Apologies for originally posting this to galaxy-user; now I realize it belongs here. Hello, I'm a galaxy newbie and running into several issues trying to adapt an R script to be a galaxy tool. I'm looking at the XY plotting tool for guidance (tools/plot/xy_plot.xml), but I decided not to embed my script in XML, but instead have it in a separate script file, that way I can still run it from the command line and make sure it works as I make incremental changes. (So my script starts with args <- commandArgs(TRUE)). Also, if it doesn't work, this suggests to me that there is a problem with my galaxy configuration. First, I tried using the r_wrapper.sh script that comes with the XY plotting tool, but it threw away my arguments: An error occurred running this job: ARGUMENT '/Users/dtenenba/dev/galaxy-dist/database/files/000/dataset_4.dat' __ignored__ ARGUMENT '/Users/dtenenba/dev/galaxy-dist/database/files/000/dataset_3.dat' __ignored__ ARGUMENT 'Fly' __ignored__ ARGUMENT 'Tagwise' __ignored__ etc. So then I tried just switching to Rscript: <command interpreter="bash">Rscript RNASeq.R $countsTsv $designTsv "$organism" $dispersion $minimumCountsPerMillion $minimumSamplesPerTranscript $out_file1 $out_file2</command> (My script produces as output a csv file and a pdf file. The final two arguments I'm passing are the names of those files.) But then I get an error that Rscript can't be found. So I wrote a little wrapper script, Rscript_wrapper.sh: #!/bin/sh Rscript $* And called that: <command interpreter="bash">Rscript_wrapper.sh RNASeq.R $countsTsv $designTsv "$organism" $dispersion $minimumCountsPerMillion $minimumSamplesPerTranscript $out_file1 $out_file2</command> Then I got an error that RNASeq.R could not be found. So then I added the absolute path to my R script to the <command> tag. This seemed to work (that is, it got me further, to the next error), but I'm not sure why I had to do this; in all the other tools I'm looking at, the directory to the script to run does not have to be specified; I assumed that the command would run in the appropriate directory. So now I've specified the full path to my R script: <command interpreter="bash">Rscript_wrapper.sh /Users/dtenenba/dev/galaxy-dist/tools/bioc/RNASeq.R $countsTsv $designTsv "$organism" $dispersion $minimumCountsPerMillion $minimumSamplesPerTranscript $out_file1 $out_file2</command> And I get the following long error, which includes all of the output of my R script: Traceback (most recent call last): File "/Users/dtenenba/dev/galaxy-dist/lib/galaxy/jobs/runners/local.py", line 133, in run_job job_wrapper.finish( stdout, stderr ) File "/Users/dtenenba/dev/galaxy-dist/lib/galaxy/jobs/__init__.py", line 725, in finish self.sa_session.flush() File "/Users/dtenenba/dev/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.7.egg/sqlalchemy/orm/scoping.py", line 127, in do return getattr(self.registry(), name)(*args, **kwargs) File "/Users/dtenenba/dev/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.7.egg/sqlalchemy/orm/session.py", line 1356, in flush self._flush(objects) File "/Users/dtenenba/dev/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.7.egg/sqlalchemy/orm/session.py", line 1434, in _flush flush_context.execute() File "/Users/dtenenba/dev/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.7.egg/sqlalchemy/orm/unitofwork.py", line 261, in execute UOWExecutor().execute(self, tasks) File "/Users/dtenenba/dev/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.7.egg/sqlalchemy/orm/unitofwork.py", line 753, in execute self.execute_save_steps(trans, task) File "/Users/dtenenba/dev/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.7.egg/sqlalchemy/orm/unitofwork.py", line 768, in execute_save_steps self.save_objects(trans, task) File "/Users/dtenenba/dev/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.7.egg/sqlalchemy/orm/unitofwork.py", line 759, in save_objects task.mapper._save_obj(task.polymorphic_tosave_objects, trans) File "/Users/dtenenba/dev/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.7.egg/sqlalchemy/orm/mapper.py", line 1413, in _save_obj c = connection.execute(statement.values(value_params), params) File "/Users/dtenenba/dev/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.7.egg/sqlalchemy/engine/base.py", line 824, in execute return Connection.executors[c](self, object, multiparams, params) File "/Users/dtenenba/dev/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.7.egg/sqlalchemy/engine/base.py", line 874, in _execute_clauseelement return self.__execute_context(context) File "/Users/dtenenba/dev/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.7.egg/sqlalchemy/engine/base.py", line 896, in __execute_context self._cursor_execute(context.cursor, context.statement, context.parameters[0], context=context) File "/Users/dtenenba/dev/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.7.egg/sqlalchemy/engine/base.py", line 950, in _cursor_execute self._handle_dbapi_exception(e, statement, parameters, cursor, context) File "/Users/dtenenba/dev/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.7.egg/sqlalchemy/engine/base.py", line 931, in _handle_dbapi_exception raise exc.DBAPIError.instance(statement, parameters, e, connection_invalidated=is_disconnect) ProgrammingError: (ProgrammingError) You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings. u'UPDATE job SET update_time=?, stdout=?, stderr=? WHERE job.id = ?' ['2012-04-24 18:55:45.791417', '', 'BiocInstaller version 1.5.7, ?biocLite for help\nWarning message:\nNAs introduced by coercion \nLoading required package: methods\nLoading required package: limma\nLoading required package: BiasedUrn\nLoading required package: geneLenDataBase\nLoading required package: org.Dm.eg.db\nLoading required package: AnnotationDbi\nLoading required package: BiocGenerics\n\nAttaching package: \xe2\x80\x98BiocGenerics\xe2\x80\x99\n\nThe following object(s) are masked from \xe2\x80\x98package:stats\xe2\x80\x99:\n\n xtabs\n\nThe following object(s) are masked from \xe2\x80\x98package:base\xe2\x80\x99:\n\n anyDuplicated, cbind, colnames, duplicated, eval, Filter, Find,\n get, intersect, lapply, Map, mapply, mget, order, paste, pmax,\n pmax.int, pmin, pmin.int, Position, rbind, Reduce, rep.int,\n rownames, sapply, setdiff, table, tapply, union, unique\n\nLoading required package: Biobase\nWelcome to Bioconductor\n\n Vignettes contain introductory material; view with\n \'browseVignettes()\'. To cite Bioconductor, see\n \'citation("Biobase")\', and for packages \'citation("pkgname")\'.\n\nLoading required package: DBI\n\nCalculating library sizes from column totals.\nError in matrix(u, nrow = nrows, byrow = TRUE) : \n negative extents to matrix\nCalls: plotMDS.DGEList ... equalizeLibSizes -> splitIntoGroups -> lapply -> FUN -> matrix\nExecution halted\n', 15] Note that if I run my script from the command line: ./Rscript_wrapper.sh RNASeq.R /Users/dtenenba/dev/galaxy-dist/database/files/000/dataset_4.dat /Users/dtenenba/dev/galaxy-dist/database/files/000/dataset_3.dat Fly 1 1 Tagwise MDSPlot.pdf outputs.csv It works fine and does not produce a warning about "NAs introduced by coercion", nor does it fail with the "Error in matrix" above. So, can anyone tell me what is going wrong here? Why does R behave differently in galaxy than it does on the command line? (I'm using the same instance of R, same machine, for my galaxy and command-line efforts). Is this 8-bit bytestring error a red herring? Can I filter it so that galaxy is happy? Finally, one other curiosity. Every time I hit "Execute" in galaxy to run my tool, it is run twice--two jobs are created (which each fail in the same way). Why is this? My R script: https://gist.github.com/2482783 My XML file: https://gist.github.com/2482792 I can share more data (such as sample input files) if necessary. Thanks for your help. Dan