On Tue, Apr 22, 2014 at 4:34 PM, Ulf Schaefer <Ulf.Schaefer@phe.gov.uk> wrote:
Hi Peter
I removed the unnecessary code.
If I run the tool with just a couple of inputs I see entries in the log files either from galaxy.jobs.runners.drmaa or from galaxy.jobs.runners.local that the job is being dispatched as normal.
Unfortunately there is no sign of the job in the log files when using more input files.
The command line that is supposed to be run is:
bash home/galaxy/galaxy-dist/tools/vcf_processing/vcf_to_fasta.sh /galaxy/database/files/042/dataset_42275.dat 40 10 50 0 40 0.9 20 /galaxy/database/files/041/dataset_41720.dat, /galaxy/database/files/041/dataset_41980.dat,
the first dat file being the output and the ones at the end being a comma separated list of the input files. On the command line this command works with much longer input files lists.
I wouldn't bother with the commas - that is just wasting characters and eating into the maximum command line string length.
Any ideas?
Check the limit with "xargs --show-limits" or "getconf ARG_MAX", our CentOS server reports: $ getconf ARG_MAX 2621440
Or is there a better practice to pass a large number of input files to a bash script?
Thanks Ulf
If there is any chance of your constructed command line string exceeding the system limit, I would construct an input file containing the filenames (e.g. one per line). That might be a practical solution anyway. For the file bases approach, I would used the Galaxy <configfile> tag. Some of the tools bundled with Galaxy also use this (find them with grep), or for example one of mine: https://github.com/peterjc/pico_galaxy/blob/master/tools/mira4/mira4_de_novo... Regards, Peter