Nate - Is there a specific place in the Galaxy code that forks the samtools index on bam files on the cluster or the head node? I really need to track this down.
I re-uploaded 3 BAM files using the "Upload system file paths. runner0.log shows:galaxy.jobs DEBUG 2012-01-13 12:50:08,442 dispatching job 76 to pbs runnergalaxy.jobs INFO 2012-01-13 12:50:08,555 job 76 dispatchedgalaxy.jobs.runners.pbs DEBUG 2012-01-13 12:50:08,697 (76) submitting file /home/galaxy/galaxy-dist-9/database/pbs/76.shgalaxy.jobs.runners.pbs DEBUG 2012-01-13 12:50:08,697 (76) command is: python /home/galaxy/galaxy-dist-9/tools/data_source/upload.py /home/galaxy/galaxy-dist-9 /home/galaxy/galaxy-dist-9/datatypes_conf.xml /home/galaxy/galaxy-dist-9/database/tmp/tmpqrVYY7 208:/home/galaxy/galaxy-dist-9/database/job_working_directory/76/dataset_208_files:None 209:/home/galaxy/galaxy-dist-9/database/job_working_directory/76/dataset_209_files:None 210:/home/galaxy/galaxy-dist-9/database/job_working_directory/76/dataset_210_files:None; cd /home/galaxy/galaxy-dist-9; /home/galaxy/galaxy-dist-9/set_metadata.sh ./database/files ./database/tmp . datatypes_conf.xml ./database/job_working_directory/76/galaxy.jsongalaxy.jobs.runners.pbs DEBUG 2012-01-13 12:50:08,699 (76) queued in default queue as 114.localhost.localdomaingalaxy.jobs.runners.pbs DEBUG 2012-01-13 12:50:09,037 (76/114.localhost.localdomain) PBS job state changed from N to Rgalaxy.jobs.runners.pbs DEBUG 2012-01-13 12:51:09,205 (76/114.localhost.localdomain) PBS job state changed from R to Egalaxy.jobs.runners.pbs DEBUG 2012-01-13 12:51:10,206 (76/114.localhost.localdomain) PBS job state changed from E to Cgalaxy.jobs.runners.pbs DEBUG 2012-01-13 12:51:10,206 (76/114.localhost.localdomain) PBS job has completed successfully76.sh shows:[galaxy@bic pbs]$ more 76.sh#!/bin/shGALAXY_LIB="/home/galaxy/galaxy-dist-9/lib"if [ "$GALAXY_LIB" != "None" ]; thenif [ -n "$PYTHONPATH" ]; thenexport PYTHONPATH="$GALAXY_LIB:$PYTHONPATH"elseexport PYTHONPATH="$GALAXY_LIB"fificd /home/galaxy/galaxy-dist-9/database/job_working_directory/76python /home/galaxy/galaxy-dist-9/tools/data_source/upload.py /home/galaxy/galaxy-dist-9 /home/galaxy/galaxy-dist-9/datatypes_conf.xml /home/galaxy/galaxy-dist-9/database/tmp/tmpqrVYY7 208:/home/galaxy/galaxy-dist-9/database/job_working_directory/76/dataset_208_files:None 209:/home/galaxy/galaxy-dist-9/database/job_working_directory/76/dataset_209_files:None 210:/home/galaxy/galaxy-dist-9/database/job_working_directory/76/dataset_210_files:None; cd /home/galaxy/galaxy-dist-9; /home/galaxy/galaxy-dist-9/set_metadata.sh ./database/files ./database/tmp . datatypes_conf.xml ./database/job_working_directory/76/galaxy.jsonRight as the job ended I check the job output files:[galaxy@bic pbs]$ lltotal 4-rw-rw-r-- 1 galaxy galaxy 950 Jan 13 12:50 76.sh[galaxy@bic pbs]$ lltotal 4-rw------- 1 galaxy galaxy 0 Jan 13 12:50 76.e-rw------- 1 galaxy galaxy 0 Jan 13 12:50 76.o-rw-rw-r-- 1 galaxy galaxy 950 Jan 13 12:50 76.shsamtools is now running on the head node.Where does Galaxy decide how to run samtools? Maybe I can add a check of some sort to see what's going on?On Fri, Jan 13, 2012 at 10:53 AM, Nate Coraor <nate@bx.psu.edu> wrote:
On Jan 12, 2012, at 11:41 PM, Ryan Golhar wrote:Could you post the contents of /home/galaxy/galaxy-dist-9/database/pbs/62.sh ?
> Any ideas as to how to fix this? We are interested in using Galaxy to host all our NGS data. If indexing on the head node is going to happen, then this is going to be an extremely slow process.
Although I have to admit this is really baffling. The presence of this line without an error:
Indicates that metadata was set externally and the relevant metadata files were present on disk.
--nate