Hello,
I'm trying to incorporate a pipeline tool into our instance of Galaxy. My tool xml uses the <repeat> tag to allow an end-user to enter a number of different fasta/fastq files with information about each of those files so that RSEMhttp://deweylab.biostat.wisc.edu/rsem/ can be run on all of the files one at a time with minimal end-user input. My repeat name is Files. To do this, I need the tool xml to loop over each file and its input parameters to put together a string of input to a perl wrapper (file is in same directory as the xml). I tried using 'enumerate' to do this within the <command> tag with perl as the interpreter, but I'm noticing some some strange behavior with ordering of statements in the tag. Here's what I have for the tool xml's command tag (aforementioned perl wrapper is run-rsem-1.2.3.pl). <command interpreter="perl"> #for $i, $s in enumerate( $Files ) #if $s.input.fastqmatepair.matepair=="single" #/opt/galaxy/galaxy-dist-dev/tools/ngs_rna/run-rsem-1.2.3.pl --seed-length $s.seedlength --forward-prob $s.fprob -p $s.cpus --single_fastq $s.input.fastqmatepair.singlefastq /opt/galaxy/references/rhesus/rheMac2_ensembl_1.64/RSEM_rheMac2_ensembl_1.64 #if $s.need_bam.yes_bamfile=="yes": --output-genome-bam --bamfile $s.bam_res #end if #if $s.dostats.stat==1: --fragment-length-mean $s.dostats.fraglenmean --fragment-length-sd $s.dostats.fraglensd #end if #if $s.dorspd.rspd==1: --estimate-rspd --num-rspd-bins $s.dorspd.rspd_num #end if #end if #end for </command>
I must be missing something above because it does not call run-rsem-1.2.3 for each loop iteration. Do I need a break or pause or something within the tag to make this enumerate loop work correctly? FWIW, to debug I added a print statement between the last '#end if' and the '#end for' above: 'print "loop iteration $i.\n";' and this correctly resulted in a number of separate calls to run-rsem-1.2.3, but the literal statement "print loop iteration #.\n" was also added as an argument to the perl wrapper that caused it to quit early because too many input arguments were received.
Please let me know if anything needs clarification. Any help is much appreciated - thanks!
Christina Shafer, Ph.D Bioinformatics Analyst Regenerative Biology Laboratory Morgridge Institute for Research
I ended up solving the immediate issue by adding a semicolon between the '#end if' and the '#end for'. Now it's just a matter of getting everything to play nicely between the perl wrapper and the xml... I think we can consider this "closed" though, so no need for anyone to look into it at this point.
Thanks, Christy
On Jul 25, 2013, at 1:44 PM, Christy Shafer <cshafer@morgridgeinstitute.orgmailto:cshafer@morgridgeinstitute.org> wrote:
Hello,
I'm trying to incorporate a pipeline tool into our instance of Galaxy. My tool xml uses the <repeat> tag to allow an end-user to enter a number of different fasta/fastq files with information about each of those files so that RSEMhttp://deweylab.biostat.wisc.edu/rsem/ can be run on all of the files one at a time with minimal end-user input. My repeat name is Files. To do this, I need the tool xml to loop over each file and its input parameters to put together a string of input to a perl wrapper (file is in same directory as the xml). I tried using 'enumerate' to do this within the <command> tag with perl as the interpreter, but I'm noticing some some strange behavior with ordering of statements in the tag. Here's what I have for the tool xml's command tag (aforementioned perl wrapper is run-rsem-1.2.3.pl). <command interpreter="perl"> #for $i, $s in enumerate( $Files ) #if $s.input.fastqmatepair.matepair=="single" #/opt/galaxy/galaxy-dist-dev/tools/ngs_rna/run-rsem-1.2.3.pl --seed-length $s.seedlength --forward-prob $s.fprob -p $s.cpus --single_fastq $s.input.fastqmatepair.singlefastq /opt/galaxy/references/rhesus/rheMac2_ensembl_1.64/RSEM_rheMac2_ensembl_1.64 #if $s.need_bam.yes_bamfile=="yes": --output-genome-bam --bamfile $s.bam_res #end if #if $s.dostats.stat==1: --fragment-length-mean $s.dostats.fraglenmean --fragment-length-sd $s.dostats.fraglensd #end if #if $s.dorspd.rspd==1: --estimate-rspd --num-rspd-bins $s.dorspd.rspd_num #end if #end if #end for </command>
I must be missing something above because it does not call run-rsem-1.2.3 for each loop iteration. Do I need a break or pause or something within the tag to make this enumerate loop work correctly? FWIW, to debug I added a print statement between the last '#end if' and the '#end for' above: 'print "loop iteration $i.\n";' and this correctly resulted in a number of separate calls to run-rsem-1.2.3, but the literal statement "print loop iteration #.\n" was also added as an argument to the perl wrapper that caused it to quit early because too many input arguments were received.
Please let me know if anything needs clarification. Any help is much appreciated - thanks!
Christina Shafer, Ph.D Bioinformatics Analyst Regenerative Biology Laboratory Morgridge Institute for Research
galaxy-dev@lists.galaxyproject.org