Re: [galaxy-user] varying number of output files in XML

5 Oct 2009

      Hi,

how does/could this work in the workflow creation. How should/can I redirect
all created outputs to a new tool which also accepts a dynamical number of
inputs?

Cheers,
Jelle

2009/10/1 Matthias Dodt <matthias.dodt@mdc-berlin.de>
...
Hi Guru!
Thank you very much for the detailed reply! - now it works.
Just one thing is strange: In the history appear the double amount of
files, half of them has size zero and contains actually nothing. I wrote
a program  which splits a fasta/fastq file into n files. On the command
line it works fine (the names of the output files can be specified
directly via parameter).
However there are always 2n files of the history - half of them empty.
Any idea?
the xml-file looks as follows:
<tool id="seqan_splitter_1" name="FASTA splitter"
force_history_refresh="True">
  <description>Splits input files into pieces of desired size</description>
 <command>
  ./tools/RNA-seq/fasta-splitter/seqan_splitter
 --source $source
 --name-pattern primary_${resultset.id}_splitfile%_visible_fasta
 --target-dir $__new_file_path__/
 --maxsize $size
 #if $format_input.type =="fasta"
 --format fasta
 #else
 --format fastq
 #end if
...
/dev/null
##2> $log_report
 </command>
<inputs>
       <conditional name="format_input">
               <param name="type" type="select" label="input file
format" optional="false">
                       <option value="fasta" selected="true">FASTA</option>
                       <option value="fastqsanger">FASTQSanger</option>
                       <option value="fastqsolexa">FASTQSolexa</option>
               </param>
               <when value="fasta">
               <param format="fasta" name="source" type="data"
label="source file"/>
           </when>
               <when value="fastqsolexa">
               <param format="fastqsolexa" name="source" type="data"
label="source file"/>
           </when>
<when value="fastqsanger">
<param format="fastqsanger" name="source" type="data" label="source file"/>
</when>
       </conditional>
        <param name="size" type="integer" label="Size in Megabyte of
each output file" value="500" optional="false"/>
  </inputs>
<outputs>
 <data format="fasta" name="resultset" label="Splitted file"/>
<!--    <data format="text" name="log_report" label="Detailed log report
from splitter"/>-->
</outputs>
<help>
</help>
</tool>
Thanks again!
greetings
mat
Guruprasad Ananda schrieb:
...
Dear Matthias,
Yes, you can define number of outputs dynamically in Galaxy. For doing
this, you'll have to declare one output dataset in your xml and pass
its ID ($out_file.id) to your python script. Also,
set force_history_refresh="True" in your tool tag in xml, like this:
<tool id="split1" name="Split" force_history_refresh="True">
In your script, if your outputs are named in the following format,
primary_associatedWithDatasetID_designation_visibility_extension(_DBKEY),
all your datasets will show up in the history pane.
associatedWithDatasetID is the $out_file.ID passed from xml,
designation will be a unique identifier for each output (set in your
script),
visibility can be set to visible if you want the dataset visible in
your history, or notvisible otherwise
extension is the required format for your dataset (bed, tabular, fasta
etc)
DBKEY is optional, and can be set if required (e.g. hg18, mm9 etc)
One of our tools "MAF to Interval converter"
(tools/maf/maf_to_interval.xml) already uses this feature. You can use
it as a reference.
Hope this answers your question. Please feel free to email us if you
have any more queries.
Guru
Galaxy team.
On Sep 29, 2009, at 9:52 AM, Matthias Dodt wrote:
...
Hi galaxy-users!
I wrote a tool that splits a FASTA file into n output files, each one of
a predefined maximum size. The program could return the number of files
or a list of filenames...
Is it possible to define the number of outputs dynamically (nr of output
files dependent on input-filesize)?
Thanks!
till now i experimented with:
<tool id="seqan_splitter_1" name="FASTA splitter">
 <description>Splits input files into pieces of desired
size</description>
 <command interpreter="python">
 ./tools/RNA-seq/fasta-splitter/fasta-splitter.py
 --maxsize $size
 2> $log_report
 </command>
<inputs>
<param name="source" type="data" format="fasta" label="input fasta
file"/>
       <param name="size" type="integer" label="Size in Megabyte of
each output file" value="500" optional="false"/>
       <param name="files" type="hidden" value="10"/>
 </inputs>
<outputs>
#for $i < $files
<data format="fasta" name="\$i" label="Splitted file"/>
#end for
       <data format="text" name="log_report" label="Detailed log report
from splitter"/>
</outputs>
</tool>
_______________________________________________
galaxy-user mailing list
galaxy-user@bx.psu.edu <mailto:galaxy-user@bx.psu.edu>
http://mail.bx.psu.edu/cgi-bin/mailman/listinfo/galaxy-user
Regards,
Guruprasad Ananda
Graduate Student
Bioinformatics and Genomics
The Pennsylvania State University
_______________________________________________
galaxy-user mailing list
galaxy-user@bx.psu.edu
http://mail.bx.psu.edu/cgi-bin/mailman/listinfo/galaxy-user

Re: [galaxy-user] varying number of output files in XML

Jelle Scholtalbers