2009/10/1 Matthias Dodt <matthias.dodt@mdc-berlin.de>

Hi Guru!

Thank you very much for the detailed reply! - now it works.
Just one thing is strange: In the history appear the double amount of
files, half of them has size zero and contains actually nothing. I wrote
a program which splits a fasta/fastq file into n files. On the command
line it works fine (the names of the output files can be specified
directly via parameter).
However there are always 2n files of the history - half of them empty.
Any idea?

the xml-file looks as follows:

<tool id="seqan_splitter_1" name="FASTA splitter"

force_history_refresh="True">

<description>Splits input files into pieces of desired size</description>
<command>

./tools/RNA-seq/fasta-splitter/seqan_splitter
--source $source
--name-pattern primary_${resultset.id}_splitfile%_visible_fasta
--target-dir $__new_file_path__/
--maxsize $size
#if $format_input.type =="fasta"
--format fasta
#else
--format fastq
#end if
> /dev/null
##2> $log_report
</command>

<inputs>
<conditional name="format_input">
<param name="type" type="select" label="input file
format" optional="false">
<option value="fasta" selected="true">FASTA</option>
<option value="fastqsanger">FASTQSanger</option>
<option value="fastqsolexa">FASTQSolexa</option>
</param>
<when value="fasta">
<param format="fasta" name="source" type="data"
label="source file"/>
</when>
<when value="fastqsolexa">
<param format="fastqsolexa" name="source" type="data"
label="source file"/>
</when>
<when value="fastqsanger">
<param format="fastqsanger" name="source" type="data" label="source file"/>
</when>
</conditional>

<param name="size" type="integer" label="Size in Megabyte of
each output file" value="500" optional="false"/>

</inputs>

<outputs>
<data format="fasta" name="resultset" label="Splitted file"/>

</outputs>

<help>
</help>
</tool>

Thanks again!

greetings

mat

Guruprasad Ananda schrieb:

> Dear Matthias,
>
> Yes, you can define number of outputs dynamically in Galaxy. For doing
> this, you'll have to declare one output dataset in your xml and pass
> its ID ($out_file.id) to your python script. Also,
> set force_history_refresh="True" in your tool tag in xml, like this:
> <tool id="split1" name="Split" force_history_refresh="True">
> In your script, if your outputs are named in the following format,
> primary_associatedWithDatasetID_designation_visibility_extension(_DBKEY),
> all your datasets will show up in the history pane.
> associatedWithDatasetID is the $out_file.ID passed from xml,
> designation will be a unique identifier for each output (set in your
> script),
> visibility can be set to visible if you want the dataset visible in
> your history, or notvisible otherwise
> extension is the required format for your dataset (bed, tabular, fasta
> etc)
> DBKEY is optional, and can be set if required (e.g. hg18, mm9 etc)
>
> One of our tools "MAF to Interval converter"
> (tools/maf/maf_to_interval.xml) already uses this feature. You can use
> it as a reference.
>
> Hope this answers your question. Please feel free to email us if you
> have any more queries.
> Guru
> Galaxy team.
>
> On Sep 29, 2009, at 9:52 AM, Matthias Dodt wrote:
>
>> Hi galaxy-users!
>>
>> I wrote a tool that splits a FASTA file into n output files, each one of
>> a predefined maximum size. The program could return the number of files
>> or a list of filenames...
>>
>> Is it possible to define the number of outputs dynamically (nr of output
>> files dependent on input-filesize)?
>>
>> Thanks!
>>
>>
>>
>> till now i experimented with:
>>
>> <tool id="seqan_splitter_1" name="FASTA splitter">
>> <description>Splits input files into pieces of desired
>> size</description>
>> <command interpreter="python">
>> ./tools/RNA-seq/fasta-splitter/fasta-splitter.py
>> --maxsize $size
>> 2> $log_report
>> </command>
>>
>> <inputs>
>> <param name="source" type="data" format="fasta" label="input fasta
>> file"/>
>> <param name="size" type="integer" label="Size in Megabyte of
>> each output file" value="500" optional="false"/>
>> <param name="files" type="hidden" value="10"/>
>> </inputs>
>>
>> <outputs>
>> #for $i < $files
>> <data format="fasta" name="\$i" label="Splitted file"/>
>> #end for
>> <data format="text" name="log_report" label="Detailed log report
>> from splitter"/>
>> </outputs>
>>
>> </tool>
>>
>> _______________________________________________
>> galaxy-user mailing list

>> galaxy-user@bx.psu.edu <mailto:galaxy-user@bx.psu.edu>

>> http://mail.bx.psu.edu/cgi-bin/mailman/listinfo/galaxy-user
>
> Regards,
>
> Guruprasad Ananda
> Graduate Student
> Bioinformatics and Genomics
> The Pennsylvania State University
>
>
>

_______________________________________________
galaxy-user mailing list
galaxy-user@bx.psu.edu
http://mail.bx.psu.edu/cgi-bin/mailman/listinfo/galaxy-user