Re: [galaxy-dev] Preffered way of running a tool on multiple input files

12 Feb 2013

      Hi Hagai,

Actually, using a workflow, you are able to select multiple input files, 
and let the workflow run separately on all input files.

I would proceed by creating a data library for all your fastq files, 
which you can upload via FTP, or via a system directory.
You can use a sample of your fastq files to create the steps in a 
history you want to perform, and extract a workflow out of it.
Next, copy all fastq files from a data library in a new history, and run 
your workflow on the all input files.

I hope this helps you further,
Joachim

Joachim Jacob

Rijvisschestraat 120, 9052 Zwijnaarde
Tel: +32 9 244.66.34
Bioinformatics Training and Services (BITS)
http://www.bits.vib.be
@bitsatvib

On 02/12/2013 04:02 PM, Hagai Cohen wrote:
...
Hi,
I'm looking for a preferred way of running Bowtie (or any other tool) 
on multiple input files and run statistics on the Bowtie output 
afterwards.
The input is a directory of files fastq1..fastq100
The bowtie output should be bed1...bed100
The statistics tool should run on bed1...bed100 and return xls1..xls100
Then I will write a tool which will get xls1..xls100 and merge them to 
one final output.
I searched for a smiliar cases, and I couldn't figure anyone which had 
this problem before.
Can't use the parallelism tag, because what will be the input for each 
tool? it should be a fastq file not a directory of fastq files.
Neither I would like to run each fastq file in a different workflow - 
creating a mess.
I thought only on two solutions:
1. Implement new datatypes: bed_dir & fastq_dir and implements new 
tool wrappers which will get a folder instead of a file.
2. merge the input files before sending to bowtie, and use parallelism 
tag to make them be splitted & merged again on each tool.
Does anyone has any better suggestion?
Thanks,
Hagai
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/

Re: [galaxy-dev] Preffered way of running a tool on multiple input files

Joachim Jacob |VIB|