Hi Tony,

Yes, that should work too. I have written up a BioPerl hack that indexes the reads and pulls out the pairs that is chugging away right now. If that does not work out somehow, I will give your idea a shot. Thanks!

Best,
Surya

On Tue, Mar 29, 2011 at 4:20 PM, Barbet,Anthony F <barbet@ufl.edu> wrote:
Can you not do fastq join on the 2 files, fastq filter for the single (same max and min bases) full length combined size (and quality if you want), then fastq splitter?

Tony
________________________________________
From: galaxy-user-bounces@lists.bx.psu.edu [galaxy-user-bounces@lists.bx.psu.edu] On Behalf Of Surya Saha [ss2489@cornell.edu]
Sent: Tuesday, March 29, 2011 4:00 PM
To: Anton Nekrutenko
Cc: galaxy-user@lists.bx.psu.edu
Subject: Re: [galaxy-user] Combining the paired reads from Illumina run

Hi Anton,

Thank you for the tip. The sequence names do end in /1 and /2 but that can be fixed using Manipulate FASTQ tool, right?

-Surya

On Tue, Mar 29, 2011 at 3:46 PM, Anton Nekrutenko <anton@bx.psu.edu<mailto:anton@bx.psu.edu>> wrote:
>
> You can try converting fastq to tabular (NGS: QC and Manipulation). Jointing (Join, Subtract and Group) the two files on ids (provided they do not have /1 and /2). Splitting into two files with cut (Text manipulation), and going back into fastq with tabulat-to-fastq (NGS: QC and Manipulation). With 30 mil reads this will likely take some time though.
> Thanks,
> anton
>
> On Mar 29, 2011, at 11:38 AM, Surya Saha wrote:
>
> These are Illumina reads
>
> -S.
>
> On Tue, Mar 29, 2011 at 11:37 AM, Anton Nekrutenko <anton@bx.psu.edu<mailto:anton@bx.psu.edu>> wrote:
>>
>> Are these illumina or solid reads?
>>
>> Tx,
>>
>> anton
>>
>>
>> On Mar 29, 2011, at 11:29 AM, Surya Saha wrote:
>>
>> > Hi,
>> >
>> > I have two fastq files with the forward(/1) and reverse(/2) paired reads. The reads are not in same order in either file, some pairs are absent/missing and the files are 8 GB each with abt 30 mill reads each.
>> >
>> > I am trying to pull out all the paired reads for which both fwd and rev exist. Can I use a combination of fastq tools in Galaxy to do this?
>> >
>> > Thanks!
>> >
>> > -Surya ___________________________________________________________
>> > The Galaxy User list should be used for the discussion of
>> > Galaxy analysis and other features on the public server
>> > at usegalaxy.org<http://usegalaxy.org>.  Please keep all replies on the list by
>> > using "reply all" in your mail client.  For discussion of
>> > local Galaxy instances and the Galaxy source code, please
>> > use the Galaxy Development list:
>> >
>> >  http://lists.bx.psu.edu/listinfo/galaxy-dev
>> >
>> > To manage your subscriptions to this and other Galaxy lists,
>> > please use the interface at:
>> >
>> >  http://lists.bx.psu.edu/
>>
>> Anton Nekrutenko
>> http://nekrut.bx.psu.edu
>> http://usegalaxy.org
>>
>>
>>
>
>
> Anton Nekrutenko
> http://nekrut.bx.psu.edu
> http://usegalaxy.org
>
>