On Tue, Mar 08, 2011 at 03:43:20PM -0500, Musa A. Hassan wrote:
Hi All, Is there anyone out there who know how I can divide an fq file containing illumina short reads randomly into 2 small files contaning approximately equal number of reads? I have a huge fq from the illumina high-seq platform, unfortunately, this file is huge and is causing all sorts of problems and I'd like to divide into to equal sizes(based on number of reads).
I'm assuming you mean "in galaxy", right? If so, check out the entries in 'Text Manipulation'. Using 'select first' and 'select last' you can turn one dataset into two datasets each half the size. If instead you mean on the Unix command line, use the tool 'split' or 'head' and 'tail'. -- Ry4an Brase 612-626-6575 Software Developer Application Development University of Minnesota Supercomputing Institute http://www.msi.umn.edu