Hello Karen,

It sounds as if the data contains replicates, but this should be confirmed with the data source by reviewing the methods for the specific experiment.

If indeed these are replicates, it is best to process these independently and submit them as replicates when using the NGS: RNA-seq tools. The tool authors have specific advice regarding replicates at their web site:
http://cufflinks.cbcb.umd.edu/howitworks.html#reps

If just different conditions, you would also want to process independently - this is probably obvious but I wanted to mention it just to be complete.

Links to resources, including a Galaxy tutorial, can be found grouped in our wiki at:
http://wiki.galaxyproject.org/Support#Tools_on_the_Main_server

Hopefully this helps,

Jen
Galaxy team

On 12/11/12 11:03 AM, Karen Margrethe Jessen wrote:

Hi,

I have downloaded the fastq.tgz files for an ENCODE RNA-SEQ data and unpacked the files. The data set is paired end illumina. I am a bit confused, as there are 5 files for read1 and 5 files for read2 for each sample. Am I supposed to merge the 5 files before aligning to the hg19 genome?

If yes, how should I merge these files?

I would greatly appreciate any help you can provide.

Best regards,

Karen Margrethe Jessen

Cand. Scient., ph.d.-student

Aarhus University

Denmark
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

-- 
Jennifer Jackson
http://galaxyproject.org