Hi Suzan,

Using the groomer can take a while, and it might not be useful if your sequences are already in Fastq Sanger (a version of the quality score needed for TopHat). I have noticed that when uploading fastq sequences in Galaxy, it always label them as fastq, even if they are fastq sanger, and that is why you have to do the groomer. If you are sure your sequences are already fastq sanger, you can simply upload them without autodetect, and selecting "fastqsanger", or, if they are already uploaded, update their attributes to change them to fastq sanger. If you aren't sure of what type of fastq quality score you have, I can send you a script to detect it. If your sequences have been treated by the Illumina Casava pipeline, they should be fastq sanger.

For the QC, well, it's up to you. You can either concatenate your files and QC them all, or do it one by one. You can test which is going to be faster on your computer. To run TopHat with paired end data, if I remember correctly, you have use the 2 files with your forward and reverse sequences. Both need to be uploaded in galaxy at that point, and no, do not combine them. However, you can combine multiple forward files and use them with combined multiple reverse file.

Hope this help you a bit.

Sincerely

David


Date: Wed, 8 Aug 2012 11:05:17 -0400
From: suz.katie@gmail.com
To: galaxy-dev@lists.bx.psu.edu; jeremy.goecks@emory.edu
Subject: [galaxy-dev] TopHat paired end

Hello,

I am trying to do RNAseq analysis on Paired end data Illumina. I have about 4 files for each sample (2 forward and 2 reverse).

I want to analyze the data  in Galaxy.

Do I have to groom and run the QC for each file? Should I join the paired files and run both tools on each pair, or should I combine all of the data for each sample (which I don't know how to do) and then groom and run the QC for all of the reads for the sample.

Ultimately, How do I run TopHat for paired-end data. When I select paired end data in Galaxy, it gives an option to upload both files should I upload both forward files once and reverse files once. Or should I combine both the files before running the TopHat.

I am all confused. Any kind of help will be appreciated

Thanks in advance

___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/