Hi Suzan,
Using the groomer can take a while, and it might not be
useful if your sequences are already in Fastq Sanger (a version of the
quality score needed for TopHat). I have noticed that when uploading
fastq sequences in Galaxy, it always label them as fastq, even if they
are fastq sanger, and that is why you have to do the groomer. If you are
sure your sequences are already fastq sanger, you can simply upload
them without autodetect, and selecting "fastqsanger", or, if they are
already uploaded, update their attributes to change them to fastq
sanger. If you aren't sure of what type of fastq quality score you have,
I can send you a script to detect it. If your sequences have been
treated by the Illumina Casava pipeline, they should be fastq sanger.
For
the QC, well, it's up to you. You can either concatenate your files and
QC them all, or do it one by one. You can test which is going to be
faster on your computer. To run TopHat with paired end data, if I
remember correctly, you have use the 2 files with your forward and
reverse sequences. Both need to be uploaded in galaxy at that point, and
no, do not combine them. However, you can combine multiple forward
files and use them with combined multiple reverse file.
Hope this help you a bit.
Sincerely
David
Date: Wed, 8 Aug 2012 11:05:17 -0400
From: suz.katie@gmail.com
To: galaxy-dev@lists.bx.psu.edu; jeremy.goecks@emory.edu
Subject: [galaxy-dev] TopHat paired end
Hello,
I am trying to do RNAseq analysis on Paired end data Illumina. I have about 4 files for each sample (2 forward and 2 reverse).
I want to analyze the data in Galaxy.
Do I have to groom and run the QC for each file? Should I join the paired files and run both tools on each pair, or should I combine all of the data for each sample (which I don't know how to do) and then groom and run the QC for all of the reads for the sample.
Ultimately, How do I run TopHat for paired-end data. When I select paired end data in Galaxy, it gives an option to upload both files should I upload both forward files once and reverse files once. Or should I combine both the files before running the TopHat.
I am all confused. Any kind of help will be appreciated
Thanks in advance
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/