How much can I trimm my reads
Dear All, I am analysing RNA-seq datasets for the differential splicing events between cell types. My reads are 36bp long. In order to increase the quality of reads, I need to trim some nucleotides from ends. How many nucleotides can I trim? I am afraid that if I trim too much, the reliability of the alingment will be affected. Thanks in advance. Jianguang
Hello Jianguang, This general protocol is also in the RNA-seq tutorial: http://main.g2.bx.psu.edu/u/jeremy/p/galaxy-rna-seq-analysis-exercise --> Understanding and QCing the reads That said, I had a sample of your data from before and I ran FastQC on it and see what you mean, the quality drops off steadily after the first 10 bases or so, then below phred+20 around the middle of the sequence (for both ends). There are a few options - 1 - Do as Ann suggests and just leave these alone and test to see what happens in TopHat. If the mapping fails, then you will know that you need to do some quality cleanup. 2 - Use the FastQC results to decide on a lower quality score boundary and trim the very worst sequences. Because of the length, yes, take care not to remove too much. As I stated, from the sample I looked at, even phred+20 would probably clip too aggressively. In general it is best to do as little manipulation as possible with expression data. Some testing on your part will be needed to identify the correct processing, and the same process will not apply to all datasets. But the general path outlined in the tutorial is a good one for what you are trying to do and should be able to address your questions. Take care, Jen Galaxy team On 8/23/12 7:40 AM, Du, Jianguang wrote:
Dear All,
I am analysing RNA-seq datasets for the differential splicing events between cell types. My reads are 36bp long. In order to increase the quality of reads, I need to trim some nucleotides from ends. How many nucleotides can I trim? I am afraid that if I trim too much, the reliability of the alingment will be affected.
Thanks in advance.
Jianguang
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://galaxyproject.org
participants (2)
-
Du, Jianguang
-
Jennifer Jackson