Hi all, After mapping, I used IGV to have a look at the mapping. There are a lot of mapped reads without pair reads. Should I keep these reads? or Is this a problem for cufflinks analysis? What I tried is: 1. BAM to SAM 2. Filter SAM: set Read mapped in a proper pair to Yes. The result is that only 1/5 reads were left. Can anybody tell me if this operation is proper?? How do you normally optimize the mapping rerults from Tophat? Which considerations should I take into account? Looking forward to your reply. Jiwen
Hi Jiwen, The short answer is that if your data is not pairing, then there may be a quality problem. Or, there may be a problem with the TopHat mapping run. The best advice is to take a sample of your data and experiment with some TopHat alternate parameters (using the protocol suggestions in paper below or manual http://tophat.cbcb.umd.edu/) and see what works. If your overall goal is simply transcript/gene discovery/assembly, then filtering is probably OK. But if you are going to be doing any statistical expression analysis, then targeted filtering of the data (e.g. beyond general quality) should be done with caution, if at all, as you risk skewing the results. You may have seen this already, but the Cufflinks tool authors put out a new paper that covers best practice RNA-seq protocols: http://www.nature.com/nprot/journal/v7/n3/full/nprot.2012.016.html (also linked from http://cufflinks.cbcb.umd.edu/, 2nd item down) Apologies for the delayed reply. There were a few questions from you around this same time, but it wasn't clear if everything was addressed or not. And I don't think the paper link was sent out in reply, which will likely be the most helpful. Best, Jen Galaxy On 4/20/12 4:51 AM, 杨继文 wrote:
Hi all,
After mapping, I used IGV to have a look at the mapping. There are a lot of mapped reads without pair reads. Should I keep these reads? or Is this a problem for cufflinks analysis?
What I tried is: 1. BAM to SAM 2. Filter SAM: set /Read mapped in a proper pair/ to *Yes*. The result is that only 1/5 reads were left.
Can anybody tell me if this operation is proper?? How do you normally optimize the mapping rerults from Tophat? Which considerations should I take into account? Looking forward to your reply. Jiwen
------------------------------------------------------------------------ 网易Lofter,专注兴趣,分享创作! <http://www.lofter.com>
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://galaxyproject.org
participants (2)
-
Jennifer Jackson
-
杨继文