Hi Jiwen,

The short answer is that if your data is not pairing, then there may be a quality problem. Or, there may be a problem with the TopHat mapping run. The best advice is to take a sample of your data and experiment with some TopHat alternate parameters (using the protocol suggestions in paper below or manual http://tophat.cbcb.umd.edu/) and see what works.

If your overall goal is simply transcript/gene discovery/assembly, then filtering is probably OK. But if you are going to be doing any statistical expression analysis, then targeted filtering of the data (e.g. beyond general quality) should be done with caution, if at all, as you risk skewing the results.

You may have seen this already, but the Cufflinks tool authors put out a new paper that covers best practice RNA-seq protocols:
http://www.nature.com/nprot/journal/v7/n3/full/nprot.2012.016.html
(also linked from http://cufflinks.cbcb.umd.edu/, 2nd item down)

Apologies for the delayed reply. There were a few questions from you around this same time, but it wasn't clear if everything was addressed or not. And I don't think the paper link was sent out in reply, which will likely be the most helpful.

Best,

Jen
Galaxy

On 4/20/12 4:51 AM, 杨继文 wrote:
 Hi all,

After mapping, I used IGV to have a look at the mapping. There are a lot of mapped reads without pair reads. Should I keep these reads? or Is this a problem for cufflinks analysis?

What I tried is:
1. BAM to SAM
2. Filter SAM:  set Read mapped in a proper pair to Yes.
The result is that only 1/5 reads were left.

Can anybody tell me if this operation is proper??
How do you normally optimize the mapping rerults from Tophat?
Which considerations should I take into account?
Looking forward to your reply.
Jiwen





网易Lofter,专注兴趣,分享创作!


___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

-- 
Jennifer Jackson
http://galaxyproject.org