This is a subject that has me very confused too. This thread at seqanswer didn't help much either: http://seqanswers.com/forums/showthread.php?t=8730
But it does have some good comments on the subject.
I did try using the two possible options I can think of: fragment length - pair end read length - adaptor length
And: fragment length - pair end read length
With the latter I get around 10% increase on properly paired reads. I wonder if Tophat internally takes into account the adapters.
Still, it would be nice to get a definitive answer in this subject.
2012/3/6 杨继文 firstname.lastname@example.org:
When mapping pair end RNA-seq reads using tophat, we need to type in "Mean Inner Distance between Mate Pairs". In galaxy, we can read the following information:
This is the expected (mean) inner distance between mate pairs. For, example, for paired end runs with fragments selected at 300bp, where each end is 50bp, you should set -r to be 200. There is no default, and this parameter is required for paired end runs.
I think the size of fragment (here 300bp) includes not only the length of pair end reads, but also the length of adaptors. so, maybe the Mean Inner Distance between Mate Pairs should be : fragment length - pair end read length - adaptor length. Am I right? or did I miss something?
Is it a must to type in the accurate value?
Looking forward to your reply
The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
To manage your subscriptions to this and other Galaxy lists, please use the interface at: