Tophat "Mean Inner Distance between Mate Pairs"
Hi all, When mapping pair end RNA-seq reads using tophat, we need to type in "Mean Inner Distance between Mate Pairs". In galaxy, we can read the following information: This is the expected (mean) inner distance between mate pairs. For, example, for paired end runs with fragments selected at 300bp, where each end is 50bp, you should set -r to be 200. There is no default, and this parameter is required for paired end runs. I think the size of fragment (here 300bp) includes not only the length of pair end reads, but also the length of adaptors. so, maybe the Mean Inner Distance between Mate Pairs should be : fragment length - pair end read length - adaptor length. Am I right? or did I miss something? Is it a must to type in the accurate value? Looking forward to your reply JIwen
Hi Jiwen, This is a subject that has me very confused too. This thread at seqanswer didn't help much either: http://seqanswers.com/forums/showthread.php?t=8730 But it does have some good comments on the subject. I did try using the two possible options I can think of: fragment length - pair end read length - adaptor length And: fragment length - pair end read length With the latter I get around 10% increase on properly paired reads. I wonder if Tophat internally takes into account the adapters. Still, it would be nice to get a definitive answer in this subject. Regards, Carlos 2012/3/6 杨继文 <jiwenyang0605@126.com>:
Hi all,
When mapping pair end RNA-seq reads using tophat, we need to type in "Mean Inner Distance between Mate Pairs". In galaxy, we can read the following information:
This is the expected (mean) inner distance between mate pairs. For, example, for paired end runs with fragments selected at 300bp, where each end is 50bp, you should set -r to be 200. There is no default, and this parameter is required for paired end runs.
I think the size of fragment (here 300bp) includes not only the length of pair end reads, but also the length of adaptors. so, maybe the Mean Inner Distance between Mate Pairs should be : fragment length - pair end read length - adaptor length. Am I right? or did I miss something?
Is it a must to type in the accurate value?
Looking forward to your reply
JIwen
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hi, This might be an easy question, but I can't find the solution in the standard galaxy tools it seems. How do I create a valid VCF file from a (filtered) samtools mpileup output? Thanks in advance ! Geert
Hello Geert, To produce and work with VCF data, please see the tool groups 'NGS: Variant Detection' and 'NGS: GATK Tools (beta)'. Each tools has notations about input, outputs, and sources. We are developing additional support within Galaxy for these tools as they move out of beta status. For now, referencing original documentation through the links on the tool forms to the program source is the best way to learn more about functionality details. These are new on the public main server since you originally wrote in, most are still considered to be 'beta', and implementation feedback is welcomed. A set of specific 'VCF Tools' are under development to perform file transformations and other tasks (what your original question is requesting), but those available on our test server are not yet recommended/supported. Best, Jen Galaxy team On 3/14/12 5:31 AM, Geert Vandeweyer wrote:
Hi,
This might be an easy question, but I can't find the solution in the standard galaxy tools it seems. How do I create a valid VCF file from a (filtered) samtools mpileup output?
Thanks in advance !
Geert ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://galaxyproject.org
Hi JIwen, As the seqanswers thread shows, there is some debate about this. One of the last posts there makes the most sense - where "fragment length" is defined as the total genome bases covered by the aligned paired sequences: tip of 5' start, through the gap, to the very 3' tail end and where "mean inner distance" is the gap in the middle (between the paired seqs) where there is no alignment. This could be tested with smaller samples of data and compared to your fragment selection, to see how it matches up. But, for a definitive answer, asking the tool authors is the best bet. The contact information is: tophat.cufflinks@gmail.com If you are given a reply that explains the calculation, it would be great if the TopHat documentation itself were updated. http://tophat.cbcb.umd.edu/manual.html And we would be very glad to hear of the details, so that here at Galaxy we could add the information to our RNA-seq help and the searchable mailing list archives. Great question! If we find out ourselves meanwhile, an update will be posted back here to the mailing list, Best, Jen Galaxy team On 3/6/12 12:23 PM, 杨继文 wrote:
Hi all, When mapping pair end RNA-seq reads using tophat, we need to type in "Mean Inner Distance between Mate Pairs". In galaxy, we can read the following information:
This is the expected (mean) inner distance between mate pairs. For, example, for paired end runs with fragments selected at 300bp, where each end is 50bp, you should set -r to be 200. There is no default, and this parameter is required for paired end runs.
I think the size of fragment (here 300bp) includes not only the length of pair end reads, but also the length of adaptors. so, maybe the Mean Inner Distance between Mate Pairs should be : fragment length - pair end read length - adaptor length. Am I right? or did I miss something?
Is it a must to type in the accurate value?
Looking forward to your reply
JIwen
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
participants (4)
-
Carlos Borroto
-
Geert Vandeweyer
-
Jennifer Jackson
-
杨继文