On Wed, Aug 15, 2012 at 11:13 AM, Du, Jianguang <jiandu@iupui.edu> wrote:

Dear All,

I am analyzing the downloaded RNA-seq datasets. However I am not sure how much is Mean Inner Distance between Mate Pairs for these paired-end datasets.

Take a paired-end RNA-seq dataset as an example, there is a description for this dataset in SRA database of NCBI: "Layout: PAIRED, Orientation: 5'-3'-3'-5', Nominal length: 400, Nominal Std Dev: 20"

At first I thought the Mean Inner Distance between Mate Pairs should be 325bps because the length of reads on both ends is 36bps. However when I aligned the sequence of the paired reads on to transcripts and genome using BLASTn, the distance between the paired reads is about 200bps. How should I decide the Mean Inner Distance between Mate Pairs in my case?


The information from SRA is likely only an approximation.  SRA does not validate these details, I do not think.

You can probably use the distribution from your data as the best estimate.  

Sean 

Thanks.

Jianguang Du