Jeremy,

 

This is not strictly correct.  Tophat/bowtie don’t report mapping quality values that are as meaningful as BWA, but there is some information in the mapping quality values tophat reports. Tophat yields 4 distinct values for its mapping quality values (you can do a “unique” count on the mapping quality field of any SAM file from tophat to verify this):

 

255 = unique mapping

3 = maps to 2 locations in the target

2 = maps to 3 locations

1 = maps to 4-9 locations

0 = maps to 10 or more locations.

 

Except for the 255 case, the simple rule that was encoded by the authors is the usual phred quality scale:

 

MapQ = -10 log10(P)

 

Where P = probability that this mapping is NOT the correct one. The authors ignore the number of mismatches in this calculation and simply assume that if it maps to 2 locations then P = 0.5, 3 locations implies P = 2/3, 4 locations => P = 3/4 etc.

 

As you can clearly see, then MapQ = -10 log10(0.5) = 3; -10 log10(2/3) = 1.76 (rounds to 2);

-10 log10(3/4) = 1.25 (rounds to 1), etc.

 

-Kevin

 

Date: Tue, 7 Feb 2012 17:56:34 -0500

From: Jeremy Goecks <jeremy.goecks@emory.edu>

To: "Li, Jilong (MU-Student)" <jl482@mail.missouri.edu>

Cc: "galaxy-user@lists.bx.psu.edu" <galaxy-user@lists.bx.psu.edu>

Subject: Re: [galaxy-user] about Mapping Quality

Message-ID: <84AF17B7-3317-43CF-92BA-C60D17A6E037@emory.edu>

Content-Type: text/plain; charset="us-ascii"

 

Tophat/Bowtie does not yield mapping quality, so, as per the SAM spec, that field is set to 255, indicating that quality is unavailable.

 

http://samtools.sourceforge.net/SAM1.pdf

 

Best,

J.

 

On Feb 7, 2012, at 5:46 PM, Li, Jilong (MU-Student) wrote:

 

> Hi all,

>

> I used TopHat to map RNA-Seq reads to genomes. In the output (.sam) file, the value of some mapping quality (the 5th column) is 255. What does it mean? And I found these reads which have mapping quality 255 mapped to unique place.

>

> Thanks!

>

> Victor

> ___________________________________________________________

> The Galaxy User list should be used for the discussion of

> Galaxy analysis and other features on the public server

> at usegalaxy.org.  Please keep all replies on the list by

> using "reply all" in your mail client.  For discussion of

> local Galaxy instances and the Galaxy source code, please

> use the Galaxy Development list:

>

http://lists.bx.psu.edu/listinfo/galaxy-dev

>

> To manage your subscriptions to this and other Galaxy lists,

> please use the interface at:

>

http://lists.bx.psu.edu/