I have a DNA seq data set from mouse genome that is heterozygous for a known 15 bp deletion. The deletion is listed in the SAM tools pileup: mapping quality = 690, SNP quality =690, mapping quality =52, coverage = 62, 11 reads span the deletion, 52 reads are reference. When I use SAM tools ‘filter pileup on coverage and SNPs’, this deletion is filtered out. I’m using the default settings for the filter: coverage=3, quality cap=60 (is this the Phred quality score or is it the Phred quality coefficient?). Does anyone whether this deletion is filtered out simply because it’s a deletion or is it filtered out due to the mapping quality? Can anyone suggest an alternative tool / setting for filtering the pileup? Thanks, Laura Laura Reinholdt, PhD Research Scientist Genetic Resource Sciences The Jackson Laboratory 600 Main Street Bar Harbor, ME 04609-1500 (207) 288-6000, ext. 6693
Hello Laura, On 10/3/11 11:05 AM, Laura Reinholdt wrote:
I have a DNA seq data set from mouse genome that is heterozygous for a known 15 bp deletion. The deletion is listed in the SAM tools pileup: mapping quality = 690, SNP quality =690, mapping quality =52, coverage = 62, 11 reads span the deletion, 52 reads are reference. When I use SAM tools ‘filter pileup on coverage and SNPs’, this deletion is filtered out. I’m using the default settings for the filter: coverage=3, quality cap=60 (is this the Phred quality score or is it the Phred quality
The tool's form does not have an upper quality threshold, but rather a minimum quality score value ("Do not consider read bases with quality lower than:" this is defaulted to 20). The value is a Phred 33 score.
coefficient?). Does anyone whether this deletion is filtered out simply because it’s a deletion or is it filtered out due to the mapping quality? Can anyone suggest an alternative tool / setting for filtering
If you had the minimum quality threshold set to 60, then yes, this could certainly have removed sequences from the region. Set this value to the default or to another value determined by evaluating the range of base qualities in your data. Use tools "NGS: QC and manipulation" -> 'Compute quality statistics' and 'Build base quality distribution'. Hopefully this helps, Best, Jen Galaxy team
the pileup?
Thanks, Laura
Laura Reinholdt, PhD Research Scientist Genetic Resource Sciences The Jackson Laboratory 600 Main Street Bar Harbor, ME 04609-1500 (207) 288-6000, ext. 6693
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/Support
For you and other following this thread, there is one extra tool that can be recommended for quickly evaluating datasets: "NGS: QC and manipulation -> FastQC". In many cases, this can replace the other quality score evaluation tools. Take care, Jen On 10/4/11 5:01 PM, Jennifer Jackson wrote:
Hello Laura,
On 10/3/11 11:05 AM, Laura Reinholdt wrote:
I have a DNA seq data set from mouse genome that is heterozygous for a known 15 bp deletion. The deletion is listed in the SAM tools pileup: mapping quality = 690, SNP quality =690, mapping quality =52, coverage = 62, 11 reads span the deletion, 52 reads are reference. When I use SAM tools ‘filter pileup on coverage and SNPs’, this deletion is filtered out. I’m using the default settings for the filter: coverage=3, quality cap=60 (is this the Phred quality score or is it the Phred quality
The tool's form does not have an upper quality threshold, but rather a minimum quality score value ("Do not consider read bases with quality lower than:" this is defaulted to 20). The value is a Phred 33 score.
coefficient?). Does anyone whether this deletion is filtered out simply because it’s a deletion or is it filtered out due to the mapping quality? Can anyone suggest an alternative tool / setting for filtering
If you had the minimum quality threshold set to 60, then yes, this could certainly have removed sequences from the region. Set this value to the default or to another value determined by evaluating the range of base qualities in your data. Use tools "NGS: QC and manipulation" -> 'Compute quality statistics' and 'Build base quality distribution'.
Hopefully this helps,
Best,
Jen Galaxy team
the pileup?
Thanks, Laura
Laura Reinholdt, PhD Research Scientist Genetic Resource Sciences The Jackson Laboratory 600 Main Street Bar Harbor, ME 04609-1500 (207) 288-6000, ext. 6693
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/Support
participants (2)
-
Jennifer Jackson
-
Laura Reinholdt