Removing low quality reads
Galaxy Users, I would like to filter a .bam file to remove reads with low mapping quality, especially ambiguously mapped reads (MAPQ = 0). I can easily do this using the command line version of samtools as shown below. samtools view -bq 20 hba1.bam > hba1_MAPQ20.bam None of the options available under "NGS:SAM Tools" (e.g., Generate pileup and Filter SAM) provide an option for removing reads with low mapping quality. The history shown in http://main.g2.bx.psu.edu/u/onsongo/h/obtaininghighqualityreads shows the results I would like to obtain. Data 2 shows the results of Picard tools SAM/BAM Alignment Summary Metrics<http://main.g2.bx.psu.edu/tool_runner?tool_id=PicardASMetrics> on hba1.bam which contains reads with MAPQ values less than 20. As shown in this summary html, PF_READS_ALIGNED = 775 and PF_HQ_ALIGNED_READS = 241. Data 4 shows the results of Picard tools SAM/BAM Alignment Summary Metrics<http://main.g2.bx.psu.edu/tool_runner?tool_id=PicardASMetrics> on hba1_MAPQ20.bam which contains only reads with MAPQ greater than or equal to 20. As shown in this summary html, PF_READS_ALIGNED = 241 and PF_HQ_ALIGNED_READS = 241. Is there a way in Galaxy to filter a bam file to remove low quality mapped reads similar to using the samtools command line alternative shown above? Thanks, Getiria -- Getiria Onsongo, Ph.D. Bioinformatics Research Scientist Masonic Cancer Center, University of Minnesota Minneapolis, MN 55455 Phone: 612-625-0101
Hi Getiria, Galaxy does not have a tool to do this directly (but it would be a nice addition). Converting to SAM and using a combinations of tools in Text Manipulation would likely be possible. It would involve several steps and some experimentation, but a workflow could be created from that result to do the filter at all once in the future. Sorry that we could not help more, Best, Jen Galaxy team On 10/24/11 9:15 AM, Getiria Onsongo wrote:
Galaxy Users,
I would like to filter a .bam file to remove reads with low mapping quality, especially ambiguously mapped reads (MAPQ = 0). I can easily do this using the command line version of samtools as shown below.
samtools view -bq 20 hba1.bam > hba1_MAPQ20.bam
None of the options available under "NGS:SAM Tools" (e.g., Generate pileup and Filter SAM) provide an option for removing reads with low mapping quality. The history shown in
http://main.g2.bx.psu.edu/u/onsongo/h/obtaininghighqualityreads
shows the results I would like to obtain.
Data 2 shows the results of Picard tools SAM/BAM Alignment Summary Metrics <http://main.g2.bx.psu.edu/tool_runner?tool_id=PicardASMetrics> on hba1.bam which contains reads with MAPQ values less than 20. As shown in this summary html, PF_READS_ALIGNED = 775 and PF_HQ_ALIGNED_READS = 241.
Data 4 shows the results of Picard tools SAM/BAM Alignment Summary Metrics <http://main.g2.bx.psu.edu/tool_runner?tool_id=PicardASMetrics> on hba1_MAPQ20.bam which contains only reads with MAPQ greater than or equal to 20. As shown in this summary html, PF_READS_ALIGNED = 241 and PF_HQ_ALIGNED_READS = 241.
Is there a way in Galaxy to filter a bam file to remove low quality mapped reads similar to using the samtools command line alternative shown above?
Thanks, Getiria
-- Getiria Onsongo, Ph.D. Bioinformatics Research Scientist Masonic Cancer Center, University of Minnesota Minneapolis, MN 55455 Phone: 612-625-0101 <tel:612-625-0101>
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/wiki/Support
participants (2)
-
Getiria Onsongo
-
Jennifer Jackson