Begin forwarded message:

From: Dan Jones <djones@psu.edu>
Date: July 13, 2010 11:02:50 AM EDT
To: Anton Nekrutenko <anton@bx.psu.edu>
Subject: galaxy tool suggestion


Hi Anton,

This is Dan (from your bioinformatics class a couple years ago). I have been playing around on galaxy with a couple of new 454 metagenomics datasets. I have been going back and forth between the tools 'Build base quality distribution' and 'filter FASTQ' to assess quality of my data and determine how it is affected by filtering certain length and quality sequences (using FASTQ lets me simultaneously operate on the seq and qual scores). I am mainly trying to understand a systematic decrease in quality that occurs after about 50% sequence length. But, in order to go back and build a base quality distribution boxplot, I need to extract the qual scores from the fastq file, and I currently can't find a way to do this on Galaxy (unless I am missing something obvious, very possible! I see an option to convert fastq to fasta, but I don't get the .qual file with it). I wrote a short py script to do this (attached), and I think that something like it to extract a .qual file from FASTQ would be a nice addition to the galaxy toolbox.

Hope all is well!

Dan


---
Daniel Jones
PhD Candidate, Penn State University
Department of Geosciences
242 Deike Building
University Park, PA 16802
cell: 651-245-2775
lab: 814-865-9340