Begin forwarded message:
From: Dan Jones <djones@psu.edu> Date: July 13, 2010 11:02:50 AM EDT To: Anton Nekrutenko <anton@bx.psu.edu> Subject: galaxy tool suggestion
Hi Anton,
This is Dan (from your bioinformatics class a couple years ago). I have been playing around on galaxy with a couple of new 454 metagenomics datasets. I have been going back and forth between the tools 'Build base quality distribution' and 'filter FASTQ' to assess quality of my data and determine how it is affected by filtering certain length and quality sequences (using FASTQ lets me simultaneously operate on the seq and qual scores). I am mainly trying to understand a systematic decrease in quality that occurs after about 50% sequence length. But, in order to go back and build a base quality distribution boxplot, I need to extract the qual scores from the fastq file, and I currently can't find a way to do this on Galaxy (unless I am missing something obvious, very possible! I see an option to convert fastq to fasta, but I don't get the .qual file with it). I wrote a short py script to do this (attached), and I think that something like it to extract a .qual file from FASTQ would be a nice addition to the galaxy toolbox.
Hope all is well!
Dan
--- Daniel Jones PhD Candidate, Penn State University Department of Geosciences 242 Deike Building University Park, PA 16802 cell: 651-245-2775 lab: 814-865-9340
Anton Nekrutenko http://nekrut.bx.psu.edu http://usegalaxy.org