fastqx-toolkit related problem in Galaxy?
Hello I am using teh galaxy instance on http://main.g2.bx.psu.edu/root For an input fastq file starting with @IL14_1008:2:1:800:71/1 AAAAAAAAAAAAAAAAAAAAAAAAACAAAAAAAAAAA +
>>>>>>>>>>>>>><>>>>->><<>><>><<
I get the following error when running FASTQ to FASTA on this input: 'An error occurred running this job: fastq_to_fasta: Invalid quality score value (char '-' ord 45 quality value -19) on line 4' When I try to use the 'Clip adapter sequences' with this input file I get the following error against the 'Library to clip' widget: 'History does not include a dataset of the required format / build' When I run the command-line version of the fastqx-toolkit tools on a Linux 64 bit machine, I have similar problems: mg8@sf-2-1-01:~/gal_data$ fastq_to_fasta -v -n -i 10000.fastq -o my.fa fastq_to_fasta: Invalid quality score value (char '-' ord 45 quality value -19) on line 4 and mg8@sf-2-1-01:~/gal_data$ fastx_clipper -i 10000.fastq -o clipped.fastq -a ACACTCTTTCCCTACACGACGCTCTTCCGATCT fastx_clipper: Invalid quality score value (char '-' ord 45 quality value -19) on line 4 Therefore, the problem seems to be with the fastqx-toolkit tools. My file is in the Sanger fastq format. Galaxy does allow me to move it to the Illumina format without problems, but that does not help with clipping the adaptors functionality Marina -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.
On Mon, Dec 13, 2010 at 12:06 PM, Marina Gourtovaia <mg8@sanger.ac.uk> wrote:
Hello
I am using teh galaxy instance on http://main.g2.bx.psu.edu/root
For an input fastq file starting with
@IL14_1008:2:1:800:71/1 AAAAAAAAAAAAAAAAAAAAAAAAACAAAAAAAAAAA +
>>>>>>>>>>>>>>><>>>>->><<>><>><<
I get the following error when running FASTQ to FASTA on this input: 'An error occurred running this job: fastq_to_fasta: Invalid quality score value (char '-' ord 45 quality value -19) on line 4'
When I try to use the 'Clip adapter sequences' with this input file I get the following error against the 'Library to clip' widget: 'History does not include a dataset of the required format / build'
When I run the command-line version of the fastqx-toolkit tools on a Linux 64 bit machine, I have similar problems:
mg8@sf-2-1-01:~/gal_data$ fastq_to_fasta -v -n -i 10000.fastq -o my.fa fastq_to_fasta: Invalid quality score value (char '-' ord 45 quality value -19) on line 4
and
mg8@sf-2-1-01:~/gal_data$ fastx_clipper -i 10000.fastq -o clipped.fastq -a ACACTCTTTCCCTACACGACGCTCTTCCGATCT fastx_clipper: Invalid quality score value (char '-' ord 45 quality value -19) on line 4
Therefore, the problem seems to be with the fastqx-toolkit tools. My file is in the Sanger fastq format. Galaxy does allow me to move it to the Illumina format without problems, but that does not help with clipping the adaptors functionality
Marina
The problem is that '-' is valid in Sanger FASTQ, but not in the Solexa or Illumina FASTQ variants, and it seems FASTX is assuming the later. The fastx tools can be told the quality offset to use (for Sanger use -Q 33, for Solexa/Illumina use -Q 64, the default). I've not looked at the Galaxy wrapper but perhaps it isn't using this (poorly documented) FASTX option. Peter
On Mon, Dec 13, 2010 at 12:18 PM, Peter <peter@maubp.freeserve.co.uk> wrote:
On Mon, Dec 13, 2010 at 12:06 PM, Marina Gourtovaia <mg8@sanger.ac.uk> wrote:
Hello
I am using teh galaxy instance on http://main.g2.bx.psu.edu/root
For an input fastq file starting with
@IL14_1008:2:1:800:71/1 AAAAAAAAAAAAAAAAAAAAAAAAACAAAAAAAAAAA +
>>>>>>>>>>>>>>>><>>>>->><<>><>><<
I get the following error when running FASTQ to FASTA on this input: 'An error occurred running this job: fastq_to_fasta: Invalid quality score value (char '-' ord 45 quality value -19) on line 4'
When I try to use the 'Clip adapter sequences' with this input file I get the following error against the 'Library to clip' widget: 'History does not include a dataset of the required format / build'
When I run the command-line version of the fastqx-toolkit tools on a Linux 64 bit machine, I have similar problems:
mg8@sf-2-1-01:~/gal_data$ fastq_to_fasta -v -n -i 10000.fastq -o my.fa fastq_to_fasta: Invalid quality score value (char '-' ord 45 quality value -19) on line 4
and
mg8@sf-2-1-01:~/gal_data$ fastx_clipper -i 10000.fastq -o clipped.fastq -a ACACTCTTTCCCTACACGACGCTCTTCCGATCT fastx_clipper: Invalid quality score value (char '-' ord 45 quality value -19) on line 4
Therefore, the problem seems to be with the fastqx-toolkit tools. My file is in the Sanger fastq format. Galaxy does allow me to move it to the Illumina format without problems, but that does not help with clipping the adaptors functionality
Marina
The problem is that '-' is valid in Sanger FASTQ, but not in the Solexa or Illumina FASTQ variants, and it seems FASTX is assuming the later.
The fastx tools can be told the quality offset to use (for Sanger use -Q 33, for Solexa/Illumina use -Q 64, the default). I've not looked at the Galaxy wrapper but perhaps it isn't using this (poorly documented) FASTX option.
Dan's just fixed this, it should work next time the Penn State Galaxy instance is updated: https://bitbucket.org/galaxy/galaxy-central/changeset/93d7007bd859 Peter
participants (2)
-
Marina Gourtovaia
-
Peter