Hi all,
Fastq groomer has Solexa or Illumina 1.3+ as an input quality format. I asked at the sequencing facility about their machine and output and they said their format was Illumina 1.8+ (the newest). I tried to convert my fastq file into Sanger by fastq groomer, using Illumina 1.3+ as an input option and got all reads with quality of around 10... Does it mean that Galaxy cannot be used on a dataset with 1.8+ encoding or something else was wrong?
Thanks,
Slon
On Tue, Oct 18, 2011 at 9:02 AM, arabidopsis svinekod@gmail.com wrote:
Hi all,
Fastq groomer has Solexa or Illumina 1.3+ as an input quality format. I asked at the sequencing facility about their machine and output and they said their format was Illumina 1.8+ (the newest). I tried to convert my fastq file into Sanger by fastq groomer, using Illumina 1.3+ as an input option and got all reads with quality of around 10... Does it mean that Galaxy cannot be used on a dataset with 1.8+ encoding or something else was wrong?
Thanks,
Slon
Illumina 1.8+ is already using the Sanger FASTQ encoding, so you don't need to convert it with the groomer.
I think the Galaxy team might still recommend it as it doubles as a sanity test for corrupt FASTQ files.
Peter
If Illumina 1.8+ is already using the Sanger FASTQ encoding, the file should be recognized by downstream applications, like Quality statistics computer, quality filter etc. However, my file is not visible by those programs and when I click on it, only "uploaded fastq file" is displayed, without encoding details.
S.
On Tue, Oct 18, 2011 at 10:12 AM, Peter Cock p.j.a.cock@googlemail.comwrote:
On Tue, Oct 18, 2011 at 9:02 AM, arabidopsis svinekod@gmail.com wrote:
Hi all,
Fastq groomer has Solexa or Illumina 1.3+ as an input quality format. I asked at the sequencing facility about their machine and output and they said their format was Illumina 1.8+ (the newest). I tried to convert my fastq file into Sanger by fastq groomer, using Illumina 1.3+ as an input option and got all reads with quality of around 10... Does it mean that Galaxy cannot be used on a dataset with 1.8+ encoding or something else was wrong?
Thanks,
Slon
Illumina 1.8+ is already using the Sanger FASTQ encoding, so you don't need to convert it with the groomer.
I think the Galaxy team might still recommend it as it doubles as a sanity test for corrupt FASTQ files.
Peter
On Tue, Oct 18, 2011 at 9:21 AM, arabidopsis svinekod@gmail.com wrote:
If Illumina 1.8+ is already using the Sanger FASTQ encoding, the file should be recognized by downstream applications, like Quality statistics computer, quality filter etc. However, my file is not visible by those programs and when I click on it, only "uploaded fastq file" is displayed, without encoding details.
S.
Have you told Galaxy it is fastqsanger? My guess is the upload tool has defaulted to the generic fastq. Look with the "pencil" icon to edit the attributes of the uploaded FASTQ file in your Galaxy history.
Peter
actually Illumina 1.8+ has one more quality value higher than fastqsanger (see http://en.wikipedia.org/wiki/FASTQ_format )
my question now I guess is if I use fastqsanger would it break anything when it encounters the 'J' in the qual values?
On Tue, Oct 18, 2011 at 5:10 PM, Peter Cock p.j.a.cock@googlemail.comwrote:
On Tue, Oct 18, 2011 at 9:21 AM, arabidopsis svinekod@gmail.com wrote:
If Illumina 1.8+ is already using the Sanger FASTQ encoding, the file
should
be recognized by downstream applications, like Quality statistics
computer,
quality filter etc. However, my file is not visible by those programs and when I click on it, only "uploaded fastq file" is displayed, without encoding details.
S.
On Tue, Nov 1, 2011 at 4:58 PM, Kevin Lam aboulia@gmail.com wrote:
actually Illumina 1.8+ has one more quality value higher than fastqsanger (see http://en.wikipedia.org/wiki/FASTQ_format )
my question now I guess is if I use fastqsanger would it break anything when it encounters the 'J' in the qual values?
The Sanger FASTQ format has always allowed J (PHRED 41), the issue is some tools might treat that as an error as it is unusually high for a raw read. For instance, you need at least FASTX v0.0.13 to cope with this - older versions didn't like it. http://seqanswers.com/forums/showthread.php?p=49667
Peter
Hello Slon,
In case you are still having issues, the best use case for Illumina 1.8+ data is to run the FASTQ Groomer tool with the option "Sanger". As Peter noted, this assigns the expected datatype plus verifies content before investing time in downstream analysis.
Please let us know if more help is needed,
Best,
Jen Galaxy team
On 10/18/11 1:02 AM, arabidopsis wrote:
Hi all,
Fastq groomer has Solexa or Illumina 1.3+ as an input quality format. I asked at the sequencing facility about their machine and output and they said their format was Illumina 1.8+ (the newest). I tried to convert my fastq file into Sanger by fastq groomer, using Illumina 1.3+ as an input option and got all reads with quality of around 10... Does it mean that Galaxy cannot be used on a dataset with 1.8+ encoding or something else was wrong?
Thanks,
Slon
The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hi, So, I am getting a fastq groomer error on some illumina data, with the following error. any ideas?
There was an error reading your input file. Your input file is likely malformed. It is suggested that you double-check your original input file for errors -- helpful information for this purpose has been provided below. However, if you think that you have encountered an actual error with this tool, please do tell us by using the bug reporting mechanism. The reported error is: 'Invalid fastq header: lab/solexa_public/Zon/111021_WICMT-SOLEXA_64KF7AAXX/QualityScore/s_3_1_sequence.txt
rich
________________________________ From: Jennifer Jackson jen@bx.psu.edu To: arabidopsis svinekod@gmail.com Cc: galaxy-user@lists.bx.psu.edu Sent: Wednesday, November 2, 2011 9:19 AM Subject: Re: [galaxy-user] fastq groomer
Hello Slon,
In case you are still having issues, the best use case for Illumina 1.8+ data is to run the FASTQ Groomer tool with the option "Sanger". As Peter noted, this assigns the expected datatype plus verifies content before investing time in downstream analysis.
Please let us know if more help is needed,
Best,
Jen Galaxy team
On 10/18/11 1:02 AM, arabidopsis wrote:
Hi all,
Fastq groomer has Solexa or Illumina 1.3+ as an input quality format. I asked at the sequencing facility about their machine and output and they said their format was Illumina 1.8+ (the newest). I tried to convert my fastq file into Sanger by fastq groomer, using Illumina 1.3+ as an input option and got all reads with quality of around 10... Does it mean that Galaxy cannot be used on a dataset with 1.8+ encoding or something else was wrong?
Thanks,
Slon
The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Howdy, Rich,
My interpretation of the error report is that the fastq file you are trying to groom contains the indicated text (lab/solexa_public/Zon/ 111021_WICMT-SOLEXA_64KF7AAXX/QualityScore/s_3_1_sequence.txt) on a line where it expects a valid fastq header. I believe a valid header line would begin with an at sign ("@"). So perhaps somewhere along the way, your fastq file's contents were replaced by a filename.
Bob H
On Nov 2, 2011, at 10:09 AM, Richard Mark White wrote:
Hi, So, I am getting a fastq groomer error on some illumina data, with the following error. any ideas?
There was an error reading your input file. Your input file is likely malformed. It is suggested that you double-check your original input file for errors -- helpful information for this purpose has been provided below. However, if you think that you have encountered an actual error with this tool, please do tell us by using the bug reporting mechanism.
The reported error is: 'Invalid fastq header: lab/solexa_public/Zon/ 111021_WICMT-SOLEXA_64KF7AAXX/QualityScore/s_3_1_sequence.txt
rich
From: Jennifer Jackson jen@bx.psu.edu To: arabidopsis svinekod@gmail.com Cc: galaxy-user@lists.bx.psu.edu Sent: Wednesday, November 2, 2011 9:19 AM Subject: Re: [galaxy-user] fastq groomer
Hello Slon,
In case you are still having issues, the best use case for Illumina 1.8+ data is to run the FASTQ Groomer tool with the option "Sanger". As Peter noted, this assigns the expected datatype plus verifies content before investing time in downstream analysis.
Please let us know if more help is needed,
Best,
Jen Galaxy team
On 10/18/11 1:02 AM, arabidopsis wrote:
Hi all,
Fastq groomer has Solexa or Illumina 1.3+ as an input quality
format. I
asked at the sequencing facility about their machine and output
and they
said their format was Illumina 1.8+ (the newest). I tried to
convert my
fastq file into Sanger by fastq groomer, using Illumina 1.3+ as an
input
option and got all reads with quality of around 10... Does it mean
that
Galaxy cannot be used on a dataset with 1.8+ encoding or something
else
was wrong?
Thanks,
Slon
The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/wiki/Support ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
galaxy-user@lists.galaxyproject.org