Hi all, Fastq groomer has Solexa or Illumina 1.3+ as an input quality format. I asked at the sequencing facility about their machine and output and they said their format was Illumina 1.8+ (the newest). I tried to convert my fastq file into Sanger by fastq groomer, using Illumina 1.3+ as an input option and got all reads with quality of around 10... Does it mean that Galaxy cannot be used on a dataset with 1.8+ encoding or something else was wrong? Thanks, Slon
On Tue, Oct 18, 2011 at 9:02 AM, arabidopsis <svinekod@gmail.com> wrote:
Hi all,
Fastq groomer has Solexa or Illumina 1.3+ as an input quality format. I asked at the sequencing facility about their machine and output and they said their format was Illumina 1.8+ (the newest). I tried to convert my fastq file into Sanger by fastq groomer, using Illumina 1.3+ as an input option and got all reads with quality of around 10... Does it mean that Galaxy cannot be used on a dataset with 1.8+ encoding or something else was wrong?
Thanks,
Slon
Illumina 1.8+ is already using the Sanger FASTQ encoding, so you don't need to convert it with the groomer. I think the Galaxy team might still recommend it as it doubles as a sanity test for corrupt FASTQ files. Peter
If Illumina 1.8+ is already using the Sanger FASTQ encoding, the file should be recognized by downstream applications, like Quality statistics computer, quality filter etc. However, my file is not visible by those programs and when I click on it, only "uploaded fastq file" is displayed, without encoding details. S. On Tue, Oct 18, 2011 at 10:12 AM, Peter Cock <p.j.a.cock@googlemail.com>wrote:
On Tue, Oct 18, 2011 at 9:02 AM, arabidopsis <svinekod@gmail.com> wrote:
Hi all,
Fastq groomer has Solexa or Illumina 1.3+ as an input quality format. I asked at the sequencing facility about their machine and output and they said their format was Illumina 1.8+ (the newest). I tried to convert my fastq file into Sanger by fastq groomer, using Illumina 1.3+ as an input option and got all reads with quality of around 10... Does it mean that Galaxy cannot be used on a dataset with 1.8+ encoding or something else was wrong?
Thanks,
Slon
Illumina 1.8+ is already using the Sanger FASTQ encoding, so you don't need to convert it with the groomer.
I think the Galaxy team might still recommend it as it doubles as a sanity test for corrupt FASTQ files.
Peter
On Tue, Oct 18, 2011 at 9:21 AM, arabidopsis <svinekod@gmail.com> wrote:
If Illumina 1.8+ is already using the Sanger FASTQ encoding, the file should be recognized by downstream applications, like Quality statistics computer, quality filter etc. However, my file is not visible by those programs and when I click on it, only "uploaded fastq file" is displayed, without encoding details.
S.
Have you told Galaxy it is fastqsanger? My guess is the upload tool has defaulted to the generic fastq. Look with the "pencil" icon to edit the attributes of the uploaded FASTQ file in your Galaxy history. Peter
actually Illumina 1.8+ has one more quality value higher than fastqsanger (see http://en.wikipedia.org/wiki/FASTQ_format ) my question now I guess is if I use fastqsanger would it break anything when it encounters the 'J' in the qual values? On Tue, Oct 18, 2011 at 5:10 PM, Peter Cock <p.j.a.cock@googlemail.com>wrote:
On Tue, Oct 18, 2011 at 9:21 AM, arabidopsis <svinekod@gmail.com> wrote:
If Illumina 1.8+ is already using the Sanger FASTQ encoding, the file should be recognized by downstream applications, like Quality statistics computer, quality filter etc. However, my file is not visible by those programs and when I click on it, only "uploaded fastq file" is displayed, without encoding details.
S.
On Tue, Nov 1, 2011 at 4:58 PM, Kevin Lam <aboulia@gmail.com> wrote:
actually Illumina 1.8+ has one more quality value higher than fastqsanger (see http://en.wikipedia.org/wiki/FASTQ_format )
my question now I guess is if I use fastqsanger would it break anything when it encounters the 'J' in the qual values?
The Sanger FASTQ format has always allowed J (PHRED 41), the issue is some tools might treat that as an error as it is unusually high for a raw read. For instance, you need at least FASTX v0.0.13 to cope with this - older versions didn't like it. http://seqanswers.com/forums/showthread.php?p=49667 Peter
Hello Slon, In case you are still having issues, the best use case for Illumina 1.8+ data is to run the FASTQ Groomer tool with the option "Sanger". As Peter noted, this assigns the expected datatype plus verifies content before investing time in downstream analysis. Please let us know if more help is needed, Best, Jen Galaxy team On 10/18/11 1:02 AM, arabidopsis wrote:
Hi all,
Fastq groomer has Solexa or Illumina 1.3+ as an input quality format. I asked at the sequencing facility about their machine and output and they said their format was Illumina 1.8+ (the newest). I tried to convert my fastq file into Sanger by fastq groomer, using Illumina 1.3+ as an input option and got all reads with quality of around 10... Does it mean that Galaxy cannot be used on a dataset with 1.8+ encoding or something else was wrong?
Thanks,
Slon
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/wiki/Support
Hi, So, I am getting a fastq groomer error on some illumina data, with the following error. any ideas? There was an error reading your input file. Your input file is likely malformed. It is suggested that you double-check your original input file for errors -- helpful information for this purpose has been provided below. However, if you think that you have encountered an actual error with this tool, please do tell us by using the bug reporting mechanism. The reported error is: 'Invalid fastq header: lab/solexa_public/Zon/111021_WICMT-SOLEXA_64KF7AAXX/QualityScore/s_3_1_sequence.txt rich ________________________________ From: Jennifer Jackson <jen@bx.psu.edu> To: arabidopsis <svinekod@gmail.com> Cc: galaxy-user@lists.bx.psu.edu Sent: Wednesday, November 2, 2011 9:19 AM Subject: Re: [galaxy-user] fastq groomer Hello Slon, In case you are still having issues, the best use case for Illumina 1.8+ data is to run the FASTQ Groomer tool with the option "Sanger". As Peter noted, this assigns the expected datatype plus verifies content before investing time in downstream analysis. Please let us know if more help is needed, Best, Jen Galaxy team On 10/18/11 1:02 AM, arabidopsis wrote:
Hi all,
Fastq groomer has Solexa or Illumina 1.3+ as an input quality format. I asked at the sequencing facility about their machine and output and they said their format was Illumina 1.8+ (the newest). I tried to convert my fastq file into Sanger by fastq groomer, using Illumina 1.3+ as an input option and got all reads with quality of around 10... Does it mean that Galaxy cannot be used on a dataset with 1.8+ encoding or something else was wrong?
Thanks,
Slon
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/wiki/Support ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Howdy, Rich, My interpretation of the error report is that the fastq file you are trying to groom contains the indicated text (lab/solexa_public/Zon/ 111021_WICMT-SOLEXA_64KF7AAXX/QualityScore/s_3_1_sequence.txt) on a line where it expects a valid fastq header. I believe a valid header line would begin with an at sign ("@"). So perhaps somewhere along the way, your fastq file's contents were replaced by a filename. Bob H On Nov 2, 2011, at 10:09 AM, Richard Mark White wrote:
Hi, So, I am getting a fastq groomer error on some illumina data, with the following error. any ideas?
There was an error reading your input file. Your input file is likely malformed. It is suggested that you double-check your original input file for errors -- helpful information for this purpose has been provided below. However, if you think that you have encountered an actual error with this tool, please do tell us by using the bug reporting mechanism.
The reported error is: 'Invalid fastq header: lab/solexa_public/Zon/ 111021_WICMT-SOLEXA_64KF7AAXX/QualityScore/s_3_1_sequence.txt
rich
From: Jennifer Jackson <jen@bx.psu.edu> To: arabidopsis <svinekod@gmail.com> Cc: galaxy-user@lists.bx.psu.edu Sent: Wednesday, November 2, 2011 9:19 AM Subject: Re: [galaxy-user] fastq groomer
Hello Slon,
In case you are still having issues, the best use case for Illumina 1.8+ data is to run the FASTQ Groomer tool with the option "Sanger". As Peter noted, this assigns the expected datatype plus verifies content before investing time in downstream analysis.
Please let us know if more help is needed,
Best,
Jen Galaxy team
Hi all,
Fastq groomer has Solexa or Illumina 1.3+ as an input quality
asked at the sequencing facility about their machine and output and they said their format was Illumina 1.8+ (the newest). I tried to convert my fastq file into Sanger by fastq groomer, using Illumina 1.3+ as an input option and got all reads with quality of around 10... Does it mean
On 10/18/11 1:02 AM, arabidopsis wrote: format. I that
Galaxy cannot be used on a dataset with 1.8+ encoding or something else was wrong?
Thanks,
Slon
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/wiki/Support ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
participants (6)
-
arabidopsis
-
Bob Harris
-
Jennifer Jackson
-
Kevin Lam
-
Peter Cock
-
Richard Mark White