Hello Mark, Correct - not needed. Any Illumina data that results from a CASAVA 1.8 or higher pipeline is already has quality scores scaled to Sanger Phred with an ASCII offset of 33. This translates to the ".fastqsanger" format in Galaxy, so no grooming is required, you can just assign the datatype. How to do that is in the first link below. And all the details, including how to read a FastQC report to determine this is in the screencast. Thanks! Jen Galaxy team On 12/13/13 8:58 AM, Mark Lindsay wrote:
Hi James
do you need to run the Groomer on the latest Illumina data?
BW
Mark
On 13 Dec 2013, at 16:53, Jennifer Jackson <jen@bx.psu.edu <mailto:jen@bx.psu.edu>> wrote:
Hello,
For the first question, make sure that you are running the groomer with the correction options. In almost all cases for Illumina data this will mean leaving all but one setting at default. The setting to change is "Input FASTQ quality scores type:". The results of FastQC will inform you about how to set this. An example is in this wiki section's screencast plus the first bullet point: https://wiki.galaxyproject.org/Support#Dataset_special_cases http://vimeo.com/galaxyproject/fastqprep
For the second, I am not sure what you mean by 'different'. Do you mean the data may have contamination from another species? Or that the the data content may be different with respect to quality?
In short, to filter based on quality as reported in the FastQC report, try tools in the same tool group such as "FASTQ Trimmer" or "FASTQ Quality Trimmer".
The protocols included in our RNA-seq pipeline help start out with some quality steps: https://wiki.galaxyproject.org/Support#Tools_on_the_Main_server:_RNA-seq
And many from our community have contributed RNA-seq tutorials: https://wiki.galaxyproject.org/Learn#Other_Tutorials
Hopefully this helps!
Jen Galaxy team
On 12/13/13 7:05 AM, Jorge Braun wrote:
Hello mates,
I have two doubts galaxy:
a) I have rna-seq data from Illumina and do fastqc ... the results are good but when I fastgroomer to Sanger format and then fastqc ... the results are bad. Does anyone know the cause? I do not understand why.
b) With the same sequences can know if they are different Rna and eliminate those that do not want to examine?
merry christmas :)
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server atusegalaxy.org <http://usegalaxy.org>. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
To search Galaxy mailing lists use the unified search at:
-- Jennifer Hillman-Jackson http://galaxyproject.org ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org <http://usegalaxy.org>. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
To search Galaxy mailing lists use the unified search at:
-- Jennifer Hillman-Jackson http://galaxyproject.org