Hello Mark,

Correct - no needed. Any Illumina data that results from a CASAVA 1.8 or higher pipeline is already has quality scores scaled to Sanger Phred with an ASCII offset of 33. This translates to the ".fastqsanger" format in Galaxy, so no grooming is required, you can just assign the datatype. How to do that is in the first link below. And all the details, including how to read a FastQC report to determine this is in the screencast.

Thanks!

Jen
Galaxy team

On 12/13/13 8:58 AM, Mark Lindsay wrote:
Hi James

do you need to run the Groomer on the latest Illumina data?

BW

Mark



On 13 Dec 2013, at 16:53, Jennifer Jackson <jen@bx.psu.edu> wrote:

Hello,

For the first question, make sure that you are running the groomer with the correction options. In almost all cases for Illumina data this will mean leaving all but one setting at default. The setting to change is "Input FASTQ quality scores type:". The results of FastQC will inform you about how to set this. An example is in this wiki section's screencast plus the first bullet point:
https://wiki.galaxyproject.org/Support#Dataset_special_cases
http://vimeo.com/galaxyproject/fastqprep

For the second, I am not sure what you mean by 'different'. Do you mean the data may have contamination from another species? Or that the the data content may be different with respect to quality?

In short, to filter based on quality as reported in the FastQC report, try tools in the same tool group such as "FASTQ Trimmer" or "FASTQ Quality Trimmer".

The protocols included in our RNA-seq pipeline help start out with some quality steps:
https://wiki.galaxyproject.org/Support#Tools_on_the_Main_server:_RNA-seq

And many from our community have contributed RNA-seq tutorials:
https://wiki.galaxyproject.org/Learn#Other_Tutorials

Hopefully this helps!

Jen
Galaxy team

On 12/13/13 7:05 AM, Jorge Braun wrote:
Hello mates,

I have two doubts galaxy:

a) I have rna-seq data from Illumina and do fastqc ... the results are good but when I fastgroomer to Sanger format and then fastqc ... the results are bad. Does anyone know the cause? I do not understand why.

b) With the same sequences can know if they are different Rna and eliminate those that do not want to examine?

merry christmas :)


___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

-- 
Jennifer Hillman-Jackson
http://galaxyproject.org
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

 http://galaxyproject.org/search/mailinglists/


-- 
Jennifer Hillman-Jackson
http://galaxyproject.org