fastqc and blast?
Hello mates, I have two doubts galaxy: a) I have rna-seq data from Illumina and do fastqc ... the results are good but when I fastgroomer to Sanger format and then fastqc ... the results are bad. Does anyone know the cause? I do not understand why. b) With the same sequences can know if they are different Rna and eliminate those that do not want to examine? merry christmas :)
Hello, For the first question, make sure that you are running the groomer with the correction options. In almost all cases for Illumina data this will mean leaving all but one setting at default. The setting to change is "Input FASTQ quality scores type:". The results of FastQC will inform you about how to set this. An example is in this wiki section's screencast plus the first bullet point: https://wiki.galaxyproject.org/Support#Dataset_special_cases http://vimeo.com/galaxyproject/fastqprep For the second, I am not sure what you mean by 'different'. Do you mean the data may have contamination from another species? Or that the the data content may be different with respect to quality? In short, to filter based on quality as reported in the FastQC report, try tools in the same tool group such as "FASTQ Trimmer" or "FASTQ Quality Trimmer". The protocols included in our RNA-seq pipeline help start out with some quality steps: https://wiki.galaxyproject.org/Support#Tools_on_the_Main_server:_RNA-seq And many from our community have contributed RNA-seq tutorials: https://wiki.galaxyproject.org/Learn#Other_Tutorials Hopefully this helps! Jen Galaxy team On 12/13/13 7:05 AM, Jorge Braun wrote:
Hello mates,
I have two doubts galaxy:
a) I have rna-seq data from Illumina and do fastqc ... the results are good but when I fastgroomer to Sanger format and then fastqc ... the results are bad. Does anyone know the cause? I do not understand why.
b) With the same sequences can know if they are different Rna and eliminate those that do not want to examine?
merry christmas :)
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
To search Galaxy mailing lists use the unified search at:
-- Jennifer Hillman-Jackson http://galaxyproject.org
Hello, of course, Jennifer is right for the first question 😊 . For the second question about blast ... I wonder if running after blast in galaxy I can remove sequences that can contaminate the data. It's possible? Last question, trinity is 100% operational in galaxy? Because trinity ran but the result was empty, and I think the script failure .Py Thanks Jennifer and colleagues for your patience and solutions Date: Fri, 13 Dec 2013 08:53:31 -0800 From: jen@bx.psu.edu To: braun_bio@hotmail.com; galaxy-user@lists.bx.psu.edu Subject: Re: [galaxy-user] fastqc and blast? Hello, For the first question, make sure that you are running the groomer with the correction options. In almost all cases for Illumina data this will mean leaving all but one setting at default. The setting to change is "Input FASTQ quality scores type:". The results of FastQC will inform you about how to set this. An example is in this wiki section's screencast plus the first bullet point: https://wiki.galaxyproject.org/Support#Dataset_special_cases http://vimeo.com/galaxyproject/fastqprep For the second, I am not sure what you mean by 'different'. Do you mean the data may have contamination from another species? Or that the the data content may be different with respect to quality? In short, to filter based on quality as reported in the FastQC report, try tools in the same tool group such as "FASTQ Trimmer" or "FASTQ Quality Trimmer". The protocols included in our RNA-seq pipeline help start out with some quality steps: https://wiki.galaxyproject.org/Support#Tools_on_the_Main_server:_RNA-seq And many from our community have contributed RNA-seq tutorials: https://wiki.galaxyproject.org/Learn#Other_Tutorials Hopefully this helps! Jen Galaxy team On 12/13/13 7:05 AM, Jorge Braun wrote: Hello mates, I have two doubts galaxy: a) I have rna-seq data from Illumina and do fastqc ... the results are good but when I fastgroomer to Sanger format and then fastqc ... the results are bad. Does anyone know the cause? I do not understand why. b) With the same sequences can know if they are different Rna and eliminate those that do not want to examine? merry christmas :) ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ -- Jennifer Hillman-Jackson http://galaxyproject.org
On Sat, Dec 14, 2013 at 8:52 AM, Jorge Braun <braun_bio@hotmail.com> wrote:
Hello, of course, Jennifer is right for the first question . For the second question about blast ... I wonder if running after blast in galaxy I can remove sequences that can contaminate the data. It's possible?
The BLAST suite is not available on the public Galaxy server at http://usegalaxy.org but is available from the Galaxy Tool Shed if you have a local Galaxy instance: http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus/ One way to filter your FASTA file based on BLAST hits would be to use the tabular output from BLAST with this sequence filtering tool: http://toolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_id e.g. If you want to remove transcripts which seem to be mitochondria, you could BLAST against a mitochondrial database, and take only the sequence with no hits. Regards, Peter
Hi Jorge, I see that Peter helped with the Blast question (thanks Peter!), but you are having trouble with Trinity, too. Trinity should work without any issues to my knowledge. Make sure that you are running the latest distribution (or cloudman) and have all of the dependencies set up. Then, if you are still having problems, send details "to" the "galaxy-dev@bx.psu.edu" mailing list only (not to a team member, and please start a new thread - not a reply the galaxy-user list). This way our development team/community will see the thread and can help you troubleshoot the install. https://wiki.galaxyproject.org/MailingLists#The_lists Thanks! Jen Galaxy team -- Hello, of course, Jennifer is right for the first question Emoji. For the second question about blast... I wonder if running after blast in galaxy I can remove sequences that can contaminate the data. It's possible? Last question, trinity is 100% operational in galaxy? Because trinity ran but the result was empty, and I think the script failure .Py Thanks Jennifer and colleagues for your patience and solutions -- Jennifer Hillman-Jackson http://galaxyproject.org
participants (3)
-
Jennifer Jackson
-
Jorge Braun
-
Peter Cock