How to process paired reads in Galaxy
Hi Users How do I process paired reads from an Ilumina Miseq platform? If I first use "Groomer" and then "Filter by quality", the paired reads get out of sync. I read some responses to similar questions in this Forum that the paired reads must first be "joined" before filtering and then split for mapping. There is also a "Fastq interlacer" command to join reads. However, I also read in the Forum that Galaxy requires the filtered paired reads to be of equal length. But would not the filtering process on the joined reads also modify the length of each differently? Or can the NGS: Picard command "Paired Read Mate Fixer" be used to re-synchronize the paired reads? If there is no way around the requirement for equal lengths, then I guess that it is really not possible to process paired reads in Galaxy? But I am sure it is just my stupidity. Full disclosure: Be kind, I am a novice in this field, just learning Galaxy (which by the way is a fantastic resource!). Larry Simpson UCLA
Hi Larry, If you are performing RNA-seq analysis, there is no need to filter the data to ensure exact pairs before running Tophat. Later steps will deal with any unpaired sequences. If you are instead performing variant analysis, then in many cases you will want matched pairs. Join the data first, perform any manipulations that may remove reads entirely (filtering by quality), then split and perform steps that may only trim off ends (if even needed). The FASTQ splitter/joiner are at the top of the "NGS: QC and manipulation" tool group, FASTQ Quality Trimmer is a bit lower down. Use FastQC to understand what thresholds to use, per dataset. Hopefully this helps, Jen Galaxy team On 7/22/13 9:10 PM, Larry Simpson wrote:
Hi Users
How do I process paired reads from an Ilumina Miseq platform? If I first use "Groomer" and then "Filter by quality", the paired reads get out of sync. I read some responses to similar questions in this Forum that the paired reads must first be "joined" before filtering and then split for mapping. There is also a "Fastq interlacer" command to join reads. However, I also read in the Forum that Galaxy requires the filtered paired reads to be of equal length. But would not the filtering process on the joined reads also modify the length of each differently? Or can the NGS: Picard command "Paired Read Mate Fixer" be used to re-synchronize the paired reads?
If there is no way around the requirement for equal lengths, then I guess that it is really not possible to process paired reads in Galaxy? But I am sure it is just my stupidity.
Full disclosure: Be kind, I am a novice in this field, just learning Galaxy (which by the way is a fantastic resource!).
Larry Simpson UCLA
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
To search Galaxy mailing lists use the unified search at:
-- Jennifer Hillman-Jackson Galaxy Support and Training http://galaxyproject.org
participants (2)
-
Jennifer Jackson
-
Larry Simpson