building a full trinity workflow from assembly through analysis
Greetings all, I'm trying to build out a Galaxy workflow to support trinity rna-seq de novo assembly and all the various downstream analyses. For the initial trinity de novo assembly, I took Jeremy's initial workflow and tweaked it to work with the latest release - and submitted it to the galaxy tool shed. So, step 1 is done. But... here's what I'm ultimately trying to accomplish: Given fastq files for a number of different samples {A,B,C} 1. merge {A,B,C,...} => MERGED.fq 2. run Trinity based on MERGED.fq to generate Trinity.fasta (existing workflow does this already) 3. align the original {A,B,C,...} separately to Trinity.fasta using bowtie 4. for each alignment, perform abundance estimation 5. combine results from each abundance estimation into a matrix file 6. run some Bioconductor tools to analyze differential expression There are some complexities here, such as not knowing ahead of time how many different samples are to be processed - so this needs to be determined dynamically, which impacts the overall complexity of the total workflow. Is this possible in Galaxy, and if so, are there examples I can work from? There are a whole bunch of other add-ons I'd like to include beyond the above, but I figure that if the above is doable, then the rest should be equally doable. Many thanks! -- -- Brian J. Haas Manager, Genome Annotation and Analysis, Research and Development The Broad Institute http://broad.mit.edu/~bhaas
Brian,
For the initial trinity de novo assembly, I took Jeremy's initial workflow and tweaked it to work with the latest release - and submitted it to the galaxy tool shed.
Excellent. I'll remove my old Trinity wrapper from our code base and point people to your wrapper.
There are some complexities here, such as not knowing ahead of time how many different samples are to be processed - so this needs to be determined dynamically, which impacts the overall complexity of the total workflow. Is this possible in Galaxy, and if so, are there examples I can work from?
Take a look at the repeat element; it enables a user to specify a number of inputs to a tool. http://wiki.g2.bx.psu.edu/Admin/Tools/Tool%20Config%20Syntax#A.3Crepeat.3E_t... Tool wrapper that use the repeat element include the GATK's depth of coverage and Cuffdiff. Best, J.
Excellent. Thanks, Jeremy! -b On Fri, Jul 20, 2012 at 8:37 AM, Jeremy Goecks <jeremy.goecks@emory.edu> wrote:
Brian,
For the initial trinity de novo assembly, I took Jeremy's initial workflow and tweaked it to work with the latest release - and submitted it to the galaxy tool shed.
Excellent. I'll remove my old Trinity wrapper from our code base and point people to your wrapper.
There are some complexities here, such as not knowing ahead of time how many different samples are to be processed - so this needs to be determined dynamically, which impacts the overall complexity of the total workflow. Is this possible in Galaxy, and if so, are there examples I can work from?
Take a look at the repeat element; it enables a user to specify a number of inputs to a tool.
http://wiki.g2.bx.psu.edu/Admin/Tools/Tool%20Config%20Syntax#A.3Crepeat.3E_t...
Tool wrapper that use the repeat element include the GATK's depth of coverage and Cuffdiff.
Best, J.
-- -- Brian J. Haas Manager, Genome Annotation and Analysis, Research and Development The Broad Institute http://broad.mit.edu/~bhaas
participants (2)
-
Brian Haas
-
Jeremy Goecks