On Thu, Mar 10, 2011 at 7:55 AM, Jeremy Goecks
<jeremy.goecks@emory.edu> wrote:
Jagat,
Just like any mRNA-seq experiment to achieve following objectives:
1. Reconstruct all transcripts of a particular gene and corresponding Cuffdiff significantly expressed transcripts as called by cuffdiff.
2. What are different isoforms
3. Location of splicing
From various output files which unique ID can be matched from one file say Cuffdiff.expr (transcript/ isoform/Splicing) to other file - transcript.gtf corresponding to each sample or combined GTF file.
I've got a script that does this for the cuffdiff isoform expression testing file and a GTF file; I'll wrap it up and add it to Galaxy in the next couple weeks. It would probably be useful to have similar scripts for the other expression testing files as well. Also, it would be nice to be able to take the FPKM values generated by Cuffdiff and attach them to their respective transcripts as attributes.
Hello all,
I've added a tool called 'Filter GTF file by attribute values list' to the galaxy-central code repository. This tool is available on our test server (
http://test.g2.bx.psu.edu/ ) at Filter and Sort --> GFF --> Filter GTF data by attribute values list and will be available on our main server in the next few weeks.
As expected, this tool filters a GTF file based on a list of attribute values--or filters using a tabular file where attribute values are first column, as is the case for Cuffdiff output files. Potential attributes that can be filtered on include transcript_id, gene_id, tss_id, and p_id; conveniently, these are the IDs that Cuffdiff uses in its output files.
Here's an example workflow:
(1) Run Cufflinks/compare/diff
(2) Filter Cufflinks isoform differential expression file for transcripts that are differentially expressed; in other words, filter for c12=='yes'
(2) Use 'Filter GTF data by attribute values list' to filter Cuffcompare combined transcripts using the filtered file from step (2) as the attribute values list and, voila, you have a GTF file of the differentially expressed transcripts that you can view in your favorite genome browser.
Hope this helps; feedback is always welcome.
Best,
J.