Any thing wrong with my cufflink process in galaxy?
Dear all: Recently, I run cufflink in galaxy on the internet. I want to compare two samples, However, I found no transcript or gene passed the significant level, even many of them have large FPKM in one sample and 0 FPKM in another sample. Any thoughts? Below is my cufflink process: I have four samples belong to two group. the test have three samples, and the control has one sample. First, using accept_hit.bam from tophat, I run cufflink without annotation on each sample. Then, for the four "gtf" files from four samples, I run cuffcompare to combine these transcript and compare to the annotation genome. However, at this step, I found the transcript accuracy is very low. See one example: Missed exons: 10673/11776 ( 90.6%) Wrong exons: 1254/2007 ( 62.5%) Missed introns: 8529/8637 ( 98.7%) Wrong introns: 2/5 ( 40.0%) Missed loci: 0/504 ( 0.0%) Wrong loci: 1248/2002 ( 62.3%) at last, I run cufdiff between this two group sample. Thank you.
Yao, It's difficult to tell what's wrong without seeing your analysis. However, you may want to use the reference annotation during the Cufflinks phase to either estimate isoform expression or guide assembly (this option will appear on our public server soon). Read the Cufflinks documentation to understand these options and what they do for your assembly and FPKM values: http://cufflinks.cbcb.umd.edu/manual.html#cufflinks De novo assembly from mapped reads is often somewhat imprecise and incomplete, especially for low-coverage data. It's not surprising that a de novo assembly doesn't match especially well with the reference. If you're still not seeing any differential expression after using the reference GTF in Cufflinks, Cuffcompare, and Cuffdiff, you may want to email the Cufflinks/compare/diff authors and ask for some pointers: tophat.cufflinks@gmail.com Good luck, J. On Aug 10, 2011, at 5:07 AM, yao chen wrote:
Dear all:
Recently, I run cufflink in galaxy on the internet. I want to compare two samples, However, I found no transcript or gene passed the significant level, even many of them have large FPKM in one sample and 0 FPKM in another sample.
Any thoughts?
Below is my cufflink process:
I have four samples belong to two group. the test have three samples, and the control has one sample.
First, using accept_hit.bam from tophat, I run cufflink without annotation on each sample.
Then, for the four "gtf" files from four samples, I run cuffcompare to combine these transcript and compare to the annotation genome. However, at this step, I found the transcript accuracy is very low. See one example: Missed exons: 10673/11776 ( 90.6%) Wrong exons: 1254/2007 ( 62.5%) Missed introns: 8529/8637 ( 98.7%) Wrong introns: 2/5 ( 40.0%) Missed loci: 0/504 ( 0.0%) Wrong loci: 1248/2002 ( 62.3%)
at last, I run cufdiff between this two group sample.
Thank you. ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hi Jeremy, I tried to run cufflinks to assemble transcripts after running Tophat against my own reference. This error was encountered. What was wrong? How to fix it? Error running cufflinks. [21:01:14] Inspecting reads and determining fragment length distribution. Processed 11130 loci. Warning: Using default Gaussian distribution due to insufficient paired-end reads in open ranges. It is recommended that correct paramaters (--frag-len-mean and --frag-len-std-dev) be provided.
Map Properties: Total Map Mass: 224153.00 Read Type: 50bp single-end Fragment Length Distribution: Truncated Gaussian (default) Default Mean: 200 Default Std Dev: 80 [21:01:18] Assembling transcripts and estimating abundances. Processed 11130 loci. [21:01:22] Loading reference annotation and sequence. No fasta index found for ref.fa. Rebuilding, please wait.. Error: sequence lines in a FASTA record must have the same length!
Thanks Jiannong
participants (3)
-
Jeremy Goecks
-
Jiannong Xu
-
yao chen