Hello Yao, The Ensembl-sourced reference annotation can often work with Cufflinks, however it does need to be in GTF format (the file samples listed here are not in GTF format). Also, you will need to alter the chromosome names once loaded into Galaxy. Specifically, Ensembl names chromosomes for human as "1", "2", "3", etc. and to have them match exactly with the Galaxy cashed human reference genome a "chr" needs to be added to create "chr1", "chr2", "chr3". A workflow to do this transformation is on the FAQ wiki here: http://main.g2.bx.psu.edu/u/jeremy/p/transcriptome-analysis-faq#faq5 Other issues with Ensembl GTF files have been known to pop up, so these data are not fully supported and we still do recommended using UCSC despite the missing gene_id information. But if you want to try, there is likely some sort of work-around that you could create on your own should a problem come up. Hopefully this helps, Jen On 8/2/11 6:30 PM, yao chen wrote:
Dear all,
I have a similar problem when using cufflinks in galaxy (net version). If I didn't select the reference annotation, I can get the FPKM values,but since no reference,I can not get the transcript or gene name. It looks like these:
test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 ln(fold_change) test_stat p_value q_value significant TCONS_00000002 XLOC_000025 - chr1:33860011-33860048 q1 q2 NOTEST 1.794e+06 0 -1.79769e+308 -1.79769e+308 0.0188163 1 no
There is no gene_id. However, if I use the reference annotation downloaded from ENSEMBLE.I can get the gene_ids, but there FPKM values are all "0":
tracking_id class_code nearest_ref_id gene_id gene_short_name tss_id locus length coverage status FPKM FPKM_conf_lo FPKM_conf_hi ENSMUSG00000024232 - - ENSMUSG00000024232 Bambi - 18:3507954-3516402 - - OK 0 0 0 ENSMUSG00000091539 - - ENSMUSG00000091539 Vmn1r238 - 18:3122454-3123465 - - OK 0 0 0
------------------ Any thoughts?
2011/8/3 Jennifer Jackson <jen@bx.psu.edu <mailto:jen@bx.psu.edu>>
Hi Aleks,
Thanks for sending the data link, this helped to narrow down the root cause of the issue.
The UCSC-sourced GTF file has the attributes gene_id and transcript_id set to the same value (both as transcript_id). The result of this is that each transcript is interpreted by Cufflinks as a single gene, with no gene grouping (thus no isoforms).
We have plans to develop a work-around. This would likely involve (for the refGene track in particular) the value in the UCSC's primary table refGene.name2 being swapped into the refGene GTF file's gene_id value. This would generate accurate gene-level statistics when the file is used as input to Cufflinks. You could do the same swap (outside of Galaxy) if you wanted to give it a try and have resource.
Very sorry for the current inconvenience,
Best,
Jen Galaxy team
On 7/25/11 11:26 AM, Jennifer Jackson wrote:
Hello Aleks,
Chromosome names must be exact between all input files). Also, the SAM file and GTF file both must be sorted the same way. This FAQ may be of interest: http://main.g2.bx.psu.edu/u/__jeremy/p/transcriptome-__analysis-faq <http://main.g2.bx.psu.edu/u/jeremy/p/transcriptome-analysis-faq>
If still a problem, please share the history with me directly either using my email address or generate the share link and email to me (only). Use "Options -> Share or Publish", not just your sessions browser URL.
Best,
Jen Galaxy team
On 7/22/11 8:45 AM, Aleks Schein wrote:
Dear all, I am trying to run Cufflinks installation in Galaxy on Solexa RNAseq samples from HeLa cells. Running Cuffcompare, according to the manual, should produce a tmap file, listing FMI values for detected isoforms. However, my files only have either "100" or "0" in FMI field. And FPKM column contains only zeros. Is there something wrong with my input files, or parameter settings? Or is it rather a specific issue with Galaxy Cufflink's installation?
The data in question is available here: http://main.g2.bx.psu.edu/u/__aleks/h/guided-__assemblyadvanced <http://main.g2.bx.psu.edu/u/aleks/h/guided-assemblyadvanced>
Thanks,
Aleks Schein
_____________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org <http://usegalaxy.org>. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/__listinfo/galaxy-dev <http://lists.bx.psu.edu/listinfo/galaxy-dev>
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/__Support <http://galaxyproject.org/Support> _____________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org <http://usegalaxy.org>. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/__listinfo/galaxy-dev <http://lists.bx.psu.edu/listinfo/galaxy-dev>
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/Support