Hello Yao,
The Ensembl-sourced reference annotation can often work with Cufflinks,
however it does need to be in GTF format (the file samples listed here
are not in GTF format). Also, you will need to alter the chromosome
names once loaded into Galaxy. Specifically, Ensembl names chromosomes
for human as "1", "2", "3", etc. and to have them match
exactly with the
Galaxy cashed human reference genome a "chr" needs to be added to create
"chr1", "chr2", "chr3". A workflow to do this transformation
is on the
FAQ wiki here:
Other issues with Ensembl GTF files have been known to pop up, so these
data are not fully supported and we still do recommended using UCSC
despite the missing gene_id information. But if you want to try, there
is likely some sort of work-around that you could create on your own
should a problem come up.
Hopefully this helps,
Jen
On 8/2/11 6:30 PM, yao chen wrote:
Dear all,
I have a similar problem when using cufflinks in galaxy (net version).
If I didn't select the reference annotation, I can get the FPKM
values,but since no reference,I can not get the transcript or gene name.
It looks like these:
test_id gene_id gene locus sample_1 sample_2 status value_1 value_2
ln(fold_change) test_stat p_value q_value significant
TCONS_00000002 XLOC_000025 - chr1:33860011-33860048 q1 q2 NOTEST
1.794e+06 0 -1.79769e+308 -1.79769e+308 0.0188163 1 no
There is no gene_id. However, if I use the reference annotation
downloaded from ENSEMBLE.I can get the gene_ids, but there FPKM values
are all "0":
tracking_id class_code nearest_ref_id gene_id gene_short_name tss_id
locus length coverage status FPKM FPKM_conf_lo FPKM_conf_hi
ENSMUSG00000024232 - - ENSMUSG00000024232 Bambi - 18:3507954-3516402 - -
OK 0 0 0
ENSMUSG00000091539 - - ENSMUSG00000091539 Vmn1r238 - 18:3122454-3123465
- - OK 0 0 0
------------------
Any thoughts?
2011/8/3 Jennifer Jackson <jen(a)bx.psu.edu <mailto:jen@bx.psu.edu>>
Hi Aleks,
Thanks for sending the data link, this helped to narrow down the
root cause of the issue.
The UCSC-sourced GTF file has the attributes gene_id and
transcript_id set to the same value (both as transcript_id). The
result of this is that each transcript is interpreted by Cufflinks
as a single gene, with no gene grouping (thus no isoforms).
We have plans to develop a work-around. This would likely involve
(for the refGene track in particular) the value in the UCSC's
primary table refGene.name2 being swapped into the refGene GTF
file's gene_id value. This would generate accurate gene-level
statistics when the file is used as input to Cufflinks. You could do
the same swap (outside of Galaxy) if you wanted to give it a try and
have resource.
Very sorry for the current inconvenience,
Best,
Jen
Galaxy team
On 7/25/11 11:26 AM, Jennifer Jackson wrote:
Hello Aleks,
Chromosome names must be exact between all input files). Also,
the SAM
file and GTF file both must be sorted the same way. This FAQ may
be of
interest:
http://main.g2.bx.psu.edu/u/__jeremy/p/transcriptome-__analysis-faq
<
http://main.g2.bx.psu.edu/u/jeremy/p/transcriptome-analysis-faq>
If still a problem, please share the history with me directly either
using my email address or generate the share link and email to me
(only). Use "Options -> Share or Publish", not just your sessions
browser URL.
Best,
Jen
Galaxy team
On 7/22/11 8:45 AM, Aleks Schein wrote:
Dear all,
I am trying to run Cufflinks installation in Galaxy on
Solexa RNAseq
samples from HeLa cells.
Running Cuffcompare, according to the manual, should produce
a tmap
file, listing FMI values for detected isoforms. However, my
files only
have either "100" or "0" in FMI field. And FPKM column
contains only
zeros.
Is there something wrong with my input files, or parameter
settings? Or
is it rather a specific issue with Galaxy Cufflink's
installation?
The data in question is available here:
http://main.g2.bx.psu.edu/u/__aleks/h/guided-__assemblyadvanced
<
http://main.g2.bx.psu.edu/u/aleks/h/guided-assemblyadvanced>
Thanks,
Aleks Schein
_____________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at
usegalaxy.org <
http://usegalaxy.org>. Please keep all
replies on the list by
using "reply all" in your mail client. For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:
http://lists.bx.psu.edu/__listinfo/galaxy-dev
<
http://lists.bx.psu.edu/listinfo/galaxy-dev>
To manage your subscriptions to this and other Galaxy lists,
please use the interface at:
http://lists.bx.psu.edu/
--
Jennifer Jackson
http://usegalaxy.org
http://galaxyproject.org/__Support <
http://galaxyproject.org/Support>
_____________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at
usegalaxy.org <
http://usegalaxy.org>. Please keep all replies on
the list by
using "reply all" in your mail client. For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:
http://lists.bx.psu.edu/__listinfo/galaxy-dev
<
http://lists.bx.psu.edu/listinfo/galaxy-dev>
To manage your subscriptions to this and other Galaxy lists,
please use the interface at:
http://lists.bx.psu.edu/