Using a reference annotation combines what is known with what is novel
in your experiment. The "Common uses" section of this web page explains
this concept: http://cufflinks.cbcb.umd.edu/tutorial.html
The UCSC Known Genes track incorporates all of RefSeq Genes with other
sources, giving it an advantage. But it is also created at a particular
time (updated recently, check UCSC for exact dates for any track's last
update). Tracks from Genbank are updated daily, includes RefSeq Genes,
giving it an different advantage as time passes. One or both tracks may
or may not be available, depending on the target genome.
Another choice are the iGenomes builds at the Cufflinks web site for
UCSC. The iGenomes 'genes.gtf. files have specific attributes populated
that activate the full functionality of Cufflinks and related tools.
These are: gene_id, transcript_id, gene_name, p_id, and tss_id. For an
example of how these are used, see Cuffdiff usage, as explained here
. The iGenomes data
are a popular choice and were recently updated.
Note: mm9 'genes.gtf' is in Shared Data Libraries on Galaxy main.
Others are welcome to add to the post to share other good sources for
GTF datasets that include the attributes this tool package can utilize.
On 7/16/12 8:33 AM, Irene Bassano wrote:
how do we know when to select a refernce annotation when running cufflinks?And if yes,
which one from UCSC should we chose: refgene or knowngenes?
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
. Please keep all replies on the list by
using "reply all" in your mail client. For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:
To manage your subscriptions to this and other Galaxy lists,
please use the interface at: