Because it is sourced from UCSC, the GRCh37 genome is available in
Galaxy as "hg19". The full name is:
Human Feb. 2009 (GRCh37/hg19) (hg19)
This aligns with how Ensembl also understands the content:
For RNA-seq analysis (and sometime other types of analysis) you may need
to adjust other input data's chromosome naming to match the UCSC format.
This is explained in the RNA-seq FAQ:
The data in the FTP link you provide is annotation data. To use
annotation data with the RNA-seq pipeline, a GTF file would be a good
format. The RNA-seq tool pages have a link out to Ensembl, but any
source from the same genome is OK (UCSC, etc.).
On 4/2/12 4:14 AM, Karthik Srinivasan wrote:
How do I run Tophat and RNA-Seq analysis using the GRCH37- embl 66
genome? I noticed there is no input for this genome version.
Can I construct a reference genome from the following embl format source
map it against my RNAseq data?
Karthik Srinivasan | Senior Application Engineer
P:+912242554282 <tel:+912242554282> | M:+919987014704 <tel:+919987014704>
OracleHealth Sciences Global Business Unit
6'th Floor, Silver Metropolis, W.E.Highway, Goregaon(E) | 400063 Mumbai
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
. Please keep all replies on the list by
using "reply all" in your mail client. For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:
To manage your subscriptions to this and other Galaxy lists,
please use the interface at: