Additional genomes will be specially sorted, indexed, and added to the
GATK tool suite as it moves out of beta status. Hg19 is short-listed for
addition near term.
We do take requests to have genome added to tools and consider these
when ranking our prioritization lists. Which genome did you want to use?
One small warning when using a custom reference genome with this
particular tool set - be sure to visit the GATK web site links directly
to understand the sorting criteria for genomes. It can be different than
how Galaxy, UCSC, and many of the existing tools already sort or
instruct users to sort genomes or data. In short, the genome must be
sorted in the exact order that it was originally released, but even this
can be slightly confusing, especially if working with a non-human genome
as there are few examples. Still, the documentation can help and tools
are easily tested (if the sorting is wrong, the tool will fail and let
If others have requests for GATK native genomes, they are also welcome
to reply. In general, key model organisms would be ranked highest in
priority. We also try to get the largest genomes loaded natively first
(for purely practical reasons).
Good question, thanks!
On 6/5/12 8:01 AM, Richard Linchangco wrote:
I've been searching the lists for this type of issue and only found
one solution thus far which is the use of a custom reference. It
doesn't make sense in my situation because the reference I used was
from Galaxy itself when I mapped my data. I'm now trying to use GATK
to find SNPs but no matter what I've tried I can't get past this
issue. I'm trying to use the Count Covariates and the Unified
Genotyper but to no avail. The only issue appears to be that
"Sequences are not currently available for the specified build."
Any help would be much appreciated. Thanks