
If hg_g1K_v37 == "1000 Genomes version of GRCh37" then it is the GRCh37 Primary assembly + a decoy sequence to try to soak up off target reads. The chromosome coordinates are the same but the sequences included in the packages are different. Here is the description from the 1000 Genomes site: http://www.1000genomes.org/category/assembly Deanna On 6/18/12 3:30 PM, "Hiram Clawson" <hiram@soe.ucsc.edu> wrote:
I'm curious what is this genome called 'hg_g1k_v37' and how does it correspond to NCBI GRCh37 which is identical to UCSC hg19 ?
--Hiram
Jennifer Jackson wrote:
UCSC does not contain the genome 'hg_g1k_v37' - the genome available from UCSC is 'hg19'.
Even though these are technically the same human release, on a practical level, they have a different arrangement for some of the chromosomes. You can compare NBCI GRCh37 <http://www.ncbi.nlm.nih.gov/genome/assembly/2758/> with UCSC hg19 <http://genome.ucsc.edu> for an explanation. Reference genomes must be /exact/ in order to be used with tools - base for base. When they are exact, the identifier will be exact between Galaxy and the source (UCSC, Ensembl) or the full Build name will provide enough information to make a connection to NCBI or other.
Sometimes genomes are similar enough that a dataset sourced from one can be used with another, if the database attribute is changed and the data from the regions that differ is removed. This may be possible in your case, only trying will let you know how difficult it actually is with your analysis. The GATK pipeline is very sensitive to exact inputs. You will need to be careful with genome database assignments, etc. Following the links on the tool forms to the GATK help pages can provide some more detail about expected inputs, if this is something that you are going to try.
The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at: