Hi Enis, Thanks for that information. Now I am getting an error with the Unified_Genotyper failing to locate the GenomeAnalysisTK.jar. I discovered that gatk2 needs to be downloaded and installed. I have done that, but can't seem to figure out where the env.sh file reference below exists. Can you point me to the correct proximity of that file? Or do I need to create the file and if so where? Thanks, Iry Galaxy wrapper for GATK2 This wrapper is copyright 2013 by Björn Grüning, Jim Johnson & the Galaxy Team. The Genome Analysis Toolkit or GATK is a software package developed at the Broad Institute to analyse next-generation resequencing data. The toolkit offers a wide variety of tools, with a primary focus on variant discovery and genotyping as well as strong emphasis on data quality assurance. Its robust architecture, powerful processing engine and high-performance computing features make it capable of taking on projects of any size. http://www.broadinstitute.org/gatk http://www.broadinstitute.org/gatk/about/citing-gatk GATK is Free for academics, and fee for commercial use. Please study the GATK licensing website: http://www.broadinstitute.org/gatk/about/#licensing Installation The recommended installation is by means of the toolshed<http://toolshed.g2.bx.psu.edu/view/iuc/gatk2>. Galaxy should be able to install samtools dependencies automatically for you. GATK2, and its new licence model, does not allow us to distribute the GATK binaries. As a consequence you need to install GATK2 by your own, please see the GATK website for more information: http://www.broadinstitute.org/gatk/download Once you have installed GATK2, you need to edit the env.sh files that are installed together with the wrappers. You must edit the GATK2_PATH environment variable in the file: <tool_dependency_dir>/environment_settings/GATK2_PATH/iuc/gatk2/<hash_string>/env.sh to point to the folder where you have installed GATK2. Optionally, you may also want to edit the GATK2_SITE_OPTIONS environment variable in the file: <tool_dependency_dir>/environment_settings/GATK2_SITE_OPTIONS/iuc/gatk2/<hash_string>/env.sh to deactivate the 'call home feature' of GATK with something like: GATK2_SITE_OPTIONS='-et NO_ET -K /data/gatk2_key_file' GATK2_SITE_OPTIONS can be also used to insert other specific options into every GATK2 wrapper at runtime, without changing the actual wrapper. Read more about the "Phone Home" problem at: http://www.broadinstitute.org/gatk/guide/article?id=1250 Optionally, you may also want to add some commands to be executed before GATK (e.g. to load modules) to the file: <tool_dependency_dir>/gatk2/default/env.sh Finally, you should fill in additional information about your genomes and annotations in the gatk2_picard_index.loc and gatk2_annotations.txt. You can find them in the tool-data/ Galaxy directory. From: Enis Afgan <afgane@gmail.com<mailto:afgane@gmail.com>> Date: Saturday, October 4, 2014 6:10 AM To: Iry Witham <iry.witham@jax.org<mailto:iry.witham@jax.org>> Cc: "galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu>" <galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu>> Subject: Re: [galaxy-dev] Cloudman indices installation/configuration Hi Iry, Try adding the following to your /mnt/galaxy/galaxy-app/tool_data_table_conf.xml, populating the referenced files (tool-data/gatk2_picard_index.loc and tool-data/gatk2_annotations.txt) as desired and restarting Galaxy: <!-- Location of Picard dict files valid for GATK --> <table name="gatk2_picard_indexes" comment_char="#"> <columns>value, dbkey, name, path</columns> <file path="tool-data/gatk2_picard_index.loc" /> </table> <!-- Available of GATK references --> <table name="gatk2_annotations" comment_char="#"> <columns>value, name, gatk_value, tools_valid_for</columns> <file path="tool-data/gatk2_annotations.txt" /> </table> Hope this gets you going. Let us know if it doesn't, Enis On Fri, Oct 3, 2014 at 1:36 PM, Iry Witham <Iry.Witham@jax.org<mailto:Iry.Witham@jax.org>> wrote: It looks like I need to generate the dict file for the mm10 reference as well as add the reference to the srma_index.loc. My question is where do these need to exist? Do they belong in the repo directory structure or or in the primary tool-data directory? The hg19.fa, hg19.fa.fia, hg19.dict as well as these same files for the mm9 GRCh37. However, the .dict does not exist for mm10. Even though that is the case the references do not appear in the gatk2 tools. Any ideas? Thanks, Iry From: Daniel Blankenberg <dan@bx.psu.edu<mailto:dan@bx.psu.edu>> Date: Thursday, October 2, 2014 1:57 PM To: Iry Witham <iry.witham@jax.org<mailto:iry.witham@jax.org>> Cc: "galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu>" <galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu>> Subject: Re: [galaxy-dev] Cloudman indices installation/configuration Hi Iry, First thing to check is that your fields are tab delimited — they appear to be spaces instead of tabs in this email, but copy and pasting into email can munge things sometimes (also “gh19.fa” is probably a typo, but that wouldn’t prevent the selection option from showing up). Thanks for using Galaxy, Dan On Oct 2, 2014, at 1:49 PM, Iry Witham <Iry.Witham@jax.org<mailto:Iry.Witham@jax.org>> wrote: Hi Team, I have a new instance of galaxy cloudman running and have added tools from the toolshed to it. When I attempt to run tools like sam-to-bam or any gatk tool I am prompted for a reference genome. However, indices/references not available for these tools. I have added the following line to the sam_fa_indices.loc, but that did nothing: index hg19 /mnt/galaxyIndices/genomes/Hsapiens/hg19/seq/gh19.fa I have also added the following three lines to the gatk2_picard_index.loc: hg19 hg19 Human (hg19) /mnt/galaxyIndices/genomes/Hsapiens/hg19/seq/hg19.fa GRCh37 GRCh37 Human (GRCh37) /mnt/galaxyIndices/genomes/Hsapiens/GRCh37/seq/GRCh37.fa mm10 mm10 Mouse (mm10) /mnt/galaxyIndices/genomes/Mmusculus/mm10/seq/mm10.fa I know I have missed something, but can't seem to figure it out. Could someone point me in the right direction? Regards, __________________________________ Iry T. Witham Scientific Applications Administrator Computational Sciences Group The Jackson Laboratory 600 Main Street Bar Harbor, ME 04609 Phone: 207-288-6744<tel:207-288-6744> email: iry.witham@jax.org <372D007A-1B00-4668-BA6B-F0527C1F24BE[34][3].png> The information in this email, including attachments, may be confidential and is intended solely for the addressee(s). If you believe you received this email by mistake, please notify the sender by return email as soon as possible. ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ The information in this email, including attachments, may be confidential and is intended solely for the addressee(s). If you believe you received this email by mistake, please notify the sender by return email as soon as possible. ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ The information in this email, including attachments, may be confidential and is intended solely for the addressee(s). If you believe you received this email by mistake, please notify the sender by return email as soon as possible.