Jennifer, What's the status of bowtie2/mm9 index on PSU main? When I select tophat2, it offers me mm9 as a choice for built-in indexes. However, when the job runs, I get the following error, indicating the bowtie2/mm9 indexes are missing (below). Any insight into whether this is expected, or what the ETA is until the index would be installed, would be great. I'm trying to reproduce work on PSU I ran on my local galaxy, so that we can link to it for supplemental materials for a paper. Thanks, Curtis PS - I clicked the submit bug button a few days ago, but haven't received a response yet. Fatal error: Tool execution failed [2013-10-29 10:13:27] Beginning TopHat run (v2.0.9) ----------------------------------------------- [2013-10-29 10:13:27] Checking for Bowtie Bowtie version: 2.1.0.0 [2013-10-29 10:13:27] Checking for Samtools Samtools version: 0.1.18.0 [2013-10-29 10:13:27] Checking for Bowtie index files (genome).. Error: Could not find Bowtie 2 index files (/galaxy/data/mm9/mm9full/bowtie2_index/mm9full.*.bt2) From: Jennifer Jackson [mailto:jen@bx.psu.edu] Sent: Friday, September 20, 2013 4:00 PM To: Curtis Hendrickson (Campus) Subject: Re: [galaxy-dev] datacache & bowtie2 for mm9 ? Thanks Curtis, I am actually working to try to get mm9 out there right now. No promises, but is just one (well, three, including variants)! If technical is a go, then will do it. Ideally others soonish. We'll see. The last news brief has help for the Data manager, it may be that you need to do some config changes to get it going. I am certainly no expert - this is Dan's and under active development - but is where I would start. Jen On 9/20/13 1:25 PM, Curtis Hendrickson (Campus) wrote: Thanks for the rapid reply! I have some questions and comments, but need to read up on Data Managers (that admin page seems non-functional in our local galaxy, despite being on latest code) first. Regards, Curtis From: Jennifer Jackson [mailto:jen@bx.psu.edu] Sent: Friday, September 20, 2013 2:34 PM To: Curtis Hendrickson (Campus) Cc: galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu> Subject: Re: [galaxy-dev] datacache & bowtie2 for mm9 ? Hello Curtis, The datacache was originally pointed to the data staging area and is now pointed to the data published area. The difference is that the published area contains data and location (.loc) files that are in synch and have completed final testing. It is your choice about whether to use the staged-only data - it depends how risk tolerant your project is and if you plan on testing. But, that said, I think it is almost certainly fine or our team wouldn't have staged it yet. A vanishingly small number of datasets are pulled back once they make it to staging, and this is why we were comfortable pointing datacache there in the first place (were unable to point to the published area at first, but wanted to make the data available ASAP). Going forward - I can let you know that these indexes are very easy to create: one command-line execution, then add one line to the associated .loc file. Instructions are here, see "Bowtie and Tophat": http://wiki.galaxyproject.org/Admin/NGS%20Local%20Setup For one or few genomes, not a problem. For hundreds of genomes with variants, can become tedious even with helper tools and in our case, the processing interacted with disk that was undergoing changes (as we have been working on system configuration most of the summer). Also, with the Data Manager is now available, creating batch indexes for use via rsync become lower priority. Even so, I would expect more indexes to be fully published once the final configuration is in place, as many are already staged or close being staged (watch the yellow banner on Main). Hopefully this helps to explain the data, guides you to making an informed decision, and aids with creating your own indexes as needed, Thanks! Jen Galaxy team On 9/18/13 1:04 PM, Curtis Hendrickson (Campus) wrote: Folks, First, I wanted to thank you for making the datacache available (http://wiki.galaxyproject.org/Admin/Data%20Integration; rsync://datacache.g2.bx.psu.edu). It's a great resource. However, what is the best way to stay abreast of changes to what's in datacache, and understand how these indexes are computed? We are currently upgrading to bowtie2, but I notice that the bowtie2 indices for mm9, which used to be in rsync://datacache.g2.bx.psu.edu/indexes/mm9/mm9*/bowtie2_index have been removed, and only the hg19 genome has bowtie2 indices. Why only that one, and not the others? Where are the scripts you use to make these indices, in case I want to create bowtie2 indices for other So, how do I find out *why* they were removed? (Can I safely use the copy I have, or was there a problem with them?) More generally, how do I understand the policies and logic behind the datacache indices, and be notified of changes, short of running my own periodic rsync/diff? Finally, since I'm doing "reproducible research" is anything planned for systematically versioning genome indices, so I can easily tell what version of a system (ie, what BWA version) was used to create the index, and be sure that an index will not suddenly disappear. Thanks, Curtis Research Associate/CTSA-Informatics Team University of Alabama at Birmingham ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ -- Jennifer Hillman-Jackson http://galaxyproject.org -- Jennifer Hillman-Jackson http://galaxyproject.org