Folks,

 

First, I wanted to thank you for making the datacache available (http://wiki.galaxyproject.org/Admin/Data%20Integration; rsync://datacache.g2.bx.psu.edu). It’s a great resource.

 

However, what is the best way to stay abreast of changes to what’s in datacache, and understand how these indexes are computed?

 

We are currently upgrading to bowtie2, but I notice that the bowtie2 indices for mm9, which used to be in

                rsync://datacache.g2.bx.psu.edu/indexes/mm9/mm9*/bowtie2_index

have been removed, and only the hg19 genome has bowtie2 indices. Why only that one, and not the others?

Where are the scripts you use to make these indices, in case I want to create bowtie2 indices for other

 

So, how do I find out *why* they were removed? (Can I safely use the copy I have, or was there a problem with them?)

 

More generally, how do I understand the policies and logic behind the datacache indices, and be notified of changes, short of running my own periodic rsync/diff?

 

Finally, since I’m doing “reproducible research” is anything planned for systematically versioning genome indices, so I can easily tell what version of a system (ie, what BWA version) was used to create the index, and be sure that an index will not suddenly disappear.

 


Thanks,

Curtis

Research Associate/CTSA-Informatics Team

University of Alabama at Birmingham