[galaxy-dev] Contributing to genome indexes on rsync server

21 Feb 2013

      Hi all;
Is there a way for community members to contribute indexes to the rsync
server? This resource is awesome and I'm working on migrating the
CloudBioLinux retrieval scripts to use this instead of the custom S3
buckets we'd set up previously:

https://github.com/chapmanb/cloudbiolinux/blob/master/cloudbio/biodata/galax...

It's great to have this as a public shared resource and I'd like to be
able to contribute back. From an initial pass, here are the things I'd
like to do:

- Include bowtie2 indexes for more genomes.

- Include novoalign indexes for a number of commonly used genomes.

- Clean up hg19 to include a full canonically sorted hg19, with indexes.
  Broad has a nice version prepped so GATK will be happy with it, and
  you need to stick with this ordering if you're ever going to use a
  GATK tool on it. Right now there is a partial hg19canon (without the
  random/haplotype chromosomes) and the structure is a bit complex.

What's the best way to contribute these? Right now I have a lot of the
indexes on S3. For instance, the hg19 indexes are here:

https://s3.amazonaws.com/biodata/genomes/hg19-bowtie.tar.xz
https://s3.amazonaws.com/biodata/genomes/hg19-bowtie2.tar.xz
https://s3.amazonaws.com/biodata/genomes/hg19-bwa.tar.xz
https://s3.amazonaws.com/biodata/genomes/hg19-novoalign.tar.xz
https://s3.amazonaws.com/biodata/genomes/hg19-seq.tar.xz
https://s3.amazonaws.com/biodata/genomes/hg19-ucsc.tar.xz

I'm happy to format these differently or upload somewhere that would
make it easy to include. Thanks again for setting this up, I'm looking
forward to working off a shared repository of data,
Brad

[galaxy-dev] Contributing to genome indexes on rsync server

Brad Chapman