script to download genomes
Hello, I'm working on a simple python script that auto-magically creates all indexes, and pre-processing necessary to add a genome to galaxy. I was wondering if anyone had a script to download all the genomes that are presented to you in the default genome drop down box? Also I'm not to confident about what "all" preprocessing actually is (with respect to the default configuration). So far I create the create indexes for the following tools and add their appropriate entry into their tools-data config file: sam index bowtie, base/color bwa, base/color bfast, base (any advice on how to do color?) blast (n and p) Am I missing any tool that requires preprocessing? Thanks, James
James;
I'm working on a simple python script that auto-magically creates all indexes, and pre-processing necessary to add a genome to galaxy. I was wondering if anyone had a script to download all the genomes that are presented to you in the default genome drop down box?
Really great you are interested in this are. We've been working on this problem as part of the cloudbiolinux project: https://github.com/chapmanb/cloudbiolinux and have a fabric install file: https://github.com/chapmanb/cloudbiolinux/blob/master/data_fabfile.py the pulls genomes and index types specified by a configuration file: https://github.com/chapmanb/cloudbiolinux/blob/master/config/biodata.yaml This has two ways of installing the indexes. The first, install_data, will download the genomes from UCSC/Ensembl/NCBI and then build the indexes locally. The second, install_data_s3, has pre-prepared and indexed genomes stored in Amazon S3 buckets. This lets you fetch and unpack ready to go genomes without having the overhead of indexing; this works from anywhere -- local or Amazon cloud machines. After download, both methods update the appropriate *.loc files to integrate with Galaxy. As you mentioned, there are a lot of different targets and genomes, and we're definitely interested in expanding out to include them. It sounds like this approach is similar to what you were working on. If you're interested, we'd be very happy to have you involved. We've been working with Enis on this but it's not an official Galaxy team project; rather it's a community supplement to the great install automation they already provide. Hope this helps, Brad
participants (2)
-
Brad Chapman
-
James Lindsay