Automated deployment of Galaxy associated data files
Galaxy folks; We recently moved our local Galaxy instance to a new server, and in the process developed an automated script to install and integrate associated data libraries and programs. I thought this might be of interest to other developers on the list and is available at: http://github.com/chapmanb/bcbb/blob/master/galaxy/galaxy_fabfile.py This uses Fabric (http://docs.fabfile.org) to: - Install 3rd party tools, focused around next-gen analysis: bowtie, bwa, maq, samtools, fastx-toolkit, UCSC programs - Download and index genome builds. There are provided classes to handle UCSC, Ensembl and NCBI downloads. - Update associated tool-data files with links to the builds and indexes. It works incrementally, skipping programs and libraries that are already installed, so you can use it to keep data files synchronized across multiple installations. We do this internally with a test and deployment Galaxy, and find it useful to have the exact same genome builds on all of our machines for running off-Galaxy alignments. Happy to hear thoughts and suggestions. My long term goal is to get this to the point that it could be used alongside existing Galaxy egg scrambling automation to automate deploying Galaxy on a bare bones machine, like Amazon EC2 instances. Hope others find this useful, Brad
participants (1)
-
Brad Chapman