"Best practice" for maintaining genome list in Galaxy

6 Jul 2017

      Hi there

Is there current "best practice" for maintaining a collection of reference
genome data in a Galaxy server? I.e. filling in the dbkeys.loc and
all_fasta.loc table and associated data. I know of the following
discussions on this topic:

1. The data integration page:
https://galaxyproject.org/admin/data-integration/
2. The data managers page:
https://galaxyproject.org/admin/tools/data-managers/
3. This biostars answer: https://biostar.usegalaxy.org/p/7176/
4. This page on custom genomes that seems to be user / history oriented:
https://galaxyproject.org/learn/custom-genomes/
5. This page on rsyncing from the Galaxy reference data collection:
https://galaxyproject.org/admin/use-galaxy-rsync/ - quick test gives me
rsync error though
6. This guide to using the human reference data:
https://biostar.usegalaxy.org/p/14777/
7. I hear the usegalaxy.org reference collection is available via cvmfs - I
think this was discussed in a Galaxy Admins meetup at some point.

None of this is comprehensive. There are 2 sets of questions for me (in my
role as Galaxy admin):

1. If I want to make reference genomes and common indices (like for HISAT
and BWA) available for e.g. human and mouse, what is the best way to do
this?

2. If I want to add a genome for a non-model organism (e.g. M. tuberculosis
or L. calcarifer) available, what is the best way to do this? Which data
manager should or could I use?

Thanks,
Peter

Peter van Heusden

tags

participants (1)