This sounds like a fun and challenging project - good luck!
The route I would recommend pursuing largely hinges on whether all 5
Galaxy instances have a shared file system and run as a single user.
If they do I would recommend implementing a Galaxy "job runner" - all
the runners bundled with Galaxy can be found in
lib/galaxy/jobs/runners/. The standard cluster runners in there would
be drmaa.py, pbs.py, and condor.py. drmaa.py and pbs.py demonstrating
hooking Galaxy up to a library for submitting jobs to a cluster.
condor.py demonstrates wrapping CLI tools. Along similar lines - there
is cli.py which is something of a general framework for submitting
jobs via CLI tools and can even be used to SSH before running the
submission scripts if that is useful.
That approach largely hinges on having a large shared cluster if you
want to submit many different tools. If you don't mind modifying the
tools themselves - one could move logic for staging files and
submitting to clusters into the tools themselves - I can send some
links of example tools that have done this.
If you don't have the shared cluster and have many different tools you
would like to manage this way - I would suggest looking at Pulsar
). It can be used to
distribute jobs to remote clusters/machines. Pulsar has the concept of
"job managers" instead of "job runners" - they have a simpler
interface that would need to be implemented for ARC. Examples here
Pulsar has a bunch of options for staging files (File system
copies/HTTP/scp/rsync) and these can be configured on a per-path basis
for each Galaxy instance allowing you to optimize the data transfer
for your 5 setups.
Pulsar can be deployed as a RESTful web service (in this case you
could probably do one web service for all 5 instances) or by
monitoring a message queue (without some small changes you would
probably need to stand-up one pulsar server for each of the 5 Galaxy
instances in this case).
I like to give the warning that Galaxy is designed for large shared
file systems - and Pulsar or other distributed strategies requires
more effort to deploy (and in your case will definitely require novel
development time as well).
It is probably out of scope - but I would also note that it would
possibly be significantly easier to just deploy one Galaxy instance
and route the jobs to local clusters and let them all share one large
file system and just provide 5 different "faces" to Galaxy. That
probably isn't possible due to hardware/institutional politics/etc...
but I just wanted to make sure.
Along the same lines - it is worth considering if writing a DRMAA
layer for ARC or plugging it into Condor somehow might be a more
robust solution that Galaxy can leverage without actually locking your
development efforts into Galaxy-specific solutions.
On Mon, Jan 5, 2015 at 6:35 AM, Abdulrahman Azab <azab(a)ifi.uio.no> wrote:
Hi Galaxy developers,
In our Elixir project (http://www.elixir-europe.org/
), Norway, we have five
geographically distributed Galaxy instances. Each is working on a local
cluster (mostly SLURM). We are planning to interconnect those five clusters
using a meta-scheduler ARC (http://www.nordugrid.org/arc/
), to achieve load
balancing, so that a galaxy job can be reallocated to an external cluster in
case that the local cluster is saturated.
ARC manages the interconnection very well. What we need is to create a
Galaxy job handler for ARC. Is there a general template or interface for a
job handler, i.e. for defining job submission commands ... etc.?
and how to compile and integrate this new job handler and integrate it into
the Galaxy installation?
Head engineer, ELIXIR.NO / The Genomic HyperBrowser team
Department of Informatics, University of Oslo, Boks 1072 Blindern, NO-0316
Email: azab(a)ifi.uio.no, Cell-phone: +47 46797339
Senior Lecturer in Computer Engineering
Faculty of Engineering, University of Mansoura, 35516-Mansoura, Egypt
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
To search Galaxy mailing lists use the unified search at: