This is something I am very interested in. The three parts below make sense to me. I would
be very happy to discuss further and provide any help to move this forward.
On Oct 26, 2013, at 2:43 PM, Kyle Ellrott <kellrott(a)soe.ucsc.edu> wrote:
I think one of the aspects where Galaxy is a bit soft is the ability
to do distributed tasks. The current system of split/replicate/merge tasks based on file
type is a bit limited and hard for tool developers to expand upon. Distributed computing
is a non-trival thing to implement and I think it would be a better use of our time to use
an already existing framework. And it would also mean one less API for tool writers to
have to develop for.
I was wondering if anybody has looked at Mesos ( http://mesos.apache.org/
). You can see
an overview of the Mesos architecture at
The important thing about Mesos is that it provides an API for C/C++, Java/Scala and
Python to write distributed frameworks. There are already implementations of frameworks
for common parallel programming systems such as:
- Hadoop (https://github.com/mesos/hadoop
- Spark (http://spark-project.org
And you can find example Python framework at
Integration with Galaxy would have three parts:
1) Add a system config variable to Galaxy called 'MESOS_URL' that is then passed
to tool wrappers and allows them to contact the local mesos infrastructure (assuming the
system has been configured) or pass a null if the system isn't available.
2) Write a tool runner that works as a mesos framework to executes single cpu jobs on the
3) For instances where mesos is not available at a system wide level (say they only have
access to an SGE based cluster), but the user wants to run distributed jobs, write a
wrapper that can create a mesos cluster using the existing queueing system. For example,
right now I run a Mesos system under the SGE queue system.
I'm curious to see what other people think.
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
To search Galaxy mailing lists use the unified search at:
Ravi K Madduri
MCS, Argonne National Laboratory
Computation Institute, University of Chicago