Re: [galaxy-dev] Galaxy fronting multiple clusters

28 Sep 2011

      Thanks Nate!

What types of plans do you have for multiple clusters and do you have a committed timeline?  

Since this forum post, I have spoken with a few other users of Galaxy and have been doing some poking around.  I did already read the link you provided thanks much. Unfortunately, from the doc it was not clear if multiple tool runners could be defined and if so how the tool runners was selected.    Your response below confirms what I was figuring out, thanks! (IE, you can only have one runner per tool and pin the tool to the cluster which is even harder for SGE since the connection info is environment variable based).  

We are exploring writing our own job runner that might front multiple clusters and dispatch based off of some simple rules.   BTW, i have found the following website useful as I am new to SGE (http://arc.liv.ac.uk/SGE/howto/).   Off the galaxy architecture web page it states that the job runners are extensible, I am trying to understand the code pathways.  I see that we would need to provide an instance of the BaseJobRunner and drop in our job runner python  into lib/jobs/runners.  Where does the logic sit that calls into the appropriate job runner based off of the configuration? Is there some documentation around with guidance on how to implement our own runners?

Finally, one other idea to discuss is having multiple galaxy installations (per cluster) with a shared database/file storage.   I am wondering if this would be supported or has been done?  Would there be potential for data corruption if there are multiple galaxy instances, dispatching jobs to their local cluster, and updating the database (for example risks of getting duplicate galaxy ids, dataset ids, etc?)

Thanks again,

Ann

On Sep 28, 2011, at 9:29 AM, Nate Coraor wrote:
...
Ann Black wrote:
...
Hello -
I am working on standing up our own galaxy installation.  We would like to have galaxy front multiple clusters, and I have some questions I was hoping someone could help with.
1)  From reading other forum posts on this subject, it seems I need to minimally do the following ... is this correct?:
      A) have galaxy server w/ sge register as a job submitting host to the head node of each cluster
      B) Configure multiple tool runners for each tool per remote cluster?
2) When galaxy would submit a job, how would a backend remote cluster be selected?  When running workflows, would the same cluster be used to run the entire workflow - or could the workflow then span remote clusters?
3) I am trying to understand some of the source code, where is the logic that would then dispatch the job and select a job runner to use?
4) Other advice or steps needed in order to get galaxy to front multiple remote clusters?
Hi Ann,
This is all split per tool, there is no way to have a tool run on more
than one.  We're hoping to expand our cluster loading support within the
next year, however.
The method for setting the cluster options for a tool can be found at
the bottom of the cluster wiki page:
http://wiki.g2.bx.psu.edu/Admin/Config/Performance/Cluster
With SGE this could be a bit tricky as the SGE cell to use is pulled
from the environment.  It might be possible to make copies of the drmaa
runner (lib/galaxy/jobs/runners/drmaa.py) and set SGE_ROOT as the runner
starts up, but changing it as each runner starts may break runners which
have already started, so this would need some testing.
--nate
...
Thanks so much,
Ann
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/