Andrey Tovchigrechko wrote:
We have decided to use a local Galaxy install as a front-end to our
metagenomic binning tool MGTAXA ( http://andreyto.github.com/mgtaxa/
I need some guidance from the Galaxy developers for the best way to
1) The server will be on a DMZ, with no direct access to the internal
network, where the computes will be running on a local SGE cluster. The
best that our IT allowed is for some script on the internal cluster to
monitor a directory on the web server, pull input/tasks from where when
they appear, put the results back. My current idea is to have the Galaxy
"local runner" to start "proxy jobs": each proxy job is a local
that does "put the input into watched dir; until results appear in the
watched dir; sleep(30); loop; finish". In other words, Galaxy thinks
that it is running jobs locally, but it fact those jobs are just waiting
for the remote results to come back. Does that look like a sane
solution? How will it scale on the Galaxy side? E.g. how many such
simultaneous tasks can the local runner support? Any anticipated gotchas?
This will work, but one of the problems you'll run in to is that all
those jobs will be considered "running" even if they're queued in SGE,
and will tie up the local job runner while giving a false status to your
users. Although to prevent backup, you could increase the number of
available local runner workers, since a bunch of sleeping scripts
probably won't impact performance too much.
Additionally, we will be also trying to run computes on our TeraGrid
account. I was thinking that the solution above can be applied to that
scenario also, except that now the proxy job would be polling qsub on
TeraGrid through ssh, or call Globus API. Here one problem is that a job
often has to wait in a TeraGrid queue for 24 hours or so. Will my proxy
jobs on Galaxy time out/get killed by any chance?
No, jobs can be queued indefinitely.
The alternatives are 1) write another runner (in addition to local,
torque) - how much work it will be?
This would actually be the cleanest route, and you could probably just
take the existing sge module and strip out all of the DRMAA code.
Simply have it generate the submission script and write it to the
cluster_files_directory and collect the outputs from the same directory
as usual. But instead of submitting the job directly, it does not need
to do anything, since your backend process will do it. The loop that
monitors job status can simply check for the existence of the output
files (assuming such appearance is atomic, e.g. they exist and have been
2) write a fake SGE python interface
and make Galaxy think it is using local SGE
This is probably more work than it'd be worth.
2) What repo is best to clone, given the scope of our activity
above? We will likely need to mess a bit with the Galaxy internals, not
just the tool definition. Should we clone galaxy-central or galaxy-dist?
What workflow would you recommend for updating, submitting patches etc?
galaxy-dist would be advisable here. Ry4an Brase did a lightning talk
on Mercurial for Galaxy Admins at our recent developer conference that
explains how to update Galaxy, his slides are on our wiki here:
For patches, either email them to us on dev list (if they're not too
big), or set up a patch queue repository in Bitbucket, and send us a
link to those patches.
I will be very grateful for answers to the above, and also to any
galaxy-dev mailing list