Re: [galaxy-dev] Staged Method for cluster running SGE?

26 Apr 2011

      On Tue, Apr 26, 2011 at 5:11 AM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
...
Hi all,
So far we've been running our local Galaxy instance on
a single machine, but I would like to be able to offload
(some) jobs onto our local SGE cluster. I've been reading
https://bitbucket.org/galaxy/galaxy-central/wiki/Config/Cluster
Unfortunately in our setup the SGE cluster head node is
a different machine to the Galaxy server, and they do not
(currently) have a shared file system. Once on the cluster,
the head node and the compute nodes do have a shared
file system.
Therefore we will need some way of copying input data
from the Galaxy server to the cluster, running the job,
and once the job is done, copying the results back to the
Galaxy server.
The "Staged Method" on the wiki sounds relevant, but
appears to be for TORQUE only (via pbs_python), not
any of the other back ends (via DRMAA).
Have I overlooked anything on the "Cluster" wiki page?
Has anyone attempted anything similar, and could you
offer any guidance or tips?
Hi, Peter.

You might consider setting up a separate queue for SGE jobs.  Then,
you could specify a prolog and epilog script that will copy files from
the galaxy machine into the cluster (in the prolog) and back to galaxy
(in the epilog).  This assumes that there is a way to map from one
file system to the other, but for Galaxy, that is probably the case
(galaxy files on the galaxy server are "under" the galaxy instance and
galaxy files on the cluster will probably be run as a single user in
that home directory).  I have not done this myself, but the advantage
to using prolog and epilog scripts is that galaxy jobs then do not
need any special configuration--all the work is done transparently by
SGE.

Sean

Re: [galaxy-dev] Staged Method for cluster running SGE?

Sean Davis