Re: [galaxy-dev] "\${GALAXY_SLOTS:-4}" usage help

8 Apr 2016

      On Fri, Apr 8, 2016 at 12:01 PM, Poole, Richard <r.poole@ucl.ac.uk> wrote:
...
Hi Nate,
Thanks for the speedy reply (hope all is well with you)!
I use Pulsar as I need the ability to stage the cluster jobs as the
cluster has no access to the data storage on my local machines (as well as
being unable to submit directly to cluster from my machine). I will take a
careful look at your job_conf.xml as an example and go from there - thanks.
I few more specific questions:
- is it possible to set the global GALAXY_SLOTS value somewhere (for local
submissions I use pretty much the default Galaxy job_conf settings which if
I understand correctly set the number of GALAXY_SLOTS to machine cores?!).
It's only possible to directly set it for the "local" job runner. By
default, its value is 1. It can be modified with the "local_slots"
destination parameter, as shown here:

https://github.com/galaxyproject/galaxy/blob/dev/config/job_conf.xml.sample_...

The "workers" option in the <plugins> section of job_conf.xml also has an
effect on local job concurrency: this is the number of concurrent jobs that
the local runner plugin will start (the workers value means something else
entirely for all other runner plugins). Thus with the local runner plugin
and one job destination that uses the local runner plugin, the number of
cores that would be used by jobs should be at most `workers * local_slots`

By combining "workers" and "local_slots" you can get rudimentary control
over the number of local cores allowed for jobs, but it is imperfect. For
true control, a proper DRM is needed.
...
- how then would this be varied for local vs Pulsar-staged (and cluster
submitted) jobs?
With Pulsar, you can either define the DRM options to use under the job
manager configured in Pulsar's app.yml. For example, the example in the
Pulsar documentation is for an SGE cluster that would result in setting
$GALAXY_SLOTS to 8:

  http://pulsar.readthedocs.org/en/latest/job_managers.html#drmaa

You can also configure the native specification as a destination parameter
in Galaxy's job_conf.xml, if you prefer.
...
- is there an explanation of the GALAXY_SLOTS syntax somewhere e.g. what
does the ':-4’ mean?
This is Bourne shell parameter substitution syntax:

  http://www.tldp.org/LDP/abs/html/parameter-substitution.html

  ${parameter-default}, ${parameter:-default}
  If parameter not set, use default.

This means "if $GALAXY_SLOTS is unset, substitute the value '4' in its
place."

In case the documentation is unclear, $GALAXY_SLOTS is a variable for use
in tool configuration files (and should probably default to "1", not "4").
When configuring a Galaxy server, you should not have to manipulate the
$GALAXY_SLOTS variable directly.

--nate
...
Thanks,
Richard
On 8 Apr 2016, at 15:05, Nate Coraor <nate@bx.psu.edu> wrote:
On Fri, Apr 8, 2016 at 9:55 AM, Poole, Richard <r.poole@ucl.ac.uk> wrote:
...
Could somebody point me to a good explanation of how to setup and use
GALAXY_SLOTS correctly on my server?
A basic explanation is good but I also make use of Pulsar to stage some
jobs on our cluster here (my machine is 4-core and cluster I use is
12-core) so I am wondering if GALAXY_SLOTS can handle this (so I don't need
to specify exact thread numbers in e.g. tool wrappers)
Hi Richard,
Is your cluster running a distributed resource manager (DRM) like PBS,
grid engine, etc.? If yes, then $GALAXY_SLOTS is handled automatically
based on whatever options you submit to your cluster with. If you request
that a job be allocated 12 cores on a node, $GALAXY_SLOTS will be set to 12
and any tools which respect $GALAXY_SLOTS (which include all of the
multicore devteam and IUC tools) will use 12 cores accordingly.
It is important to only submit tools which can use multiple cores with the
multicore option, otherwise you may allocate 12 cores for a tool which will
only use 1, wasting resources. Here is the job configuration file we use
for usegalaxy.org which shows how to map multicore tools to multicore
destinations running the Slurm DRM:
https://github.com/galaxyproject/usegalaxy-playbook/blob/master/templates/ga...
For example, `bowtie2` (line 201) runs on the `slurm_multi` destination
(126) via the `dynamic_local_stampede_select_dynamic_walltime` dynamic
destination (in another file, but the details are not relevant, in your
case you can map directly from a tool to multicore destination defined in
job_conf.xml).
If your cluster is running a DRM, you most likely do not need to run
Pulsar (Galaxy has native support for pretty much all commonly used DRMs)
unless you need the ability to stage files to/from the cluster or do not
have direct submit access to the cluster from the Galaxy server.
--nate
...
Thanks,
Richard
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/