Defining $GALAXY_CPUS for use in tool wrappers
Hello all, Re: http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-June/010153.html http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-October/011557.html Something I raised during the GCC2013, and we talked about via Twitter as well was a Galaxy environment variable for use within Tool Wrappers setting the number of threads/CPUs to use. The idea is that you can configure a default value, and then override this per runner or per tool etc. James Taylor had suggested calling this environment variable $GALAXY_CPUS which seem fine to me (personally I'd say threads not CPUs but I don't really mind). e.g. <command>my_tool --threads "\$GALAXY_CPUS" --input "$input" --output "$output"</command> Everyone I spoke to about this seemed positive about the idea. This would/should be integrated into the various cluster back ends, for example for SGE/OGE the number of threads is already configurable via the DRMAA settings and available as the environment variable $NSLOTS for non-MPI jobs, so my guess is all Galaxy needs to do is something like this: $ hg diff diff -r ce0d758bb995 lib/galaxy/jobs/runners/drmaa.py --- a/lib/galaxy/jobs/runners/drmaa.py Tue Jul 30 12:30:30 2013 +0100 +++ b/lib/galaxy/jobs/runners/drmaa.py Tue Jul 30 16:10:40 2013 +0100 @@ -43,6 +43,7 @@ # - execute the command # - take the command's exit code ($?) and write it to a file. drm_template = """#!/bin/sh +export GALAXY_CPUS="$NSLOTS" GALAXY_LIB="%s" if [ "$GALAXY_LIB" != "None" ]; then if [ -n "$PYTHONPATH" ]; then Is there an open Trello card for this? Thanks, Peter
I would actually prefer we use $GALAXY_SLOTS. -- James Taylor, Assistant Professor, Biology/CS, Emory University On Tue, Jul 30, 2013 at 11:18 AM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
Hello all,
Re: http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-June/010153.html http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-October/011557.html
Something I raised during the GCC2013, and we talked about via Twitter as well was a Galaxy environment variable for use within Tool Wrappers setting the number of threads/CPUs to use.
The idea is that you can configure a default value, and then override this per runner or per tool etc. James Taylor had suggested calling this environment variable $GALAXY_CPUS which seem fine to me (personally I'd say threads not CPUs but I don't really mind). e.g.
<command>my_tool --threads "\$GALAXY_CPUS" --input "$input" --output "$output"</command>
Everyone I spoke to about this seemed positive about the idea.
This would/should be integrated into the various cluster back ends, for example for SGE/OGE the number of threads is already configurable via the DRMAA settings and available as the environment variable $NSLOTS for non-MPI jobs, so my guess is all Galaxy needs to do is something like this:
$ hg diff diff -r ce0d758bb995 lib/galaxy/jobs/runners/drmaa.py --- a/lib/galaxy/jobs/runners/drmaa.py Tue Jul 30 12:30:30 2013 +0100 +++ b/lib/galaxy/jobs/runners/drmaa.py Tue Jul 30 16:10:40 2013 +0100 @@ -43,6 +43,7 @@ # - execute the command # - take the command's exit code ($?) and write it to a file. drm_template = """#!/bin/sh +export GALAXY_CPUS="$NSLOTS" GALAXY_LIB="%s" if [ "$GALAXY_LIB" != "None" ]; then if [ -n "$PYTHONPATH" ]; then
Is there an open Trello card for this?
Thanks,
Peter ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
On Jul 30, 2013, at 11:40 AM, Peter Cock wrote:
On Tue, Jul 30, 2013 at 4:39 PM, James Taylor <james@jamestaylor.org> wrote:
I would actually prefer we use $GALAXY_SLOTS.
I like that too, CPUs is a bit of a fuzzy term with multiple cores, and SLOTS has familiarity from the SGE terminology as well.
I started some work on this during a bit of free time at the GCC. With some luck it may be finished in time for the September release. --nate
Peter ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
On Tue, Jul 30, 2013 at 4:44 PM, Nate Coraor <nate@bx.psu.edu> wrote:
On Jul 30, 2013, at 11:40 AM, Peter Cock wrote:
On Tue, Jul 30, 2013 at 4:39 PM, James Taylor <james@jamestaylor.org> wrote:
I would actually prefer we use $GALAXY_SLOTS.
I like that too, CPUs is a bit of a fuzzy term with multiple cores, and SLOTS has familiarity from the SGE terminology as well.
I started some work on this during a bit of free time at the GCC. With some luck it may be finished in time for the September release.
--nate
That's encouraging - is there a public branch of this yet? Peter
On Jul 30, 2013, at 11:47 AM, Peter Cock wrote:
On Tue, Jul 30, 2013 at 4:44 PM, Nate Coraor <nate@bx.psu.edu> wrote:
On Jul 30, 2013, at 11:40 AM, Peter Cock wrote:
On Tue, Jul 30, 2013 at 4:39 PM, James Taylor <james@jamestaylor.org> wrote:
I would actually prefer we use $GALAXY_SLOTS.
I like that too, CPUs is a bit of a fuzzy term with multiple cores, and SLOTS has familiarity from the SGE terminology as well.
I started some work on this during a bit of free time at the GCC. With some luck it may be finished in time for the September release.
--nate
That's encouraging - is there a public branch of this yet?
No, nothing committable.
Peter
Il 2013-07-30 17:18 Peter Cock ha scritto:
Hello all,
Re: http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-June/010153.html http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-October/011557.html
Something I raised during the GCC2013, and we talked about via Twitter as well was a Galaxy environment variable for use within Tool Wrappers setting the number of threads/CPUs to use.
The idea is that you can configure a default value, and then override this per runner or per tool etc.
Thanks Peter for pushing this idea, I totally support this proposal. In the mean time, I've been using for my tools the solution by Jim Johnson for its CD-HIT wrapper: http://toolshed.g2.bx.psu.edu/view/jjohnson/cdhit But this requires the system administrator to modify both the tool env.sh and job_conf.xml to be in sync.
Is there an open Trello card for this?
A Trello card would be useful indeed. Nicola
On Thu, Aug 1, 2013 at 10:27 AM, Nicola Soranzo <soranzo@crs4.it> wrote:
Il 2013-07-30 17:18 Peter Cock ha scritto:
Hello all,
Re: http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-June/010153.html http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-October/011557.html
Something I raised during the GCC2013, and we talked about via Twitter as well was a Galaxy environment variable for use within Tool Wrappers setting the number of threads/CPUs to use.
The idea is that you can configure a default value, and then override this per runner or per tool etc.
Thanks Peter for pushing this idea, I totally support this proposal. In the mean time, I've been using for my tools the solution by Jim Johnson for its CD-HIT wrapper:
http://toolshed.g2.bx.psu.edu/view/jjohnson/cdhit
But this requires the system administrator to modify both the tool env.sh and job_conf.xml to be in sync.
Is there an open Trello card for this?
A Trello card would be useful indeed.
Nicola
Better than a Trello card, we now have a pull request from John: https://bitbucket.org/galaxy/galaxy-central/pull-request/236/job-runner-enha... Peter
Am Samstag, den 12.10.2013, 19:42 +0100 schrieb Peter Cock:
On Thu, Aug 1, 2013 at 10:27 AM, Nicola Soranzo <soranzo@crs4.it> wrote:
Il 2013-07-30 17:18 Peter Cock ha scritto:
Hello all,
Re: http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-June/010153.html http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-October/011557.html
Something I raised during the GCC2013, and we talked about via Twitter as well was a Galaxy environment variable for use within Tool Wrappers setting the number of threads/CPUs to use.
The idea is that you can configure a default value, and then override this per runner or per tool etc.
Thanks Peter for pushing this idea, I totally support this proposal. In the mean time, I've been using for my tools the solution by Jim Johnson for its CD-HIT wrapper:
http://toolshed.g2.bx.psu.edu/view/jjohnson/cdhit
But this requires the system administrator to modify both the tool env.sh and job_conf.xml to be in sync.
Is there an open Trello card for this?
A Trello card would be useful indeed.
Nicola
Better than a Trello card, we now have a pull request from John: https://bitbucket.org/galaxy/galaxy-central/pull-request/236/job-runner-enha...
And thanks to John its merged! Time for testing and migrating our tools :)
Peter ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
On Thu, Oct 17, 2013 at 8:37 AM, Bjoern Gruening <bjoern.gruening@gmail.com> wrote:
Am Samstag, den 12.10.2013, 19:42 +0100 schrieb Peter Cock:
On Thu, Aug 1, 2013 at 10:27 AM, Nicola Soranzo <soranzo@crs4.it> wrote:
Il 2013-07-30 17:18 Peter Cock ha scritto:
Hello all,
Re: http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-June/010153.html http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-October/011557.html
Something I raised during the GCC2013, and we talked about via Twitter as well was a Galaxy environment variable for use within Tool Wrappers setting the number of threads/CPUs to use.
The idea is that you can configure a default value, and then override this per runner or per tool etc.
Thanks Peter for pushing this idea, I totally support this proposal. In the mean time, I've been using for my tools the solution by Jim Johnson for its CD-HIT wrapper:
http://toolshed.g2.bx.psu.edu/view/jjohnson/cdhit
But this requires the system administrator to modify both the tool env.sh and job_conf.xml to be in sync.
Is there an open Trello card for this?
A Trello card would be useful indeed.
Nicola
Better than a Trello card, we now have a pull request from John: https://bitbucket.org/galaxy/galaxy-central/pull-request/236/job-runner-enha...
And thanks to John its merged! Time for testing and migrating our tools :)
So far $GALAXY_SLOTS seems to be working nicely for me. However, I am wondering if it would be possible to use it inside the <configfile> section? Is that run at the time of job creation on the Galaxy server (where determining the number of threads may be hard) or as part of job execution (e.g. on the cluster, when $GALAXY_SLOTS would be known). I have tried using \$GALAXY_SLOTS but it remains in the file generated by <configfile> as $GALAXY_SLOTS rather than being substituted. Thanks, Peter
So far $GALAXY_SLOTS seems to be working nicely for me.
However, I am wondering if it would be possible to use it inside the <configfile> section? Is that run at the time of job creation on the Galaxy server (where determining the number of threads may be hard) or as part of job execution (e.g. on the cluster, when $GALAXY_SLOTS would be known).
Unfortunately, the former, and there is no easy way to change this. One could do some simple variable substitution on the config file when the job runs though (sed would do the job here, not pretty but it works).
On Thu, Oct 24, 2013 at 5:53 PM, James Taylor <james@jamestaylor.org> wrote:
So far $GALAXY_SLOTS seems to be working nicely for me.
However, I am wondering if it would be possible to use it inside the <configfile> section? Is that run at the time of job creation on the Galaxy server (where determining the number of threads may be hard) or as part of job execution (e.g. on the cluster, when $GALAXY_SLOTS would be known).
Unfortunately, the former, and there is no easy way to change this.
Ah. That was what I suspected :(
One could do some simple variable substitution on the config file when the job runs though (sed would do the job here, not pretty but it works).
Yeah - I was thinking about something like that as a work around, although since I have a wrapper Python script anyway I can do the edit there: https://github.com/peterjc/pico_galaxy/tree/master/tools/mira4 Thanks for such a prompt answer, Peter
participants (5)
-
Bjoern Gruening
-
James Taylor
-
Nate Coraor
-
Nicola Soranzo
-
Peter Cock