setting per-tools parameters for SGE job runner
Hi, I have galaxy running now with SGE job runners. I also managed to set PATH now using .sge_request. What I want to do now is to set some SGE parameters for some special tools which need special parameters (e.q. a parallel env. ) Is there a way to set them in galaxy (maybe using the job runner URL)? Or do I need to write wrappers? thanks, Andreas
Andreas Kuntzagk wrote:
I have galaxy running now with SGE job runners. I also managed to set PATH now using .sge_request. What I want to do now is to set some SGE parameters for some special tools which need special parameters (e.q. a parallel env. ) Is there a way to set them in galaxy (maybe using the job runner URL)? Or do I need to write wrappers?
Hi Andreas, We'd like to be able to define specific required resources or perhaps even scheduler options in the future, but unfortunately for the moment the only way to do this is to create a separate queue (the assigned nodes can overlap those assigned to your default queue) that sets the options you need, and then point the tool at this new queue. --nate
Someone supplied a patch to this list a while back. I've used it on our local set-up and it works great. Sadly, I seem to have lost it :( I'll have a look through the archives and see if I can find it. Chris On Thu, 2009-11-05 at 14:41 +0100, Andreas Kuntzagk wrote:
Hi,
I have galaxy running now with SGE job runners. I also managed to set PATH now using .sge_request. What I want to do now is to set some SGE parameters for some special tools which need special parameters (e.q. a parallel env. ) Is there a way to set them in galaxy (maybe using the job runner URL)? Or do I need to write wrappers?
thanks, Andreas _______________________________________________ galaxy-dev mailing list galaxy-dev@bx.psu.edu http://mail.bx.psu.edu/cgi-bin/mailman/listinfo/galaxy-dev
On Thu, 2009-11-05 at 16:03 +0000, Chris Cole wrote:
Someone supplied a patch to this list a while back. I've used it on our local set-up and it works great. Sadly, I seem to have lost it :(
I'll have a look through the archives and see if I can find it.
Found it! It's here: http://mail.bx.psu.edu/pipermail/galaxy-dev/2009-August/000649.html Like I said in the thread at the time, I think it's a strong contender for adding to Galaxy by default.
Thanks, from the description this looks like what I need. Will test if it's still working against current revision. Andreas Chris Cole wrote:
On Thu, 2009-11-05 at 16:03 +0000, Chris Cole wrote:
Someone supplied a patch to this list a while back. I've used it on our local set-up and it works great. Sadly, I seem to have lost it :(
I'll have a look through the archives and see if I can find it.
Found it! It's here: http://mail.bx.psu.edu/pipermail/galaxy-dev/2009-August/000649.html
Like I said in the thread at the time, I think it's a strong contender for adding to Galaxy by default.
Chris Cole wrote:
On Thu, 2009-11-05 at 16:03 +0000, Chris Cole wrote:
Someone supplied a patch to this list a while back. I've used it on our local set-up and it works great. Sadly, I seem to have lost it :(
I'll have a look through the archives and see if I can find it.
Found it! It's here: http://mail.bx.psu.edu/pipermail/galaxy-dev/2009-August/000649.html
Like I said in the thread at the time, I think it's a strong contender for adding to Galaxy by default.
I'm definitely pro adding this to galaxy. I found a minor problem: the line job = model.Job.get ( job_wrapper.job_id ) gave me an error. (type object 'Job' has no attribute 'get') But since I don't want the username as jobname I just commented it out. Before adding it to galaxy main I'd suggest reworking the parameter-part of the URL. It looks ugly and confusing as it is. (but I cant come up with something better ATM :-( ) regards, Andreas
This is on the development plan, and will be available in the distribution soon. On Nov 6, 2009, at 3:42 AM, Andreas Kuntzagk wrote:
I'm definitely pro adding this to galaxy.
Before adding it to galaxy main I'd suggest reworking the parameter- part of the URL. It looks ugly and confusing as it is. (but I cant come up with something better ATM :-( )
regards, Andreas _______________________________________________ galaxy-dev mailing list galaxy-dev@bx.psu.edu http://mail.bx.psu.edu/cgi-bin/mailman/listinfo/galaxy-dev
Greg Von Kuster Galaxy Development Team greg@bx.psu.edu
Andreas Kuntzagk wrote:
I'm definitely pro adding this to galaxy.
I found a minor problem: the line job = model.Job.get ( job_wrapper.job_id )
gave me an error. (type object 'Job' has no attribute 'get') But since I don't want the username as jobname I just commented it out.
In upgrading to SQLAlchemy 0.5, we removed the name mapping. This can be rewritten as: job = self.sa_session.query( self.app.model.Job ).get( job_wrapper.job_id )
Before adding it to galaxy main I'd suggest reworking the parameter-part of the URL. It looks ugly and confusing as it is. (but I cant come up with something better ATM :-( )
This is our hesitation as well, but we've come up with an idea that we hope is cleaner: define them as requirements in tool_conf.xml, like so: <tool file="filters/headWrapper.xml"> <runner name="GalaxyProject"> <requests virtual_free="7G"/> <parallel_environment>threads 4</parallel_environment> </runner> </tool> Where 'GalaxyProject' is a 4-field runner URL as described by Assaf. --nate
Nate Coraor wrote:
Andreas Kuntzagk wrote:
Before adding it to galaxy main I'd suggest reworking the parameter-part of the URL. It looks ugly and confusing as it is. (but I cant come up with something better ATM :-( )
This is our hesitation as well, but we've come up with an idea that we hope is cleaner: define them as requirements in tool_conf.xml, like so:
<tool file="filters/headWrapper.xml"> <runner name="GalaxyProject"> <requests virtual_free="7G"/> <parallel_environment>threads 4</parallel_environment> </runner> </tool>
Where 'GalaxyProject' is a 4-field runner URL as described by Assaf.
This looks much cleaner. (Moving the runner config to the tool_conf was also somewhere on my wish list.) Only grief I have with this is that you need to have a complete mapping of SGE (or Torque) parameters to XML elements in Galaxy (and in the documentation). Or would you also have something generic like <other_argument>-l my_limit</other_argument> ? regards, Andreas
On Mon, 09 Nov 2009 09:30:36 +0100 Andreas Kuntzagk <andreas.kuntzagk@mdc-berlin.de> wrote:
Nate Coraor wrote:
Andreas Kuntzagk wrote:
Before adding it to galaxy main I'd suggest reworking the parameter-part of the URL. It looks ugly and confusing as it is. (but I cant come up with something better ATM :-( )
This is our hesitation as well, but we've come up with an idea that we hope is cleaner: define them as requirements in tool_conf.xml, like so:
<tool file="filters/headWrapper.xml"> <runner name="GalaxyProject"> <requests virtual_free="7G"/> <parallel_environment>threads 4</parallel_environment> </runner> </tool>
Where 'GalaxyProject' is a 4-field runner URL as described by Assaf.
This looks much cleaner. (Moving the runner config to the tool_conf was also somewhere on my wish list.)
I'm not so sure. I think it makes sense to keep all the SGE tool config in one place rather than strewn across various tool XML files. If the admin were to need to change some settings or migrate to a new set-up, it would be a nightmare to find them all. I would advocate keeping it in the universe_wsgi.ini file.
Only grief I have with this is that you need to have a complete mapping of SGE (or Torque) parameters to XML elements in Galaxy (and in the documentation). Or would you also have something generic like <other_argument>-l my_limit</other_argument> ?
The way I read the above is that the <requests> tag fulfils the -l role. i.e. qsub -l foo=bar would map to: <requests foo="bar" />. So it is generic already. I agree, though, that this system does require a full mapping of all the options. Assaf's solution at least only requires a knowledge of the qsub parameters. I've no idea if Assaf's solution is problematic for other schedulers or if there are security implications via passing of raw parameters, but I prefer his implementation TBH. Cheers, Chris
Chris Cole (by way of Chris Cole <chris@compbio.dundee.ac.uk>) wrote:
I'm not so sure. I think it makes sense to keep all the SGE tool config in one place rather than strewn across various tool XML files. If the admin were to need to change some settings or migrate to a new set-up, it would be a nightmare to find them all.
I would advocate keeping it in the universe_wsgi.ini file.
It'd be all in one place, we're just considering moving it to tool_conf.xml where it can be placed alongside the tools themselves. It'd be possible to have predefined requirements, as well, if queues don't make sense. So rather than having to set number of processors and memory requirements per tool, you can just set it in an alias and assign that alias to a tool.
The way I read the above is that the <requests> tag fulfils the -l role. i.e. qsub -l foo=bar would map to: <requests foo="bar" />. So it is generic already.
I agree, though, that this system does require a full mapping of all the options. Assaf's solution at least only requires a knowledge of the qsub parameters.
I've no idea if Assaf's solution is problematic for other schedulers or if there are security implications via passing of raw parameters, but I prefer his implementation TBH. Cheers,
There is a bit of reinventing the wheel here since SGE already defines its own language, but we want to make this as RM-agnostic as possible. We can also have an SGE-specific tag that could be used in place of any of our language, so hopefully that would satisfy your preferences. Thanks, --nate
On Mon, 09 Nov 2009 09:51:22 -0500 Nate Coraor <nate@bx.psu.edu> wrote:
Chris Cole (by way of Chris Cole <chris@compbio.dundee.ac.uk>) wrote:
I'm not so sure. I think it makes sense to keep all the SGE tool config in one place rather than strewn across various tool XML files. If the admin were to need to change some settings or migrate to a new set-up, it would be a nightmare to find them all.
I would advocate keeping it in the universe_wsgi.ini file.
It'd be all in one place, we're just considering moving it to tool_conf.xml where it can be placed alongside the tools themselves. It'd be possible to have predefined requirements, as well, if queues don't make sense. So rather than having to set number of processors and memory requirements per tool, you can just set it in an alias and assign that alias to a tool.
That sounds ideal.
I've no idea if Assaf's solution is problematic for other schedulers or if there are security implications via passing of raw parameters, but I prefer his implementation TBH. Cheers,
There is a bit of reinventing the wheel here since SGE already defines its own language, but we want to make this as RM-agnostic as possible.
I thought as much.
We can also have an SGE-specific tag that could be used in place of any of our language, so hopefully that would satisfy your preferences.
Yeah, that would be useful. It's great that you're addressing this issue. Thanks very much for the update.
Andreas Kuntzagk wrote:
This looks much cleaner. (Moving the runner config to the tool_conf was also somewhere on my wish list.) Only grief I have with this is that you need to have a complete mapping of SGE (or Torque) parameters to XML elements in Galaxy (and in the documentation). Or would you also have something generic like <other_argument>-l my_limit</other_argument> ?
The idea is to use abstract parameters that can be applied to any RM that is plugged in to Galaxy in the future. However, it would make sense to provide an "RM-specific" field as you suggest, where you can add anything we don't account for. --nate
participants (4)
-
Andreas Kuntzagk
-
Chris Cole
-
Greg Von Kuster
-
Nate Coraor