Local jobs aren't dispatching in a balanced configuration deployed on a cluster
I have galaxy running on my institution's cluster computing service, which uses PBS. It's in a balanced configuration. Jobs going to the cluster submit without any problem at all. However, any job that I have specified to run locally in the universe_wgsi.ini file won't dispatch. There isn't any record of the job in the manager.log, or any of the handler[#].log files. In fact, I've never seen anything in the handler logfiles after "serving on [GALAXY IP]:[SPECIFIC PORT NUMBER OF HANDLER]". The manager logfile has all the details about the jobs dispatched to the pbs runner, but nothing about local jobs. HOWEVER, if I stop Galaxy using "GALAXY_RUN_ALL=1 sh ./run.sh --stop-daemon" and restart using "GALAXY_RUN_ALL=1 sh ./run.sh --daemon" then the local jobs that were waiting to run begin immediately. Information about them shows up in manager.log, but not in the handler0.log or handler1.log files. I'm on an 8-core Dell R410 server, if that matters. The server portion of my universe_wsgi.ini file is pasted below. # ---- HTTP Server ---------------------------------------------------------- # Configuration of the internal HTTP server. [server:web0] # The internal HTTP server to use. Currently only Paste is provided. This # option is required. use = egg:Paste#http # The port on which to listen. port = 8080 # The address on which to listen. By default, only listen to localhost (Galaxy # will not be accessible over the network). Use '0.0.0.0' to listen on all # available network interfaces. host = localhost # Use a threadpool for the web server instead of creating a thread for each # request. use_threadpool = True # Number of threads in the web server thread pool. threadpool_workers = 7 [server:web1] use = egg:Paste#http port = 8081 host = localhost use_threadpool = true threadpool_workers = 7 [server:manager] use = egg:Paste#http port = 8079 host = localhost use_threadpool = true threadpool_workers = 5 [server:handler0] use = egg:Paste#http port = 8090 host = localhost use_threadpool = true threadpool_workers = 5 [server:handler1] use = egg:Paste#http port = 8091 host = localhost use_threadpool = true threadpool_workers = 5 [app:main] # -- Application and filtering job_manager = manager job_handler = handler0,handler1 # ---- Custom Parameters ----------------------------------------------------
On Jun 21, 2012, at 4:50 PM, Dorset, Daniel C wrote:
I have galaxy running on my institution’s cluster computing service, which uses PBS. It’s in a balanced configuration.
Jobs going to the cluster submit without any problem at all. However, any job that I have specified to run locally in the universe_wgsi.ini file won’t dispatch. There isn’t any record of the job in the manager.log, or any of the handler[#].log files. In fact, I’ve never seen anything in the handler logfiles after “serving on [GALAXY IP]:[SPECIFIC PORT NUMBER OF HANDLER]”. The manager logfile has all the details about the jobs dispatched to the pbs runner, but nothing about local jobs.
HOWEVER, if I stop Galaxy using “GALAXY_RUN_ALL=1 sh ./run.sh --stop-daemon” and restart using “GALAXY_RUN_ALL=1 sh ./run.sh --daemon” then the local jobs that were waiting to run begin immediately. Information about them shows up in manager.log, but not in the handler0.log or handler1.log files.
I’m on an 8-core Dell R410 server, if that matters.
The server portion of my universe_wsgi.ini file is pasted below.
# ---- HTTP Server ----------------------------------------------------------
# Configuration of the internal HTTP server.
[server:web0]
# The internal HTTP server to use. Currently only Paste is provided. This # option is required. use = egg:Paste#http
# The port on which to listen. port = 8080
# The address on which to listen. By default, only listen to localhost (Galaxy # will not be accessible over the network). Use '0.0.0.0' to listen on all # available network interfaces. host = localhost
# Use a threadpool for the web server instead of creating a thread for each # request. use_threadpool = True
# Number of threads in the web server thread pool. threadpool_workers = 7
[server:web1] use = egg:Paste#http port = 8081 host = localhost use_threadpool = true threadpool_workers = 7
[server:manager] use = egg:Paste#http port = 8079 host = localhost use_threadpool = true threadpool_workers = 5
[server:handler0] use = egg:Paste#http port = 8090 host = localhost use_threadpool = true threadpool_workers = 5
[server:handler1] use = egg:Paste#http port = 8091 host = localhost use_threadpool = true threadpool_workers = 5
[app:main]
# -- Application and filtering
job_manager = manager job_handler = handler0,handler1
Hi Daniel, This parameter should be 'job_handlers' Are there any entries in your [galaxy:tool_handlers] and/or [galaxy:tool_runners] sections? --nate
# ---- Custom Parameters ---------------------------------------------------- ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Good catch, thanks Nate! I have plenty of tool_runners defined, but no tool_handlers. Do I have to specifically assign tool handlers in order for them to be used in job deployment? Thanks! Dan
Hi Daniel,
This parameter should be 'job_handlers' Are there any entries in your [galaxy:tool_handlers] and/or [galaxy:tool_runners] sections? --nate
On Jun 22, 2012, at 12:26 PM, Dorset, Daniel C wrote:
Good catch, thanks Nate! I have plenty of tool_runners defined, but no tool_handlers. Do I have to specifically assign tool handlers in order for them to be used in job deployment?
No, if you don't define any specific handlers in the [galaxy:tool_handlers], each job will have one assigned randomly from the list in the 'job_handers' parameter in [app:main]. --nate
Thanks!
Dan
Hi Daniel,
This parameter should be 'job_handlers'
Are there any entries in your [galaxy:tool_handlers] and/or [galaxy:tool_runners] sections?
--nate
participants (2)
-
Dorset, Daniel C
-
Nate Coraor