I am totally lost on what is happening now, I have Galaxy running but jobs are not being run: This is my setup: torque: qmgr -c 'p s' # # Create queues and set their attributes. # # # Create and define queue batch # create queue batch set queue batch queue_type = Execution set queue batch resources_default.nodes = 1 set queue batch resources_default.walltime = 01:00:00 set queue batch enabled = True set queue batch started = True # # Set server attributes. # set server scheduling = True set server acl_hosts = manager set server managers = root@* set server managers += jurgens@* set server operators = galaxy@* set server operators += jurgens@* set server operators += root@* set server default_queue = batch set server log_events = 511 set server mail_from = adm set server scheduler_iteration = 600 set server node_check_rate = 150 set server tcp_timeout = 300 set server job_stat_rate = 45 set server poll_jobs = True set server mom_job_sync = True set server keep_completed = 300 set server next_job_number = 17 set server moab_array_compatible = True This is my job_conf.xml <?xml version="1.0"?> <!-- A sample job config that explicitly configures job running the way it is configured by default (if there is no explicit config). --> <job_conf> <plugins> <!-- <plugin id="drmaa" type="runner" load="galaxy.jobs.runners.drmaa:DRMAAJobRunner" workers="4"/> --> <plugin id="drmaa" type="runner" load="galaxy.jobs.runners.drmaa:DRMAAJobRunner"/> </plugins> <handlers default="batch"> <handler id="cn01" tags="batch"/> <handler id="cn02" tags="batch"/> </handlers> <destinations default="batch"> <destination id="batch" runner="drmaa" tag="cluster,batch"> <param id="nativeSpecfication">-q batch</param> </destination> </destinations> </job_conf> This is parts of the universe_wsgi.ini # Configuration of the internal HTTP server. [server:main] # The internal HTTP server to use. Currently only Paste is provided. This # option is required. use = egg:Paste#http # The port on which to listen. port = 8989 # The address on which to listen. By default, only listen to localhost (Galaxy # will not be accessible over the network). Use '0.0.0.0' to listen on all # available network interfaces. #host = 127.0.0.1 host = 0.0.0.0 # Use a threadpool for the web server instead of creating a thread for each # request. use_threadpool = True # Number of threads in the web server thread pool. #threadpool_workers = 10 # Set the number of seconds a thread can work before you should kill it (assuming it will never finish) to 3 hours. threadpool_kill_thread_limit = 10800 [server:cn01] use = egg:Paste#http port = 8090 host = 127.0.0.1 use_threadpool = true threadpool_worker = 5 [server:cn02] use = egg:Paste#http port = 8091 host = 127.0.0.1 use_threadpool = true threadpool_worker = 5 Where cn01 and cn02 are cluster nodes echo $DRMAA_LIBRARY_PATH /usr/local/lib/libdrmaa.so On 8 August 2013 16:58, Nate Coraor <nate@bx.psu.edu <javascript:_e({}, 'cvml', 'nate@bx.psu.edu');>> wrote:
On Aug 7, 2013, at 9:23 PM, shenwiyn wrote:
Yes,and I also have the same confuse about that.Actually when I set server:<id> in the universe_wsgi.ini as follows for a try,my Galaxy doesn't work with Cluster,if I remove server:<id>,it work .
Hi Shenwiyn,
Are you starting all of the servers that you have defined in universe_wsgi.ini? If using run.sh, setting GALAXY_RUN_ALL in the environment will do this for you:
http://wiki.galaxyproject.org/Admin/Config/Performance/Scaling
[server:node01] use = egg:Paste#http port = 8080 host = 0.0.0.0 use_threadpool = true threadpool_workers = 5 This is my job_conf.xml : <?xml version="1.0"?> <job_conf> <plugins workers="4"> <plugin id="local" type="runner" load="galaxy.jobs.runners.local:LocalJobRunner" workers="4"/> <plugin id="pbs" type="runner" load="galaxy.jobs.runners.pbs:PBSJobRunner" workers="8"/> </plugins> <handlers default="batch"> <handler id="node01" tags="batch"/> <handler id="node02" tags="batch"/> </handlers> <destinations default="regularjobs"> <destination id="local" runner="local"/> <destination id="regularjobs" runner="pbs" tags="cluster"> <param id="Resource_List">walltime=24:00:00,nodes=1:ppn=4,mem=10G</param> <param id="galaxy_external_runjob_script">scripts/drmaa_external_runner.py</param> <param id="galaxy_external_killjob_script">scripts/drmaa_external_killer.py</param> <param id="galaxy_external_chown_script">scripts/external_chown_script.py</param> </destination> </destinations> </job_conf>
The galaxy_external_* options are only supported with the drmaa plugin, and actually only belong in the univese_wsgi.ini for the moment, they have not been migrated to the new-style job configuration. They should also only be used if you are attempting to set up "run jobs as the real user" job running capabilities.
Further more when I want to kill my jobs by clicking <Catch(08-08-09-12-39).jpg> in galaxy web,the job keeps on running in my background.I do not know how to fix this. Any help on this would be grateful.Thank you very much.
Job deletion in the pbs runner was recently broken, but a fix for this bug will be part of the next stable release (on Monday).
--nate
shenwiyn
From: Jurgens de Bruin Date: 2013-08-07 19:55 To: galaxy-dev Subject: [galaxy-dev] Help with cluster setup Hi,
This is my first Galaxy installation setup so apologies for stupid
questions. I am setting up Galaxy on a Cluster running Torque as the resource manager. I am working through the documentation but I am unclear on some things:
Firstly I am unable to find : start_job_runners within the
universe_wsgi.ini and I dont want to just add this anywhere - any help on this would be create.
Further more this is my job_conf.xml :
<?xml version="1.0"?> <!-- A sample job config that explicitly configures job running the way
it is configured by default (if there is no explicit config). -->
<job_conf> <plugins> <plugin id="hpc" type="runner" load="galaxy.jobs.runners.drmaa:DRMAAJobRunner" workers="4"/> </plugins> <handlers> <!-- Additional job handlers - the id should match the name of a [server:<id>] in universe_wsgi.ini. <handler id="cn01"/> <handler id="cn02"/> </handlers> <destinations> <destination id="hpc" runner="drmaa"/> </destinations> </job_conf>
Does this look meaning full, further more where to I set the additional server:<id> in the universe_wsgi.ini.
As background the cluster has 13 compute nodes and a shared storage array that can be accessed by all nodes in the cluster.
Thanks again
-- Regards/Groete/Mit freundlichen Grüßen/recuerdos/meilleures salutations/ distinti saluti/siong/duì yú/привет
Jurgens de Bruin ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
-- Regards/Groete/Mit freundlichen Grüßen/recuerdos/meilleures salutations/ distinti saluti/siong/duì yú/привет Jurgens de Bruin -- Regards/Groete/Mit freundlichen Grüßen/recuerdos/meilleures salutations/ distinti saluti/siong/duì yú/привет Jurgens de Bruin