I am totally lost on what is happening now, I have Galaxy running but jobs are not being run:

This is my setup:
torque:
qmgr -c 'p s'
#
# Create queues and set their attributes.
#
#
# Create and define queue batch
#
create queue batch
set queue batch queue_type = Execution
set queue batch resources_default.nodes = 1
set queue batch resources_default.walltime = 01:00:00
set queue batch enabled = True
set queue batch started = True
#
# Set server attributes.
#
set server scheduling = True
set server acl_hosts = manager
set server managers = root@*
set server managers += jurgens@*
set server operators = galaxy@*
set server operators += jurgens@*
set server operators += root@*
set server default_queue = batch
set server log_events = 511
set server mail_from = adm
set server scheduler_iteration = 600
set server node_check_rate = 150
set server tcp_timeout = 300
set server job_stat_rate = 45
set server poll_jobs = True
set server mom_job_sync = True
set server keep_completed = 300
set server next_job_number = 17
set server moab_array_compatible = True


This is my job_conf.xml

<?xml version="1.0"?>
<!-- A sample job config that explicitly configures job running the way it is configured by default (if there is no explicit config). -->
<job_conf>
    <plugins>
         <!-- <plugin id="drmaa" type="runner" load="galaxy.jobs.runners.drmaa:DRMAAJobRunner" workers="4"/> -->
         <plugin id="drmaa" type="runner" load="galaxy.jobs.runners.drmaa:DRMAAJobRunner"/>
        </plugins>
    <handlers default="batch">
        <handler id="cn01"  tags="batch"/>
        <handler id="cn02"  tags="batch"/>
    </handlers>
    <destinations default="batch">
        <destination id="batch" runner="drmaa" tag="cluster,batch">
        <param id="nativeSpecfication">-q batch</param>
        </destination>
    </destinations>
</job_conf>


 
This is parts of the universe_wsgi.ini

# Configuration of the internal HTTP server.

[server:main]

# The internal HTTP server to use.  Currently only Paste is provided.  This
# option is required.
use = egg:Paste#http

# The port on which to listen.
port = 8989

# The address on which to listen.  By default, only listen to localhost (Galaxy
# will not be accessible over the network).  Use '0.0.0.0' to listen on all
# available network interfaces.
#host = 127.0.0.1
host = 0.0.0.0

# Use a threadpool for the web server instead of creating a thread for each
# request.
use_threadpool = True

# Number of threads in the web server thread pool.
#threadpool_workers = 10

# Set the number of seconds a thread can work before you should kill it (assuming it will never finish) to 3 hours.
threadpool_kill_thread_limit = 10800

[server:cn01]
use = egg:Paste#http
port = 8090
host = 127.0.0.1
use_threadpool = true
threadpool_worker = 5
[server:cn02]
use = egg:Paste#http
port = 8091
host = 127.0.0.1
use_threadpool = true
threadpool_worker = 5


Where cn01 and cn02 are cluster nodes

echo $DRMAA_LIBRARY_PATH
/usr/local/lib/libdrmaa.so






On 8 August 2013 16:58, Nate Coraor <nate@bx.psu.edu> wrote:
On Aug 7, 2013, at 9:23 PM, shenwiyn wrote:

> Yes,and I also have the same confuse about that.Actually when I set server:<id> in the universe_wsgi.ini as follows for a try,my Galaxy doesn't work with Cluster,if I remove server:<id>,it work .

Hi Shenwiyn,

Are you starting all of the servers that you have defined in universe_wsgi.ini?  If using run.sh, setting GALAXY_RUN_ALL in the environment will do this for you:

    http://wiki.galaxyproject.org/Admin/Config/Performance/Scaling

> [server:node01]
> use = egg:Paste#http
> port = 8080
> host = 0.0.0.0
> use_threadpool = true
> threadpool_workers = 5
> This is my job_conf.xml :
> <?xml version="1.0"?>
> <job_conf>
>     <plugins workers="4">
>         <plugin id="local" type="runner" load="galaxy.jobs.runners.local:LocalJobRunner" workers="4"/>
>         <plugin id="pbs" type="runner" load="galaxy.jobs.runners.pbs:PBSJobRunner" workers="8"/>
>     </plugins>
>     <handlers default="batch">
>         <handler id="node01" tags="batch"/>
>         <handler id="node02" tags="batch"/>
>     </handlers>
>     <destinations default="regularjobs">
>         <destination id="local" runner="local"/>
>         <destination id="regularjobs" runner="pbs" tags="cluster">
>             <param id="Resource_List">walltime=24:00:00,nodes=1:ppn=4,mem=10G</param>
>             <param id="galaxy_external_runjob_script">scripts/drmaa_external_runner.py</param>
>             <param id="galaxy_external_killjob_script">scripts/drmaa_external_killer.py</param>
>             <param id="galaxy_external_chown_script">scripts/external_chown_script.py</param>
>         </destination>
>    </destinations>
> </job_conf>

The galaxy_external_* options are only supported with the drmaa plugin, and actually only belong in the univese_wsgi.ini for the moment, they have not been migrated to the new-style job configuration.  They should also only be used if you are attempting to set up "run jobs as the real user" job running capabilities.

> Further more when I want to kill my jobs  by clicking <Catch(08-08-09-12-39).jpg> in galaxy web,the job keeps on running in my background.I do not know how to fix this.
> Any help on this would be grateful.Thank you very much.

Job deletion in the pbs runner was recently broken, but a fix for this bug will be part of the next stable release (on Monday).

--nate

>
> shenwiyn
>
> From: Jurgens de Bruin
> Date: 2013-08-07 19:55
> To: galaxy-dev
> Subject: [galaxy-dev] Help with cluster setup
> Hi,
>
> This is my first Galaxy installation setup so apologies for stupid questions. I am setting up Galaxy on a Cluster running Torque as the resource manager. I am working through the documentation but I am unclear on some things:
>
> Firstly I am unable to find : start_job_runners within the universe_wsgi.ini and I dont want to just add this anywhere - any help on this would be create.
>
> Further more this is my job_conf.xml :
>
> <?xml version="1.0"?>
> <!-- A sample job config that explicitly configures job running the way it is configured by default (if there is no explicit config). -->
> <job_conf>
>     <plugins>
>         <plugin id="hpc" type="runner" load="galaxy.jobs.runners.drmaa:DRMAAJobRunner" workers="4"/>
>     </plugins>
>     <handlers>
>   <!-- Additional job handlers - the id should match the name of a
>              [server:<id>] in universe_wsgi.ini.
>         <handler id="cn01"/>
>         <handler id="cn02"/>
>     </handlers>
>     <destinations>
>         <destination id="hpc" runner="drmaa"/>
>     </destinations>
> </job_conf>
>
>
> Does this look meaning full, further more where to I set the additional server:<id>
> in the universe_wsgi.ini.
>
> As background the cluster has 13 compute nodes and a shared storage array that can be accessed by all nodes in the cluster.
>
>
> Thanks again
>
>
>
> --
> Regards/Groete/Mit freundlichen Grüßen/recuerdos/meilleures salutations/
> distinti saluti/siong/duì yú/привет
>
> Jurgens de Bruin
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>  http://lists.bx.psu.edu/
>
> To search Galaxy mailing lists use the unified search at:
>  http://galaxyproject.org/search/mailinglists/




--
Regards/Groete/Mit freundlichen Grüßen/recuerdos/meilleures salutations/
distinti saluti/siong/duì yú/привет

Jurgens de Bruin


--
Regards/Groete/Mit freundlichen Grüßen/recuerdos/meilleures salutations/
distinti saluti/siong/duì yú/привет

Jurgens de Bruin