Hi,

I have noticed that from time to time, the job queue seems to be “stuck” and can only be unstuck by restarting galaxy.

The jobs seem to be in the queue state and the python job handler processes are hardly ticking over and the cluster is empty.

When I restart, the startup procedure realizes all jobs are in the a “new state” and it then assigns a jobhandler after which the jobs start fine….

Any ideas?

Torque

Thon

P.S I am using the june version of galaxy and I DO set limits on my users in job_conf.xml as so: (Maybe it is related? Before it went into dormant mode, this user had started lots of jobs and may have hit the limit, but I assumed this limit was the number of running jobs at one time, right?)

<?xml version="1.0"?>

<job_conf>

<!-- "workers" is the number of threads for the runner's work queue.

The default from <plugins> is used if not defined for a <plugin>.

-->

</plugins>

<!-- Additional job handlers - the id should match the name of a

[server:<id>] in universe_wsgi.ini.

-->

<!-- <handler id="handler10" tags="handlers"/>

-->

</handlers>

<!-- Destinations define details about remote resources and how jobs

should be executed on those remote resources.

-->

</destination>

</destination>

<param id="nativeSpecification">-V -q short.q -pe smp 1</param>

</destination>

</destination>

<!-- <destination id="real_user_cluster" runner="drmaa">

<param id="galaxy_external_runjob_script">scripts/drmaa_external_runner.py</param>

<param id="galaxy_external_killjob_script">scripts/drmaa_external_killer.py</param>

<param id="galaxy_external_chown_script">scripts/external_chown_script.py</param>

</destination> -->

<param id="type">python</param>

<param id="function">interactiveOrCluster</param>

</destination>

</destinations>

<tools>

<!-- Tools can be configured to use specific destinations or handlers,

identified by either the "id" or "tags" attribute. If assigned to

a tag, a handler or destination that matches that tag will be

chosen at random.

-->

</tools>

<!-- Certain limits can be defined.

-->

</limits>

</job_conf>