I submitted a workflow that in turn submits a drm job to sun grid engine.
The queue had an error (probably due to a problem with automount).
Traceback (most recent call last): File "/inside/depot4/galaxy/lib/galaxy/jobs/runners/__init__.py", line 60, in run_next method(arg) File "/inside/depot4/galaxy/lib/galaxy/jobs/runners/drmaa.py", line 169, in queue_job external_job_id = self.ds.runJob(jt) File "/inside/depot4/galaxy/eggs/drmaa-0.4b3-py2.7.egg/drmaa/__init__.py", line 331, in runJob _h.c(_w.drmaa_run_job, jid, _ct.sizeof(jid), jobTemplate) File "/inside/depot4/galaxy/eggs/drmaa-0.4b3-py2.7.egg/drmaa/helpers.py", line 213, in c return f(*(args + (error_buffer, sizeof(error_buffer)))) File "/inside/depot4/galaxy/eggs/drmaa-0.4b3-py2.7.egg/drmaa/errors.py", line 90, in error_check raise _ERRORS[code-1]("code %s: %s" % (code, error_buffer.value)) DeniedByDrmException: code 17: error: no suitable queues
When the sysadmin cleared the error, the job started running normally after being in an error state for 10 minutes.
The cool thing is that galaxy kept running without a problem.
galaxy.jobs.runners.drmaa DEBUG 2013-06-27 14:51:22,968 (10481/4767487) state change: job is running galaxy.jobs.runners.drmaa DEBUG 2013-06-27 15:00:23,628 (10481/4767487) state change: job is queued and active galaxy.jobs.runners.drmaa DEBUG 2013-06-27 15:00:27,751 (10481/4767487) state change: job is running
However in the history panel, the job shows as queued but not running, even if I refresh the history panel.
Is this normal or should the status change to running?
I'm using this version of straight galaxy:
changeset: 10162:f295092476c7 branch: stable parent: 10160:2efb1083676b user: Dannon Baker dannonbaker@me.com date: Sat Jun 15 09:08:09 2013 -0400 summary: Fix reports import issue reported by Lance, https://trello.com/card/bug-in-reports-webapp-imports/506338ce32ae458f6d15e4...
-Robert Baertsch UC Santa Cruz
ignore this last post On Jun 27, 2013, at 3:09 PM, Robert Baertsch wrote:
I submitted a workflow that in turn submits a drm job to sun grid engine.
The queue had an error (probably due to a problem with automount).
Traceback (most recent call last): File "/inside/depot4/galaxy/lib/galaxy/jobs/runners/__init__.py", line 60, in run_next method(arg) File "/inside/depot4/galaxy/lib/galaxy/jobs/runners/drmaa.py", line 169, in queue_job external_job_id = self.ds.runJob(jt) File "/inside/depot4/galaxy/eggs/drmaa-0.4b3-py2.7.egg/drmaa/__init__.py", line 331, in runJob _h.c(_w.drmaa_run_job, jid, _ct.sizeof(jid), jobTemplate) File "/inside/depot4/galaxy/eggs/drmaa-0.4b3-py2.7.egg/drmaa/helpers.py", line 213, in c return f(*(args + (error_buffer, sizeof(error_buffer)))) File "/inside/depot4/galaxy/eggs/drmaa-0.4b3-py2.7.egg/drmaa/errors.py", line 90, in error_check raise _ERRORS[code-1]("code %s: %s" % (code, error_buffer.value)) DeniedByDrmException: code 17: error: no suitable queues
When the sysadmin cleared the error, the job started running normally after being in an error state for 10 minutes.
The cool thing is that galaxy kept running without a problem.
galaxy.jobs.runners.drmaa DEBUG 2013-06-27 14:51:22,968 (10481/4767487) state change: job is running galaxy.jobs.runners.drmaa DEBUG 2013-06-27 15:00:23,628 (10481/4767487) state change: job is queued and active galaxy.jobs.runners.drmaa DEBUG 2013-06-27 15:00:27,751 (10481/4767487) state change: job is running
However in the history panel, the job shows as queued but not running, even if I refresh the history panel.
Is this normal or should the status change to running?
I'm using this version of straight galaxy:
changeset: 10162:f295092476c7 branch: stable parent: 10160:2efb1083676b user: Dannon Baker dannonbaker@me.com date: Sat Jun 15 09:08:09 2013 -0400 summary: Fix reports import issue reported by Lance, https://trello.com/card/bug-in-reports-webapp-imports/506338ce32ae458f6d15e4...
-Robert Baertsch UC Santa Cruz
On Thu, Jun 27, 2013 at 11:09 PM, Robert Baertsch robert.baertsch@gmail.com wrote:
I submitted a workflow that in turn submits a drm job to sun grid engine.
The queue had an error (probably due to a problem with automount).
Traceback (most recent call last): File "/inside/depot4/galaxy/lib/galaxy/jobs/runners/__init__.py", line 60, in run_next method(arg) File "/inside/depot4/galaxy/lib/galaxy/jobs/runners/drmaa.py", line 169, in queue_job external_job_id = self.ds.runJob(jt) File "/inside/depot4/galaxy/eggs/drmaa-0.4b3-py2.7.egg/drmaa/__init__.py", line 331, in runJob _h.c(_w.drmaa_run_job, jid, _ct.sizeof(jid), jobTemplate) File "/inside/depot4/galaxy/eggs/drmaa-0.4b3-py2.7.egg/drmaa/helpers.py", line 213, in c return f(*(args + (error_buffer, sizeof(error_buffer)))) File "/inside/depot4/galaxy/eggs/drmaa-0.4b3-py2.7.egg/drmaa/errors.py", line 90, in error_check raise _ERRORS[code-1]("code %s: %s" % (code, error_buffer.value)) DeniedByDrmException: code 17: error: no suitable queues
When the sysadmin cleared the error, the job started running normally after being in an error state for 10 minutes.
The cool thing is that galaxy kept running without a problem.
galaxy.jobs.runners.drmaa DEBUG 2013-06-27 14:51:22,968 (10481/4767487) state change: job is running galaxy.jobs.runners.drmaa DEBUG 2013-06-27 15:00:23,628 (10481/4767487) state change: job is queued and active galaxy.jobs.runners.drmaa DEBUG 2013-06-27 15:00:27,751 (10481/4767487) state change: job is running
However in the history panel, the job shows as queued but not running, even if I refresh the history panel.
Is this normal or should the status change to running?
I'm using this version of straight galaxy:
changeset: 10162:f295092476c7 branch: stable parent: 10160:2efb1083676b user: Dannon Baker dannonbaker@me.com date: Sat Jun 15 09:08:09 2013 -0400 summary: Fix reports import issue reported by Lance, https://trello.com/card/bug-in-reports-webapp-imports/506338ce32ae458f6d15e4...
-Robert Baertsch UC Santa Cruz
Good question - I've just have a job "fail" like this, with the following in my log:
014-05-12T14:24:28.366979+01:00 ppserver galaxy.jobs.runners ERROR 2014-05-12 14:24:28,261 (10622) Unhandled exception calling queue_job#012Traceback (most recent call last):#012 File "/mnt/galaxy/galaxy-dist/lib/galaxy/jobs/runners/__init__.py", line 62, in run_next#012 method(arg)#012 File "/mnt/galaxy/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py", line 154, in queue_job#012 external_job_id = self.ds.runJob(jt)#012 File "/mnt/galaxy/galaxy-dist/eggs/drmaa-0.6-py2.6.egg/drmaa/__init__.py", line 331, in runJob#012 _h.c(_w.drmaa_run_job, jid, _ct.sizeof(jid), jobTemplate)#012 File "/mnt/galaxy/galaxy-dist/eggs/drmaa-0.6-py2.6.egg/drmaa/helpers.py", line 213, in c#012 return f(*(args + (error_buffer, sizeof(error_buffer))))#012 File "/mnt/galaxy/galaxy-dist/eggs/drmaa-0.6-py2.6.egg/drmaa/errors.py", line 90, in error_check#012 raise _ERRORS[code-1]("code %s: %s" % (code, error_buffer.value))#012DeniedByDrmException: code 17: error: no suitable queues
(Apologies there are no line breaks, we send our logging to syslog)
i.e. DeniedByDrmException: code 17: error: no suitable queues
From the user's perspective the job is still say in the "grey" pending state.
Is that the correct behaviour? It sounds like (from Robert's email) the job is pending and if the cluster administrator attended to it, the job might run.
We're using Univa Grid Engine, and perhaps I am looking in the wrong place, but the job seems to have been rejected rather than held. If so, then Galaxy should treat this as a failed job (red).
Peter
Hey Peter,
Sorry about that. Nate and I talked about this and we believe the following pull request should likely fix the problem.
https://bitbucket.org/galaxy/galaxy-central/pull-request/390
-John
On Mon, May 12, 2014 at 8:53 AM, Peter Cock p.j.a.cock@googlemail.com wrote:
On Thu, Jun 27, 2013 at 11:09 PM, Robert Baertsch robert.baertsch@gmail.com wrote:
I submitted a workflow that in turn submits a drm job to sun grid engine.
The queue had an error (probably due to a problem with automount).
Traceback (most recent call last): File "/inside/depot4/galaxy/lib/galaxy/jobs/runners/__init__.py", line 60, in run_next method(arg) File "/inside/depot4/galaxy/lib/galaxy/jobs/runners/drmaa.py", line 169, in queue_job external_job_id = self.ds.runJob(jt) File "/inside/depot4/galaxy/eggs/drmaa-0.4b3-py2.7.egg/drmaa/__init__.py", line 331, in runJob _h.c(_w.drmaa_run_job, jid, _ct.sizeof(jid), jobTemplate) File "/inside/depot4/galaxy/eggs/drmaa-0.4b3-py2.7.egg/drmaa/helpers.py", line 213, in c return f(*(args + (error_buffer, sizeof(error_buffer)))) File "/inside/depot4/galaxy/eggs/drmaa-0.4b3-py2.7.egg/drmaa/errors.py", line 90, in error_check raise _ERRORS[code-1]("code %s: %s" % (code, error_buffer.value)) DeniedByDrmException: code 17: error: no suitable queues
When the sysadmin cleared the error, the job started running normally after being in an error state for 10 minutes.
The cool thing is that galaxy kept running without a problem.
galaxy.jobs.runners.drmaa DEBUG 2013-06-27 14:51:22,968 (10481/4767487) state change: job is running galaxy.jobs.runners.drmaa DEBUG 2013-06-27 15:00:23,628 (10481/4767487) state change: job is queued and active galaxy.jobs.runners.drmaa DEBUG 2013-06-27 15:00:27,751 (10481/4767487) state change: job is running
However in the history panel, the job shows as queued but not running, even if I refresh the history panel.
Is this normal or should the status change to running?
I'm using this version of straight galaxy:
changeset: 10162:f295092476c7 branch: stable parent: 10160:2efb1083676b user: Dannon Baker dannonbaker@me.com date: Sat Jun 15 09:08:09 2013 -0400 summary: Fix reports import issue reported by Lance, https://trello.com/card/bug-in-reports-webapp-imports/506338ce32ae458f6d15e4...
-Robert Baertsch UC Santa Cruz
Good question - I've just have a job "fail" like this, with the following in my log:
014-05-12T14:24:28.366979+01:00 ppserver galaxy.jobs.runners ERROR 2014-05-12 14:24:28,261 (10622) Unhandled exception calling queue_job#012Traceback (most recent call last):#012 File "/mnt/galaxy/galaxy-dist/lib/galaxy/jobs/runners/__init__.py", line 62, in run_next#012 method(arg)#012 File "/mnt/galaxy/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py", line 154, in queue_job#012 external_job_id = self.ds.runJob(jt)#012 File "/mnt/galaxy/galaxy-dist/eggs/drmaa-0.6-py2.6.egg/drmaa/__init__.py", line 331, in runJob#012 _h.c(_w.drmaa_run_job, jid, _ct.sizeof(jid), jobTemplate)#012 File "/mnt/galaxy/galaxy-dist/eggs/drmaa-0.6-py2.6.egg/drmaa/helpers.py", line 213, in c#012 return f(*(args + (error_buffer, sizeof(error_buffer))))#012 File "/mnt/galaxy/galaxy-dist/eggs/drmaa-0.6-py2.6.egg/drmaa/errors.py", line 90, in error_check#012 raise _ERRORS[code-1]("code %s: %s" % (code, error_buffer.value))#012DeniedByDrmException: code 17: error: no suitable queues
(Apologies there are no line breaks, we send our logging to syslog)
i.e. DeniedByDrmException: code 17: error: no suitable queues
From the user's perspective the job is still say in the "grey" pending state.
Is that the correct behaviour? It sounds like (from Robert's email) the job is pending and if the cluster administrator attended to it, the job might run.
We're using Univa Grid Engine, and perhaps I am looking in the wrong place, but the job seems to have been rejected rather than held. If so, then Galaxy should treat this as a failed job (red).
Peter ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
galaxy-dev@lists.galaxyproject.org