On Jan 20, 2012, at 7:31 PM, Edward Kirton wrote:
yes, nate but that fails the job but it is, in fact, still running and the error should be ignored
except Exception, e: # so we don't kill the monitor thread log.exception("(%s/%s) Unable to check job status" % ( galaxy_job_id, job_id ) ) log.warning("(%s/%s) job will now be errored" % ( galaxy_job_id, job_id ) ) drm_job_state.fail_message = "Cluster could not complete job" self.work_queue.put( ( 'fail', drm_job_state ) ) continue
I was curious why Ann's DrmCommunicationException appeared to be uncaught. I see now I made a mistake in reading, it was caught and then printed via log.exception(). Okay, I applied your catch in 6578:84ee6eeedb41. Thanks! --nate
On Fri, Jan 20, 2012 at 9:40 AM, Nate Coraor <nate@bx.psu.edu> wrote:
Hi Ann,
The cause of the exception aside, this should be caught by the except block below it in drmaa.py (in check_watched_items()):
except Exception, e: # so we don't kill the monitor thread log.exception("(%s/%s) Unable to check job status" % ( galaxy_job_id, job_id ) )
What changeset are you running?
--nate