Re: [galaxy-dev] Galaxy Hang after DrmCommunicationException

21 Jan 2012


      yes, nate but that fails the job but it is, in fact, still running and
the error should be ignored

            except Exception, e:
                # so we don't kill the monitor thread
                log.exception("(%s/%s) Unable to check job status" % (
galaxy_job_id, job_id ) )
                log.warning("(%s/%s) job will now be errored" % (
galaxy_job_id, job_id ) )
                drm_job_state.fail_message = "Cluster could not complete job"
                self.work_queue.put( ( 'fail', drm_job_state ) )
                continue

On Fri, Jan 20, 2012 at 9:40 AM, Nate Coraor <nate@bx.psu.edu> wrote:
...
Hi Ann,
The cause of the exception aside, this should be caught by the except block below it in drmaa.py (in check_watched_items()):
           except Exception, e:
               # so we don't kill the monitor thread
               log.exception("(%s/%s) Unable to check job status" % ( galaxy_job_id, job_id ) )
What changeset are you running?
--nate

Re: [galaxy-dev] Galaxy Hang after DrmCommunicationException

Edward Kirton