Thanks for your response. That seemed to work, to a point. I redid the same FASTQ->FASTA conversion test and could see using command-line utilities that the job started on the grid (SGE is behind it.)
About 10 minutes have gone by though since the job stopped running on the grid and the interface is still spinning as if it's running. I checked the Galaxy log for that job ID and I see these entries:
galaxy.jobs.runners.drmaa INFO 2013-02-26 10:51:15,670 (103) queued as 5224332
galaxy.jobs.runners.drmaa DEBUG 2013-02-26 10:51:16,390 (103/5224332) state change: job is queued and active
galaxy.jobs.runners.drmaa DEBUG 2013-02-26 10:51:30,422 (103/5224332) state change: job is running
galaxy.jobs.runners.drmaa DEBUG 2013-02-26 10:52:32,623 (103/5224332) is still in running state, adding to the DRM queue
galaxy.jobs.runners.drmaa DEBUG 2013-02-26 10:54:31,961 (103/5224332) job left DRM queue with following message: code 18: The job specified by the 'jobid' does not exist.
The SGE qacct utility for that job shows me that it took 3.5 minutes to run and that the return status was 0, so it executed successfully.
It seems that Galaxy is missing the completion event for the job?
Joshua
On Tue, Feb 26, 2013 at 12:17 AM, Ross
<ross.lazarus@gmail.com> wrote:
See if adding the default queue name to the job runner path - eg:
default_cluster_job_runner = drmaa:///default
works any better?
Galaxy will default to the local runner if it can't find the nominated drmaa path AFAIK and I don't think 'default' is the default
:)