Thanks for your response.  That seemed to work, to a point.  I redid the same FASTQ->FASTA conversion test and could see using command-line utilities that the job started on the grid (SGE is behind it.)  

About 10 minutes have gone by though since the job stopped running on the grid and the interface is still spinning as if it's running.  I checked the Galaxy log for that job ID and I see these entries:

galaxy.jobs.runners.drmaa INFO 2013-02-26 10:51:15,670 (103) queued as 5224332
galaxy.jobs.runners.drmaa DEBUG 2013-02-26 10:51:16,390 (103/5224332) state change: job is queued and active
galaxy.jobs.runners.drmaa DEBUG 2013-02-26 10:51:30,422 (103/5224332) state change: job is running
galaxy.jobs.runners.drmaa DEBUG 2013-02-26 10:52:32,623 (103/5224332) is still in running state, adding to the DRM queue
galaxy.jobs.runners.drmaa DEBUG 2013-02-26 10:54:31,961 (103/5224332) job left DRM queue with following message: code 18: The job specified by the 'jobid' does not exist.

The SGE qacct utility for that job shows me that it took 3.5 minutes to run and that the return status was 0, so it executed successfully.

It seems that Galaxy is missing the completion event for the job?

Joshua




On Tue, Feb 26, 2013 at 12:17 AM, Ross <ross.lazarus@gmail.com> wrote:
See if adding the default queue name to the job runner path - eg: 
default_cluster_job_runner = drmaa:///default
works any better?
Galaxy will default to the local runner if it can't find the nominated drmaa path AFAIK and I don't think 'default' is the default 
:)



On Tue, Feb 26, 2013 at 4:51 PM, Joshua Orvis <jorvis@gmail.com> wrote:
I have a working local Galaxy instance and wanted to enable DRMAA support to utilize our SGE (or LSF) grid.  Following the guide here I set what I appeared to need to make this work.  From the DRMAA_LIBRARY_PATH env variable to all the configuration settings in universe_wsgi.ini, reconfiguring the server hosting Galaxy as a submit host, etc.  Some specific config file changes made:

    new_file_path = /seq/gscidA/www/gscid_devel/htdocs/galaxy-dist/database/tmp
    start_job_runners = drmaa
    default_cluster_job_runner = drmaa:///
    set_metadata_externally = True
    outputs_to_working_directory = True

I then killed and restarted the Galaxy instance and tried a simple FASTQ -> FASTA test execution, but it ran locally.  I couldn't find any sort of errors or messages related to DRMAA in the server log, and the job ran to completion.  I commented out the local tool runner overrides.  What can I do to test my DRMAA configuration and where should I look for errors?

Thanks -

Joshua

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/