I'm running a test Galaxy system on a cluster (merged galaxy-dist on Janurary 4th). And I've noticed some odd behavior from the DRMAA job runner.

I'm running a multithread system, one web server, one job_manager, and three job_handlers. DRMAA is the default job runner (the command for tophat2 is drmaa://-V -l mem_total=7G -pe smp 2/), with SGE 6.2u5 being the engine underneath.

My test involves trying to run three different Tophat2 jobs. The first two seem to start up (and get put on the SGE queue), but the third stays grey, with the job manager listing it in state 'new' with command line 'None'. It doesn't seem to leave this state. Both of the jobs that actually got onto the queue die (reasons unknown, but much to early, probably some tophat/bowtie problem), but one job is listed in error state with stderr as 'Job output not returned from cluster', while the other job (which is no longer in the SGE queue) is still listed as running.

Any ideas?

Kyle