Dear list,
I have implemented a new DRMAA job runner allows to detect runtime and memory violations. This is done by using the drmaa library functions job_info + wait or as fallback the commandline tools qstat + qacct. This has the additional advantage that the runner works also in setups using the external runner scripts to start jobs as real user (the original DRMAAJobRunner can not query these jobs at all).
You may have a look here:
https://github.com/galaxyproject/galaxy/pull/4275
I have successfully tested the but settings: galaxy user/real user submits the jobs. Also the resubmission in case of memory and time violations seems to work.
Would be great to get some comments.
One problem that I encountered is that upload jobs do not work in the real user setting (for the original and the new runner): The permissions of the uploaded file in /gpfs1/data/galaxy_server/galaxy-dev/database/tmp/ are not changed. Any idea what needs to be changed to get this running?
Best, Matthias