I finally found out that this native specification is NOT about the native parameters of slurm. It comes from the DRMA-"wrapper" of slurm: http://apps.man.poznan.pl/trac/slurm-drmaa/wiki/WikiStart#Nativespecificatio... And this does not support mem-per-cpu. Which is a huge restriction. The logger seems to print drmaa because it is the base class of the SlurmRunner. Anybody stumbled upon this before? A way would be to have a slurm runner that uses native calls instead of drmaa. Pros/Cons? Any opinions on this topic? Or solutions to get the mem-per-cpu option? I read, that this is criticised by some people from the slurm-drmaa community too. Patches are available, but not included in a release. Best, Alexander 2015-07-31 11:38 GMT-05:00 Alexander Vowinkel <vowinkel.alexander@gmail.com> :
Hi,
can someone tell me something about this? Pointing at the runner stated in the log vs. expected runner.
Thanks, Alexander
2015-07-06 18:08 GMT-05:00 Alexander Vowinkel < vowinkel.alexander@gmail.com>:
Hi,
I have adopted /mnt/galaxy/galaxy-app/config/job_conf.xml: I added a destination with following param: <param id="nativeSpecification">-N1 --cpus-per-task 1 --mem-per-cpu=4G</param>
Galaxy service was restarted after chaning this. Now I get the error "Unable to run job due to a misconfiguration of the Galaxy job running system. Please contact a site administrator." when I try to run a job. Looking in the log gives:
galaxy.jobs.runners.drmaa DEBUG 2015-07-06 22:50:40,187 (493) submitting file /mnt/galaxy/tmp/job_working_directory/000/493/galaxy_493.sh
galaxy.jobs.runners.drmaa DEBUG 2015-07-06 22:50:40,187 (493) native specification is: -N1 --cpus-per-task 1 --mem-per-cpu=4G galaxy.jobs.runners.drmaa ERROR 2015-07-06 22:50:40,187 (493) drmaa.Session.runJob() failed unconditionally Traceback (most recent call last): [...] InvalidAttributeValueException: code 14: Invalid native specification: -N1 --cpus-per-task 1 --mem-per-cpu=4G
Well. running this as galaxy user in the console with
$ srun -N1 --cpus-per-task 1 --mem-per-cpu=4G
/mnt/galaxy/tmp/job_working_directory/000/494/galaxy_494.sh
, it actually runs good.
I was wondering if this is connected with the fact that the log states " galaxy.jobs.runners.drmaa" and not something like galaxy.jobs.runners.slurm, like defined in the job_conf.xml: <plugin id="slurm" type="runner" load="galaxy.jobs.runners.slurm:SlurmJobRunner" />
So - What is going wrong here? Am I working on the wrong job_conf.xml? What does galaxy do here?
Thanks for help!Alexander