Galaxy jobs on Apache Mesos with chronos.py runner
Dear experts, I’m trying to run Galaxy (version 18.05) jobs with Apache MESOS (version 1.5.0). I configured a NFS between galaxy and mesos cluster nodes, thus sharing the directories /home/galaxy and /path/to/galaxy/database with the directories (citations, compiled_templates, files, ftp, home, job_working_directory, object_store_cache, tmp). Then I have changed the job_conf.xml accordingly: ( https://gist.github.com/pmandreoli/6ffba03193717393a2322586686f9aed ). It works fine with my very simple test wrapper here: https://gist.github.com/pmandreoli/ce120612afd0ac9ee80ce70c90e7d324 Then I enabled mulled containers in the galaxy.yml file in order to test the configuration using fastQC (version 0.72, owner devteam). In this case the job was correctly executed on the mesos node (using the container quay.io/biocontainers/fastqc:0.11.8--1) but the results were not linked in the history (please see the attached fig1). [fig1. history panel screenshot for fastQC job on chronos destination] You can have a look to the output here: https://gist.github.com/pmandreoli/bbbeb2eab5c1d1772872220c01678e15 I checked the docker run command on chronos and the working directory is located on /root/working. In order to check if the problems is related to my job_conf.xml configuration and not to the mulled containers i changed the job_conf.xml to run jobs locally ( https://gist.github.com/pmandreoli/484566b2c548d39d8bddb5aa54461ecc ) and i have run the same tool (fastQC 0.72). In this case everything was fine. Is my job_conf.xml configuration correct? I would also like to ask if it is possible to add more than one volume to the docker container run on MESOS in order to add the location for the reference data, located on my mesos slave node in /cvmfs. I tried to modify the job_conf.xml block in this way <param id="volumes">/export/galaxy/database/,/cvmfs/</param> but the job failed. The docker run command sent to chronos was indeed wrong: “-v /export/galaxy/database/,/cvmfs/:/export/galaxy/database/,/cvmfs/:rw “. This is probably due to the definition of the “volumes” field in the chronos.py runner: https://github.com/galaxyproject/galaxy/blob/3b3b52f013ac8c6b5bf8a4765f9fe9c... which, if I understand well, is allowing to use only one path. Did I miss something? Any suggestion and correction is, of course, more than welcome. Best regards, Pietro Mandreoli <signaturebeforequotedtext></signaturebeforequotedtext><signatureafterquotedtext></signatureafterquotedtext>
Hi Pietro, Thanks for trying that runner! Some background here regarding the volume directory issue: https://github.com/galaxyproject/galaxy/pull/3946 https://github.com/galaxyproject/galaxy/pull/3946/files/2099c09f6ab5a8f5951d... It looks to be a known and documented limitation of this runner that it only allows one directory to be mounted in. I don't think the devteam really maintains this runner or has any throughput to - so you'll probably have to patch the runner to support multiple volumes if you need it to work :(. I assume the underlying API it leverages would allow that. The Kubernetes runner supports multiple volumes for instance. If you figure that out, we'd love a PR though. Sorry I don't have better news and can't be more helpful. -John On Tue, May 14, 2019 at 8:56 AM Pietro Mandreoli <pietro.mandreoli@studenti.unimi.it> wrote:
Dear experts,
I’m trying to run Galaxy (version 18.05) jobs with Apache MESOS (version 1.5.0). I configured a NFS between galaxy and mesos cluster nodes, thus sharing the directories /home/galaxy and /path/to/galaxy/database with the directories (citations, compiled_templates, files, ftp, home, job_working_directory, object_store_cache, tmp).
Then I have changed the job_conf.xml accordingly: ( https://gist.github.com/pmandreoli/6ffba03193717393a2322586686f9aed ).
It works fine with my very simple test wrapper here: https://gist.github.com/pmandreoli/ce120612afd0ac9ee80ce70c90e7d324
Then I enabled mulled containers in the galaxy.yml file in order to test the configuration using fastQC (version 0.72, owner devteam).
In this case the job was correctly executed on the mesos node (using the container quay.io/biocontainers/fastqc:0.11.8--1) but the results were not linked in the history (please see the attached fig1).
[fig1. history panel screenshot for fastQC job on chronos destination]
You can have a look to the output here:
https://gist.github.com/pmandreoli/bbbeb2eab5c1d1772872220c01678e15
I checked the docker run command on chronos and the working directory is located on /root/working.
In order to check if the problems is related to my job_conf.xml configuration and not to the mulled containers i changed the job_conf.xml to run jobs locally ( https://gist.github.com/pmandreoli/484566b2c548d39d8bddb5aa54461ecc )
and i have run the same tool (fastQC 0.72). In this case everything was fine.
Is my job_conf.xml configuration correct?
I would also like to ask if it is possible to add more than one volume to the docker container run on MESOS in order to add the location for the reference data, located on my mesos slave node in /cvmfs. I tried to modify the job_conf.xml block in this way
<param id="volumes">/export/galaxy/database/,/cvmfs/</param>
but the job failed. The docker run command sent to chronos was indeed wrong: “-v /export/galaxy/database/,/cvmfs/:/export/galaxy/database/,/cvmfs/:rw “.
This is probably due to the definition of the “volumes” field in the chronos.py runner: https://github.com/galaxyproject/galaxy/blob/3b3b52f013ac8c6b5bf8a4765f9fe9c...
which, if I understand well, is allowing to use only one path. Did I miss something?
Any suggestion and correction is, of course, more than welcome.
Best regards,
Pietro Mandreoli
<signaturebeforequotedtext></signaturebeforequotedtext><signatureafterquotedtext></signatureafterquotedtext> ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: %(web_page_url)s
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/
participants (2)
-
John Chilton
-
Pietro Mandreoli