dynamically send jobs to second cluster on high load
Hi, The admin pages state that it is possible to specify multiple clusters in the universe file. Currently, we are investigating if we can couple the university HPC platform to galaxy, to handle usage peaks. It would be ideal if the job manager would check the load of the dedicated cluster (eg queue length) and send jobs to the second cluster when load is above a threshold. Does such an approach exists already, or will it become available in the near future? As far as I understand, it is now only possible to specify which jobs run on which cluster, without dynamic switching? Best regards, Geert -- Geert Vandeweyer, Ph.D. Department of Medical Genetics University of Antwerp Prins Boudewijnlaan 43 2650 Edegem Belgium Tel: +32 (0)3 275 97 56 E-mail: geert.vandeweyer@ua.ac.be http://ua.ac.be/cognitivegenetics http://www.linkedin.com/pub/geert-vandeweyer/26/457/726
Hello Geert, I don't believe any such functionality is available out of the box, but I am confident clever use of dynamic job runners (http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-June/010080.html) could solve this problem. One approach would be to maybe move all of your job runners out of galaxy:tool_runners maybe to a new section called galaxy:tool_runners_local and then create another set of runners for your HPC resource (maybe galaxy:tool_runners_hpc). Next set your default_cluster_job_runner to dynamic:///python/default_runner and create a python function called default_runner in lib/galaxy/jobs/rules/200_runners.py. The outline of this file might be something like this: from ConfigParser import ConfigParser def default_runner(tool_id): runner = None if _local_queue_busy(): runner = _get_runner("galaxy:tool_runners_local", tool_id) else: runner = _get_runner("galaxy:tool_runners_hpc", tool_id) if not runner: runner = "local://" # Or whatever default behavior you want. return runner def _local_queue_busy(): # TODO: check local queue, would need to know more... def _get_runner(runner_section, tool_id): universe_config_file = "universe_wsgi.ini" parser = ConfigParser() parser.read(universe_config_file) job_runner = None if parser.has_option(runner_section, tool_id): job_runner = parser.get(runner_section, tool_id) return job_runner You could tweak the logic here to do stuff like only submit certain kinds of jobs to the HPC resource or specify different default runners for each location. Hopefully this is helpful. If you want more help defining this file I could fill in the details if I knew more precisely what behavior you wanted for each queue and what the command line to determine if the dedicated Galaxy resource is busy (or maybe just what queue manager you are using if any). Let me know if you go ahead and get this working, I am eager to hear success stories. -John ------------------------------------------------ John Chilton Senior Software Developer University of Minnesota Supercomputing Institute Office: 612-625-0917 Cell: 612-226-9223 Bitbucket: https://bitbucket.org/jmchilton Github: https://github.com/jmchilton Web: http://jmchilton.net On Mon, Sep 24, 2012 at 3:55 AM, Geert Vandeweyer <geert.vandeweyer2@ua.ac.be> wrote:
Hi,
The admin pages state that it is possible to specify multiple clusters in the universe file. Currently, we are investigating if we can couple the university HPC platform to galaxy, to handle usage peaks. It would be ideal if the job manager would check the load of the dedicated cluster (eg queue length) and send jobs to the second cluster when load is above a threshold.
Does such an approach exists already, or will it become available in the near future? As far as I understand, it is now only possible to specify which jobs run on which cluster, without dynamic switching?
Best regards,
Geert
--
Geert Vandeweyer, Ph.D. Department of Medical Genetics University of Antwerp Prins Boudewijnlaan 43 2650 Edegem Belgium Tel: +32 (0)3 275 97 56 E-mail: geert.vandeweyer@ua.ac.be http://ua.ac.be/cognitivegenetics http://www.linkedin.com/pub/geert-vandeweyer/26/457/726
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
participants (2)
-
Geert Vandeweyer
-
John Chilton