On a related note, it's definitely nice to be able to consolidate the configuration into a single universe_wsi.ini, but --daemon and --stop-daemon are not sufficient for smooth operation of a multi web and job runner instance of Galaxy. I wonder if anyone's already adapted run.sh to restart the runners one by one to accommodate addition of tools and reference datasets without disrupting the user experience. Cheers, Alex On Aug 28, 2012, at 9:03 AM, Nate Coraor <nate@bx.psu.edu> wrote:
On Aug 27, 2012, at 18:13, "Sebastian Schaaf" <schaaf@ibe.med.uni-muenchen.de> wrote:
If you refer to using one machine as master AND host simultaneously, which you never tried: let me tell you it does work well with SGE/OGE. I have it running, and it just works out great. It enhances the load control on that machine, due to the great configuration possibilities on queue level (max. cores, queue ranking, user restrictions etc.). I did not connect it to Galaxy yet (which is also installed on the same machine), but I do not see a reason why that should not work out. Indeed, the reason for acting like this was the need to set up a modularized system, using defined interfaces. Currently everything is physically in on one machine, but that will change. Galaxy for example does not care, where and how a cluster is set up physically, it just connects to a given ressource.
Regarding the initial question: at least jobs in the cluster queue will indeed finish, because after submission those jobs are decoupled from Galaxy, by definition; it's like submitting from a desktop machine, which can be switched off afterwards. Additionally, I would wonder if Galaxy (which has really great fail safe mechanisms) would not be able to reconnect to the cluster, asking for jobs, which were submitted before restart and not noted as "finished" or "picked up" after restart (noted internally within the DB). I would say test it and get back to the mailing list :).