Hi John, I'm glad you found an alternative solution. Late last night, I realized I had forgotten to email you yesterday as I completed implementing the feature for disabling the master from running jobs within CloudMan ( https://bitbucket.org/galaxy/cloudman/changeset/b21e967d30f9). I have not yet updated the official CloudMan (because there are some other features I'd like to add before doing so) but it is possible to get this code by pulling CloudMan's source from bitbucket, creating cm.tar.gz, and uploading it to your cluster's bucket. Let me know if you'd like to give it a shot and I can point you to a couple of scripts to make that task trivial. I also like the solution you found because it does not necessarily exclude the master from running jobs, just limits it. I'll see about adding that option as well. On Tue, Mar 13, 2012 at 5:36 AM, John Major <john.e.major.jr@gmail.com>wrote:
Enis-
Thanks again for your advice. It led me on a small SGE learning dive and I actually settled upon a different solution that might be easier to configure and figured I'd share.
Using 'qconf -me IP' I set: 'complex_variables slots=0' (or 1 if the head node was bigger).
This seemed to do the trick where the head node did not get overwhelmed and quickly added nodes if the jobs were not fast to execute. And seems to persist during auto scaling.
John
On Tue, Mar 6, 2012 at 4:00 PM, Enis Afgan <enis.afgan@irb.hr> wrote:
Hi John, Theoretically, this is a straightforward task but in reality CloudMan gets in the way of making it stick. Namely, if you are to manually remove the master instance from being an execution host, CloudMan will add it back in the next time a node is added or removed from the cluster, thus negating your manual modification. So, I will add this feature to CloudMan itself but I cannot commit to a date right now. It should be soon though.
In the mean time, if you'd like to script this yourself via an add-hoc solution that runs periodically, below is the procedure for manually removing a node from SGE's execution host list: ubuntu@ip-10-204-170-63:~$ sudo -s root@ip-10-204-170-63:~# qhost HOSTNAME ARCH NCPU LOAD MEMTOT MEMUSE SWAPTO SWAPUS
------------------------------------------------------------------------------- global - - - - - - - ip-10-204-170-63 lx24-amd64 1 1.11 615.2M 161.6M 0.0 0.0 # Remove the host from the list of execution hosts root@ip-10-204-170-63:~# qconf -de ip-10-204-170-63 Host object "ip-10-204-170-63" is still referenced in cluster queue "all.q". # Edit the configuration of allhosts and remove the host in question. If this is the only host in the list, replace it's name with word NONE root@ip-10-204-170-63:~# qconf -mhgrp "@allhosts" root@ip-10-204-170-63.ec2.internal modified "@allhosts" in host group list # Show configuraiton of group allhosts root@ip-10-204-170-63:~# qconf -shgrp "@allhosts" group_name @allhosts hostlist NONE # The hos is now removed from the list of execution hosts root@ip-10-204-170-63:~# qstat -f root@ip-10-204-170-63:~#
On Wed, Mar 7, 2012 at 6:23 AM, John Major <john.e.major.jr@gmail.com> wrote:
Hello All-
I'd like to launch a galaxy-cloudman head node which does not accept SGE jobs, but as jobs are submitted go to compute nodes (or cause compute node to be added when auto-scale is on). Primarily, this is b/c I'd like to have the head node be a cheaper instance which can run long term, and only fire up more expensive compute nodes as they are actually needed.
How would I enable this?
Thanks- John
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at: