Glad to hear it's working now. Sorry about the trouble.
The startup procedure is available here:
As part of a larger documentation redo, I'll add it somewhere on the main
wiki we well.
On Mon, Jun 22, 2015 at 4:45 PM, Nicholas Dickens <
Nick.Dickens(a)glasgow.ac.uk> wrote:
Dear Enis,
Thanks for your help with this. I can confirm that if I use a password
that doesn’t start with an exclamation mark the cloudman adding nodes works
fine, it can contain an ! but just not start with one.
Knowing where to look logs-wise will really help – is there a schematic
at all somewhere that shows the Cloudman startup procedure? I’m working on
one for my own understanding but I’m a firm believer in not reinventing the
wheel.
Best wishes,
Nick
--
Nick Dickens
DPhil BSc ARCS
Bioinformatics Team Leader
Wellcome Trust Centre for Molecular Parasitology
B6-21 SGDB
120 University Place
Glasgow
G12 8TA
Tel: +44 141 330 8282
http://fb.me/WTCMPbix
@WTCMPbix
http://www.gla.ac.uk/researchinstitutes/iii/staff/nickdickens/
http://www.gla.ac.uk/researchinstitutes/iii/staff/jeremymottram/comparati...
From: Nick Dickens <nick.dickens(a)glasgow.ac.uk>
Date: Monday, 22 June 2015 17:02
To: Enis Afgan <enis.afgan(a)irb.hr>
Cc: "galaxy-dev(a)lists.galaxyproject.org" <
galaxy-dev(a)lists.galaxyproject.org>
Subject: Re: [galaxy-dev] Cloudman cluster not starting workers
Aha - so neither /tmp/cm/cm_boot.py.log nor /mnt/cm/paster.log exist,
but the ec2autorun.log showed a reasonable error, I used a setup password
beginning with an exclamation mark, which it seems not to like. I realised
that I accidentally posted this to the list previously so don't worry about
it being here I've killed that particular cluster. When I was trying
different configurations, etc I also consistently used the same password
format (which I will no longer use now I posted it to a public mail list
like a moron).
I assume since ec2autorun is first in the bootstrap setup if it fails then
so does everything else. I'll try it with a different password format -
and get back to you (I have a meeting just now). But this looks like an
issue with the password format to me...and possibly a bug in the script?
Best wishes,
Nick
[INFO] ec2autorun:57 2015-06-22 15:38:42,207: Getting user data from '
http://169.254.169.254/latest/user-data';, attempt 0
[DEBUG] ec2autorun:61 2015-06-22 15:38:42,210: Saving user data in its
original format to file '/tmp/cm/original_userData.yaml'
[DEBUG] ec2autorun:65 2015-06-22 15:38:42,211: Got user data
[INFO] ec2autorun:416 2015-06-22 15:38:42,211: Handling user data in YAML
format.
Traceback (most recent call last):
File "/usr/bin/ec2autorun.py", line 516, in <module>
main()
File "/usr/bin/ec2autorun.py", line 512, in main
_parse_user_data(ud)
File "/usr/bin/ec2autorun.py", line 504, in _parse_user_data
_handle_yaml(ud)
File "/usr/bin/ec2autorun.py", line 417, in _handle_yaml
ud = _load_user_data(user_data)
File "/usr/bin/ec2autorun.py", line 402, in _load_user_data
ud = yaml.load(user_data)
File "/usr/lib/python2.7/dist-packages/yaml/__init__.py", line 71, in
load
return loader.get_single_data()
File "/usr/lib/python2.7/dist-packages/yaml/constructor.py", line 39, in
get_single_data
return self.construct_document(node)
File "/usr/lib/python2.7/dist-packages/yaml/constructor.py", line 48, in
construct_document
for dummy in generator:
File "/usr/lib/python2.7/dist-packages/yaml/constructor.py", line 398,
in construct_yaml_map
value = self.construct_mapping(node)
File "/usr/lib/python2.7/dist-packages/yaml/constructor.py", line 208,
in construct_mapping
return BaseConstructor.construct_mapping(self, node, deep=deep)
File "/usr/lib/python2.7/dist-packages/yaml/constructor.py", line 133,
in construct_mapping
value = self.construct_object(value_node, deep=deep)
File "/usr/lib/python2.7/dist-packages/yaml/constructor.py", line 88, in
construct_object
data = constructor(self, node)
File "/usr/lib/python2.7/dist-packages/yaml/constructor.py", line 414,
in construct_undefined
node.start_mark)
yaml.constructor.ConstructorError: could not determine a constructor for
the tag '!galaxySATDEVZGK'
in "<string>", line 4, column 13:
freenxpass: !galaxySATDEVZGK
On 22/06/15 14:45, Enis Afgan wrote:
Hmm - /mnt definitely should not be empty. There's nothing unusual in the
log you sent so could you please send me the one from the worker? It's in
the same location (/mnt/cm/paster.log)
If it's not there, please track the boot procedure logs as follows and can
send those logs:
1. /usr/bin/ec2autorun.log
2. /tmp/cm/cm_boot.py.log
3. /mnt/cm/paster.log
Thanks,
Enis
On Fri, Jun 19, 2015 at 5:02 PM, Nicholas Dickens <
Nick.Dickens(a)glasgow.ac.uk> wrote:
> Thanks – I’ve attached the log. I just tried to start a worker and let
> it go to the first reboot and then copied this log. I logged into the
> worker and it looks ok (dmesg, etc) the only noticable thing was /mnt is
> empty (just a lost+found directory) and I was expecting to see an nfs mount
> for galaxy export or something. But I’m still finding my way round the
> system. It may also have been the time in the reboot cycle that I was
> there.
>
> Best wishes,
>
> Nick
> --
> Nick Dickens
> DPhil BSc ARCS
>
> Bioinformatics Team Leader
> Wellcome Trust Centre for Molecular Parasitology
> B6-21 SGDB
> 120 University Place
> Glasgow
> G12 8TA
>
> Tel: +44 141 330 8282
>
>
http://fb.me/WTCMPbix
> @WTCMPbix
>
http://www.gla.ac.uk/researchinstitutes/iii/staff/nickdickens/
>
>
http://www.gla.ac.uk/researchinstitutes/iii/staff/jeremymottram/comparati...
>
> From: Enis Afgan <enis.afgan(a)irb.hr>
> Date: Friday, 19 June 2015 17:00
> To: Nick Dickens <nick.dickens(a)glasgow.ac.uk>
> Cc: "galaxy-dev(a)lists.galaxyproject.org" <
> galaxy-dev(a)lists.galaxyproject.org>
> Subject: Re: [spam?] [galaxy-dev] Cloudman cluster not starting workers
>
> Hi Nick,
> Sorry to hear you're having trouble. I just tried a couple of scenarios
> and they all worked as expected (e.g., with and without elastic IPs,
> different instance types).
>
> The main CloudMan log is located in /mnt/cm/paster.log, on both master
> and worker instances (if you didn't download the ssh key from cloudlaunch,
> you can ssh with ubuntu username and the same password as provided on the
> cloudlaunch form). The log is also available from the UI if you go to Admin
> page and then click 'Show CloudMan log' under 'System controls'. If
you can
> share that, we can hopefully figure out what's going on.
>
> Best,
> Enis
>
> On Wed, Jun 17, 2015 at 11:47 AM, Nicholas Dickens <
> Nick.Dickens(a)glasgow.ac.uk> wrote:
>
>> Hi All,
>>
>> First time post, as a quick intro, I’m have some reasonable experience
>> with EC2 & AWS for developing our own pipelines, I’m comfortable in python
>> and *nix flavours and I am developing a completely custom galaxy in AWS for
>> members of the WTCMP, Glasgow. In the meantime, I have thrown up a quick
>> cloudman galaxy using the cloudstart and the cloudman 2.3 ami
>> (ami-a7dbf6ce) in us-east-1. Auto–scaling didn’t seem to work so I’ve
>> switched it off and added nodes manually, I tried various sizes including
>> ‘same as master’ instances but they just don’t start – in the EC2 console I
>> can see them and see them running. But they’re constantly pending in the
>> /cloud interface and in the log they reboot 4 times and then terminate –
>> apparently not responding "10:16:56 - Instance i-xxxxxx not responding
>> after 4 reboots. Terminating instance".
>>
>> It’s out of the box, I editted the universe_wsgi.ini… file to disallow
>> user registration and allow me to impersonate users but didn’t change
>> anything else. The only other configuration I’ve done is associate an
>> elastic IP with the master instance so I can have a more static url for a
>> couple of test users (if I need to destroy it and start again, etc).
>>
>> I’m new to the system so don’t know which logs are best to check…and
>> am I missing something obvious? It there a known bug when using elastic
>> IPs? I’ve googled but with no joy.
>>
>> Thanks for your help and best wishes,
>>
>> Nick
>> --
>> Nick Dickens
>> DPhil BSc ARCS
>>
>> Bioinformatics Team Leader
>> Wellcome Trust Centre for Molecular Parasitology
>> B6-21 SGDB
>> 120 University Place
>> Glasgow
>> G12 8TA
>>
>> Tel: +44 141 330 8282
>>
>>
http://fb.me/WTCMPbix
>> @WTCMPbix
>>
http://www.gla.ac.uk/researchinstitutes/iii/staff/nickdickens/
>>
>>
http://www.gla.ac.uk/researchinstitutes/iii/staff/jeremymottram/comparati...
>>
>>
>> ----------------------------- Upozorenje -----------------------------
>>
>> Automatskom detekcijom utvrdjeno je da tekst ove poruke
>> podsjeca na tzv. phishing poruku.
>>
>> AKO SE U PORUCI TRAZI DA POSALJETE VASU IRB LOZINKU ILI
>> DA UNESETE IRB PODATKE NA NAVEDENOM LINKU, RADI SE O
>> NAPADU S CILJEM KRADJE I ZLOUPOTREBE PODATAKA.
>>
>> Centar za informatiku i racunarstvo,
>> Institut Rudjer Boskovic
>>
>> ----------------------------- Upozorenje -----------------------------
>>
>> ___________________________________________________________
>> Please keep all replies on the list by using "reply all"
>> in your mail client. To manage your subscriptions to this
>> and other Galaxy lists, please use the interface at:
>>
https://lists.galaxyproject.org/
>>
>> To search Galaxy mailing lists use the unified search at:
>>
http://galaxyproject.org/search/mailinglists/
>>
>
>
--
Nick Dickens
DPhil BSc ARCS
Bioinformatics Team Leader
Wellcome Trust Centre for Molecular Parasitology
B6-21 SGDB
120 University Place
Glasgow
G12 8TA
Tel: +44 141 330 8282
http://fb.me/WTCMPbix
@WTCMPbixhttp://www.gla.ac.uk/researchinstitutes/iii/staff/nickdickens/http://www.gla.ac.uk/researchinstitutes/iii/staff/jeremymottram/comparativegenomicsofleishmania/