CloudMan - Can't start nodes - galaxy-user

23 Jul 2012

      Hi guys,

I created a new Galaxy instance (probably around early July) with the
web launcher (https://biocloudcentral.herokuapp.com/launch).

I've been coming back and re-using it since then.  However for the
past week at least I haven't been able to launch new nodes.  They show
up as red on the indicators, and below I've pasted the error messages.

(Could this be related to a new version of cloudman being released?)

Thanks,

Greg

This is the cluster status log from my last attempt:

    15:55:02 - Retrieved file 'persistent_data.yaml' from bucket
'cm-[redacted]' to 'pd.yaml'.
    15:55:02 - Master starting
    15:55:05 - Completed initial cluster configuration.
    15:55:25 - Prerequisites OK; starting service 'SGE'
    15:55:37 - Configuring SGE...
    15:55:37 - Setting up SGE did not go smoothly, running command 'cd
/opt/sge; ./inst_sge -m -x -auto /opt/sge/galaxyEC2.conf' returned
code '1' and following stderr: ''
    15:55:57 - Saved file 'persistent_data.yaml' to bucket 'cm-[redacted]'
    15:55:57 - Trouble comparing local (/mnt/cm/post_start_script) and
remote (post_start_script) file modified times: [Errno 2] No such file
or directory: '/mnt/cm/post_start_script'
    15:55:58 - Adding 2 instance(s)...
    15:57:32 - Instance 'i-56ba942e' reported alive
    15:57:33 - Successfully generated root user's public key.
    15:57:33 - Sent master public key to worker instance 'i-56ba942e'.
    15:57:47 - Adding instance 'i-56ba942e' as SGE administrative host.
    15:57:47 - Process encountered problems adding instance
'i-56ba942e' as administrative host. Process returned code 2
    15:57:47 - Adding instance 'i-56ba942e' to SGE execution host list.
    15:57:47 - Process encountered problems adding instance
'i-56ba942e' as execution host. Process returned code 2
    15:57:47 - Problems updating @allhosts aimed at adding
'i-56ba942e', running command 'export SGE_ROOT=/opt/sge;.
$SGE_ROOT/default/common/settings.sh; /opt/sge/bin/lx24-amd64/qconf
-Mhgrp /tmp/ah_add_15_57_47' returned code '2' and following stderr:
'/bin/sh: 1: .: Can't open /opt/sge/default/common/settings.sh '
    15:57:47 - Waiting on worker instance 'i-56ba942e' to configure itself...
    15:57:47 - Instance 'i-54ba942c' reported alive
    15:57:47 - Sent master public key to worker instance 'i-54ba942c'.
    15:58:01 - Adding instance 'i-54ba942c' as SGE administrative host.
    15:58:01 - Process encountered problems adding instance
'i-54ba942c' as administrative host. Process returned code 2
    15:58:01 - Adding instance 'i-54ba942c' to SGE execution host list.
    15:58:01 - Process encountered problems adding instance
'i-54ba942c' as execution host. Process returned code 2
    15:58:01 - Problems updating @allhosts aimed at adding
'i-54ba942c', running command 'export SGE_ROOT=/opt/sge;.
$SGE_ROOT/default/common/settings.sh; /opt/sge/bin/lx24-amd64/qconf
-Mhgrp /tmp/ah_add_15_58_01' returned code '2' and following stderr:
'/bin/sh: 1: .: Can't open /opt/sge/default/common/settings.sh '
    15:58:01 - Waiting on worker instance 'i-54ba942c' to configure itself...

CloudMan - Can't start nodes

mailing list

Brad Chapman

mailing list

Brad Chapman

mailing list

Brad Chapman

mailing list

tags

participants (2)