Hi guys, I'm hitting an error using CloudMan using the Share-an-Instance option. It says: Error creating volume from shared cluster's snapshot '['snap-cfa775ba']': 'filesystems'. Also disk stats says 0 /0 and the Applications light is yellow while the data light is green. I'm using the share string cm-808d863548acae7c2328c39a90f52e29/shared/2012-09-17--19-47 It's always worked in the past. Thanks, Greg Here's the full log: 14:58:18 - Master starting 14:58:20 - Completed the initial cluster startup process. This is a new cluster; waiting to configure the type. 14:58:24 - Migration service prerequisites OK; starting the service 14:58:24 - SGE service prerequisites OK; starting the service 14:58:31 - Setting up SGE... 14:58:51 - HTCondor service prerequisites OK; starting the service 14:58:51 - HTCondor config file /etc/condor/condor_config not found! 14:58:59 - Hadoop service prerequisites OK; starting the service 14:59:48 - Done adding Hadoop service; service running. 15:01:45 - Error creating volume from shared cluster's snapshot '['snap-cfa775ba']': 'filesystems'
Hi guys, Any thoughts on this? I'm kind of stuck. (Even some pointers on where to look for more clues would be extremely helpful.) Thanks, Greg On Fri, Jul 5, 2013 at 11:10 AM, greg <margeemail@gmail.com> wrote:
Hi guys,
I'm hitting an error using CloudMan using the Share-an-Instance option. It says:
Error creating volume from shared cluster's snapshot '['snap-cfa775ba']': 'filesystems'.
Also disk stats says 0 /0 and the Applications light is yellow while the data light is green.
I'm using the share string cm-808d863548acae7c2328c39a90f52e29/shared/2012-09-17--19-47
It's always worked in the past.
Thanks,
Greg
Here's the full log:
14:58:18 - Master starting 14:58:20 - Completed the initial cluster startup process. This is a new cluster; waiting to configure the type. 14:58:24 - Migration service prerequisites OK; starting the service 14:58:24 - SGE service prerequisites OK; starting the service 14:58:31 - Setting up SGE... 14:58:51 - HTCondor service prerequisites OK; starting the service 14:58:51 - HTCondor config file /etc/condor/condor_config not found! 14:58:59 - Hadoop service prerequisites OK; starting the service 14:59:48 - Done adding Hadoop service; service running. 15:01:45 - Error creating volume from shared cluster's snapshot '['snap-cfa775ba']': 'filesystems'
Hi guys, I just thought I'd check in again. None of the researches that want to run out genotyping program can do so until I figure this out. Any help or advice at all would be greatly appreciated. Thanks, Greg On Mon, Jul 8, 2013 at 8:43 AM, greg <margeemail@gmail.com> wrote:
Hi guys,
Any thoughts on this? I'm kind of stuck.
(Even some pointers on where to look for more clues would be extremely helpful.)
Thanks,
Greg
On Fri, Jul 5, 2013 at 11:10 AM, greg <margeemail@gmail.com> wrote:
Hi guys,
I'm hitting an error using CloudMan using the Share-an-Instance option. It says:
Error creating volume from shared cluster's snapshot '['snap-cfa775ba']': 'filesystems'.
Also disk stats says 0 /0 and the Applications light is yellow while the data light is green.
I'm using the share string cm-808d863548acae7c2328c39a90f52e29/shared/2012-09-17--19-47
It's always worked in the past.
Thanks,
Greg
Here's the full log:
14:58:18 - Master starting 14:58:20 - Completed the initial cluster startup process. This is a new cluster; waiting to configure the type. 14:58:24 - Migration service prerequisites OK; starting the service 14:58:24 - SGE service prerequisites OK; starting the service 14:58:31 - Setting up SGE... 14:58:51 - HTCondor service prerequisites OK; starting the service 14:58:51 - HTCondor config file /etc/condor/condor_config not found! 14:58:59 - Hadoop service prerequisites OK; starting the service 14:59:48 - Done adding Hadoop service; service running. 15:01:45 - Error creating volume from shared cluster's snapshot '['snap-cfa775ba']': 'filesystems'
Hi Greg, Sorry for replying really late. So, I'm guessing this was an old cluster that was shared and is now being derived on a new cluster? There was a large number of paths we explored while getting ready for the upgrade and I was of the opinion we covered that path but it seems things are not working as expected. Can look at the more detailed log on the Admin page (under CloudMan log) and see if there are more details about what's going on and why it's failing? On Thu, Jul 11, 2013 at 3:14 PM, greg <margeemail@gmail.com> wrote:
Hi guys,
I just thought I'd check in again. None of the researches that want to run out genotyping program can do so until I figure this out. Any help or advice at all would be greatly appreciated.
Thanks,
Greg
On Mon, Jul 8, 2013 at 8:43 AM, greg <margeemail@gmail.com> wrote:
Hi guys,
Any thoughts on this? I'm kind of stuck.
(Even some pointers on where to look for more clues would be extremely helpful.)
Thanks,
Greg
On Fri, Jul 5, 2013 at 11:10 AM, greg <margeemail@gmail.com> wrote:
Hi guys,
I'm hitting an error using CloudMan using the Share-an-Instance option. It says:
Error creating volume from shared cluster's snapshot '['snap-cfa775ba']': 'filesystems'.
Also disk stats says 0 /0 and the Applications light is yellow while the data light is green.
I'm using the share string cm-808d863548acae7c2328c39a90f52e29/shared/2012-09-17--19-47
It's always worked in the past.
Thanks,
Greg
Here's the full log:
14:58:18 - Master starting 14:58:20 - Completed the initial cluster startup process. This is a new cluster; waiting to configure the type. 14:58:24 - Migration service prerequisites OK; starting the service 14:58:24 - SGE service prerequisites OK; starting the service 14:58:31 - Setting up SGE... 14:58:51 - HTCondor service prerequisites OK; starting the service 14:58:51 - HTCondor config file /etc/condor/condor_config not found! 14:58:59 - Hadoop service prerequisites OK; starting the service 14:59:48 - Done adding Hadoop service; service running. 15:01:45 - Error creating volume from shared cluster's snapshot '['snap-cfa775ba']': 'filesystems'
Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Thanks for getting back to me, Enis. I went ahead and started a new cluster instance. I'll try to leave it running today in case there's anything we want to check. Here's my whole process: Start Screen Setup: ---------------------------------- http://snag.gy/DMIeC.jpg Entering my share string: ---------------------------------------- http://snag.gy/wKFLy.jpg Main Page Text and log: ------------------------------------ Cluster name: MSGGREG Disk status: 0 / 0 (0%) Worker status: Idle: 0 Available: 0 Requested: 0 Service status: Applications Data Cluster status log 14:08:32 - Master starting 14:08:34 - Completed the initial cluster startup process. This is a new cluster; waiting to configure the type. 14:08:50 - Migration service prerequisites OK; starting the service 14:08:50 - SGE service prerequisites OK; starting the service 14:08:58 - Setting up SGE... 14:09:13 - HTCondor service prerequisites OK; starting the service 14:09:21 - Hadoop service prerequisites OK; starting the service 14:09:38 - Done adding Hadoop service; service running. 14:11:53 - Error creating volume from shared cluster's snapshot '['snap-cfa775ba']': 'filesystems' Admin Page CloudMan Log: (unfortunately nothing is jumping out at me?) ----------------------------------------------- CloudMan from Galaxy Admin | Report bugs | Wiki | Screencast The entire log file (paster.log) is shown. Show latest | Back to admin view Python version: (2, 7) Image configuration suports: {'apps': ['cloudman', 'galaxy']} 2013-07-15 14:08:32,406 DEBUG app:68 Initializing app 2013-07-15 14:08:32,407 DEBUG ec2:121 Gathering instance zone, attempt 0 2013-07-15 14:08:32,410 DEBUG ec2:127 Instance zone is 'us-east-1d' 2013-07-15 14:08:32,410 DEBUG ec2:45 Gathering instance ami, attempt 0 2013-07-15 14:08:32,412 DEBUG app:71 Running on 'ec2' type of cloud in zone 'us-east-1d' using image 'ami-118bfc78'. 2013-07-15 14:08:32,412 DEBUG app:89 Getting pd.yaml 2013-07-15 14:08:32,412 DEBUG ec2:338 No S3 Connection, creating a new one. 2013-07-15 14:08:32,413 DEBUG ec2:342 Got boto S3 connection. 2013-07-15 14:08:32,452 DEBUG misc:212 Checking if bucket 'cm-0479bd75a331acc874033e98b2e1e03e' exists... it does not. 2013-07-15 14:08:32,452 DEBUG misc:583 Bucket 'cm-0479bd75a331acc874033e98b2e1e03e' does not exist, did not get remote file 'persistent_data.yaml' 2013-07-15 14:08:32,452 DEBUG app:96 Setting deployment_version to 2 2013-07-15 14:08:32,453 INFO app:103 Master starting 2013-07-15 14:08:32,453 DEBUG master:55 Initializing console manager - cluster start time: 2013-07-15 14:08:32.453182 2013-07-15 14:08:32,453 DEBUG comm:42 AMQP Connection Failure: [Errno 111] Connection refused 2013-07-15 14:08:32,453 DEBUG master:791 Trying to discover any worker instances associated with this cluster... 2013-07-15 14:08:32,454 DEBUG ec2:317 Establishing boto EC2 connection 2013-07-15 14:08:32,535 DEBUG ec2:305 Got region as 'RegionInfo:us-east-1' 2013-07-15 14:08:32,777 DEBUG ec2:326 Got boto EC2 connection for region 'us-east-1' 2013-07-15 14:08:33,022 DEBUG misc:574 Retrieved file 'snaps.yaml' from bucket 'cloudman' on host 's3.amazonaws.com' to 'cm_snaps.yaml'. 2013-07-15 14:08:33,035 DEBUG ec2:286 Got region name as 'us-east-1' 2013-07-15 14:08:33,035 DEBUG master:226 Loaded default snapshot data: [{'snap_id': 'snap-adad90fc', 'name': 'galaxy', 'roles': 'galaxyTools,galaxyData'}, {'snap_id': 'snap-5b030634', 'name': 'galaxyIndices', 'roles': 'galaxyIndices'}] 2013-07-15 14:08:33,035 DEBUG ec2:81 Gathering instance id, attempt 0 2013-07-15 14:08:33,037 DEBUG ec2:87 Instance ID is 'i-5346d733' 2013-07-15 14:08:33,125 DEBUG ec2:360 Adding tag 'clusterName:MSGGREG' to resource 'i-5346d733' 2013-07-15 14:08:33,307 DEBUG ec2:360 Adding tag 'role:master' to resource 'i-5346d733' 2013-07-15 14:08:33,554 DEBUG ec2:360 Adding tag 'Name:master: MSGGREG' to resource 'i-5346d733' 2013-07-15 14:08:33,744 DEBUG master:246 ud at manager start: {'region_name': u'us-east-1', 'region_endpoint': u'ec2.amazonaws.com', 'ec2_port': None, 'deployment_version': 2, 'cloud_name': u'Amazon', 'boot_script_name': 'cm_boot.py', 'is_secure': True, 'password': '4444', 'access_key': 'redacted!', 's3_port': None, 'cloud_type': u'ec2', 'cloudman_home': '/mnt/cm', 'cluster_name': u'MSGGREG', 'freenxpass': u'4444', 'bucket_default': 'cloudman', 'role': 'master', 'bucket_cluster': 'cm-0479bd75a331acc874033e98b2e1e03e', 'boot_script_path': '/tmp/cm', 'secret_key': u'redacted!', 's3_conn_path': u'/', 's3_host': u's3.amazonaws.com', 'ec2_conn_path': u'/'} 2013-07-15 14:08:33,744 DEBUG master:1858 Generating root user's public key... 2013-07-15 14:08:33,763 DEBUG base:57 Enabling 'root' controller, class: CM 2013-07-15 14:08:33,766 DEBUG buildapp:88 Enabling 'httpexceptions' middleware 2013-07-15 14:08:33,770 DEBUG buildapp:94 Enabling 'recursive' middleware 2013-07-15 14:08:33,776 DEBUG buildapp:114 Enabling 'print debug' middleware 2013-07-15 14:08:33,788 DEBUG buildapp:128 Enabling 'error' middleware 2013-07-15 14:08:33,788 DEBUG buildapp:138 Enabling 'config' middleware 2013-07-15 14:08:33,789 DEBUG buildapp:142 Enabling 'x-forwarded-host' middleware Starting server in PID 1791. serving on 0.0.0.0:42284 view at http://127.0.0.1:42284 2013-07-15 14:08:34,198 DEBUG master:1861 Successfully generated root user's public key. 2013-07-15 14:08:34,199 DEBUG master:1869 Successfully retrieved root user's public key from file. 2013-07-15 14:08:34,199 DEBUG master:99 Updating dependencies for service Migration 2013-07-15 14:08:34,199 DEBUG master:99 Updating dependencies for service SGE 2013-07-15 14:08:34,200 DEBUG filesystem:32 Instantiating Filesystem object transient_nfs with service roles: TransientNFS 2013-07-15 14:08:34,200 DEBUG filesystem:594 Configuring instance transient storage at /mnt/transient_nfs with NFS. 2013-07-15 14:08:34,200 DEBUG master:99 Updating dependencies for service transient_nfs 2013-07-15 14:08:34,200 DEBUG pss:27 Configured PSS as master 2013-07-15 14:08:34,200 DEBUG master:99 Updating dependencies for service PSS 2013-07-15 14:08:34,200 DEBUG htcondor:25 Condor is preparing 2013-07-15 14:08:34,200 DEBUG master:99 Updating dependencies for service HTCondor 2013-07-15 14:08:34,201 DEBUG master:99 Updating dependencies for service Hadoop 2013-07-15 14:08:34,201 DEBUG master:321 Checking for and adding any previously defined cluster services 2013-07-15 14:08:34,201 DEBUG master:327 Processing filesystems in an existing cluster config 2013-07-15 14:08:34,201 DEBUG master:812 Trying to discover any volumes attached to this instance... 2013-07-15 14:08:34,377 DEBUG master:829 Attached volumes: [Volume:vol-cc9b1791] 2013-07-15 14:08:34,378 DEBUG ec2:360 Adding tag 'clusterName:MSGGREG' to resource 'vol-cc9b1791' 2013-07-15 14:08:34,545 DEBUG master:385 Processing application services in an existing cluster config 2013-07-15 14:08:34,545 INFO master:298 Completed the initial cluster startup process. This is a new cluster; waiting to configure the type. 2013-07-15 14:08:34,545 DEBUG master:2342 Monitor started; manager started 2013-07-15 14:08:38,545 DEBUG master:2352 Trying to setup AMQP connection; conn = '' 2013-07-15 14:08:38,546 DEBUG comm:42 AMQP Connection Failure: [Errno 111] Connection refused 2013-07-15 14:08:42,546 DEBUG master:2352 Trying to setup AMQP connection; conn = '' 2013-07-15 14:08:42,547 DEBUG comm:42 AMQP Connection Failure: [Errno 111] Connection refused 2013-07-15 14:08:46,547 DEBUG master:2352 Trying to setup AMQP connection; conn = '' 2013-07-15 14:08:46,550 DEBUG connection:661 Start from server, version: 8.0, properties: {u'information': u'Licensed under the MPL. See http://www.rabbitmq.com/', u'product': u'RabbitMQ', u'copyright': u'Copyright (C) 2007-2011 VMware, Inc.', u'capabilities': {}, u'platform': u'Erlang/OTP', u'version': u'2.7.1'}, mechanisms: [u'PLAIN', u'AMQPLAIN'], locales: [u'en_US'] 2013-07-15 14:08:46,551 DEBUG connection:507 Open OK! known_hosts [] 2013-07-15 14:08:46,551 DEBUG channel:70 using channel_id: 1 2013-07-15 14:08:46,552 DEBUG channel:484 Channel open 2013-07-15 14:08:46,555 DEBUG comm:40 Successfully established AMQP connection 2013-07-15 14:08:50,556 DEBUG master:2377 S&S: Migration..Unstarted; SGE..Unstarted; FS-transient_nfs..Unstarted; PSS..Unstarted; HTCondor..Unstarted; Hadoop..Unstarted; 2013-07-15 14:08:50,556 DEBUG master:2288 Monitor adding service 'Migration' 2013-07-15 14:08:50,556 INFO __init__:284 Migration service prerequisites OK; starting the service 2013-07-15 14:08:50,556 DEBUG migration_service:284 Starting migration service... 2013-07-15 14:08:50,556 DEBUG migration_service:332 Old deployment version: 2 2013-07-15 14:08:50,556 DEBUG migration_service:324 Current deployment version: 2 2013-07-15 14:08:50,556 DEBUG migration_service:300 No migration required. Service complete. 2013-07-15 14:08:50,556 DEBUG master:2288 Monitor adding service 'SGE' 2013-07-15 14:08:50,557 INFO __init__:284 SGE service prerequisites OK; starting the service 2013-07-15 14:08:50,557 DEBUG sge:100 Unpacking SGE from '/opt/galaxy/pkg/ge6.2u5' 2013-07-15 14:08:50,560 DEBUG sge:123 Unpacking SGE to '/opt/sge'. 2013-07-15 14:08:58,715 INFO sge:159 Setting up SGE... 2013-07-15 14:08:58,716 DEBUG ec2:188 Gathering instance private IP, attempt 0 2013-07-15 14:08:58,719 DEBUG sge:166 Created SGE install template as file '/opt/sge/galaxyEC2.conf' 2013-07-15 14:08:58,719 DEBUG sge:168 Setting up SGE. 2013-07-15 14:08:58,719 DEBUG misc:750 Replacing string ' libc_version=`echo $libc_string | tr ' ,' '\n' | grep "2\." | cut -f 2 -d "."`' with ' libc_version=`echo $libc_string | tr ' ,' '\n' | grep "2\." | cut -f 2 -d "." | sort -u`' in file /opt/sge/util/arch 2013-07-15 14:08:58,721 DEBUG misc:750 Replacing string ' 2.[46].*)' with ' [23].[24567890].*)' in file /opt/sge/util/arch 2013-07-15 14:08:58,721 DEBUG misc:750 Replacing string ' 2.6.*)' with ' [23].[24567890].*)' in file /opt/sge/util/arch 2013-07-15 14:08:58,744 DEBUG misc:724 Modified /opt/sge/util/arch 2013-07-15 14:08:58,763 DEBUG misc:724 Successfully chmod /opt/sge/util/arch 2013-07-15 14:08:58,783 DEBUG misc:724 'sed -i.bak '/^127.0.1./s/^/# (Commented by CloudMan) /' /etc/hosts' command OK 2013-07-15 14:09:11,740 DEBUG misc:724 Successfully set up SGE 2013-07-15 14:09:11,740 DEBUG sge:171 Successfully setup SGE; configuring SGE 2013-07-15 14:09:11,740 DEBUG sge:172 Adding parallel environments 2013-07-15 14:09:11,769 DEBUG misc:724 'cd /opt/sge; ./bin/lx24-amd64/qconf -Ap /tmp/SMP_PE' command OK 2013-07-15 14:09:11,797 DEBUG misc:724 'cd /opt/sge; ./bin/lx24-amd64/qconf -Ap /tmp/MPI_PE' command OK 2013-07-15 14:09:11,797 DEBUG sge:180 Creating queue 'all.q' 2013-07-15 14:09:11,798 DEBUG sge:201 Created SGE all.q template as file '/opt/sge/all.q.conf' 2013-07-15 14:09:11,827 DEBUG misc:724 Successfully modified all.q 2013-07-15 14:09:11,828 DEBUG sge:206 Configuring users' SGE profiles 2013-07-15 14:09:11,828 DEBUG master:2288 Monitor adding service 'FS-transient_nfs' 2013-07-15 14:09:11,829 DEBUG filesystem:106 Trying to add file system service FS-transient_nfs 2013-07-15 14:09:11,829 DEBUG transient_storage:55 Adding transient file system at /mnt/transient_nfs 2013-07-15 14:09:11,881 DEBUG filesystem:403 Added '/mnt/transient_nfs *(rw,sync,no_root_squash,no_subtree_check)' line to NFS file /etc/exports 2013-07-15 14:09:11,882 DEBUG filesystem:468 FS-transient_nfs in 'Starting' state for 0 seconds 2013-07-15 14:09:13,065 DEBUG misc:724 As part of transient_nfs filesystem update, successfully restarted NFS server 2013-07-15 14:09:13,066 DEBUG filesystem:468 FS-transient_nfs in 'Starting' state for 1 seconds 2013-07-15 14:09:13,067 DEBUG filesystem:135 Done adding devices to FS-transient_nfs (devices: [], [], [Transient storage @ /mnt/transient_nfs], -) 2013-07-15 14:09:13,067 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:13,067 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:09:13,067 DEBUG master:2288 Monitor adding service 'HTCondor' 2013-07-15 14:09:13,067 INFO __init__:284 HTCondor service prerequisites OK; starting the service 2013-07-15 14:09:13,067 DEBUG htcondor:38 Starting HTCondor service 2013-07-15 14:09:13,068 DEBUG htcondor:67 HTCondor params: {'flock_host': ''} 2013-07-15 14:09:14,208 DEBUG misc:724 '/etc/init.d/condor restart' command OK 2013-07-15 14:09:14,209 DEBUG master:2288 Monitor adding service 'Hadoop' 2013-07-15 14:09:14,209 DEBUG __init__:290 Hadoop service prerequisites are not yet satisfied, waiting for: []. Setting Hadoop service state to 'Unstarted' 2013-07-15 14:09:14,209 DEBUG master:2199 Storing cluster configuration to cluster's bucket 2013-07-15 14:09:14,798 DEBUG misc:212 Checking if bucket 'cm-0479bd75a331acc874033e98b2e1e03e' exists... it does not. 2013-07-15 14:09:15,090 DEBUG misc:224 Created bucket 'cm-0479bd75a331acc874033e98b2e1e03e'. 2013-07-15 14:09:15,237 DEBUG misc:595 Saved file 'persistent_data.yaml' of size 537B to bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:15,237 DEBUG master:2244 Saving current instance boot script (/tmp/cm/cm_boot.py) to cluster bucket 'cm-0479bd75a331acc874033e98b2e1e03e' as 'cm_boot.py' 2013-07-15 14:09:15,424 DEBUG misc:595 Saved file 'cm_boot.py' of size 19845B to bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:15,424 DEBUG master:2254 Saving CloudMan source (/mnt/cm/cm.tar.gz) to cluster bucket 'cm-0479bd75a331acc874033e98b2e1e03e' as 'cm.tar.gz' 2013-07-15 14:09:15,680 DEBUG misc:595 Saved file 'cm.tar.gz' of size 793622B to bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:15,680 DEBUG misc:685 Setting metadata 'revision' for file 'cm.tar.gz' in bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:15,921 DEBUG master:2280 Saving '/mnt/cm/MSGGREG.clusterName' file to cluster bucket 'cm-0479bd75a331acc874033e98b2e1e03e' as 'MSGGREG.clusterName' 2013-07-15 14:09:15,985 DEBUG misc:595 Saved file 'MSGGREG.clusterName' of size 0B to bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:16,006 DEBUG master:1755 Checking for new version of CloudMan 2013-07-15 14:09:16,006 DEBUG misc:667 Getting metadata 'revision' for file 'cm.tar.gz' from bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:16,066 DEBUG misc:667 Getting metadata 'revision' for file 'cm.tar.gz' from bucket 'cloudman' 2013-07-15 14:09:16,184 DEBUG master:1762 Revision number for user's CloudMan: '732'; revision number for default CloudMan: '732' 2013-07-15 14:09:16,184 DEBUG ec2:61 Gathering instance type, attempt 0 2013-07-15 14:09:16,457 DEBUG ec2:258 Gathering instance public hostname, attempt 0 2013-07-15 14:09:21,091 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.76 lx24-amd64 '] 2013-07-15 14:09:21,096 DEBUG filesystem:468 FS-transient_nfs in 'Starting' state for 9 seconds 2013-07-15 14:09:21,096 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..Starting; PSS..Unstarted; HTCondor..OK; Hadoop..Unstarted; 2013-07-15 14:09:21,097 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:21,097 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:09:21,097 DEBUG master:2288 Monitor adding service 'Hadoop' 2013-07-15 14:09:21,097 INFO __init__:284 Hadoop service prerequisites OK; starting the service 2013-07-15 14:09:21,097 DEBUG hadoop:38 Configuring Hadoop 2013-07-15 14:09:21,098 DEBUG master:2199 Storing cluster configuration to cluster's bucket 2013-07-15 14:09:21,101 DEBUG hadoop:81 Unpacking Hadoop 2013-07-15 14:09:21,101 DEBUG hadoop:84 Hadoop path is /opt/hadoop/hadoop\.((([|0-9])*\.)*[0-9]*__([0-9]*\.)*[0-9]+){0,1}\.{0,1}tar\.gz 2013-07-15 14:09:21,101 DEBUG hadoop:87 Hadoop SGE integration path is /opt/hadoop/sge_integration\.(([0-9]*\.)*[0-9]+){0,1}\.{0,1}tar\.gz 2013-07-15 14:09:21,106 DEBUG hadoop:180 Extracted Hadoop version: 1.0.4 2013-07-15 14:09:21,106 DEBUG hadoop:181 Extracted Hadoop build version: 1.0 2013-07-15 14:09:21,174 DEBUG hadoop:180 Extracted Hadoop version: 1.0.4 2013-07-15 14:09:21,175 DEBUG hadoop:181 Extracted Hadoop build version: 1.0 2013-07-15 14:09:21,821 DEBUG misc:595 Saved file 'persistent_data.yaml' of size 537B to bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:21,822 DEBUG master:2244 Saving current instance boot script (/tmp/cm/cm_boot.py) to cluster bucket 'cm-0479bd75a331acc874033e98b2e1e03e' as 'cm_boot.py' 2013-07-15 14:09:21,902 DEBUG misc:595 Saved file 'cm_boot.py' of size 19845B to bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:21,902 DEBUG master:2254 Saving CloudMan source (/mnt/cm/cm.tar.gz) to cluster bucket 'cm-0479bd75a331acc874033e98b2e1e03e' as 'cm.tar.gz' 2013-07-15 14:09:22,158 DEBUG misc:595 Saved file 'cm.tar.gz' of size 793622B to bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:22,158 DEBUG misc:685 Setting metadata 'revision' for file 'cm.tar.gz' in bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:22,621 DEBUG master:2280 Saving '/mnt/cm/MSGGREG.clusterName' file to cluster bucket 'cm-0479bd75a331acc874033e98b2e1e03e' as 'MSGGREG.clusterName' 2013-07-15 14:09:22,683 DEBUG misc:595 Saved file 'MSGGREG.clusterName' of size 0B to bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:26,686 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:26,686 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:09:30,689 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:30,690 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:09:35,352 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.76 lx24-amd64 '] 2013-07-15 14:09:35,356 DEBUG filesystem:468 FS-transient_nfs in 'Starting' state for 23 seconds 2013-07-15 14:09:35,357 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..Starting; PSS..Unstarted; HTCondor..OK; Hadoop..Starting; 2013-07-15 14:09:35,357 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:35,358 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:09:38,802 DEBUG hadoop:141 Hadoop extracted to /opt/hadoop 2013-07-15 14:09:38,821 DEBUG hadoop:146 Hadoop SGE integration extracted to /opt/hadoop 2013-07-15 14:09:38,848 DEBUG misc:724 'chown -R -c ubuntu /opt/hadoop/sge_integration.1.0.tar.gz' command OK 2013-07-15 14:09:38,872 DEBUG misc:724 'chown -R -c ubuntu /opt/hadoop/hadoop.1.0.4__1.0.tar.gz' command OK 2013-07-15 14:09:38,873 DEBUG hadoop:190 Setting up Hadoop environment 2013-07-15 14:09:38,874 DEBUG hadoop:195 Hadoop id_rsa set from::/opt/hadoop/id_rsa 2013-07-15 14:09:38,899 DEBUG misc:724 'chown -c ubuntu /home/ubuntu/.ssh/id_rsa' command OK 2013-07-15 14:09:38,899 DEBUG hadoop:199 Hadoop authFile saved to /home/ubuntu/.ssh/id_rsa 2013-07-15 14:09:38,923 DEBUG misc:724 'chown -c ubuntu /home/ubuntu/.ssh/authorized_keys' command OK 2013-07-15 14:09:38,924 INFO hadoop:51 Done adding Hadoop service; service running. 2013-07-15 14:09:39,363 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:39,363 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:09:43,365 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:43,365 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:09:47,907 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.76 lx24-amd64 '] 2013-07-15 14:09:47,934 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:09:47,934 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:47,935 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:09:51,936 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:51,937 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:09:55,938 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:55,938 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:00,482 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.87 lx24-amd64 '] 2013-07-15 14:10:00,508 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:10:00,509 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:00,509 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:04,511 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:04,512 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:08,513 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:08,514 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:16,307 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.87 lx24-amd64 '] 2013-07-15 14:10:16,336 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:10:16,336 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:16,337 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:20,339 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:20,339 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:24,890 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.87 lx24-amd64 '] 2013-07-15 14:10:24,921 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:10:24,922 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:24,922 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:28,924 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:28,924 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:32,926 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:32,927 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:37,472 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.88 lx24-amd64 '] 2013-07-15 14:10:37,500 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:10:37,501 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:37,501 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:41,503 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:41,504 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:45,505 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:45,505 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:50,046 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.88 lx24-amd64 '] 2013-07-15 14:10:50,075 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:10:50,076 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:50,076 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:54,078 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:54,079 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:58,080 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:58,081 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:02,633 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.88 lx24-amd64 '] 2013-07-15 14:11:02,662 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:11:02,662 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:02,662 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:06,664 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:06,665 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:10,667 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:10,668 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:15,212 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.77 lx24-amd64 '] 2013-07-15 14:11:15,240 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:11:15,241 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:15,241 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:19,243 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:19,243 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:23,245 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:23,246 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:27,785 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.77 lx24-amd64 '] 2013-07-15 14:11:27,814 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:11:27,814 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:27,815 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:31,816 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:31,817 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:35,818 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:35,819 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:40,362 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.77 lx24-amd64 '] 2013-07-15 14:11:40,390 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:11:40,391 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:40,391 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:44,393 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:44,394 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:48,395 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:48,396 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:50,209 DEBUG master:1252 Initializing a shared cluster from 'cm-808d863548acae7c2328c39a90f52e29/shared/2012-09-17--19-47' 2013-07-15 14:11:50,392 DEBUG misc:574 Retrieved file 'shared/2012-09-17--19-47/shared_instance_file_list.txt' from bucket 'cm-808d863548acae7c2328c39a90f52e29' on host 's3.amazonaws.com' to 'shared_instance_file_list.txt'. 2013-07-15 14:11:50,394 DEBUG misc:615 Establishing handle with key object 'shared/2012-09-17--19-47/persistent_data.yaml' 2013-07-15 14:11:50,394 DEBUG misc:619 Copying file 'cm-808d863548acae7c2328c39a90f52e29/shared/2012-09-17--19-47/persistent_data.yaml' to file 'cm-0479bd75a331acc874033e98b2e1e03e/persistent_data.yaml' 2013-07-15 14:11:50,550 DEBUG misc:615 Establishing handle with key object 'shared/2012-09-17--19-47/cm.tar.gz' 2013-07-15 14:11:50,550 DEBUG misc:619 Copying file 'cm-808d863548acae7c2328c39a90f52e29/shared/2012-09-17--19-47/cm.tar.gz' to file 'cm-0479bd75a331acc874033e98b2e1e03e/cm.tar.gz' 2013-07-15 14:11:50,771 DEBUG misc:615 Establishing handle with key object 'shared/2012-09-17--19-47/cm.tar.gz_2012-09-13' 2013-07-15 14:11:50,771 DEBUG misc:619 Copying file 'cm-808d863548acae7c2328c39a90f52e29/shared/2012-09-17--19-47/cm.tar.gz_2012-09-13' to file 'cm-0479bd75a331acc874033e98b2e1e03e/cm.tar.gz_2012-09-13' 2013-07-15 14:11:50,930 DEBUG misc:615 Establishing handle with key object 'shared/2012-09-17--19-47/cm_boot.py' 2013-07-15 14:11:50,930 DEBUG misc:619 Copying file 'cm-808d863548acae7c2328c39a90f52e29/shared/2012-09-17--19-47/cm_boot.py' to file 'cm-0479bd75a331acc874033e98b2e1e03e/cm_boot.py' 2013-07-15 14:11:51,085 DEBUG misc:615 Establishing handle with key object 'shared/2012-09-17--19-47/post_start_script' 2013-07-15 14:11:51,085 DEBUG misc:619 Copying file 'cm-808d863548acae7c2328c39a90f52e29/shared/2012-09-17--19-47/post_start_script' to file 'cm-0479bd75a331acc874033e98b2e1e03e/post_start_script' 2013-07-15 14:11:51,326 DEBUG misc:574 Retrieved file 'persistent_data.yaml' from bucket 'cm-0479bd75a331acc874033e98b2e1e03e' on host 's3.amazonaws.com' to 'shared_p_d.yaml'. 2013-07-15 14:11:52,944 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.77 lx24-amd64 '] 2013-07-15 14:11:52,973 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:11:52,973 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:52,973 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:53,730 ERROR master:1337 Error creating volume from shared cluster's snapshot '['snap-cfa775ba']': 'filesystems' 2013-07-15 14:11:56,975 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:56,976 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:00,978 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:00,978 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:05,521 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.72 lx24-amd64 '] 2013-07-15 14:12:05,550 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:12:05,551 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:05,551 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:09,553 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:09,553 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:13,555 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:13,556 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:18,102 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.72 lx24-amd64 '] 2013-07-15 14:12:18,131 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:12:18,131 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:18,131 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:22,133 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:22,134 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:25,087 DEBUG ec2:166 Gathering instance public keys (i.e., key pairs), attempt 0 2013-07-15 14:12:25,091 DEBUG ec2:173 Got key pair: 'cloudman_key_pair' 2013-07-15 14:12:26,136 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:26,137 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:30,689 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.72 lx24-amd64 '] 2013-07-15 14:12:30,717 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:12:30,717 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:30,718 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:34,719 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:34,720 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:38,721 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:38,721 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:43,263 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.65 lx24-amd64 '] 2013-07-15 14:12:43,289 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:12:43,290 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:43,290 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:47,292 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:47,292 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:51,293 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:51,293 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:55,836 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.65 lx24-amd64 '] 2013-07-15 14:12:55,865 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:12:55,865 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:55,866 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:59,867 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:59,868 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:13:03,868 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:13:03,869 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:13:08,419 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.65 lx24-amd64 '] 2013-07-15 14:13:08,448 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:13:08,449 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:13:08,449 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) On Fri, Jul 12, 2013 at 12:08 PM, Enis Afgan <eafgan@emory.edu> wrote:
Hi Greg, Sorry for replying really late.
So, I'm guessing this was an old cluster that was shared and is now being derived on a new cluster? There was a large number of paths we explored while getting ready for the upgrade and I was of the opinion we covered that path but it seems things are not working as expected. Can look at the more detailed log on the Admin page (under CloudMan log) and see if there are more details about what's going on and why it's failing?
On Thu, Jul 11, 2013 at 3:14 PM, greg <margeemail@gmail.com> wrote:
Hi guys,
I just thought I'd check in again. None of the researches that want to run out genotyping program can do so until I figure this out. Any help or advice at all would be greatly appreciated.
Thanks,
Greg
On Mon, Jul 8, 2013 at 8:43 AM, greg <margeemail@gmail.com> wrote:
Hi guys,
Any thoughts on this? I'm kind of stuck.
(Even some pointers on where to look for more clues would be extremely helpful.)
Thanks,
Greg
On Fri, Jul 5, 2013 at 11:10 AM, greg <margeemail@gmail.com> wrote:
Hi guys,
I'm hitting an error using CloudMan using the Share-an-Instance option. It says:
Error creating volume from shared cluster's snapshot '['snap-cfa775ba']': 'filesystems'.
Also disk stats says 0 /0 and the Applications light is yellow while the data light is green.
I'm using the share string cm-808d863548acae7c2328c39a90f52e29/shared/2012-09-17--19-47
It's always worked in the past.
Thanks,
Greg
Here's the full log:
14:58:18 - Master starting 14:58:20 - Completed the initial cluster startup process. This is a new cluster; waiting to configure the type. 14:58:24 - Migration service prerequisites OK; starting the service 14:58:24 - SGE service prerequisites OK; starting the service 14:58:31 - Setting up SGE... 14:58:51 - HTCondor service prerequisites OK; starting the service 14:58:51 - HTCondor config file /etc/condor/condor_config not found! 14:58:59 - Hadoop service prerequisites OK; starting the service 14:59:48 - Done adding Hadoop service; service running. 15:01:45 - Error creating volume from shared cluster's snapshot '['snap-cfa775ba']': 'filesystems'
Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Hey Greg, We put together a quick workaround until we're able to resolve the underlying issues. You can launch the previous incarnation of galaxy (pre-update AMI and Cloudman versions) using https://main.g2.bx.psu.edu/cloudlaunch?ami=ami-da58aab3&bucket_default=gxy-workshop Once your instance is up, enter your share string like usual and it'll work fine. We expect to fix these issues next week, but this should get your instance back up and running for now. On Mon, Jul 15, 2013 at 10:19 AM, greg <margeemail@gmail.com> wrote:
Thanks for getting back to me, Enis.
I went ahead and started a new cluster instance. I'll try to leave it running today in case there's anything we want to check.
Here's my whole process:
Start Screen Setup: ---------------------------------- http://snag.gy/DMIeC.jpg
Entering my share string: ---------------------------------------- http://snag.gy/wKFLy.jpg
Main Page Text and log: ------------------------------------ Cluster name:
MSGGREG
Disk status:
0 / 0 (0%)
Worker status:
Idle: 0 Available: 0 Requested: 0
Service status:
Applications
Data
Cluster status log
14:08:32 - Master starting 14:08:34 - Completed the initial cluster startup process. This is a new cluster; waiting to configure the type. 14:08:50 - Migration service prerequisites OK; starting the service 14:08:50 - SGE service prerequisites OK; starting the service 14:08:58 - Setting up SGE... 14:09:13 - HTCondor service prerequisites OK; starting the service 14:09:21 - Hadoop service prerequisites OK; starting the service 14:09:38 - Done adding Hadoop service; service running. 14:11:53 - Error creating volume from shared cluster's snapshot '['snap-cfa775ba']': 'filesystems'
Admin Page CloudMan Log: (unfortunately nothing is jumping out at me?) -----------------------------------------------
CloudMan from Galaxy Admin | Report bugs | Wiki | Screencast The entire log file (paster.log) is shown. Show latest | Back to admin view Python version: (2, 7) Image configuration suports: {'apps': ['cloudman', 'galaxy']} 2013-07-15 14:08:32,406 DEBUG app:68 Initializing app 2013-07-15 14:08:32,407 DEBUG ec2:121 Gathering instance zone, attempt 0 2013-07-15 14:08:32,410 DEBUG ec2:127 Instance zone is 'us-east-1d' 2013-07-15 14:08:32,410 DEBUG ec2:45 Gathering instance ami, attempt 0 2013-07-15 14:08:32,412 DEBUG app:71 Running on 'ec2' type of cloud in zone 'us-east-1d' using image 'ami-118bfc78'. 2013-07-15 14:08:32,412 DEBUG app:89 Getting pd.yaml 2013-07-15 14:08:32,412 DEBUG ec2:338 No S3 Connection, creating a new one. 2013-07-15 14:08:32,413 DEBUG ec2:342 Got boto S3 connection. 2013-07-15 14:08:32,452 DEBUG misc:212 Checking if bucket 'cm-0479bd75a331acc874033e98b2e1e03e' exists... it does not. 2013-07-15 14:08:32,452 DEBUG misc:583 Bucket 'cm-0479bd75a331acc874033e98b2e1e03e' does not exist, did not get remote file 'persistent_data.yaml' 2013-07-15 14:08:32,452 DEBUG app:96 Setting deployment_version to 2 2013-07-15 14:08:32,453 INFO app:103 Master starting 2013-07-15 14:08:32,453 DEBUG master:55 Initializing console manager - cluster start time: 2013-07-15 14:08:32.453182 2013-07-15 14:08:32,453 DEBUG comm:42 AMQP Connection Failure: [Errno 111] Connection refused 2013-07-15 14:08:32,453 DEBUG master:791 Trying to discover any worker instances associated with this cluster... 2013-07-15 14:08:32,454 DEBUG ec2:317 Establishing boto EC2 connection 2013-07-15 14:08:32,535 DEBUG ec2:305 Got region as 'RegionInfo:us-east-1' 2013-07-15 14:08:32,777 DEBUG ec2:326 Got boto EC2 connection for region 'us-east-1' 2013-07-15 14:08:33,022 DEBUG misc:574 Retrieved file 'snaps.yaml' from bucket 'cloudman' on host 's3.amazonaws.com' to 'cm_snaps.yaml'. 2013-07-15 14:08:33,035 DEBUG ec2:286 Got region name as 'us-east-1' 2013-07-15 14:08:33,035 DEBUG master:226 Loaded default snapshot data: [{'snap_id': 'snap-adad90fc', 'name': 'galaxy', 'roles': 'galaxyTools,galaxyData'}, {'snap_id': 'snap-5b030634', 'name': 'galaxyIndices', 'roles': 'galaxyIndices'}] 2013-07-15 14:08:33,035 DEBUG ec2:81 Gathering instance id, attempt 0 2013-07-15 14:08:33,037 DEBUG ec2:87 Instance ID is 'i-5346d733' 2013-07-15 14:08:33,125 DEBUG ec2:360 Adding tag 'clusterName:MSGGREG' to resource 'i-5346d733' 2013-07-15 14:08:33,307 DEBUG ec2:360 Adding tag 'role:master' to resource 'i-5346d733' 2013-07-15 14:08:33,554 DEBUG ec2:360 Adding tag 'Name:master: MSGGREG' to resource 'i-5346d733' 2013-07-15 14:08:33,744 DEBUG master:246 ud at manager start: {'region_name': u'us-east-1', 'region_endpoint': u'ec2.amazonaws.com', 'ec2_port': None, 'deployment_version': 2, 'cloud_name': u'Amazon', 'boot_script_name': 'cm_boot.py', 'is_secure': True, 'password': '4444', 'access_key': 'redacted!', 's3_port': None, 'cloud_type': u'ec2', 'cloudman_home': '/mnt/cm', 'cluster_name': u'MSGGREG', 'freenxpass': u'4444', 'bucket_default': 'cloudman', 'role': 'master', 'bucket_cluster': 'cm-0479bd75a331acc874033e98b2e1e03e', 'boot_script_path': '/tmp/cm', 'secret_key': u'redacted!', 's3_conn_path': u'/', 's3_host': u's3.amazonaws.com', 'ec2_conn_path': u'/'} 2013-07-15 14:08:33,744 DEBUG master:1858 Generating root user's public key... 2013-07-15 14:08:33,763 DEBUG base:57 Enabling 'root' controller, class: CM 2013-07-15 14:08:33,766 DEBUG buildapp:88 Enabling 'httpexceptions' middleware 2013-07-15 14:08:33,770 DEBUG buildapp:94 Enabling 'recursive' middleware 2013-07-15 14:08:33,776 DEBUG buildapp:114 Enabling 'print debug' middleware 2013-07-15 14:08:33,788 DEBUG buildapp:128 Enabling 'error' middleware 2013-07-15 14:08:33,788 DEBUG buildapp:138 Enabling 'config' middleware 2013-07-15 14:08:33,789 DEBUG buildapp:142 Enabling 'x-forwarded-host' middleware Starting server in PID 1791. serving on 0.0.0.0:42284 view at http://127.0.0.1:42284 2013-07-15 14:08:34,198 DEBUG master:1861 Successfully generated root user's public key. 2013-07-15 14:08:34,199 DEBUG master:1869 Successfully retrieved root user's public key from file. 2013-07-15 14:08:34,199 DEBUG master:99 Updating dependencies for service Migration 2013-07-15 14:08:34,199 DEBUG master:99 Updating dependencies for service SGE 2013-07-15 14:08:34,200 DEBUG filesystem:32 Instantiating Filesystem object transient_nfs with service roles: TransientNFS 2013-07-15 14:08:34,200 DEBUG filesystem:594 Configuring instance transient storage at /mnt/transient_nfs with NFS. 2013-07-15 14:08:34,200 DEBUG master:99 Updating dependencies for service transient_nfs 2013-07-15 14:08:34,200 DEBUG pss:27 Configured PSS as master 2013-07-15 14:08:34,200 DEBUG master:99 Updating dependencies for service PSS 2013-07-15 14:08:34,200 DEBUG htcondor:25 Condor is preparing 2013-07-15 14:08:34,200 DEBUG master:99 Updating dependencies for service HTCondor 2013-07-15 14:08:34,201 DEBUG master:99 Updating dependencies for service Hadoop 2013-07-15 14:08:34,201 DEBUG master:321 Checking for and adding any previously defined cluster services 2013-07-15 14:08:34,201 DEBUG master:327 Processing filesystems in an existing cluster config 2013-07-15 14:08:34,201 DEBUG master:812 Trying to discover any volumes attached to this instance... 2013-07-15 14:08:34,377 DEBUG master:829 Attached volumes: [Volume:vol-cc9b1791] 2013-07-15 14:08:34,378 DEBUG ec2:360 Adding tag 'clusterName:MSGGREG' to resource 'vol-cc9b1791' 2013-07-15 14:08:34,545 DEBUG master:385 Processing application services in an existing cluster config 2013-07-15 14:08:34,545 INFO master:298 Completed the initial cluster startup process. This is a new cluster; waiting to configure the type. 2013-07-15 14:08:34,545 DEBUG master:2342 Monitor started; manager started 2013-07-15 14:08:38,545 DEBUG master:2352 Trying to setup AMQP connection; conn = '' 2013-07-15 14:08:38,546 DEBUG comm:42 AMQP Connection Failure: [Errno 111] Connection refused 2013-07-15 14:08:42,546 DEBUG master:2352 Trying to setup AMQP connection; conn = '' 2013-07-15 14:08:42,547 DEBUG comm:42 AMQP Connection Failure: [Errno 111] Connection refused 2013-07-15 14:08:46,547 DEBUG master:2352 Trying to setup AMQP connection; conn = '' 2013-07-15 14:08:46,550 DEBUG connection:661 Start from server, version: 8.0, properties: {u'information': u'Licensed under the MPL. See http://www.rabbitmq.com/', u'product': u'RabbitMQ', u'copyright': u'Copyright (C) 2007-2011 VMware, Inc.', u'capabilities': {}, u'platform': u'Erlang/OTP', u'version': u'2.7.1'}, mechanisms: [u'PLAIN', u'AMQPLAIN'], locales: [u'en_US'] 2013-07-15 14:08:46,551 DEBUG connection:507 Open OK! known_hosts [] 2013-07-15 14:08:46,551 DEBUG channel:70 using channel_id: 1 2013-07-15 14:08:46,552 DEBUG channel:484 Channel open 2013-07-15 14:08:46,555 DEBUG comm:40 Successfully established AMQP connection 2013-07-15 14:08:50,556 DEBUG master:2377 S&S: Migration..Unstarted; SGE..Unstarted; FS-transient_nfs..Unstarted; PSS..Unstarted; HTCondor..Unstarted; Hadoop..Unstarted; 2013-07-15 14:08:50,556 DEBUG master:2288 Monitor adding service 'Migration' 2013-07-15 14:08:50,556 INFO __init__:284 Migration service prerequisites OK; starting the service 2013-07-15 14:08:50,556 DEBUG migration_service:284 Starting migration service... 2013-07-15 14:08:50,556 DEBUG migration_service:332 Old deployment version: 2 2013-07-15 14:08:50,556 DEBUG migration_service:324 Current deployment version: 2 2013-07-15 14:08:50,556 DEBUG migration_service:300 No migration required. Service complete. 2013-07-15 14:08:50,556 DEBUG master:2288 Monitor adding service 'SGE' 2013-07-15 14:08:50,557 INFO __init__:284 SGE service prerequisites OK; starting the service 2013-07-15 14:08:50,557 DEBUG sge:100 Unpacking SGE from '/opt/galaxy/pkg/ge6.2u5' 2013-07-15 14:08:50,560 DEBUG sge:123 Unpacking SGE to '/opt/sge'. 2013-07-15 14:08:58,715 INFO sge:159 Setting up SGE... 2013-07-15 14:08:58,716 DEBUG ec2:188 Gathering instance private IP, attempt 0 2013-07-15 14:08:58,719 DEBUG sge:166 Created SGE install template as file '/opt/sge/galaxyEC2.conf' 2013-07-15 14:08:58,719 DEBUG sge:168 Setting up SGE. 2013-07-15 14:08:58,719 DEBUG misc:750 Replacing string ' libc_version=`echo $libc_string | tr ' ,' '\n' | grep "2\." | cut -f 2 -d "."`' with ' libc_version=`echo $libc_string | tr ' ,' '\n' | grep "2\." | cut -f 2 -d "." | sort -u`' in file /opt/sge/util/arch 2013-07-15 14:08:58,721 DEBUG misc:750 Replacing string ' 2.[46].*)' with ' [23].[24567890].*)' in file /opt/sge/util/arch 2013-07-15 14:08:58,721 DEBUG misc:750 Replacing string ' 2.6.*)' with ' [23].[24567890].*)' in file /opt/sge/util/arch 2013-07-15 14:08:58,744 DEBUG misc:724 Modified /opt/sge/util/arch 2013-07-15 14:08:58,763 DEBUG misc:724 Successfully chmod /opt/sge/util/arch 2013-07-15 14:08:58,783 DEBUG misc:724 'sed -i.bak '/^127.0.1./s/^/# (Commented by CloudMan) /' /etc/hosts' command OK 2013-07-15 14:09:11,740 DEBUG misc:724 Successfully set up SGE 2013-07-15 14:09:11,740 DEBUG sge:171 Successfully setup SGE; configuring SGE 2013-07-15 14:09:11,740 DEBUG sge:172 Adding parallel environments 2013-07-15 14:09:11,769 DEBUG misc:724 'cd /opt/sge; ./bin/lx24-amd64/qconf -Ap /tmp/SMP_PE' command OK 2013-07-15 14:09:11,797 DEBUG misc:724 'cd /opt/sge; ./bin/lx24-amd64/qconf -Ap /tmp/MPI_PE' command OK 2013-07-15 14:09:11,797 DEBUG sge:180 Creating queue 'all.q' 2013-07-15 14:09:11,798 DEBUG sge:201 Created SGE all.q template as file '/opt/sge/all.q.conf' 2013-07-15 14:09:11,827 DEBUG misc:724 Successfully modified all.q 2013-07-15 14:09:11,828 DEBUG sge:206 Configuring users' SGE profiles 2013-07-15 14:09:11,828 DEBUG master:2288 Monitor adding service 'FS-transient_nfs' 2013-07-15 14:09:11,829 DEBUG filesystem:106 Trying to add file system service FS-transient_nfs 2013-07-15 14:09:11,829 DEBUG transient_storage:55 Adding transient file system at /mnt/transient_nfs 2013-07-15 14:09:11,881 DEBUG filesystem:403 Added '/mnt/transient_nfs *(rw,sync,no_root_squash,no_subtree_check)' line to NFS file /etc/exports 2013-07-15 14:09:11,882 DEBUG filesystem:468 FS-transient_nfs in 'Starting' state for 0 seconds 2013-07-15 14:09:13,065 DEBUG misc:724 As part of transient_nfs filesystem update, successfully restarted NFS server 2013-07-15 14:09:13,066 DEBUG filesystem:468 FS-transient_nfs in 'Starting' state for 1 seconds 2013-07-15 14:09:13,067 DEBUG filesystem:135 Done adding devices to FS-transient_nfs (devices: [], [], [Transient storage @ /mnt/transient_nfs], -) 2013-07-15 14:09:13,067 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:13,067 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:09:13,067 DEBUG master:2288 Monitor adding service 'HTCondor' 2013-07-15 14:09:13,067 INFO __init__:284 HTCondor service prerequisites OK; starting the service 2013-07-15 14:09:13,067 DEBUG htcondor:38 Starting HTCondor service 2013-07-15 14:09:13,068 DEBUG htcondor:67 HTCondor params: {'flock_host': ''} 2013-07-15 14:09:14,208 DEBUG misc:724 '/etc/init.d/condor restart' command OK 2013-07-15 14:09:14,209 DEBUG master:2288 Monitor adding service 'Hadoop' 2013-07-15 14:09:14,209 DEBUG __init__:290 Hadoop service prerequisites are not yet satisfied, waiting for: []. Setting Hadoop service state to 'Unstarted' 2013-07-15 14:09:14,209 DEBUG master:2199 Storing cluster configuration to cluster's bucket 2013-07-15 14:09:14,798 DEBUG misc:212 Checking if bucket 'cm-0479bd75a331acc874033e98b2e1e03e' exists... it does not. 2013-07-15 14:09:15,090 DEBUG misc:224 Created bucket 'cm-0479bd75a331acc874033e98b2e1e03e'. 2013-07-15 14:09:15,237 DEBUG misc:595 Saved file 'persistent_data.yaml' of size 537B to bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:15,237 DEBUG master:2244 Saving current instance boot script (/tmp/cm/cm_boot.py) to cluster bucket 'cm-0479bd75a331acc874033e98b2e1e03e' as 'cm_boot.py' 2013-07-15 14:09:15,424 DEBUG misc:595 Saved file 'cm_boot.py' of size 19845B to bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:15,424 DEBUG master:2254 Saving CloudMan source (/mnt/cm/cm.tar.gz) to cluster bucket 'cm-0479bd75a331acc874033e98b2e1e03e' as 'cm.tar.gz' 2013-07-15 14:09:15,680 DEBUG misc:595 Saved file 'cm.tar.gz' of size 793622B to bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:15,680 DEBUG misc:685 Setting metadata 'revision' for file 'cm.tar.gz' in bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:15,921 DEBUG master:2280 Saving '/mnt/cm/MSGGREG.clusterName' file to cluster bucket 'cm-0479bd75a331acc874033e98b2e1e03e' as 'MSGGREG.clusterName' 2013-07-15 14:09:15,985 DEBUG misc:595 Saved file 'MSGGREG.clusterName' of size 0B to bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:16,006 DEBUG master:1755 Checking for new version of CloudMan 2013-07-15 14:09:16,006 DEBUG misc:667 Getting metadata 'revision' for file 'cm.tar.gz' from bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:16,066 DEBUG misc:667 Getting metadata 'revision' for file 'cm.tar.gz' from bucket 'cloudman' 2013-07-15 14:09:16,184 DEBUG master:1762 Revision number for user's CloudMan: '732'; revision number for default CloudMan: '732' 2013-07-15 14:09:16,184 DEBUG ec2:61 Gathering instance type, attempt 0 2013-07-15 14:09:16,457 DEBUG ec2:258 Gathering instance public hostname, attempt 0 2013-07-15 14:09:21,091 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.76 lx24-amd64 '] 2013-07-15 14:09:21,096 DEBUG filesystem:468 FS-transient_nfs in 'Starting' state for 9 seconds 2013-07-15 14:09:21,096 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..Starting; PSS..Unstarted; HTCondor..OK; Hadoop..Unstarted; 2013-07-15 14:09:21,097 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:21,097 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:09:21,097 DEBUG master:2288 Monitor adding service 'Hadoop' 2013-07-15 14:09:21,097 INFO __init__:284 Hadoop service prerequisites OK; starting the service 2013-07-15 14:09:21,097 DEBUG hadoop:38 Configuring Hadoop 2013-07-15 14:09:21,098 DEBUG master:2199 Storing cluster configuration to cluster's bucket 2013-07-15 14:09:21,101 DEBUG hadoop:81 Unpacking Hadoop 2013-07-15 14:09:21,101 DEBUG hadoop:84 Hadoop path is
/opt/hadoop/hadoop\.((([|0-9])*\.)*[0-9]*__([0-9]*\.)*[0-9]+){0,1}\.{0,1}tar\.gz 2013-07-15 14:09:21,101 DEBUG hadoop:87 Hadoop SGE integration path is /opt/hadoop/sge_integration\.(([0-9]*\.)*[0-9]+){0,1}\.{0,1}tar\.gz 2013-07-15 14:09:21,106 DEBUG hadoop:180 Extracted Hadoop version: 1.0.4 2013-07-15 14:09:21,106 DEBUG hadoop:181 Extracted Hadoop build version: 1.0 2013-07-15 14:09:21,174 DEBUG hadoop:180 Extracted Hadoop version: 1.0.4 2013-07-15 14:09:21,175 DEBUG hadoop:181 Extracted Hadoop build version: 1.0 2013-07-15 14:09:21,821 DEBUG misc:595 Saved file 'persistent_data.yaml' of size 537B to bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:21,822 DEBUG master:2244 Saving current instance boot script (/tmp/cm/cm_boot.py) to cluster bucket 'cm-0479bd75a331acc874033e98b2e1e03e' as 'cm_boot.py' 2013-07-15 14:09:21,902 DEBUG misc:595 Saved file 'cm_boot.py' of size 19845B to bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:21,902 DEBUG master:2254 Saving CloudMan source (/mnt/cm/cm.tar.gz) to cluster bucket 'cm-0479bd75a331acc874033e98b2e1e03e' as 'cm.tar.gz' 2013-07-15 14:09:22,158 DEBUG misc:595 Saved file 'cm.tar.gz' of size 793622B to bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:22,158 DEBUG misc:685 Setting metadata 'revision' for file 'cm.tar.gz' in bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:22,621 DEBUG master:2280 Saving '/mnt/cm/MSGGREG.clusterName' file to cluster bucket 'cm-0479bd75a331acc874033e98b2e1e03e' as 'MSGGREG.clusterName' 2013-07-15 14:09:22,683 DEBUG misc:595 Saved file 'MSGGREG.clusterName' of size 0B to bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:26,686 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:26,686 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:09:30,689 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:30,690 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:09:35,352 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.76 lx24-amd64 '] 2013-07-15 14:09:35,356 DEBUG filesystem:468 FS-transient_nfs in 'Starting' state for 23 seconds 2013-07-15 14:09:35,357 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..Starting; PSS..Unstarted; HTCondor..OK; Hadoop..Starting; 2013-07-15 14:09:35,357 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:35,358 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:09:38,802 DEBUG hadoop:141 Hadoop extracted to /opt/hadoop 2013-07-15 14:09:38,821 DEBUG hadoop:146 Hadoop SGE integration extracted to /opt/hadoop 2013-07-15 14:09:38,848 DEBUG misc:724 'chown -R -c ubuntu /opt/hadoop/sge_integration.1.0.tar.gz' command OK 2013-07-15 14:09:38,872 DEBUG misc:724 'chown -R -c ubuntu /opt/hadoop/hadoop.1.0.4__1.0.tar.gz' command OK 2013-07-15 14:09:38,873 DEBUG hadoop:190 Setting up Hadoop environment 2013-07-15 14:09:38,874 DEBUG hadoop:195 Hadoop id_rsa set from::/opt/hadoop/id_rsa 2013-07-15 14:09:38,899 DEBUG misc:724 'chown -c ubuntu /home/ubuntu/.ssh/id_rsa' command OK 2013-07-15 14:09:38,899 DEBUG hadoop:199 Hadoop authFile saved to /home/ubuntu/.ssh/id_rsa 2013-07-15 14:09:38,923 DEBUG misc:724 'chown -c ubuntu /home/ubuntu/.ssh/authorized_keys' command OK 2013-07-15 14:09:38,924 INFO hadoop:51 Done adding Hadoop service; service running. 2013-07-15 14:09:39,363 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:39,363 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:09:43,365 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:43,365 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:09:47,907 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.76 lx24-amd64 '] 2013-07-15 14:09:47,934 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:09:47,934 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:47,935 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:09:51,936 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:51,937 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:09:55,938 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:55,938 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:00,482 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.87 lx24-amd64 '] 2013-07-15 14:10:00,508 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:10:00,509 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:00,509 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:04,511 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:04,512 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:08,513 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:08,514 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:16,307 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.87 lx24-amd64 '] 2013-07-15 14:10:16,336 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:10:16,336 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:16,337 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:20,339 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:20,339 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:24,890 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.87 lx24-amd64 '] 2013-07-15 14:10:24,921 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:10:24,922 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:24,922 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:28,924 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:28,924 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:32,926 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:32,927 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:37,472 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.88 lx24-amd64 '] 2013-07-15 14:10:37,500 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:10:37,501 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:37,501 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:41,503 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:41,504 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:45,505 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:45,505 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:50,046 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.88 lx24-amd64 '] 2013-07-15 14:10:50,075 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:10:50,076 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:50,076 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:54,078 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:54,079 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:58,080 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:58,081 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:02,633 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.88 lx24-amd64 '] 2013-07-15 14:11:02,662 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:11:02,662 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:02,662 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:06,664 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:06,665 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:10,667 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:10,668 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:15,212 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.77 lx24-amd64 '] 2013-07-15 14:11:15,240 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:11:15,241 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:15,241 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:19,243 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:19,243 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:23,245 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:23,246 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:27,785 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.77 lx24-amd64 '] 2013-07-15 14:11:27,814 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:11:27,814 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:27,815 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:31,816 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:31,817 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:35,818 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:35,819 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:40,362 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.77 lx24-amd64 '] 2013-07-15 14:11:40,390 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:11:40,391 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:40,391 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:44,393 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:44,394 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:48,395 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:48,396 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:50,209 DEBUG master:1252 Initializing a shared cluster from 'cm-808d863548acae7c2328c39a90f52e29/shared/2012-09-17--19-47' 2013-07-15 14:11:50,392 DEBUG misc:574 Retrieved file 'shared/2012-09-17--19-47/shared_instance_file_list.txt' from bucket 'cm-808d863548acae7c2328c39a90f52e29' on host 's3.amazonaws.com' to 'shared_instance_file_list.txt'. 2013-07-15 14:11:50,394 DEBUG misc:615 Establishing handle with key object 'shared/2012-09-17--19-47/persistent_data.yaml' 2013-07-15 14:11:50,394 DEBUG misc:619 Copying file
'cm-808d863548acae7c2328c39a90f52e29/shared/2012-09-17--19-47/persistent_data.yaml' to file 'cm-0479bd75a331acc874033e98b2e1e03e/persistent_data.yaml' 2013-07-15 14:11:50,550 DEBUG misc:615 Establishing handle with key object 'shared/2012-09-17--19-47/cm.tar.gz' 2013-07-15 14:11:50,550 DEBUG misc:619 Copying file 'cm-808d863548acae7c2328c39a90f52e29/shared/2012-09-17--19-47/cm.tar.gz' to file 'cm-0479bd75a331acc874033e98b2e1e03e/cm.tar.gz' 2013-07-15 14:11:50,771 DEBUG misc:615 Establishing handle with key object 'shared/2012-09-17--19-47/cm.tar.gz_2012-09-13' 2013-07-15 14:11:50,771 DEBUG misc:619 Copying file
'cm-808d863548acae7c2328c39a90f52e29/shared/2012-09-17--19-47/cm.tar.gz_2012-09-13' to file 'cm-0479bd75a331acc874033e98b2e1e03e/cm.tar.gz_2012-09-13' 2013-07-15 14:11:50,930 DEBUG misc:615 Establishing handle with key object 'shared/2012-09-17--19-47/cm_boot.py' 2013-07-15 14:11:50,930 DEBUG misc:619 Copying file 'cm-808d863548acae7c2328c39a90f52e29/shared/2012-09-17--19-47/cm_boot.py' to file 'cm-0479bd75a331acc874033e98b2e1e03e/cm_boot.py' 2013-07-15 14:11:51,085 DEBUG misc:615 Establishing handle with key object 'shared/2012-09-17--19-47/post_start_script' 2013-07-15 14:11:51,085 DEBUG misc:619 Copying file
'cm-808d863548acae7c2328c39a90f52e29/shared/2012-09-17--19-47/post_start_script' to file 'cm-0479bd75a331acc874033e98b2e1e03e/post_start_script' 2013-07-15 14:11:51,326 DEBUG misc:574 Retrieved file 'persistent_data.yaml' from bucket 'cm-0479bd75a331acc874033e98b2e1e03e' on host 's3.amazonaws.com' to 'shared_p_d.yaml'. 2013-07-15 14:11:52,944 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.77 lx24-amd64 '] 2013-07-15 14:11:52,973 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:11:52,973 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:52,973 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:53,730 ERROR master:1337 Error creating volume from shared cluster's snapshot '['snap-cfa775ba']': 'filesystems' 2013-07-15 14:11:56,975 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:56,976 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:00,978 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:00,978 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:05,521 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.72 lx24-amd64 '] 2013-07-15 14:12:05,550 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:12:05,551 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:05,551 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:09,553 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:09,553 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:13,555 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:13,556 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:18,102 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.72 lx24-amd64 '] 2013-07-15 14:12:18,131 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:12:18,131 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:18,131 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:22,133 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:22,134 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:25,087 DEBUG ec2:166 Gathering instance public keys (i.e., key pairs), attempt 0 2013-07-15 14:12:25,091 DEBUG ec2:173 Got key pair: 'cloudman_key_pair' 2013-07-15 14:12:26,136 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:26,137 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:30,689 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.72 lx24-amd64 '] 2013-07-15 14:12:30,717 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:12:30,717 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:30,718 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:34,719 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:34,720 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:38,721 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:38,721 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:43,263 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.65 lx24-amd64 '] 2013-07-15 14:12:43,289 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:12:43,290 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:43,290 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:47,292 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:47,292 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:51,293 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:51,293 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:55,836 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.65 lx24-amd64 '] 2013-07-15 14:12:55,865 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:12:55,865 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:55,866 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:59,867 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:59,868 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:13:03,868 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:13:03,869 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:13:08,419 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.65 lx24-amd64 '] 2013-07-15 14:13:08,448 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:13:08,449 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:13:08,449 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None)
Hi Greg, Sorry for replying really late.
So, I'm guessing this was an old cluster that was shared and is now being derived on a new cluster? There was a large number of paths we explored while getting ready for the upgrade and I was of the opinion we covered
On Fri, Jul 12, 2013 at 12:08 PM, Enis Afgan <eafgan@emory.edu> wrote: that
path but it seems things are not working as expected. Can look at the more detailed log on the Admin page (under CloudMan log) and see if there are more details about what's going on and why it's failing?
On Thu, Jul 11, 2013 at 3:14 PM, greg <margeemail@gmail.com> wrote:
Hi guys,
I just thought I'd check in again. None of the researches that want to run out genotyping program can do so until I figure this out. Any help or advice at all would be greatly appreciated.
Thanks,
Greg
On Mon, Jul 8, 2013 at 8:43 AM, greg <margeemail@gmail.com> wrote:
Hi guys,
Any thoughts on this? I'm kind of stuck.
(Even some pointers on where to look for more clues would be extremely helpful.)
Thanks,
Greg
On Fri, Jul 5, 2013 at 11:10 AM, greg <margeemail@gmail.com> wrote:
Hi guys,
I'm hitting an error using CloudMan using the Share-an-Instance option. It says:
Error creating volume from shared cluster's snapshot '['snap-cfa775ba']': 'filesystems'.
Also disk stats says 0 /0 and the Applications light is yellow while the data light is green.
I'm using the share string cm-808d863548acae7c2328c39a90f52e29/shared/2012-09-17--19-47
It's always worked in the past.
Thanks,
Greg
Here's the full log:
14:58:18 - Master starting 14:58:20 - Completed the initial cluster startup process. This is a new cluster; waiting to configure the type. 14:58:24 - Migration service prerequisites OK; starting the service 14:58:24 - SGE service prerequisites OK; starting the service 14:58:31 - Setting up SGE... 14:58:51 - HTCondor service prerequisites OK; starting the service 14:58:51 - HTCondor config file /etc/condor/condor_config not found! 14:58:59 - Hadoop service prerequisites OK; starting the service 14:59:48 - Done adding Hadoop service; service running. 15:01:45 - Error creating volume from shared cluster's snapshot '['snap-cfa775ba']': 'filesystems'
Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Thanks guys! I'm going to try it out now. Would you mind letting me know when the issues are fixed next week? (updating this thread would be fine) Thanks again, Greg On Tue, Jul 16, 2013 at 12:42 PM, Dannon Baker <dannon.baker@gmail.com> wrote:
Hey Greg,
We put together a quick workaround until we're able to resolve the underlying issues. You can launch the previous incarnation of galaxy (pre-update AMI and Cloudman versions) using https://main.g2.bx.psu.edu/cloudlaunch?ami=ami-da58aab3&bucket_default=gxy-workshop
Once your instance is up, enter your share string like usual and it'll work fine. We expect to fix these issues next week, but this should get your instance back up and running for now.
On Mon, Jul 15, 2013 at 10:19 AM, greg <margeemail@gmail.com> wrote:
Thanks for getting back to me, Enis.
I went ahead and started a new cluster instance. I'll try to leave it running today in case there's anything we want to check.
Here's my whole process:
Start Screen Setup: ---------------------------------- http://snag.gy/DMIeC.jpg
Entering my share string: ---------------------------------------- http://snag.gy/wKFLy.jpg
Main Page Text and log: ------------------------------------ Cluster name:
MSGGREG
Disk status:
0 / 0 (0%)
Worker status:
Idle: 0 Available: 0 Requested: 0
Service status:
Applications
Data
Cluster status log
14:08:32 - Master starting 14:08:34 - Completed the initial cluster startup process. This is a new cluster; waiting to configure the type. 14:08:50 - Migration service prerequisites OK; starting the service 14:08:50 - SGE service prerequisites OK; starting the service 14:08:58 - Setting up SGE... 14:09:13 - HTCondor service prerequisites OK; starting the service 14:09:21 - Hadoop service prerequisites OK; starting the service 14:09:38 - Done adding Hadoop service; service running. 14:11:53 - Error creating volume from shared cluster's snapshot '['snap-cfa775ba']': 'filesystems'
Admin Page CloudMan Log: (unfortunately nothing is jumping out at me?) -----------------------------------------------
CloudMan from Galaxy Admin | Report bugs | Wiki | Screencast The entire log file (paster.log) is shown. Show latest | Back to admin view Python version: (2, 7) Image configuration suports: {'apps': ['cloudman', 'galaxy']} 2013-07-15 14:08:32,406 DEBUG app:68 Initializing app 2013-07-15 14:08:32,407 DEBUG ec2:121 Gathering instance zone, attempt 0 2013-07-15 14:08:32,410 DEBUG ec2:127 Instance zone is 'us-east-1d' 2013-07-15 14:08:32,410 DEBUG ec2:45 Gathering instance ami, attempt 0 2013-07-15 14:08:32,412 DEBUG app:71 Running on 'ec2' type of cloud in zone 'us-east-1d' using image 'ami-118bfc78'. 2013-07-15 14:08:32,412 DEBUG app:89 Getting pd.yaml 2013-07-15 14:08:32,412 DEBUG ec2:338 No S3 Connection, creating a new one. 2013-07-15 14:08:32,413 DEBUG ec2:342 Got boto S3 connection. 2013-07-15 14:08:32,452 DEBUG misc:212 Checking if bucket 'cm-0479bd75a331acc874033e98b2e1e03e' exists... it does not. 2013-07-15 14:08:32,452 DEBUG misc:583 Bucket 'cm-0479bd75a331acc874033e98b2e1e03e' does not exist, did not get remote file 'persistent_data.yaml' 2013-07-15 14:08:32,452 DEBUG app:96 Setting deployment_version to 2 2013-07-15 14:08:32,453 INFO app:103 Master starting 2013-07-15 14:08:32,453 DEBUG master:55 Initializing console manager - cluster start time: 2013-07-15 14:08:32.453182 2013-07-15 14:08:32,453 DEBUG comm:42 AMQP Connection Failure: [Errno 111] Connection refused 2013-07-15 14:08:32,453 DEBUG master:791 Trying to discover any worker instances associated with this cluster... 2013-07-15 14:08:32,454 DEBUG ec2:317 Establishing boto EC2 connection 2013-07-15 14:08:32,535 DEBUG ec2:305 Got region as 'RegionInfo:us-east-1' 2013-07-15 14:08:32,777 DEBUG ec2:326 Got boto EC2 connection for region 'us-east-1' 2013-07-15 14:08:33,022 DEBUG misc:574 Retrieved file 'snaps.yaml' from bucket 'cloudman' on host 's3.amazonaws.com' to 'cm_snaps.yaml'. 2013-07-15 14:08:33,035 DEBUG ec2:286 Got region name as 'us-east-1' 2013-07-15 14:08:33,035 DEBUG master:226 Loaded default snapshot data: [{'snap_id': 'snap-adad90fc', 'name': 'galaxy', 'roles': 'galaxyTools,galaxyData'}, {'snap_id': 'snap-5b030634', 'name': 'galaxyIndices', 'roles': 'galaxyIndices'}] 2013-07-15 14:08:33,035 DEBUG ec2:81 Gathering instance id, attempt 0 2013-07-15 14:08:33,037 DEBUG ec2:87 Instance ID is 'i-5346d733' 2013-07-15 14:08:33,125 DEBUG ec2:360 Adding tag 'clusterName:MSGGREG' to resource 'i-5346d733' 2013-07-15 14:08:33,307 DEBUG ec2:360 Adding tag 'role:master' to resource 'i-5346d733' 2013-07-15 14:08:33,554 DEBUG ec2:360 Adding tag 'Name:master: MSGGREG' to resource 'i-5346d733' 2013-07-15 14:08:33,744 DEBUG master:246 ud at manager start: {'region_name': u'us-east-1', 'region_endpoint': u'ec2.amazonaws.com', 'ec2_port': None, 'deployment_version': 2, 'cloud_name': u'Amazon', 'boot_script_name': 'cm_boot.py', 'is_secure': True, 'password': '4444', 'access_key': 'redacted!', 's3_port': None, 'cloud_type': u'ec2', 'cloudman_home': '/mnt/cm', 'cluster_name': u'MSGGREG', 'freenxpass': u'4444', 'bucket_default': 'cloudman', 'role': 'master', 'bucket_cluster': 'cm-0479bd75a331acc874033e98b2e1e03e', 'boot_script_path': '/tmp/cm', 'secret_key': u'redacted!', 's3_conn_path': u'/', 's3_host': u's3.amazonaws.com', 'ec2_conn_path': u'/'} 2013-07-15 14:08:33,744 DEBUG master:1858 Generating root user's public key... 2013-07-15 14:08:33,763 DEBUG base:57 Enabling 'root' controller, class: CM 2013-07-15 14:08:33,766 DEBUG buildapp:88 Enabling 'httpexceptions' middleware 2013-07-15 14:08:33,770 DEBUG buildapp:94 Enabling 'recursive' middleware 2013-07-15 14:08:33,776 DEBUG buildapp:114 Enabling 'print debug' middleware 2013-07-15 14:08:33,788 DEBUG buildapp:128 Enabling 'error' middleware 2013-07-15 14:08:33,788 DEBUG buildapp:138 Enabling 'config' middleware 2013-07-15 14:08:33,789 DEBUG buildapp:142 Enabling 'x-forwarded-host' middleware Starting server in PID 1791. serving on 0.0.0.0:42284 view at http://127.0.0.1:42284 2013-07-15 14:08:34,198 DEBUG master:1861 Successfully generated root user's public key. 2013-07-15 14:08:34,199 DEBUG master:1869 Successfully retrieved root user's public key from file. 2013-07-15 14:08:34,199 DEBUG master:99 Updating dependencies for service Migration 2013-07-15 14:08:34,199 DEBUG master:99 Updating dependencies for service SGE 2013-07-15 14:08:34,200 DEBUG filesystem:32 Instantiating Filesystem object transient_nfs with service roles: TransientNFS 2013-07-15 14:08:34,200 DEBUG filesystem:594 Configuring instance transient storage at /mnt/transient_nfs with NFS. 2013-07-15 14:08:34,200 DEBUG master:99 Updating dependencies for service transient_nfs 2013-07-15 14:08:34,200 DEBUG pss:27 Configured PSS as master 2013-07-15 14:08:34,200 DEBUG master:99 Updating dependencies for service PSS 2013-07-15 14:08:34,200 DEBUG htcondor:25 Condor is preparing 2013-07-15 14:08:34,200 DEBUG master:99 Updating dependencies for service HTCondor 2013-07-15 14:08:34,201 DEBUG master:99 Updating dependencies for service Hadoop 2013-07-15 14:08:34,201 DEBUG master:321 Checking for and adding any previously defined cluster services 2013-07-15 14:08:34,201 DEBUG master:327 Processing filesystems in an existing cluster config 2013-07-15 14:08:34,201 DEBUG master:812 Trying to discover any volumes attached to this instance... 2013-07-15 14:08:34,377 DEBUG master:829 Attached volumes: [Volume:vol-cc9b1791] 2013-07-15 14:08:34,378 DEBUG ec2:360 Adding tag 'clusterName:MSGGREG' to resource 'vol-cc9b1791' 2013-07-15 14:08:34,545 DEBUG master:385 Processing application services in an existing cluster config 2013-07-15 14:08:34,545 INFO master:298 Completed the initial cluster startup process. This is a new cluster; waiting to configure the type. 2013-07-15 14:08:34,545 DEBUG master:2342 Monitor started; manager started 2013-07-15 14:08:38,545 DEBUG master:2352 Trying to setup AMQP connection; conn = '' 2013-07-15 14:08:38,546 DEBUG comm:42 AMQP Connection Failure: [Errno 111] Connection refused 2013-07-15 14:08:42,546 DEBUG master:2352 Trying to setup AMQP connection; conn = '' 2013-07-15 14:08:42,547 DEBUG comm:42 AMQP Connection Failure: [Errno 111] Connection refused 2013-07-15 14:08:46,547 DEBUG master:2352 Trying to setup AMQP connection; conn = '' 2013-07-15 14:08:46,550 DEBUG connection:661 Start from server, version: 8.0, properties: {u'information': u'Licensed under the MPL. See http://www.rabbitmq.com/', u'product': u'RabbitMQ', u'copyright': u'Copyright (C) 2007-2011 VMware, Inc.', u'capabilities': {}, u'platform': u'Erlang/OTP', u'version': u'2.7.1'}, mechanisms: [u'PLAIN', u'AMQPLAIN'], locales: [u'en_US'] 2013-07-15 14:08:46,551 DEBUG connection:507 Open OK! known_hosts [] 2013-07-15 14:08:46,551 DEBUG channel:70 using channel_id: 1 2013-07-15 14:08:46,552 DEBUG channel:484 Channel open 2013-07-15 14:08:46,555 DEBUG comm:40 Successfully established AMQP connection 2013-07-15 14:08:50,556 DEBUG master:2377 S&S: Migration..Unstarted; SGE..Unstarted; FS-transient_nfs..Unstarted; PSS..Unstarted; HTCondor..Unstarted; Hadoop..Unstarted; 2013-07-15 14:08:50,556 DEBUG master:2288 Monitor adding service 'Migration' 2013-07-15 14:08:50,556 INFO __init__:284 Migration service prerequisites OK; starting the service 2013-07-15 14:08:50,556 DEBUG migration_service:284 Starting migration service... 2013-07-15 14:08:50,556 DEBUG migration_service:332 Old deployment version: 2 2013-07-15 14:08:50,556 DEBUG migration_service:324 Current deployment version: 2 2013-07-15 14:08:50,556 DEBUG migration_service:300 No migration required. Service complete. 2013-07-15 14:08:50,556 DEBUG master:2288 Monitor adding service 'SGE' 2013-07-15 14:08:50,557 INFO __init__:284 SGE service prerequisites OK; starting the service 2013-07-15 14:08:50,557 DEBUG sge:100 Unpacking SGE from '/opt/galaxy/pkg/ge6.2u5' 2013-07-15 14:08:50,560 DEBUG sge:123 Unpacking SGE to '/opt/sge'. 2013-07-15 14:08:58,715 INFO sge:159 Setting up SGE... 2013-07-15 14:08:58,716 DEBUG ec2:188 Gathering instance private IP, attempt 0 2013-07-15 14:08:58,719 DEBUG sge:166 Created SGE install template as file '/opt/sge/galaxyEC2.conf' 2013-07-15 14:08:58,719 DEBUG sge:168 Setting up SGE. 2013-07-15 14:08:58,719 DEBUG misc:750 Replacing string ' libc_version=`echo $libc_string | tr ' ,' '\n' | grep "2\." | cut -f 2 -d "."`' with ' libc_version=`echo $libc_string | tr ' ,' '\n' | grep "2\." | cut -f 2 -d "." | sort -u`' in file /opt/sge/util/arch 2013-07-15 14:08:58,721 DEBUG misc:750 Replacing string ' 2.[46].*)' with ' [23].[24567890].*)' in file /opt/sge/util/arch 2013-07-15 14:08:58,721 DEBUG misc:750 Replacing string ' 2.6.*)' with ' [23].[24567890].*)' in file /opt/sge/util/arch 2013-07-15 14:08:58,744 DEBUG misc:724 Modified /opt/sge/util/arch 2013-07-15 14:08:58,763 DEBUG misc:724 Successfully chmod /opt/sge/util/arch 2013-07-15 14:08:58,783 DEBUG misc:724 'sed -i.bak '/^127.0.1./s/^/# (Commented by CloudMan) /' /etc/hosts' command OK 2013-07-15 14:09:11,740 DEBUG misc:724 Successfully set up SGE 2013-07-15 14:09:11,740 DEBUG sge:171 Successfully setup SGE; configuring SGE 2013-07-15 14:09:11,740 DEBUG sge:172 Adding parallel environments 2013-07-15 14:09:11,769 DEBUG misc:724 'cd /opt/sge; ./bin/lx24-amd64/qconf -Ap /tmp/SMP_PE' command OK 2013-07-15 14:09:11,797 DEBUG misc:724 'cd /opt/sge; ./bin/lx24-amd64/qconf -Ap /tmp/MPI_PE' command OK 2013-07-15 14:09:11,797 DEBUG sge:180 Creating queue 'all.q' 2013-07-15 14:09:11,798 DEBUG sge:201 Created SGE all.q template as file '/opt/sge/all.q.conf' 2013-07-15 14:09:11,827 DEBUG misc:724 Successfully modified all.q 2013-07-15 14:09:11,828 DEBUG sge:206 Configuring users' SGE profiles 2013-07-15 14:09:11,828 DEBUG master:2288 Monitor adding service 'FS-transient_nfs' 2013-07-15 14:09:11,829 DEBUG filesystem:106 Trying to add file system service FS-transient_nfs 2013-07-15 14:09:11,829 DEBUG transient_storage:55 Adding transient file system at /mnt/transient_nfs 2013-07-15 14:09:11,881 DEBUG filesystem:403 Added '/mnt/transient_nfs *(rw,sync,no_root_squash,no_subtree_check)' line to NFS file /etc/exports 2013-07-15 14:09:11,882 DEBUG filesystem:468 FS-transient_nfs in 'Starting' state for 0 seconds 2013-07-15 14:09:13,065 DEBUG misc:724 As part of transient_nfs filesystem update, successfully restarted NFS server 2013-07-15 14:09:13,066 DEBUG filesystem:468 FS-transient_nfs in 'Starting' state for 1 seconds 2013-07-15 14:09:13,067 DEBUG filesystem:135 Done adding devices to FS-transient_nfs (devices: [], [], [Transient storage @ /mnt/transient_nfs], -) 2013-07-15 14:09:13,067 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:13,067 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:09:13,067 DEBUG master:2288 Monitor adding service 'HTCondor' 2013-07-15 14:09:13,067 INFO __init__:284 HTCondor service prerequisites OK; starting the service 2013-07-15 14:09:13,067 DEBUG htcondor:38 Starting HTCondor service 2013-07-15 14:09:13,068 DEBUG htcondor:67 HTCondor params: {'flock_host': ''} 2013-07-15 14:09:14,208 DEBUG misc:724 '/etc/init.d/condor restart' command OK 2013-07-15 14:09:14,209 DEBUG master:2288 Monitor adding service 'Hadoop' 2013-07-15 14:09:14,209 DEBUG __init__:290 Hadoop service prerequisites are not yet satisfied, waiting for: []. Setting Hadoop service state to 'Unstarted' 2013-07-15 14:09:14,209 DEBUG master:2199 Storing cluster configuration to cluster's bucket 2013-07-15 14:09:14,798 DEBUG misc:212 Checking if bucket 'cm-0479bd75a331acc874033e98b2e1e03e' exists... it does not. 2013-07-15 14:09:15,090 DEBUG misc:224 Created bucket 'cm-0479bd75a331acc874033e98b2e1e03e'. 2013-07-15 14:09:15,237 DEBUG misc:595 Saved file 'persistent_data.yaml' of size 537B to bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:15,237 DEBUG master:2244 Saving current instance boot script (/tmp/cm/cm_boot.py) to cluster bucket 'cm-0479bd75a331acc874033e98b2e1e03e' as 'cm_boot.py' 2013-07-15 14:09:15,424 DEBUG misc:595 Saved file 'cm_boot.py' of size 19845B to bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:15,424 DEBUG master:2254 Saving CloudMan source (/mnt/cm/cm.tar.gz) to cluster bucket 'cm-0479bd75a331acc874033e98b2e1e03e' as 'cm.tar.gz' 2013-07-15 14:09:15,680 DEBUG misc:595 Saved file 'cm.tar.gz' of size 793622B to bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:15,680 DEBUG misc:685 Setting metadata 'revision' for file 'cm.tar.gz' in bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:15,921 DEBUG master:2280 Saving '/mnt/cm/MSGGREG.clusterName' file to cluster bucket 'cm-0479bd75a331acc874033e98b2e1e03e' as 'MSGGREG.clusterName' 2013-07-15 14:09:15,985 DEBUG misc:595 Saved file 'MSGGREG.clusterName' of size 0B to bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:16,006 DEBUG master:1755 Checking for new version of CloudMan 2013-07-15 14:09:16,006 DEBUG misc:667 Getting metadata 'revision' for file 'cm.tar.gz' from bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:16,066 DEBUG misc:667 Getting metadata 'revision' for file 'cm.tar.gz' from bucket 'cloudman' 2013-07-15 14:09:16,184 DEBUG master:1762 Revision number for user's CloudMan: '732'; revision number for default CloudMan: '732' 2013-07-15 14:09:16,184 DEBUG ec2:61 Gathering instance type, attempt 0 2013-07-15 14:09:16,457 DEBUG ec2:258 Gathering instance public hostname, attempt 0 2013-07-15 14:09:21,091 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.76 lx24-amd64 '] 2013-07-15 14:09:21,096 DEBUG filesystem:468 FS-transient_nfs in 'Starting' state for 9 seconds 2013-07-15 14:09:21,096 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..Starting; PSS..Unstarted; HTCondor..OK; Hadoop..Unstarted; 2013-07-15 14:09:21,097 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:21,097 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:09:21,097 DEBUG master:2288 Monitor adding service 'Hadoop' 2013-07-15 14:09:21,097 INFO __init__:284 Hadoop service prerequisites OK; starting the service 2013-07-15 14:09:21,097 DEBUG hadoop:38 Configuring Hadoop 2013-07-15 14:09:21,098 DEBUG master:2199 Storing cluster configuration to cluster's bucket 2013-07-15 14:09:21,101 DEBUG hadoop:81 Unpacking Hadoop 2013-07-15 14:09:21,101 DEBUG hadoop:84 Hadoop path is
/opt/hadoop/hadoop\.((([|0-9])*\.)*[0-9]*__([0-9]*\.)*[0-9]+){0,1}\.{0,1}tar\.gz 2013-07-15 14:09:21,101 DEBUG hadoop:87 Hadoop SGE integration path is /opt/hadoop/sge_integration\.(([0-9]*\.)*[0-9]+){0,1}\.{0,1}tar\.gz 2013-07-15 14:09:21,106 DEBUG hadoop:180 Extracted Hadoop version: 1.0.4 2013-07-15 14:09:21,106 DEBUG hadoop:181 Extracted Hadoop build version: 1.0 2013-07-15 14:09:21,174 DEBUG hadoop:180 Extracted Hadoop version: 1.0.4 2013-07-15 14:09:21,175 DEBUG hadoop:181 Extracted Hadoop build version: 1.0 2013-07-15 14:09:21,821 DEBUG misc:595 Saved file 'persistent_data.yaml' of size 537B to bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:21,822 DEBUG master:2244 Saving current instance boot script (/tmp/cm/cm_boot.py) to cluster bucket 'cm-0479bd75a331acc874033e98b2e1e03e' as 'cm_boot.py' 2013-07-15 14:09:21,902 DEBUG misc:595 Saved file 'cm_boot.py' of size 19845B to bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:21,902 DEBUG master:2254 Saving CloudMan source (/mnt/cm/cm.tar.gz) to cluster bucket 'cm-0479bd75a331acc874033e98b2e1e03e' as 'cm.tar.gz' 2013-07-15 14:09:22,158 DEBUG misc:595 Saved file 'cm.tar.gz' of size 793622B to bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:22,158 DEBUG misc:685 Setting metadata 'revision' for file 'cm.tar.gz' in bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:22,621 DEBUG master:2280 Saving '/mnt/cm/MSGGREG.clusterName' file to cluster bucket 'cm-0479bd75a331acc874033e98b2e1e03e' as 'MSGGREG.clusterName' 2013-07-15 14:09:22,683 DEBUG misc:595 Saved file 'MSGGREG.clusterName' of size 0B to bucket 'cm-0479bd75a331acc874033e98b2e1e03e' 2013-07-15 14:09:26,686 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:26,686 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:09:30,689 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:30,690 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:09:35,352 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.76 lx24-amd64 '] 2013-07-15 14:09:35,356 DEBUG filesystem:468 FS-transient_nfs in 'Starting' state for 23 seconds 2013-07-15 14:09:35,357 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..Starting; PSS..Unstarted; HTCondor..OK; Hadoop..Starting; 2013-07-15 14:09:35,357 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:35,358 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:09:38,802 DEBUG hadoop:141 Hadoop extracted to /opt/hadoop 2013-07-15 14:09:38,821 DEBUG hadoop:146 Hadoop SGE integration extracted to /opt/hadoop 2013-07-15 14:09:38,848 DEBUG misc:724 'chown -R -c ubuntu /opt/hadoop/sge_integration.1.0.tar.gz' command OK 2013-07-15 14:09:38,872 DEBUG misc:724 'chown -R -c ubuntu /opt/hadoop/hadoop.1.0.4__1.0.tar.gz' command OK 2013-07-15 14:09:38,873 DEBUG hadoop:190 Setting up Hadoop environment 2013-07-15 14:09:38,874 DEBUG hadoop:195 Hadoop id_rsa set from::/opt/hadoop/id_rsa 2013-07-15 14:09:38,899 DEBUG misc:724 'chown -c ubuntu /home/ubuntu/.ssh/id_rsa' command OK 2013-07-15 14:09:38,899 DEBUG hadoop:199 Hadoop authFile saved to /home/ubuntu/.ssh/id_rsa 2013-07-15 14:09:38,923 DEBUG misc:724 'chown -c ubuntu /home/ubuntu/.ssh/authorized_keys' command OK 2013-07-15 14:09:38,924 INFO hadoop:51 Done adding Hadoop service; service running. 2013-07-15 14:09:39,363 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:39,363 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:09:43,365 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:43,365 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:09:47,907 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.76 lx24-amd64 '] 2013-07-15 14:09:47,934 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:09:47,934 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:47,935 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:09:51,936 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:51,937 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:09:55,938 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:09:55,938 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:00,482 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.87 lx24-amd64 '] 2013-07-15 14:10:00,508 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:10:00,509 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:00,509 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:04,511 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:04,512 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:08,513 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:08,514 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:16,307 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.87 lx24-amd64 '] 2013-07-15 14:10:16,336 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:10:16,336 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:16,337 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:20,339 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:20,339 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:24,890 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.87 lx24-amd64 '] 2013-07-15 14:10:24,921 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:10:24,922 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:24,922 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:28,924 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:28,924 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:32,926 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:32,927 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:37,472 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.88 lx24-amd64 '] 2013-07-15 14:10:37,500 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:10:37,501 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:37,501 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:41,503 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:41,504 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:45,505 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:45,505 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:50,046 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.88 lx24-amd64 '] 2013-07-15 14:10:50,075 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:10:50,076 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:50,076 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:54,078 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:54,079 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:10:58,080 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:10:58,081 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:02,633 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.88 lx24-amd64 '] 2013-07-15 14:11:02,662 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:11:02,662 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:02,662 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:06,664 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:06,665 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:10,667 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:10,668 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:15,212 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.77 lx24-amd64 '] 2013-07-15 14:11:15,240 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:11:15,241 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:15,241 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:19,243 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:19,243 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:23,245 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:23,246 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:27,785 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.77 lx24-amd64 '] 2013-07-15 14:11:27,814 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:11:27,814 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:27,815 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:31,816 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:31,817 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:35,818 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:35,819 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:40,362 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.77 lx24-amd64 '] 2013-07-15 14:11:40,390 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:11:40,391 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:40,391 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:44,393 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:44,394 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:48,395 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:48,396 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:50,209 DEBUG master:1252 Initializing a shared cluster from 'cm-808d863548acae7c2328c39a90f52e29/shared/2012-09-17--19-47' 2013-07-15 14:11:50,392 DEBUG misc:574 Retrieved file 'shared/2012-09-17--19-47/shared_instance_file_list.txt' from bucket 'cm-808d863548acae7c2328c39a90f52e29' on host 's3.amazonaws.com' to 'shared_instance_file_list.txt'. 2013-07-15 14:11:50,394 DEBUG misc:615 Establishing handle with key object 'shared/2012-09-17--19-47/persistent_data.yaml' 2013-07-15 14:11:50,394 DEBUG misc:619 Copying file
'cm-808d863548acae7c2328c39a90f52e29/shared/2012-09-17--19-47/persistent_data.yaml' to file 'cm-0479bd75a331acc874033e98b2e1e03e/persistent_data.yaml' 2013-07-15 14:11:50,550 DEBUG misc:615 Establishing handle with key object 'shared/2012-09-17--19-47/cm.tar.gz' 2013-07-15 14:11:50,550 DEBUG misc:619 Copying file 'cm-808d863548acae7c2328c39a90f52e29/shared/2012-09-17--19-47/cm.tar.gz' to file 'cm-0479bd75a331acc874033e98b2e1e03e/cm.tar.gz' 2013-07-15 14:11:50,771 DEBUG misc:615 Establishing handle with key object 'shared/2012-09-17--19-47/cm.tar.gz_2012-09-13' 2013-07-15 14:11:50,771 DEBUG misc:619 Copying file
'cm-808d863548acae7c2328c39a90f52e29/shared/2012-09-17--19-47/cm.tar.gz_2012-09-13' to file 'cm-0479bd75a331acc874033e98b2e1e03e/cm.tar.gz_2012-09-13' 2013-07-15 14:11:50,930 DEBUG misc:615 Establishing handle with key object 'shared/2012-09-17--19-47/cm_boot.py' 2013-07-15 14:11:50,930 DEBUG misc:619 Copying file 'cm-808d863548acae7c2328c39a90f52e29/shared/2012-09-17--19-47/cm_boot.py' to file 'cm-0479bd75a331acc874033e98b2e1e03e/cm_boot.py' 2013-07-15 14:11:51,085 DEBUG misc:615 Establishing handle with key object 'shared/2012-09-17--19-47/post_start_script' 2013-07-15 14:11:51,085 DEBUG misc:619 Copying file
'cm-808d863548acae7c2328c39a90f52e29/shared/2012-09-17--19-47/post_start_script' to file 'cm-0479bd75a331acc874033e98b2e1e03e/post_start_script' 2013-07-15 14:11:51,326 DEBUG misc:574 Retrieved file 'persistent_data.yaml' from bucket 'cm-0479bd75a331acc874033e98b2e1e03e' on host 's3.amazonaws.com' to 'shared_p_d.yaml'. 2013-07-15 14:11:52,944 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.77 lx24-amd64 '] 2013-07-15 14:11:52,973 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:11:52,973 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:52,973 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:11:53,730 ERROR master:1337 Error creating volume from shared cluster's snapshot '['snap-cfa775ba']': 'filesystems' 2013-07-15 14:11:56,975 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:11:56,976 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:00,978 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:00,978 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:05,521 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.72 lx24-amd64 '] 2013-07-15 14:12:05,550 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:12:05,551 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:05,551 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:09,553 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:09,553 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:13,555 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:13,556 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:18,102 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.72 lx24-amd64 '] 2013-07-15 14:12:18,131 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:12:18,131 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:18,131 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:22,133 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:22,134 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:25,087 DEBUG ec2:166 Gathering instance public keys (i.e., key pairs), attempt 0 2013-07-15 14:12:25,091 DEBUG ec2:173 Got key pair: 'cloudman_key_pair' 2013-07-15 14:12:26,136 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:26,137 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:30,689 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.72 lx24-amd64 '] 2013-07-15 14:12:30,717 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:12:30,717 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:30,718 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:34,719 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:34,720 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:38,721 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:38,721 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:43,263 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.65 lx24-amd64 '] 2013-07-15 14:12:43,289 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:12:43,290 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:43,290 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:47,292 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:47,292 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:51,293 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:51,293 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:55,836 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.65 lx24-amd64 '] 2013-07-15 14:12:55,865 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:12:55,865 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:55,866 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:12:59,867 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:12:59,868 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:13:03,868 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:13:03,869 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None) 2013-07-15 14:13:08,419 DEBUG sge:538 qstat: ['all.q@ip-10-235-1-48.ec2.inter BIP 0/0/2 0.65 lx24-amd64 '] 2013-07-15 14:13:08,448 DEBUG master:2377 S&S: Migration..Completed; SGE..OK; FS-transient_nfs..OK; PSS..Unstarted; HTCondor..OK; Hadoop..OK; 2013-07-15 14:13:08,449 DEBUG master:2288 Monitor adding service 'PSS' 2013-07-15 14:13:08,449 DEBUG pss:95 Not adding PSS svc; it completed (False) or the cluster was not yet initialized (None)
On Fri, Jul 12, 2013 at 12:08 PM, Enis Afgan <eafgan@emory.edu> wrote:
Hi Greg, Sorry for replying really late.
So, I'm guessing this was an old cluster that was shared and is now being derived on a new cluster? There was a large number of paths we explored while getting ready for the upgrade and I was of the opinion we covered that path but it seems things are not working as expected. Can look at the more detailed log on the Admin page (under CloudMan log) and see if there are more details about what's going on and why it's failing?
On Thu, Jul 11, 2013 at 3:14 PM, greg <margeemail@gmail.com> wrote:
Hi guys,
I just thought I'd check in again. None of the researches that want to run out genotyping program can do so until I figure this out. Any help or advice at all would be greatly appreciated.
Thanks,
Greg
On Mon, Jul 8, 2013 at 8:43 AM, greg <margeemail@gmail.com> wrote:
Hi guys,
Any thoughts on this? I'm kind of stuck.
(Even some pointers on where to look for more clues would be extremely helpful.)
Thanks,
Greg
On Fri, Jul 5, 2013 at 11:10 AM, greg <margeemail@gmail.com> wrote:
Hi guys,
I'm hitting an error using CloudMan using the Share-an-Instance option. It says:
Error creating volume from shared cluster's snapshot '['snap-cfa775ba']': 'filesystems'.
Also disk stats says 0 /0 and the Applications light is yellow while the data light is green.
I'm using the share string cm-808d863548acae7c2328c39a90f52e29/shared/2012-09-17--19-47
It's always worked in the past.
Thanks,
Greg
Here's the full log:
14:58:18 - Master starting 14:58:20 - Completed the initial cluster startup process. This is a new cluster; waiting to configure the type. 14:58:24 - Migration service prerequisites OK; starting the service 14:58:24 - SGE service prerequisites OK; starting the service 14:58:31 - Setting up SGE... 14:58:51 - HTCondor service prerequisites OK; starting the service 14:58:51 - HTCondor config file /etc/condor/condor_config not found! 14:58:59 - Hadoop service prerequisites OK; starting the service 14:59:48 - Done adding Hadoop service; service running. 15:01:45 - Error creating volume from shared cluster's snapshot '['snap-cfa775ba']': 'filesystems'
Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
participants (3)
-
Dannon Baker
-
Enis Afgan
-
greg