Hi guys, I'm fairly new to cloud computing and about 2 days in to using cloudman for galaxy... I have setup a m2.4xlarge master node and flexible load for up to 4 workers of the m2.xlarge type, minimum one. I was able to upload 6 samples of paired end RNAseq->12 files, gz file sizes around 5-8 gb. Grooming files took about a day, but my previous experience was on an in-house Galaxy install which was pretty small so I didn't think of anything at the time. I started 3 Tophat jobs and I noticed the UI being a bit sluggish to respond, added a 4th one hoping it might push it over the edge for the worker nodes. Unfortunately, 12 hours later, the Tophats are still running, the master node is way over 100%, and the worker is reported idle. While the cluster log has a few errors in it, it ends saying that the instance for the worker node is ready, despite it encountering errors adding it to the SGE host, code1. Has anyone run into this before or has any insight to fixing this? I'm pasting the lower portion of the status log below. Thanks very much for any help! (Also, several other emails about cloud installs were directed to /dev. If that's not the right place for this question, I apologize and can change to the /user list.) -- Brian Lin contact@brian-lin.com brian.lin@tufts.edu 13:15:38 - Instance 'i-e2bb0491' reported alive 13:15:38 - Sent master public key to worker instance 'i-e2bb0491'. 13:15:54 - Adding instance i-e2bb0491 as SGE administrative host. 13:16:06 - Adding instance 'i-e2bb0491' to SGE execution host list. 13:16:11 - Process encountered problems adding instance 'i-e2bb0491' as an SGE execution host. Process returned code 1 13:16:26 - Waiting on worker instance 'i-e2bb0491' to configure itself... 13:17:05 - Instance 'i-e2bb0491' (IP: 23.23.24.81) ready 13:42:23 - Rebooting instance i-e2bb0491 (reboot #3). 13:44:27 - Instance 'i-e2bb0491' reported alive 13:44:27 - Sent master public key to worker instance 'i-e2bb0491'. 13:44:56 - Adding instance i-e2bb0491 as SGE administrative host. 13:44:56 - Adding instance 'i-e2bb0491' to SGE execution host list. 13:44:56 - Process encountered problems adding instance 'i-e2bb0491' as an SGE execution host. Process returned code 1 13:44:56 - Waiting on worker instance 'i-e2bb0491' to configure itself... 13:46:09 - Instance 'i-e2bb0491' (IP: 23.23.24.81) ready 15:19:37 - Rebooting instance i-e2bb0491 (reboot #4). 15:21:16 - Instance 'i-e2bb0491' reported alive 15:21:16 - Sent master public key to worker instance 'i-e2bb0491'. 15:21:38 - Adding instance i-e2bb0491 as SGE administrative host. 15:21:44 - Adding instance 'i-e2bb0491' to SGE execution host list. 15:21:48 - Process encountered problems adding instance 'i-e2bb0491' as an SGE execution host. Process returned code 1 15:21:51 - Waiting on worker instance 'i-e2bb0491' to configure itself... 15:22:13 - Instance 'i-e2bb0491' (IP: 23.23.24.81) ready 15:54:32 - Instance i-e2bb0491 not responding after 4 reboots. Terminating instance. 15:54:32 - Terminating instance i-e2bb0491 15:54:35 - Instance 'i-e2bb0491' removed from the internal instance list. 15:56:10 - Adding 1 on-demand instance(s) 15:56:14 - Cannot get cloud instance object without an instance ID? 15:58:26 - Instance 'i-98932deb' reported alive 15:58:26 - Sent master public key to worker instance 'i-98932deb'. 15:59:02 - Adding instance i-98932deb as SGE administrative host. 15:59:09 - Adding instance 'i-98932deb' to SGE execution host list. 15:59:17 - Successfully added instance 'i-98932deb' to SGE 15:59:17 - Waiting on worker instance 'i-98932deb' to configure itself... 15:59:46 - Instance 'i-98932deb' (IP: 54.234.100.22) ready 16:12:47 - Rebooting instance i-98932deb (reboot #1). 16:12:47 - Rebooting instance i-98932deb (reboot #1). 16:12:47 - Rebooting instance i-98932deb (reboot #1). 16:14:06 - Instance 'i-98932deb' reported alive 16:14:06 - Sent master public key to worker instance 'i-98932deb'. 16:14:29 - Adding instance i-98932deb as SGE administrative host. 16:14:29 - Adding instance 'i-98932deb' to SGE execution host list. 16:14:29 - Process encountered problems adding instance 'i-98932deb' as an SGE execution host. Process returned code 1 16:14:29 - Waiting on worker instance 'i-98932deb' to configure itself... 16:14:39 - Instance 'i-98932deb' (IP: 54.234.100.22) ready