Hi Alistair,

One major goal of the Galaxy cloud offering is that it works out of the box with a very large selection of tools and indices installed and ready to go, with no configuration necessary.  My workshop heuristic is two students per node using regular m1.xlarge nodes (though input from others on this is absolutely welcome!).  For costs, see http://aws.amazon.com/ec2/#pricing for detailed pricing information, but for the most part what you need to know is:

You'll need a head node up for the entire lifespan of your cluster.  You can terminate this and start it back up (and avoid paying for the node use in the interim), but any time you want to interact with this cluster it will need to be alive.  We generally use a high memory quadruple extra large instance @ ~1.65/hr.  I'd start with this, and start with *more* space on the galaxy volume than you think you'll need -- space is not expensive, especially if you're not keeping it for more than a few days to cover a workshop.

Worker nodes will be the brunt of your costs, and fortunately these are *trivial* to add and remove and can be configured just prior to and after the workshop.  if you use standard m1.xlarge instances, they're ~.48/hour.  So, for 10 of them to cover your 20 students, you're looking at about $5/hour for the workshop -- not too bad.

Amazon will bill you for every hour you have an instance, regardless of whether you're using it or not.  Do make sure to terminate things when you're done and/or not utilizing it anymore.  You may want to look into Cloudman's auto-scaling to handle this for you -- this allows you to say something like "Keep 2 worker instances up all the time, but under heavy use scale up to at most 10".   I wouldn't recommend auto-scaling for a workshop, however, I'd have the instances ready ahead of time since you're fairly certain you'll need them.  Regarding setting up ahead of time, what I'd recommend is to set up just the master instance (which you will pay for) ahead of time, test it out, and only add the workers when it's time to kick off the workshop.

One more performance tip is that for workshops, once you've added extra worker nodes, you'll want to go into the Cloudman admin panel and disable job running on the master node (just a single button click) for maximum performance of the galaxy application.

We're in the middle of an update that will be released early next week (already accessible using the cloudman-dev bucket), which fixes several known issues with the current tool offering, so if you're able I'd wait until that happens before starting a new workshop cluster that's probably for the best.  Additionally, we have a supported 'workshop-ready' offering that has preloaded data for our Galaxy101 and RNA-seq exercises, among other things.  This will be updated with our forthcoming release, but see http://wiki.galaxyproject.org/Teach/WorkshopAMI for more details.

Good luck, and let me know if there's anything else I can do to help out!

Dannon


On Thu, Aug 22, 2013 at 9:45 PM, Alistair Chilcott <Alistair.Chilcott@utas.edu.au> wrote:

Hi,

 

Firstly this is my first post to the list so be gentle :D

 

We have been looking at the Galaxy in the Cloud option for our researchers

 

We have been doing a fair bit of reading on the various source but haven’t found any solid answers to the following questions:

 

-When it is started (AWS account setup and launched from the “new cloud cluster link”) how much config is required to get it running all the tools such as megablast, Bowtie, Tophat etc?

(We have also been setting up a local install of Galaxy but are struggling with the setup of these tools that don’t come bundled with the base Galaxy install)

 

-What size AWS cluster would be required to support a class of 20 or so students running a range of relatively short tasks with Megablast, Bowtie, Tophat, Fastq Groomer, SAM tools?

 

-how is the AWS charge calculated does it run while the cluster is available or just for the actual compute time used? Ie Could setup it up and have it ready but while it isn’t processing data it’s not costing anything?

 

Regards,

 

Alistair

 

Description: Description: Description: Description: Description: Description: cid:image001.png@01CA36CF.30513900

Alistair Chilcott

Systems Administrator, (Domain)

Information Technology Services

Email: alistair.chilcott@utas.edu.au | P: +61 3 6226 7743

University of Tasmania, Locked Bag 23, Hobart  Tas.  7000

 

 


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/