Hi Clare, Jeremy (from the team) ran a similar workshop several months ago and used some resource intensive tools (e.g., Tophat). We were concerned about the same scalability issues so we just started 4 separate clusters and divided the users across those. The approach worked well and it turned out we did not see any scalability issues. I think we went way overboard with 4 clusters but the approach did demonstrate an additional 'coolness' of the project allowing one to spin up 4 complete, identical clusters in a matter of minutes... So, I feel you could replicate a similar approach but could probably go with 2 clusters only? Jeremy can hopefully provide some first hand comments as well. When ti comes to the instance types, especially for the master node, I would strongly suggest an instance with a lot of memory. This is one thing I've noticed that greatly aids with cluster responsiveness, plus BWA can be a bit memory hungry. Please let us know how the workshop goes (because, like you said, it's hard to test such environments), Enis On Thu, Nov 17, 2011 at 5:27 AM, Clare Sloggett <sloc@unimelb.edu.au> wrote:
Hi all (especially Enis :) ),
We are planning to use Amazon (Galaxy CloudMan) to run a workshop for about 50 people. We won't need to transfer any data during the workshop, but need the virtual cluster to be reasonably responsive and cope with: a) the load on the front end b) the workshop participants each trying to run a bwa alignment - at the moment each alignment would be of about 2.8M reads, but we could cut it down c) any other scalability issues I may not have thought of?
I wanted to ask if anyone has used CloudMan for a similar purpose, or has an understanding, based on running a Galaxy cluster, of any problems we might encounter? I can add enough nodes to the cluster on the day to cope with the computational load (I assume) but I'm not sure if I should be expecting any other problems.
Is the size of the node (e.g. Amazon's 4-core vs 8-core nodes) very important? I can scale out by adding more nodes, but should I be concerned about the capacity of the master node which handles the traffic?
Also, is there any sensible way for me to test it in advance (in terms of the user load)?
Many thanks for any advice!
Clare
-- E: sloc@unimelb.edu.au P: 03 903 53357 M: 0414 854 759 ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at: