Hi Enis, Jeremy, Thanks, that's really helpful! Jeremy do you remember what kind of instance you used per node, e.g. Amazon's 'large' (2 core / 4ECU / 7.5GB) or 'xlarge' (4 core / 8ECU / 15GB)? This would be a good sanity check for me. Enis when you say a lot of memory, would you consider the 15GB nodes 'a lot'? I would generally consider that a lot in the context of a web server but not so much in NGS, so it's all relative! Amazon does have double-memory instances available. Also, when you say a lot of memory especially for the master node - does this imply that I can choose a different specification for the master node than the compute nodes? I thought they had to be all identical, but just checking. Thanks, Clare On Fri, Nov 18, 2011 at 12:56 AM, Jeremy Goecks <jeremy.goecks@emory.edu> wrote:
Jeremy (from the team) ran a similar workshop several months ago and used some resource intensive tools (e.g., Tophat). We were concerned about the same scalability issues so we just started 4 separate clusters and divided the users across those. The approach worked well and it turned out we did not see any scalability issues. I think we went way overboard with 4 clusters but the approach did demonstrate an additional 'coolness' of the project allowing one to spin up 4 complete, identical clusters in a matter of minutes... So, I feel you could replicate a similar approach but could probably go with 2 clusters only? Jeremy can hopefully provide some first hand comments as well.
Yes, this approach worked well when I did it. I created all the data and workflows needed for the workshop on one Galaxy instance, shared/cloned that instance and set up 3 additional instances, and divided users evenly amongst the instances. 2-3 clusters will probably meet your needs with 50 participants.
Scalability issues are more likely to arise on the back end than the front end, so you'll want to ensure that you have enough compute nodes. BWA uses four nodes by default--Enis, does the cloud config change this parameter?--so you'll want 4x50 or 200 total nodes if you want everyone to be able to run a BWA job simultaneously.
Good luck, J.
-- E: sloc@unimelb.edu.au P: 03 903 53357 M: 0414 854 759