Clare
I wonder if you would be open to share the details of your setup and what you had to do for this once you are done. I think it would be incredibly useful to community (at least it would be for me ;-)) 

Thanks!
On Nov 17, 2011, at 5:14 PM, Clare Sloggett wrote:

Hi Enis, Jeremy,

Thanks, that's really helpful! Jeremy do you remember what kind of
instance you used per node, e.g. Amazon's 'large' (2 core / 4ECU /
7.5GB) or 'xlarge' (4 core / 8ECU / 15GB)? This would be a good sanity
check for me.

Enis when you say a lot of memory, would you consider the 15GB nodes
'a lot'? I would generally consider that a lot in the context of a web
server but not so much in NGS, so it's all relative! Amazon does have
double-memory instances available.

Also, when you say a lot of memory especially for the master node -
does this imply that I can choose a different specification for the
master node than the compute nodes? I thought they had to be all
identical, but just checking.

Thanks,
Clare

On Fri, Nov 18, 2011 at 12:56 AM, Jeremy Goecks <jeremy.goecks@emory.edu> wrote:

Jeremy (from the team) ran a similar workshop several months ago and used some resource intensive tools (e.g., Tophat). We were concerned about the same scalability issues so we just started 4 separate clusters and divided the users across those. The approach worked well and it turned out we did not see any scalability issues. I think we went way overboard with 4 clusters but the approach did demonstrate an additional 'coolness' of the project allowing one to spin up 4 complete, identical clusters in a matter of minutes...
So, I feel you could replicate a similar approach but could probably go with 2 clusters only? Jeremy can hopefully provide some first hand comments as well.

Yes, this approach worked well when I did it. I created all the data and workflows needed for the workshop on one Galaxy instance, shared/cloned that instance and set up 3 additional instances, and divided users evenly amongst the instances. 2-3 clusters will probably meet your needs with 50 participants.

Scalability issues are more likely to arise on the back end than the front end, so you'll want to ensure that you have enough compute nodes. BWA uses four nodes by default--Enis, does the cloud config change this parameter?--so you'll want 4x50 or 200 total nodes if you want everyone to be able to run a BWA job simultaneously.

Good luck,
J.







--
E: sloc@unimelb.edu.au
P: 03 903 53357
M: 0414 854 759

___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/

Ravi Madduri
madduri@mcs.anl.gov