Re: [galaxy-dev] I need some advice on what type of Galaxy server to implement

29 Jul 2015

      What you should do (in my opinion) is install a grid scheduler (SGE, torque, etc) on the big server.  If you run Galaxy on a separate server, it can be configured to submit jobs to the scheduler.  Galaxy also has the concept of a Web App and Handler components.  Essentially, handlers take care of talking with the scheduler while the Web App will serve pages to users.  By default, the Web App and Handlers are combined in the same process.  you can configure galaxy to start up multiple handler processes and multiple web app processes.  Then, you can use Apache or Nginx to load balance user requests between the various Galaxy Web Apps.

My recommendation is that you start small to accommodate about 5 concurrent users without noticeable performance issues:
1 Galaxy web app
1 Galaxy handler

Recommended reading:
https://wiki.galaxyproject.org/Admin/Config/Performance/Scaling 
https://wiki.galaxyproject.org/Admin/Config/Performance/Cluster

-----Original Message-----
From: Shane Kelly [mailto:skk@shanek54.co.uk] 
Sent: July-29-15 7:32 AM
To: Kandalaft, Iyad
Cc: galaxy-dev@lists.galaxyproject.org
Subject: Re: [galaxy-dev] I need some advice on what type of Galaxy server to implement

Hi Iyad,
	Thanks for taking the time to get back to me.
I am an IT guy, and would not know how to answer these questions except to say that the server is to be used to facilitate biological and medical research in the areas of genomics, transcriptomics, epigenomics and metagenomics. (straight from the mission statement :-) )

	I think that they would say  that overall performance is less important than being able to run large datasets (1/2-3TB).

	I like the idea of a separate box for the web-server, but I am not sure how the web server would communicate with the pipeline box - is that ability built into galaxy, or is it a well worn path with plenty of examples that I could plagiarize?

	Sorry to be such a newb, but I don't know much about galaxy at all. Luckily I have 2-3 months to put this in place...

Again, thank you for your time.

Regards,
Shane
...
When you say NGS, is it genome assembly?  If so, what type of genomes 
and do you have experience with its memory and cpu requirements.  We 
noted that servers with large amount of memory and cores have a memory 
bus bottleneck. The other aspect is high processing on the server will 
impact the performance of Galaxy unless it is given higher priority.
Note that if you overcommit the server, it can destabilize and bring 
down the Galaxy web app and database.
My general approach is Galaxy web app + proxy on a separate machine 
from the handlers.  The analysis server is either running a grid or the 
handlers.
I recommend multiple smaller servers if you can get away with it as 
long as you have one that can accommodate your LARGE workloads.  If you 
don't care about overall performance, large servers are the way to go 
as they are more "versatile".
Regards,
Iyad Kandalaft
Acting Chief Bioinformatician in Biodiversity, STB Agriculture and 
Agri-Food Canada / Government of Canada Iyad.Kandalaft@Agr.gc.ca / Tel: 
613-759-1228 / TTY: 613-773-2600
Bioinformaticien chef de la  biodiversite interim, Direction générale 
des Science et de la technologie Agriculture et Agroalimentaire Canada 
/ Gouvernement du Canada Iyad.Kandalaft@Agr.gc.ca / Tel:
613-759-1228 / TTY: 613-773-2600
-----Original Message-----
From: galaxy-dev [mailto:galaxy-dev-bounces@lists.galaxyproject.org]
On Behalf Of Shane Kelly Sent: July-28-15 8:23 AM
To: galaxy-dev@lists.galaxyproject.org
Subject: [galaxy-dev] I need some advice on what type of Galaxy server 
to implement
Hi
  I have been tasked with getting a Galaxy server up and running
  for a group at work.
1. No-one can tell me how many users (concurrent or otherwise)
  there will be 2. Most of the analyses will be NGS.
  3. Tools will be developed in-house but we will use public
  domain tools also. 4. There will be a guy running the
  server/developing tools pretty much full time.
I have two favoured solutions at the moment:
1. A pipeline processor ( 64 Core, 512G Ram, with DAS of about 150TB ), 
and a Web server to act as frontend and database server, and another, 
smaller box for a total install of galaxy, but doing only the 
development work.
2. An all-in-one server with 128 Cores, 1TB ram, DAS storage of 150TB, 
and development work done on a VM.
Any input would be hepfull.
Thanks
Shane
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this and other 
Galaxy lists, please use the interface at:
 https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this and other 
Galaxy lists, please use the interface at:
 https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/