David,
Please direct questions about local installations to the galaxy-dev mailing list (cc'd); there are a large number of experienced people of this list that can likely help with this and similar questions.
Just a quick question about the compute infrastructure of the setup running the public instance of Galaxy. Our Galaxy copy is on our High Performance Cluster here at X and they are currently canvassing users about the next upgrade of the HPC. This cluster serves the needs of all the university academics here and so we are sharing time with physicists, aerodynamics engineers etc. However, the HPC board of directors tell me that they see biological computing problems as the next big growth area for them - especially now we have a copy of Galaxy. What they would like to know is what kind of mix of new nodes would be most useful to us. They have about X million pounds to spend (about X million dollars) and the question as I understand it is determining the right mix of expensive high memory nodes (e.g. with 1 or 2TB RAM on board) and cheaper lower memory nodes (e.g. with 32G ram on board or less even). The problem for me is that I have no idea what to suggest (I am not that compute savvy). So any advice from your experiences (or that of the Galaxy team) would be most helpful.
Your hardware will depend on what types of analyses you plan to support/run. For instance, assembly jobs (e.g. ABySS, Velvet, Trinity) require high-memory nodes, and read mapping jobs (e.g. BWA, Bowtie, Tophat) are best run on high CPU nodes. Also, creating indices (often used in visualization) such as bigWig and bigBed require large amounts of memory as well. You'll want to do some research on the tools that you plan to run most often and determine the best hardware for them.
Nate Coraor's presentation at GCC 2011 has some useful information about setting up a production Galaxy as well as details about our public instance: http://wiki.g2.bx.psu.edu/Events/GCC2011 In short, we use multiple clusters to distribute resource-intensive jobs appropriately.
Also, you may want to search the galaxy-dev mailing list as there's been discussion about hardware specs for a Galaxy instance previously: http://gmod.827538.n3.nabble.com/Galaxy-Development-f815885.html
Nate and others, please feel free to chime in with additional information.
Good luck, J.
HI Jeremy,
Many thanks for the feedback (I thought I'd cc'd the dev list but apparently not!) any advice is most useful especially as they are keen to hear what we want (they already have a list of demands from the aerospace guys!).
Best Wishes, David.
__________________________________ Dr David A. Matthews
Senior Lecturer in Virology Room E49 Department of Cellular and Molecular Medicine, School of Medical Sciences University Walk, University of Bristol Bristol. BS8 1TD U.K.
Tel. +44 117 3312058 Fax. +44 117 3312091
D.A.Matthews@bristol.ac.uk
On 12 Oct 2011, at 15:13, Jeremy Goecks wrote:
David,
Please direct questions about local installations to the galaxy-dev mailing list (cc'd); there are a large number of experienced people of this list that can likely help with this and similar questions.
Just a quick question about the compute infrastructure of the setup running the public instance of Galaxy. Our Galaxy copy is on our High Performance Cluster here at X and they are currently canvassing users about the next upgrade of the HPC. This cluster serves the needs of all the university academics here and so we are sharing time with physicists, aerodynamics engineers etc. However, the HPC board of directors tell me that they see biological computing problems as the next big growth area for them - especially now we have a copy of Galaxy. What they would like to know is what kind of mix of new nodes would be most useful to us. They have about X million pounds to spend (about X million dollars) and the question as I understand it is determining the right mix of expensive high memory nodes (e.g. with 1 or 2TB RAM on board) and cheaper lower memory nodes (e.g. with 32G ram on board or less even). The problem for me is that I have no idea what to suggest (I am not that compute savvy). So any advice from your experiences (or that of the Galaxy team) would be most helpful.
Your hardware will depend on what types of analyses you plan to support/run. For instance, assembly jobs (e.g. ABySS, Velvet, Trinity) require high-memory nodes, and read mapping jobs (e.g. BWA, Bowtie, Tophat) are best run on high CPU nodes. Also, creating indices (often used in visualization) such as bigWig and bigBed require large amounts of memory as well. You'll want to do some research on the tools that you plan to run most often and determine the best hardware for them.
Nate Coraor's presentation at GCC 2011 has some useful information about setting up a production Galaxy as well as details about our public instance: http://wiki.g2.bx.psu.edu/Events/GCC2011 In short, we use multiple clusters to distribute resource-intensive jobs appropriately.
Also, you may want to search the galaxy-dev mailing list as there's been discussion about hardware specs for a Galaxy instance previously: http://gmod.827538.n3.nabble.com/Galaxy-Development-f815885.html
Nate and others, please feel free to chime in with additional information.
Good luck, J.
galaxy-dev@lists.galaxyproject.org