This isn't an easy question to answer. Here's why:

*there is significant variation in mammalian genome size; of course, larger genomes require more resources, but the relationship is difficult to quantify;
*assembly can take anywhere from a day to a week depending on software and resource choices;
*variant detection can take anywhere from 1-4 days depending on software used;
*completing assembly and variant detection in 48 hours is something that is challenging for even the most advanced genomics labs.

To answer your question, I'd start with 256-512GB of RAM on a machine and 36-72 compute cores across a cluster. This is simply a guess of course. Before investing in hardware, you might try your analysis on the cloud ( usegalaxy.org/cloud ) to get a sense of the resources needed.

Good luck,
J.

On Sep 11, 2013, at 8:34 AM, Gerald Bothe wrote:

Can I put in a similar question on top of this: How much resources do you need for re-sequencing of a mammalian genome (assembly and variant detection), one job at a time? E.g. how much RAM  etc. if I want the re-sequencing SAM file of a 30-fold coverage be done in 48 hours?
 
Gerald
 
Gerald Bothe
32 Plum Hill Road
East Lyme, CT 06333
(860) 451 8776

From: Nikos Sidiropoulos <nikos.sidiro@gmail.com>
To: Peter Cock <p.j.a.cock@googlemail.com>
Cc: "<galaxy-dev@bx.psu.edu>" <galaxy-dev@bx.psu.edu>
Sent: Wednesday, September 11, 2013 8:19 AM
Subject: Re: [galaxy-dev] Scaling and hardware requirements

Hi Peter

It's going to be one big machine, running both Galaxy server and the jobs. It's going to be a multi-process configuration. If that idea is terribly bad please let me know so I can give back the feedback. 

De novo assembly can also be for the human/mouse genome. 

Bests,
Nikos


2013/9/11 Peter Cock <p.j.a.cock@googlemail.com>
On Wed, Sep 11, 2013 at 1:03 PM, Nikos Sidiropoulos
<nikos.sidiro@gmail.com> wrote:
> Hi all,
>
> I have a couple of questions regarding a server setup dedicated on Galaxy.
>
> The idea is to buy a 64 core 256GB RAM server. From my experience I believe
> that Galaxy will be able to scale up to 64 cpu's but I would like some more
> feedback on this. Also, is 4GB RAM per CPU core enough for NGS data?
> (including de-novo assembly)
>
> Bests,
> Nikos

Hi Nikos,

Is this going to be one server both for running Galaxy (which
needs fairly low resources) and running jobs for Galaxy,
like de novo assemblies (which need high resources)?

i.e. You have one big machine only, no cluster?

For de novo assembly the RAM per core/CPU isn't important,
it is the total RAM on the machine. How much RAM you
need depends on which assembler you use, the organism
(both size and also complexity) and the volume of data.

What you've described should be fine for bacterial assemblies
and smaller eukaryotes - beyond that you'll need to give
more details.

Peter


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
 http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/