This isn't an easy question to answer. Here's why:
*there is significant variation in mammalian genome size; of course, larger genomes require more resources, but the relationship is difficult to quantify;
*assembly can take anywhere from a day to a week depending on software and resource choices;
*variant detection can take anywhere from 1-4 days depending on software used;
*completing assembly and variant detection in 48 hours is something that is challenging for even the most advanced genomics labs.
To answer your question, I'd start with 256-512GB of RAM on a machine and 36-72 compute cores across a cluster. This is simply a guess of course. Before investing in hardware, you might try your analysis on the cloud (
usegalaxy.org/cloud ) to get a sense of the resources needed.
Good luck,
J.
On Sep 11, 2013, at 8:34 AM, Gerald Bothe wrote:
Can I put in a similar question on top of this: How much resources do you need for re-sequencing of a mammalian genome (assembly and variant detection), one job at a time? E.g. how much RAM etc. if I want the re-sequencing SAM file of a 30-fold coverage be done in 48 hours?
Gerald
Gerald Bothe
32 Plum Hill Road
East Lyme, CT 06333
(860) 451 8776
Hi Peter
It's going to be one big machine, running both Galaxy server and the jobs. It's going to be a multi-process configuration. If that idea is terribly bad please let me know so I can give back the feedback.
De novo assembly can also be for the human/mouse genome.
Bests,
Nikos
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/