Exactly. In addition, most relational database are optimized for data that can change, but the access pattern for our raw data is write once. We can implement more efficient storage formats and indexes outside the database for this purpose. On Aug 31, 2010, at 5:17 PM, Hiram Clawson wrote:
Good afternoon Yury:
Typical file sizes are currently running in the 10s and 100s of Gb for most work flows these days. It isn't practical to try and stuff such large single entities into a database. It is much more simple to compute indexes into the file and store the indexes in the database. We do this all the time at the UCSC genome browser.
--Hiram
Yury Bukhman wrote:
Thank you, James, for your reply. I wonder if you could elaborate ...
galaxy-user mailing list galaxy-user@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-user
-- jt James Taylor Assistant Professor Department of Biology Department of Mathematics & Computer Science Emory University