Hopefully Ross' email helped - it's sometimes difficult for us to know how easily people adapt Galaxy for their own use, outside of our public sites.
Michael Rusch wrote:
How does Galaxy scale? Does anybody have experience with scaling to thousands of datasets, or working with datasets in the hundreds of megabytes?
Our public sites are up to hundreds of thousands of datasets, ranging in sizes up to a few gigs. The caveats Ross listed apply, with respect to moving data around, so a good cluster infrastructure is important with larger datasets.
We have traditionally done most of our work using a MySQL backend. I haven't (yet) received the green light from our sysadmin to install Postgres, and I'm wondering if anybody has any experience running on MySQL. Is it possible? Are there pitfalls?
We do indeed test all of our builds on SQLite (the default database, but not recommended outside of development), Postgres, and MySQL.
Has anybody by any chance implemented support for condor as a job scheduler?
No, only TORQUE/PBS and Sun Grid Engine. Galaxy's job runner is modular and can support any number of configurable job runners, though.