File system performance varies wildly between storage architectures. There are storage server setups that can easily scale to orders of magnitude beyond the compute that backs usegalaxy.org - suffice to say we are currently bound by the number of cores we have available and not by IO/network performance. (Nate may elaborate on the specifics of the public servers setup, but I am not sure it is useful to you unless you have hundreds or thousands (or millions) of dollars to spend on new storage and network hardware :) ). Also my idea was not a second Galaxy instance - sorry I did not make that clearer. It was to restrict your current Galaxy instance to a smaller portion of your cluster. If your cluster is completely dedicated to Galaxy however this idea doesn't make sense, but if this is a shared condor cluster used for other things (besides Galaxy) it could make sense. Sorry have not been more helpful. -John On Tue, Dec 17, 2013 at 6:12 PM, Ben Gift <corn8bit2@gmail.com> wrote:
How do you have it set up on the main public galaxy install? I imagine that people run enough big jobs that there there is enormous use of your shared file system. How did you scale that to so many nodes without bogging down the file system with large dataset transfers?
It seems that for now the solution of having a second Galaxy instance will work well, thank you very much John :) . But I'm still interested in a more permanent scaled solution. After reading up more on our shared file system it still seems like heavy traffic is bad, so could my initial idea still be good?