Re: [galaxy-dev] Request: Option to reduce server data transfer for big workflow in cluster

18 Dec 2013

      File system performance varies wildly between storage architectures.
There are storage server setups that can easily scale to orders of
magnitude beyond the compute that backs usegalaxy.org - suffice to say
we are currently bound by the number of cores we have available and
not by IO/network performance. (Nate may elaborate on the specifics of
the public servers setup, but I am not sure it is useful to you unless
you have hundreds or thousands (or millions) of dollars to spend on
new storage and network hardware :) ).

Also my idea was not a second Galaxy instance - sorry I did not make
that clearer. It was to restrict your current Galaxy instance to a
smaller portion of your cluster. If your cluster is completely
dedicated to Galaxy however this idea doesn't make sense, but if this
is a shared condor cluster used for other things (besides Galaxy) it
could make sense.

Sorry have not been more helpful.

-John

On Tue, Dec 17, 2013 at 6:12 PM, Ben Gift <corn8bit2@gmail.com> wrote:
...
How do you have it set up on the main public galaxy install? I imagine that
people run enough big jobs that there there is enormous use of your shared
file system. How did you scale that to so many nodes without bogging down
the file system with large dataset transfers?
It seems that for now the solution of having a second Galaxy instance will
work well, thank you very much John :) . But I'm still interested in a more
permanent scaled solution. After reading up more on our shared file system
it still seems like heavy traffic is bad, so could my initial idea still be
good?

Re: [galaxy-dev] Request: Option to reduce server data transfer for big workflow in cluster

John Chilton