Galaxy, Apache, WSGI and mod_wsgi
Hello, I have recently had the opportunity to look at the deployment of Galaxy together with Apache, and I saw that the recommendation is to run the Galaxy Web server behind Apache with the latter acting as a proxy: http://wiki.g2.bx.psu.edu/Admin/Config/Apache%20Proxy Other than convenience - one can just put Apache in front of an existing server - is there any particular reason for doing things this way? It seems that Galaxy uses the built-in Web server provided by the Paste framework, which in turn is based on the Python standard library BaseHTTPServer framework, and although paste.httpserver seems to add capabilities to the underlying framework, each such server must still be constrained to running in a single process. I imagine that this then leads to the use of load balancing as described on the following page: http://wiki.g2.bx.psu.edu/Admin/Config/Performance/Web%20Application%20Scali... Given that Apache is an acceptable part of a large-scale solution, I would like to know if mod_wsgi has been considered for the deployment of Galaxy at all, and whether anyone has any positive or negative experiences with it. It seems to me that mod_rewrite is often something that should only be brought into play where other, typically more elegant, solutions cannot be used. Many Python-based Web applications have mod_wsgi as a recommended deployment option once their users look beyond CGI, and I wondered why this isn't the case with Galaxy. Paul
Hi, Paul I would like to know if mod_wsgi has been considered for the deployment of
Galaxy at all, and whether anyone has any positive or negative experiences with it.
I can only speak for the second part of your question: I've had some experience with Apache 2 + mod_wsgi, but within a Django stack. There was a somewhat complex set up/configuration (not horrible, just tedious) - that may have been particular to our situation and/or Django. It might be easier in your Galaxy situation. After set up, I found it pretty easy to work with: - Very rarely had to deal with bugs/workarounds or modification in general. - His/Their documentation for the mod is excellent - which is an often understated positive. - I found no problems with logging, debugging, or the mod 'getting in the way' of normal Apache features. - If I recall correctly, there was a good user base out there (~2-3 years ago). Carl On Mon, Oct 22, 2012 at 9:26 AM, Paul Boddie <paul.boddie@biotek.uio.no>wrote:
Hello,
I have recently had the opportunity to look at the deployment of Galaxy together with Apache, and I saw that the recommendation is to run the Galaxy Web server behind Apache with the latter acting as a proxy:
Other than convenience - one can just put Apache in front of an existing server - is there any particular reason for doing things this way? It seems that Galaxy uses the built-in Web server provided by the Paste framework, which in turn is based on the Python standard library BaseHTTPServer framework, and although paste.httpserver seems to add capabilities to the underlying framework, each such server must still be constrained to running in a single process. I imagine that this then leads to the use of load balancing as described on the following page:
http://wiki.g2.bx.psu.edu/**Admin/Config/Performance/Web%** 20Application%20Scaling<http://wiki.g2.bx.psu.edu/Admin/Config/Performance/Web%20Application%20Scaling>
Given that Apache is an acceptable part of a large-scale solution, I would like to know if mod_wsgi has been considered for the deployment of Galaxy at all, and whether anyone has any positive or negative experiences with it. It seems to me that mod_rewrite is often something that should only be brought into play where other, typically more elegant, solutions cannot be used. Many Python-based Web applications have mod_wsgi as a recommended deployment option once their users look beyond CGI, and I wondered why this isn't the case with Galaxy.
Paul ______________________________**_____________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
On 24/10/12 18:09, Carl Eberhard wrote:
I would like to know if mod_wsgi has been considered for the deployment of Galaxy at all, and whether anyone has any positive or negative experiences with it. I can only speak for the second part of your question: I've had some experience with Apache 2 + mod_wsgi, but within a Django stack.
There was a somewhat complex set up/configuration (not horrible, just tedious) - that may have been particular to our situation and/or Django. It might be easier in your Galaxy situation.
I've administered applications on various mod_wsgi sites, but I've never had to set it up. It can't really be worse than mod_python, though, can it?
After set up, I found it pretty easy to work with:
- Very rarely had to deal with bugs/workarounds or modification in general. - His/Their documentation for the mod is excellent - which is an often understated positive. - I found no problems with logging, debugging, or the mod 'getting in the way' of normal Apache features. - If I recall correctly, there was a good user base out there (~2-3 years ago).
I think it's even more usable than it was 2-3 years ago, which may also have been when I first had to use it. I just wondered whether there were any issues preventing its use with Galaxy. Having a "persistent" Galaxy-only Web server process, as the Paste-based server seems to be, might inadvertently encourage people to store things in memory (class and module globals, for example) that would mysteriously go away (and cause errors) under different deployment mechanisms, although I would imagine that doing load balancing of several Paste-based servers might provoke similar problems and so they would be known to the community. It just seemed rather involved to have to manage mod_rewrite rules or use mod_proxy when an integrated solution exists and is widely used. The only really practical reason I can think of, given that the Paste-based server is probably not well regarded for performance, is that mod_wsgi isn't packaged for certain operating system distributions like Red Hat Enterprise Linux, but having just checked, that actually isn't the case for RHEL 6: mod_wsgi.x86_64 3.2-1.el6 rhel-x86_64-server-6 Maybe the answer is that people either don't know or don't really care. :-) Paul
The Galaxy application does store quite a bit in memory (not as globals though). This doesn't preclude running under mod_wsgi, but it will work best in a configuration that uses a small number of long running processes with multiple threads. Basically, we run nginx proxying paste on the main Galaxy, so that is the configuration that is most well tested, and that is what we recommend (and can help support most easily). Other configurations might work just fine. I've used Galaxy with various other wsgi servers with issues. The performance of Paste's http server isn't really an issue since all requests that actually to paste require database access, which is typically orders of magnitude slower than Paste#http's overhead. -- jt On Wed, Oct 24, 2012 at 12:23 PM, Paul Boddie <paul.boddie@biotek.uio.no> wrote:
Having a "persistent" Galaxy-only Web server process, as the Paste-based server seems to be, might inadvertently encourage people to store things in memory (class and module globals, for example) that would mysteriously go away (and cause errors) under different deployment mechanisms, although I would imagine that doing load balancing of several Paste-based servers might provoke similar problems and so they would be known to the community.
participants (3)
-
Carl Eberhard
-
James Taylor
-
Paul Boddie