the display of bam files from galaxy behind apache is not working
Hi, I can't see the data at ucsc when I upload a bam file and try to display it. The track arrives at ucsc. I see GET /display_application/... and 200 and bytes in the httpd/access_log I see GET /display_application/... in the paster.log But I don't see the seq reads at the ucsc browser. Some background information: I am setting up our local galaxy server. The server is redhat 5. The IP is external and whether firewall is up or down is immaterial in the behavior. The galaxy install is running at localhost:8080. There is the default redhat apache running in front of galaxy with the rewrite rules enabled for static and all traffic to port 80. Apache security is not enabled (yet). The dedicated galaxy user and group is running galaxy. Thanks, Terry ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
Hi there, I think we should join to find out what's going on... I have the very same issue. Can you check if you can display at least some thousands of reads at the beginning of chromosome 10 (or the fist chr in your BAM file anyway)? d On Jun 18, 2010, at 5:19 PM, Terrence Barrette wrote:
Hi, I can't see the data at ucsc when I upload a bam file and try to display it. The track arrives at ucsc. I see GET /display_application/... and 200 and bytes in the httpd/access_log I see GET /display_application/... in the paster.log But I don't see the seq reads at the ucsc browser.
Some background information: I am setting up our local galaxy server. The server is redhat 5. The IP is external and whether firewall is up or down is immaterial in the behavior. The galaxy install is running at localhost:8080. There is the default redhat apache running in front of galaxy with the rewrite rules enabled for static and all traffic to port 80. Apache security is not enabled (yet). The dedicated galaxy user and group is running galaxy.
Thanks, Terry
********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues <Terrence Barrette2.vcf>_______________________________________________ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-dev
/* Davide Cittaro Cogentech - Consortium for Genomic Technologies via adamello, 16 20139 Milano Italy tel.: +39(02)574303007 e-mail: davide.cittaro@ifom-ieo-campus.it */
Davide, Yes you are correct, a few reads from the bam file are showing up on chr10, from about 80,000 - 300,000, no other chr appear to have reads. Well that explains the data retrieval issue and seeing data come from the web servers to ucsc. It doesn't explain why no where else is displayed. Is the entire file content being sent to ucsc with every request rather than the particular chunk to be displayed, so we are just hitting a ucsc data limit? How could that be accessed? I took a look at the previous thread at: http://lists.bx.psu.edu/pipermail/galaxy-dev/2010-May/002691.html but I didn't see a solution as part of it. My apache is not forcing authentication, galaxy is supplying data to ucsc, some of the data is getting there for display. The same file when loaded through our local http server for display at ucsc works normally, so the file is functional. The same file loaded to the public galaxy server is functional, so public galaxy can provide the data properly. The galaxy build is from the current dist build and python is 2.5. Are there any ideas what I have missed to get ucsc to display my bam files properly? Thanks Terry On 6/18/2010 12:14 PM, Davide Cittaro wrote:
Hi there, I think we should join to find out what's going on... I have the very same issue. Can you check if you can display at least some thousands of reads at the beginning of chromosome 10 (or the fist chr in your BAM file anyway)?
d On Jun 18, 2010, at 5:19 PM, Terrence Barrette wrote:
Hi, I can't see the data at ucsc when I upload a bam file and try to display it. The track arrives at ucsc. I see GET /display_application/... and 200 and bytes in the httpd/access_log I see GET /display_application/... in the paster.log But I don't see the seq reads at the ucsc browser.
Some background information: I am setting up our local galaxy server. The server is redhat 5. The IP is external and whether firewall is up or down is immaterial in the behavior. The galaxy install is running at localhost:8080. There is the default redhat apache running in front of galaxy with the rewrite rules enabled for static and all traffic to port 80. Apache security is not enabled (yet). The dedicated galaxy user and group is running galaxy.
Thanks, Terry
********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues <Terrence Barrette2.vcf>_______________________________________________ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu <mailto:galaxy-dev@lists.bx.psu.edu> http://lists.bx.psu.edu/listinfo/galaxy-dev
/* Davide Cittaro
Cogentech - Consortium for Genomic Technologies via adamello, 16 20139 Milano Italy
tel.: +39(02)574303007 e-mail: davide.cittaro@ifom-ieo-campus.it <mailto:davide.cittaro@ifom-ieo-campus.it> */
_______________________________________________ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-dev
Hi Terry, On Jun 21, 2010, at 7:11 PM, TRBarrette wrote:
Davide, Yes you are correct, a few reads from the bam file are showing up on chr10, from about 80,000 - 300,000, no other chr appear to have reads.
Nice! :-(
Well that explains the data retrieval issue and seeing data come from the web servers to ucsc. It doesn't explain why no where else is displayed. Is the entire file content being sent to ucsc with every request rather than the particular chunk to be displayed, so we are just hitting a ucsc data limit? How could that be accessed?
I've tried a tcpdump on interfaces involved in this data stream. Well, it was for a bigWig file but I believe the cause of this weird behavior is the same. Basically the galaxy web server is not able to serve partial file contents to a client. In other words, the udc client from the genome browser asks for a chunk of data, while galaxy gives back the whole file... Check this post: http://lists.bx.psu.edu/pipermail/galaxy-dev/2010-June/002802.html
I took a look at the previous thread at: http://lists.bx.psu.edu/pipermail/galaxy-dev/2010-May/002691.html but I didn't see a solution as part of it.
My apache is not forcing authentication, galaxy is supplying data to ucsc, some of the data is getting there for display. The same file when loaded through our local http server for display at ucsc works normally, so the file is functional. The same file loaded to the public galaxy server is functional, so public galaxy can provide the data properly.
The galaxy build is from the current dist build and python is 2.5.
Are there any ideas what I have missed to get ucsc to display my bam files properly? Thanks Terry
On 6/18/2010 12:14 PM, Davide Cittaro wrote:
Hi there, I think we should join to find out what's going on... I have the very same issue. Can you check if you can display at least some thousands of reads at the beginning of chromosome 10 (or the fist chr in your BAM file anyway)?
d On Jun 18, 2010, at 5:19 PM, Terrence Barrette wrote:
Hi, I can't see the data at ucsc when I upload a bam file and try to display it. The track arrives at ucsc. I see GET /display_application/... and 200 and bytes in the httpd/access_log I see GET /display_application/... in the paster.log But I don't see the seq reads at the ucsc browser.
Some background information: I am setting up our local galaxy server. The server is redhat 5. The IP is external and whether firewall is up or down is immaterial in the behavior. The galaxy install is running at localhost:8080. There is the default redhat apache running in front of galaxy with the rewrite rules enabled for static and all traffic to port 80. Apache security is not enabled (yet). The dedicated galaxy user and group is running galaxy.
Thanks, Terry
********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues <Terrence Barrette2.vcf>_______________________________________________ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-dev
/* Davide Cittaro
Cogentech - Consortium for Genomic Technologies via adamello, 16 20139 Milano Italy
tel.: +39(02)574303007 e-mail: davide.cittaro@ifom-ieo-campus.it */
_______________________________________________ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-dev
<ucsc-bam.png>
/* Davide Cittaro Cogentech - Consortium for Genomic Technologies via adamello, 16 20139 Milano Italy tel.: +39(02)574303007 e-mail: davide.cittaro@ifom-ieo-campus.it */
Thanks for the link to the tcp thread. Well that's a little puzzling, how is it that the public server works then? Is there a different implementation or paste server? I tried the 2.5 and 2.6.5 versions of python with my galaxy build with no change in results. Thanks Terry * * On 6/21/2010 1:59 PM, Davide Cittaro wrote:
Hi Terry,
On Jun 21, 2010, at 7:11 PM, TRBarrette wrote:
Davide, Yes you are correct, a few reads from the bam file are showing up on chr10, from about 80,000 - 300,000, no other chr appear to have reads.
Nice! :-(
Well that explains the data retrieval issue and seeing data come from the web servers to ucsc. It doesn't explain why no where else is displayed. Is the entire file content being sent to ucsc with every request rather than the particular chunk to be displayed, so we are just hitting a ucsc data limit? How could that be accessed?
I've tried a tcpdump on interfaces involved in this data stream. Well, it was for a bigWig file but I believe the cause of this weird behavior is the same. Basically the galaxy web server is not able to serve partial file contents to a client. In other words, the udc client from the genome browser asks for a chunk of data, while galaxy gives back the whole file...
Check this post:
http://lists.bx.psu.edu/pipermail/galaxy-dev/2010-June/002802.html
I took a look at the previous thread at: http://lists.bx.psu.edu/pipermail/galaxy-dev/2010-May/002691.html but I didn't see a solution as part of it.
My apache is not forcing authentication, galaxy is supplying data to ucsc, some of the data is getting there for display. The same file when loaded through our local http server for display at ucsc works normally, so the file is functional. The same file loaded to the public galaxy server is functional, so public galaxy can provide the data properly.
The galaxy build is from the current dist build and python is 2.5.
Are there any ideas what I have missed to get ucsc to display my bam files properly? Thanks Terry
On 6/18/2010 12:14 PM, Davide Cittaro wrote:
Hi there, I think we should join to find out what's going on... I have the very same issue. Can you check if you can display at least some thousands of reads at the beginning of chromosome 10 (or the fist chr in your BAM file anyway)?
d On Jun 18, 2010, at 5:19 PM, Terrence Barrette wrote:
Hi, I can't see the data at ucsc when I upload a bam file and try to display it. The track arrives at ucsc. I see GET /display_application/... and 200 and bytes in the httpd/access_log I see GET /display_application/... in the paster.log But I don't see the seq reads at the ucsc browser.
Some background information: I am setting up our local galaxy server. The server is redhat 5. The IP is external and whether firewall is up or down is immaterial in the behavior. The galaxy install is running at localhost:8080. There is the default redhat apache running in front of galaxy with the rewrite rules enabled for static and all traffic to port 80. Apache security is not enabled (yet). The dedicated galaxy user and group is running galaxy.
Thanks, Terry
********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues <Terrence Barrette2.vcf>_______________________________________________ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu <mailto:galaxy-dev@lists.bx.psu.edu> http://lists.bx.psu.edu/listinfo/galaxy-dev
/* Davide Cittaro
Cogentech - Consortium for Genomic Technologies via adamello, 16 20139 Milano Italy
tel.: +39(02)574303007 e-mail: davide.cittaro@ifom-ieo-campus.it <mailto:davide.cittaro@ifom-ieo-campus.it> */
_______________________________________________ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-dev
<ucsc-bam.png>
/* Davide Cittaro
Cogentech - Consortium for Genomic Technologies via adamello, 16 20139 Milano Italy
tel.: +39(02)574303007 e-mail: davide.cittaro@ifom-ieo-campus.it <mailto:davide.cittaro@ifom-ieo-campus.it> */
On Jun 22, 2010, at 1:15 PM, TRBarrette wrote:
Thanks for the link to the tcp thread. Well that's a little puzzling, how is it that the public server works then?
Don't know... Are you proxying with apache? They are not... Can you try without the proxy?
Is there a different implementation or paste server?
I would like some of the developers may anwser.... d /* Davide Cittaro Cogentech - Consortium for Genomic Technologies via adamello, 16 20139 Milano Italy tel.: +39(02)574303007 e-mail: davide.cittaro@ifom-ieo-campus.it */
We use nginx in front of Galaxy which provides partial GET support. You may be able to accomplish this with an alternative wsgi server as well. On Jun 22, 2010, at 8:36 AM, Davide Cittaro wrote:
Don't know... Are you proxying with apache? They are not... Can you try without the proxy?
Is there a different implementation or paste server?
I would like some of the developers may anwser....
On Jun 22, 2010, at 3:17 PM, James Taylor wrote:
We use nginx in front of Galaxy which provides partial GET support. You may be able to accomplish this with an alternative wsgi server as well.
I've disabled the proxy and still have the issue :-( That means that the apache (or nginx) proxy doesn't affect this (probably)... Also, which alternative wsgi servers are supported by galaxy? And, most important, can you provide a BAM file on your public data library so that people can test "how it should work"? I've tried to produce one on your main server aligning some public data but the job crashed...
On Jun 22, 2010, at 8:36 AM, Davide Cittaro wrote:
Don't know... Are you proxying with apache? They are not... Can you try without the proxy?
Is there a different implementation or paste server?
I would like some of the developers may anwser....
/* Davide Cittaro Cogentech - Consortium for Genomic Technologies via adamello, 16 20139 Milano Italy tel.: +39(02)574303007 e-mail: davide.cittaro@ifom-ieo-campus.it */
Davide Cittaro wrote:
On Jun 22, 2010, at 3:17 PM, James Taylor wrote:
We use nginx in front of Galaxy which provides partial GET support. You may be able to accomplish this with an alternative wsgi server as well.
I've disabled the proxy and still have the issue :-( That means that the apache (or nginx) proxy doesn't affect this (probably)...
nginx would need to be configured to use x-accel-redirect. Apache would need to use x-sendfile. This is so the dataset will be sent directly by the proxy, rather than just streamed through from Paste. --nate
Also, which alternative wsgi servers are supported by galaxy? And, most important, can you provide a BAM file on your public data library so that people can test "how it should work"? I've tried to produce one on your main server aligning some public data but the job crashed...
On Jun 22, 2010, at 8:36 AM, Davide Cittaro wrote:
Don't know... Are you proxying with apache? They are not... Can you try without the proxy?
Is there a different implementation or paste server?
I would like some of the developers may anwser....
/* Davide Cittaro
Cogentech - Consortium for Genomic Technologies via adamello, 16 20139 Milano Italy
tel.: +39(02)574303007 e-mail: davide.cittaro@ifom-ieo-campus.it */
_______________________________________________ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-dev
On Jun 22, 2010, at 3:39 PM, Nate Coraor wrote:
Davide Cittaro wrote:
On Jun 22, 2010, at 3:17 PM, James Taylor wrote:
We use nginx in front of Galaxy which provides partial GET support. You may be able to accomplish this with an alternative wsgi server as well.
I've disabled the proxy and still have the issue :-( That means that the apache (or nginx) proxy doesn't affect this (probably)...
nginx would need to be configured to use x-accel-redirect. Apache would need to use x-sendfile. This is so the dataset will be sent directly by the proxy, rather than just streamed through from Paste.
OMFG! It works! Thanks! It works for BAM and bigWig (and about this, are you interested in my implementation or are you implementing it on your own?) d /* Davide Cittaro Cogentech - Consortium for Genomic Technologies via adamello, 16 20139 Milano Italy tel.: +39(02)574303007 e-mail: davide.cittaro@ifom-ieo-campus.it */
Davide, I'd really like to see how you got it work. Thanks Terry On 6/22/2010 10:02 AM, Davide Cittaro wrote:
On Jun 22, 2010, at 3:39 PM, Nate Coraor wrote:
Davide Cittaro wrote:
On Jun 22, 2010, at 3:17 PM, James Taylor wrote:
We use nginx in front of Galaxy which provides partial GET support. You may be able to accomplish this with an alternative wsgi server as well.
I've disabled the proxy and still have the issue :-( That means that the apache (or nginx) proxy doesn't affect this (probably)...
nginx would need to be configured to use x-accel-redirect. Apache would need to use x-sendfile. This is so the dataset will be sent directly by the proxy, rather than just streamed through from Paste.
OMFG! It works! Thanks! It works for BAM and bigWig (and about this, are you interested in my implementation or are you implementing it on your own?)
d
/* Davide Cittaro
Cogentech - Consortium for Genomic Technologies via adamello, 16 20139 Milano Italy
tel.: +39(02)574303007 e-mail: davide.cittaro@ifom-ieo-campus.it <mailto:davide.cittaro@ifom-ieo-campus.it> */
_______________________________________________ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-dev
On Jun 22, 2010, at 7:30 PM, TRBarrette wrote:
Davide, I'd really like to see how you got it work.
I'm on my way to home right now... can you wait until tomorrow? :-) I'll send you my configuration. d
Thanks Terry
On 6/22/2010 10:02 AM, Davide Cittaro wrote:
On Jun 22, 2010, at 3:39 PM, Nate Coraor wrote:
Davide Cittaro wrote:
On Jun 22, 2010, at 3:17 PM, James Taylor wrote:
We use nginx in front of Galaxy which provides partial GET support. You may be able to accomplish this with an alternative wsgi server as well.
I've disabled the proxy and still have the issue :-( That means that the apache (or nginx) proxy doesn't affect this (probably)...
nginx would need to be configured to use x-accel-redirect. Apache would need to use x-sendfile. This is so the dataset will be sent directly by the proxy, rather than just streamed through from Paste.
OMFG! It works! Thanks! It works for BAM and bigWig (and about this, are you interested in my implementation or are you implementing it on your own?)
d
/* Davide Cittaro
Cogentech - Consortium for Genomic Technologies via adamello, 16 20139 Milano Italy
tel.: +39(02)574303007 e-mail: davide.cittaro@ifom-ieo-campus.it */
_______________________________________________ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-dev
/* Davide Cittaro Cogentech - Consortium for Genomic Technologies via adamello, 16 20139 Milano Italy tel.: +39(02)574303007 e-mail: davide.cittaro@ifom-ieo-campus.it */
Hi Terry, On Jun 22, 2010, at 7:30 PM, TRBarrette wrote:
Davide, I'd really like to see how you got it work.
Note that my galaxy configuration relies on a local mirror of the UCSC genome browser. Our galaxy runs as a multiserver (2 web servers on ports 8081 and 8082, one runner on 8100). I've installed mod_xsendfile as Nate suggested (http://tn123.ath.cx/mod_xsendfile/) and enabled it: LoadModule xsendfile_module /usr/lib/apache2/modules/mod_xsendfile.so I've configured apache to bind a virtual host to port 8080, which is now proxying galaxy: [begin apache conf for galaxy, I'll try to comment it] NameVirtualHost *:8080 <VirtualHost *:8080> ServerAdmin davide.cittaro@ifom-ieo-campus.it #Here comes the proxy stuff... I think you already have this :-) <Proxy localhost:8081> Order allow,deny Allow from all </Proxy> <Proxy localhost:8082> Order allow,deny Allow from all </Proxy> <Proxy balancer://multi-galaxy> BalancerMember http://localhost:8081 BalancerMember http://localhost:8082 </Proxy> ProxyPass / balancer://multi-galaxy RewriteEngine on RewriteRule ^(.*) http://localhost:8081$1 [P] RewriteRule ^/static/style/(.*) /data/galaxy_dist/static/june_2007_style/blue/$1 [L] RewriteRule ^/static/(.*) /data/galaxy_dist/static/$1 [L] RewriteRule ^/images/(.*) /data/galaxy_dist/static/images/$1 [L] RewriteRule ^/favicon.ico /data/galaxy_dist/static/favicon.ico [L] RewriteRule ^/robots.txt /data/galaxy_dist/static/robots.txt [L] <Location /> AuthType Basic AuthName Galaxy # Xsendfile as Nate suggested XSendFile On XSendFileAllowAbove On # I'm using our internal ldap, querying for name and email AuthBasicProvider ldap AuthLDAPURL "ldap://ldap.ifom-ieo-campus.it/dc=ifom-ieo-campus,dc=it?cn,mail?sub?(cn=*)" AuthLDAPRemoteUserAttribute mail Require ldap-filter objectClass=posixAccount </Location> # Set the http header to user e-mail so that galaxy is happy to authenticate :-) RequestHeader set REMOTE_USER %{AUTHENTICATE_MAIL}e <Location /root/display_as> Satisfy Any Order deny,allow Allow from genome.ifom-ieo-campus.it </Location> <LocationMatch /ucsc_(bam|big) > # This is to enable bam and bigWig (or bigBEd in the future) by traversing the proxy # Allow from our internal network # and set the http header to a fake email address, this is required because of galaxy architecture... Satisfy any Order deny,allow Allow from 85.239.0.0/255.255.0.0 RequestHeader set REMOTE_USER "ucsc_browser_display@ifom-ieo-campus.it" </LocationMatch> ErrorLog /var/log/apache2/galaxy-error.log LogLevel debug CustomLog /var/log/apache2/galaxy-access.log combined ServerSignature On </VirtualHost> [/end of apache conf file] After this comes the galaxy configuration file... this is pretty much the original one, I'll write only the differences for this scope: ucsc_display_sites = main,campus #where campus is our local mirror. I left "main" although we are behind a firewall and it cannot communicate... use_remote_user = True apache_xsendfile = True #remote_user_maildomain = #commented and left blank... well, this because I've already have the whole mail address in http header Then there are some mods I've done in galaxy code and files: In ${GALAXY_ROOT}/tool-data/shared/ucsc/ucsc_build_sites.txt I've added #Harvested from http://genome.ifom-ieo-campus.it/cgi-bin/das/dsn campus http://genome.ifom-ieo-campus.it/cgi-bin/hgTracks? hg19,hg18,hg17,mm9,mm8,rn4,danRer6,danRer5,ci2,ce6,ce4,cb3,dm3,sacCer2,sacCer1 To enable our "campus" UCSC mirror. Then I've modified a python file to enable our local mirror: diff -r 4cdf4cca0f31 lib/galaxy/web/framework/middleware/remoteuser.py --- a/lib/galaxy/web/framework/middleware/remoteuser.py Mon Jun 21 13:46:52 2010 -0400 +++ b/lib/galaxy/web/framework/middleware/remoteuser.py Wed Jun 23 10:34:03 2010 +0200 @@ -44,6 +44,7 @@ 'hgw6.cse.ucsc.edu', 'hgw7.cse.ucsc.edu', 'hgw8.cse.ucsc.edu', + 'genome.ifom-ieo-campus.it', ) UCSC_ARCHAEA_SERVERS = ( 'lowepub.cse.ucsc.edu', @@ -55,7 +56,7 @@ self.maildomain = maildomain self.allow_ucsc_main = False self.allow_ucsc_archaea = False - if 'main' in ucsc_display_sites or 'test' in ucsc_display_sites: + if 'main' in ucsc_display_sites or 'test' in ucsc_display_sites or 'campus' in ucsc_display_sites: self.allow_ucsc_main = True if 'archaea' in ucsc_display_sites: self.allow_ucsc_archaea = True @@ -69,7 +70,7 @@ host = None if ( self.allow_ucsc_main and host in UCSC_MAIN_SERVERS ) or \ ( self.allow_ucsc_archaea and host in UCSC_ARCHAEA_SERVERS ): - environ[ 'HTTP_REMOTE_USER' ] = 'ucsc_browser_display@example.org' + environ[ 'HTTP_REMOTE_USER' ] = 'ucsc_browser_display@ifom-ieo-campus.it' return self.app( environ, start_response ) # Apache sets REMOTE_USER to the string '(null)' when using the # Rewrite* method for passing REMOTE_USER and a user is I believe this is all... If your galaxy can communicate with main UCSC server you won't need some the patches above, but only the apache configuration. HTH d /* Davide Cittaro Cogentech - Consortium for Genomic Technologies via adamello, 16 20139 Milano Italy tel.: +39(02)574303007 e-mail: davide.cittaro@ifom-ieo-campus.it */
Davide, Terry and Nate; This discussion is awesome. We've also been missing BAM display at UCSC and this thread finally cleared all the cobwebs so we could get it working.
Note that my galaxy configuration relies on a local mirror of the UCSC genome browser. Our galaxy runs as a multiserver (2 web servers on ports 8081 and 8082, one runner on 8100). I've installed mod_xsendfile as Nate suggested (http://tn123.ath.cx/mod_xsendfile/) and enabled it:
[... apache configuration ...]
I believe this is all... If your galaxy can communicate with main UCSC server you won't need some the patches above, but only the apache configuration.
I added the explicit information for getting this set up on nginx on the wiki: http://bitbucket.org/galaxy/galaxy-central/wiki/Config/nginxProxy All of the parts were there between the example configuration and Nate's talk, but this puts it together for those folks like me who hadn't made all the connections. Nate, would you be able to provide some clues for the configuration to get uploads handled through nginx? Your talk mentions this is slightly more complex; I see the following variables that might be involved in universe_config: nginx_x_archive_files_base upstream_gzip nginx_upload_store nginx_upload_path and could use a bit of direction about which items in your sample config are critical for enabling it. Thanks, Brad
Brad Chapman wrote:
Davide, Terry and Nate; This discussion is awesome. We've also been missing BAM display at UCSC and this thread finally cleared all the cobwebs so we could get it working.
Note that my galaxy configuration relies on a local mirror of the UCSC genome browser. Our galaxy runs as a multiserver (2 web servers on ports 8081 and 8082, one runner on 8100). I've installed mod_xsendfile as Nate suggested (http://tn123.ath.cx/mod_xsendfile/) and enabled it:
[... apache configuration ...]
I believe this is all... If your galaxy can communicate with main UCSC server you won't need some the patches above, but only the apache configuration.
I added the explicit information for getting this set up on nginx on the wiki:
http://bitbucket.org/galaxy/galaxy-central/wiki/Config/nginxProxy
Awesome, thanks!
All of the parts were there between the example configuration and Nate's talk, but this puts it together for those folks like me who hadn't made all the connections.
Nate, would you be able to provide some clues for the configuration to get uploads handled through nginx? Your talk mentions this is slightly more complex; I see the following variables that might be involved in universe_config:
nginx_x_archive_files_base upstream_gzip nginx_upload_store nginx_upload_path
and could use a bit of direction about which items in your sample config are critical for enabling it.
Yes, I'm planning on finally writing up the missing bits starting at the end of this week. --nate
Thanks, Brad _______________________________________________ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-dev
Davide, Nate, Brad Davide, Thanks for the configuration, there are many aspects here (ldap, load balancing) that I will be working toward and this is a great help. Unfortunately I am still missing something here. I don't have any auth running right now. I am using just the default setup of apache. I am using the RewriteRule set from galaxy-wiki. I installed and loaded the xsendfile_module I added the <Location "/" directive with the XsendFile on to allow anything to be xsend available (this is 'get it to work' phase). Still no joy. The request comes back from ucsc hits apache with the specific byte request. That gets passed to the galaxy service and which interprets it and returns the entire file part of which loads at ucsc. It seems that I am not getting the pass to the XsendFile handled correctly. I've attached a comment deleted http.conf, a universe_wsgi.ini, a tail of the http access_log and a tail of the paster.log In the process I've tried adding and removing each stanza from Davide's example setup. The http.conf is what makes sense to me now, seeing as there is no auth and this setup is preliminary to get things working before bringing the site up to fully functional. It seems that the RewriteRule is functional without proxy or with it. I've tried it with a proxy stanza and without. The galaxy part of the http.conf is pretty simple at the end. Originally, I didn't bother to move the DocumentRoot seeing as all traffic was redirected via the RewriteRules. I allowed XSendFile to / in this example but I've also tried /galaxy/galaxy-dist where the application is at. The bam file is located on the nfs share at /exds/galaxy/galaxy-dist/database/files. ### added for galaxy ### #rewrite rules for apache RewriteEngine On RewriteRule ^/static/style/(.*) /galaxy/galaxy-dist/static/june_2007_style/blue/$1 [L] Rewriterule ^/static/(.*) /galaxy/galaxy-dist/static/$1 [L] RewriteRule ^/images/(.*) /galaxy/galaxy-dist/images(.*) [L] RewriteRule ^/favicon.ico /galaxy/galaxy-dist/favicon.ico [L] RewriteRule ^/robots.txt /galaxy/galaxy-dist/robots.txt [L] RewriteRule ^(.*) http://localhost:8080$1 [P] #eos rewrite rules for apache #apache_xsendfile directive# <Location "/" > Satisfy Any order allow,deny allow from all XSendFile on XSendFileAllowAbove on </Location> #eos apache_xsendfile directive# ### eos added for galaxy ### I'm missing something obvious, thanks for any help. Terry On 6/23/2010 4:39 AM, Davide Cittaro wrote:
Hi Terry, On Jun 22, 2010, at 7:30 PM, TRBarrette wrote:
Davide, I'd really like to see how you got it work.
Note that my galaxy configuration relies on a local mirror of the UCSC genome browser. Our galaxy runs as a multiserver (2 web servers on ports 8081 and 8082, one runner on 8100). I've installed mod_xsendfile as Nate suggested (http://tn123.ath.cx/mod_xsendfile/) and enabled it:
LoadModule xsendfile_module /usr/lib/apache2/modules/mod_xsendfile.so
I've configured apache to bind a virtual host to port 8080, which is now proxying galaxy:
[begin apache conf for galaxy, I'll try to comment it]
NameVirtualHost *:8080 <VirtualHost *:8080> ServerAdmin davide.cittaro@ifom-ieo-campus.it <mailto:davide.cittaro@ifom-ieo-campus.it> #Here comes the proxy stuff... I think you already have this :-) <Proxy localhost:8081> Order allow,deny Allow from all </Proxy> <Proxy localhost:8082> Order allow,deny Allow from all </Proxy> <Proxy balancer://multi-galaxy> BalancerMember http://localhost:8081 BalancerMember http://localhost:8082 </Proxy> ProxyPass / balancer://multi-galaxy
RewriteEngine on RewriteRule ^(.*) http://localhost:8081$1 [P] RewriteRule ^/static/style/(.*) /data/galaxy_dist/static/june_2007_style/blue/$1 [L] RewriteRule ^/static/(.*) /data/galaxy_dist/static/$1 [L] RewriteRule ^/images/(.*) /data/galaxy_dist/static/images/$1 [L] RewriteRule ^/favicon.ico /data/galaxy_dist/static/favicon.ico [L] RewriteRule ^/robots.txt /data/galaxy_dist/static/robots.txt [L]
<Location /> AuthType Basic AuthName Galaxy # Xsendfile as Nate suggested XSendFile On XSendFileAllowAbove On # I'm using our internal ldap, querying for name and email AuthBasicProvider ldap AuthLDAPURL "ldap://ldap.ifom-ieo-campus.it/dc=ifom-ieo-campus,dc=it?cn,mail?sub?(cn=*) <ldap://ldap.ifom-ieo-campus.it/dc=ifom-ieo-campus,dc=it?cn,mail?sub?%28cn=*%29>" AuthLDAPRemoteUserAttribute mail Require ldap-filter objectClass=posixAccount </Location> # Set the http header to user e-mail so that galaxy is happy to authenticate :-) RequestHeader set REMOTE_USER %{AUTHENTICATE_MAIL}e
<Location /root/display_as> Satisfy Any Order deny,allow Allow from genome.ifom-ieo-campus.it <http://genome.ifom-ieo-campus.it> </Location>
<LocationMatch /ucsc_(bam|big) > # This is to enable bam and bigWig (or bigBEd in the future) by traversing the proxy # Allow from our internal network # and set the http header to a fake email address, this is required because of galaxy architecture... Satisfy any Order deny,allow Allow from 85.239.0.0/255.255.0.0 RequestHeader set REMOTE_USER "ucsc_browser_display@ifom-ieo-campus.it <mailto:ucsc_browser_display@ifom-ieo-campus.it>" </LocationMatch>
ErrorLog /var/log/apache2/galaxy-error.log LogLevel debug CustomLog /var/log/apache2/galaxy-access.log combined ServerSignature On </VirtualHost> [/end of apache conf file]
After this comes the galaxy configuration file... this is pretty much the original one, I'll write only the differences for this scope:
ucsc_display_sites = main,campus #where campus is our local mirror. I left "main" although we are behind a firewall and it cannot communicate... use_remote_user = True apache_xsendfile = True #remote_user_maildomain = #commented and left blank... well, this because I've already have the whole mail address in http header
Then there are some mods I've done in galaxy code and files:
In ${GALAXY_ROOT}/tool-data/shared/ucsc/ucsc_build_sites.txt I've added
#Harvested from http://genome.ifom-ieo-campus.it/cgi-bin/das/dsn campus http://genome.ifom-ieo-campus.it/cgi-bin/hgTracks? hg19,hg18,hg17,mm9,mm8,rn4,danRer6,danRer5,ci2,ce6,ce4,cb3,dm3,sacCer2,sacCer1
To enable our "campus" UCSC mirror. Then I've modified a python file to enable our local mirror:
diff -r 4cdf4cca0f31 lib/galaxy/web/framework/middleware/remoteuser.py --- a/lib/galaxy/web/framework/middleware/remoteuser.py Mon Jun 21 13:46:52 2010 -0400 +++ b/lib/galaxy/web/framework/middleware/remoteuser.py Wed Jun 23 10:34:03 2010 +0200 @@ -44,6 +44,7 @@ 'hgw6.cse.ucsc.edu', 'hgw7.cse.ucsc.edu', 'hgw8.cse.ucsc.edu', + 'genome.ifom-ieo-campus.it', ) UCSC_ARCHAEA_SERVERS = ( 'lowepub.cse.ucsc.edu', @@ -55,7 +56,7 @@ self.maildomain = maildomain self.allow_ucsc_main = False self.allow_ucsc_archaea = False - if 'main' in ucsc_display_sites or 'test' in ucsc_display_sites: + if 'main' in ucsc_display_sites or 'test' in ucsc_display_sites or 'campus' in ucsc_display_sites: self.allow_ucsc_main = True if 'archaea' in ucsc_display_sites: self.allow_ucsc_archaea = True @@ -69,7 +70,7 @@ host = None if ( self.allow_ucsc_main and host in UCSC_MAIN_SERVERS ) or \ ( self.allow_ucsc_archaea and host in UCSC_ARCHAEA_SERVERS ): - environ[ 'HTTP_REMOTE_USER' ] = 'ucsc_browser_display@example.org <mailto:%27ucsc_browser_display@example.org>' + environ[ 'HTTP_REMOTE_USER' ] = 'ucsc_browser_display@ifom-ieo-campus.it <mailto:%27ucsc_browser_display@ifom-ieo-campus.it>' return self.app( environ, start_response ) # Apache sets REMOTE_USER to the string '(null)' when using the # Rewrite* method for passing REMOTE_USER and a user is
I believe this is all... If your galaxy can communicate with main UCSC server you won't need some the patches above, but only the apache configuration.
HTH
d
/* Davide Cittaro
Cogentech - Consortium for Genomic Technologies via adamello, 16 20139 Milano Italy
tel.: +39(02)574303007 e-mail: davide.cittaro@ifom-ieo-campus.it <mailto:davide.cittaro@ifom-ieo-campus.it> */
On Jun 23, 2010, at 11:52 PM, TRBarrette wrote:
Davide, Nate, Brad Davide, Thanks for the configuration, there are many aspects here (ldap, load balancing) that I will be working toward and this is a great help.
Unfortunately I am still missing something here. I don't have any auth running right now. I am using just the default setup of apache. I am using the RewriteRule set from galaxy-wiki. I installed and loaded the xsendfile_module I added the <Location "/" directive with the XsendFile on to allow anything to be xsend available (this is 'get it to work' phase).
Still no joy.
I've been reading your httpd.conf... What if you define a <Directory /> and a <Location /> in the same file? Don't they conflict? Could it be that apache is ignoring the Location because there's a Directory defined at the same level but before that? I would start removing from httpd.conf all things you don't need (as it will work just as a proxy): Directory, Alias and ScriptAlias definitions... universe_wsgi.ini file looks fine... d /* Davide Cittaro Cogentech - Consortium for Genomic Technologies via adamello, 16 20139 Milano Italy tel.: +39(02)574303007 e-mail: davide.cittaro@ifom-ieo-campus.it */
Terry;
Davide, Thanks for the configuration, there are many aspects here (ldap, load balancing) that I will be working toward and this is a great help.
Unfortunately I am still missing something here. [...] I added the <Location "/" directive with the XsendFile on to allow anything to be xsend available (this is 'get it to work' phase).
Still no joy.
In your universe_wsgi.ini configuration file, you have the 'apache_xsendfile = true' line under the message queue ([galaxy_amqp]) section of the configuration. I believe this should be under [app:main] section. Fingers crossed that moving the line up there will get it rolling, Brad
participants (6)
-
Brad Chapman
-
Davide Cittaro
-
James Taylor
-
Nate Coraor
-
Terrence Barrette
-
TRBarrette