displaying ucsc tracks when using external auth
Hi All, When attempting to display a custom track on the UCSC genome browser, galaxy generates a URL such as... http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg18&position=chr1:1-1000&hgt.customText=http%3A%2F%2Fgalaxy.example.com%2Froot%2Fdisplay_as%3Fid%3D171%26display_app%3Ducsc The first problem I have is that the base url is wrong for the customText - I am using https - is there a way to set that? The customText parameter is the bit that allows UCSC to pick up the data - but it is not going to get past the external auth. Is there a reasonable way to fix this? I suppose there must be a way to bypass the auth if the request comes from UCSC and looks like a certain URL, but is there a better way? thanks, James
James Casbon wrote:
When attempting to display a custom track on the UCSC genome browser, galaxy generates a URL such as...
The first problem I have is that the base url is wrong for the customText - I am using https - is there a way to set that?
This is a function of the Apache proxy. You connect to Apache via https, but it proxies to Galaxy via http. I wrote a fix for this but haven't committed it; I just learned that the browser doesn't support https. You could do some Apache trickery to make this work (i.e. use Location directives in your http VirtualHost to only allow through UCSC on http connections).
The customText parameter is the bit that allows UCSC to pick up the data - but it is not going to get past the external auth. Is there a reasonable way to fix this? I suppose there must be a way to bypass the auth if the request comes from UCSC and looks like a certain URL, but is there a better way?
The solution for this is provided here, but please be aware that it opens up a security hole: http://g2.trac.bx.psu.edu/wiki/HowToInstall/ApacheProxy#DisplayatUCSC I have a couple ideas about how to close this hole but nothing planned as of yet. --nate
2009/1/8 Nate Coraor <nate@bx.psu.edu>:
James Casbon wrote:
When attempting to display a custom track on the UCSC genome browser, galaxy generates a URL such as...
The first problem I have is that the base url is wrong for the customText - I am using https - is there a way to set that?
This is a function of the Apache proxy. You connect to Apache via https, but it proxies to Galaxy via http. I wrote a fix for this but haven't committed it; I just learned that the browser doesn't support https.
Not sure I understand - you mean that the base url is technically correct but the proxy introduces the error? What we need is the ability to set the base url for this kind of situation.
You could do some Apache trickery to make this work (i.e. use Location directives in your http VirtualHost to only allow through UCSC on http connections).
The customText parameter is the bit that allows UCSC to pick up the data - but it is not going to get past the external auth. Is there a reasonable way to fix this? I suppose there must be a way to bypass the auth if the request comes from UCSC and looks like a certain URL, but is there a better way?
The solution for this is provided here, but please be aware that it opens up a security hole:
http://g2.trac.bx.psu.edu/wiki/HowToInstall/ApacheProxy#DisplayatUCSC
I have a couple ideas about how to close this hole but nothing planned as of yet.
Great, thanks. I hadn't spotted that on the wiki.
James Casbon wrote:
Not sure I understand - you mean that the base url is technically correct but the proxy introduces the error? What we need is the ability to set the base url for this kind of situation.
The URL scheme between Apache and Galaxy is http. Between Apache and the end user it can be anything (in your case, https). Galaxy only knows which scheme it's using with its immediate upstream. I'll go ahead and commit my fix so that Galaxy generates the URL correctly. This should solve any future problems or integration with other external sources. But it won't fix UCSC, which I'll explain in a moment. To generate proper 'https' URLs, inside the SSL VirtualHost in which you proxy Galaxy you'd need to add this: RequestHeader set X-URL-SCHEME https Galaxy will then check the X-URL-SCHEME header and override the previously defined scheme (http) accordingly. However, UCSC cannot read data over https[1], so fixing the URL will not actually help. However, you can serve UCSC data over http via a hackish workaround: 1. Don't set X-URL-SCHEME as described above. Galaxy will continue to generate (incorrect) 'http' links. 2. In the non-SSL VirtualHost for galaxy.example.com, set up a proxy similar to how you did in the SSL VirtualHost, but only for /root/display_as. You'll want to deny connections from everywhere but UCSC: <VirtualHost *:80> ... standard stuff ... Order deny,allow Deny from all Allow from hgw1.cse.ucsc.edu Allow from hgw2.cse.ucsc.edu Allow from hgw3.cse.ucsc.edu Allow from hgw4.cse.ucsc.edu Allow from hgw5.cse.ucsc.edu Allow from hgw6.cse.ucsc.edu Allow from hgw7.cse.ucsc.edu Allow from hgw8.cse.ucsc.edu RewriteEngine on RewriteRule ^(/root/display_as.*) http://localhost:8192$1 [P,L] RewriteRule ^(.*) - [F] </VirtualHost> Authentication configuration is not necessary since you are only allowing through UCSC, unauthenticated. Everything else in that VirtualHost will return 403 Forbidden. [1] http://www.soe.ucsc.edu/pipermail/genome/2008-April/015997.html --nate
2009/1/9 Nate Coraor <nate@bx.psu.edu>:
James Casbon wrote:
Not sure I understand - you mean that the base url is technically correct but the proxy introduces the error? What we need is the ability to set the base url for this kind of situation.
The URL scheme between Apache and Galaxy is http. Between Apache and the end user it can be anything (in your case, https). Galaxy only knows which scheme it's using with its immediate upstream.
I'll go ahead and commit my fix so that Galaxy generates the URL correctly. This should solve any future problems or integration with other external sources. But it won't fix UCSC, which I'll explain in a moment.
To generate proper 'https' URLs, inside the SSL VirtualHost in which you proxy Galaxy you'd need to add this:
RequestHeader set X-URL-SCHEME https
Galaxy will then check the X-URL-SCHEME header and override the previously defined scheme (http) accordingly.
I tried out this fix and it works - which is great for the UCSC table browser integration.
However, UCSC cannot read data over https[1], so fixing the URL will not actually help. However, you can serve UCSC data over http via a hackish workaround:
1. Don't set X-URL-SCHEME as described above. Galaxy will continue to generate (incorrect) 'http' links. 2. In the non-SSL VirtualHost for galaxy.example.com, set up a proxy similar to how you did in the SSL VirtualHost, but only for /root/display_as. You'll want to deny connections from everywhere but UCSC:
<VirtualHost *:80> ... standard stuff ... Order deny,allow Deny from all Allow from hgw1.cse.ucsc.edu Allow from hgw2.cse.ucsc.edu Allow from hgw3.cse.ucsc.edu Allow from hgw4.cse.ucsc.edu Allow from hgw5.cse.ucsc.edu Allow from hgw6.cse.ucsc.edu Allow from hgw7.cse.ucsc.edu Allow from hgw8.cse.ucsc.edu RewriteEngine on RewriteRule ^(/root/display_as.*) http://localhost:8192$1 [P,L] RewriteRule ^(.*) - [F] </VirtualHost>
Authentication configuration is not necessary since you are only allowing through UCSC, unauthenticated. Everything else in that VirtualHost will return 403 Forbidden.
[1] http://www.soe.ucsc.edu/pipermail/genome/2008-April/015997.html
I also tried this, which is also good. I can display custom tracks on the genome browser. I would like a way to have both the table and the custom track integration. I suppose the best way is to use https, and if it is a UCSC client redirect it to http. That is if the UCSC client will honor redirects? cheers, James
2009/1/12 James Casbon <casbon@gmail.com>:
2009/1/9 Nate Coraor <nate@bx.psu.edu>:
James Casbon wrote:
Not sure I understand - you mean that the base url is technically correct but the proxy introduces the error? What we need is the ability to set the base url for this kind of situation.
The URL scheme between Apache and Galaxy is http. Between Apache and the end user it can be anything (in your case, https). Galaxy only knows which scheme it's using with its immediate upstream.
I'll go ahead and commit my fix so that Galaxy generates the URL correctly. This should solve any future problems or integration with other external sources. But it won't fix UCSC, which I'll explain in a moment.
To generate proper 'https' URLs, inside the SSL VirtualHost in which you proxy Galaxy you'd need to add this:
RequestHeader set X-URL-SCHEME https
Galaxy will then check the X-URL-SCHEME header and override the previously defined scheme (http) accordingly.
I tried out this fix and it works - which is great for the UCSC table browser integration.
However, UCSC cannot read data over https[1], so fixing the URL will not actually help. However, you can serve UCSC data over http via a hackish workaround:
1. Don't set X-URL-SCHEME as described above. Galaxy will continue to generate (incorrect) 'http' links. 2. In the non-SSL VirtualHost for galaxy.example.com, set up a proxy similar to how you did in the SSL VirtualHost, but only for /root/display_as. You'll want to deny connections from everywhere but UCSC:
<VirtualHost *:80> ... standard stuff ... Order deny,allow Deny from all Allow from hgw1.cse.ucsc.edu Allow from hgw2.cse.ucsc.edu Allow from hgw3.cse.ucsc.edu Allow from hgw4.cse.ucsc.edu Allow from hgw5.cse.ucsc.edu Allow from hgw6.cse.ucsc.edu Allow from hgw7.cse.ucsc.edu Allow from hgw8.cse.ucsc.edu RewriteEngine on RewriteRule ^(/root/display_as.*) http://localhost:8192$1 [P,L] RewriteRule ^(.*) - [F] </VirtualHost>
Authentication configuration is not necessary since you are only allowing through UCSC, unauthenticated. Everything else in that VirtualHost will return 403 Forbidden.
[1] http://www.soe.ucsc.edu/pipermail/genome/2008-April/015997.html
I also tried this, which is also good. I can display custom tracks on the genome browser.
I would like a way to have both the table and the custom track integration. I suppose the best way is to use https, and if it is a UCSC client redirect it to http. That is if the UCSC client will honor redirects?
No it won't. So we either need an explicit exception to the URL-SCHEME code, or to do some more apache trickery to put non https traffic over the same port, just for the ucsc clients.
James Casbon wrote:
No it won't. So we either need an explicit exception to the URL-SCHEME code, or to do some more apache trickery to put non https traffic over the same port, just for the ucsc clients.
That's probably a task for a packet filter (firewall) rather than Apache. I'm not clear on what this would be necessary for, though? UCSC will already talk to your Galaxy over http if you configure Apache to allow it. All other connections would have to use https as long as you use Deny statements in your http vhost to prevent non-UCSC sites from connecting. --nate
2009/1/12 Nate Coraor <nate@bx.psu.edu>:
James Casbon wrote:
No it won't. So we either need an explicit exception to the URL-SCHEME code, or to do some more apache trickery to put non https traffic over the same port, just for the ucsc clients.
That's probably a task for a packet filter (firewall) rather than Apache.
I'm not clear on what this would be necessary for, though? UCSC will already talk to your Galaxy over http if you configure Apache to allow it. All other connections would have to use https as long as you use Deny statements in your http vhost to prevent non-UCSC sites from connecting.
I want the table browser to work and generate the correct (https) links back to my galaxy install, which means I set the URL scheme so that these are correct. This has the side effect that links to the tracks for genome browser are also https, which doesn't work. What I need is: 1. https urls for the table browser 2. http urls for the tracks one is going to be wrong at the moment, because galaxy behaves correctly, but UCSC doesn't. cheers, James
James Casbon wrote:
I want the table browser to work and generate the correct (https) links back to my galaxy install, which means I set the URL scheme so that these are correct. This has the side effect that links to the tracks for genome browser are also https, which doesn't work. What I need is:
1. https urls for the table browser 2. http urls for the tracks
one is going to be wrong at the moment, because galaxy behaves correctly, but UCSC doesn't.
Ah, I understand the problem now. A simple redirect is not going to work, because the browser would have to speak https to get the redirect in the first place. Assuming that UCSC is happy to speak http to URLs starting with https, you should be able to use your favorite flavor of packet filtering to redirect all port 443 traffic originating from hgw[1-8].cse.ucsc.edu to port 80. --nate
participants (2)
-
James Casbon
-
Nate Coraor