Cluster Installation Tips
Hi, I am attempting to install Galaxy in our computing grid as a first step to integrating our tools within it. I have everything up and running as far as serving the galaxy suite except the job-submission aspect (to our compute cluster). We have a commodity cluster using the SGE job scheduling system along with an extra node which acts just as the q-master and job submission node. Galaxy is going to be running on a standalone web-server which may or may not be outside of a firewall disconnecting it from the cluster and head node, although ports may be opened in the firewall if required. What would be the advised strategy for enabling the web server to submit jobs to our cluster? Am I right in thinking that DRMAA allows remote hosts to submit jobs remotely? Do I need to install the SGE binaries on the web-server and use DRMAA_python to connect to the head node and submit the jobs? Any advice or direction on how to tackle this would be greatly appreciated. Cheers, Matt Goyder -- matthew.goyder@nationwidechildrens.org Battelle Center for Mathematical Medicine The Research Institute at Nationwide Children's Hospital Office: (614)355-2395 ----------------------------------------- Confidentiality Notice: The following mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. The recipient is responsible to maintain the confidentiality of this information and to use the information only for authorized purposes. If you are not the intended recipient (or authorized to receive information for the intended recipient), you are hereby notified that any review, use, disclosure, distribution, copying, printing, or action taken in reliance on the contents of this e-mail is strictly prohibited. If you have received this communication in error, please notify us immediately by reply e-mail and destroy all copies of the original message. Thank you.
As far as I understand your galaxy and gridengine need to share some NFS. The solution I use is to run galaxy on a computer which is a submit host of the cluster and thus has access to sge binaries and have the webserver reverse-proxy the galaxy site. So if you run galaxy on your head node you just need to pass HTTP traffic through your firewall. For using apache as a proxy see: http://bitbucket.org/galaxy/galaxy-central/wiki/Config/ApacheProxy For general "production server nodes: http://bitbucket.org/galaxy/galaxy-central/wiki/Config/ProductionServer And for integrating SGE: http://bitbucket.org/galaxy/galaxy-central/wiki/Config/Cluster regards, Andreas Goyder, Matthew wrote:
Hi,
I am attempting to install Galaxy in our computing grid as a first step to integrating our tools within it. I have everything up and running as far as serving the galaxy suite except the job-submission aspect (to our compute cluster).
We have a commodity cluster using the SGE job scheduling system along with an extra node which acts just as the q-master and job submission node. Galaxy is going to be running on a standalone web-server which may or may not be outside of a firewall disconnecting it from the cluster and head node, although ports may be opened in the firewall if required.
What would be the advised strategy for enabling the web server to submit jobs to our cluster? Am I right in thinking that DRMAA allows remote hosts to submit jobs remotely? Do I need to install the SGE binaries on the web-server and use DRMAA_python to connect to the head node and submit the jobs?
Any advice or direction on how to tackle this would be greatly appreciated.
Cheers, Matt Goyder -- matthew.goyder@nationwidechildrens.org Battelle Center for Mathematical Medicine The Research Institute at Nationwide Children's Hospital Office: (614)355-2395
------------------------------------------------------------------------
* ----------------------------------------- Confidentiality Notice: The following mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. The recipient is responsible to maintain the confidentiality of this information and to use the information only for authorized purposes. If you are not the intended recipient (or authorized to receive information for the intended recipient), you are hereby notified that any review, use, disclosure, distribution, copying, printing, or action taken in reliance on the contents of this e-mail is strictly prohibited. If you have received this communication in error, please notify us immediately by reply e-mail and destroy all copies of the original message. Thank you. *
------------------------------------------------------------------------
_______________________________________________ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-dev
Andreas Kuntzagk wrote:
As far as I understand your galaxy and gridengine need to share some NFS. The solution I use is to run galaxy on a computer which is a submit host of the cluster and thus has access to sge binaries and have the webserver reverse-proxy the galaxy site. So if you run galaxy on your head node you just need to pass HTTP traffic through your firewall. For using apache as a proxy see: http://bitbucket.org/galaxy/galaxy-central/wiki/Config/ApacheProxy For general "production server nodes: http://bitbucket.org/galaxy/galaxy-central/wiki/Config/ProductionServer And for integrating SGE: http://bitbucket.org/galaxy/galaxy-central/wiki/Config/Cluster
Thanks Andreas. Matt, SGE must exist on the Galaxy server since you need access to $SGE_ROOT to submit and monitor jobs. It does not need to be the same machine as the head node, but the SGE commands need to work and to use some sort of NFS as explained by Andreas since this is how the cluster is able to access the same files as the Galaxy server. --nate
participants (3)
-
Andreas Kuntzagk
-
Goyder, Matthew
-
Nate Coraor