I need some help in configuring galaxy with SGE scheduler using unified method. The galaxy is running on a system distinct from SGE scheduler install. The cluster nodes can access galaxy install, galaxy-tools and dataset files using NFS. I am not sure how drmaa works and how galaxy submits jobs to the cluster/scheduler. Do we need specify some type of connection string or ssh-config to connect with the cluster/scheduler? Does it need any configuration changes on the SGE scheduler side? Any explanation regarding this will be really helpful. -- Thanks, Shantanu.
Shantanu Pavgi wrote:
I need some help in configuring galaxy with SGE scheduler using unified method. The galaxy is running on a system distinct from SGE scheduler install. The cluster nodes can access galaxy install, galaxy-tools and dataset files using NFS. I am not sure how drmaa works and how galaxy submits jobs to the cluster/scheduler. Do we need specify some type of connection string or ssh-config to connect with the cluster/scheduler? Does it need any configuration changes on the SGE scheduler side? Any explanation regarding this will be really helpful.
Hi Shantanu, You'll need to locate your drmaa library, it can be found wherever SGE is installed. For example, if SGE is installed for 64-bit Linux in /galaxy/sge, then the drmaa library should be located at: /galaxy/sge/lib/lx24-amd64/libdrmaa.so.1.0 Once you have the path, do the following (adjusting the value for the path to libdrmaa.so.1.0 at your site): export DRMAA_LIBRARY_PATH=/galaxy/sge/lib/lx24-amd64/libdrmaa.so.1.0 Then in universe_wsgi.ini, set: start_job_runners = drmaa default_cluster_job_runner = drmaa:/// This should be all you need. --nate
-- Thanks, Shantanu. ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
On May 12, 2011, at 3:43 PM, Nate Coraor wrote:
Shantanu Pavgi wrote:
I need some help in configuring galaxy with SGE scheduler using unified method. The galaxy is running on a system distinct from SGE scheduler install. The cluster nodes can access galaxy install, galaxy-tools and dataset files using NFS. I am not sure how drmaa works and how galaxy submits jobs to the cluster/scheduler. Do we need specify some type of connection string or ssh-config to connect with the cluster/scheduler? Does it need any configuration changes on the SGE scheduler side? Any explanation regarding this will be really helpful.
Hi Shantanu,
You'll need to locate your drmaa library, it can be found wherever SGE is installed. For example, if SGE is installed for 64-bit Linux in /galaxy/sge, then the drmaa library should be located at:
/galaxy/sge/lib/lx24-amd64/libdrmaa.so.1.0
Once you have the path, do the following (adjusting the value for the path to libdrmaa.so.1.0 at your site):
export DRMAA_LIBRARY_PATH=/galaxy/sge/lib/lx24-amd64/libdrmaa.so.1.0
Then in universe_wsgi.ini, set:
start_job_runners = drmaa default_cluster_job_runner = drmaa:///
This should be all you need.
Thanks for the explanation Nate. I will try it out. Also, after integrating with the cluster is there any pre-processing or post-processing done on the galaxy system locally? I would like to get some sense on how much RAM and other system resources will be required for the galaxy system itself. I am assuming it can be fairly thin system for running web server, database and job submission processes. Is it a correct assumption or am I missing something here? -- Shantanu.
Shantanu Pavgi wrote:
Thanks for the explanation Nate. I will try it out.
Also, after integrating with the cluster is there any pre-processing or post-processing done on the galaxy system locally? I would like to get some sense on how much RAM and other system resources will be required for the galaxy system itself. I am assuming it can be fairly thin system for running web server, database and job submission processes. Is it a correct assumption or am I missing something here?
That is correct. Make sure you set: set_metadata_externally = True In universe_wsgi.ini. For the best performance, it's recommended that you run nginx and have it handle the uploads and downloads as explained in the production server documentation: http://usegalaxy.org/production --nate
-- Shantanu.
On May 12, 2011, at 3:43 PM, Nate Coraor wrote:
Shantanu Pavgi wrote:
I need some help in configuring galaxy with SGE scheduler using unified method. The galaxy is running on a system distinct from SGE scheduler install. The cluster nodes can access galaxy install, galaxy-tools and dataset files using NFS. I am not sure how drmaa works and how galaxy submits jobs to the cluster/scheduler. Do we need specify some type of connection string or ssh-config to connect with the cluster/scheduler? Does it need any configuration changes on the SGE scheduler side? Any explanation regarding this will be really helpful.
Hi Shantanu,
You'll need to locate your drmaa library, it can be found wherever SGE is installed. For example, if SGE is installed for 64-bit Linux in /galaxy/sge, then the drmaa library should be located at:
/galaxy/sge/lib/lx24-amd64/libdrmaa.so.1.0
Once you have the path, do the following (adjusting the value for the path to libdrmaa.so.1.0 at your site):
export DRMAA_LIBRARY_PATH=/galaxy/sge/lib/lx24-amd64/libdrmaa.so.1.0
Then in universe_wsgi.ini, set:
start_job_runners = drmaa default_cluster_job_runner = drmaa:///
This should be all you need.
Just want to confirm SGE configuration again. As mentioned earlier we started with a separate galaxy VM without any SGE installation. The SGE master node is installed on a separate system altogether. As I understand from your reply, we will need to install SGE on the galaxy VM first and configure it as a submit host with the main SGE master node. Is that correct? Are their any alternative approaches? -- Thanks, Shantanu.
Shantanu Pavgi wrote:
On May 12, 2011, at 3:43 PM, Nate Coraor wrote:
Shantanu Pavgi wrote:
I need some help in configuring galaxy with SGE scheduler using unified method. The galaxy is running on a system distinct from SGE scheduler install. The cluster nodes can access galaxy install, galaxy-tools and dataset files using NFS. I am not sure how drmaa works and how galaxy submits jobs to the cluster/scheduler. Do we need specify some type of connection string or ssh-config to connect with the cluster/scheduler? Does it need any configuration changes on the SGE scheduler side? Any explanation regarding this will be really helpful.
Hi Shantanu,
You'll need to locate your drmaa library, it can be found wherever SGE is installed. For example, if SGE is installed for 64-bit Linux in /galaxy/sge, then the drmaa library should be located at:
/galaxy/sge/lib/lx24-amd64/libdrmaa.so.1.0
Once you have the path, do the following (adjusting the value for the path to libdrmaa.so.1.0 at your site):
export DRMAA_LIBRARY_PATH=/galaxy/sge/lib/lx24-amd64/libdrmaa.so.1.0
Then in universe_wsgi.ini, set:
start_job_runners = drmaa default_cluster_job_runner = drmaa:///
This should be all you need.
Just want to confirm SGE configuration again. As mentioned earlier we started with a separate galaxy VM without any SGE installation. The SGE master node is installed on a separate system altogether. As I understand from your reply, we will need to install SGE on the galaxy VM first and configure it as a submit host with the main SGE master node. Is that correct? Are their any alternative approaches?
I'm not an expert in SGE, so perhaps someone else can verify my response, but AFAIK there is no way to submit to a remote SGE server via DRMAA without having a local SGE installation to reference. You *might* be able to get away with just copying libdrmaa.so into the VM and some parts of the SGE config which define where the qmaster is, but I don't know what parts you'd need. I'd suggest asking in the SGE community if you'd like a definite answer since it's not a Galaxy-specific issue. --nate
-- Thanks, Shantanu.
yes, your web server needs to be configured as an sge submit host to work seamlessly with galaxy. alternatives include submitting the jobs to the cluster outside of galaxy using another script that will either ssh or use expect. these alternatives are messy and to be avoided unless necessary. conditions which would require these solutions include if you wish to submit to multiple clusters or queues (e.g. user-specific queues, application-specific clusters) or require cluster jobs to be submitted as individual users rather than as the galaxy user (eg. for accounting). On Mon, May 16, 2011 at 10:44 AM, Shantanu Pavgi <pavgi@uab.edu> wrote:
Just want to confirm SGE configuration again. As mentioned earlier we started with a separate galaxy VM without any SGE installation. The SGE master node is installed on a separate system altogether. As I understand from your reply, we will need to install SGE on the galaxy VM first and configure it as a submit host with the main SGE master node. Is that correct? Are their any alternative approaches?
On Jun 17, 2011, at 2:12 PM, Edward Kirton wrote:
yes, your web server needs to be configured as an sge submit host to work seamlessly with galaxy. alternatives include submitting the jobs to the cluster outside of galaxy using another script that will either ssh or use expect. these alternatives are messy and to be avoided unless necessary. conditions which would require these solutions include if you wish to submit to multiple clusters or queues (e.g. user-specific queues, application-specific clusters) or require cluster jobs to be submitted as individual users rather than as the galaxy user (eg. for accounting).
Thanks for the reply Edward and Nate. We got it working by configuring galaxy system as a submit host to master sge node, but I forgot to follow-up on the thread later on. Thanks again for your inputs.. -- Shantanu.
On Mon, May 16, 2011 at 10:44 AM, Shantanu Pavgi <pavgi@uab.edu> wrote: Just want to confirm SGE configuration again. As mentioned earlier we started with a separate galaxy VM without any SGE installation. The SGE master node is installed on a separate system altogether. As I understand from your reply, we will need to install SGE on the galaxy VM first and configure it as a submit host with the main SGE master node. Is that correct? Are their any alternative approaches?
participants (3)
-
Edward Kirton
-
Nate Coraor
-
Shantanu Pavgi