Clusters, Runners, and user credentials
I'm a systems administrator for an HPC cluster, and have been asked by a faculty member here to try to get galaxy to work on our cluster. Unfortunately, there are one or two outstanding questions that I can't seem to find the answer to, and I'm hoping someone here can help me out. In particular, is galaxy, and the PBS runner specifically, capable of submitting jobs under specific user names? Essentially, if I set up galaxy to push jobs to our cluster, will they all show up under one user credential (eg. the "galaxy" user), or can we set it up so that the user logged into galaxy, is used to submit the job? This one is kindof a show-stopper, since our internal policies require that all jobs have a specific user credential, with one person per username. Thanks, Lloyd -- Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu
Lyod, See Nate's email below Title: Actual user code. We have been working on implementing this feature in galaxy. The code is still in development but feel free to test it out and let us know how it works for you. Best, Ilya -----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Lloyd Brown Sent: Monday, October 31, 2011 2:35 PM To: Galaxy Dev List Subject: [galaxy-dev] Clusters, Runners, and user credentials I'm a systems administrator for an HPC cluster, and have been asked by a faculty member here to try to get galaxy to work on our cluster. Unfortunately, there are one or two outstanding questions that I can't seem to find the answer to, and I'm hoping someone here can help me out. In particular, is galaxy, and the PBS runner specifically, capable of submitting jobs under specific user names? Essentially, if I set up galaxy to push jobs to our cluster, will they all show up under one user credential (eg. the "galaxy" user), or can we set it up so that the user logged into galaxy, is used to submit the job? This one is kindof a show-stopper, since our internal policies require that all jobs have a specific user credential, with one person per username. Thanks, Lloyd -- Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
BTW, I am not sure if PBS works with drmaa. If not then the code will need to be ported to work with pbs. Ilya -----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Monday, October 31, 2011 3:27 PM To: Lloyd Brown; Galaxy Dev List Subject: Re: [galaxy-dev] Clusters, Runners, and user credentials Lyod, See Nate's email below Title: Actual user code. We have been working on implementing this feature in galaxy. The code is still in development but feel free to test it out and let us know how it works for you. Best, Ilya -----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Lloyd Brown Sent: Monday, October 31, 2011 2:35 PM To: Galaxy Dev List Subject: [galaxy-dev] Clusters, Runners, and user credentials I'm a systems administrator for an HPC cluster, and have been asked by a faculty member here to try to get galaxy to work on our cluster. Unfortunately, there are one or two outstanding questions that I can't seem to find the answer to, and I'm hoping someone here can help me out. In particular, is galaxy, and the PBS runner specifically, capable of submitting jobs under specific user names? Essentially, if I set up galaxy to push jobs to our cluster, will they all show up under one user credential (eg. the "galaxy" user), or can we set it up so that the user logged into galaxy, is used to submit the job? This one is kindof a show-stopper, since our internal policies require that all jobs have a specific user credential, with one person per username. Thanks, Lloyd -- Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Many of us are using the PBS job runner (for TORQUE) and would definitely be interested in a port. How do you deal with making sure the user's environment is configured properly? We use a python virtualenv and load specific module files with tested tool versions in our galaxy users startup scripts on our cluster. Sent from my iPhone On Oct 31, 2011, at 6:29 PM, "Chorny, Ilya" <ichorny@illumina.com> wrote:
BTW, I am not sure if PBS works with drmaa. If not then the code will need to be ported to work with pbs.
Ilya
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Monday, October 31, 2011 3:27 PM To: Lloyd Brown; Galaxy Dev List Subject: Re: [galaxy-dev] Clusters, Runners, and user credentials
Lyod,
See Nate's email below Title: Actual user code. We have been working on implementing this feature in galaxy. The code is still in development but feel free to test it out and let us know how it works for you.
Best,
Ilya
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Lloyd Brown Sent: Monday, October 31, 2011 2:35 PM To: Galaxy Dev List Subject: [galaxy-dev] Clusters, Runners, and user credentials
I'm a systems administrator for an HPC cluster, and have been asked by a faculty member here to try to get galaxy to work on our cluster. Unfortunately, there are one or two outstanding questions that I can't seem to find the answer to, and I'm hoping someone here can help me out.
In particular, is galaxy, and the PBS runner specifically, capable of submitting jobs under specific user names? Essentially, if I set up galaxy to push jobs to our cluster, will they all show up under one user credential (eg. the "galaxy" user), or can we set it up so that the user logged into galaxy, is used to submit the job?
This one is kindof a show-stopper, since our internal policies require that all jobs have a specific user credential, with one person per username.
Thanks, Lloyd
-- Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
I modified drmaa.py to pass the galaxy users path variable to the actual user. As long as the galaxy user's environment is correct then the actual user's environment should be correct. -----Original Message----- From: Glen Beane [mailto:Glen.Beane@jax.org] Sent: Monday, October 31, 2011 4:20 PM To: Chorny, Ilya Cc: Lloyd Brown; Galaxy Dev List Subject: Re: [galaxy-dev] Clusters, Runners, and user credentials Many of us are using the PBS job runner (for TORQUE) and would definitely be interested in a port. How do you deal with making sure the user's environment is configured properly? We use a python virtualenv and load specific module files with tested tool versions in our galaxy users startup scripts on our cluster. Sent from my iPhone On Oct 31, 2011, at 6:29 PM, "Chorny, Ilya" <ichorny@illumina.com> wrote:
BTW, I am not sure if PBS works with drmaa. If not then the code will need to be ported to work with pbs.
Ilya
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Monday, October 31, 2011 3:27 PM To: Lloyd Brown; Galaxy Dev List Subject: Re: [galaxy-dev] Clusters, Runners, and user credentials
Lyod,
See Nate's email below Title: Actual user code. We have been working on implementing this feature in galaxy. The code is still in development but feel free to test it out and let us know how it works for you.
Best,
Ilya
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Lloyd Brown Sent: Monday, October 31, 2011 2:35 PM To: Galaxy Dev List Subject: [galaxy-dev] Clusters, Runners, and user credentials
I'm a systems administrator for an HPC cluster, and have been asked by a faculty member here to try to get galaxy to work on our cluster. Unfortunately, there are one or two outstanding questions that I can't seem to find the answer to, and I'm hoping someone here can help me out.
In particular, is galaxy, and the PBS runner specifically, capable of submitting jobs under specific user names? Essentially, if I set up galaxy to push jobs to our cluster, will they all show up under one user credential (eg. the "galaxy" user), or can we set it up so that the user logged into galaxy, is used to submit the job?
This one is kindof a show-stopper, since our internal policies require that all jobs have a specific user credential, with one person per username.
Thanks, Lloyd
-- Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
I recall at the Galaxy conf there were questions on how secure this is (having the 'galaxy' user submit jobs as someone else). This would involve switching users on the cluster or would require user login information, correct? The way we planned on working around this was to just specify a user account string (using '-A') instead of bothering with switching users. I believe our local cluster disallows switching users via PBS unless the submitter has admin privs, but the accounting string works fine (I suppose one could use the project option as well). chris On Oct 31, 2011, at 6:30 PM, Chorny, Ilya wrote:
I modified drmaa.py to pass the galaxy users path variable to the actual user. As long as the galaxy user's environment is correct then the actual user's environment should be correct.
-----Original Message----- From: Glen Beane [mailto:Glen.Beane@jax.org] Sent: Monday, October 31, 2011 4:20 PM To: Chorny, Ilya Cc: Lloyd Brown; Galaxy Dev List Subject: Re: [galaxy-dev] Clusters, Runners, and user credentials
Many of us are using the PBS job runner (for TORQUE) and would definitely be interested in a port.
How do you deal with making sure the user's environment is configured properly? We use a python virtualenv and load specific module files with tested tool versions in our galaxy users startup scripts on our cluster.
Sent from my iPhone
On Oct 31, 2011, at 6:29 PM, "Chorny, Ilya" <ichorny@illumina.com> wrote:
BTW, I am not sure if PBS works with drmaa. If not then the code will need to be ported to work with pbs.
Ilya
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Monday, October 31, 2011 3:27 PM To: Lloyd Brown; Galaxy Dev List Subject: Re: [galaxy-dev] Clusters, Runners, and user credentials
Lyod,
See Nate's email below Title: Actual user code. We have been working on implementing this feature in galaxy. The code is still in development but feel free to test it out and let us know how it works for you.
Best,
Ilya
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Lloyd Brown Sent: Monday, October 31, 2011 2:35 PM To: Galaxy Dev List Subject: [galaxy-dev] Clusters, Runners, and user credentials
I'm a systems administrator for an HPC cluster, and have been asked by a faculty member here to try to get galaxy to work on our cluster. Unfortunately, there are one or two outstanding questions that I can't seem to find the answer to, and I'm hoping someone here can help me out.
In particular, is galaxy, and the PBS runner specifically, capable of submitting jobs under specific user names? Essentially, if I set up galaxy to push jobs to our cluster, will they all show up under one user credential (eg. the "galaxy" user), or can we set it up so that the user logged into galaxy, is used to submit the job?
This one is kindof a show-stopper, since our internal policies require that all jobs have a specific user credential, with one person per username.
Thanks, Lloyd
-- Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Ilya, Nate, To add a bit of background to the below, we have several clusters on campus that use very different accounting systems; some run as a regular cron job to process job run info, however others use a qsub wrapper to check service units prior to job submission (a byproduct of being part of teragrid/xcede). It seems the most direct route to work around accounting-level differences is to submit the job as a user (so I'm interested in this solution), but the below security questions I mentioned were raised by a number of our local cluster sysadmins as well as (if I'm not mistaken) at the conference. Were these ever addressed, or is it considered an non-issue? Apologies about re-sending, I didn't know if this had been answered elsewhere, but this was a serious concern that may block us from using some pretty nice HPC resources. chris On Nov 1, 2011, at 4:59 PM, Fields, Christopher J wrote:
I recall at the Galaxy conf there were questions on how secure this is (having the 'galaxy' user submit jobs as someone else). This would involve switching users on the cluster or would require user login information, correct?
The way we planned on working around this was to just specify a user account string (using '-A') instead of bothering with switching users. I believe our local cluster disallows switching users via PBS unless the submitter has admin privs, but the accounting string works fine (I suppose one could use the project option as well).
chris
On Oct 31, 2011, at 6:30 PM, Chorny, Ilya wrote:
I modified drmaa.py to pass the galaxy users path variable to the actual user. As long as the galaxy user's environment is correct then the actual user's environment should be correct.
-----Original Message----- From: Glen Beane [mailto:Glen.Beane@jax.org] Sent: Monday, October 31, 2011 4:20 PM To: Chorny, Ilya Cc: Lloyd Brown; Galaxy Dev List Subject: Re: [galaxy-dev] Clusters, Runners, and user credentials
Many of us are using the PBS job runner (for TORQUE) and would definitely be interested in a port.
How do you deal with making sure the user's environment is configured properly? We use a python virtualenv and load specific module files with tested tool versions in our galaxy users startup scripts on our cluster.
Sent from my iPhone
On Oct 31, 2011, at 6:29 PM, "Chorny, Ilya" <ichorny@illumina.com> wrote:
BTW, I am not sure if PBS works with drmaa. If not then the code will need to be ported to work with pbs.
Ilya
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Monday, October 31, 2011 3:27 PM To: Lloyd Brown; Galaxy Dev List Subject: Re: [galaxy-dev] Clusters, Runners, and user credentials
Lyod,
See Nate's email below Title: Actual user code. We have been working on implementing this feature in galaxy. The code is still in development but feel free to test it out and let us know how it works for you.
Best,
Ilya
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Lloyd Brown Sent: Monday, October 31, 2011 2:35 PM To: Galaxy Dev List Subject: [galaxy-dev] Clusters, Runners, and user credentials
I'm a systems administrator for an HPC cluster, and have been asked by a faculty member here to try to get galaxy to work on our cluster. Unfortunately, there are one or two outstanding questions that I can't seem to find the answer to, and I'm hoping someone here can help me out.
In particular, is galaxy, and the PBS runner specifically, capable of submitting jobs under specific user names? Essentially, if I set up galaxy to push jobs to our cluster, will they all show up under one user credential (eg. the "galaxy" user), or can we set it up so that the user logged into galaxy, is used to submit the job?
This one is kindof a show-stopper, since our internal policies require that all jobs have a specific user credential, with one person per username.
Thanks, Lloyd
-- Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hi Chris, Ilya's solution uses sudo to submit the job via drmaa after switching to the actual user's uid and gid. This means giving your Galaxy user sudo rights to run 3 scripts as root: * A script to submit jobs * A script to kill jobs * A script to chown a directory This could be tightened up a bit, in the case of the first two by sudoing directly to the user, rather than to root and then setuid()ing. In the case of the latter script, a path is passed to the script rather than a Galaxy job id, so it could be used by the Galaxy user to chown anything that root can chown. In addition, if your Galaxy data lives in NFS with root squashing enabled, this script would fail. Of course, the paths to these scripts are configurable, so they can be replaced with site-suitable versions. Another option to avoid sudo entirely would be for Galaxy to start as root and then drop privileges, but I am not incredibly fond of this solution, since it allows for the possibility of privilege separation exploits. Perhaps a stripped down Galaxy data daemon that runs with elevated privileges, whose sole job it is to manage permissions and move data? As with the existing Galaxy implementation, Galaxy's data is not copied around at job runtime for tool input, it simply exists in one place and is expected to be locatable on the cluster resource at the same path. My next development goal is to remove this limitation. The assumption is also made that tool inputs are readable by the actual user, which was a problem in some environments. If administrators prefer to give the Galaxy account the permission to run jobs as other users directly in the DRM, this would certainly solve the problem. Galaxy would just need minor modification to take advantage of the feature. As you probably recall, there were many people at the GCC brainstorming this problem, and I don't recall that we ever came up with the perfectly secure solution. This solution may be good enough for some sites. If there's a desire for tightened security, I would be happy to review and accept any work done on that. =) --nate Fields, Christopher J wrote:
Ilya, Nate,
To add a bit of background to the below, we have several clusters on campus that use very different accounting systems; some run as a regular cron job to process job run info, however others use a qsub wrapper to check service units prior to job submission (a byproduct of being part of teragrid/xcede). It seems the most direct route to work around accounting-level differences is to submit the job as a user (so I'm interested in this solution), but the below security questions I mentioned were raised by a number of our local cluster sysadmins as well as (if I'm not mistaken) at the conference.
Were these ever addressed, or is it considered an non-issue? Apologies about re-sending, I didn't know if this had been answered elsewhere, but this was a serious concern that may block us from using some pretty nice HPC resources.
chris
On Nov 1, 2011, at 4:59 PM, Fields, Christopher J wrote:
I recall at the Galaxy conf there were questions on how secure this is (having the 'galaxy' user submit jobs as someone else). This would involve switching users on the cluster or would require user login information, correct?
The way we planned on working around this was to just specify a user account string (using '-A') instead of bothering with switching users. I believe our local cluster disallows switching users via PBS unless the submitter has admin privs, but the accounting string works fine (I suppose one could use the project option as well).
chris
On Oct 31, 2011, at 6:30 PM, Chorny, Ilya wrote:
I modified drmaa.py to pass the galaxy users path variable to the actual user. As long as the galaxy user's environment is correct then the actual user's environment should be correct.
-----Original Message----- From: Glen Beane [mailto:Glen.Beane@jax.org] Sent: Monday, October 31, 2011 4:20 PM To: Chorny, Ilya Cc: Lloyd Brown; Galaxy Dev List Subject: Re: [galaxy-dev] Clusters, Runners, and user credentials
Many of us are using the PBS job runner (for TORQUE) and would definitely be interested in a port.
How do you deal with making sure the user's environment is configured properly? We use a python virtualenv and load specific module files with tested tool versions in our galaxy users startup scripts on our cluster.
Sent from my iPhone
On Oct 31, 2011, at 6:29 PM, "Chorny, Ilya" <ichorny@illumina.com> wrote:
BTW, I am not sure if PBS works with drmaa. If not then the code will need to be ported to work with pbs.
Ilya
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Monday, October 31, 2011 3:27 PM To: Lloyd Brown; Galaxy Dev List Subject: Re: [galaxy-dev] Clusters, Runners, and user credentials
Lyod,
See Nate's email below Title: Actual user code. We have been working on implementing this feature in galaxy. The code is still in development but feel free to test it out and let us know how it works for you.
Best,
Ilya
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Lloyd Brown Sent: Monday, October 31, 2011 2:35 PM To: Galaxy Dev List Subject: [galaxy-dev] Clusters, Runners, and user credentials
I'm a systems administrator for an HPC cluster, and have been asked by a faculty member here to try to get galaxy to work on our cluster. Unfortunately, there are one or two outstanding questions that I can't seem to find the answer to, and I'm hoping someone here can help me out.
In particular, is galaxy, and the PBS runner specifically, capable of submitting jobs under specific user names? Essentially, if I set up galaxy to push jobs to our cluster, will they all show up under one user credential (eg. the "galaxy" user), or can we set it up so that the user logged into galaxy, is used to submit the job?
This one is kindof a show-stopper, since our internal policies require that all jobs have a specific user credential, with one person per username.
Thanks, Lloyd
-- Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
On Nov 3, 2011, at 9:50 AM, Nate Coraor wrote:
Hi Chris,
Ilya's solution uses sudo to submit the job via drmaa after switching to the actual user's uid and gid. This means giving your Galaxy user sudo rights to run 3 scripts as root:
* A script to submit jobs * A script to kill jobs * A script to chown a directory
This could be tightened up a bit, in the case of the first two by sudoing directly to the user, rather than to root and then setuid()ing. In the case of the latter script, a path is passed to the script rather than a Galaxy job id, so it could be used by the Galaxy user to chown anything that root can chown. In addition, if your Galaxy data lives in NFS with root squashing enabled, this script would fail.
Yes, we may run into this or worse; we're setting up gpfs locally for our NFS.
Of course, the paths to these scripts are configurable, so they can be replaced with site-suitable versions.
Another option to avoid sudo entirely would be for Galaxy to start as root and then drop privileges, but I am not incredibly fond of this solution, since it allows for the possibility of privilege separation exploits. Perhaps a stripped down Galaxy data daemon that runs with elevated privileges, whose sole job it is to manage permissions and move data?
That sounds like a feasible option.
As with the existing Galaxy implementation, Galaxy's data is not copied around at job runtime for tool input, it simply exists in one place and is expected to be locatable on the cluster resource at the same path. My next development goal is to remove this limitation.
This is something we will run into at some point, particularly with some of the NCSA resources (where user paths are quite different from other clusters on campus).
The assumption is also made that tool inputs are readable by the actual user, which was a problem in some environments.
If administrators prefer to give the Galaxy account the permission to run jobs as other users directly in the DRM, this would certainly solve the problem. Galaxy would just need minor modification to take advantage of the feature.
As you probably recall, there were many people at the GCC brainstorming this problem, and I don't recall that we ever came up with the perfectly secure solution. This solution may be good enough for some sites.
Right, I think this is more a problem when the cluster is not under our control and has already been configured. Not that it's impossible, but there is definitely an additional level of sysadmin concerns we have to deal with. And with multiple clusters (with multiple configurations, sysadmins, etc) this becomes more complex. We're deploying step-wise (on one cluster initially, then others down the road) for this reason.
If there's a desire for tightened security, I would be happy to review and accept any work done on that. =)
--nate
That is a possibility, we have initially talked with a few of the myproxy folks here re: security concerns and possible solutions for user job submissions (there wasn't much added yet beyond what you already covered, unfortunately). chris
Fields, Christopher J wrote:
Ilya, Nate,
To add a bit of background to the below, we have several clusters on campus that use very different accounting systems; some run as a regular cron job to process job run info, however others use a qsub wrapper to check service units prior to job submission (a byproduct of being part of teragrid/xcede). It seems the most direct route to work around accounting-level differences is to submit the job as a user (so I'm interested in this solution), but the below security questions I mentioned were raised by a number of our local cluster sysadmins as well as (if I'm not mistaken) at the conference.
Were these ever addressed, or is it considered an non-issue? Apologies about re-sending, I didn't know if this had been answered elsewhere, but this was a serious concern that may block us from using some pretty nice HPC resources.
chris
On Nov 1, 2011, at 4:59 PM, Fields, Christopher J wrote:
I recall at the Galaxy conf there were questions on how secure this is (having the 'galaxy' user submit jobs as someone else). This would involve switching users on the cluster or would require user login information, correct?
The way we planned on working around this was to just specify a user account string (using '-A') instead of bothering with switching users. I believe our local cluster disallows switching users via PBS unless the submitter has admin privs, but the accounting string works fine (I suppose one could use the project option as well).
chris
On Oct 31, 2011, at 6:30 PM, Chorny, Ilya wrote:
I modified drmaa.py to pass the galaxy users path variable to the actual user. As long as the galaxy user's environment is correct then the actual user's environment should be correct.
-----Original Message----- From: Glen Beane [mailto:Glen.Beane@jax.org] Sent: Monday, October 31, 2011 4:20 PM To: Chorny, Ilya Cc: Lloyd Brown; Galaxy Dev List Subject: Re: [galaxy-dev] Clusters, Runners, and user credentials
Many of us are using the PBS job runner (for TORQUE) and would definitely be interested in a port.
How do you deal with making sure the user's environment is configured properly? We use a python virtualenv and load specific module files with tested tool versions in our galaxy users startup scripts on our cluster.
Sent from my iPhone
On Oct 31, 2011, at 6:29 PM, "Chorny, Ilya" <ichorny@illumina.com> wrote:
BTW, I am not sure if PBS works with drmaa. If not then the code will need to be ported to work with pbs.
Ilya
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Monday, October 31, 2011 3:27 PM To: Lloyd Brown; Galaxy Dev List Subject: Re: [galaxy-dev] Clusters, Runners, and user credentials
Lyod,
See Nate's email below Title: Actual user code. We have been working on implementing this feature in galaxy. The code is still in development but feel free to test it out and let us know how it works for you.
Best,
Ilya
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Lloyd Brown Sent: Monday, October 31, 2011 2:35 PM To: Galaxy Dev List Subject: [galaxy-dev] Clusters, Runners, and user credentials
I'm a systems administrator for an HPC cluster, and have been asked by a faculty member here to try to get galaxy to work on our cluster. Unfortunately, there are one or two outstanding questions that I can't seem to find the answer to, and I'm hoping someone here can help me out.
In particular, is galaxy, and the PBS runner specifically, capable of submitting jobs under specific user names? Essentially, if I set up galaxy to push jobs to our cluster, will they all show up under one user credential (eg. the "galaxy" user), or can we set it up so that the user logged into galaxy, is used to submit the job?
This one is kindof a show-stopper, since our internal policies require that all jobs have a specific user credential, with one person per username.
Thanks, Lloyd
-- Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Glen Beane wrote:
Many of us are using the PBS job runner (for TORQUE) and would definitely be interested in a port.
Technically, using the drmaa runner with TORQUE is supposed to work. I just tried it here to test Ilya's code, and Galaxy was segfaulting when trying to interact with TORQUE's libdrmaa on setting up the job template. I didn't look into it any further, I'm using a fairly old TORQUE client here, so I suspect it may just be due to that. --nate
How do you deal with making sure the user's environment is configured properly? We use a python virtualenv and load specific module files with tested tool versions in our galaxy users startup scripts on our cluster.
Sent from my iPhone
On Oct 31, 2011, at 6:29 PM, "Chorny, Ilya" <ichorny@illumina.com> wrote:
BTW, I am not sure if PBS works with drmaa. If not then the code will need to be ported to work with pbs.
Ilya
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Monday, October 31, 2011 3:27 PM To: Lloyd Brown; Galaxy Dev List Subject: Re: [galaxy-dev] Clusters, Runners, and user credentials
Lyod,
See Nate's email below Title: Actual user code. We have been working on implementing this feature in galaxy. The code is still in development but feel free to test it out and let us know how it works for you.
Best,
Ilya
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Lloyd Brown Sent: Monday, October 31, 2011 2:35 PM To: Galaxy Dev List Subject: [galaxy-dev] Clusters, Runners, and user credentials
I'm a systems administrator for an HPC cluster, and have been asked by a faculty member here to try to get galaxy to work on our cluster. Unfortunately, there are one or two outstanding questions that I can't seem to find the answer to, and I'm hoping someone here can help me out.
In particular, is galaxy, and the PBS runner specifically, capable of submitting jobs under specific user names? Essentially, if I set up galaxy to push jobs to our cluster, will they all show up under one user credential (eg. the "galaxy" user), or can we set it up so that the user logged into galaxy, is used to submit the job?
This one is kindof a show-stopper, since our internal policies require that all jobs have a specific user credential, with one person per username.
Thanks, Lloyd
-- Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
On Nov 2, 2011, at 4:57 PM, Nate Coraor wrote:
Glen Beane wrote:
Many of us are using the PBS job runner (for TORQUE) and would definitely be interested in a port.
Technically, using the drmaa runner with TORQUE is supposed to work. I just tried it here to test Ilya's code, and Galaxy was segfaulting when trying to interact with TORQUE's libdrmaa on setting up the job template. I didn't look into it any further, I'm using a fairly old TORQUE client here, so I suspect it may just be due to that.
I don't believe TORQUE's DRMAA support receives that much attention these days -- Glen L. Beane Senior Software Engineer The Jackson Laboratory (207) 288-6153
participants (5)
-
Chorny, Ilya
-
Fields, Christopher J
-
Glen Beane
-
Lloyd Brown
-
Nate Coraor