Galaxy sites usually do all work a compute cluster, with all jobs submitted
as a "galaxy" unix user, so there isn't any "fair-share" accounting between
users.
Other sysops have created a solution to run jobs as the actual unix user,
which may be feasible for an intranet site but is undesirable for a site
accessible via the internet due to security reasons.
A simpler and more secure method to enable fair-share is by using projects.
Here's a simple scenario and straightforward solution: Multiple groups in
an organization use the same galaxy site and it is desirable to enable
fair-share accounting between the groups. All users in a group consume the
same fair-share, which is generally acceptable.
1) configure scheduler with a project for each group, configure each user
to use their group's project by default, and grant galaxy user access to
submit jobs to any project; all users should be associated with a project.
There's a good chance your grid is already configured this way.
2) create a database which maps galaxy user id to a project; i use a cron
job to create a standalone sqlite3 db. since this is site-specific, code
is not provided but hints are given below. Rather than having a separate
database, the proj could have been added to the galaxy db, but i sought to
minimize my changes.
3) add a snippet of code to drmaa.py's queue_job method to lookup proj from
job_wrapper.user_id and append to jt.nativeSpecification; see below
Here are the changes required. It's small enough that I didn't do this as
a clone/patch.
(1) lib/galaxy/jobs/runners/drmaa.py:
11 import sqlite3
12
...
155 native_spec = self.get_native_spec( runner_url )
156
157 # BEGIN ADD USER'S PROJ
158 if self.app.config.user_proj_map_db is not None:
159 try:
160 conn = sqlite3.connect(self.app.config.user_proj_map_db)
161 c = conn.cursor()
162 c.execute('SELECT PROJ FROM USER_PROJ WHERE GID=?',
[job_wrapper.user_id])
163 row = c.fetchone()
164 c.close
165 native_spec += ' -P ' + row[0]
166 except:
167 log.debug("Cannot look up proj of user %s" %
job_wrapper.user_id)
168 # END ADD USER'S PROJ
(2) lib/galaxy/config.py: add support for user_proj_map_db variable
self.user_proj_map_db = resolve_path( kwargs.get(
"user_proj_map_db", None ), self.root )
(3) universe_wsgi.ini:
user_proj_map_db = /some/path/to/user_proj_map_db.sqlite
(4) here's some suggestions to help get you started on a script to make the
sqlite3 db.
a) parse ldap tree example: (to get uid:email)
ldapsearch -LLL -x -b 'ou=aliases,dc=jgi,dc=gov'
b) parse scheduler config: (to get uid:proj)
qconf -suserl | /usr/bin/xargs -I '{}' qconf -suser '{}' | egrep
'name|default_project'
c) query galaxy db: (to get gid:email)
select id, email from galaxy_user;
The limitation of this method is that all jobs submitted by a user will
always be charged to the same project (which may be okay, depending on how
your organization uses projects). However a user may have access to
several projects and may wish to associate some jobs with a particular
project. This could be accomplished by adding an option to the user
preferences; a user would chose a project from their available projects and
any jobs submitted would have to record their currently chosen project.
Alternatively, histories could be associated with a particular project.
This solution would require significant changes to galaxy, so i haven't
implemented it (and the simple solution works well enough for me).
Edward Kirton
US DOE JGI