
Dear All, I am using Galaxy with condor as the scheduler via the DRMAA Job Runner. on Debian 6. Jobs are submitted to COndor but they are held because of permissions: condor@vm-debian:~$ condor_q -analyze 10.0 -- Submitter: vm-debian.uchad.uchospitals.edu : <165.68.219.43:40312> : vm-debian.uchad.uchospitals.edu --- 010.000: Request is held. Hold reason: Error from vm-debian.uchad.uchospitals.edu: Failed to open '/home/galaxy/galaxy-dist/database/pbs/19.o' as standard output: Permission denied (errno 13) condor@vm-debian:~$ ls -l /home/galaxy/galaxy-dist/database/pbs/19.o -rw-r--r-- 1 galaxy galaxy 0 Oct 4 16:40 /home/galaxy/galaxy-dist/database/pbs/19.o I am running galaxy as the galaxy user. I also have a condor user. How to ensure that jobs are submitted as the proper user? Thanks so much, Oren

You might want to ask about this on condor-users@cs.wisc.edu, if it is a Condor problem. (cc'd) On Tue, Oct 04, 2011 at 04:53:24PM -0500, Oren Livne wrote:
Dear All,
I am using Galaxy with condor as the scheduler via the DRMAA Job Runner. on Debian 6. Jobs are submitted to COndor but they are held because of permissions:
condor@vm-debian:~$ condor_q -analyze 10.0
-- Submitter: vm-debian.uchad.uchospitals.edu : <165.68.219.43:40312> : vm-debian.uchad.uchospitals.edu --- 010.000: Request is held.
Hold reason: Error from vm-debian.uchad.uchospitals.edu: Failed to open '/home/galaxy/galaxy-dist/database/pbs/19.o' as standard output: Permission denied (errno 13)
condor@vm-debian:~$ ls -l /home/galaxy/galaxy-dist/database/pbs/19.o -rw-r--r-- 1 galaxy galaxy 0 Oct 4 16:40 /home/galaxy/galaxy-dist/database/pbs/19.o
I am running galaxy as the galaxy user. I also have a condor user. How to ensure that jobs are submitted as the proper user?
Which user does condor believe you to be?
Thanks so much, Oren

Which user does condor believe you to be? The user running the "condor_submit" command is "galaxy", which has
Dear Nick, Haven't heard back yet from condor, I found in previous posts and condor docs that this might have to do with the condor UID_DOMAIN config variable, which is set to $(FULL_HOSTNAME) in my case, as recommended therein. Also, "condor_master" is being run as root. permissions to write that output file. But I don't know if this is the relevant uid or if condor is trying to open that file as a different user within the condor_submit command. Oren

On Wed, Oct 05, 2011 at 11:11:58AM -0500, Oren Livne wrote:
Dear Nick,
Which user does condor believe you to be? The user running the "condor_submit" command is "galaxy", which has
Haven't heard back yet from condor, I found in previous posts and condor docs that this might have to do with the condor UID_DOMAIN config variable, which is set to $(FULL_HOSTNAME) in my case, as recommended therein. Also, "condor_master" is being run as root. permissions to write that output file. But I don't know if this is the relevant uid or if condor is trying to open that file as a different user within the condor_submit command.
What is the output of the following? condor_q -f '%s\n' FileSystemDomain <job> condor_q -f '%s\n' Owner <job> condor_q -f '%s\n' User <job> Nathan Panike

Dear Nathan, Thank you so much for your help. condor@vm-debian:/home/galaxy/galaxy-dist/database/pbs$ condor_q -- Submitter: vm-debian.uchad.uchospitals.edu : <165.68.219.43:56855> : vm-debian.uchad.uchospitals.edu ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 18.0 galaxy 10/5 12:01 0+00:00:01 H 0 0.0 galaxy_31.sh 1 jobs; 0 idle, 0 running, 1 held condor@vm-debian:/home/galaxy/galaxy-dist/database/pbs$ condor_q -f '%s\n' FileSystemDomain 18.0 uchad.uchospitals.edu condor@vm-debian:/home/galaxy/galaxy-dist/database/pbs$ condor_q -f '%s\n' Owner 18.0 galaxy condor@vm-debian:/home/galaxy/galaxy-dist/database/pbs$ condor_q -f '%s\n' User 18.0 galaxy@uchad.uchospitals.edu We added the condor to the galaxy user group, but that didn't help. Oren

Dear Nathan, We found that condor was running the job as the "nobody" user, which had no permissions to the files galaxy created in the database/pbs directory. This was fixed by setting TRUST_UID_DOMAIN = True in the condor config file. Now the job runs, but fails immediately because it cannot find the PATH environment variable. If I run the generated database/pbs/galaxy_<jobID>.sh file manually at the command line, it works fine. Do we need any some environment setting in galaxy/condor? Traceback (most recent call last): File "/home/galaxy/galaxy-dist/tools/stats/column_maker.py", line 7, in <module> from galaxy.tools import validation File "/home/galaxy/galaxy-dist/lib/galaxy/tools/__init__.py", line 15, in <module> from galaxy import util, jobs, model File "/home/galaxy/galaxy-dist/lib/galaxy/jobs/__init__.py", line 4, in <module> from galaxy import util, model File "/home/galaxy/galaxy-dist/lib/galaxy/model/__init__.py", line 13, in <module> import galaxy.datatypes.registry File "/home/galaxy/galaxy-dist/lib/galaxy/datatypes/registry.py", line 6, in <module> import data, tabular, interval, images, sequence, qualityscore, genetics, xml, coverage, tracks, chrominfo, binary, assembly, ngsindex, wsf File "/home/galaxy/galaxy-dist/lib/galaxy/datatypes/data.py", line 7, in <module> import metadata File "/home/galaxy/galaxy-dist/lib/galaxy/datatypes/metadata.py", line 5, in <module> from galaxy.web import form_builder File "/home/galaxy/galaxy-dist/lib/galaxy/web/__init__.py", line 5, in <module> from framework import expose, json, json_pretty, require_login, require_admin, url_for, error, form, FormBuilder, expose_api File "/home/galaxy/galaxy-dist/lib/galaxy/web/framework/__init__.py", line 18, in <module> import helpers File "/home/galaxy/galaxy-dist/lib/galaxy/web/framework/helpers/__init__.py", line 4, in <module> from webhelpers import * File "/home/galaxy/galaxy-dist/eggs/WebHelpers-0.2-py2.6.egg/webhelpers/__init__.py", line 1, in <module> from webhelpers.rails import * File "/home/galaxy/galaxy-dist/eggs/WebHelpers-0.2-py2.6.egg/webhelpers/rails/__init__.py", line 9, in <module> from text import * File "/home/galaxy/galaxy-dist/eggs/WebHelpers-0.2-py2.6.egg/webhelpers/rails/text.py", line 10, in <module> import webhelpers.textile as textile File "/home/galaxy/galaxy-dist/eggs/WebHelpers-0.2-py2.6.egg/webhelpers/textile.py", line 241, in <module> import tidy File "/usr/lib/pymodules/python2.6/tidy/__init__.py", line 43, in <module> from tidy.lib import parse, parseString File "/usr/lib/pymodules/python2.6/tidy/lib.py", line 24, in <module> os.environ['PATH'] = "%s%s%s" % (packagedir, os.pathsep, os.environ['PATH']) File "/usr/lib/python2.6/UserDict.py", line 22, in __getitem__ raise KeyError(key) KeyError: 'PATH' Thank you so much again for your help. Oren

On Wed, Oct 05, 2011 at 01:00:35PM -0500, Oren Livne wrote:
Dear Nathan,
We found that condor was running the job as the "nobody" user, which had no permissions to the files galaxy created in the database/pbs directory. This was fixed by setting
TRUST_UID_DOMAIN = True
in the condor config file.
Now the job runs, but fails immediately because it cannot find the PATH environment variable. If I run the generated database/pbs/galaxy_<jobID>.sh file manually at the command line, it works fine. Do we need any some environment setting in galaxy/condor?
You can set the environment by using an "environment = " command in the submit file or, if appropriate, you can copy your environment to the job by inserting "getenv = true" in the submit file. This is described in the condor manual at http://www.cs.wisc.edu/condor/manual/v7.7/2_5Submitting_Job.html#2516
Thank you so much again for your help.
You're welcome Nathan Panike
participants (2)
-
Nathan Panike
-
Oren Livne