Chorny, We had a similar issue with slurm-drmaa, and we exchanged some emails with the upstream drmaa maintainer (Mariusz, also reading this list). At the end he decided to export the environment variables from the submitting node shell to the worker nodes. That solves the problem in a more general way (affects all environment variables, not only $TEMP). On top of that, batch systems such as SGE have specific native flags for this behaviour: http://linux.die.net/man/3/drmaa_attributes: " To have the remote command executed in a shell, such as ***to preserve environment settings***, use the drmaa_native_specification attribute to include the "-shell yes" option " Having this TMP hack on drmaa.py would only complicate maintenance and traceability, imho. It is thus a matter of getting this functionality included upstream on the drmaa connector for your job manager and control it via drmaa's "nativespecification" settings/config (drmaa:// on universe_wsgi.ini). Cheers, Roman PS: Looking forward to see the "run as actual user" code reviewed/commited too :) On 2011-09-03 01:51, Chorny, Ilya wrote:
Nate,
We ran into this issue with /tmp not having enough space. We came up with a generic solution and were wondering if you might want to add it to galaxy-central. We did not want to modify SGE as it could affect other jobs that use the cluster.
We created a TMPDIR option in universe_wsgi.ini and then modified drama.py to include an "export TMPDIR=TMPDIR" to the drm_tmplate if TMPDIR is defined in universe_wsgi.ini. Do you want to pull this code into galaxy central. It's like 5 lines of code.
Thanks and have a good weekend,
Ilya
BTW, any luck reviewing the run as actual user code?
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Assaf Gordon Sent: Thursday, August 04, 2011 2:31 PM To: Shantanu Pavgi; galaxydev psu Subject: Re: [galaxy-dev] TEMP variable in cluster install
As a follow-up for SGE + TMP variables,
I've also encountered problems with settings TEMP,TMP and TMPDIR variables. Tried setting "~/.sge_request" and changing the Galaxy environment variables before starting python - nothing worked - TMPDIR and TMP were always set to "/tmp/all.q.XXXXX" by SGE .
What finally worked is changing the SGE configuration queue, and simply setting the "tmpdir" variable to my desired temp directory.
Do that by running: $ sudo qconf -mq all.q
and changing the line: tmpdir /tmp to tmpdir /my/temporary/directory/path
Problem solved :) No more messing around with TMP,TMPDIR variables in any Galaxy related source files.
Hope this helps, -gordon
Shantanu Pavgi wrote, On 07/26/2011 05:40 PM:
On Jul 21, 2011, at 6:24 PM, Shantanu Pavgi wrote:
We have configured galaxy to work with our SGE cluster using drmaa job runner interface. We are using 'unified method' for this install and both TEMP environment variable and new_file_path in universe_wsgi.ini file have been configured correctly. However, we are seeing some errors where local /tmp space on compute nodes is being referenced by the galaxy tools. Specifically we saw it mentioned in error messages from following tools: * bwa_wrapper and upload tools: 'No space left on device: /tmp...' * sam_to_bam tool: 'Error sorting alignments from /tmp/filename..'
Shouldn't it be referencing TEMP environment variable or new_file_path configuration value? Is it getting overridden by TMP or TMPDIR variables in python wrapper code? Has anyone else experienced similar issue?
Further debugging showed that there are two more temporary directory related environment variables - TMP and TMPDIR which were pointing local /tmp location on compute nodes. We tried to set these variables in our shell environment (drmaa URL uses -V to export current shell env) however SGE overwrote TMP and TMPDIR before actual job execution. The TEMP variable remained unmodified by SGE scheduler.
The galaxy tools seemed to be using temporary directory space pointed by TMP and TMPDIR and hence we saw local /tmp related errors mentioned in my earlier post.
We have temporarily fixed this problem by hard coding TEMP, TMP and TMPDIR values in job template files - 1. lib/galaxy/jobs/runners/drmaa.py 2. lib/galaxy/jobs/runners/sge.py
This will affect all jobs submitted through the galaxy. It will be helpful to know how it can be handled in the tool wrapper scripts. It seems like it can be handled in the tool wrapper scripts using directory prefix while creating the tempfile ( http://docs.python.org/library/tempfile.html ). Any thoughts/comments? Are there any other SGE users having similar issue?
Also, there is a similar thread started today regarding usage of temporary directory location in tool wrapper scripts. It would be helpful to know how galaxy or tool wrappers are using the temporary directory and ways to configure it.
-- Thanks, Shantanu. ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: