On Sep 27, 2011, at 1:14 PM, Nate Coraor wrote:
Glen Beane wrote:
We recently updated to the latest galaxy-dist, and learned that the sam_merge.xml tool now uses picard MergeSamFiles.jar to merge the files instead of the samtools merge wrapper sam_merge.py.
this is a problem for us because MergeSamFiles.jar does not honor $TMPDIR when creating temporary file names (the jvm developers inexplicably hard code the value of java.io.tmpdir to /tmp in Unix/Linux rather than doing the Right Thing) . On our cluster, TMPDIR is set to something like /scratch/batch_job_id/. This location has plenty of free space, however /tmp does not and now we can't successfully merge largeish bam files.
In case anyone else is bit by this, I think there are two options
the Picard tools take an optional TMP_DIR= argument that lets us specify the location we want to use for a temporary directory. Initially we ended up modifying the .xml to add TMP_DIR=\$TMPDIR to the arguments to MergeSamFiles.jar. This works, but we could potentially need to do this with multiple Picard tools and not just MergeSamFiles. Now I am probably going to go with the following solution:
add something like "export _JAVA_OPTIONS=-Djava.io.tmpdir=$TMPDIR" to the .bashrc file for my Galaxy user.
This is what I've just done on our local cluster as well. I was also confounded by the lack of a proper environment variable to do this.
It looks like the export _JAVA_OPTIONS=-Djava.io.tmpdir=$TMPDIR solution breaks some tools (like snpEFF, which is a 3rd party tool we use). The Jvm prints a diagnostic message to stderr that looks something like this: Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/scratch/32095.scyld.localdomain so in this case the tool fails, since it does not have a wrapper. The sam_merge.xml tool redirects stderr, so it doesn't have this problem. I am thinking about putting a wrapper script for java in my galaxy user's path that adds -Djava.io.tmpdir=$TMPDIR to the arguments. -- Glen L. Beane Senior Software Engineer The Jackson Laboratory (207) 288-6153