Joachim, Nate,

Leon Mei pointed me to a mailing list post of August 2012 where you two discussed a problem with uploads to Galaxy filling up /tmp. I think I have traced this down now after we suffered from this too several times.

There are a number of places where temporary files are configurable in galaxy, but there is (at least) one place that uses the Python default directory (can be set with TMPDIR or some other envvars, but if you don't it is often /tmp). The "unconfigurable" place is tools/data_source/upload.py, where the code reads:

    if dataset.type == 'url':
        try:
            page = urllib.urlopen( dataset.path ) #page will be .close()ed by sniff methods
            temp_name, dataset.is_multi_byte = sniff.stream_to_file( page, prefix='url_paste', source_encoding=util.get_charset_from_http_headers( page.headers ) )
        except Exception, e:
            file_err( 'Unable to fetch %s\n%s' % ( dataset.path, str( e ) ), dataset, json_file )
            return
        dataset.path = temp_name

sniff.stream_to_file uses the tempfile module, and since there is no "dir=" in the argument list to this call, the temporary file is made in /tmp. The central solution for the main galaxy code is in lib/galaxy/config.py:

        self.new_file_path = resolve_path( kwargs.get( "new_file_path", "database/tmp" ), self.root )
        tempfile.tempdir = self.new_file_path

But this assignment to "tempdir" does not help in this case because upload.py is a tool?

It would be nice to fix this, which we can obviously do ourselves for our andromeda deployment, but it would be better to do it centrally.

Regards,

Rob 

--
Rob W.W. Hooft
Chief Technology Officer BioAssist, Netherlands Bioinformatics Centre
http://www.nbic.nl/    Skype: robhooft    GSM: +31 6 27034319