On Tue, Nov 12, 2013 at 7:13 PM, Ben Gift <corn8bit2@gmail.com> wrote:
I'm working with a lot of data on a cluster (condor). If I save all the workflow intermediate data, as Galaxy does by default (and rightfully so), it fills the drives.
How can tell Galaxy to use /tmp/ to store all intermediate data in a workflow, and keep the result?
You can't - for a start /tmp is usually machine specific so the /tmp used by one cluster node is probably not going to be available on the /tmp of the other cluster nodes, and different stages of the workflow are likely to be run on different cluster nodes.
I imagine I'll have to work on how Galaxy handles jobs, but I'm hoping there is something built in for this.
Workflows can mark the output datasets, and the rest are automatically hidden/deleted on successful completion (but kept and visible on request via the history menu). It might be nice if we could make that more aggressive and actually purge the intermediate files from disk as well? Peter