
Our local instance currently uses the traditional directories under `database/` for datasets, job working directories, and temporary files. Ultimately we wish to transition to using our Swift object store for storage. We've been doing some experimentation with Galaxy's Swift backend and have run into a few issues. The first major issue we came across was Swift's 5 GB segment size limit, since the segmentation/multipart upload code is bypassed for instances of SwiftObjectStore [1]. SwiftStack support provided a patch enabling multipart uploads for Swift (PR #648) which has been working well for us so far. (Thanks, Charles!) The next issue is that the path attribute of the cache tag in object_store_conf.xml appears to be ignored. The value does get stored to self.cache_path in _parse_config_xml, but elsewhere in the file self.staging_path is used instead. Finally, adding extra_dir tags to the Swift object store config doesn't appear to do anything. Here's my object_store_conf.xml: <?xml version="1.0"?> <object_store type="hierarchical"> <backends> <object_store type="swift" id="primary" order="0"> <auth access_key="..." secret_key="..."/> <bucket name="galaxy_store"/> <connection host="tin.fhcrc.org" port="443"/> <cache path="database/object_store_cache" size="1000"/> <extra_dir type="temp" path="database/tmp"/> <extra_dir type="job_work" path="database/job_working_directory"/> </object_store> <object_store type="disk" id="secondary" order="1"> <files_dir path="database/files"/> </object_store> </backends> </object_store> The goal with the hierarchical setup above is for new datasets to be created in the primary (Swift) object store, caching to `database/object_store_cache`, while the job and temporary directories remain at `database/job_working_directory` and `database/tmp`, respectively. Existing (pre-Swift) datasets remain in `database/files` and are handled by the secondary disk store. What actually happens (after renaming self.cache_path to self.staging_path in _parse_config_xml to get the cache path working) is this: galaxy.jobs DEBUG 2015-02-06 16:07:26,615 (1) Working directory for job is: /home/bclaywel/workspace/galaxy-central/database/object_store_cache/000/1 That is, the job working directory is created directly under the cache path's hash directories. I assume temp files would probably end up there also. We're quite excited to get Galaxy and Swift working well together, and I'm more than happy to help debug and test! Cheers, Brian [1] https://bitbucket.org/galaxy/galaxy-central/src/54ed3adb6575addba47d627944eb... -- Brian Claywell | programmer/analyst Matsen Group | http://matsen.fredhutch.org Fred Hutchinson Cancer Research Center