Hey Peter, It looks like ext will be undefined in your default implementation of _archive_main_file - is this going to cause an error or am I missing something? If this is the interface you would like - I could rework it to handle this ext issue and apply it directly to galaxy-central. Otherwise, I will await a pull request. Thanks for working on this! -John On Tue, Nov 26, 2013 at 10:55 AM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
On Mon, Nov 25, 2013 at 10:37 PM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
On Mon, Nov 25, 2013 at 7:47 PM, John Chilton <chilton@msi.umn.edu> wrote:
On Mon, Nov 25, 2013 at 11:59 AM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
Hello all,
I would like to modify the default ZIP bundle behaviour for composite datatypes, which currently insists on creating an HTML "main" file. To do this I think the current monolithic _archive_composite_dataset function needs to be refactored in file lib/galaxy/datatypes/data.py
I would like to move the section which currently names and populates the HTML index file into a sub-method, i.e. this chunk:
path = data.file_name fname = os.path.split(path)[-1] efp = data.extra_files_path htmlname = os.path.splitext(outfname)[0] if not htmlname.endswith(ext): htmlname = '%s_%s' % (htmlname,ext) archname = '%s.html' % htmlname # fake the real nature of the html file try: archive.add(data.file_name,archname) except IOError: error = True log.exception( "Unable to add composite parent %s to temporary library download archive" % data.file_name) msg = "Unable to create archive for download, please report this error"
Then by overriding this new method a subclass (custom composite datatype) could for example set the filename used inside the archive to be "index.html" or "index.htm" or anything else like "README.txt" (doesn't have to be HTML); alter the contents of the index file; or even not include an index file in the archive.
Would a pull request to do this be considered?
Hey Peter,
Definitely. This sounds like a great change - I am looking forward to merging it!
-John
I'll give that a go then, thanks.
Peter
Hi John,
Here's a first cut - patch at end of email - and matching changes to the BLAST database class to use this:
https://github.com/peterjc/galaxy_blast/commit/0a205ab98a639dccf29db72f81b02...
Thoughts on the method signature welcome (should it get the full data object for example - for other more ambitious changes), and the return values (error condition signalling).
With this change, rather than having a dummy HTML file (named after a truncated version of the dataset name), I can use a non-HTML filename like blastdb.log (or just README.txt etc), or have no dummy central file at all (useful for old BLAST databases where the central dummy file is empty).
The logical next step would be to move the walk loop into a separate method as well (to allow that to be overridden to - for example if you want to enforce a subfolder structure inside the ZIP file).
Regards,
Peter
--
$ hg diff lib/galaxy/datatypes/data.py diff -r 359a822b2e5d lib/galaxy/datatypes/data.py --- a/lib/galaxy/datatypes/data.py Tue Nov 26 08:04:27 2013 -0600 +++ b/lib/galaxy/datatypes/data.py Tue Nov 26 16:45:47 2013 +0000 @@ -199,6 +199,29 @@ out = "Can't create peek %s" % str( exc ) return out
+ def _archive_main_file(self, archive, outfname, data_filename): + """Called from _archive_composite_dataset to add central file to archive. + + Unless subclassed, this will add the main dataset file (argument data_filename) + to the archive, as an HTML file with its filename derived from the dataset name + (argument outfname). + + Returns a tuple of boolean, string, string: (error, msg, messagetype) + """ + error, msg, messagetype = False, "", "" + htmlname = os.path.splitext(outfname)[0] + if not htmlname.endswith(ext): + htmlname = '%s_%s' % (htmlname, ext) + archname = '%s.html' % htmlname # fake the real nature of the html file + try: + archive.add(data_filename, archname) + except IOError: + error = True + log.exception("Unable to add composite parent %s to temporary library download archive" % data_filename) + msg = "Unable to create archive for download, please report this error" + messagetype = "error" + return error, msg, messagetype + def _archive_composite_dataset( self, trans, data=None, **kwd ): # save a composite object into a compressed archive for downloading params = util.Params( kwd ) @@ -233,33 +256,25 @@ messagetype = 'error' if not error: current_user_roles = trans.get_current_user_roles() - ext = data.extension path = data.file_name fname = os.path.split(path)[-1] efp = data.extra_files_path - htmlname = os.path.splitext(outfname)[0] - if not htmlname.endswith(ext): - htmlname = '%s_%s' % (htmlname,ext) - archname = '%s.html' % htmlname # fake the real nature of the html file - try: - archive.add(data.file_name,archname) - except IOError: - error = True - log.exception( "Unable to add composite parent %s to temporary library download archive" % data.file_name) - msg = "Unable to create archive for download, please report this error" - messagetype = 'error' - for root, dirs, files in os.walk(efp): - for fname in files: - fpath = os.path.join(root,fname) - rpath = os.path.relpath(fpath,efp) - try: - archive.add( fpath,rpath ) - except IOError: - error = True - log.exception( "Unable to add %s to temporary library download archive" % rpath) - msg = "Unable to create archive for download, please report this error" - messagetype = 'error' - continue + #Add any central file to the archive, + error, msg, messagetype = self._archive_main_file(archive, outfname, path) + if not error: + #Add any child files to the archive, + for root, dirs, files in os.walk(efp): + for fname in files: + fpath = os.path.join(root,fname) + rpath = os.path.relpath(fpath,efp) + try: + archive.add( fpath,rpath ) + except IOError: + error = True + log.exception( "Unable to add %s to temporary library download archive" % rpath) + msg = "Unable to create archive for download, please report this error" + messagetype = 'error' + continue if not error: if params.do_action == 'zip': archive.close()