Hello Jean-Frédéric,

Thanks very much for your wiki editing contributions - the particular page has been and continues to be on my list for making enhancements, and I'm currently planning on making some additions this week.

I apologize for the difference in how the extraction directories are determined for tar and zip archives.  The use of the target_filename attribute for tar archives is a recent enhancement that was required for migrating the lastz and bowtie tools from the Galaxy distribution, and I have not yet had a chance to update the wiki page accordingly (I will attempt to do so this week).

The tool_dependency.xml definitions have evolved enough that they are sufficient for defining the process for many 3rd party tool dependencies.  However, they will continue to be refined as additional dependencies are discovered that cannot be installed using the current definitions.  I will continue to update the tool shed wiki page as these definition tag sets evolve.

Thanks very much for your message and your contributions.

Greg Von Kuster


On Jan 30, 2013, at 10:49 AM, Jean-Frédéric Berthelot wrote:

Hi list,

When working to get a tool wrapper Toolshed-ready, I have explored how tool dependencies installation works, especially the download_by_url feature.

I have summarised my understanding on the Wiki (hope this is clear and accurate):
<http://wiki.galaxyproject.org/ToolShedToolFeatures?action=diff&rev1=33&rev2=34>

The difference of behaviour when dealing with a Tar and a Zip file lies in the methods tar_extraction_directory() and zip_extraction_directory() (in [1]), which try to locate the deflated files:

* zip_extraction_directory() counts the number of non- .zip “files” in the current directory. If more than one, the current path is returned ; if one, this “file” is assumed to be the deflated directory, and the full path to this directory is returned. The file_name variable (set to the name of the downloaded file) is not used.

* tar_extraction_directory() which takes the name of the downloaded_file and removes the extension. If a subdirectory of that name exists, it is returned. Otherwise, if the full path to the downloaded file exists, the current path is returned. Otherwise, an exception is raised.

This difference of behaviour can be very confusing. My wrapper local installations kept failing because for my tool, the deflated directory name did not match the tar.gz name. I was confused because I was using the bowtie wrapper [2] as an example, where it is not an issue.

The obvious workaround is to use the target_filename attribute to have the downloaded archive name match the subdirectory name (which I have done, and it works perfectly now). But I was wondering if it would be a good idea to harmonize both methods behaviour?

(P.S. As I am not very familiar yet with the Galaxy development process, I believe posting here is appropriate ; but I’d be happy to open an issue − on the Trello if I am not mistaken? − if needed.)

[1] <https://bitbucket.org/galaxy/galaxy-central/src/f49ce2c1883e/lib/galaxy/tool_shed/tool_dependencies/common_util.py>
[2] <http://toolshed.g2.bx.psu.edu/repos/devteam/bowtie_wrappers/file/0c7e4eadfb3c/tool_dependencies.xml>

Cheers,
-- 
Jean-Frédéric
Bonsai Bioinformatics group
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/