Comments on and a fix for John Chilton's tool-dependency-resolver-plugins
Hi John, This is in regard to this: https://bitbucket.org/galaxy/galaxy-central/pull-request/228/tool-dependency... Overall, this is very useful, just what I need, thanks. I'd *really* like to see this feature in the mainline Galaxy. Is there some voting necessary on Trello to achieve this, or is it enough to be enthusiastic here? I tested the ModuleDependencyResolver, and fixed three problems: 1. Fixed up module loading to work properly. The problem is that 'module' is not a first class command, it's a shell function. And it only works from interactive shells. The solution is to use the underlying modulecmd command. This requires deeper knowledge in the modules resolver of how environment modules work, which obviates the DEFAULT_MODULE_COMMAND and the flexibility to override it. 2. Made versionless fallback work, i.e. use the matching version if it exists, and only fallback to a generic match if it doesn't. 3. Enhanced the DirectoryModuleChecker to look along the modulepath, not just in a single directory. The default path is initialised appropriately from environment variables MODULEPATH, MODULESHOME, as per module(1). This can be overridden with the attribute modulepath rather than directory in the config file. Fix attached - I presume a Mercurial export is all you need? It may be better to default prefetch to false (but I didn't change that). Otherwise the Galaxy server needs restarting after new system packages become available. Now, there's one more thing required, which I'm not sure how to achieve. I intend to run with this config: <dependency_resolvers> <modules prefetch="false" versionless="true"/> </dependency_resolvers> So in particular I'm not interested in tool_shed_packages. However, when I install from the toolshed, say, the emboss tool, it still downloads the source tarball and tries to compile it locally (which fails, as I don't have make installed on my production Galaxy, nor do I want it). The emboss tool status in the "Manage installed tool shed repositories" list is "Installed, missing tool dependencies", but actually my installed modules mean the tool dependencies are satisfied. The behaviour I'm after is not even to try to do the actions in a tool_dependency.xml package spec in the toolshed, if I have dependency resolvers configured without tool_shed_packages. What are your thoughts on that? cheers, Simon
Simon, Thanks for the fixes, they look great! (Glad to have someone who knows what they are doing with respect to modules helping!) I have updated the PR with these changes and some additional unit testing I did of the behavior you outlined. I will merge these changes in at some point next week. I understand the desire to not want to try to execute the tool shed actions if tool_shed_packages are not specified in the dependency resolvers list. I have created a Trello card for it (https://trello.com/c/CPeU3VlR). It sounds like the new status quo will be functional it will just be kind of annoying to have those actions try and fail and then marked as errors, right? If that is right then for this reason I don't think implementing this behavior will be a high priority for the team, but if you send another brilliant patch or pull request I will be happy to test it and merge it. Thanks again for the contribution, -John On Wed, Oct 2, 2013 at 9:43 PM, Guest, Simon <Simon.Guest@agresearch.co.nz> wrote:
Hi John,
This is in regard to this: https://bitbucket.org/galaxy/galaxy-central/pull-request/228/tool-dependency...
Overall, this is very useful, just what I need, thanks. I'd *really* like to see this feature in the mainline Galaxy. Is there some voting necessary on Trello to achieve this, or is it enough to be enthusiastic here?
I tested the ModuleDependencyResolver, and fixed three problems:
1. Fixed up module loading to work properly. The problem is that 'module' is not a first class command, it's a shell function. And it only works from interactive shells. The solution is to use the underlying modulecmd command. This requires deeper knowledge in the modules resolver of how environment modules work, which obviates the DEFAULT_MODULE_COMMAND and the flexibility to override it.
2. Made versionless fallback work, i.e. use the matching version if it exists, and only fallback to a generic match if it doesn't.
3. Enhanced the DirectoryModuleChecker to look along the modulepath, not just in a single directory. The default path is initialised appropriately from environment variables MODULEPATH, MODULESHOME, as per module(1). This can be overridden with the attribute modulepath rather than directory in the config file.
Fix attached - I presume a Mercurial export is all you need?
It may be better to default prefetch to false (but I didn't change that). Otherwise the Galaxy server needs restarting after new system packages become available.
Now, there's one more thing required, which I'm not sure how to achieve. I intend to run with this config:
<dependency_resolvers> <modules prefetch="false" versionless="true"/> </dependency_resolvers>
So in particular I'm not interested in tool_shed_packages. However, when I install from the toolshed, say, the emboss tool, it still downloads the source tarball and tries to compile it locally (which fails, as I don't have make installed on my production Galaxy, nor do I want it). The emboss tool status in the "Manage installed tool shed repositories" list is "Installed, missing tool dependencies", but actually my installed modules mean the tool dependencies are satisfied.
The behaviour I'm after is not even to try to do the actions in a tool_dependency.xml package spec in the toolshed, if I have dependency resolvers configured without tool_shed_packages.
What are your thoughts on that?
cheers, Simon ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. =======================================================================
At Fri, 4 Oct 2013 23:12:01 -0500, John Chilton wrote:
I understand the desire to not want to try to execute the tool shed actions if tool_shed_packages are not specified in the dependency resolvers list. I have created a Trello card for it (https://trello.com/c/CPeU3VlR). It sounds like the new status quo will be functional it will just be kind of annoying to have those actions try and fail and then marked as errors, right? If that is right then for this reason I don't think implementing this behavior will be a high priority for the team, but if you send another brilliant patch or pull request I will be happy to test it and merge it.
Hi John, I have now implemented this. Mercurial changeset attached. This adds another configuration item for universe_wsgi.ini as documented in the updated universe_wsgi.ini.sample: # This option may be used to disable installation of package # dependencies from tool sheds, which usually means downloading a # source tarball and compiling it locally. You would disable this if # you want to make use of system installed packages for example. In # that case, alternative tool dependency resolvers should be # configured in dependency_resolvers_conf.xml #enable_tool_shed_package_dependency_installation = True The default behaviour is per standard toolshed. If configured to False, then the package components of tool dependencies will never be installed from the toolshed. Rather, they will rely on the tool dependency manager to resolve the requirement. This is integrated with the missing_tool_dependencies status, so that if dependency resolution fails, the repository gets flagged with "missing tool dependencies" on the installed tool shed repositories page, and the paster.log file contains a warning of exactly what wasn't found. Of course, if the dependency can be satisfied, then it's green lights all the way, and everything should work fine. I've tested this interactively by installing the standard emboss_5 repository, with various combinations of environment module available and not. It looks to work fine. I'm afraid I haven't yet read up on the Galaxy unit testing framework, so no unit tests for now. What do you think? cheers, Simon
Simon, Very cool! I have two concerns. Rather than adding a new configuration option I think I would prefer to just check the configured dependency resolvers and then infer from them if the tool shed will be used. The configuration option strikes me as having to configure the same thing twice, and this change would make your setup slightly easier. Do you have any objection to me reworking your patch to do this? On the other hand, perhaps it is made more clear to the deployer that they are definitely disabling tool dependency installations if they have to add the explicit option this way. Greg, Dave have you looked at this? In particular, do you think there are any downsides to marking a ToolDependency as installed if nothing has actually been installed by Galaxy? Would it be better to add new option - EXTERNALLY_CONFIGURED? -John P.S. I know I said I would merge that module stuff last week, but usegalaxy.org is running off of galaxy-central right now and I am being extra cautious about not breaking galaxy. It will get merged though! On Thu, Oct 10, 2013 at 7:55 PM, Guest, Simon <Simon.Guest@agresearch.co.nz> wrote:
At Fri, 4 Oct 2013 23:12:01 -0500, John Chilton wrote:
I understand the desire to not want to try to execute the tool shed actions if tool_shed_packages are not specified in the dependency resolvers list. I have created a Trello card for it (https://trello.com/c/CPeU3VlR). It sounds like the new status quo will be functional it will just be kind of annoying to have those actions try and fail and then marked as errors, right? If that is right then for this reason I don't think implementing this behavior will be a high priority for the team, but if you send another brilliant patch or pull request I will be happy to test it and merge it.
Hi John,
I have now implemented this. Mercurial changeset attached.
This adds another configuration item for universe_wsgi.ini as documented in the updated universe_wsgi.ini.sample:
# This option may be used to disable installation of package # dependencies from tool sheds, which usually means downloading a # source tarball and compiling it locally. You would disable this if # you want to make use of system installed packages for example. In # that case, alternative tool dependency resolvers should be # configured in dependency_resolvers_conf.xml #enable_tool_shed_package_dependency_installation = True
The default behaviour is per standard toolshed. If configured to False, then the package components of tool dependencies will never be installed from the toolshed. Rather, they will rely on the tool dependency manager to resolve the requirement. This is integrated with the missing_tool_dependencies status, so that if dependency resolution fails, the repository gets flagged with "missing tool dependencies" on the installed tool shed repositories page, and the paster.log file contains a warning of exactly what wasn't found. Of course, if the dependency can be satisfied, then it's green lights all the way, and everything should work fine.
I've tested this interactively by installing the standard emboss_5 repository, with various combinations of environment module available and not. It looks to work fine. I'm afraid I haven't yet read up on the Galaxy unit testing framework, so no unit tests for now.
What do you think?
cheers, Simon
======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. =======================================================================
At Mon, 14 Oct 2013 20:22:06 -0500, John Chilton wrote:
Simon,
Very cool! I have two concerns. Rather than adding a new configuration option I think I would prefer to just check the configured dependency resolvers and then infer from them if the tool shed will be used. The configuration option strikes me as having to configure the same thing twice, and this change would make your setup slightly easier. Do you have any objection to me reworking your patch to do this? On the other hand, perhaps it is made more clear to the deployer that they are definitely disabling tool dependency installations if they have to add the explicit option this way.
Hi John, I have no problem with you reworking it in that way. There are two reasons I didn't do that myself: 1. I would have had to change the interface to the dependency resolvers somehow to support this query, and I wasn't sure that was a good thing. 2. I wanted to make it explicit that toolshed package installation was disabled in this case, as I thought that would make it more likely this change gets accepted into the mainline. Whichever way you Greg and Dave are happy with is OK by me. Actually, I like your implicit approach better, so hope that's the one that gets agreed. cheers, Simon
Simon, As you have probably noticed a new stable galaxy was released. It includes 95% of what we discussed including this implicit check to see if tool shed packages are enabled. Your help implementing, testing, and driving these changes was greatly appreciated! I couldn't however pull the trigger and mark packages resolved via modules as "Installed" - so they will still appear to be in an error state (though your check is in there and they won't attempt to be installed, they will be just marked as errors). The upshot is you can change this one line of code in your Galaxy instance to get the behavior you desire (patch attached). The reason I don't want to mark these packages as "Installed" is that I am worried about Galaxy deployments that maybe want to use modules are first but transition to tool shed packages down the road. I am unsure what will happen if things are marked as "Installed" even if no files corresponding to the installation exist. I think the state NEVER_INSTALLED may be preferable - but I need to understand more about what that means. For your own instance, if you are certainly committed to using modules and not using the tool shed - it should be easy to apply the above patch. Is this a fair compromise for the time being? Until I can resolve this last issue, the Trello card remains open, but it should now be quite trivial to modify Galaxy to get the behavior you desire and hopefully this can serve as a model for how others can hook in other dependency resolution mechanisms. I would be eager to hear how this experiment progresses and how you feel about the implementation. Thanks for your contributions, -John On Mon, Oct 14, 2013 at 8:33 PM, Guest, Simon <Simon.Guest@agresearch.co.nz> wrote:
At Mon, 14 Oct 2013 20:22:06 -0500, John Chilton wrote:
Simon,
Very cool! I have two concerns. Rather than adding a new configuration option I think I would prefer to just check the configured dependency resolvers and then infer from them if the tool shed will be used. The configuration option strikes me as having to configure the same thing twice, and this change would make your setup slightly easier. Do you have any objection to me reworking your patch to do this? On the other hand, perhaps it is made more clear to the deployer that they are definitely disabling tool dependency installations if they have to add the explicit option this way.
Hi John,
I have no problem with you reworking it in that way. There are two reasons I didn't do that myself:
1. I would have had to change the interface to the dependency resolvers somehow to support this query, and I wasn't sure that was a good thing.
2. I wanted to make it explicit that toolshed package installation was disabled in this case, as I thought that would make it more likely this change gets accepted into the mainline.
Whichever way you Greg and Dave are happy with is OK by me. Actually, I like your implicit approach better, so hope that's the one that gets agreed.
cheers, Simon
I think that the best approach for this is an additional ToolDependency status as John proposed a while back as this approach allows for the flexibility needed to support just about anything users may want to do. I haven't had a chance to look at this yet, and I'm currently working on a couple of high priority items. But I can probably take a looke at this probably within the next couple of weeks. If there is an example tool dependency packafge that you have that uses modules, it would really help me with this implementation. Of course, if you want to tak ethis John, it would be much appreciated as well. Greg Von Kuster On Nov 5, 2013, at 12:03 AM, John Chilton <chilton@msi.umn.edu> wrote:
Simon,
As you have probably noticed a new stable galaxy was released. It includes 95% of what we discussed including this implicit check to see if tool shed packages are enabled. Your help implementing, testing, and driving these changes was greatly appreciated!
I couldn't however pull the trigger and mark packages resolved via modules as "Installed" - so they will still appear to be in an error state (though your check is in there and they won't attempt to be installed, they will be just marked as errors). The upshot is you can change this one line of code in your Galaxy instance to get the behavior you desire (patch attached).
The reason I don't want to mark these packages as "Installed" is that I am worried about Galaxy deployments that maybe want to use modules are first but transition to tool shed packages down the road. I am unsure what will happen if things are marked as "Installed" even if no files corresponding to the installation exist. I think the state NEVER_INSTALLED may be preferable - but I need to understand more about what that means. For your own instance, if you are certainly committed to using modules and not using the tool shed - it should be easy to apply the above patch. Is this a fair compromise for the time being?
Until I can resolve this last issue, the Trello card remains open, but it should now be quite trivial to modify Galaxy to get the behavior you desire and hopefully this can serve as a model for how others can hook in other dependency resolution mechanisms.
I would be eager to hear how this experiment progresses and how you feel about the implementation.
Thanks for your contributions, -John
On Mon, Oct 14, 2013 at 8:33 PM, Guest, Simon <Simon.Guest@agresearch.co.nz> wrote:
At Mon, 14 Oct 2013 20:22:06 -0500, John Chilton wrote:
Simon,
Very cool! I have two concerns. Rather than adding a new configuration option I think I would prefer to just check the configured dependency resolvers and then infer from them if the tool shed will be used. The configuration option strikes me as having to configure the same thing twice, and this change would make your setup slightly easier. Do you have any objection to me reworking your patch to do this? On the other hand, perhaps it is made more clear to the deployer that they are definitely disabling tool dependency installations if they have to add the explicit option this way.
Hi John,
I have no problem with you reworking it in that way. There are two reasons I didn't do that myself:
1. I would have had to change the interface to the dependency resolvers somehow to support this query, and I wasn't sure that was a good thing.
2. I wanted to make it explicit that toolshed package installation was disabled in this case, as I thought that would make it more likely this change gets accepted into the mainline.
Whichever way you Greg and Dave are happy with is OK by me. Actually, I like your implicit approach better, so hope that's the one that gets agreed.
cheers, Simon <modules_installed.patch>
At Mon, 4 Nov 2013 23:03:08 -0600, John Chilton wrote:
As you have probably noticed a new stable galaxy was released. It includes 95% of what we discussed including this implicit check to see if tool shed packages are enabled. Your help implementing, testing, and driving these changes was greatly appreciated!
Hi John, Thanks very much for this! I'm afraid I won't be able to have a go at this immediately, as I'm a bit swamped with other stuff right now, but I'll certainly be getting round to it at some stage.
I couldn't however pull the trigger and mark packages resolved via modules as "Installed" - so they will still appear to be in an error state (though your check is in there and they won't attempt to be installed, they will be just marked as errors).
The reason I don't want to mark these packages as "Installed" is that I am worried about Galaxy deployments that maybe want to use modules are first but transition to tool shed packages down the road. I am unsure what will happen if things are marked as "Installed" even if no files corresponding to the installation exist. I think the state NEVER_INSTALLED may be preferable - but I need to understand more about what that means. For your own instance, if you are certainly committed to using modules and not using the tool shed - it should be easy to apply the above patch. Is this a fair compromise for the time being?
This sounds sensible. When I have a moment, I will probably investigate the idea of a new tool state INSTALLED_WITH_EXTERNAL_DEPENDENCIES. That would enable someone to see from the installation status screen what system packages need installing by the local sys admin, and not compromise the record of whether the dependencies have been installed from the toolshed. cheers, Simon
participants (4)
-
Greg Von Kuster
-
Guest, Simon
-
Guest, Simon
-
John Chilton