Stepping back a little, is the right way to address Python dependencies? I was a big advocate for inter-repository dependencies, but I think taking it to the level of individual python packages might be going too far - my thought was they were needed for big 100Mb programs and stuff like that. At the Java jar/Python library/Ruby gem level I think using some of the platform specific packaging stuff to creating isolated environments for each program might be a better way to go. Brad, Enis, and I came up with this idea to use virtualenv to automatically create environments for Galaxy tools in CloudBioLinux based on a requirements file and then activating that environment in the tool's env.sh file. https://github.com/chapmanb/cloudbiolinux/commit/0e4489275bba2e8f77e1218e3cc... It would be easier for tool authors if they could just say here is a requirements.txt file and have the Python environment automatically created or here is a Gemfile and use rvm+bundler to automatically configure a Ruby environment. Thanks, -John On Tue, Apr 16, 2013 at 4:50 AM, Björn Grüning <bjoern.gruening@pharmazie.uni-freiburg.de> wrote:
Hi Greg.
If numpy is not required for compiling matplotlib components (i.e., matplotlib components just use numpy after installation), then you should be able to make this work using a complex repository dependency for numpy in your tool_dependencies.xml definition for matplotlib. The discussion for doing this is at http://wiki.galaxyproject.org/DefiningRepositoryDependencies#Complex_reposit...
Thanks! But it is required at compile time.
Ok, we may need to do a bit of work to support this requirement, but I'm not quite sure. What I've described to you should still be your approach, but we'll need to ensure that the package_numpy_1_7_1 repository is installed before the package_matplotlib_1_2_1 is installed. Guaranteeing this is not currently possible, but this is a feature am hoping to have available this week. This is a feature that Ira Cooke has needed for his repositories. When the feature is available, it will support an attribute named "prior_installation_required" in the <repository> tag, so this tag will look something like: <repository toolshed=www" name="xxx" owner="yyy" changeset_revision="zzz" prior_installation_required="True" />
Such a tag would also ensure that we do not end up in a dependency-loop right?
What this will do is skip installation of the repository that contains this dependency until the repository that is associated with the "prior_installation_required" attribute is installed (unless that repository is not in the current list of repositories being installed).
What I think still needs to be worked out is how to ensure that the tool_dependencies.xml definition that installs the matplotlib package will find the previously installed numpy binary during compilation of matplotlib. Currently, the numpy binary will only be available to the installed and compiled matplotlib binary. I'll create a Trello card for this and let you know an estimate of when it will be available.
I already created a trello card after talking with InitHello in IRC. https://trello.com/c/QTeSmNSs
My idea would be to populate all env.sh scripts associated from all <repository toolshed=www" > tags during the execution of <action type="shell_command"></action> commands.
The tool author can add the ./lib/ folder to LD_LIBRARY_PATH and can use it in any compile-time depending program, as long as <repository toolshed=www" name="dep_with_populated_LD_LIBRARY_PATH"> is included.
By the way,
I noticed that revision 2:c5fbe4aa5a74 of your package_numpy_1_7 repository on the test tool shed includes the following contents. Is this the repository you are working with? Strangely, the repository dependency should be invalid because it should not be possible for a repository to define a dependency upon any revision of itself. You may have uncovered a way to do this using a tool dependency definition with a complex repository dependency. I'll look into this and make sure to provide a fix for the scenario you used.
Oh, ok. in revision 3 of both packages you should see, what I was trying.
Ok, revision 3 looks good as long as it correctly installs and compiles numpy.
Instead of the above approach, your approach here should be to include only the tool_dependencies.cml definition file for installing only numpy version 1.7.1 in a r epository named package_numpy_1_7_1 (use the full version in naming the repository). You should create a separate repository named package_matplotlib_1_2_1 that similarly contains a single tool_dependencies.xml file that (in addition to defining how to install and compile mtplotlib) defines a complex repository dependency on the package_numpy_1_7_1 repository as described in the wiki at the link above.
This approach creates 2 separate orphan tool dependencies, the second of which (matplotlib) has a complex repository dependency on the first (numpy). When you install the package_matplotlib_1_2_1 repository and check the box for handling tool dependencies during the installation, it will install the package_numpy_1_7_1 repository and create a pointer to the numpy binary in the env.sh file within the package_matplotlib_1_2_1 repository environment. This enables matplotlib to locate the required version of numpy.
I know this is a bit tricky, so please let me know if it still does not make sense.
Lets see if I got it right.
repository_dependencies.xml will be pared first. The defined repo's and the included and populated system variables will be available in tool_dependencies.xml, which is parsed afterwards. Is that correct?
I'm not quite sure I understand your statements above, but I've looked at revision 3 of your package_matplotlib_1_2_1 repository and the tool_dependencies.xml definition looks good (with the exception of the currently unsupported "prior_installation_required" attribute), so I think you've successfully deciphered my documentation.
I'll make sure to keep you informed as I make progress on the missing pieces that will support what you need this week.
I will try that. Thanks! Bjoern
Thanks very much,
Greg Von Kuster
On Apr 15, 2013, at 3:29 PM, Björn Grüning wrote:
Hi,
is there a general rule to handle dependencies inside of tool_dependencies.xml?
Lets assume I write a matplotlib orphan tool_dependencies.xml file. matplotlib depends on numpy. Numpy has already a orphan definition.
Is there a way to include numpy as dependency inside the matplotlib-definition, so that I did not need to fetch and compile numpy inside of matplotlib?
I tried to specify it beforehand but that did not work.
Thanks! Bjoern
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/