Done. On the topic of the freeze, if Dannon's change requiring all metadata to be "set externally" is going to be included I would suggest someone looking at build_command_line in runners/__init__.py. https://bitbucket.org/galaxy/galaxy-central/src/237209336f0337ea9f47df39548d... I think there is a bug that when metadata is set externally it masks the return code of the tool (likewise if from_work_dir is used). I had just created a trello (https://trello.com/c/JfB2w1Br) card with an idea for how to address it, but I think the problem is going to be more severe when everyone is setting metadata externally. I have only observed this for the from_work_dir case, but based on code inspection I don't know how setting metadata externally would be different. Also, that same change broke the LWR so it would be very appreciated if pr 166 could be accepted before release is tagged :) or at least the first two changesets. Thanks all, -John On Mon, May 20, 2013 at 8:17 AM, Nate Coraor <nate@bx.psu.edu> wrote:
John,
Could you create a pull request with your changes from the branch in github? I'll accept them and then commit my additions and changes. Today is the "freeze" so I'd like to get this in to the next release.
Thanks, ---nate
On May 17, 2013, at 11:21 AM, John Chilton wrote:
Hey All,
There was a long conversation about this topic in IRC yesterday (among people who don't actually use the tool shed all that frequently), I have posted it to the new unofficial Galaxy Google+ group if anyone would like to read and chime in.
https://plus.google.com/111860405027053012444/posts/TkCFwA2jkDN
-John
On Tue, May 14, 2013 at 3:59 PM, Nate Coraor <nate@bx.psu.edu> wrote:
Greg created the following card, and I'm working on a few changes to your commit:
https://trello.com/card/toolshed-consider-enhancing-tool-dependency-definiti...
Thanks, --nate
On May 14, 2013, at 1:45 PM, Nate Coraor wrote:
On May 14, 2013, at 10:58 AM, John Chilton wrote:
Hey Nate,
On Tue, May 14, 2013 at 8:40 AM, Nate Coraor <nate@bx.psu.edu> wrote:
Hi John,
A few of us in the lab here at Penn State actually discussed automatic creation of virtualenvs for dependency installations a couple weeks ago. This was in the context of Bjoern's request for supporting compile-time dependencies. I think it's a great idea, but there's a limitation that we'd need to account for.
If you're going to have frequently used and expensive to build libraries (e.g. numpy, R + rpy) in dependency-only repositories and then have your tool(s) depend on those repositories, the activate method won't work. virtualenvs cannot depend on other virtualenvs or be active at the same time as other virtualenvs. We could work around it by setting PYTHONPATH in the dependencies' env.sh like we do now. But then, other than making installation a bit easier (e.g. by allowing the use of pip), we have not gained much.
I don't know what to make of your response. It seems like a no, but the word no doesn't appear anywhere.
Sorry about being wishy-washy. Unless anyone has any objections or can foresee other problems, I would say yes to this. But I believe it should not break the concept of common-dependency-only repositories.
I'm pretty sure that as long as the process of creating a venv also adds the venv's site-packages to PYTHONPATH in that dependency's env.sh, the problem should be automatically dealt with.
I don't know the particulars of rpy, but numpy installs fine via this method and I see no problem with each application having its own copy of numpy. I think relying on OS managed python packages for instance is something of a bad practice, when developing and distributing software I use virtualenvs for everything. I think that stand-alone python defined packages in the tool shed are directly analogous to OS managed packages.
Completely agree that we want to avoid OS-managed python packages. I had, in the past, considered that for something like numpy, we ought to make it easy for an administrator to allow their own version of numpy to be used, since numpy can be linked against a number of optimized libraries for significant performance gains, and this generally won't happen for versions installed from the toolshed unless the system already has stuff like atlas-dev installed. But I think we still allow admins that possibility with reasonable ease since dependency management in Galaxy is not a requirement.
What we do want to avoid is the situation where someone clones a new copy of Galaxy, wants to install 10 different tools that all depend on numpy, and has to wait an hour while 10 versions of numpy compile. Add that in with other tools that will have a similar process (installing R + packages + rpy) plus the hope that down the line you'll be able to automatically maintain separate builds for remote resources that are not the same (i.e. multiple clusters with differing operating systems) and this hopefully highlights why I think reducing duplication where possible will be important.
I also disagree we have not gained much. Setting up these repositories is a onerous, brittle process. This patch provides some high-level functionality for creating virtualenv's which negates the need for creating separate repositories per package.
This is a good point. I probably also sold short the benefit of being able to install with pip, since this does indeed remove a similarly brittle and tedious step of downloading and installing modules.
--nate
-John
--nate
On May 13, 2013, at 6:49 PM, John Chilton wrote:
> The proliferation of individual python package install definitions has > continued and it has spread to some MSI managed tools. I worry about > the tedium I will have to endure in the future if that becomes an > established best practice :) so I have implemented the python version > of what I had described in this thread: > > As patch: > https://github.com/jmchilton/galaxy-central/commit/161d3b288016077a99fb7196b... > Pretty version: > https://github.com/jmchilton/galaxy-central/commit/161d3b288016077a99fb7196b... > > I understand that there are going to be differing opinions as to > whether this is the best way forward but I thought I would give my > position a better chance of succeeding by providing an implementation. > > Thanks for your consideration, > -John > > > On Wed, Apr 17, 2013 at 3:56 PM, Peter Cock <p.j.a.cock@googlemail.com> wrote: >> On Tue, Apr 16, 2013 at 2:46 PM, John Chilton <chilton@msi.umn.edu> wrote: >>> Stepping back a little, is the right way to address Python >>> dependencies? >> >> Looks like I missed this thread, hence: >> http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-April/014169.html >> >>> I was a big advocate for inter-repository dependencies, >>> but I think taking it to the level of individual python packages might >>> be going too far - my thought was they were needed for big 100Mb >>> programs and stuff like that. >> >> It should work but it is a lot of boilerplate for something which >> should be more automated. >> >>> At the Java jar/Python library/Ruby gem >>> level I think using some of the platform specific packaging stuff to >>> creating isolated environments for each program might be a better way >>> to go. >> >> I agree, the best way forward isn't obvious here, and it may make >> sense to have tailored solutions for Python, Perl, Java, R, Ruby, >> etc packages rather than the current Tool Shed package solution. >> >> I've like to be able to just continue to write this kind of thing in my >> tool XML files and have it actually taken care of (rather than ignored): >> >> <requirements> >> <requirement type="python-module">numpy</requirement> >> <requirement type="python-module">Bio</requirement> >> </requirements> >> >> Adding a version key would be sensible, handling min/max etc >> as per Python packaging norms. >> >> Peter > ___________________________________________________________ > Please keep all replies on the list by using "reply all" > in your mail client. To manage your subscriptions to this > and other Galaxy lists, please use the interface at: > http://lists.bx.psu.edu/ > > To search Galaxy mailing lists use the unified search at: > http://galaxyproject.org/search/mailinglists/
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/