On Mon, Sep 23, 2013 at 2:30 PM, Carlos Borroto <carlos.borroto@gmail.com> wrote:
On Mon, Sep 23, 2013 at 2:22 PM, John Chilton <chilton@msi.umn.edu> wrote:
Hi John,
First let me tell you why I think duplicating the biopython install reduces reproducibility. I have this on my tool setup.py: install_requires=[ "docopt", "biopython", "python-levenshtein" ],
While I could specify versions here(ex. biopython==1.62), I feel that is not a good thing outside of Galaxy. I think pip should be free to install the latest version of these packages until I found there is an issue otherwise. I think this is the most common approach, I might be wrong thou. This leaves me with the issue that then when Galaxy installs my tool using virtualenv, it will grab the most up-to-date version of these packages, hence reducing reproducibility. Did I explain myself well enough? I'll be happy to debate about any of this.
I understand what you are saying and I sympathize with you here. Still I think the better approach is going to be to copy these requirements into the setup_virtualenv block and specify hard-coded versions. This way you get reproduciblity across all packages, not just biopython.
Hi John,
Could you go a little further with this recommendation. How can I specify versions for required packages in setup_virtualenv. I now have this: <install version="1.0"> <actions> <action type="setup_virtualenv">ngs-tools==0.1.6</action> </actions> </install>
I tried these two without luck: <action type="setup_virtualenv">docopt==0.6.1 python-levenshtein==0.10.2 biopython==1.62 ngs-tools==0.1.6</action>
So the contents is treated like a requirements.txt file. So the whitespace becomes important (I have a plan to improve this and sort of synchronize the syntax used for Ruby, Python, and R, but for now its just a file). So you want this: <action type="setup_virtualenv">docopt==0.6.1 python-levenshtein==0.10.2 biopython==1.62 ngs-tools==0.1.6 </action> Newline between dependencies, and no whitespace to the left of each package. Someday the syntax will be: <action type="setup_virtualenv"> <package>docopt==0.6.1</package> <package>python-levenshtein==0.10.2</package> biopython==1.62 ngs-tools==0.1.6 </action>
<action type="setup_virtualenv">docopt==0.6.1, python-levenshtein==0.10.2, biopython==1.62, ngs-tools==0.1.6</action>
I think this slight duplication is a smaller problem then mixing dependency mechanisms you described in your approach. To me it is analogous to installing some python dependencies via os packages and other ones via sudo pip install into /usr, it is a recipe for confusion.
While I'm getting convinced that maybe some duplication is not that bad after all, please notice that my plan is to install everything from the toolshed. I also don't like mixing install methods. In fact, I would like for 'install_pip' to have the best practice option of doing always 'pip install --no-deps'. This would force you to first upload everything your package needs to the toolshed.
Thanks, Carlos