On Thu, Aug 29, 2013 at 5:45 AM, Guest, Simon <Simon.Guest@agresearch.co.nz> wrote:
Dear Galaxians,
This email is about difficulties with the current approach for installing tool dependency binaries from the Galaxy Toolshed, and what might be done to improve the situation. It comes down to this: packaging software to run on different systems is tricky. It is a problem that has been solved by various Linux distributions with their packaging systems (RPM, deb, etc.), and package archives. The Galaxy Toolshed is trying to solve this problem again, but so far it doesn't work very well. There must be something better we can do.
I agree with you, and as more people try to package thier tools and the dependencies, I think more will too :(
Since gaining a better understanding from the Galaxy Community Conference of what the Toolshed is trying to do (versioned tools, reproducibility), I have been working on switching over from locally installed tools to Toolshed versions. However, it has not gone well, and I think I am about to revert to my previous approach. Here's the problem: building software from source on any system requires certain tweaks to the build process which are dependent on the target platform. An example is the NCBI BLAST+ suite, which failed to build on my (EL6) system, because it couldn't run /usr/bin/touch. That's pretty dumb, and pretty simple to solve in isolation - it needs to be running /bin/touch instead.
Can we continue this specific example here?: http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-August/015890.html ... http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-August/016287.html Short answer, yes I know, a new install XML process being used on the Test Tool Shed which fixes this (but breaks in a not yet understood way on the Galaxy teams test cluster), awaiting release to the main Tool Shed.
But the general point is this: it's not feasible (i.e. too much work, too hard) to produce build scripts to build software from source that work on any platform, even the common ones. Packaging source code for a given platform is a non-trivial task. The RPM and deb packagers are doing a good job here. It's a significant amount of work. I know that, as I've been packaging bioinformatics software as binary RPMs for EL6 for 18 months or so now, and have done nearly 300 packages.
What do we want? Simply to be able to install a given version of some software, and all its dependencies, with a single click, or a single command, and have it Just Work (tm). It's the dependencies that make this hard. Things get installed in different ways on different systems. Does your platform need #include <bam.h>, or #include <bam/bam.h>? If the former, then you'll have to patch tophat, say, (in a trivial way) before building it. I think this is simply too hard to do by embedding some commands and conditionals in Toolshed XML build files.
Indeed - "nice" tools being packaged will have something like a ./configure script to take care of that, but not all :(
It seems to me that a number of people out there are currently having some issues installing tool dependencies from the Toolshed, because things are not building as expected. I think it's much easier for just one person to troubleshoot why things go wrong when they are packaging the software for a given platform, rather than for each end user (Galaxy admin) to wonder why a tool failed to install.
So, what to do? My starting point is that I have packaged a large amount of bioinformatics software for EL6, which is freely available at http://rpm.agresearch.co.nz/. I'm after some Galaxy tool wrappers for the tools that we use here at AgResearch, which can simply make use of packages installed from this repo.
Is there any interest in exploring the merits or otherwise of this approach in the Galaxy community?
There is a similar but probably larger set of Debian packages available via Debian-Med and Bio-Linux too. The catch here is can you install arbitrary versions of a tool in parallel? And I think the answer sadly is no. The idea of standard recipe templates (e.g. typical Python install) James outlined here might help: http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-August/016273.html Peter