On Wed, Jun 1, 2011 at 4:22 PM, Greg Von Kuster <greg@bx.psu.edu> wrote:
Hello Peter - I finally got a chance to jump in - see my inline comments...
Hi :)
What happens with branches? Would the Tool Shed just show the default branch? That seems best for a simple UI.
Some of the branching details are yet to be worked out, but forks are easy because repository urls include the unique username of the Galaxy user.
Well, yes and no - as long as there are competing versions of a Galaxy tool (e.g. from an original author and a fork by a second author), and they use the same ID in their XML, you have a clash. This will have to be considered in the (automated) install interface. i.e. In general, when installing or updating any tool, there may be existing versions of some components already present. In fact two completely unrelated tools could even have the same XML ID by accident.
I have a query regarding the way the tools are shown in tables and the "version" column, which shows a changeset and revision number. According to Greg's slides (slide #10, titled "Simpler tool versioning" which seems ironic to me), the old numerical version is still there in the XML - and I'd prefer to see that. How about having both shown (two columns, perhaps call them "Public version" and "hg version" or "hg revision").
We can certainly do this, but what would you like to see for tool suites and other tool "types"? The old Galaxy tool shed strictly required a suite_config.xml file that included the overall version of the suite. To make tool development easier, we're no longer requiring the inclusion of a suite_config.xml file ( we don't even differentiate types of tools since everything is a repository ). The definition of a tool in the next gen tool shed, is fairly loose. A tool could be data, it could be an exported workflow, it could be a suite of tools, a single tool, or just a set of files. So we'll need to define an easy way to provide a version of the tool if it will be different than the version of the repository tip.
I see what you mean for the "suite" case. Maybe on the view details page each constituent tool could be shown with its "classical" version number from the XML file?
Here's the future "big picture" highlights. Many of the details are yet to be defined and fleshed out...
We're hoping that in the near future there will be many local tool sheds ( just like Galaxy instances ). I'm thinking that there will be a central tool shed "broker" of sorts that is hosted by the Galaxy team. This broker will provide 2 basic functions. It will enable local tool sheds ( including the current tool shed hosted by the Galaxy team ) to advertise their tools, and it will allow local Galaxy instances to use those advertisements to find tools that the local Galaxy instance's users are interested in. This specific point has not yet been discussed to any depth, so consider it fluid for now.
I'm not immediately sold on this plan. To me one of the big plus points of having a single "Official" Tool Shed looked after by the Galaxy team is the convenience factor (a one stop shop), which requires critical mass, plus whatever QA happens as part of the current approval process. I would regard it as a step backwards if in order to hunt for a wrapper for a given tool, I had to resort to Google in order to find all the individual Galaxy Tool Sheds.
When a Galaxy instance's admin locates tools within a specific tool shed that they want to install, they will be able to install them via a Galaxy tool installation control panel. Think of a UI that provides a check-boxed list of tools that have been found in some tool shed or sheds. The Galaxy admin will check those tools he wants to install, and the tools, along with all dependencies will automatically be installed in the local Galaxy instance. Dependencies could include 3rd party binaries, maybe some form of data, and other forms of dependencies. This is another good reason to keep tools separated in their own repositories.
If you mean by "dependencies" the small task of installing the tool XML and associated scripts and data files currently bundled in the tar balls on the current Tool Shed, that seems fine. Anything beyond that seems difficult and likely to impose a significant extra load on tool wrapper authors.
The installation will be virtually automatic, requiring little or no manual intervention via a "package manage" of sorts. This will be done using a combination of fabric scripts, and other components. All of the underlying mercurial stuff will be handled beneath the UI layer.
This larger aim of installing the underlying dependencies is impossible in general - but that seems to be what you want to aim for. Consider obvious use case of closed source (non-redistributable) 3rd party binaries. I can think of several examples from the current Tool Shed wrappers, including the Roche "Newbler" off instrument applications, TMHMM and SignalP. Even if you just hope to cover open source tool dependencies, this is another big problem which seems like something Galaxy shouldn't be taking on. Frankly the only way I expect this grand plan to have any practical chance of success is if you limit yourselves to a single existing Linux package management platform like RPM or Deb files (although doing that would limit Galaxy's appeal). e.g. Work hand in hand with Debian-Med to ensure any missing tool is covered. Are you biting off more than you can chew? I hope I am misinterpreting your plans. (And for the umpteenth time, I am frustrated I couldn't make it to the Galaxy conference last week in person - more for this kind of discussion rather than the talks themselves. Will you be at BOSC or ISMB 2011 in Vienna? Maybe that could be another thread...) Regards, Peter