Hello Peter - I finally got a chance to jump in - see my inline comments... On Jun 1, 2011, at 11:00 AM, Peter Cock wrote:
On Wed, Jun 1, 2011 at 3:22 PM, Nate Coraor <nate@bx.psu.edu> wrote:
Hi Peter,
Greg will probably reply, but I'll throw in my $0.02 as well.
Great - but with your answers you've triggered more questions ;)
Peter Cock wrote:
Hi Greg et al,
I've just been looking over your slides from last week about the new 'Galaxy Tool Shed', which are posted online here:
http://wiki.g2.bx.psu.edu/GCC2011
http://wiki.g2.bx.psu.edu/GCC2011?action=AttachFile&do=get&target=GalaxyToolShed.pdf
They talk about how you will be tracking individual tools in hg repositories.
I can see two ways this might work:
(1) Each of these tool specific repositories (or branches if you just make one repository for each tool owner) would be a full fork of the Galaxy code base. This allows in principle tools to include changes to core functionality (but that seems dangerous due to potential merge clashes), and any existing tool contributor's pre-existing hg forks on bitbucket might be reused.
The tool shed isn't really intended for framework changes - I would suggest keeping these as bitbucket forks, although it would certainly be good if we had a way to locate the list of such forks centrally.
Well, as long as the repository is created by forking on bitbucket, then the link existing in the bitbucket web interface. https://bitbucket.org/galaxy/galaxy-central/descendants
What's important here is that each tool or set of tools is it's own separate entity - see the future "big picture" highlights below for reasons.
(2) Each of these tool specific repositories would ONLY track the tool specific files you'd add to Galaxy to install the tool. So, typically there would be an XML file, perhaps a wrapper script, maybe a sample loc file, and a plain text readme file.
I'm guessing you've gone for something along the lines of idea (2), but I
Yep.
It did seem the most likely route.
would love to hear more about how this will all work. e.g. Where would the tool shed repositories be hosted, and would tool authors use hg to work with them, or something like the current web based tool upload?
They're hosted here, and you can check them out and work with them locally as you do the Galaxy source itself, or use the new web-based upload to upload individual files or tarballs.
Have a look at the test instance of the next-gen toolshed here if you'd like to see how it works:
http://testtoolshed.g2.bx.psu.edu/
Please feel free to use this as a sandbox and report any issues you find.
I see the existing usernames and passwords from the old Tool Shed were transferred - that makes life easier. And it lists the hg information, e.g.
hg clone http://peterjc@testtoolshed.g2.bx.psu.edu/repos/peterjc/venn_list hg clone http://peterjc@testtoolshed.g2.bx.psu.edu/repos/peterjc/tmhmm_and_signalp
What happens with branches? Would the Tool Shed just show the default branch? That seems best for a simple UI.
Some of the branching details are yet to be worked out, but forks are easy because repository urls include the unique username of the Galaxy user.
I have a query regarding the way the tools are shown in tables and the "version" column, which shows a changeset and revision number. According to Greg's slides (slide #10, titled "Simpler tool versioning" which seems ironic to me), the old numerical version is still there in the XML - and I'd prefer to see that. How about having both shown (two columns, perhaps call them "Public version" and "hg version" or "hg revision").
We can certainly do this, but what would you like to see for tool suites and other tool "types"? The old Galaxy tool shed strictly required a suite_config.xml file that included the overall version of the suite. To make tool development easier, we're no longer requiring the inclusion of a suite_config.xml file ( we don't even differentiate types of tools since everything is a repository ). The definition of a tool in the next gen tool shed, is fairly loose. A tool could be data, it could be an exported workflow, it could be a suite of tools, a single tool, or just a set of files. So we'll need to define an easy way to provide a version of the tool if it will be different than the version of the repository tip.
With regards to the planned installation functionality, what happens when a tool repository (aka Tool Suite in the old model) contains several XML wrappers - would you be able to choose which are wanted?
Yes - see below...
The use case I have here is when several tools share some common dependency (which should be tracked in a single repository), and were therefore useful to bundle together as a suite, but where not all the tools will be of global interest (e.g. My TMHMM, SignalP, etc suite).
Here's the future "big picture" highlights. Many of the details are yet to be defined and fleshed out... We're hoping that in the near future there will be many local tool sheds ( just like Galaxy instances ). I'm thinking that there will be a central tool shed "broker" of sorts that is hosted by the Galaxy team. This broker will provide 2 basic functions. It will enable local tool sheds ( including the current tool shed hosted by the Galaxy team ) to advertise their tools, and it will allow local Galaxy instances to use those advertisements to find tools that the local Galaxy instance's users are interested in. This specific point has not yet been discussed to any depth, so consider it fluid for now. When a Galaxy instance's admin locates tools within a specific tool shed that they want to install, they will be able to install them via a Galaxy tool installation control panel. Think of a UI that provides a check-boxed list of tools that have been found in some tool shed or sheds. The Galaxy admin will check those tools he wants to install, and the tools, along with all dependencies will automatically be installed in the local Galaxy instance. Dependencies could include 3rd party binaries, maybe some form of data, and other forms of dependencies. This is another good reason to keep tools separated in their own repositories. The installation will be virtually automatic, requiring little or no manual intervention via a "package manage" of sorts. This will be done using a combination of fabric scripts, and other components. All of the underlying mercurial stuff will be handled beneath the UI layer.
Peter
Greg Von Kuster Galaxy Development Team greg@bx.psu.edu