[galaxy-dev] Galaxy-less Tool Installing

15 Jul 2013

      One of my goals for the GCC was to sell the idea that tool shed
repositories need to be installable without a database present. I
talked with James Taylor and Enis Afgan about this idea briefly and they seemed
to believe this was a good idea - I kept meaning to discuss it with Greg but I
never got a good opportunity. Though in past Greg has made this sound
potentially doable and has never objected to the goal overtly.

  I have two specific use cases in mind (CloudBioLinux and LWR), but perhaps the
higher-level justification is something along the lines that a lot of effort
from Greg and others (Dave, Bjorn, Peter, Nate) has gone into building a modular
dependency system that could very easily be leveraged by applications other than
Galaxy, so the extra steps that could be taken to make this possible should to
make the codebase as broadly useful and to encourage adoption. The Galaxy
community could benefit from other applications potentially utilizing and
populating the tool shed and Galaxy tool developers would be further incentized
to write good, modular dependencies and publish them to the tool shed.

  A high-level task decomposition would be something like this:

  1. Rework installing tool shed repositories to not require a database. A kind
of messy way to do this might be adding a use_database flag throughout. A
cleaner way might be to use allow the core functionality to work with callbacks
or plugins that performed the database interactions.

  2. Separate the core functionality out of the Galaxy code base entirely into
a reusable, stand-alone library.

  I would love buy in from the Galaxy team on item 2 above, but it is not
strictly needed for my goals - I imagine I could write a script to pull it out
Galaxy and build the library automatically or even just have the Galaxy codebase
present when using Galaxy-less tool shed dependencies.

  Buy in on item 1 by the Galaxy team (specifically Greg and Dave B.)
however is needed, are there any objections to this idea? Do you have any broad
advice on how to approach this to ensure the changes make sense, work with your
long term vision, and end up in Galaxy?

  Of all the things on my TODO list for the next year, this is probably the most
potentially broadly interesting to this weeks BOSC codefest attendees, so I was
going to attempt to sell this as something to work on. The sales pitch would
include building a little tool shed version of the module command -
http://linux.die.net/man/1/module to demonstrate this work and have something
immediately useful produced.

  The idea would be to create a command-line tool for utilizing tool shed
dependencies.

  # Unlike standard module, install procedure is available. Probably could
  # default to main tool shed and latest installable revision

  % tsmodule repo:install galaxyp/tint
  % tsmodule repo:install toolshed.g2.bx.psu.edu/galaxyp/tint/ab43b5ba7a4e

  # module lets you list packages, I guess tool shed version would need
  # repository and package listings:

  % tsmodule repo:list
  toolshed.g2.bx.psu.edu/galaxyp/tint/ab43b5ba7a4e
  % tsmodule package:list
  tint_proteomics_scripts/1.19.19/galaxyp/tint/ab43b5ba7a4e

  # Finally, a use command would source the env.sh script and make dependency
  # available in the command-line (might require starting new shell?):

  % tsmodule package:use tint_proteomics_scripts
  % tsmodule package:use tint_proteomics_scripts/1.19.19
  % tsmodule package:use
int_proteomics_scripts/1.19.19/galaxyp/tint/ab43b5ba7a4e

  # use apps that would be available to tools with valid requirements tags.
  % iQuantCLI

  This would be different from using the API scripts because there would be no
API, Galaxy instance, or Galaxy database involved - just the Galaxy code. If
this was able to split into its own Python library, one could imagine even
allowing something like tsmodule to be installable right from pip and
recursively fetch a toolshed_client library or something like that.

[galaxy-dev] Galaxy-less Tool Installing

John Chilton