Hi everyone, 

it’s my first post to the list, so forgive me if I miss something obvious, I tried to get all the information I could so if there was already a discussion I’d be glad if you can point me towards this.

 

I’m posting to start a discussion about how galaxy is going to handle interactions with Docker.

I know there are already some tools and resources out there that make use of Docker. As I understood, we can already specify Docker as a job destination in the tool xml, along with an image to use (See https://github.com/apetkau/galaxy-hackathon-2014/tree/master/smalt for a detailed example). This is cool, and allows for the controlling of resources, but I think we could take the integration to the next step if we allowed the generation of Docker images in galaxy.

 

I was thinking if there is a trusted baseimage (ie without root access), we could let users install

all the packages they need, commit after the install, and let them use this new docker image. This could be beneficial for generating new tools with a sandboxed version of the Galaxy Tool Factory (http://www.ncbi.nlm.nih.gov/pubmed/23024011).

 

I am currently working on this (https://bitbucket.org/mvdbeek/dockertoolfactory), and I would like to have your opinion on how to manage these user-generated Docker images.

Some questions that I came across:

Do we store the committed images in the user’s history (they could potentially become very big!)?

How to display available images to the user? Something like a per-user image registry?

How to identify available images (dataset_id as the tag?)

How to transfer user-generated images to the toolshed?

Is it a good idea at all to store user-generated images in the toolshed?

What do we do if the security of the baseimage is compromised? Obviously

we can blacklist execution of images, but what if somebody installed a dangerous image from the toolshed, and is not aware of this?

 

Ultimately I think this could be a way to have advanced users run their own scripts inside galaxy, to generate their own tools and tool-dependencies inside galaxy, and why not, even have “user-space tools”.

It would bridge the gap between galaxy users and galaxy tool-developers,

so I’m curious what you’re thinking about this.

 

Cheers,

Marius