Just my 2 cents about this:

I feel your pain, our connection to the docker hub is horrible, it takes an hour to pull the GIE images ... .

(While it only takes seconds from the cloud ...).

The galaxy docker images are pretty fat, because we use them VM-style, in principle nothing wrong about that.

So the base-image is about 1.2GB .... add a few tools and you're quickly reaching 5GB.

If in addition you use interactive environments ... add a few GB more.

I think that instead of trying to reduce the size of the base-image, it might be a better effort to separate the components

into a proxy-image, database image (perhaps 2, one for tools / one for user-data), galaxy-image,

cluster-image ... and so on. This would allow you to just update the tools and galaxy image regularly,

plus you could do all the neat docker stuff, like versioning, committing, rolling updates, streaming database replication, worker scaling ...

It's certainly something I would be interested in.

The other thing is to make sure that the tool dependencies are as slim as possible. Having many different R packages

and their source lying around makes for a lot of data. Hopefully conda can alleviate that situation.

Cheers,

Marius

On 23 February 2016 at 09:36, Björn Grüning <bjoern.gruening@gmail.com> wrote:

Hi Tiago,

thanks for the heads-up. I also tried alpine some time ago but Galaxy
needs some external dependencies which Nate is building also with a ppa
for Debian based systems. So this is not so easy to migrate.

What we could do is to orchestrate the containers and its deps:

https://github.com/bgruening/docker-galaxy-stable/issues/43

But this has the big disadvantages of not sharing the setup with other
Galaxy installations, like the VM installation from planemo-machine.

So in the end I stoped to make it more modular and tried to share as
much as possible with other installations of Galaxy and move more and
more into the ansible-playbook.

Thanks Tiago for trying this,
Bjoern

Am 23.02.2016 um 05:15 schrieb Tiago Antao:
> Dear all,
>
> This email is mostly to report a negative result, maybe to help others
> _not_ trying something.
>
> I researched the possibility of replacing ubuntu with alpine on the
> Docker images (alpine seems to be used more and more on docker images,
> with plenty of official containers now based on it and not on debian).
>
> The reason alpine is used, its because it generates very small
> containers (a bare bones one is below 10 MB). But, due the the large
> dependencies of galaxy, the gain is negligible. Maybe 200 MB or so. 20%
> is something, but not a revolution.
>
> This being said, for servers with smaller dependencies (a mail server,
> web, ldap, dns...) alpine really reduces the footprint of docker
> containers.
>
> Tiago
>

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/