noteed/docker-reesd.md

## docker-reesd.md

      
    Raw
  

              docker-reesd.md
            
          
    How I use Docker for Reesd

Docker is all the rage these days. I started to look at Docker seriously maybe
around April last year. It has gained a lot of momentum and I think it's just
the beginning. For instance new tools and services are created regularly, or
Docker, inc. has not yet unveiled their exact commercial offering (and I think
they have a very rich view about what it is possible to achieve with Docker).
When I started to build Reesd, I have directly used Docker. Even if it is not
production-ready, or even if there is some risk that it might evolve
differently than what I would need in the future, it is a great way to start
building a complete service and I'm glad I took that path. If necessary, it
will still be possible to drop at the LXC level directly. I think the greatest
benefit is in learning and thinking about how to organize services the Docker
way, even if I don't end up using it.
Now, there is probably more than a single Docker "way", especially when
building multi-containers setup. In this post I want to share the view I have
and how I'm using Docker to build Reesd. Please see this
blog post to learn about
it.
Self-contained services

At its heart, a Docker image is advertised as a light-weight, self-contained
shipping unit for any application. The goal is to package your application with
"everything" it needs. If we consider a Python application (that would be
almost the same situation for a different language ecosystem), you can use
tools such as virtualenv to install the application dependencies without
interfering with another Python application dependencies. But a tool such as
virtualenv cannot cope with, say, C libraries that you would install through
your OS packaging system. Docker however provides a mean to take care of your
application dependencies in a broader, more systematic, way. That is,
continuing with the Python example, you can use Docker to package your Python
application, including its C dependencies, configurations files, or file system
directories. When necessary, you can even package some supporting services.
E.g. a syslog server can be provided with your application within the container
to turn syslog logs into stdout output (maybe by tail -F on some file).
Technically, when running a Docker image, you can select the program that is
executed. For instance if you use a base Ubuntu image, you can run echo,
bash, or any of the other programs available:
> docker run ubuntu echo hello world
> docker run -t -i ubuntu bash

Obviously when the image contains your application, possibly with some startup
script, or bash or any other executable you might need, you're free to fork
other processes within the container. This is particularly true if you decide
to use something like supervisord. In other words, since processes can fork
other processes, there is a natural question about whether using a single
container that would bundle, say, your application server and a database, is
better or not than having two containers and additional wiring code to
orchestrate them.
I think the Docker community advocates a (more or less) single process per
container approach and I think it is right. Still the homepage of Docker talks
about "packaging any application as a container". This might be a wording
issue, but an application can be made of multiple processes, or services.
Indeed we can talk about "distributed applications". So really while I don't
want to define the words "application" or "service", it might help to think
about your application being packaged as multiple self-contained containers,
each providing a single service. (I advertise Reesd as a redundant storage
service; this is what I expose to my users but to me it is a distributed
application. To you it might be a service, part of your own distributed
application.)
Uniformity and flexibility

To explain the reasons I believe in the single-process-per-container approach,
we might go back to the problem we try to solve. We will see that Docker solves
the problem nicely but that is is even better at solving it when we use the
approach I advocate.
One of the first reason people get interested in Docker is as a lightweight
Vagrant alternative. The goal is to have reproducible environments -- that is
being able to share a common environment between developers, and between
development, integration and production deployments (or really any additional
environment you might need).
If the primary reason for you to use Docker is as a lightweight Vagrant
alternative, there is nothing wrong. Indeed the re-usability of Dockerfiles or
the very short startup time are perfectly enjoyable on their own.
Uniformity examples: docker logs: same command for every services.
Isolation of (trusted or untrusted) code execution.
Reproducible environments

More than environments: deterministic tooling: e.g. your compiler and
associated libraries are provided by Docker images. It is easy to rebuild the
depencies together, swap compiler versions, ...