Skip to content

Instantly share code, notes, and snippets.

@StevenACoffman
Last active February 9, 2024 12:54
Show Gist options
  • Save StevenACoffman/41fee08e8782b411a4a26b9700ad7af5 to your computer and use it in GitHub Desktop.
Save StevenACoffman/41fee08e8782b411a4a26b9700ad7af5 to your computer and use it in GitHub Desktop.
Docker Best Practices

Mistakes to Avoid: Docker Antipatterns

Whichever route you take to implementing containers, you’ll want to steer clear of common pitfalls that can undermine the efficiency of your Docker stack.

Don’t run too many processes inside a single container

The beauty of containers—and an advantage of containers over virtual machines—is that it is easy to make multiple containers interact with one another in order to compose a complete application. There is no need to run a full application inside a single container. Instead, break your application down as much as possible into discrete services, and distribute services across multiple containers. This maximizes flexibility and reliability.

Don’t install operating systems inside Docker containers

It is possible to install a complete Linux operating system inside a container. In most cases, however, this is not necessary. If your goal is to host just a single application or part of an application in the container, you need to install only the essential pieces inside the container. Installing an operating system on top of those pieces adds unnecessary overhead.

Don’t run unnecessary services inside a container

To make the very most of containers, you want each container to be as lean as possible. This maximizes performance and minimizes security risks. For this reason, avoid running services that are not strictly necessary. For example, unless it is absolutely essential to have an SSH service running inside the container—which is probably not the case because there are other ways to log in to a container, such as with the docker exec call—don’t include an SSH service.

Don't bloat runtime images with buildtime dependencies and tools

Use the “builder pattern”. Docker Multistage builds in Docker CE 17.05+. In a single dockerfile, you have multiple FROM stages. The ephemeral builder stage container will be discarded so that the final runtime container image will be as lean as possible.

Pros:

  • Builds are faster
  • Need less storage
  • Cold starts (image pull) are faster
  • Potentially less attack surface

Cons:

  • Less tooling inside container
  • “Non-standard” environment

Remember to include security and monitoring tools in your container stack

You can run containers without security and monitoring solutions in place. But if you want to use containers securely and efficiently, you’ll need to include these additional components in your stack.

Don't Trust Arbitrary Base Images!

Static Analysis of Containers:

Don’t store sensitive data inside a container image

This is a mistake because anyone with access to the registry in which the container image is stored, or to the running container, can read the information. Instead, store sensitive data on a secure filesystem to which the container can connect. In most scenarios, this filesystem would exist on the container host or be available over the network.

Don’t store data or logs inside containers

As noted earlier, any data stored inside a running container will disappear forever when that container shuts down, unless it is exported to a persistent storage location that is external to the container. That means you need to be sure to have such a persistent storage solution in place for any logs or other container data that you want to store permanently.

Don’t use container registries for purposes other than storing container images

A container registry is designed to do one thing and one thing only: store container images. Although container registries offer features that might make them attractive as all-purpose repositories for storing other types of data, doing so is a mistake. Vine learned this the hard way in 2016, when the company was using a container registry to store sensitive source code, which was inadvertently exposed to the public.

Don't write to your container's filesystem

Every time you write something to container's filesystem, it activates copy on write strategy.

A new storage layer is created using a storage driver (devicemapper, overlayfs or others). In case of active usage, it can put a lot of load on storage drivers, especially in case of Devicemapper or BTRFS.

Make sure your containers write data only to volumes. You can use tmpfs for small (as tmpfs stores everything in memory) temporary files:

apiVersion: v1
kind: Pod
metadata:
  name: test-pd
spec:
  containers:
  - image: busybox
    name: test-container
    volumeMounts:
    - mountPath: /tmp
      name: tempdir
  volumes:
  - name: tempdir
    emptyDir: {}

Don't run PID 1

Use tini or dumb-init (see below for why)

Dumb-init vs Tini vs Gosu vs Su-Exec

Dumb-init or Tini

PID 1 is special in unix, and so omitting an init system often leads to incorrect handling of processes and signals, and can result in problems such as containers which can't be gracefully stopped, or leaking containers which should have been destroyed.

In Linux, processes in a PID namespace form a tree with each process having a parent process. Only one process at the root of the tree doesn't really have a parent. This is the "init" process, which has PID 1.

Processes can start other processes using the fork and exec syscalls. When they do this, the new process' parent is the process that called the fork syscall. fork is used to start another copy of the running process and exec is used to start different process. Each process has an entry in the OS process table. This records info about the process' state and exit code. When a child process has finished running, its process table entry remains until the parent process has retrieved its exit code using the wait syscall. This is called "reaping" zombie processes.

Zombie processes are processes that have stopped running but their process table entry still exists because the parent process hasn't retrieved it via the wait syscall. Technically each process that terminates is a zombie for a very short period of time but they could live for longer.

Something like tini or dumb-init can be used if you have a process that spawns new processes but doesn't have good signal handlers implemented to catch child signals and stop your child if your process should be stopped etc.

Bash scripts for example do NOT handle and emit signals properly.

Dumb-init supports signal rewriting and Tini doesn't, but Tini supports subreapers and they don't.

Gosu or Su-exec

Similarly, 'su' and 'sudo' have very strange and often annoying TTY and signal-forwarding behavior. They're also somewhat complex to setup and use (especially in the case of sudo), which allows for a great deal of expressivity, but falls flat if all you need is "run this specific application as this specific user and get out of the pipeline". gosu is a tool for simpler, lighterweight behavior. su-exec is a very minimal re-write of gosu in C, making for a much smaller binary, and is available in the main Alpine package repository.

Gosu (or su-exec) is useful inside containers for processes that are root, but don't want to be. While creating an image, the dockerfile USER is ideal. After the image is created, gosu (or su-exec) is more useful as part of a container initialization when you can no longer change users between run commands in your Dockerfile.

Again, after the image is created, something like gosu allows you to drop root permissions at the end of your entrypoint inside of a container. You may initially need root access to do some initialization steps (fixing uid's, host mounted volume permissions, etc). Then once initialized, you run the final service without root privileges and as pid 1 to handle signals cleanly. For instance, you may trap for instance SIGHUP for reloading the process as you would normally achieve via systemctl reload or such.

References

@jakub-bochenski
Copy link

Just wanted to point out there are native (and smaller) alternatives to gosu and su-exec.

In debian there is setuidgid.

Busybox su does exactly what su-exec does so there is no need for su-exec in Alpine (unless they fix busybox su), and you can also use setuidgid

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment