Skip to content

Instantly share code, notes, and snippets.

@luismts
Last active December 11, 2023 17:34
Show Gist options
  • Star 7 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save luismts/725fd70c7e963884b3a430e55ffb3432 to your computer and use it in GitHub Desktop.
Save luismts/725fd70c7e963884b3a430e55ffb3432 to your computer and use it in GitHub Desktop.

Docker Best Practices

  • Keep containers stateless.
  • Use COPY instead of ADD.
  • Make COPY last line before CMD or ENTRYPOINT.
    • Each line in the Dockerfile is cached.
    • Separate COPY of requirements.txt from source code.
  • CMD vs ENTRYPOINT: ENTRYPOINT is the main command. Treat CMD as the default flag for the entrypoint. Example:
ENTRYPOINT ["s3cmd"]
CMD ["--help"]
  • Bind to 0.0.0.0. Otherwise you get issue with Kubernetes.
  • Make services have healthcheck endpoints.
  • Switch to non-root-user
RUN groupadd -r myapp && useradd -r -g myapp myapp
USER myapp
  • Log everything to STDOUT.
  • One process per container.
  • Limit access from network. Example:
docker network create --driver bridge isolated_nw
docker run --network=isolated_nw --name=container busybox
  • Support SIGTERM, SIGKILL and SIGINT to exit gracefully.
  • 1 Process per container.
    • No supervisord, uWSGI 1 worker.
  • order instruction by least to most frequent changing content
  • Avoid COPY . (Copy only what is needed if possible)
  • Don’t use latest tag, add a specific version
  • Multi stage build -
    • DRY
    • small image size
    • Build different image for test/run/lint from base buiilt image

Removing images and conatiners with no tags

#!/bin/bash

# Delete all stopped containers
sudo docker rm $(sudo docker ps -q -f status=exited)
# Delete all dangling (unused) images
sudo docker rmi $(sudo docker images -q -f dangling=true)

Alternative using --no-run-if-empty

xargs with --no-run-if-empty is even better as it does cleanly handle the case when there is nothing to be removed.

#!/bin/bash

# Delete all stopped containers
sudo docker ps -q -f status=exited | xargs --no-run-if-empty sudo docker rm
# Delete all dangling (unused) images
sudo docker images -q -f dangling=true | xargs --no-run-if-empty sudo docker rmi

Alternative: deleting images with no tags

sudo docker rmi $(sudo docker images | grep "^<none>" | awk '{print $3}')

Mistakes to Avoid

Docker Antipatterns

Whichever route you take to implementing containers, you’ll want to steer clear of common pitfalls that can undermine the efficiency of your Docker stack.

For containers

Don’t run too many processes inside a single container

The beauty of containers—and an advantage of containers over virtual machines—is that it is easy to make multiple containers interact with one another in order to compose a complete application. There is no need to run a full application inside a single container. Instead, break your application down as much as possible into discrete services, and distribute services across multiple containers. This maximizes flexibility and reliability.

Don’t install operating systems inside Docker containers

It is possible to install a complete Linux operating system inside a container. In most cases, however, this is not necessary. If your goal is to host just a single application or part of an application in the container, you need to install only the essential pieces inside the container. Installing an operating system on top of those pieces adds unnecessary overhead.

Don’t run unnecessary services inside a container

To make the very most of containers, you want each container to be as lean as possible. This maximizes performance and minimizes security risks. For this reason, avoid running services that are not strictly necessary. For example, unless it is absolutely essential to have an SSH service running inside the container—which is probably not the case because there are other ways to log in to a container, such as with the docker exec call—don’t include an SSH service.

Don’t use container registries for purposes other than storing container images

A container registry is designed to do one thing and one thing only: store container images. Although container registries offer features that might make them attractive as all-purpose repositories for storing other types of data, doing so is a mistake. Vine learned this the hard way in 2016, when the company was using a container registry to store sensitive source code, which was inadvertently exposed to the public.

Don't write to your container's filesystem

Every time you write something to container's filesystem, it activates copy on write strategy.

A new storage layer is created using a storage driver (devicemapper, overlayfs or others). In case of active usage, it can put a lot of load on storage drivers, especially in case of Devicemapper or BTRFS.

Make sure your containers write data only to volumes. You can use tmpfs for small (as tmpfs stores everything in memory) temporary files:

apiVersion: v1
kind: Pod
metadata:
  name: test-pd
spec:
  containers:
  - image: busybox
    name: test-container
    volumeMounts:
    - mountPath: /tmp
      name: tempdir
  volumes:
  - name: tempdir
    emptyDir: {}

Images

Don't Trust Arbitrary Base Images!

Static Analysis of Containers:

Don't bloat runtime images with buildtime dependencies and tools

Use the “builder pattern”. Docker Multistage builds in Docker CE 17.05+. In a single dockerfile, you have multiple FROM stages. The ephemeral builder stage container will be discarded so that the final runtime container image will be as lean as possible.

Pros:

  • Builds are faster
  • Need less storage
  • Cold starts (image pull) are faster
  • Potentially less attack surface

Cons:

  • Less tooling inside container
  • “Non-standard” environment

Your DATA

Don’t store sensitive data inside a container image

This is a mistake because anyone with access to the registry in which the container image is stored, or to the running container, can read the information. Instead, store sensitive data on a secure filesystem to which the container can connect. In most scenarios, this filesystem would exist on the container host or be available over the network.

Don’t store data or logs inside containers

As noted earlier, any data stored inside a running container will disappear forever when that container shuts down, unless it is exported to a persistent storage location that is external to the container. That means you need to be sure to have such a persistent storage solution in place for any logs or other container data that you want to store permanently.

Others

Remember to include security and monitoring tools in your container stack

You can run containers without security and monitoring solutions in place. But if you want to use containers securely and efficiently, you’ll need to include these additional components in your stack.

Don't run PID 1

Use tini or dumb-init (see below for why)

Dumb-init vs Tini vs Gosu vs Su-Exec

Dumb-init or Tini

PID 1 is special in unix, and so omitting an init system often leads to incorrect handling of processes and signals, and can result in problems such as containers which can't be gracefully stopped, or leaking containers which should have been destroyed.

In Linux, processes in a PID namespace form a tree with each process having a parent process. Only one process at the root of the tree doesn't really have a parent. This is the "init" process, which has PID 1.

Processes can start other processes using the fork and exec syscalls. When they do this, the new process' parent is the process that called the fork syscall. fork is used to start another copy of the running process and exec is used to start different process. Each process has an entry in the OS process table. This records info about the process' state and exit code. When a child process has finished running, its process table entry remains until the parent process has retrieved its exit code using the wait syscall. This is called "reaping" zombie processes.

Zombie processes are processes that have stopped running but their process table entry still exists because the parent process hasn't retrieved it via the wait syscall. Technically each process that terminates is a zombie for a very short period of time but they could live for longer.

Something like tini or dumb-init can be used if you have a process that spawns new processes but doesn't have good signal handlers implemented to catch child signals and stop your child if your process should be stopped etc.

Bash scripts for example do NOT handle and emit signals properly.

Dumb-init supports signal rewriting and Tini doesn't, but Tini supports subreapers and they don't.

Gosu or Su-exec

Similarly, 'su' and 'sudo' have very strange and often annoying TTY and signal-forwarding behavior. They're also somewhat complex to setup and use (especially in the case of sudo), which allows for a great deal of expressivity, but falls flat if all you need is "run this specific application as this specific user and get out of the pipeline". gosu is a tool for simpler, lighterweight behavior. su-exec is a very minimal re-write of gosu in C, making for a much smaller binary, and is available in the main Alpine package repository.

Gosu (or su-exec) is useful inside containers for processes that are root, but don't want to be. While creating an image, the dockerfile USER is ideal. After the image is created, gosu (or su-exec) is more useful as part of a container initialization when you can no longer change users between run commands in your Dockerfile.

Again, after the image is created, something like gosu allows you to drop root permissions at the end of your entrypoint inside of a container. You may initially need root access to do some initialization steps (fixing uid's, host mounted volume permissions, etc). Then once initialized, you run the final service without root privileges and as pid 1 to handle signals cleanly. For instance, you may trap for instance SIGHUP for reloading the process as you would normally achieve via systemctl reload or such.

Just wanted to point out there are native (and smaller) alternatives to gosu and su-exec.

In debian there is setuidgid.

Busybox su does exactly what su-exec does so there is no need for su-exec in Alpine (unless they fix busybox su), and you can also use setuidgid

Resources

Docker commands definition

Containers

Your basic isolated Docker process. Containers are to Virtual Machines as threads are to processes. Or you can think of them as chroots on steroids.

Lifecycle

Command Meaning Definition
docker create -- creates a container but does not start it.
docker run -- creates and starts a container in one operation. running containers
docker stop -- stops it.
docker start -- will start it again.
docker restart -- restarts a container.
docker rm remove deletes a container.
docker kill -- sends a SIGKILL to a container.
docker attach -- will connect to a running container.will connect to a running container.
docker wait -- blocks until container stops.

Info

Command Meaning Definition
docker ps Process Status List all running containers
docker ps -a -- shows running and stopped containers.
docker logs gets logs from container.
docker inspect -- looks at all the info on a container (including IP address).
docker events -- gets events from container.
docker port -- shows public facing port of container.
docker top -- shows running processes in container.
docker stats -- shows containers' resource usage statistics.
docker diff -- shows changed files in the container's FS.

Import / Export

There doesn't seem to be a way to use docker directly to import files into a container's filesystem. The closest thing is to mount a host file or directory as a data volume and copy it from inside the container.

Command Meaning Definition
docker cp copie copies files or folders out of a container's filesystem.
docker export -- turns container filesystem into tarball archive stream to STDOUT.

Executing Commands

Command Meaning Definition
docker exec -- to execute a command in container.
docker exec -it foo /bin/bash -- To enter a running container, attach a new shell process to a running container called foo

Images

Images are just templates for docker containers.

Lifecycle

Command Meaning Definition
docker images -- shows all images.
docker import -- creates an image from a tarball.
docker build -- creates image from Dockerfile.
docker commit -- creates image from a container.
docker rmi -- removes an image.
docker insert -- inserts a file from URL into image. (kind of odd, you'd think images would be immutable after create)
docker load -- loads an image from a tar archive as STDIN, including images and tags (as of 0.7).
docker save -- saves an image to a tar archive stream to STDOUT with all parent layers, tags & versions (as of 0.7).

Info

Command Meaning Definition
docker history -- shows history of image.
docker tag -- tags an image to a name (local or registry).
docker image ids -- are sensitive information and should not be exposed to the outside world. Treat them like passwords.

Registry & Repository

A repository is a hosted collection of tagged images that together create the file system for a container.

A registry is a host -- a server that stores repositories and provides an HTTP API for managing the uploading and downloading of repositories.

Docker.com hosts its own index to a central registry which contains a large number of repositories. Having said that, the central docker registry does not do a good job of verifying images and should be avoided if you're worried about security.

Command Meaning Definition
docker login -- to login to a registry.
docker search -- searches registry for image.
docker pull -- pulls an image from registry to local machine.
docker push -- pushes an image to the registry from local machine.

Docker Compose

This file is documenting best practices when it comes to using docker-compose files.

File Organization

  • Each service should be sorted in alpabetical order
  • Each variable in environment should be sorted in alphabetical order

Service Structure

  • Each service must have a container_name and it must be the same as the service name.
 version: '3'
  services:
    api_gateway:
      container_name: api_gateway    
      ...
  • Each service must have the following structure (where applicable)
service_name:
  container_name:
  build: *OPTIONAL - if building images locally from a Dockerfile.*
  image: *OPTIONAL - if pulling images from dockerhub repo.*
  restart: *OPTIONAL - if container needs to continually restart or reconnect on failure.*
  volumes:
    - local_folder:container_folder
  environment:
    - container_env_variable=$variable_from_env_file
  ports: *OPTIONAL - if container needs to publicly expose ports.*
    - $some_port:$some_port
  expose: *OPTIONAL - if contianer only needs to expose ports within private network.*
    - $some_private_port
  networks:
    - network_name

Network Modes

host

Allows different docker containers to communicate via the localhost.

Important to note, this means each docker container that exposes a port, has each others ports exposed inside of the containers.

At the moment can't interact from an external to the exposed ports.

ports:
      # Only internally exposed.
      - $P2P_PORT:$P2P_PORT
      - $RPC_PORT:$RPC_PORT
      network_mode: host

Create a network

$ docker network create --subnet=172.0.0.0/25 local_dev
$ docker network inspect local_dev

DOCKER GUIDE

Docker is a tool that follows the Cattle, no pets DevOps mantra. It describes your hosting environment via a Dockerfile. Each system deployment results in an entirely new reprovisionned hosting environment while the other is decommissioned in the background. Docker achieves this by making it cheap to create new system images to spawn fleets of new containers.

Table of contents

Quick example recap

  1. Create a new NodeJS project:
mkdir my-app && \
cd my-app && \
npm init --yes && \
npm i express && \
touch index.js && \
touch Dockerfile
  1. Paste the following hello world API in the index.js:
const express = require('express')
const app = express()

app.get('/', (req, res) => {
	res.send('hello world')
})

app.listen(3000, () => console.log(`Server ready and listening on port ${3000}`))
  1. Paste the following in the Dockerfile:
# Use the official lightweight Node.js 12 image.
# https://hub.docker.com/_/node
FROM node:12-slim

# Create and change to the app directory.
WORKDIR /usr/src/app

# Copy application dependency manifests to the container image.
# A wildcard is used to ensure both package.json AND package-lock.json are copied.
# Copying this separately prevents re-running npm install on every code change.
COPY package*.json ./

# Install production dependencies.
RUN npm install --only=prod

# Copy local code to the container image.
COPY . ./

# Run the web service on container startup.
CMD ["node", "index.js"]
  1. Build an image for this project:
docker build -t nodejs:v0 .

Where:

  • -t is the tag option which allows to name the image, aka tag the image.
  • <IMAGE NAME>:<VERSION TAG> is the usual image naming convention, but you can change it to whatever you prefer.
  • . means the current directory to build the image.
  1. Launch a new container:
docker run -p 127.0.0.1:4000:3000 nodejs:v0

This last command port-forward the traffic received on 127.0.0.1:4000 to the port 3000 inside the container.

Key concepts

Image vs Container

A running instance of an image is called container. An image is made of a set of layers. If you start an image, you have a running container for that image. You can have many running containers of the same image.

You can see all your images with docker images whereas you can see your running containers with docker ps -a.

You don't reconfigure containers, instead you reprovision them

A typical newbie scratch head is to wonder how to change the container's config after it has been started. The answer is you can't. For example, if a container with an app listening on port 3000 has been provisioned as follow:

docker run my_image:v1

The app in this container is listening on port 3000. It cannot receive traffic from outside because no port binding has been configured. It is not possible to change that container later. Instead, recreate a new container from that image as follow, and delete the previous container:

docker -p 127.0.0.1:3000:3000 run my_image:v1

Which means, port-forward traffic on 127.0.0.1:3000 to this container on its internal port 3000.

This approach highlights the typicall way of thinking with Docker. Because containers and images are cheap to create, you do not reconfigure them. Instead, you recreate them from scratch. That the DevOps Cattle, no pets mantra.

ENTRYPOINT vs CMD

Demystifying ENTRYPOINT and CMD in Docker

You define those 2 properties in the Dockerfile. For example:

ENTRYPOINT ["echo”, "Hello"]
CMD ["World"]

With those 2 setup, starting a container with docker run <YOUR-IMAGE> will execute the default command echo Hello World. As you can see, the default command is just the concatenation of the entrypoint and the cmd.

Though those 2 can be strings, eventually, docker converts them to arrays, so it's usually less confusing to always use arrays.

To learn more about this topic, please refer to the ENTRYPOINT and CMD section.

Getting started

Quick start

  • Install Docker
  • Make sure it runs (launch the app on your Desktop. Not sure how to launch it from the terminal). If it is not launched (i.e., the daemon is not started in the background, the docker command will fail).
  • Create a Dockerfile.
  • Build the image: docker build -t <IMAGE NAME>:<VERSION TAG> .

Where:

  • -t is the tag option which allows to name the image, aka tag the image.
  • <IMAGE NAME>:<VERSION TAG> is the usual image naming convention, but you can change it to whatever you prefer.
  • . means the current directory to build the image.
  • List your containers: docker ps -a
  • Launch your container: docker start -a <MY-CONTAINER-NAME>

docker build vs docker run vs docker start vs docker exec

This section explains the important differences between:

If you're still confused by the conceptual difference between an image and a container, please refer to the Image vs Container section.

Creating an image with docker build

Typical usage: docker build -t my_image:v1 . Where:

  • -t is the tag option which allows to name the image, aka tag the image.
  • my_image:v1 is the usual image naming convention, but you can change it to whatever you prefer.
  • . means the current directory to build the image.
  1. If your project does not contain a Dockerfile and a .dockerignore, create one. Typically, the Dockerfile imports the project's file you need into the image and defines a command that can start your project, or an entry point that allows to call your project's APIs.
  2. Create the image for your project with docker build -t <IMAGE NAME>:<VERSION TAG> . (e.g., docker build -t my-website:v1 .).
  3. Once that's done, you can see your image with docker images.
  4. To delete your image, run docker rmi <IMAGE ID>

Creating a container from an image with docker run

Typical usage: docker run -it my_image:v1 Where -it allows to interact with the container's STDIN (-i) via the terminal (-t).

Once you have an image on your local machine, you can start a container with docker run <IMAGE_ID|IMAGE_NAME:IMAGE_TAG>. To list the available images and their IDs, use docker images.

Each time you use docker run <IMAGE ID>, a new container is created and started. To test this command, run it multiple times and then execute docker ps -a. You should see mutliple container for that specific image ID.

To delete the containers you don't need, run docker rm <CONTAINER ID>.

Tips:

  • Use docker run -it <IMAGE ID> if you need to interact with that container directly via the terminal (-i means STDIN and -t means terminal).

Starting an existing container with docker start

Use docker start if you simply want to start an existing container instead of creating a new one from the image.

  1. List all the containers:
docker ps -a
  1. You can start any container using either its CONTAINER ID or its NAMES:
docker start tender_bassi

Executing commands in a running container with docker exec

This command allows to execute a command inside a running container. The most common use of it is to open a shell terminal in a container to start interacting with that container:

docker exec -it <CONTAINER ID> sh

Creating a Dockerfile

# Use the official lightweight Node.js 12 image.
# https://hub.docker.com/_/node
FROM node:12-slim

# Create and change to the app directory.
WORKDIR /usr/src/app

# Copy application dependency manifests to the container image.
# A wildcard is used to ensure both package.json AND package-lock.json are copied.
# Copying this separately prevents re-running npm install on every code change.
COPY package*.json ./

# Install production dependencies.
RUN npm install --only=prod

# Configure Nuxt with the host, otherwise, it won't be reachable from outside Docker
ENV NUXT_HOST=0.0.0.0

# Copy local code to the container image.
COPY . ./

# Run the web service on container startup.
CMD npm start

IMPORTANT: The instructions order affects the speed at which Docker can create/recreate the image. More details about this topic in the The instructions order in your Dockerfile matters for performance section.

Popular commands

Command Description
docker image ls List all images.
docker ps -a List all containers. The -a options includes the non-running containers.

The Dockerfile

IMPORTANT: The instructions order in your Dockerfile matters for performance

Example:

# Comments can be used using the hashtag symbol
# Always start with FROM. This specify the base image
FROM ubuntu:14.04
# MAINTAINER is not required, but that’s a good practice
MAINTAINER Nicolas Dao <nicolas.dao@gmail.com>
# Creates a new environment variable called myName
ENV myName John Doe
# Add all the files and folders from your docker project into your image under /app/src
ADD . /app/src
# Add all the files and folders from your docker project into your image under /app/src2. To 
# understand the difference between COPY and ADD jump to the next ADD vs COPY section
COPY . /app/src2
# Create a new data volume in your image. More info about data volume here.
VOLUME /new-data-volume
# By default, RUN uses /bin/sh. The following line is pretty easy to understand. More info here
RUN apt-get update && apt-get install -y ruby ruby-dev
# WORKDIR set up the working directory for RUN, CMD, ENTRYPOINT, COPY, and ADD. Each time you run, 
# it will be relative to the previous working directory(unless you specify an absolute path).
# If you use a directory which does not exist, that directory will be automatically created.  
WORKDIR /a # Create a new ‘a’ directory at the container’s root, and set it up as the working dir.
RUN pwd    # /a
WORKDIR b  # Create a new ‘b’ directory under ‘a’, and set it up as the working dir.
RUN pwd    # /a/b
# ONBUILD is typically used in image intended to be used as base image. More details here
ONBUILD ADD . /app/src 
# CMD is the command that will be run after your container has started. More info here
CMD["echo", "Hello world" ]

ARG and ENV

The following Dockerfile defines a MSG argument and a HELLO environment variable:

FROM amazon/aws-lambda-nodejs:12
ARG MSG
ENV HELLO Hello ${MSG}

HELLO is an environment variable accessible in all the systems running inside the container defined by this image. It is set to the text message Hello ${MSG} where MSG is an argument that can be set via the docker build command:

docker build --build-arg MSG="Mike Davis" -t my-app

If you have multiple arguments:

docker build --build-arg MSG="Mike Davis" --build-arg AGE=40 -t my-app

If you need to set a default value for the ARG:

FROM amazon/aws-lambda-nodejs:12
ARG MSG=Baby
ENV HELLO Hello ${MSG}

ARG can also be nested:

FROM amazon/aws-lambda-nodejs:12
ARG MSG=Baby
ARG COOLMSG="you hot ${MSG}"

RUN echo ${COOLMSG}

ADD and COPY

Both exposes the same API: ADD|COPY <src> <dest. in the image>. The difference is that ADD supports more use cases, where COPY only supports sources accessible from your docker project.

ADD supports those following useful use cases:

  • src can be a URI
  • src can be a tar file. If the compression format is recognized, then the tar file will be automatically unpacked.

The best practice is to use COPY when possible. This is more transparent and obvious for anybody reading the Dockerfile.

RUN

RUN allows you to customize your base image with additional configurations or artifacts. Each time RUN is executed, a new docker commit is performed. When docker is done with the build, you’ll be able that there are a commit for each RUN command.

Run can be use in 2 different ways:

  • Shell form: RUN <shell command>
  • Exec form: RUN ["executable", "param1", "param2", ... "paramN"], which will result in the following shell command: executable param1 param2 ... paramN

In reality, the shell form is just a shortcut for the following exec form: RUN ["/bin/sh", "-c", "shell command"].

WARNING: The exec form does not invoke a command shell(have no clue on what it invokes instead!!!). That means that exec form does not support variable substitution out-of-the box. If you need variable substitution while using the exec form, you’ll have to explicitly use the shell exec:

RUN ["echo", "$HOME"] # output: $HOME
RUN ["/bin/sh", "-c", "echo", "$HOME"] # output: /users/you/

RUN uses a cache which is not automatically flushed between builds. To explicitly refresh that cache, use the following command:

docker build --no-cache ...

Each RUN will create a new layer. The best practice is to try to keep layers to a minimum. Please refer to the Carefull with single lines RUN commands section to learn more.

ENTRYPOINT and CMD

Basics ENTRYPOINT and CMD

Demystifying ENTRYPOINT and CMD in Docker

You define those 2 properties in the Dockerfile. For example:

ENTRYPOINT ["echo”, "Hello"]
CMD ["World"]

With those 2 setup, starting a container with docker run <YOUR-IMAGE> will execute the default command echo Hello World. As you can see, the default command is just the concatenation of the entrypoint and the cmd.

Though those 2 can be strings, eventually, docker converts them to arrays, so it's usually less confusing to always use arrays.

ENTRYPOINT also work with scripts:

COPY ./docker-entrypoint.sh /
ENTRYPOINT ["/docker-entrypoint.sh"]

CMD vs RUN

  • The main difference between CMD and RUN is that CMD does not result in a docker commit. CMD’s purpose is not to modify the base image, but instead to execute a command as soon as the container is up.
  • There can only be one CMD per Dockerfile. If there are more than one, only the last one will be executed.

Overriding them

To override CMD:

docker run <IMAGE> Baby

This will print: Hello Baby

To override ENTRYPOINT:

docker run -it --entrypoint /bin/bash <IMAGE>

This will allow to interact with the shell from inside the container.

ONBUILD

This directive allows to execute other directive at image build time. This is usefull when building images based on other images. For example, let's image the following parent image called parentimage:0.0.1:

FROM ubuntu:14.04
...
ONBUILD ADD . /app/src
...

The following child image does not worry about copying its content to /app/src:

FROM parentimage:0.0.1
...

If the parent had been defined as follow:

FROM ubuntu:14.04
...
ADD . /app/src
...

This would have means that the content of the parent would had been added to /app/src.

Multi-stage builds

Original article: https://docs.docker.com/develop/develop-images/multistage-build/

The old way - aka The Builder Pattern

Complex images require more layers to be built. Those additional layers increase the size of the final image, which impact performance. To keep the final image as slim as possible, the first popular pattern called the builder pattern used a shell script that would run two Dockerfile sequentially. The fist one called Dockerfile.build would build all the artefacts similarly to a build server. The image created by this Dockerfile.build would not matter. The artefact would then be passed to the production Dockerfile, which would therefore be lean.

Later, Docker shipped a new feature called multi-stage build.

The new way - Out-of-the-box multi-stage build

Let's imagine we want to package some NodeJS code to inject those files in another image. We don't really care about NodeJS or NPM, we only want the artefact. The following Dockerfile uses a minimal node image to build our project:

FROM node:14-slim
ARG FUNCTION_DIR="/opt/nodejs/"

RUN mkdir -p $FUNCTION_DIR
WORKDIR $FUNCTION_DIR

COPY package*.json ./
RUN npm i --only=prod

ENTRYPOINT ["/bin/sh"]

However, this image is still 168MB. Because we only care about the actual files, we can simply build the smallest image possible and pass it the built artefacts using the multi-stage build API:

FROM node:14-slim AS builder
ARG FUNCTION_DIR="/opt/nodejs/"

RUN mkdir -p $FUNCTION_DIR
WORKDIR $FUNCTION_DIR

COPY package*.json ./
RUN npm i --only=prod

FROM busybox
ARG FUNCTION_DIR="/opt/nodejs/"

RUN mkdir -p $FUNCTION_DIR
WORKDIR $FUNCTION_DIR

COPY --from=builder $FUNCTION_DIR ./

ENTRYPOINT ["/bin/sh"]

The image's size is now 1.8MB.

Docker configuration

The Docker configuration is maintained in the ~/.docker/config.json file.

Adding other Docker registries

By default, deploying new images targets Docker Hub, but you may want to deploy your images using other services (e.g., Google Cloud Container Register or AWS Elastic Container Registry (ECR)).

To add new registries, edit the ~/.docker/config.json file by adding a new credHelpers property as follow:

{
  "credHelpers": {
    "gcr.io": "gcloud",
    "us.gcr.io": "gcloud",
    "eu.gcr.io": "gcloud",
    "asia.gcr.io": "gcloud",
    "staging-k8s.gcr.io": "gcloud",
    "marketplace.gcr.io": "gcloud"
  }
}

Where each key represents a domain (e.g., gcr.io) and each value represents a program (e.g.,gcloud). The above example shows the exhaustive config for setting up Google Cloud Container Registry for the GCloud CLI.

Development workflow

Designing the correct container

As a matter of fact, designing the right container is about designing the right image. Docker excels at creating new images and spawning containers from them in milliseconds. This efficiency comes from its layers design. The original layers take time to be pulled from the registry (e.g., the first time you pull a linux distro preconfigured with NodeJS). But once those layers have been pulled, they are cached on your local system. As you're building your own layers locally, those layers are also cached, and that's why iterating throught building the image you need is so cheap. To test the image, you spawn a container from it. If the container needs to be fixed, you fix the underlying image definition (usually via its Dockerfile) generate a new image, spawn a new container. These are the basics behind the iteration process. You eventually end up with a lot of trash in your Docker images and containers. Simply delete them once you're done.

The iteration process is similar to this:

  1. Create a Dockerfile (and a new .dockerignore) based on what you think you need.
  2. Create a new image from that Dockerfile as follow: docker build -t <YOUR-IMAGE-NAME>:<YOUR-TAG> .
  3. Create a new container from that new image: docker run -it <YOUR-IMAGE-NAME>:<YOUR-TAG>
  4. Start interacting with your running container: docker exec -it <CONTAINER ID> sh
  5. Iterate though that process until you have the container you want.

Logs

To print the container's logs:

docker logs <CONTAINER ID OR NAME>

Thie issue with this command is that its a one off. If you need to stream the logs continuously, use thos instead:

docker logs --follow <CONTAINER ID OR NAME>

Networking

Port binding

If you package an app in your container that listens on port 3000, and if you intend to expose that app outside of the container, then the following won't work:

docker run my_image:v1

Instead, you need to configure your container with port binding as follow:

docker -p 127.0.0.1:3100:3000 run my_image:v1

The above creates a container configure on the host to do port forwarding so that all traffic sent to the host on 127.0.0.1:3100 is forwarded to the container on its app listening on port 3000.

NOTE: -p stands for publish

Popular base images

Image Description Link
scratch Most minimal image usefull to build other images. It's pretty much the base image for all the others. https://hub.docker.com/_/scratch
busybox Minimal image with many UNIX tools preinstalled. https://hub.docker.com/_/busybox
alpine:<VERSION> Built on top of busybox, it is a very mininal Linux version. https://hub.docker.com/_/alpine
node:<version> Self-explanatory. https://hub.docker.com/_/node
node:<version>-alpine Same as node:<version> but with alpine as the base image to save on space. Might not work in all scenarios. https://hub.docker.com/_/node

Tips and tricks

Dynamic base image with ARG

The following example show how to use a LAYER_01 argument to load different layer image for an AWS Lambda image.

ARG LAYER_01
FROM $LAYER_01 AS layer01
FROM amazon/aws-lambda-nodejs:12
ARG FUNCTION_DIR="/var/task"
ARG LAYER01_PATH="/opt/layer01/"

# Copy layer01 files into the opt/ folder
COPY --from=layer01 /opt/nodejs/ $LAYER01_PATH
ENV NODE_PATH "${LAYER01_PATH}node_modules:${NODE_PATH}"

# Create function directory
RUN mkdir -p ${FUNCTION_DIR}

# Copy handler function and package.json
COPY index.js ${FUNCTION_DIR}

# Set the CMD to your handler (could also be done as a parameter override outside of the Dockerfile)
CMD [ "index.handler" ]

Best practices

The instructions order in your Dockerfile matters for performance

BAD EXAMPLE:

FROM node:12-slim
WORKDIR /usr/src/app
COPY . ./
RUN npm install --only=prod
CMD npm start

GOOD EXAMPLE:

FROM node:12-slim
WORKDIR /usr/src/app
COPY package*.json ./
RUN npm install --only=prod
COPY . ./
CMD npm start

All instructions in your Dockerfile create a new layer(1). Because layers are cached, and because a change to a layer forces the regeneration of all its following layers, it is recommended to start the Dockerfile with the instructions that rarely change. In the GOOD EXAMPLE, the first 4 lines shoudl rarely change, while the 5th line (COPY . ./) almost always change. Indeed, each time a file changes, Docker will consider the COPY . ./ as different, which will force the regenaration of all the following instructions. In the BAD EXAMPLE, this includes RUN npm install --only=prod. This is a waste as the dependencies of a NodeJS project rarely change (for the readers unfamiliar with NodeJS, those dependencies are explicitly defined in the package.json and package-lock.json files). A better approach is to rewrite the Dockerfile as follow:

(1) Only RUN, COPY and ADD generate layers that impact the image size. The other instructions are called intermediate layers.

Keep your docker project small

Small means both bytes and numbers of files. All files in your project will be sent to the Docker daemon at build-time. If there are too many files, or if they are too big, you’ll experience bad performances. If you do have a lot of files, but some of them are not required by the build process, do use a .dockerignore file.

Use a .dockerignore

If files are not part of the build process.

Carefull with single lines RUN commands

Never write RUN apt-get update on its own on a single line, otherwise, Docker will cache that resulting image, and no updates will ever be performed. Instead, use the following:

RUN apt-get update && apt-get install -y s3cmd=1.1.0.*

This is better because:

  • Add or remove dependencies explicitly automatically invalidates the cache.
  • Now the s3cmd's version is explicit, updating it will force Docker to update the cache.

It's also considered a best practice to organize your dependencies using multi-lines and alphabetical order:

RUN apt-get update && apt-get install -y \ 
    aufs-tools \ 
    automake \ 
    btrfs-tools \ 
    build-essential \ 
    curl \
    s3cmd=1.1.0.*

Avoid downloading resources using ADD

Replace this:

ADD http://example.com/big.tar.xz /usr/src/things/
RUN tar -xJf /usr/src/things/big.tar.xz -C /usr/src/things
RUN make -C /usr/src/things all

With this:

RUN mkdir -p /usr/src/things \
    && curl -SL http://example.com/big.tar.gz \
    | tar -xJC /usr/src/things \
    && make -C /usr/src/things all

The second example is better than the first because the first creates a layer just for the ADD and keep the downloaded resources in the image for no reason.

Use multi-stage builds instead of single build or the builder pattern

Multi-stage builds allows to load all the tools and run all the expensive steps in other intermediate images and then load the final artefacts into a minimal image (small base image and minimal amount of layers). To learn more, please refer to the Multi-stage builds section.

FAQ

How to delete all the containers on my local machine?

docker stop $(docker ps -a -q)
docker rm $(docker ps -a -q)

How to delete all the images on my local machine?

docker stop $(docker ps -a -q)
docker rm $(docker ps -a -q)
docker rmi $(docker images -a -q) --force

How to run a container in the backgound?

This is called detached mode:

docker run -d my_image:v1

How to pass variables to the docker build command?

  1. Use the --build-arg option as follow:
docker build --build-arg HTTP_PROXY=http://10.20.30.2:1234 --build-arg FTP_PROXY=http://40.50.60.5:4567 .
  1. In your Dockerfile, just after the FROM, add the following commands:
ARG HTTP_PROXY
ARG FTP_PROXY

How to configure Docker to use other registries?

Please refer to the Adding other Docker registries section.

How read or stream a container's logs?

Please refer to the Logs section.

How to set the PATH in the Dockerfile?

ENV PATH="/opt/gtk/bin:${PATH}"

How to create a global ARG in a multi-stage build?

ARG are scoped per build stage. This means that each build stage MUST explicitly define its own ARG. For example, the following is incorrect:

ARG PULUMI_BIN="/opt/.pulumi/bin"

FROM alpine:3.14 AS builder
RUN mkdir -p $PULUMI_BIN

FROM busybox
RUN mkdir -p $PULUMI_BIN

The correct version is:

ARG PULUMI_BIN="/opt/.pulumi/bin"

FROM alpine:3.14 AS builder
ARG PULUMI_BIN
RUN mkdir -p $PULUMI_BIN

FROM busybox
ARG PULUMI_BIN
RUN mkdir -p $PULUMI_BIN

NOTE: Both block will use the same value.

Annex

Simple NodeJS Dockerfile and .dockerignore files

Dockerfile

# Use the official lightweight Node.js 12 image.
# https://hub.docker.com/_/node
FROM node:12-slim

# Create and change to the app directory.
WORKDIR /usr/src/app

# Copy application dependency manifests to the container image.
# A wildcard is used to ensure both package.json AND package-lock.json are copied.
# Copying this separately prevents re-running npm install on every code change.
COPY package*.json ./

# Install production dependencies.
RUN npm install --only=prod

# Copy local code to the container image.
COPY . ./

# Run the web service on container startup.
CMD npm start

.dockerignore

Dockerfile
README.md
node_modules
npm-debug.log

Terminal shortcut config

The following config in your terminal config allows to use:

  • d instead of docker.
  • drm to delete all the containers.
  • drmi to delete all the images.
  • dhost 4000:8080 to build an image and launch a container and redirect traffic on 127.0.0.1:4000 to port 8080 in your container.
  • dit to build an image and launch a container and start a terminal in it.
  • dit --entrypoint /bin/bash to force starting bash.
  • dlocal 3000:3100 [other options] instead of docker run -p 127.0.0.1:3000:3100 [other options]
alias d="docker"
function drm() {
	docker stop $(docker ps -a -q) 
	docker rm $(docker ps -a -q)
}
function drmi() {
	docker stop $(docker ps -a -q)
	docker rm $(docker ps -a -q)
	docker rmi $(docker images -a -q) --force
}
function dlocal() {
	docker run -p 127.0.0.1:$1 $2 $3 $4 $5 $6
}
function dhost() {
	docker build -t localapp .
	docker run -p 127.0.0.1:$1 localapp:latest
}
function dit() {
	docker build -t localapp .
	docker run -it $1 $2 $3 localapp:latest
}

References

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment