Skip to content

Instantly share code, notes, and snippets.

@nirev
Created August 14, 2020 16:34
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save nirev/eb9cb7dfa669c158ecee86e97b581ec1 to your computer and use it in GitHub Desktop.
Save nirev/eb9cb7dfa669c158ecee86e97b581ec1 to your computer and use it in GitHub Desktop.
Caching docker images

Speeding up Docker builds in CI

Every time a build starts in a new agent, the docker build cache is empty, and thus a docker build will need to rebuild the entire image from scratch.

To avoid this we can repopulate the docker cache when starting a build.

Without docker multi stage

If your build is not using multi-stage yet, re-populating the cache is easier. The cache can be populated by pulling the latest image from registry before building:

docker pull service:dev || true
docker build -t service:dev .

However, using a multi-stage build can be good to:

  • reduce final image size
  • reduce build time by building layers in parallel

With docker multi-stage

Take the following Dockerfile using multi stage build as an example:

FROM hexpm/elixir:1.10.4-erlang-23.0.2-alpine-3.11.6 as compile
WORKDIR /build
COPY something something2 /build
RUN build.sh --target /app

FROM alpine:3.11.6 as runtime
COPY --from=compile /app /app
ENTRYPOINT ["/app/run"]

(note: this is not a real Dockerfile)

It create a compile stage using a base elixir image, copies the code, and compiles it saving output compiled artificts to /app.

Then, it creates a new (and final) stage called runtime, that starts from a minimal alpine image, and copies the compiled artifacts from the compile stage.

What happens if you try to rebuild the cache the same was as before? If you build and push the image doing:

docker build -t service:dev .
docker push service:dev

And then try to repopulate the cache:

docker pull service:dev || true
docker build -t service:dev .

The build process will start from scratch. This happens because the final tagged image is the runtime stage. And that stage doesn't have the cache for the compile stage, which is most likely the heavy lifting and longer build step.

In order to cache it, you need to split your build to tag the intermediate step:

docker pull service:compile || true
docker pull service:dev || true

# Build compile stage using pulled image as cache:
docker build \
  --target compile \
  --cache-from=service:compile \
  --tag service:compile .

# Build runtime stage:
docker build \
  --target runtime \
  --cache-from=service:compile \
  --cache-from=service:dev \
  --tag service:dev .

With that, the cache will be populated from image and reused in following builds.

NOTE: remember to push the tagged images to your registry ;)

Parallel builds with BuildKit

Another improvement that can be made is using the newer docker build system: BuildKit (github repo)

BuildKit has been integrated with docker build since version 18.06 (2018-07-18), but is still in development and not enabled by default.

Is is useful when you have a Dockerfile like this:

FROM something as base
WORKDIR /build
COPY something /build
RUN get-dependencies

FROM base as compile-backend
RUN compile-backend.sh

FROM base as compile-frontend
RUN compile-frontend.sh

FROM base as compiled
COPY --from=compile-backend /app /app
COPY --from=compile-frontend /assets /app/assets

FROM minimal-something as runtime
COPY --from=compile-backend /app /app
COPY --from=compile-frontend /assets /app/assets
ENTRYPOINT ["/app/run"]

In this example Dockerfile, compile-backend and compile-frontend have a shared dependency with base, but can be build in parallel, and then compiled and runtime waits for both, but can also be built in parallel. The compiled stage is used solely for caching purposes, and is the one that should be tagged and pushed to registry.

To be able to build this in parallel with BuildKit, you need to enable it:

export DOCKER_BUILDKIT=1
docker build \
  --build-arg BUILDKIT_INLINE_CACHE=1 \
  --target compiled \
  --cache-from=service:compiled \
  --tag service:compiled .

BUILDKIT_INLINE_CACHE should be set so the pushed image contains all the cached layers, otherwise the cache won't be repopulated when pulling the image.

Here's how a BuildKit parallel build looks like: image (this is building deps-dev, deps-test, deps-prod and dialyzer stages in parallel)

Docker buildx

There is a new docker buildx experimental feature that will better integrate docker build process with BuildKit. If you'd like to give it a try and report back, here it is: https://docs.docker.com/buildx/working-with-buildx/ :)

References

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment