Skip to content

Instantly share code, notes, and snippets.

@dergachev
Last active May 25, 2023 03:55
Show Gist options
  • Star 64 You must be signed in to star a gist
  • Fork 5 You must be signed in to fork a gist
  • Save dergachev/8441335 to your computer and use it in GitHub Desktop.
Save dergachev/8441335 to your computer and use it in GitHub Desktop.
Caching debian package installation with docker

TLDR: I now add the following snippet to all my Dockerfiles:

# If host is running squid-deb-proxy on port 8000, populate /etc/apt/apt.conf.d/30proxy
# By default, squid-deb-proxy 403s unknown sources, so apt shouldn't proxy ppa.launchpad.net
RUN route -n | awk '/^0.0.0.0/ {print $2}' > /tmp/host_ip.txt
RUN echo "HEAD /" | nc `cat /tmp/host_ip.txt` 8000 | grep squid-deb-proxy \
  && (echo "Acquire::http::Proxy \"http://$(cat /tmp/host_ip.txt):8000\";" > /etc/apt/apt.conf.d/30proxy) \
  && (echo "Acquire::http::Proxy::ppa.launchpad.net DIRECT;" >> /etc/apt/apt.conf.d/30proxy) \
  || echo "No squid-deb-proxy detected on docker host"

Caching apt-get install with docker

If you're using docker, you probably have a Dockerfile that starts like this:

FROM ubuntu:12.04
RUN apt-get update

# install all my favorite utilities, putting it early to facilitate docker caching
RUN apt-get install -y curl git vim make build-essential

# install all pre-requisite packages for our dockerized application
RUN apt-get install -y libyaml-dev libxml2-dev libxslt-dev ruby1.9.1 ruby1.9.1-dev

# other stuff...

The best part about docker (vs vagrant-lxc, for example) is that Docker will automatically cache each successful build step in the Dockerfile, and each time you tweak the Dockerfile and re-run docker build, it only needs to re-run from the first change in the Dockerfile. That's a massive win, unless you like watching your packages install!

Even with this workflow, you'll still regularly invalidate your caches. For example, it's still likely that you'll invalidate Docker's build caches and be forced to re-install your packages. For example, you just realized that the base ubuntu image is missing man (what horror!). Sure you could add it to the bottom of your dockerfile and keep your caches, but that's just gonna bug at you:

FROM ubuntu:12.04
RUN apt-get update

# install all my favorite utilities, putting it early to facilitate docker caching
RUN apt-get install -y curl git vim make build-essential

# install all pre-requisite packages for our dockerized application
RUN apt-get install -y libyaml-dev libxml2-dev libxslt-dev ruby1.9.1 ruby1.9.1-dev

# other stuff...

# all by itself
RUN apt-get install -y man

You can save yourself refactoring, time, and bandwidth by installing a caching proxy for apt. Assuming you're running debian/ubuntu on your docker host, best one seems to be squid-deb-proxy.

I recommend installing it on the docker host, as follows:

sudo apt-get install -y squid-deb-proxy
# it automatically starts listening on port 8000 (on the host)

Now we need to get our docker containers to use the proxy. While there's a package called squid-deb-proxy-client that automatically detects the presence of a local proxy, it relies on the zeroconf daemon, and daemon's generally don't work in docker containers without a lot of fuss. Instead, modify your Dockerfile to create /etc/apt/apt.conf.d/30proxy, which configures apt to use the proxy on http://HOST-IP:8000:

FROM ubuntu:12.04

# If host is running squid-deb-proxy on port 8000, populate /etc/apt/apt.conf.d/30proxy
# By default, squid-deb-proxy 403s unknown sources, so apt shouldn't proxy ppa.launchpad.net
RUN route -n | awk '/^0.0.0.0/ {print $2}' > /tmp/host_ip.txt
RUN echo "HEAD /" | nc `cat /tmp/host_ip.txt` 8000 | grep squid-deb-proxy \
  && (echo "Acquire::http::Proxy \"http://$(cat /tmp/host_ip.txt):8000\";" > /etc/apt/apt.conf.d/30proxy) \
  && (echo "Acquire::http::Proxy::ppa.launchpad.net DIRECT;" >> /etc/apt/apt.conf.d/30proxy) \
  || echo "No squid-deb-proxy detected on docker host"

RUN apt-get update
RUN apt-get install -y {PACKAGES}

Obviously it's still faster if you can get Docker to cache the install steps, but caching just the package downloading will give you a big speedup.

Want more info? Here are my sources:

Enjoy!

@islander
Copy link

islander commented Jun 14, 2019

HOST_IP=$(awk '/^[a-z]+[0-9]+\t00000000/ { printf("%d.%d.%d.%d\n", "0x" substr($3, 7, 2), "0x" substr($3, 5, 2), "0x" substr($3, 3, 2), "0x" substr($3, 1, 2)) }' < /proc/net/route)

This line doesn't work on my installation (Ubuntu 18.04 / GNU Awk 4.1.4).
Interface regexp should be [a-z0-9]+ to parse new interface names, for example enp2s0. And printf can't parse digits from string, so you should explicitly convert it using strtonum("0x" substr($3, 7, 2)), or add --non-decimal-data key to awk. So my fixed version of this script:

#!/bin/bash -ex
# see:
# https://github.com/sameersbn/docker-apt-cacher-ng
# https://gist.github.com/dergachev/8441335

CONFPATH=/etc/apt/apt.conf.d/01proxy 
APT_PROXY_PORT=$1
HOST_IP=$(awk --non-decimal-data '/^[a-z0-9]+\t00000000/ { printf("%d.%d.%d.%d\n", "0x" substr($3, 7, 2), "0x" substr($3, 5, 2), "0x" substr($3, 3, 2), "0x" substr($3, 1, 2)) }' < /proc/net/route)

if [[ ! -z "$APT_PROXY_PORT" ]] && [[ ! -z "$HOST_IP" ]]; then
    cat > $CONFPATH <<-EOL
        Acquire::HTTP::Proxy "http://${HOST_IP}:${APT_PROXY_PORT}";
        Acquire::HTTPS::Proxy "false";
EOL
    cat $CONFPATH
    echo "Using host's apt proxy"
else
    echo "No squid-deb-proxy detected on docker host"
fi

UPD: also, see this workaround for other versions of awk (mawk, etc)

@bytearchive
Copy link

🆒

@Aposhian
Copy link

Aposhian commented Aug 6, 2021

What are the advantages of using squid-deb-proxy vs using buildkit to cache your /var/cache/apt and /var/lib/apt directories between runs?

https://github.com/moby/buildkit/blob/master/frontend/dockerfile/docs/syntax.md#example-cache-apt-packages

@aerickson
Copy link

@Aposhian it doesn't cache during a docker build only a docker run.

I have a gist with what I'm currently using. It doesn't require modifying the Dockerfile at all, just the build command.

https://gist.github.com/aerickson/3f785dd2fb75de27c30468dbac91cb96

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment