Skip to content

Instantly share code, notes, and snippets.

@dergachev
Last active May 25, 2023 03:55
Show Gist options
  • Star 64 You must be signed in to star a gist
  • Fork 5 You must be signed in to fork a gist
  • Save dergachev/8441335 to your computer and use it in GitHub Desktop.
Save dergachev/8441335 to your computer and use it in GitHub Desktop.
Caching debian package installation with docker

TLDR: I now add the following snippet to all my Dockerfiles:

# If host is running squid-deb-proxy on port 8000, populate /etc/apt/apt.conf.d/30proxy
# By default, squid-deb-proxy 403s unknown sources, so apt shouldn't proxy ppa.launchpad.net
RUN route -n | awk '/^0.0.0.0/ {print $2}' > /tmp/host_ip.txt
RUN echo "HEAD /" | nc `cat /tmp/host_ip.txt` 8000 | grep squid-deb-proxy \
  && (echo "Acquire::http::Proxy \"http://$(cat /tmp/host_ip.txt):8000\";" > /etc/apt/apt.conf.d/30proxy) \
  && (echo "Acquire::http::Proxy::ppa.launchpad.net DIRECT;" >> /etc/apt/apt.conf.d/30proxy) \
  || echo "No squid-deb-proxy detected on docker host"

Caching apt-get install with docker

If you're using docker, you probably have a Dockerfile that starts like this:

FROM ubuntu:12.04
RUN apt-get update

# install all my favorite utilities, putting it early to facilitate docker caching
RUN apt-get install -y curl git vim make build-essential

# install all pre-requisite packages for our dockerized application
RUN apt-get install -y libyaml-dev libxml2-dev libxslt-dev ruby1.9.1 ruby1.9.1-dev

# other stuff...

The best part about docker (vs vagrant-lxc, for example) is that Docker will automatically cache each successful build step in the Dockerfile, and each time you tweak the Dockerfile and re-run docker build, it only needs to re-run from the first change in the Dockerfile. That's a massive win, unless you like watching your packages install!

Even with this workflow, you'll still regularly invalidate your caches. For example, it's still likely that you'll invalidate Docker's build caches and be forced to re-install your packages. For example, you just realized that the base ubuntu image is missing man (what horror!). Sure you could add it to the bottom of your dockerfile and keep your caches, but that's just gonna bug at you:

FROM ubuntu:12.04
RUN apt-get update

# install all my favorite utilities, putting it early to facilitate docker caching
RUN apt-get install -y curl git vim make build-essential

# install all pre-requisite packages for our dockerized application
RUN apt-get install -y libyaml-dev libxml2-dev libxslt-dev ruby1.9.1 ruby1.9.1-dev

# other stuff...

# all by itself
RUN apt-get install -y man

You can save yourself refactoring, time, and bandwidth by installing a caching proxy for apt. Assuming you're running debian/ubuntu on your docker host, best one seems to be squid-deb-proxy.

I recommend installing it on the docker host, as follows:

sudo apt-get install -y squid-deb-proxy
# it automatically starts listening on port 8000 (on the host)

Now we need to get our docker containers to use the proxy. While there's a package called squid-deb-proxy-client that automatically detects the presence of a local proxy, it relies on the zeroconf daemon, and daemon's generally don't work in docker containers without a lot of fuss. Instead, modify your Dockerfile to create /etc/apt/apt.conf.d/30proxy, which configures apt to use the proxy on http://HOST-IP:8000:

FROM ubuntu:12.04

# If host is running squid-deb-proxy on port 8000, populate /etc/apt/apt.conf.d/30proxy
# By default, squid-deb-proxy 403s unknown sources, so apt shouldn't proxy ppa.launchpad.net
RUN route -n | awk '/^0.0.0.0/ {print $2}' > /tmp/host_ip.txt
RUN echo "HEAD /" | nc `cat /tmp/host_ip.txt` 8000 | grep squid-deb-proxy \
  && (echo "Acquire::http::Proxy \"http://$(cat /tmp/host_ip.txt):8000\";" > /etc/apt/apt.conf.d/30proxy) \
  && (echo "Acquire::http::Proxy::ppa.launchpad.net DIRECT;" >> /etc/apt/apt.conf.d/30proxy) \
  || echo "No squid-deb-proxy detected on docker host"

RUN apt-get update
RUN apt-get install -y {PACKAGES}

Obviously it's still faster if you can get Docker to cache the install steps, but caching just the package downloading will give you a big speedup.

Want more info? Here are my sources:

Enjoy!

@aerickson
Copy link

@Aposhian it doesn't cache during a docker build only a docker run.

I have a gist with what I'm currently using. It doesn't require modifying the Dockerfile at all, just the build command.

https://gist.github.com/aerickson/3f785dd2fb75de27c30468dbac91cb96

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment