Skip to content

Instantly share code, notes, and snippets.

@mboersma
Last active November 22, 2016 19:24
Show Gist options
  • Save mboersma/6435b6b22da28476bae847479b6cc19c to your computer and use it in GitHub Desktop.
Save mboersma/6435b6b22da28476bae847479b6cc19c to your computer and use it in GitHub Desktop.
Handling Updates to deis/base

Handling Updates to deis/base

This document proposes solutions to the process of updating dependent components when the deis/base image is updated.

Background

Most Deis Workflow images use deis/base as the starting point for their Docker images:

FROM deis/base

Besides ensuring consistency and network efficiency, a common base image allows security fixes to be applied in one place. The deis/base image was refactored to use ubuntu-slim specifically to enable security scanning.

When a relevant CVE is found and fixed, the Workflow team will apply that fix to deis/base. It is important that the new base image be rolled out quickly to all dependent components.

Current Challenges

There are a large number of Deis repositories that consume deis/base. There may be some obscure components that are easily forgotten.

Catching up with deis/base updates is a purely manual process. Creating a large number of Pull Requests is tedious and error-prone. Initiating the creation of these PRs is left to engineers' own initiative and thus may lag or be forgotten.

It isn't immediately obvious when deis/base has been updated. Currently only a Slack notification will show when it is tagged, and even when subscribed to the correct channel, it's possible to miss that important event.

Mutable or Immutable Tags?

Another approach to handling base image updates is to rebuild it inline, without updating the base tag. A variant of this is to update the base tag while using "cascading" semantic version tags. In that scenario, deis/base:0.3 would always point to the latest patch release tag such as deis/base:0.3.4.

Discussion of using mutable tags concluded negatively. Simply rebuilding a dependent image to benefit from base image changes, without a specific code change to point to, is problematic from both the CI and traceability points of view. And since Workflow production charts use imagePull=IfNotPresent, it's not realistic to have users update Docker images inline.

Philosophically, the Workflow team prefers to be explicit and to pin software packages at specific versions. It's safer to err on the side of creating lots of annoying "housekeeping" PRs that make dependencies explicit, so that artifacts can be rebuilt reliably.

It's possible there is a better way to use mutable tags, and more feedback here would be welcome. For now, this document considers only immutable Docker tags in its proposals.

Goals

  • All dependent components are updated quickly to match deis/base updates
  • Updating components does not require extensive manual work

Proposed Solutions

When important, security-related changes are made to deis/base, it is released with a new git tag resulting in a new Docker image artifact matching that tag. How can components ensure they update the FROM line in relevant Dockerfiles?

Create PRs or Commits When Deis/Base is Tagged

The CI system (specifically jenkins-jobs) maintains a list of dependent components that must be kept up-to-date. When deis/base is successfully deployed with a new tag, CI jobs execute that either:

  • Programmatically create all the necessary Pull Requests to update Dockerfiles, attaching a p0 (priority 0) tag
  • Programmatically commit Dockerfile changes to master rather than creating a PR

Check for Out-of-Date on master Builds

When code is merged to master in a dependent component, a check is made to see if FROM deis/base is out of date. If so, the build is failed. An engineer must then create the Pull Request to update FROM deis/base and see it through to completion.

This check could be implemented directly in the project Makefile, so developers would be warned immediately during their normal build/test cycle that an update to the base image needs to happen ASAP.

So that dependent components do not fall through the cracks, this would require CI to build components from master regularly, at a minimum once a day. The failure notifications should also be very hard to ignore.

Further Issues

While the two proposals above help with getting an updated FROM deis/base line into the master branch for dependent components, they do not make a patch release of that component itself. So if a major CVE is patched in deis/base and deis/controller updates its Dockerfile in master, it may still be necessary to release a new tag for deis/controller.

One problem with automating releases of the dependent components is that the master branch itself may not be intended to be released as-is. While all components should be releaseable from master at any given time, in practice it might be undesirable to simply tack on a FROM deis/base commit to master and automatically release that. Previous changes may imply the next semantic version tag is a minor (or major!), not a patch. And proper quality control may mean the best approach is to cherry-pick the deis/base commit on top of a previous tagged release.

For now, the tagging and releasing of the dependent components themselves is outside the scope of what this document proposes to solve.

@bacongobbler
Copy link

bacongobbler commented Oct 5, 2016

One of the problems why this issue came up is because no code changes are required when a security update is released for deis/base. Since there is no code change, there is no new commit to tag from other than "technically" cutting a new release from the same commit as the last patch release.

Therefore, I'd propose to change deis/base tags to distribution-based (i.e. deis/base:xenial), much like the underlying ubuntu:xenial image it relies on. In this case we would just rebuild the tag when any security update comes out and then the underlying components get the changes.

OR

Quit using deis/base altogether, stop trying to package a distribution with a common set of packages and just let the components roll on their own using ubuntu:xenial, installing their packages as necessary. The image layer optimization problem is so small now because most kubernetes clusters have great networking speeds, and can usually fetch the entire Workflow component images within < 3 minutes. With v1 this was a bigger problem because we had people deploying vagrant clusters in third world countries with huge packet loss. Our target market now uses these components separately (like the router) or runs on AWS/GKE/Azure which has ample bandwidth to download Workflow. What's a few extra layers to get rid of this problem?

@jchauncey
Copy link

I think from a security perspective it is better for us to have 1 base image that everything derives from rather than have everyone do their own thing. Most people aren't paying close attention to the image scans so trying to manage that for 10+ components will get unwieldy. We will need to make sure can quickly respond to any security problems that crop up and having to go patch each component will be way to much of a headache.

I kind of like the idea of having our base image have a name like xenial which reflects the underlying image name. I wonder if we should do something like

deis/base:xenial-1.0.0 so we can version the image but also have some understanding of what it was built from. The problem we have is that you cant just republish the same image with the same tag. Technically, that doesn't even fit semver. If we cut a new image with secruity patches in it we should roll to at least x.x.1.

@mboersma
Copy link
Author

mboersma commented Nov 22, 2016

For better or worse, it seems we're sticking with immutable semver tags for deis/base, and requiring dependent images to update explicitly. @bacongobbler and I did so for the recent v0.3.5 update, and it was a bit tedious but not terrible.

Unless and until this becomes a frequent task, the approach is going to be a manual one: update deis/base, tag it, then create a slew of PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment