Skip to content

Instantly share code, notes, and snippets.

@tonistiigi
Last active March 7, 2024 07:29
Show Gist options
  • Star 14 You must be signed in to star a gist
  • Fork 4 You must be signed in to fork a gist
  • Save tonistiigi/059fc72c4630f066d94dafb5e0e70dc6 to your computer and use it in GitHub Desktop.
Save tonistiigi/059fc72c4630f066d94dafb5e0e70dc6 to your computer and use it in GitHub Desktop.

edit: comment in moby/moby#32925

Buildkit is a proposal to separate out docker build experience into a separate project, allowing different users to collaborate on the underlying technology and reuse and customize it in different ways.

One of the main design goals of buildkit is to separate frontend and backend concerns during a build process. A frontend is something designed for the users to describe their build definition. Backend solves the problem of finding a most efficient way to solve a common low-level description of the build operations, that has been prepared for them by the frontends.

The purpose of buildkit is not to be an arbitrary task runner. Instead, buildkit solves the problem of converting source code to an artifact in a self-contained, portable, reproducible, and most efficient way. Invoking builder should be traceable to immutable sources and invoking it shouldn't have any side-effects. Buildkit will support intelligent caching of artifacts of previous invocations so it can be efficiently used in a developer workflow.

Buildkit is meant to be used as a long-running service. It is optimized for parallel execution of complex projects and building multiple projects at the same time.

Design draft:

Buildkit is separated to following subcomponents:

  • sources - getting data from remote sources
  • frontends - preparing build definition from users to low-level format
  • solver - finds the most efficient way to execute build instructions graph and leverage caching for next invocations
  • worker - component in charge of actually running the modification step on a source
  • exporter - component for getting the final results back from the builder
  • snapshots - implementation for filesystems manipulations while builder executes
  • cache manager - component managing the recently used artifacts for efficient rebuilds
  • controlAPI - definition for linking multiple builders together for nested invocation

Connection with docker platform

Buildkit is meant to become the next generation backend implementation for docker build command and github.com/docker/docker/builder package. This doesn't mean any changes to Dockerfile format as buildkit draws a boundary between build backends and frontends. Dockerfile would be one of the frontend implementations.

When invoked from the Docker CLI buildkit would be capable of exposing clients context directory as a source and use Docker containers as a worker. The snapshots would be backed by Docker's layer store(containerD snapshot drivers). End results from the builder would be exported to docker images.

Frontends

A frontend is a component that takes in user-provided build definition, parses it and prepares a generalized definition for the low-level builder.

Buildkit supports multiple frontends. Most common example of a frontend is Dockerfile. Frontend also has access to the other components of the builder. It can access the sources directly and has access to store/get resources from cache. For example, for Dockerfile to correctly parse the FROM command, it needs to request the config for an image.

Solver/Low-level builder

The core part of the builder is a solver that takes a DAG of low-level build instructions from the frontend and finds a way to execute them in a most efficient manner while keeping the cache for the next invocations.

For this, the graph of build instructions should be loaded into a content addressable store. Evey item in that store can have dependencies from previous items. That makes sure that no definitions are duplicated. To start a builder a root node from that graph is asked to be solved with a provided worker options. That internally will call the same action for its dependencies and so on.

While solving an instruction a cache key is computed to see if a result for the instruction can be already found without computing the step. If it is found, a snapshot associated with the cache-key can be used as a result directly. After every instruction, the result of the operation is stored by the same cache key for future use.

The goal is to:

  • Minimize duplication between builder invocations that share common steps.
  • Minimize duplication between build steps that return identical results.
  • Find efficient parallelization of steps

Supported operations for LLB:

LLB is optimized for simplicity. The main operation that it support is running a process in the context of one snapshot and capturing the modifications this process made. To simplify and optimize implementation there is a built-in operation for copying data from one snapshot to another and accessing data from one of the remote sources known to the builder.

type Input struct {
  Base Op
  Index int
}

type Op struct {
  Deps []Input
  Outs []snapshot.Snapshot
}

type ExecOp struct {
  Op
  Meta ExecMeta
  Mounts []Mount // mapping for inputs to paths
}

type CopyOp struct {
  Op
  Sources []string
  Dest string
}

type SourceOp struct {
  Op
  Identifier string
}

Low-level builder only works on snapshots. There are no methods for controlling image metadata changes. Image metadata can be managed by a frontend, should it be needed. The only component that could know about the image format is image exporter. The ExecMeta structure is the defined by buildkit and contains a minimal set of properties describing a running process. If more properties are needed (host networking etc) they must be set when initializing the worker and DAG solver has no idea of their existence.

ExecOp can depend on multiple snapshots as it inputs, one of them would be mounted at / and be used as the root filesystem. Every operation could also export multiple output snapshots.

Another operation to support invoking other builders in a build operation to support nested invocation of builders. This is covered more in the ControlAPI section.

Sources

Sources is a component that allows registering transport methods to the builder that prepare remote data to the snapshot. A build operation can refer to a remote data by an identifier, that identifier is used to find the registered source provider that has the actual implementation.

Supported built-in sources include docker images, git repositories, http archives and local directories. It is likely that a source implementation uses cache from previous invocations to speed up getting access to the data. Docker image source would skip pulling image layers that it has pulled before, git source could reuse previously pulled repo and only pull in incremental changes.

When integrated with docker build extra source would be available that allows access to the files sent by Docker client.

Worker

Worker is a component tasked with running a build step. Only required steps are moving data between snapshots and executing a command with correct data mounts.

type ExecMeta struct {
Args []string
Env  []string
User string
Wd   string
Tty  bool
// DisableNetworking bool
}

Usually worker would run a container to execute the process but that is not a requirement set by the builder.

Exporter

An exporter is a post-step that runs before data is returned. Unlike docker build where every build results in an image being added to Docker's image store, buildkit can export the build results in any other formats.

That means that the build result may be a plugin or OCI-image bundle or maybe just a file or a directory.

Snapshots

Snapshots cover the implementation for the filesystem modifications needed during the build. It supports plugging in different backends. When buildkit is used as part of docker build, it would use Docker's layer store or ContainerD snapshot drivers as a backend. But alternative implementations could be provided, for example, a build would probably work quite good with a FUSE based snapshot backend.

Snapshots API uses reference counting because the same data may be used outside the build as well. When a part of the system(build operation) takes a reference to a snapshot it can't be deleted by anything else.

Cache

The persistent storage used by buildkit is managed automatically by a cache manager component. The user can specify a simple cache policy that is used by the garbage collector to clean up unneeded resources.

type GCPolicy struct {
    MaxSize         uint64
    MaxKeepDuration time.Duration
}

The build cache contains snapshots previously accessed by the builder and some metadata for the operations(cache keys) referring to these snapshots. If a builder has stopped using a snapshot, before releasing it, it would call RetainSnapshot(snapshot.Snapshot, CachePolicy) to make the cache manager responsible for keeping the snapshot data and releasing it once free space is needed.

The user also has control to see what is currently tracked by yhr cache manager and manually prune its contents.

type CacheManager interface {
    DiskUsage(context.Context) ([]CacheRecord, error)
    Prune(context.Context, CacheSelector) error
    GC(context.Context, GCPolicy) error
}

Cache import/export

Build cache can be exported out of buildkit and imported in another machine. It can be stored in the registry using a distribution manifest.

ExportCache(selector CacheSelector) ([]byte, []snapshot.Snapshot, error)

ExportCache would export a config object with metadata how the snapshots are referred by operation cache-keys. That data could be pushed to a registry with every snapshot being pushed as a separate blob. Cache importer can read back this config and expose the operation cache to the currently running builder action. If a cache-key requested by an operation is not found locally but exist in the imported configuration, snapshot associated with it can be pulled in from the registry.

ControlAPI

Control API is an API layer for controlling the builder while it is running as a long-running service. It supports invoking a build job and inspecting how a build job would execute. The user should be able to query a build target by not executing it and get to see the vertex graph of all operations that would be executed and if they are already backed by the cache.

Load(context.Context, []Op) error
Build(context.Context, digest.Digest, bool) ([]Vertex, error)

By defining a common interface a client program can be used with multiple builder implementations. This also enables supporting nested builder invocations as a build operation. That would be similar to ExecOp but instead of executing a process builder would invoke a controlapi.Build request instead.

@AkihiroSuda
Copy link

left comment here: moby/moby#32550 (comment) 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment