Skip to content

Instantly share code, notes, and snippets.

@aweiteka
Last active July 29, 2019 18:11
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save aweiteka/9453f873ea0e2150edc2 to your computer and use it in GitHub Desktop.
Save aweiteka/9453f873ea0e2150edc2 to your computer and use it in GitHub Desktop.
Docker Image Signing and Provenance

Docker Image Signing and Provenance

Provenance (from the French provenir meaning "to come from"): the origin or source of something

Overview

We need a better way of distributing docker images securely. This includes:

  • data integrity
  • issuer trust and verification
  • supporting distributed access to assemble an image
  • attestation of the image assembly (commands/instructions to build it)

Definitions

  • image – immutable detached content that is means for instantiating a container
  • image namespace – the owner scope of an image
  • registry – an endpoint that can serve information on named images, within a namespace.
  • namespace identity – cryptographically safe identity or fingerprint for an image namespace

Requirements

  • definition of a registry as an endpoint for identity of a requested image's namespace
  • allow detached validation of the identity of an image's namespace
  • sane default registry of the docker hub
  • configurable priority of registries (to solve conflict in image namespace)

Foundational Concepts from IPFS

InterPlanetary File System (IPFS) is an emerging protocol that builds on popular projects such as Git and Bittorrent. IPFS provides a model for addressing many of the provenance needs of the Docker community. Readers are strongly encouraged to read the 11-page IPFS protocol. It is not important to implement the full IPFS stack so much as to glean from it and use it as a model. See Go implementation of IPFS.

Summary:

Implement Distributed Hash Tables of immutable image objects within a pool of registry nodes addressed by self-certified names in a cryptographically assigned global namespace.

Objects and Distributed Hash Tables

DHT provides a list of cryptographic image hashes. The hash guarantees the correct image will be used regardless of where it is stored or served from. Image objects are assumed to be permanent.

This work has already begun. See Issue#6959 and PR#5956 for the abstract layer ID and image ID discussions and PR#7262 for an implementation that is backward compatible. IPFS recommends a multihash format for flexibility. Example:

<function code><digest length><digest bytes>

The DHT answers requests for hashes: "Where can I get <docker_image_multihash>?" It is similar to the docker registry object cloud storage drivers (e.g. S3, Google Cloud Storage) in place today.

Example DHT data structure from the Pulp/crane registry implementation. Assumes a json file for each repository namespace.

# centos.json
{
  "images": [
    { "id": "1a7dc42f78ba213ec1ac5cd04930011334536214ad26c8000f1eec72e302c041" },
    { "id": "34e94e67e63a0f079d9336b3c2a52e814d138e5b3f1f614a0cfe273814ed7c0a" },
    { "id": "511136ea3c5a64f264b78b5433614aec563103b4d4702f3ba7d4d2698e22c158" },
    { "id": "cd934e0010d5a63c3137a6d0d6b1cdeca68a02bafd2a51554be61dfd6b6dda80" }
  ],
  "protected": true,
  "repo-registry-id": "centos",
  "repository": "centos",
  "tags": {
    "centos6": "cd934e0010d5a63c3137a6d0d6b1cdeca68a02bafd2a51554be61dfd6b6dda80",
    "centos7": "1a7dc42f78ba213ec1ac5cd04930011334536214ad26c8000f1eec72e302c041",
    "latest": "1a7dc42f78ba213ec1ac5cd04930011334536214ad26c8000f1eec72e302c041"
  },
  "type": "docker-redirect",
  "url": "https://registry.example.com:/docker/centos/",
  "version": 1
}

DHT enables remote assembly where image layers may be safely pulled from multiple issuers when assembling application layers. See Separate naming and transport

DHT Exchange

It is unclear if a true peer-to-peer DHT model is required or desired where a marketplace of nodes are rewarded for serving content. There is a need to have some method for sharing hash tables. IPFS uses the BitSwap protocol, where nodes share want_list and have_list information on object hashes.

Registry Node Identities

As the registry is decoupled from the image name, registries must be referenced using a secure naming scheme. NFS implements a NodeId (our registry ID) as a cryptographic hash of a public key.

difficulty = <integer parameter>
n = Node{}
do {
  n.PubKey, n.PrivKey = PKI.genKeyPair()
  n.NodeId = hash(n.PubKey)
  p = count_preceding_zero_bits(hash(n.NodeId))
} while (p < difficulty)

The registry stores its public and private keys. Public keys are exchanged and the ID is checked against the public key hash.

Registry name resolution is important for disconnected environments with private registries. DNS TXT records can provide resolution.

registry.example.com. TXT "ipfs=XLF2ipQ4jD3UdeX5xp1KBgeHRhemUtaA8Vm"

Namespacing

While images are presumed to be immutable and effectively permanent, naming is considered mutable. IPFS provides a global namespace for accessing objects but it does not appear to provide a solution for global namespace for docker repositories (e.g. aweiteka/myapp). There is some desire in the docker community to ensure that aweiteka/myapp always resolves to a single thing. A centralized authority for resolving a global namespace (i.e. hub.docker.com) is one possible solution but it does not work for disconnected registries, those who have non-distributed hash table registries behind a firewall.

A possible approach to resolving this is to tie the repository names to hashable link objects. These objects would become part of the DHT pool of registry nodes serving content. This decentralizes namespace resolution while ensuring it is distributed. Disconnected registry nodes would have their own set of naming objects but they would not be part of the global DHT.

In the same way the each registry has a unique NodeId, each namespace owner has a UserId.

UserId = hash(PubKey)

This is addressed as

/ipns/XLF2ipQ4jD3UdeX5xp1KBgeHRhemUtaA8Vm/

Repositories in this namespace are expressed as

/ipns/XLF2ipQ4jD3UdeX5xp1KBgeHRhemUtaA8Vm/myapp

The published object would include a link name, providing hash-namespace resolution

/ipns/XLF2ipQ4jD3UdeX5xp1KBgeHRhemUtaA8Vm aweiteka

Image Signing

Today image signing has been done outside of docker by GPG signing the tarball output of a docker save using docker-utils. A native docker implementation is needed so that extra steps are not required to verify signatures. With a distributed image model some users require that each image layer be signed. IPFS provides a method to sign each object. This changes the object hash. Object signatures are verified automatically.

type SignedObject struct {
  Object []bytes
  // raw object data signed

  Signature []bytes
  // hmac signature

  PublicKey []multihash
  // multihash identifying key
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment