Towards Notary 2.0
This document is still a work in progress, and currently has more questions than answers.
Docker launched container image signing in 2015, but it is not widely used. Most images that Docker produces, such as official images, are signed, but there are not a large number of other users signing images, and only a small number of users check image signatures. Only a few registries, such as Docker Hub and Azure Container Registry, run a Notary server, and so allow signatures to be added. However there is demand for an image signing solution that is simple to use from customers. This document examines what sorts of changes are needed to make a solution that meets customer needs.
The first area of work is to add a specification for signatures as an OCI extension, and so store signature data in the registry rather than in an additional service that uses a separate database. Operationally this becomes much simpler, and it also means that signatures will be able to be stored in any registry that supports arbitrary OCI artifacts. There is ongoing work on extending the types of data that can be stored in registries in github.com/opencontainers/artifacts and the plan would be to work within these standards. There are several options to explore here, in particular around whether signatures are inline in the object they sign, or detached and reference the object. TUF currently uses inline signatures, which has difficulties around having to sign canonical JSON subsets of the content, while for registry content signing the blob hash itself (and associated data) seems more natural. The specification for signatures should be generic for any signature purpose, and allow metadata about purpose and key types and identity to be attached. These signatures should then be able to be pulled and pushed with images, without any of the validation constraints that Notary keys currently have, that tie the signature check to the registry location. Other considerations include signing registry image names as well as just blobs, in a similar model to signing git tags. Currently TUF has its own list of names that may not correspond to those in the registry which can be confusing, although the tooling tries to keep them in sync.
Following the signature design, the plan would be to support the additional TUF metadata on top of the new signing scheme. This involves OCI document types for TUF metadata, and a TUF specification for how these are stored that is registry native. Currently there is optional support in the TUF specififcation for content hashes, but it is not used by Notary at present, rather it uses the relational database storage in the Notary server which supports a transactional interface. Observability should be improved, with tooling, in order to help debug issues; currently it is difficult to inspect the TUF metadata when there are issues. Again having TUF metadata in the registry means it can be pushed and pulled with images across registries. TUF needs a timestamp service, to re-sign timestamps, so this would need to work on registries.
Tooling needs to be developed to work with in registry signatures and TUF metadata, such as client libraries and modifications of existing tooling such as containerd to support pluggable signature checks. Admission control examples for Kubernetes are needed; there are few for Notary at present.
The usability efforts that Docker have undertaken recently have been around adding a simpler signing workflow into the "docker trust" series of commands, so as to remove the need to use the notary tooling directly. This has produced a better CLI workflow but has not improved ability to debug issues, and it has not improved key management, or understanding of how to use signing effectively. Key management in particular is a weakness, due to the large number of keys required, and lack of integration into CI pipelines and cloud HSMs.
There are questions around the design of TUF and whether it needs further modifications for container registry use cases, having been designed for the somewhat similar use case of package management. Expectations of what the security model brings are unclear to most users. Other validation mechanisms such as transparency logs, or existing signature methods might be better for some use cases.
I plan to provide a first draft and prototype of some parts at Kubecon at my talk https://events19.linuxfoundation.org/events/kubecon-cloudnativecon-north-america-2019/schedule/