hlandau/rough-design.md

## rough-design.md

      
    Raw
  

              rough-design.md
            
          
    This is a rough sketch I've put together in my mind of how an 'ACME daemon' might end up looking.
API

acmetool is designed for batch operation which works well for small use cases but
large scale deployments will work better with a daemon. This will probably expose
a service via an HTTP API, so that arbitrary parts of a service provider's stack
can request certificates.
This API will need to be asynchronous as it may take arbitrarily long for 'acmed'
to obtain certificates. For example, if a service provider's customer changes
their nameservers to those of a provider, this change may take time to propagate.
The service provider will need to keep checking until the changes propagate and
they can complete challenges for the issuance of certificates.
So this suggests that there should be an API for requesting certificates for certain hostnames,
and then the daemon keeps doing challenge self-tests periodically until it determines
it can obtain certificates, and then does so.
Likewise there will also need to be an API for the retrieval of certificates. There is some
prior art in the field of programmatically retrieving and using certificates: an open source
TLS terminator called Bud can lookup certificates for SANs using HTTP requests. Implementing
this interface would be useful and desirable.
Private Key Location

There's a question of who should generate, own and control private keys. On the one hand, it
seems more secure for private keys to be generated where they will ultimately be used. In this
case it would be necessary for this server to transmit a CSR to acmed for it to use to request
certificates. But this is unworkable where there is more than one server needing to make use
of the certificate. You could have some daemon to manage and hand out these keys, but that
sounds... rather like acmed, so you're just duplicating the role.
In other words, it makes the most sense for acmed to generate and manage these private keys,
and to hand them out to authorized clients. This is indeed required by the protocol Bud expects,
as described above.
Problem Instantiation

The whole point of acmed would be to handle largescale deployments that acmetool is unsuited for.
So there's no point if we don't start with a largescale instantiation of the problem. So let's say:
You have 10 load balancers, each which need to be able to obtain certificates and private keys.
They all serve the same hostnames.
Five of the load balancers are in one data centre and five are in another.
Two instances of acmed must be used for redundancy, and acmed must support this.
Under this model, whatever database acmed uses will need to be suitable for use
by multiple accessors, and so networked. PostgreSQL will probably be preferred, but maybe other
backends can be supported too. Since PostgreSQL supports a 'LISTEN/NOTIFY' protocol, it also
may be possible to support configuring acmed by PostgreSQL database changes as an alternative
to the HTTP interface.
Challenge Completion

acmed will need to automatically retry challenges periodically. Status information for a given hostname
should probably be exposed in machine-readable form via an HTTP endpoint. The ability to retry
should probably be exposed via an HTTP endpoint.
The HTTP proxy/listener, webroot and hook and TLS-SNI and DNS hook challenge completion methods
which are supported by acmetool should remain applicable in these cases, and therefore these
parts of acmetool can probably be reused with minor refactoring.
The process execution model of the ACME hooks system might in extreme cases become a bottleneck.
One possible solution to this is to allow UNIX domain sockets to be placed in the hooks directory,
in which case they are detected and a certain protocol made over them. This protocol could be HTTP.
But more practically, it would probably make sense to simply allow HTTP-based hooks to be configured.
Rough API

PUT /names/{hostname}
  Indicate a desire for a hostname.
DELETE /names/{hostname}
  Removes a desire for a hostname. The certificates are not deleted, but are no longer accessible
  via the API. If the hostname is requested again, the existing certificates are reused if they are
  not expired. Maybe have an option for revoking, in which case the certificates really are deleted.
GET /names/{hostname}/cert
  PEM-encoded certificate, if available; else 404 or maybe a default certificate or in-progress response.
GET /names/{hostname}/privkey
  PEM-encoded private key, if available.
GET /names/{hostname}/chain
  A series of PEM-encoded certificates in the chain, not including the end certificate.
GET /names/{hostname}/bud
  A BUD-compatible JSON response.
GET /names/{hostname}/status
  A JSON or maybe also HTML (via negotiation) response indicating recent failures in completing challenges
  or acquiring certificates.

Issues

Hostname lumping: How to control, and allow the expression of, the lumping of hostnames into different certificates?
In acmetool you get to specify this arbitrarily via targetfiles. Some possibilities:


Instead of the above interface, expose the idea of 'targets' just as acmetool does, possibly with a
search or conditional-creation API to allow it to be determined whether there already exists a certificate
satisfying the target.


Allow lumping to be done arbitrarily and controlled by static configuration. For example, lump large numbers
of unrelated hostnames into certificates. This may require delaying requests for a hostname until more requests
pile up.


This is a service-based design, as opposed to a library-based design. Either way, implementing the service
will probably result in components which are easy to use as a library.