mholt/for-servers.md

## for-servers.md

      
    Raw
  

              for-servers.md
            
          
    CT For Server (Developers)

Intro

Similar to my advice regarding OCSP Stapling
for servers/server developers, based on questions I've received about "CT best practices," I wanted to
write something similar for those writing server software. That is, this isn't targeted at server
operators, but for those writing software like Apache, nginx, Caddy, etc.
At the most basic level, the deployment of Certificate Transparency to date has largely tried to
focus the burden on CAs, rather than on server developers. If the CA is doing everything right,
the server developer and the server operator shouldn't need to bother with CT, and all is good. However,
with proposals liked Expect-CT, it may be
that server operators want to opt-in to CT, even without their CA's support, and so need work from the
server operator.
The ranking of best to worst ways to deploy CT on a server today are:

Embedded within the certificate (Requires the CA to support)
Embedded within an OCSP response (Requires the CA to support, requires
robust OCSP support)
Served as part of the TLS extension
Dynamically obtained by the server (similar to OCSP)
Statically configured by a file

However, high-performance servers, the priorities look a bit like:

Served as part of the TLS extension


SCTs dynamically obtained by the server
Including only enough SCTs that aren't covered by the OCSP response or certificate


Served as part of the OCSP response
Embedded within the certificate
Served as part of the TLS extension, statically configured by a file

The difference here is that high-performance servers want to avoid sending any
unnecessary data. The TLS extension allows the server to correct for any issues in
either the OCSP response or certificate, streamlining the process.
So what does robust CT support look like for a server to support these use cases? Well,
it's complicated, because some of the necessary bits aren't quite there yet, but here's
my list of things to think about.
Support dynamically updating a list of known logs

Much like a list of root CAs, the list of logs is currently in flux. New logs are coming up,
some with more reliable infrastructure and liberal acceptance policies, and older logs are
going away, whether due to natural deprecation due to size or due to growing pains as log
operators work out the system.
A server implementation should support dynamic reconfiguration of the logs. Ideally, this
would be something that, like a Root CA list, simply automatically updates.
Google currently hosts a set of JSON files at https://www.certificate-transparency.org/known-logs
which offers a starting point. It may be that an industry standard log schema will exist, or it
may turn out that, depending on the policies of other browsers and clients, it may be better for
the server developer  to transform/aggregate that data into a way specific for the capabilities of
the server software. For example, expressing "One Google and One non-Google" or "Independence" may
turn out to be too complex for a vendor-agnostic solution.
Log Early, Log Often

Because the ecosystem is still fairly dynamic at this point, server operators shouldn't assume that
once they have enough SCTs, that will be enough for the lifetime of that certificate. For example,
the log may go away quickly, and so additional SCTs are needed. Instead of trying to log "just enough,"
a server implementation should try to obtain SCTs from as many logs as possible, and to store them for
the case if/when it will need them, such as a log being disqualified.
As new logs are added (per dynamic updates), try to get SCTs from them, if the policies would permit.
This probably means knowing apriori what the log will accept, whether it be the set of valid root CAs
or the policies around that.
It's probably less than ideal to "try all logs all the time", since at present, there are logs that don't
accept all certificates from all CAs. As a result, continually trying to log to a Log that will never
accept just creates unnecessary load on the server, for no benefit to the user. This probably requires
some expression of what the log will accept, or a change in policies mandating what logs should accept.
Don't Trust Logs

Since the possibility exists that a certificate for a log could be misissued, code defensively. Treat
all input as hostile input, even though it's a "trusted" log, because it might be that a certificate
was misissued for the log itself. Implement sanity limits on the size of the responses, and don't treat
the log as any more trusted than a random HTTP server on the Internet.
Know the chain of command

In order to log, a completed certificate path to a CA trusted by the Log is necessary. As certificates
can have multiple paths, this may not be as simple as taking the configured path and sending it, especially
when an ideal TLS configuration would omit things like the root certificate, whereas a Log requires it
be included.
This may mean some minimal path-building support is needed in server software, to try to build a path to
one or more CAs trusted by the Log. This may also mean that some list of 'known' intermediates may be
needed by the server software (... and all of the updating mess that would entail), so that things like
authorityInfoAccess chasing isn't needed.
Police the Policies

Currently, only one browser (Chrome) has defined policies around Certificate Transparency, and that
policy is that there be "one Google" and "one non-Google" SCT. So a server operator likely needs some
way to filter the set of SCTs to ensure that whatever is served will be accepted. It's likely that Chrome
will relax this in the future, but it's also possible that other browsers may introduce different policies,
even after that. So there likely needs to be some way to distinguish whether a given "bundle of SCTs" meets
that criteria, and that criteria should itself be updatable, as it may update when the browser updates.

  
## schema-data.md

      
    Raw
  

              schema-data.md
            
          
    Things a list of logs likely needs

(Scratchpad for tracking various things a log schema likely needs to include, for CAs and for server developers)

Log URL
Log public key
Accepted roots of the log

This means that the schema would need to update any time the Log's list of acceptable CAs updates
The logger can always fetch this data from the log, but then it means a thundering herd of loggers hitting the Log


"Organization" operating the log

Today, that means determining whether a log is Google or not
Tomorrow, may include more complex nexus of organizational relationships?


Things a client logging likely needs


Full context of the potential chains

For example, if a given cert has a path to Root A and a path to Root B, the client must be prepared to handle
the case where they need to send Path A to one Log, and Path B to another log, in order to obtain SCTs from either.
This could be simplified with policies for CAs (reducing the paths that exist)
This could be simplified with policies for logs (accepting CAs by organization, rather than key)


Log's accepted roots

Can be obtained directly from the log, but that's dynamic

Support for HTTP Caching directives for that response likely necessary

Do today's logs provide those directives for cacheability?
Should that be part of operational guidance?


Thundering herd problem from a thousand and one log clients


Could be included within log metadata given to client

But now increases frequency of updates

Would differential updates be needed? JSON Patch?


Knowledge of log's policies

No point talking to a log that will never talk back
What policies are acceptable, and what aren't?
Do we need a "Log BPF" syntax to express rules about what a log will accept?