lingqingmeng/keybase_cryptography.md

## keybase_cryptography.md

      
    Raw
  

              keybase_cryptography.md
            
          
    Keybase Cryptography

This is a construction for encrypting and signing a message, using a
symmetric encryption key and a signing keypair, in a way that supports safe
streaming decryption. We need this for chat attachments because we've chosen
to use signing keys for authenticity in chat, and we don't want one
participant to be able to modify another's attachment, even with an evil
server's help. It's almost enough that we record the hash of the
attachment along with the symmetric key used to encrypt it, but that by
itself doesn't allow safe streaming decryption. Instead, we use this
construction to sign each chunk of the attachment as we encrypt it. (Note
that it's still possible for a sender with the server's help to modify their
own attachments after the fact, if clients aren't checking the hash. This
isn't perfect, but it's better than any participant being able to do it.)
This file has 100% test coverage. Please keep it that way :-)
Technical Spec

Seal inputs:

plaintext bytes (streaming is fine)
a crypto_secretbox symmetric key
a crypto_sign private key
a globally unique (with respect to these keys) 16-byte nonce

Seal steps:

Chunk the message into chunks exactly one megabyte long (2^20 bytes), with
exactly one short chunk at the end, which might be zero bytes.
Compute the SHA512 hash of each plaintext chunk.
Concatenate the 16-byte nonce above with the 8-byte unsigned big-endian
chunk number, where the first chunk is zero. This is the 24-byte chunk
nonce.
Concatenate four things:

"Keybase-Chat-Attachment-1\0" (that's a null byte at the end)
the encryption key (why?! read below)
the chunk nonce from #3
the hash from #2.


Sign the concatenation from #4, giving a detached 64-byte crypto_sign
signature.
Concatenate the signature from #5 + the plaintext chunk.
Encrypt the concatenation from #6 with the crypto_secretbox key and the
chunk nonce from #3.
Concatenate all the ciphertexts from #7 into the output.

Open inputs:

ciphertext bytes (streaming is fine)
the same crypto_secretbox symmetric key
the corresponding crypto_sign public key
the same nonce

Open steps:

Chop the input stream into chunks of exactly (2^20 + 80) bytes, with
exactly one short chunk at the end. If this short chunk is less than 80
bytes (the size of an Ed25519 signature and a Poly1305 authenticator put
together), return a truncation error.
Decrypt each input chunk with the crypto_secretbox key and chunk nonce as
in seal step #7.
Split each decrypted chunk into a 64-byte signature and the following
plaintext.
Hash that plaintext and make the concatenation from seal step #4.
Verify the signature against that concatenation.
Emit each verified plaintext chunk as output.

Design Notes:

Combining signing and encryption is surprisingly tricky! See
http:world.std.com/~dtd/sign_encrypt/sign_encrypt7.html for lots of
details about the issues that come up. (Note that "encryption" in that
paper refers mostly to RSA encryption like PGP's, which doesn't involve
a sender key the way Diffie-Hellman / NaCl's crypto_box does. This makes
me appreciate just how many problems the crypto_box construction is
solving.)
Many of these issues probably don't apply to chat attachments (yet?!),
because recipients will know what keys to use ahead of time. But there
are other places where we use signing+encryption that have different
properties, and I want to be able to use this design as a reference. The
short version of the problem is that both encrypt-then-sign and
sign-then-encrypt have to worry about what happens when someone reuses
the inner layer with a new outer layer.
Encrypt-then-sign has a "sender impersonation" problem. The
man-in-the-middle can re-sign an encrypted payload with their own key
and claim authorship of the message. If the message itself contains
secrets, like in an auth protocol for example, the MITM can fake knowing
those secrets. (Also, encrypt-then-sign has the more obvious downside
that encryption is hiding only the contents of a signature and not its
author.)
Sign-then-encrypt has a "surreptitious forwarding" problem. A recipient
can re-encrypt the signed payload to another unintended recipient.
Recipients must not rely on the encryption layer to mean that the sender
intended the message for them. In fact PGP is vulnerable to this attack,
unless the user/application understands the very subtle difference
between "I can read this" and "this was written to me".
So, simply using encryption and signing together isn't good enough! The
paper linked above mentions a few different solutions, but in general
the fix is that the inner layer needs to assert something about the
outer layer, so that the outer layer can't be changed without the inner
key. We could simply include the outer key verbatim inside the inner
layer, but a better approach is to mix the outer key into the inner
crypto, so that it's impossible to forget to check it.
We prefer sign-then-encrypt, because hiding the author of the signature
is a feature. That means the inner signing layer needs to assert the
encryption key. We do this by including the encryption key as
"associated data" that gets signed along with the plaintext. Since we
already need to do that with a nonce and a chunk number, including the
the encryption key is easy. We don't need to worry about whether the
signature might leak the encryption key either, because the signature
gets encrypted.
Apart from these signing gymnastics, all the large-encrypted-message
considerations from https:www.imperialviolet.org/2015/05/16/aeads.html
apply here. Namely we use a chunk number to prevent reordering, and we
require a short chunk at the end to detect truncation. A globally unique
nonce (for encryption and signing) prevents chunk swapping in between
messages, and is required for encryption in any case. (It's expected
that the chat client will pass in all zeroes for the nonce, because both
keys are one-time-use. That's up to the client. G-d help us if we ever
reuse those keys.) We also follow the "prefix signatures with an ASCII
context string and a null byte" recommendation from
https:www.ietf.org/mail-archive/web/tls/current/msg14734.html.
Source: https://github.com/keybase/client/blob/dc664617ea326abdd5e5bca877aa0c25fb403efd/go/chat/signencrypt/codec.go