Cryptographic Best Practices
Putting cryptographic primitives together is a lot like putting a jigsaw puzzle together, where all the pieces are cut exactly the same way, but there is only one correct solution. Thankfully, there are some projects out there that are working hard to make sure developers are getting it right.
The following advice comes from years of research from leading security researchers, developers, and cryptographers. This Gist was forked from Thomas Ptacek's Gist to be more readable. Additions have been added from Latacora's Cryptographic Right Answers.
Some others have been added from years of discussions on Twitter, IRC, and mailing lists that would be too difficult to pin down, or exhaustively list here.
If at any point, I disagree with some of the advice, I will note it and provide reasoning why. If you have any questions, or disagreements, let me know.
If you take only one thing away from this post, it should be to use a library that puts these puzzle pieces together correctly for you. Pick one of for your project:
- NaCl - By cryptographer Daniel Bernstein
- libsodium - NaCl fork by developer Frank Denis
- monocypher- libsodium fork by developer Loup Vaillant
Throughout this document, when I refer to "just use NaCl", I mean one of these libraries.
If you are in a position to use a key management system (KMS), then you should use KMS. If you are not in a position to use KMS, then you should use authenticated encryption with associated data (AEAD).
Currently, the CAESAR competition is being held to find an AEAD algorithm that doesn't have some of the sharp edges of AES-GCM while also improving performance. When the announcement of the final portfolio drops, this document will be updated.
Some notes on AEAD:
- ChaCha20-Poly1305 is faster in software than AES-GCM.
- AES-GCM will be faster than ChaCha20-Poly1305 with AES-NI.
- AES-CTR with HMAC will be faster in software than AES-GCM.
- Poly1305 is also easier than GCM for library designers to implement safely.
- AES-GCM is the industry standard.
The NaCl libraries will handle AEAD for you natively.
Use, in order of preference:
- KMS, if available.
- The NaCL, libsodium, or monocypher default
- AES-CTR with HMAC
- AES-CBC, AES-CTR by itself
- Block ciphers with 64-bit blocks, such as Blowfish.
- OFB mode
- RC4, which is comically broken
Symmetric Key Length
See my blog post about The Physics of Brute Force to understand why 256-bit keys is more than sufficient. But rememeber: your AES key is far less likely to be broken than your public key pair, so the latter key size should be larger if you're going to obsess about this.
If your symmetric key is based on user input, such as a passphrase, then it should provide at least as many bits of theoretical entropic security as the symmetric key length. In other words, if your AES key is 128-bits, and is built from a password, then that passwourd should provide at least 128-bits of entropy.
As with asymmetric encryption, symmetric encryption key length is a vital security parameter. Academic, private, and government organizations provide different recommendations with mathematical formulas to approimate the minimum key size requirement for security. See BlueKcrypt's Cryptographyc Key Length Recommendation for other recommendations and dates.
To protect data up through 2050, it is recommended to meet the minimum requirements for symmetric key lengths:
- Lenstra/Verheul- 109 bits
- Lenstra Updated- 102 bits
- ECRYPT II- 256 bits
- NIST- 192 bits
- ANSSI- 128 bits
- IAD-NSA- 256 bits
- BSI- 128 bits
Personally, I don't see any problem with using 256-bit key lengths. So, my recommendation would be:
- Minimum- 128-bit keys
- Maximum- 256-bit keys
- Constructions with huge keys
- Cipher "cascades"
- Key sizes under 128 bits
If you're authenticating but not encrypting, as with API requests, don't do anything complicated. There is a class of crypto implementation bugs that arises from how you feed data to your MAC, so, if you're designing a new system from scratch, Google "crypto canonicalization bugs". Also, use a secure compare function.
If you use HMAC, people will feel the need to point out that SHA-3 (and the truncated SHA-2 hashes) can do “KMAC”, which is to say you can just concatenate the key and data, and hash them to be secure. This means that in theory, HMAC is doing unnecessary extra work with SHA-3 or truncated SHA-2. But who cares? Think of HMAC as cheap insurance for your design, in case someone switches to non-truncated SHA-2.
Alternately, use in order of preference:
- Keyed BLAKE2b
- Keyed BLAKE2s
- Keyed SHA3-512
- Keyed SHA3-256
- Custom "keyed hash" constructions
- Complex polynomial MACs
- Encrypted hashes
- Anything CRC
If you can get away with it you want to use hashing algorithms that truncate their output and sidesteps length extension attacks. Meanwhile: it's less likely that you'll upgrade from SHA-2 to SHA-3 than it is that you'll upgrade from SHA-2 to BLAKE2, which is faster than SHA-3, and SHA-2 looks great right now, so get comfortable and cuddly with SHA-2.
Use (pick one):
- SHA-2 (fast, time-tested, industry standard)
- BLAKE2 (fastest, SHA-3 finalist)
- SHA-3 (slowest, industry standard)
- EDON-R (I'm looking at you OpenZFS)
When creating random IDs, numbers, URLs, initialization vectors, or anything
that is random, then you should always use your operating system's
kernelspace CSPRNG On GNU/Linux (including Android), BSD, or Mac (including
iOS), this is
/dev/urandom. On Windows, this is CryptGenRandom.
NOTE: /dev/random is not more secure then /dev/urandom. They
use the same CSPRNG. They only time you would obsess over this, is when working
on an information theoretic cryptographic primitive that exploits the blocking
/dev/random, which you aren't doing (you would know it when you're
The only time you should ever use a userspace RNG, is when you're in a constrained environment, such as embedded firmware, where the OS RNG is not available. In that case, use fast-key erasure. The problem here, however, is making sure that it is properly seeded with entropy on each boot. This is harder than it sounds, so really, at all costs, this should only be used as a worst-scenario fallback.
- Your operating system's CSPRNG.
- Fast key erasure (as a fallback).
Create: 256-bit random numbers
- Userspace random number generators
When using scrypt for password hashing, be aware that it is very sensitive to the parameters, making it possible to end up weaker than bcrypt, and suffers from time-memory trade-off (source #1 and source #2). When using bcrypt, make sure to use the following algorithm to prevent the leading NULL byte problem. and the 72-character password limit:
Initially, I was hesitant to recommend Argon2 for general production use. I no longer feel that way. It was the winner of the Password Hashing Competition, has had ample analysis, even before the competition finished, and is showing no signs of serious weaknesses.
Each password hashing algorithm requires a "cost" to implement correctly. For Argon2, this is using a sufficient time on the CPU and a sufficient amount of RAM. For scrypt, this is using at least 16 MB of RAM. For bcrypt, this is a cost of at least "5". For sha512crypt and sha256crypt, this it at least 5,000 rounds. For PBKDF2, this is at least 1,000 rounds.
Jeremi Gonsey, a professional password cracker, publishes benchmarks with Nvidia GPU clusters, such as with 8x Nvidia GTX 1080 Ti GPUs. It's worth looking over those numbers.
Use, in order of preference:
- Argon2 (tune appropriately)
- scrypt (>= 16 MB)
- bcrypt (>= 5)
- sha512crypt (>= 5,000 rounds)
- sha256crypt (>= 5,000 rounds)
- PBKDF2 (>= 1,000 rounds)
- Naked SHA-2, SHA-1, MD5
- Complex homebrew algorithms
- Any encryption algorithm
It's time to stop using anything RSA, and start using NaCL. Of all the cryptographic "best practices", this is the one you're least likely to get right on your own. NaCL has been designed to prevent you from making stupid mistakes, it's highly favored among the cryptographic community, and focuses on modern, highly secure cryptographic primitives.
It's time to start using ECC. Here are several reasons you should stop using RSA and switch to elliptic curve software:
- Progress in attacking RSA --- really, all the classic multiplicative group primitives, including DH and DSA and presumably ElGamal --- is proceeding faster than progress against elliptic curves.
- RSA (and DH) drag you towards "backwards compatibility" (ie: downgrade-attack compatibility) with insecure systems. Elliptic curve schemes generally don't need to be vigilant about accidentally accepting 768-bit parameters.
- RSA begs implementors to encrypt directly with its public key primitive, which is usually not what you want to do: not only does accidentally designing with RSA encryption usually forfeit forward-secrecy, but it also exposes you to new classes of implementation bugs. Elliptic curve systems don't promote this particular foot-gun.
- The weight of correctness/safety in elliptic curve systems falls primarily on cryptographers, who must provide a set of curve parameters optimized for security at a particular performance level; once that happens, there aren't many knobs for implementors to turn that can subvert security. The opposite is true in RSA. Even if you use RSA-KEM or RSA-OAEP, there are additional parameters to supply and things you have to know to get right.
If you absolutely have to use RSA, do use RSA-KEM. But don't use RSA. Use ECC.
Use: NaCL, libsodium, or monocypher
- Really, anything RSA
- OpenPGP, OpenSSL, BouncyCastle, etc.
Asymmetric Key Length
As with symmetric encryption, asymmetric encryption key length is a vital security parameter. Academic, private, and government organizations provide different recommendations with mathematical formulas to approimate the minimum key size requirement for security. See BlueKcrypt's Cryptographyc Key Length Recommendation for other recommendations and dates.
To protect data up through 2050, it is recommended to meet the minimum requirements for asymmetric key lengths:
|Method||RSA||ECC||D-H Key||D-H Group|
Personally, I don't see any problem with using 2048-bit RSA/DH group and 256-bit ECC/DH key lengths. So, my recommendation would be:
- 256-bit minimum for ECC/DH Keys
- 2048-bit minimum for RSA/DH Group (but you're not using RSA, right?)
Avoid: Not following the above recommendations.
The two dominating use cases within the last 10 years for asymmetric signatures are cryptocurrencies and forward-secret key agreement, as with ECDHE-TLS. The dominating algorithms for these use cases are all elliptic-curve based. Be wary of new systems that use RSA signatures.
In the last few years there has been a major shift away from conventional DSA signatures and towards misuse-resistent "deterministic" signature schemes, of which EdDSA and RFC6979 are the best examples. You can think of these schemes as "user-proofed" responses to the Playstation 3 ECDSA flaw, in which reuse of a random number leaked secret keys. Use deterministic signatures in preference to any other signature scheme.
Ed25519, the NaCl default, is by far the most popular public key signature scheme outside of Bitcoin. It’s misuse-resistant and carefully designed in other ways as well. You shouldn’t freelance this either; get it from NaCl.
Use, in order of preference:
- NaCL, libsodium, or monocypher
- RFC6979 (deterministic DSA/ECDSA)
- Anything RSA
Developers should not freelance their own encrypted transports. To get a sense of the complexity of this issue, read the documentation for the Noise Protocol Framework. If you're doing a key-exchange with Diffie-Hellman, you probably want an authenticated key exchange (AKE) that resists key compromise impersonation (KCI), and so the primitive you use for Diffie-Hellman is not the only important security concern.
It remains the case: if you can just use NaCl, use NaCl. You don’t even have to care what NaCl does. That’s the point of NaCl.
Don’t do ECDH with the NIST curves, where you’ll have to carefully verify elliptic curve points before computing with them to avoid leaking secrets. That attack is very simple to implement, easier than a CBC padding oracle, and far more devastating.
The previos edition of this document included a clause about using DH-1024 in preference to sketchy curve libraries. You know what? That’s still a valid point. Valid and stupid. The way to solve the “DH-1024 vs. sketchy curve library” problem is, the same as the “should I use Blowfish or IDEA?” problem. Don’t have that problem. Use Curve25519.
Use, in order of preference:
- NaCL, libsodium, or monocypher
- 2048-bit Diffie-Hellman Group #14
- Conventional DH
- Handshakes and negotiation
- Elaborate key negotiation schemes that only use block ciphers
By "website security", we mean "the library you use to make your web server speak HTTPS". If you can pay a web hosting provider to worry about this problem for you, then you do that. Otherwise, use OpenSSL.
There was a dark period between 2010 and 2016 where OpenSSL might not have been the right answer, but that time has passed. OpenSSL has gotten better, and, more importantly, OpenSSL is on-the-ball with vulnerability disclosure and response.
Using anything besides OpenSSL will drastically complicate your system for little, no, or even negative security benefit. This means avoid LibreSSL, BoringSSL, or BearSSL for the time being. Not because they're bad, but because OpenSSL really is the Right Answer here. Just keep it simple; use OpenSSL.
Speaking of simple: LetsEncrypt is free and automated. Set up a cron job to re-fetch certificates regularly, and test it.
- A web hosting provider, like AWS.
- OpenSSL with Let's Encrypt
Client-Server Application Security
What happens when you design your own custom RSA protocol is that 1-18 months afterwards, hopefully sooner but often later, you discover that you made a mistake and your protocol had virtually no security. A good example is Salt Stack. Salt managed to deploy e=1 RSA.
It seems a little crazy to recommend TLS given its recent history:
- The Logjam DH negotiation attack
- The FREAK export cipher attack
- The POODLE CBC oracle attack
- The RC4 fiasco
- The CRIME compression attack
- The Lucky13 CBC padding oracle timing attack
- The BEAST CBC chained IV attack
- Triple Handshakes
- Compromised CAs
Here's why you should still use TLS for your custom transport problem:
- Most of these attacks can be mitigated by hardcoding TLS 1.2+, ECDHE and AES-GCM. That sounds tricky, and it is, but it's less tricky than designing your own transport protocol with ECDHE and AES-GCM!
- In a custom transport scenario, you don't need to depend on CAs: you can self-sign a certificate and ship it with your code, just like Colin suggests you do with RSA keys.
- Designing your own encrypted transport, which is a genuinely hard engineering problem;
- Using TLS but in a default configuration, like, with "curl"
- Using "curl"
Of course, you should host your own backups in house. The best security is the security where others just don't get access to your data.
The best solution, IMO, is OpenZFS. Not only do you get data integrity with 256-bit checksums, but you get redundancy, volume management, network transport, and many other options, all for free. FreeNAS makes setting this up trivial. Setting it up with Debian GNU/Linux isn't to difficult.
If using an online backup service, rather than hosting your own, use Tarsnap. It's withstood the test of time.
Alternatively, Keybase has its own keybase filesystem (KBFS) that supports public, private, and team repositories. The specification is sound, but they only provide 10 GB for free, without any paid plans currently. All data is end-to-end encrypted in your KBFS client before being stored on the filesystem, using the NaCl library.
- Amazon S3