Not all random values are created equal - for security-related code, you need a specific kind of random value.
A summary of this article, if you don't want to read the entire thing:
- Don't use
Math.random()
. There are extremely few cases whereMath.random()
is the right answer. Don't use it, unless you've read this entire article, and determined that it's necessary for your case. - Don't use
crypto.getRandomBytes
directly. While it's a CSPRNG, it's easy to bias the result when 'transforming' it, such that the output becomes more predictable. - If you want to generate random tokens or API keys: Use
uuid
, specifically theuuid.v4()
method. Avoidnode-uuid
- it's not the same package, and doesn't produce reliably secure random values. - If you want to generate random numbers in a range: Use
random-number-csprng
.
You should seriously consider reading the entire article, though - it's not that long :)
There exist roughly three types of "random":
- Truly random: Exactly as the name describes. True randomness, to which no pattern or algorithm applies. It's debatable whether this really exists.
- Unpredictable: Not truly random, but impossible for an attacker to predict. This is what you need for security-related code - it doesn't matter how the data is generated, as long as it can't be guessed.
- Irregular: This is what most people think of when they think of "random". An example is a game with a background of a star field, where each star is drawn in a "random" position on the screen. This isn't truly random, and it isn't even unpredictable - it just doesn't look like there's a pattern to it, visually.
Irregular data is fast to generate, but utterly worthless for security purposes - even if it doesn't seem like there's a pattern, there is almost always a way for an attacker to predict what the values are going to be. The only realistic usecase for irregular data is things that are represented visually, such as game elements or randomly generated phrases on a joke site.
Unpredictable data is a bit slower to generate, but still fast enough for most cases, and it's sufficiently hard to guess that it will be attacker-resistant. Unpredictable data is provided by what's called a CSPRNG.
- CSPRNG: A Cryptographically Secure Pseudo-Random Number Generator. This is what produces unpredictable data that you need for security purposes.
- PRNG: A Pseudo-Random Number Generator. This is a broader category that includes CSPRNGs and generators that just return irregular values - in other words, you cannot rely on a PRNG to provide you with unpredictable values.
- RNG: A Random Number Generator. The meaning of this term depends on the context. Most people use it as an even broader category that includes PRNGs and truly random number generators.
Every random value that you need for security-related purposes (ie. anything where there exists the possibility of an "attacker"), should be generated using a CSPRNG. This includes verification tokens, reset tokens, lottery numbers, API keys, generated passwords, encryption keys, and so on, and so on.
In Node.js, the most widely available CSPRNG is the crypto.randomBytes
function, but you shouldn't use this directly, as it's easy to mess up and "bias" your random values - that is, making it more likely that a specific value or set of values is picked.
A common example of this mistake is using the %
modulo operator when you have less than 256 possibilities (since a single byte has 256 possible values). Doing so actually makes lower values more likely to be picked than higher values.
For example, let's say that you have 36 possible random values - 0-9
plus every lowercase letter in a-z
. A naive implementation might look something like this:
let randomCharacter = randomByte % 36;
That code is broken and insecure. With the code above, you essentially create the following ranges (all inclusive):
- 0-35 stays 0-35.
- 36-71 becomes 0-35.
- 72-107 becomes 0-35.
- 108-143 becomes 0-35.
- 144-179 becomes 0-35.
- 180-215 becomes 0-35.
- 216-251 becomes 0-35.
- 252-255 becomes 0-3.
If you look at the above list of ranges you'll notice that while there are 7 possible values for each randomCharacter
between 4 and 35 (inclusive), there are 8 possible values for each randomCharacter
between 0 and 3 (inclusive). This means that while there's a 2.64% chance of getting a value between 4 and 35 (inclusive), there's a 3.02% chance of getting a value between 0 and 3 (inclusive).
This kind of difference may look small, but it's an easy and effective way for an attacker to reduce the amount of guesses they need when bruteforcing something. And this is only one way in which you can make your random values insecure, despite them originally coming from a secure random source.
In Node.js:
- If you need a sequence of random bytes: Use
crypto.randomBytes
. - If you need individual random numbers in a certain range: use
crypto.randomInt
. - If you need a random string: You have two good options here, depending on your needs.
- Use a v4 UUID. Safe ways to generate this are
crypto.randomUUID
, and theuuid
library (only the v4 variant!). - Use a nanoid, using the
nanoid
library. This also allows specifying a custom alphabet to use for your random string.
- Use a v4 UUID. Safe ways to generate this are
Both of these use a CSPRNG, and 'transform' the bytes in an unbiased (ie. secure) way.
In the browser:
- When using the Node.js options, your bundler should automatically select equivalently safe browser implementations for all of these.
- If not using a bundler:
- If you need a sequence of random bytes: Use
crypto.getRandomValues
with aUint8Array
. Other array types will get you numbers in different ranges. - If you need a random string: You have two good options here, depending on your needs.
- Use a v4 UUID, with the
crypto.randomUUID
method. - Use a nanoid, using the standalone build of the
nanoid
library. This also allows specifying a custom alphabet to use for your random string.
- Use a v4 UUID, with the
- If you need a sequence of random bytes: Use
However, it is strongly recommended that you use a bundler, in general.
@bipinstha7 So there's a general principle in cryptography: never add more complexity than is actually needed to effectively solve the problem. This is because every bit of complexity you add, increases the chance of failure; whether it's because of a bug in the implementation, or because you put the pieces together incorrectly. And because a single tiny cryptographic failure can compromise an entire system, it's crucial to reduce this risk to the absolute minimum.
It's for this same reason that when you see a cryptographic tool advertising itself with how complex it is ("triple hashing", "double encryption", "proprietary algorithm", etc.), you should treat it as completely insecure. When someone presents complexity as a feature, that's a sure sign that they don't understand how to handle risk in cryptography. More about that and related topics here.
But to get back to your case: consider that in your approach, you still need to have a secret key, which needs to be randomly generated, and a salt, which also needs to be randomly generated. So in the end, you still need a random source, you're just adding a bunch of cryptographic complexity on top of that. That means more things that could break.
Consider, for example, what would happen if a weakness were found in the key derivation function that allows someone with 10 outputs (with 10 different salts) to determine the original secret key. Suddenly, your key security got a lot weaker; a problem that would not have existed if you had just used a CSPRNG with a simple encoding scheme.
You're not avoiding the encoding scheme either - you say that it produces "string values", but it doesn't, at least not originally! Like all cryptographic operations, PBKDF2 operates on series of bytes, and produces a series of bytes too. It likely just gets encoded (hex-encoded, base64-encoded, etc.) before it is returned to you.
So why not just take a random string from the CSPRNG, and encode it in some simple and non-biased way, completely skipping the whole "cryptographic key derivation with a salt" dance inbetween those two steps? And that is indeed exactly what UUID and nanoid implementations do.
You may also find this a useful read.