NIST recommends that when users are trying to set a password you should reject those that are commonly used or compromised:
When processing requests to establish and change memorized secrets,
verifiers SHALL compare the prospective secrets against a list that
contains values known to be commonly-used, expected, or compromised.
But how do you know what are the compromised passwords? Luckily Troy Hunter put a lot of effort into building the "Have I Been Pwned (HIBP)" database with the SHA1 hashes of 501,636,842 passwords that have been compromised on the internet. Sweet.
This means that to prevent a user setting a compromised password like P@ssword
you can look it up on a public HIBP service such as this one and reject it.
If you are running a security sensitive service it is probably a bad idea to make a call to a public password hash lookup service. To get around that the public Pwned Password API at https://haveibeenpwned.com/API/v2#PwnedPasswords has you send the first 5 chars of the hash and they respond with all the matches. That might be slow or return a lot of data or be offline. So you might want to load the HIBP database into a private store such as MongoDB and check the SHA1 hash against that authorative store. You can then use a private secure API to your own MongoDB and just do an exact match SHA1 check which will be fast and since it is on your infrastructure you can ensure that it is made highly available.
There is another gist on this site for loading into Redis. Redis needs to fit in memory so would be expensive to run but that gist has a suggestion of how to hold the most used passwords in redis for fast checks before doing a slower check against all the hashs in mongo.
These instructions assume that you drive a mac but should be as straightforward on linux.
- Over 50Gi of disk (uncompressed the database is 33Gi then add to that the compressed 8Gi )
- Homebrew to install command line tools
brew install aria2
for thearia2c
bit torrent download clientbrew install p7zip
for the7za
tool to uncompress a the.txt.7z
file
- A mongo database with sufficent disk space.
Note that it took an hour to download the 8Gi torrent on my broadband.
The mongoimport command assumes that your mongod server is listing locally on the default port. If not you can pass commandline args to mongoimport below to connect to a remote server.
aria2c https://downloads.pwnedpasswords.com/passwords/pwned-passwords-2.0.txt.7z.torrent
7za x -so pwned-passwords-2.0.txt.7z | sed 's/:/,/g' | mongoimport --fields "_id.binary(base64),c.int32()" --columnsHaveTypes --db hibp --collection pwndpsswds --type csv
If you login and query the collection it looks something like:
> db.pwndpsswds.find()
{ "_id" : BinData(0,"5BAA61E4C9B93F3F0682250B6CF8331B7EE68FD8"), "c" : 3303003 }
{ "_id" : BinData(0,"3D4F2BF07DC1BE38B20CD6E46949A1071F9D0E3D"), "c" : 2900049 }
{ "_id" : BinData(0,"7C222FB2927D828AF22F592134E8932480637C0D"), "c" : 2680521 }
{ "_id" : BinData(0,"6367C48DD193D56EA7B0BAAD25B19455E529F5EE"), "c" : 2670319 }
{ "_id" : BinData(0,"E38AD214943DAAD1D64C102FAEC29DE4AFE9DA3D"), "c" : 2310111 }
Where the primary key _id
is stored as a binary byte format to reduce the storage size compared to storing a string. That means that to query by the pk you need to do a little bit of work to conver the string base64 SHA1 into a BinData type. You should test your query solution against known passwords such as P@ssword
so that you don't get false negatives.