Skip to content

Instantly share code, notes, and snippets.

@shello
Last active May 13, 2022 23:43
Show Gist options
  • Save shello/59585bc235b877009216752f1debfd74 to your computer and use it in GitHub Desktop.
Save shello/59585bc235b877009216752f1debfd74 to your computer and use it in GitHub Desktop.
HIBP Pwned Passwords splitter

This Awk script splits pwned-passwords-1.0.txt.7z (and updates) into 256 files xz compressed files, by hash prefix (2 digits).
The script assumes the format of the input file is one hash per line.

It can be ran for more than one pwned-passwords-*.txt.7z file, such as pwned-passwords-update-1.txt.7z as long as Troy keeps the format consistent.

Requires 7z (p7zip), and GNU Awk (tested with GNU Awk 4.1.4).

Usage

$ 7z e -so pwned-passwords-1.0.txt.7z | awk -f split.awk

Why?

Because searching a multi-GiB compressed file is a pain in the ass, and loading the contents into a database for a few local searches is not worth the hassle. (says the one who wrote an awk script and ran it...)

Being able to use xzgrep is also nice.

What do I do with 256 files now?

Here's a quick and dirty bash function to perform a search on these files:

amipwned() {
    local pw pwsha
    read -rsp "Password: " pw
    echo
    pwsha=$(echo -n "$pw" | shasum | cut -f1 -d ' ')
    xzgrep -qm1 "$pwsha" "pwned-${pwsha:0:2}.xz" \
        && echo 'PWNED!' \
        || echo 'That one is safe, for now.'
}

Using it is simple:

$ amipwned
Password:
PWNED!

😰😰😰

Legalese

I'm not responsible for full disks, omg-my-password-is-pwned panic attacks, sweated shirts or soiled trousers, or whatever happens to you or your system derived from running any of this code, or reading this text.

This code is licensed under a CC0 1.0 Universal license: https://creativecommons.org/publicdomain/zero/1.0/legalcode.

Have fun. 😁

#!/usr/bin/env bash
amipwned() {
local pw pwsha
read -rsp "Password: " pw
echo
pwsha=$(echo -n "$pw" | shasum | cut -f1 -d ' ')
bucket="${1:-.}/pwned-${pwsha:0:2}.xz"
[[ ! -f "$bucket" ]] && { echo "Hash bucket file not found!" >&2; exit 1; }
xzgrep -qm1 "$pwsha" "$bucket" \
&& echo 'PWNED!' \
|| echo 'That one is safe, for now.'
}
[[ "$1" == '-h' ]] && { echo "Usage: $0 [bucket-files-directory]" >&2; exit 0; }
[[ ! -d "$1" ]] && { echo "Unable to access $1." >&2; exit 1; }
amipwned "$1"
function cmd(prefix) {
return "xz -0 >>pwned-" prefix ".xz";
}
function info(prefix, bucket_idx) {
bck_pct = bucket_idx * 100 / buckets_total;
delta = systime() - start_time;
rec_s = delta > 0 ? NR / delta : 0;
print "Prefix", prefix, "(" bck_pct "%,", rec_s, "rec/s)";
}
BEGIN {
last_prefix = "00";
buckets_total = 256;
bucket_idx = 0;
info(last_prefix, bucket_idx);
out_cmd = cmd(last_prefix);
start_time = systime();
}
{
line = tolower($0);
prefix = substr(line, 0, 2);
if (prefix != last_prefix) {
close(out_cmd);
out_cmd = cmd(prefix);
last_prefix = prefix;
info(prefix, bucket_idx++);
}
print line | out_cmd;
}
END {
close(out_cmd);
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment