Skip to content

Instantly share code, notes, and snippets.

View abamacus's full-sized avatar

abacus abamacus

View GitHub Profile
@abamacus
abamacus / wiki-100k.txt
Last active January 30, 2022 22:57 — forked from h3xx/wiki-100k.txt
Wictionary top 100,000 most frequently-used English words [for john the ripper]
This file has been truncated, but you can view the full file.
#!comment: This allegedly waa a list of "the top 100,000 most frequently-used English words", see the repo I forked for more provenance.
#!comment: But it was very un-sanitized. I had a specific purpose in mind, and thought it might be somewhat more generally useful to have/share in the future, so here's how I sanitized the list:
#!comment: 0) put a number on each word (I should note that somehow it only included 98,913 to start with)
#!comment: 1) change all words to lower-case
#!comment: 2) blank out any words with characters other than a-z
#!comment: 3) remove any duplicates, keeping the lower (more frequent) number
#!comment: Now the list is 62916 words long, and still contains a lot of non-English words, but I think it is more useful.
#!comment:
#!comment: Format: Rank (original rank) Word
1 (1) the