Skip to content

Instantly share code, notes, and snippets.

@amake
Last active September 25, 2015 12:40
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save amake/464930fb36e2909ac7d9 to your computer and use it in GitHub Desktop.
Save amake/464930fb36e2909ac7d9 to your computer and use it in GitHub Desktop.
Generate a list of Unicode codepoints that are English words
with open('/usr/share/dict/words') as infile:
words = infile.read().split()
hex_words = [w.upper() for w in words if len(w) == 4
and all(l in 'ABCDEFabcdef' for l in w)]
print ('\n'.join([u'U+%s %s' % (w, unichr(int(w, base=16)))
for w in sorted(hex_words)]).encode('utf-8'))
@amake
Copy link
Author

amake commented Sep 11, 2015

Output:

U+ABAC ꮬ
U+ABBA ꮺ
U+ABED ꯭
U+ACCA 곊
U+ADAD 궭
U+ADAD 궭
U+ADDA 귚
U+ADDA 귚
U+AFFA 꿺
U+BABA 몺
U+BABE 몾
U+BADE 뫞
U+BAFF 뫿
U+BEAD 뺭
U+BEEF 뻯
U+CABA 쪺
U+CACA 쫊
U+CADE 쫞
U+CEDE 컞
U+DABB
U+DACE
U+DADA
U+DADA
U+DADE
U+DAFF
U+DEAD
U+DEAF
U+DEED
U+ECAD 
U+ECCA 
U+EDDA 
U+EDEA 
U+FABA 諸
U+FACE 龜
U+FADE 﫞
U+FAFF 﫿
U+FEED ﻭ

(Actual U+Dxxx removed because surrogates.)

@amake
Copy link
Author

amake commented Sep 25, 2015

The only hexspeak words I was aware of before making this gist were 0xDEADBEEF and 0xCAFEBABE. The gist output does contain DEAD, BEEF, and BABE, but CAFE is not present because /usr/share/dict/words doesn't contain "cafe" or "café" for whatever reason.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment