Skip to content

Instantly share code, notes, and snippets.

@amundo
Created June 11, 2012 03:16
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save amundo/2908340 to your computer and use it in GitHub Desktop.
Save amundo/2908340 to your computer and use it in GitHub Desktop.
look up names of characters in text
#!/usr/bin/env python
"""
pathall@gmail.com
Do Whatever the Fuck You Want To License
http://sam.zoy.org/wtfpl/
letters - cat some UTF-8 text to this script, and
it will output the unicode name of the characters in the text, if
there is one.
"""
import sys
from unicodedata import name
import codecs
sys.stdout = codecs.getwriter('utf-8')(sys.stdout)
content = sys.stdin.read().strip()
text = content.decode('utf-8')
for letter in text:
try:
uniname = name(letter)
except ValueError:
continue
print letter, uniname, "(U+%.4X)" % ord(letter), 'u"\\u' + "%.4X" % ord(letter) + '"'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment