Skip to content

Instantly share code, notes, and snippets.

@taoy
Created December 16, 2015 01:08
Show Gist options
  • Save taoy/d7f1c4d61d5659066c6a to your computer and use it in GitHub Desktop.
Save taoy/d7f1c4d61d5659066c6a to your computer and use it in GitHub Desktop.
Python (2.7) Normalize Japanese Text File. (Hankaku, Zenkaku)
import unicodedata
filename = 'somefilename.txt'
f = open(file, 'r')
r = f.read()
f.close()
ur = r.decode('utf-8')
n = unicodedata.normalize('NFKC', ur)
outfile = 'outfile.txt'
f = open(outfile, 'wb')
f.write(n.encode('utf-8'))
f.close()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment