Skip to content

Instantly share code, notes, and snippets.

@hktechn0
Created November 10, 2009 17:01
Show Gist options
  • Save hktechn0/231036 to your computer and use it in GitHub Desktop.
Save hktechn0/231036 to your computer and use it in GitHub Desktop.
Convert int or long UTF-8 code to Unicode character on Python.
def utf2ucs(utf):
if utf & 0x80:
# multibyte
buf = []
while not(utf & 0x40):
buf.append(utf & 0x3f)
utf >>= 8
buf.append(utf & (0x3f >> len(buf)))
ucs = 0
while buf != []:
ucs <<= 6
ucs += buf.pop()
else:
# ascii
ucs = utf
return unichr(ucs)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment