Skip to content

Instantly share code, notes, and snippets.

@xiaohk
Created February 21, 2017 01:43
Show Gist options
  • Save xiaohk/5d62c51636b2611b7643045846a639ed to your computer and use it in GitHub Desktop.
Save xiaohk/5d62c51636b2611b7643045846a639ed to your computer and use it in GitHub Desktop.
How Python3 deals with unicode character and code point
zh = 'U+5EB8'
print(zh) # U+5EB8
zh = '\u5EB8'
print(zh) # 庸
zh = '\\u5EB8'
print(zh) # \\u5EB8
print(zh.decode('unicode-escape')) # AttributeError: 'str' object has no attribute 'decode'
print(zh.encode('ascii').decode('unicode-escape')) # 庸
zh = b'\\u5EB8'
print(zh.decode('unicode-escape')) # 庸
zh = '庸'
print(zh) # 庸
print(zh.encode('utf-8')) # b'\xe5\xba\xb8'
print(zh.encode('unicode-escape')) # b'\\u5eb8'
print(zh.encode('unicode-escape').decode("ascii")) # \u5eb8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment