Skip to content

Instantly share code, notes, and snippets.

@Sennahoi
Created November 17, 2014 09:33
Show Gist options
  • Save Sennahoi/36c834ea413f2dfec3fd to your computer and use it in GitHub Desktop.
Save Sennahoi/36c834ea413f2dfec3fd to your computer and use it in GitHub Desktop.
Some basic python encoding/decoding/unicode examples
string_with_utf8 = "T\xc3\xa4ter" # str, no unicode
correct_unicode = string_with_utf8.decode("utf-8") # interpret string as utf-8
print repr(correct_unicode) # T\xe4ter, correct unicode string
string_with_utf8_new = correct_unicode.encode("utf-8") # make a utf-8 str()
print repr(string_with_utf8_new) # equals repr(string_with_utf8), str()
#correct_unicode.encode("ascii") # UnicodeEncodeError becuase ascii has no representation for this char!
bad_unicode = unicode("T\xc3\xa4ter")
print repr(bad_unicode) #u'T\xc3\xa4ter' not interpreted as utf-8, so:
correct_unicode = unicode("T\xc3\xa4ter", "utf-8") # make a unicode string from a utf-8 representation
print repr(correct_unicode) # u'T\xe4ter', nice!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment