Skip to content

Instantly share code, notes, and snippets.

@dmtucker
Created August 23, 2018 06:23
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dmtucker/6a34397a67118cdee59324cde895d3eb to your computer and use it in GitHub Desktop.
Save dmtucker/6a34397a67118cdee59324cde895d3eb to your computer and use it in GitHub Desktop.
$ python3 sep.py
different: cp037 b'a'
different: cp273 b'a'
different: cp424 b'a'
different: cp500 b'a'
different: cp875 b'a'
different: cp1026 b'a'
different: cp1140 b'a'
untested: cp65001
different: utf_32 b'\xff\xfe\x00\x00/\x00\x00\x00'
different: utf_32_be b'\x00\x00\x00/'
different: utf_32_le b'/\x00\x00\x00'
different: utf_16 b'\xff\xfe/\x00'
different: utf_16_be b'\x00/'
different: utf_16_le b'/\x00'
different: utf_8_sig b'\xef\xbb\xbf/'
encodings = [
'ascii',
'big5',
'big5hkscs',
'cp037',
'cp273',
'cp424',
'cp437',
'cp500',
'cp720',
'cp737',
'cp775',
'cp850',
'cp852',
'cp855',
'cp856',
'cp857',
'cp858',
'cp860',
'cp861',
'cp862',
'cp863',
'cp864',
'cp865',
'cp866',
'cp869',
'cp874',
'cp875',
'cp932',
'cp949',
'cp950',
'cp1006',
'cp1026',
'cp1125',
'cp1140',
'cp1250',
'cp1251',
'cp1252',
'cp1253',
'cp1254',
'cp1255',
'cp1256',
'cp1257',
'cp1258',
'cp65001',
'euc_jp',
'euc_jis_2004',
'euc_jisx0213',
'euc_kr',
'gb2312',
'gbk',
'gb18030',
'hz',
'iso2022_jp',
'iso2022_jp_1',
'iso2022_jp_2',
'iso2022_jp_2004',
'iso2022_jp_3',
'iso2022_jp_ext',
'iso2022_kr',
'latin_1',
'iso8859_2',
'iso8859_3',
'iso8859_4',
'iso8859_5',
'iso8859_6',
'iso8859_7',
'iso8859_8',
'iso8859_9',
'iso8859_10',
'iso8859_11',
'iso8859_13',
'iso8859_14',
'iso8859_15',
'iso8859_16',
'johab',
'koi8_r',
'koi8_t',
'koi8_u',
'kz1048',
'mac_cyrillic',
'mac_greek',
'mac_iceland',
'mac_latin2',
'mac_roman',
'mac_turkish',
'ptcp154',
'shift_jis',
'shift_jis_2004',
'shift_jisx0213',
'utf_32',
'utf_32_be',
'utf_32_le',
'utf_16',
'utf_16_be',
'utf_16_le',
'utf_7',
'utf_8',
'utf_8_sig',
]
for encoding in encodings:
try:
encoded = u'/'.encode(encoding)
except UnicodeEncodeError:
print('unencodable:', encoding)
except LookupError:
print('untested:', encoding)
else:
if encoded != u'/'.encode('utf-8'):
print('different:', encoding, encoded)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment