Skip to content

Instantly share code, notes, and snippets.

@gustavofonseca
Created July 13, 2016 13:54
Show Gist options
  • Save gustavofonseca/4dffc81c336f8795afc910b50e026366 to your computer and use it in GitHub Desktop.
Save gustavofonseca/4dffc81c336f8795afc910b50e026366 to your computer and use it in GitHub Desktop.
Exemplo de decodificação de sequências de bytes com base em detecção automática.
"""
Uso:
with open('amostra.txt') as amostra:
decode = make_decoder(amostra)
with open('amostra.txt') as dados:
for linha in dados:
print decode(linha)
"""
from chardet.universaldetector import UniversalDetector
def make_decoder(data):
detector = UniversalDetector()
for line in data:
detector.feed(line)
if detector.done:
break
detector.close()
apparent_encoding = detector.result.get('encoding')
def decode(bstring):
return bstring.decode(apparent_encoding)
return decode
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment