Skip to content

Instantly share code, notes, and snippets.

@tyrion
Created November 4, 2014 18:15
Show Gist options
  • Save tyrion/46e5c63e66e5635908c1 to your computer and use it in GitHub Desktop.
Save tyrion/46e5c63e66e5635908c1 to your computer and use it in GitHub Desktop.
Parse irc logs with multiple encodings
def decode(line, encodings):
for encoding in encodings:
try:
return line.decode(encoding)
except UnicodeDecodeError:
pass
return line.decode('utf-8', 'ignore')
if __name__ == '__main__':
encodings = ['utf-8', 'latin1']
with open('error.log', 'rb') as input, \
open('error.out', 'a') as output:
output.writelines(decode(line, encodings) for line in input)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment