Skip to content

Instantly share code, notes, and snippets.

@Jaxkr
Last active March 29, 2018 21:23
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save Jaxkr/09269df30a85b1b1ad004973cdcd01f0 to your computer and use it in GitHub Desktop.
Save Jaxkr/09269df30a85b1b1ad004973cdcd01f0 to your computer and use it in GitHub Desktop.
import unicodedata
with open("steam_crash.txt", 'rb') as myfile:
message = myfile.read()
print('Length of message in bytes: ' + str(len(message)))
message = message.decode('utf-8')
print('Number of characters in message: ' + str(len(message)))
filter_combining_chars = True
number_combining_chars = 0
number_latin_chars = 0
for char in message:
name = unicodedata.name(char)
if filter_combining_chars:
if ("COMBINING" not in name and filter_combining_chars):
print(unicodedata.name(char))
number_latin_chars += 1
else:
number_combining_chars += 1
else:
print(unicodedata.name(char))
print("Number of latin characters: " + str(number_latin_chars))
print("Number of combining characters: " + str(number_combining_chars))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment