Skip to content

Instantly share code, notes, and snippets.

@tdsymonds
Last active August 9, 2019 21:56
Show Gist options
  • Save tdsymonds/9be69fc5cc732363a6bc74fc23c1f4df to your computer and use it in GitHub Desktop.
Save tdsymonds/9be69fc5cc732363a6bc74fc23c1f4df to your computer and use it in GitHub Desktop.
How to remove non printable characters from a string in Python, taken from: https://www.tutorialspoint.com/How-to-trim-down-non-printable-characters-from-a-string-in-Python
import sys, unicodedata, re
# Get all unicode characters
all_chars = (unichr(i) for i in xrange(sys.maxunicode))
# Get all non printable characters
control_chars = ''.join(c for c in all_chars if unicodedata.category(c) == 'Cc')
# Create regex of above characters
control_char_re = re.compile('[%s]' % re.escape(control_chars))
# Substitute these characters by empty string in the original string.
def remove_control_chars(s):
return control_char_re.sub('', s)
print (remove_control_chars('\x00\x01String'))
@Gatsby-Lee
Copy link

Here is similar code and commented that this seems risky since valid char can be removed.

https://stackoverflow.com/questions/92438/stripping-non-printable-characters-from-a-string-in-python

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment