Skip to content

Instantly share code, notes, and snippets.

@johnpena
Created February 8, 2011 20:00
Show Gist options
  • Save johnpena/817087 to your computer and use it in GitHub Desktop.
Save johnpena/817087 to your computer and use it in GitHub Desktop.
Safely/sanely ransforms a string from unicode to ascii.
import unicodedata
def normalize(s):
""" Safely/sanely ransforms a string from unicode to ascii. """
return unicodedata.normalize('NFKD', unicode(s)).encode('ascii','ignore')
def sanitize(s, replace_with=None):
""" Replace control chars, unicode chars, and whitespace with '?'. """
if not replace_with: replace_with = '?'
return ''.join(c if (32 < ord(c) < 127) else replace_with for c in s)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment