Skip to content

Instantly share code, notes, and snippets.

@Capriatto
Forked from j4mie/normalise.py
Created April 1, 2016 23:37
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Capriatto/58f8cd23b664d547a47510da50b3738f to your computer and use it in GitHub Desktop.
Save Capriatto/58f8cd23b664d547a47510da50b3738f to your computer and use it in GitHub Desktop.
Normalise (normalize) unicode data in Python to remove umlauts, accents etc.
# -*- coding: utf-8 -*-
import unicodedata
""" Normalise (normalize) unicode data in Python to remove umlauts, accents etc. """
data = u'naïve café'
normal = unicodedata.normalize('NFKD', data).encode('ASCII', 'ignore')
print normal
# prints "naive cafe"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment