Skip to content

Instantly share code, notes, and snippets.

@bcse
Created June 19, 2013 19:57
Show Gist options
  • Save bcse/5817497 to your computer and use it in GitHub Desktop.
Save bcse/5817497 to your computer and use it in GitHub Desktop.
Remove all UTF-16 characters by regular expression
txt = u'\U0001f600'
r = re.compile(u'[\uD800-\uDBFF][\uDC00-\uDFFF]')
r.sub('', txt)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment