Skip to content

Instantly share code, notes, and snippets.

@unhammer
Created October 13, 2011 09:07
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save unhammer/1283800 to your computer and use it in GitHub Desktop.
Save unhammer/1283800 to your computer and use it in GitHub Desktop.
decode html entities from stdin, utf-8
#!/usr/bin/env python2
import sys, codecs;
sys.stdin = codecs.getreader('utf-8')(sys.stdin);
sys.stdout = codecs.getwriter('utf-8')(sys.stdout);
sys.stderr = codecs.getwriter('utf-8')(sys.stderr);
import HTMLParser
h = HTMLParser.HTMLParser()
for line in sys.stdin:
print h.unescape(line)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment