Created
November 24, 2008 10:33
-
-
Save brendano/28439 to your computer and use it in GitHub Desktop.
pipe fiddling: (1) kill buffering (2) output redir kills stdout encoding, so force it
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Pipe-oriented I/O in Python. This is harder than it should be. | |
# (1) Kill stdout buffering. makes redirects and tee easier to use. | |
if "<fdopen>" not in str(sys.stdout): sys.stdout = os.fdopen(1,'w',0) | |
# (2) Encoding madness. Note codecs.open() isn't available to us since we're using pipes. | |
import codecs | |
sys.stdout = codecs.EncodedFile(sys.stdout,'utf-8','utf-8','ignore') | |
# or this too .. sys.stdout = codecs.getwriter('utf-8')(sys.stdout) | |
# I'm interested in safely handling potentially garbled input data, so want to protect stdin. | |
# You'd think this would work: | |
sys.stdin = codecs.EncodedFile(sys.stdin,'utf-8','utf-8','ignore') | |
# But it fails on the line: "Smokey Joe's Caf\xe9\n" (which appears to not be valid utf-8) | |
# It thought the newline wasn't a newline and blended to the next line. wtf. | |
# So, if line-oriented input is what you want (usually is for me): | |
for line in codecs.iterdecode(sys.stdin,'utf-8','ignore'): | |
print line, |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment