Skip to content

Instantly share code, notes, and snippets.

@leafstorm
Created November 12, 2014 17:52
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save leafstorm/ae9a159674e47d693f68 to your computer and use it in GitHub Desktop.
Save leafstorm/ae9a159674e47d693f68 to your computer and use it in GitHub Desktop.
Python script for decoding NCSBE voter data
#!/usr/bin/env python3
"""
streamncv.py
============
Translates from latin-1 tab-separated-values to UTF-8 comma-separated values
on the fly, without trying to allocate the entire file.
It also prints progress updates.
"""
import csv
import io
import sys
latin1stdin = io.TextIOWrapper(sys.stdin.buffer, 'latin1')
reader = csv.reader(latin1stdin, dialect='excel-tab')
writer = csv.writer(sys.stdout, dialect='excel')
n = 0
for row in reader:
writer.writerow([c.strip() for c in row])
n += 1
if n % 1000 == 0:
sys.stderr.write("\r{}".format(n))
sys.stderr.flush()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment