Skip to content

Instantly share code, notes, and snippets.

@myersjustinc
Created November 12, 2012 16:25
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save myersjustinc/4060293 to your computer and use it in GitHub Desktop.
Save myersjustinc/4060293 to your computer and use it in GitHub Desktop.
Rendering a CSV column as JSON

This script was developed to take a column of a CSV and render it as JSON. Because it was meant for U.S. choropleth maps, it makes a few assumptions about the input CSV's structure--specifically that it has at least three columns:

  • FIPS contains a five-digit FIPS 6-4 county code for rows with county-level data. For rows with state-level data, a two-digit FIPS 5-2 state code is used by convention but is ignored.

  • Name contains the common English name for the area. This is used for rows with state-level data and ignored for rows with county-level data.

  • Any other column is treated as a data column that can be rendered as JSON.

The output JSON contains the original CSV's string representation of data values and therefore probably needs some sort of regex-based find-and-replace used on it to make it more usable. The keys are FIPS codes for counties (as found in the FIPS column) and state names (as found in the Name column) for states.

#!/usr/bin/env python
from csv import DictReader
import codecs
import json
import sys
INDENT_LEVEL = 4
def main(input_filename, field_name):
output_dict = {}
input_file = open(input_filename, 'rb')
reader = DictReader(input_file)
for row in reader:
area_fips = row['FIPS'].rjust(5, '0')
area_name = area_fips
if area_name[0:2] == '00':
area_name = row['Name']
if field_name in row:
output_dict[area_name] = row[field_name]
else:
print '%s not a valid field name' % field_name
return
input_file.close()
output_filename = field_name + '.json'
output_file = codecs.open(output_filename, 'w', 'latin-1')
json.dump(output_dict, output_file, indent=INDENT_LEVEL)
output_file.close()
if __name__ == '__main__':
if len(sys.argv) != 3:
print "Usage: %s foo.csv field_name" % sys.argv[0]
else:
main(sys.argv[1], sys.argv[2])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment