Skip to content

Instantly share code, notes, and snippets.

@mango314
Last active December 7, 2018 01:58
Show Gist options
  • Star 9 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save mango314/7387071 to your computer and use it in GitHub Desktop.
Save mango314/7387071 to your computer and use it in GitHub Desktop.
a map of all 2166 Census Tracts of New York City in Python Matplotlib

Census Tracts of New York City

Here at PyData NYC, I heard a tutorial of how to use numpy and iPython notebooks. In a previous gist, I wrote drew all the zip codes of the Bronx in d3.js

This would be great for reproducing inforgraphics like Educational Attainment in New York City -- Brooklyn which looks a bit like a jigsaw puzzle:

Where to Obtain the Data

Unzipping the 2010 NYC Census Tracts gave me 5 files:

nyct2010.dbf  
nyct2010.prj  
nyct2010.shp  
nyct2010.shp.xml  
nyct2010.shx

Looking up Shapefile (Español) I learned how *.shp, *.shx & *.dbf extensions work.

Eventually, I found ogr2ogr to switch between shape-files and geoJSON. You can just dump the zip file online.

Possible Speed-Up

Using topoJSON by Michael Bostock, we indeed get a 90% reduction from 8MB to 800KB and a partition of the map of Census Tracts into disjoint arcs. From gis.StackExchange:

The primary advantage of TopoJSON is size. By eliminating redundancy and using a more efficent fixed-precision integer     
encoding of coordinates, TopoJSON files are often an order of magnitude smaller than GeoJSON files. The secondary advantage 
of TopoJSON files is that encoding the topology has useful applications, such as topology-preserving simplification (similar 
to MapShaper) and automatic mesh generation (as in the state-state boundaries in this example choropleth).

For the purposes of matplotlib, we can just draw the 6195 arcs. Here is an example of an arc in the TopoJSON sense:

>>> x['objects']['nyc']['geometries'][0]
{'type': 'Polygon', 'arcs': [[0, 1, 2, 3, 4]]}

The topojson file has valuable information about the scaling of our polygon

>>> x['transform']
{'translate': [913175.1090087891, 120121.8812543377], 'scale': [15.422282169623212, 15.273768652066204]}

The arcs and be posive or negatively oriented. Let's examine arc #1. Still somewhat mysterious

>>> x['arcs'][1]  
[[3190, 3474], [-5, -1], [-10, -2], [-16, -4], [-17, -4], [3, -19], [4, -19], [-14, -3], [-1, 3], [-2, 9], [-4, 7], [-50, -10], [-39, -7]]

Future Directions

There are a few good libraries that I am excited about. Leaflet or CartoDB are useful for mapping all of the world. We would like to focus on New York States. My goal is something easy to use, that produces good results. I have a few examples I am following.

  • AtlasPR
  • A map of Puerto Rico using JavaScript d3.js
  • 3 scales: municipality, barrio and island
  • State.ly
  • a CSS Font whose glyphs are US States
  • easy Chropleth maps

It makes me really happy to say NYC OpenData website has improved substantially since last time I checked!

Alternative Approach using Geopandas

During PyData NYC 2013 I met Kelsey Johrdahl who offered an alternative approach to drawing these maps.

At this point GeoPandas is in pre-alpha. You download it directly from GitHub with git clone https://github.com/kjordahl/geopandas and then install it with python setup.py install.

If you have trouble installing, look at StackOverflow http://stackoverflow.com/questions/19883315/trouble-installing-fiona-in-python-cpl-error-h-no-such-file-or-directory/

Originally, my first step was to use ogr2ogr in order to convert the .shx files into geoJSON. Looks like this can be done instead directly with Fiona. Looks like Geopandas will be handling this.

apt-get install libgdal-dev
apt-get install libjpeg62
pip install fiona
pip install geopandas

Drawing Some Shapes

Polygons come from shapely so pip install shapely as well.

>>> import shapely, geopandas, matplotlib
>>> p1 = shapely.geometry.Polygon([(0, 0), (1, 0), (1, 1)])
>>> p2 = shapely.geometry.Polygon([(0, 0), (1, 0), (1, 1), (0, 1)])
>>> p3 = shapely.geometry.Polygon([(2, 0), (3, 0), (3, 1), (2, 1)])
>>> g = geopandas.GeoSeries([p1,p2,p3])
>>> g.area
>>> g.plot()
>>> matplotlib.pyplot.show()

Drawing New York City

>>> import shapely, geopandas, matplotlib
>>> boros = geopandas.GeoDataFrame.from_file('nyct2010_13a/nyct2010.shp')
>>> boros.ix[0]
BoroCT2010                                              5000900
BoroCode                                                      5
BoroName                                          Staten Island
CDEligibil                                                 None
CT2010                                                   000900
CTLabel                                                       9
NTACode                                                    SI22
NTAName               West New Brighton-New Brighton-St. George
PUMA                                                       3903
Shape_Area                                              2497010
Shape_Leng                                             7729.017
geometry      POLYGON ((962269.1260375976562500 173705.50018...
>>> boros.plot()
<matplotlib.axes.AxesSubplot object at 0x325e5d0>
>>> matplotlib.pyplot.show()

import simplejson as json
import numpy as np
import matplotlib.pyplot as plt
bx = file("nyc.json")
bx = json.loads(bx.read())
polygons = bx['features']
censusTracts = np.array([bx['features'][k]['geometry']['coordinates'][0] for k in range(len(bx['features']))])
for x in range(len(censusTracts)):
plt.plot(np.array(censusTracts[x])[:, 0] , np.array(censusTracts[x])[:, 1] , '-')
plt.xlim(xmin=900000, xmax=1100000)
plt.ylim(ymin=120000, ymax=280000)
plt.axes().set_aspect('equal', 'datalim')
plt.show()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment