Skip to content

Instantly share code, notes, and snippets.

@jwass
Last active December 29, 2015 04:02
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jwass/27b0b9c85b7c1d669fb0 to your computer and use it in GitHub Desktop.
Save jwass/27b0b9c85b7c1d669fb0 to your computer and use it in GitHub Desktop.
GeoPandas Tornado Analysis

Thinking about the Turf tornado analysis from https://www.mapbox.com/blog/60-years-of-tornadoes-with-turf/ and what the similar approacoh is in GeoPandas.

The two programs take slightly different approaches to the counting. Turf loops over the counties, counting how many tornadoes fall inside its borders. GeoPandas performs a spatial join - first forming a spatial index on the tornadoes. The joined GeoDataFrame combines the columns (properties) of both sets. Then a groupby operation is performed counting the number of entries for each county.

The Turf version is significantly faster - most likely due to the slow spatial join operation in GeoPandas and that all columns are included resulting in a large final DataFrame - and probably just that node is much faster than Python here. Overall, Turf's speed is impressive.

import geopandas as gpd
import geopandas.tools
import numpy as np
# Load the counties and tornadoes files into GeoDataFrames
counties = gpd.read_file('counties.json')
tornadoes = gpd.read_file('tornadoes.json')
tornadoes.set_geometry(tornadoes.centroid, inplace=True)
# Perform a spatial join of counties to tornadoes. intersection() is default
joined = geopandas.tools.sjoin(counties.reset_index(), tornadoes, how='inner')
count = joined.groupby('index').size() # Count # of entries in each county
counties['tornadoes'] = count
with open('result_geopandas.geojson', 'w') as f:
f.write(counties.to_json())
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment