jwass/README.md

## README.md

      
    Raw
  

              README.md
            
          
    Thinking about the Turf tornado analysis from https://www.mapbox.com/blog/60-years-of-tornadoes-with-turf/ and what the similar approacoh is in GeoPandas.
The two programs take slightly different approaches to the counting. Turf loops over the counties, counting how many tornadoes  fall inside its borders. GeoPandas performs a spatial join - first forming a spatial index on the tornadoes. The joined GeoDataFrame combines the columns (properties) of both sets. Then a groupby operation is performed counting the number of entries for each county.
The Turf version is significantly faster - most likely due to the slow spatial join operation in GeoPandas and that all columns are included resulting in a large final DataFrame - and probably just that node is much faster than Python here. Overall, Turf's speed is impressive.

  
## tornado.py
import geopandas as gpd
import geopandas.tools
import numpy as np

# Load the counties and tornadoes files into GeoDataFrames
counties = gpd.read_file('counties.json')
tornadoes = gpd.read_file('tornadoes.json')
tornadoes.set_geometry(tornadoes.centroid, inplace=True)

# Perform a spatial join of counties to tornadoes. intersection() is default
joined = geopandas.tools.sjoin(counties.reset_index(), tornadoes, how='inner')
count = joined.groupby('index').size()  # Count # of entries in each county
counties['tornadoes'] = count

with open('result_geopandas.geojson', 'w') as f:
    f.write(counties.to_json())
	import geopandas as gpd
	import geopandas.tools
	import numpy as np

	# Load the counties and tornadoes files into GeoDataFrames
	counties = gpd.read_file('counties.json')
	tornadoes = gpd.read_file('tornadoes.json')
	tornadoes.set_geometry(tornadoes.centroid, inplace=True)

	# Perform a spatial join of counties to tornadoes. intersection() is default
	joined = geopandas.tools.sjoin(counties.reset_index(), tornadoes, how='inner')
	count = joined.groupby('index').size() # Count # of entries in each county
	counties['tornadoes'] = count

	with open('result_geopandas.geojson', 'w') as f:
	f.write(counties.to_json())