Author: Sean Gillies Version: 1.0
This document describes a GeoJSON-like protocol for geo-spatial (GIS) vector data.
# pipelinedemo.py | |
# Data processing pipeline demo | |
# Uses GeoJSON like geometries for demonstration only | |
from fiona import workspace | |
from shapely.geometry import box | |
from shapely import wkb | |
from json import dumps | |
import urllib2 | |
import logging |
Latency Comparison Numbers (~2012) | |
---------------------------------- | |
L1 cache reference 0.5 ns | |
Branch mispredict 5 ns | |
L2 cache reference 7 ns 14x L1 cache | |
Mutex lock/unlock 25 ns | |
Main memory reference 100 ns 20x L2 cache, 200x L1 cache | |
Compress 1K bytes with Zippy 3,000 ns 3 us | |
Send 1K bytes over 1 Gbps network 10,000 ns 10 us | |
Read 4K randomly from SSD* 150,000 ns 150 us ~1GB/sec SSD |
As an ex-University lecturer I could do this in class and students would enter the workforce able to make their analysis look good on a map and communicate properly… What about those who can't be bothered reading a few pages from a book or a web site that shows them some useful tips?
You understand the academic mindset, not the hacking mindset. Academics go like this:
lectures -> books -> grades -> make a good map
Hacking goes like this:
make a map -> websites -> make a better map -> make a good map
Two alternative ways.
A possible elegant way to achieve this objective.
Starting point is here:
This is a non-technical reading list for technical people.
This is a list of software you should read like a novel.
Star this Gist to indicate preference for the deeper form (with "when" and "@type") of GeoJSON-LD Time (geojson/geojson-ld#9).
Thinking about the Turf tornado analysis from https://www.mapbox.com/blog/60-years-of-tornadoes-with-turf/ and what the similar approacoh is in GeoPandas.
The two programs take slightly different approaches to the counting. Turf loops over the counties, counting how many tornadoes fall inside its borders. GeoPandas performs a spatial join - first forming a spatial index on the tornadoes. The joined GeoDataFrame combines the columns (properties) of both sets. Then a groupby
operation is performed counting the number of entries for each county.
The Turf version is significantly faster - most likely due to the slow spatial join operation in GeoPandas and that all columns are included resulting in a large final DataFrame - and probably just that node is much faster than Python here. Overall, Turf's speed is impressive.
a core issue in implementing good policy, is a sound understanding of the problem to be solved. let me unpack that a bit. policy is good if it can solve a problem; it needs to address some issue which reasonable people collectively agree is worth solving. for exampe, lets say, building code is in place to ensure fire safety standards or to withstand environmental tragedy (say earthquake) or similar. most people would agree saving lives due to non-standard building implementations is a good thing. so in order to attack that problem, groups of people (non-profit/industry groups, industry, government, academics, etc) advocate, strategize, architect, pass and implement policy (standards, law, regulation, executive order etc) which solve the problems around standarization of building codes.
good policy stands the test of time, is robust to challenge, moves with opinion (popular, scientific and/or otherwise) and is high on the adoption curve; which is to say people generally want to adopt it. so for the sa