Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
A Python Protocol for Geospatial Data

Author: Sean Gillies Version: 1.0

Abstract

This document describes a GeoJSON-like protocol for geo-spatial (GIS) vector data.

Introduction

Python has a number of built-in protocols (descriptors, iterators, etc). A very simple and familiar one involves string representations of objects. The built-in str() function calls the __str__() method of its single argument. By implementing __str__(), instances of any class can be printed by any other Python program.

>>> class A(object):
...     def __str__(self):
...         return "Eh!"
...
>>> a = A()
>>> str(a)
'Eh!'
>>> "%s" % a
'Eh!'

What if we could do something like this for geo-spatial objects? It might, for example, let any object be analyzed using any other hypothetical software package like this:

>>> from some_analytic_module import as_geometry
>>> as_geometry(obj).buffer(1.0).area   # obj is a "point" of some kind
3.1365484905459389

The hypothetical as_geometry() function of the hypothetical some_analytic_module module would access relevant data of its single argument using an agreed upon method or attribute.

__geo_interface__

Following the lead of numpy's Array Interface [1], let's agree on a __geo_interface__ property. To avoid creating even more protocols, let's make the value of this attribute a Python mapping. To further minimize invention, let's borrow from the GeoJSON format [2] for the structure of this mapping.

The keys are:

type (required)
A string indicating the geospatial type. Possible values are "Feature" or a geometry type: "Point", "LineString", "Polygon", etc.
bbox (optional)
A tuple of floats that describes the geo-spatial bounds of the object: (left, bottom, right, top) or (west, south, east, north).
properties (optional)
A mapping of feature properties (labels, populations ... you name it. Dependent on the data). Valid for "Feature" types only.
geometry (optional)
The geometric object of a "Feature" type, also as a mapping.
coordinates (required)
Valid only for geometry types. This is an (x, y) or (longitude, latitude) tuple in the case of a "Point", a list of such tuples in the "LineString" case, or a list of lists in the "Polygon" case. See the GeoJSON spec for details.

Examples

First, a toy class with a point representation:

>>> class Pointy(object):
...     __geo_interface__ = {'type': 'Point', 'coordinates': (0.0, 0.0)}
...
>>> as_geometry(Pointy()).buffer(1.0).area
3.1365484905459389

Next, a toy class with a feature representation:

>>> class Placemark(object):
...     __geo_interface__ = {
...         'type': 'Feature',
...         'properties': {'name': 'Phoo'},
...         'geometry': Pointy.__geo_interface__ }
>>> from my_analytic_module import as_feature
>>> as_feature(Placemark())['properties']['name']
'Phoo'

Implementations

Python programs and packages that you have heard of – and made be a frequent user of – already implement this protocol:

Shapely

Shapely [7] provides a shape() function that makes Shapely geometries from objects that provide __geo_interface__ and a mapping() function that writes geometries out as dictionaries:

>>> from shapely.geometry import Point
>>> from shapely.geometry import mapping, shape
>>> Point(0.0, 0.0).__geo_interface__
{'type': 'Point', 'coordinates': (0.0, 0.0)}
>>> shape(Point(0.0, 0.0))
<shapely.geometry.point.Point object at 0x...>
>>> mapping(Point(0.0, 0.0))
{'type': 'Point', 'coordinates': (0.0, 0.0)}

The Shapely version of the example in the introduction is:

>>> from shapely.geometry import shape
>>> shape(obj).buffer(1.0).area
3.1365484905459389

where obj could be a geometry object from ArcPy or PySAL, or even a mapping directly:

>>> shape({'type': 'Point', 'coordinates': (0.0, 0.0)}).buffer(1.0).area
3.1365484905459389

References

[1]http://docs.scipy.org/doc/numpy/reference/arrays.interface.html
[2]https://tools.ietf.org/html/rfc7946
[3]https://desktop.arcgis.com/en/arcmap/latest/analyze/arcpy-functions/asshape.htm
[4]https://bitbucket.org/sgillies/descartes/src/f97e54f3b8d4/descartes/patch.py#cl-14
[5]http://pypi.python.org/pypi/geojson/
[6]https://pysal.readthedocs.io/en/latest/users/tutorials/shapely.html
[7]https://github.com/Toblerity/Shapely

Pretty cool. Would be good to implement this into pyqgis too.

Just added to mapnik as well: mapnik/mapnik#2009

Should geo_interface have an optional crs key?

Since we already support geometries and features, let's go all the way and optionally allow representation of the complete GeoJSON hierarchy including FeatureCollections. Any objections from current users?

ayanmukg commented Nov 7, 2014

hi.. if the point is in lat,long and i want the buffer to be in meters or kilometers, is there a way to implement that?

I am not sure if anyone is still looking at this but should geo_interface have an optional crs key? I am curious if it was omitted for a reason or was just looked over.

@sgillies: Shouldn't the coordinates returned from __geo_interface__ be a list instead of a tuple to conform to the GeoJSON spec?

aolieman commented May 8, 2015

@aronbierbaum, I don't think so. Coordinate pairs don't benefit from being stored in a mutable python type, and a tuple is an efficient choice for what we want to represent here. JSON is a serialization format, and as such is inherently immutable.

I guess what you've overlooked here is that __geo_interface__ specifies an interface, not a serialization format. The geojson package provides a way to serialize __geo_interface__ values to GeoJSON (see encoding/decoding).

@jzmiller1 I think that __geo_interface__ should have an optional crs key. However, specifying the format could be a little problematic. I have some ideas on how to do it, but I feel that a post might be a little too limited. I am creating a document to explain my ideas of adding crs to the __geo_interface__.

cleder commented May 4, 2016

https://github.com/fortyninemaps/karta also implements the __geo_interface__.

bixb0012 commented Jan 9, 2017

I think it would be good to include some additional examples that clarify how tuples and lists should be handled in the output. Although there may not be a difference in terms of processing the output, there is a difference in terms of appearance, and there seems to be some debate as to which is the "better" way to go.

For example, should MULTILINESTRING((35 35, 45 45), (5 15, 15 25)) output look like

{'type': 'MultiLineString', 'coordinates': [[(35.0, 35.0), (45.0, 45.0)], [(5.0, 15.0), (15.0, 25.0)]]}

or

{'type': 'MultiLineString', 'coordinates': (((35.0, 35.0), (45.0, 45.0)), ((5.0, 15.0), (15.0, 25.0)))}

MRigal commented Jan 10, 2017 edited

Any known minimal adapter of the geo_interface for psycopg2 to avoid using the Python2-constricted ppygis or the heavier ogr or shapely?

@bixb0012

From the spec text:

coordinates (required)
Valid only for geometry types. This is an (x, y) or (longitude, latitude) tuple in the case of a "Point", a list of such tuples in the "LineString" case, or a list of lists in the "Polygon" case. See the GeoJSON spec for details.

This reads as if you should use tuples for a coordinates only and lists for any more complicated geometry. So the first example seems like it is correct:

{'type': 'MultiLineString', 'coordinates': [[(35.0, 35.0), (45.0, 45.0)], [(5.0, 15.0), (15.0, 25.0)]]}

Implementations are a whole different matter.
python-geojson seems to use lists all the way down:

>>> import geojson
In [5]: geojson.MultiLineString(coordinates=[[(35.0, 35.0), (45.0, 45.0)], [(5.0, 15.0), (15.0, 25.0)]])

Out[5]: {"coordinates": [[[35.0, 35.0], [45.0, 45.0]], [[5.0, 15.0], [15.0, 25.0]]], "type": "MultiLineString"}

pygeoif uses tuples all the way down.

>>> import pygeoif
In [12]: g=pygeoif.MultiLineString([[(35.0, 35.0), (45.0, 45.0)], [(5.0, 15.0), (15.0, 25.0)]])
In [15]: g.__geo_interface__
Out[15]: {'coordinates': (((35.0, 35.0), (45.0, 45.0)), ((5.0, 15.0), (15.0, 25.0))), 'type': 'MultiLineString'}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment