The challenge in using Shapefiles is understanding the format of the file, which I haven't found to be well documented.
The first step is to use an interactive Python session, or better yet IPython, to explore the Shapefile. In the following example, I will look at some Sedimentary Basin data. The first step is to open the Shapefile with GDAL (which is called osgeo for some reason):
>>> import osgeo imoprt ogr, osr
>>> data_set = ogr.Open('Sedimentary_Basins/Sedimentary_Basins_of_the_United_States.shp')
Once Shapefile is open, the next step is to understand the structure of the data which is arranged into layers, features, and fields:
>>> data_set.GetLayerCount()
1
>>> layer = data_set.GetLayer(0)
>>> layer.GetFeatureCount()
144
>>> feature = layer.GetFeature(0)
>>> feature.GetFieldCount()
10
>>> for i in range(feature.GetFieldCount()):
print(i, feature.GetField(i))
0 Atlantic Coastal Plain, Florida Peninsula, New England
1 277953138.655
2 3611.979
3 Pericratonic
4 Mesozoic
5 Passive Margin (incl. Deltaic)
6 Cohee and others (1962), Frezon and Finn (1988), and BOEMRE (2010)
7 1:5,000,000
8 West Atlantic Basin
9 144
The steps that you might use to examine the data would depend on what you found along each step. If there are multiple layers you might have to look at each of those. From this information, I can see which fields are important for the task at hand.
The next step is to write a script that performs the lookups of points. There are a few points that I should mention.
The fields of interest are defined and given a name:
self.fields = [ (8, 'name'), (5, 'basin_type'), (4, 'age'), (3, 'tectonic_setting'), (6, 'source'), ]
Given longitude and latitude pair (which I have assumed to be the WGS84 coordinate system) it is important that the coordinates are transformed into the coordinate system of the Shapefile. This transformation is done by:
point = ogr.CreateGeometryFromWkt("POINT(%s %s)" % (lon, lat)) wgs84 = osr.SpatialReference() wgs84.ImportFromEPSG(4326) point.AssignSpatialReference(wgs84) point.TransformTo(self.layer.GetSpatialRef())
For a specific point, each feature in the map is tested to see if the point is contained by the feature:
self.layer.ResetReading() for i in range(self.layer.GetFeatureCount()): feature = self.layer.GetFeature(i) # Need to close the ring prior to every reading. feature.GetGeometryRef().CloseRings() if feature.GetGeometryRef().Contains(point): return [feature.GetField(index) for index, name in self.fields] else: return [None for _ in self.fields]
This is relatively simple logic. Two things worth mentioning is that rings are closed if they aren't closed within the Shapefile. Ia ran into one badly formed Shapefile so I include this by default, but it isn't always required. At the start of the loop there is a self.layer.ResetReading()
, which resets the looping over the features.