Skip to content

Instantly share code, notes, and snippets.

@chryss
Created February 28, 2014 21:18
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save chryss/9280127 to your computer and use it in GitHub Desktop.
Save chryss/9280127 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"metadata": {
"name": ""
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "heading",
"level": 1,
"metadata": {},
"source": [
"Calculate distances between Iditarod checkpoints along the trail"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The following is a brief tutorial on how to solve the problem, using popular Python scientific and geospatial tools: Given the track of a race trail and the location of checkpoints, what are the distances between any two successive checkpoints. The example is the Iditarod dog sled race: the trail is approximated as a sequence of route points, or rather, the straight-line segments that connect them -- this is called a linestring. The checkpoints are given separately. It is assumed that both are available as (separate) files in ESRI Shapefile format.\n",
"\n",
"First some libraries that will be needed later."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import math\n",
"import fiona\n",
"from shapely.geometry import shape, LineString, Point\n",
"from collections import OrderedDict"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 1
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's define some helper functions. The first calculates the 3D Euclidian distance between two points in space, with the points represented as lists or tuples. I'm sure there's something usitable in the `math` library, but it's not costing us much.\n",
"\n",
"The second function does something quite generally useful. First of all, it's not a function at all, but a generator: when repeatedly called with a sequence variable or iterator as an argument, it successively returns pairs (i, i+1) of elements. For example, calling `pairs([1, 2, 3, 4])` yields successively: \n",
"1, 2 \n",
"2, 3 \n",
"3, 4 \n",
"\n",
"We will use this function to iterate through pairs of successive checkpoints. \n",
"\n",
"Last, `getnearest()` is a function that determines the point on a linestring that is closest to a given point external to the line string. Such a function is needed because the checkpoint locations aren't actually **on** the trail linestring (which is only given as a discrete list of route points with imagined straight lines between them). The linestring is here simply a sequence variable or iterator made up of objects that can be compared to the point object via an appropriate distance function. The method is simplistic, but good enough if the line strings aren't too long. The main limitation is that it won't give the correct result if the line doubles back on itself and runs sections of the trail multiple times (think of a loop or an out-and-back). It would not be hard to write a more general function, but for the Iditarod, this one is good enough."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def dist(P3D1, P3D2):\n",
" sqr = (P3D1[0] - P3D2[0])*(P3D1[0] - P3D2[0]) \n",
" sqr += (P3D1[1] - P3D2[1])*(P3D1[1] - P3D2[1])\n",
" sqr += (P3D1[2] - P3D2[2])*(P3D1[2] - P3D2[2])\n",
" return math.sqrt(sqr)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 2
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def pairs(seq):\n",
" i = iter(seq)\n",
" prev = item = i.next()\n",
" for item in i:\n",
" yield prev, item\n",
" prev = item"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 3
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def getnearest(point, line, distancefunc=None):\n",
" \"\"\"\n",
" Get nearest index and point on a line for a given point external to the line\n",
" \n",
" point: some object representing a point in space \n",
" line: iterable composed of objects of the same type as point\n",
" distancefunc: a function taking two points and returning a number\n",
" \n",
" returns: index, point on line that is nearest to the input point\n",
" \n",
" Not efficient for very long lines, because it searches along the line.\n",
" \"\"\"\n",
" if not distancefunc:\n",
" try:\n",
" distancefunc = lambda x, y: (x - y) * (x - y)\n",
" except:\n",
" # if no distance function provided and objects can't be subtracted and multiplied, use constant\n",
" distancefunc = lambda x, y: 1\n",
" # This would be inefficient for long lines, but ok for simple applications\n",
" # making sure we have an iterator; if line is an iterator, this doesn't make a difference\n",
" xline = iter(line)\n",
" # set nearest to first line element\n",
" nearest = next(xline)\n",
" nidx = 0\n",
" # iterate through the rest of the line\n",
" for idx, pt in enumerate(xline):\n",
" if distancefunc(pt, point) < distancefunc(nearest, point):\n",
" nearest = pt\n",
" nidx = idx\n",
" return nidx, nearest "
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 4
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In order to measure distances, we want to be working in a projection that operates in metres and is reasonably regular. We use Alaska Albers with the NAD83 datum (EPSG:3338) for this purpose. The KML files are converted to ESRI shapefiles and reprojected from Lat/Long (GPS coordinates with WGS 83 datum - they identify themselves as EPSG:6326). "
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"data_track = fiona.open('Iditarod_routes_AKAlbers.shp', 'r')\n",
"data_checkpoints = fiona.open('Iditarod_chkp_AKAlbers.shp', 'r')"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 5
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This makes the data accessible as Fiona Collection objects. The data_track object, however, contains records, which means the trail is provided in 3 separate lines strings. In this case, it turns out the segments are: Willow to Ophir (common to both the northern or southern Iditarod route); then the northern route Ophir to Kaltag; last the second common bit, Kaltag to Nome. "
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"print type(data_track)\n",
"print len(data_track)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"<class 'fiona.collection.Collection'>\n",
"3\n"
]
}
],
"prompt_number": 6
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Luckily, the records are all of type LineString, and they are ordered end-to-end, so that we can stitch them together. This is verified by printing the first and last data point from each and checking that a line's end point coincides (or is at least very close to) the next line's starting point. Indeed, they coincide, which simplifies matters. We can simply stitch them together and then convert the result in a shapely linestring shape object. "
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"for record in data_track:\n",
" print record['geometry']['type']\n",
" print record['geometry']['coordinates'][0], record['geometry']['coordinates'][-1]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"LineString\n",
"(201864.3672577657, 1311075.3757290987, 0.0) (-126708.67456065794, 1466405.3248210528, 0.0)\n",
"LineString\n",
"(-126709.72589937256, 1466377.6315675275, 0.0) (-227464.665379459, 1604444.2870315968, 0.0)\n",
"LineString\n",
"(-227464.665379459, 1604444.2870315968, 0.0) (-545332.8008364791, 1662252.0556106432, 0.0)\n"
]
}
],
"prompt_number": 7
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"fullinecoords = []\n",
"for line in data_track:\n",
" fullinecoords.extend(line['geometry']['coordinates'])"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 8
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we have a single list of coordinates for the Iditarod trail (northern route) that we can convert into a Shapely LineString object."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"fulline = {\n",
" 'geometry': {\n",
" 'coordinates': fullinecoords,\n",
" 'type': 'LineString' \n",
" },\n",
" 'id': 0,\n",
" 'properties': OrderedDict([(u'Name', u'Iditarod_northern'), (u'descriptio', None), (u'timestamp', None), (u'begin', None), (u'end', None), (u'altitudeMo', None), (u'tessellate', 1), (u'extrude', -1), (u'visibility', -1)]) \n",
"}\n",
"fulltrack = shape(fulline['geometry'])\n",
"fulltrack.geom_type"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 9,
"text": [
"'LineString'"
]
}
],
"prompt_number": 9
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"1. Explicitly find nearest point on track to each checkpoint/other point of interest"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The first method to calculate distances is the more manual one. It has two steps: Place every checkpoint actually *on* the track, but finding the closest track route point to it, and then summing up distances along the track."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"chkp_ontrack = {}\n",
"for checkpoint in data_checkpoints:\n",
" name = checkpoint['properties']['Name']\n",
" coords = checkpoint['geometry']['coordinates']\n",
" nearestidx, nearestpoint = getnearest(coords, fulltrack.coords, dist)\n",
" chkp_ontrack[name] = {}\n",
" chkp_ontrack[name]['idx'] = nearestidx\n",
" chkp_ontrack[name]['coords'] = nearestpoint\n",
"chkp_ontrack = OrderedDict(sorted(chkp_ontrack.items(), key=lambda item: item[1]['idx']))\n",
"print chkp_ontrack"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"OrderedDict([(u'Willow', {'coords': (201864.3672577657, 1311075.3757290987, 0.0), 'idx': 0}), (u'Yentna', {'coords': (175541.922551426, 1310422.667657584, 0.0), 'idx': 16}), (u'Skwentna', {'coords': (148273.63926342092, 1335305.5819313568, 0.0), 'idx': 69}), (u'Finger Lake', {'coords': (101143.48518480505, 1335331.4841845965, 0.0), 'idx': 81}), (u'Rainy Pass', {'coords': (66366.61204266077, 1346355.6871907613, 0.0), 'idx': 102}), (u'Rohn', {'coords': (32414.164815141416, 1369057.6355107727, 0.0), 'idx': 148}), (u'Nikolai', {'coords': (-19254.89342679152, 1450344.9621536026, 0.0), 'idx': 184}), (u'McGrath', {'coords': (-80500.13741392425, 1443687.8063012932, 0.0), 'idx': 219}), (u'Takotna', {'coords': (-104457.12558471515, 1447988.9224941842, 0.0), 'idx': 226}), (u'Ophir', {'coords': (-126743.78601976226, 1466445.5738682337, 0.0), 'idx': 241}), (u'Cripple', {'coords': (-90772.02370913935, 1554895.7141499869, 0.0), 'idx': 296}), (u'Ruby', {'coords': (-70857.30227587471, 1642094.6524683584, 0.0), 'idx': 464}), (u'Galena', {'coords': (-139023.34706871994, 1644154.2649788973, 0.0), 'idx': 512}), (u'Nulato', {'coords': (-194114.96857541226, 1643433.5101878615, 0.0), 'idx': 553}), (u'Kaltag', {'coords': (-228603.29277456927, 1604009.7401093692, 0.0), 'idx': 580}), (u'Unalakeet', {'coords': (-332360.1309182973, 1562768.0305000974, 0.0), 'idx': 600}), (u'Shaktoolik', {'coords': (-347377.80681277846, 1619099.1299162228, 0.0), 'idx': 622}), (u'Koyuk', {'coords': (-337874.247600846, 1681707.5706750432, 0.0), 'idx': 626}), (u'Elim', {'coords': (-394343.0533200941, 1652837.1998616038, 0.0), 'idx': 661}), (u'Golovin', {'coords': (-430995.2088565377, 1649786.6383352731, 0.0), 'idx': 692}), (u'White Mountains', {'coords': (-447215.0767230113, 1667242.719142894, 0.0), 'idx': 697}), (u'Safety', {'coords': (-514874.63125333586, 1653411.267474647, 0.0), 'idx': 737}), (u'Nome', {'coords': (-545209.0742850348, 1662290.4894755771, 0.0), 'idx': 767})])\n"
]
}
],
"prompt_number": 10
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"totaldist = 0\n",
"for chkpt, nextchkpt in pairs(chkp_ontrack):\n",
" idx, nextidx = chkp_ontrack[chkpt]['idx'], chkp_ontrack[nextchkpt]['idx']\n",
" partialtrack = LineString(fulltrack.coords[idx:nextidx+1])\n",
" totaldist = totaldist + partialtrack.length\n",
" print '%s --> %s' % (chkpt, nextchkpt)\n",
" print ' Distance in km: %s' % format(partialtrack.length/1000, '.1f')\n",
" print ' Distance in miles: %s' % format(partialtrack.length/1600, '.1f')\n",
"print\n",
"print 'Total distance summed up: %s km -- %s miles' % (format(totaldist/1000, '.1f'), format(totaldist/1600, '.1f'))\n",
"print 'Total distance from full track: %s km -- %s miles' % (format(fulltrack.length/1000, '.1f'), format(fulltrack.length/1600, '.1f'))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Willow --> Yentna\n",
" Distance in km: 49.2\n",
" Distance in miles: 30.7\n",
"Yentna --> Skwentna\n",
" Distance in km: 48.7\n",
" Distance in miles: 30.4\n",
"Skwentna --> Finger Lake\n",
" Distance in km: 47.3\n",
" Distance in miles: 29.6\n",
"Finger Lake --> Rainy Pass\n",
" Distance in km: 46.8\n",
" Distance in miles: 29.2\n",
"Rainy Pass --> Rohn\n",
" Distance in km: 54.4\n",
" Distance in miles: 34.0\n",
"Rohn --> Nikolai\n",
" Distance in km: 86.5\n",
" Distance in miles: 54.1\n",
"Nikolai --> McGrath\n",
" Distance in km: 98.2\n",
" Distance in miles: 61.4\n",
"McGrath --> Takotna\n",
" Distance in km: 26.6\n",
" Distance in miles: 16.6\n",
"Takotna --> Ophir\n",
" Distance in km: 29.5\n",
" Distance in miles: 18.4\n",
"Ophir --> Cripple\n",
" Distance in km: 115.7\n",
" Distance in miles: 72.3\n",
"Cripple --> Ruby\n",
" Distance in km: 111.7\n",
" Distance in miles: 69.8\n",
"Ruby --> Galena\n",
" Distance in km: 82.3\n",
" Distance in miles: 51.5\n",
"Galena --> Nulato\n",
" Distance in km: 77.5\n",
" Distance in miles: 48.4\n",
"Nulato --> Kaltag\n",
" Distance in km: 57.1\n",
" Distance in miles: 35.7\n",
"Kaltag --> Unalakeet\n",
" Distance in km: 118.0\n",
" Distance in miles: 73.7\n",
"Unalakeet --> Shaktoolik\n",
" Distance in km: 66.0\n",
" Distance in miles: 41.2\n",
"Shaktoolik --> Koyuk\n",
" Distance in km: 68.8\n",
" Distance in miles: 43.0\n",
"Koyuk --> Elim\n",
" Distance in km: 74.0\n",
" Distance in miles: 46.2\n",
"Elim --> Golovin\n",
" Distance in km: 43.0\n",
" Distance in miles: 26.9\n",
"Golovin --> White Mountains\n",
" Distance in km: 23.7\n",
" Distance in miles: 14.8\n",
"White Mountains --> Safety\n",
" Distance in km: 78.5\n",
" Distance in miles: 49.1\n",
"Safety --> Nome\n",
" Distance in km: 35.2\n",
" Distance in miles: 22.0\n",
"\n",
"Total distance summed up: 1438.7 km -- 899.2 miles\n",
"Total distance from full track: 1439.1 km -- 899.4 miles\n"
]
}
],
"prompt_number": 11
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"OK, this looks rather reasonable. Unalakleet is misspelled, because it was misspelled in the original KML file. The first leg seems to be quite a bit shorter than the Iditarod Trail Committee says, and looking at the track, there seems to be a rather long straight line at the beginning. We're therefore underestimating the first leg by quite a bit. For the other legs, we will always underestimate them by a few percent, given that we only have less than a route point per mile on the average: 771 route points for 900+ miles. "
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"len(fulltrack.coords)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 12,
"text": [
"771"
]
}
],
"prompt_number": 12
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"2. Using the shapely API to project the waypoints on the track"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Using the Shapely library, there is a more direct way: the `project` method of a LineString object, which takes a Point object as an argument, calculates the distance from the start point to a point obtained by projecting the Point onto the LineString. This method is also more precise. "
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"waypoints = {}\n",
"import itertools\n",
"for checkpoint in data_checkpoints:\n",
" name = checkpoint['properties']['Name']\n",
" chkptshape = Point(checkpoint['geometry']['coordinates'])\n",
" a = fulltrack.project(chkptshape)\n",
" waypoints[name] = a\n",
"waypoints = OrderedDict(sorted(waypoints.items(), key=lambda x: x[1]))\n",
"print waypoints"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"OrderedDict([(u'Willow', 0.0), (u'Yentna', 50714.33033776395), (u'Skwentna', 98456.95832563685), (u'Finger Lake', 154847.81950651482), (u'Rainy Pass', 194917.18510199385), (u'Rohn', 247108.63569004112), (u'Nikolai', 350916.59367488633), (u'McGrath', 432930.69745211414), (u'Takotna', 459168.7339948695), (u'Ophir', 491825.4313515935), (u'Cripple', 605267.813128124), (u'Ruby', 714936.960658677), (u'Galena', 799741.3740328938), (u'Nulato', 875872.404119346), (u'Kaltag', 932207.0987561418), (u'Unalakeet', 1048066.017263612), (u'Shaktoolik', 1122485.1410288452), (u'Koyuk', 1187335.6547237583), (u'Elim', 1258850.0709805435), (u'Golovin', 1301511.4599364558), (u'White Mountains', 1326225.760966473), (u'Safety', 1404359.4452045814), (u'Nome', 1438842.1274170913)])\n"
]
}
],
"prompt_number": 13
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"prev = 0\n",
"for pt1, pt2 in pairs(waypoints):\n",
" print '%s --> %s' % (pt1, pt2)\n",
" print ' Distance in km: %s' % format((waypoints[pt2] - waypoints[pt1])/1000, '.1f')\n",
" print ' Distance in miles: %s' % format((waypoints[pt2] - waypoints[pt1])/1600, '.1f')\n",
"print\n",
"print 'Total distance: %s km -- %s miles' % (format(waypoints[pt2]/1000, '.1f'), format(waypoints[pt2]/1600, '.1f'))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Willow --> Yentna\n",
" Distance in km: 50.7\n",
" Distance in miles: 31.7\n",
"Yentna --> Skwentna\n",
" Distance in km: 47.7\n",
" Distance in miles: 29.8\n",
"Skwentna --> Finger Lake\n",
" Distance in km: 56.4\n",
" Distance in miles: 35.2\n",
"Finger Lake --> Rainy Pass\n",
" Distance in km: 40.1\n",
" Distance in miles: 25.0\n",
"Rainy Pass --> Rohn\n",
" Distance in km: 52.2\n",
" Distance in miles: 32.6\n",
"Rohn --> Nikolai\n",
" Distance in km: 103.8\n",
" Distance in miles: 64.9\n",
"Nikolai --> McGrath\n",
" Distance in km: 82.0\n",
" Distance in miles: 51.3\n",
"McGrath --> Takotna\n",
" Distance in km: 26.2\n",
" Distance in miles: 16.4\n",
"Takotna --> Ophir\n",
" Distance in km: 32.7\n",
" Distance in miles: 20.4\n",
"Ophir --> Cripple\n",
" Distance in km: 113.4\n",
" Distance in miles: 70.9\n",
"Cripple --> Ruby\n",
" Distance in km: 109.7\n",
" Distance in miles: 68.5\n",
"Ruby --> Galena\n",
" Distance in km: 84.8\n",
" Distance in miles: 53.0\n",
"Galena --> Nulato\n",
" Distance in km: 76.1\n",
" Distance in miles: 47.6\n",
"Nulato --> Kaltag\n",
" Distance in km: 56.3\n",
" Distance in miles: 35.2\n",
"Kaltag --> Unalakeet\n",
" Distance in km: 115.9\n",
" Distance in miles: 72.4\n",
"Unalakeet --> Shaktoolik\n",
" Distance in km: 74.4\n",
" Distance in miles: 46.5\n",
"Shaktoolik --> Koyuk\n",
" Distance in km: 64.9\n",
" Distance in miles: 40.5\n",
"Koyuk --> Elim\n",
" Distance in km: 71.5\n",
" Distance in miles: 44.7\n",
"Elim --> Golovin\n",
" Distance in km: 42.7\n",
" Distance in miles: 26.7\n",
"Golovin --> White Mountains\n",
" Distance in km: 24.7\n",
" Distance in miles: 15.4\n",
"White Mountains --> Safety\n",
" Distance in km: 78.1\n",
" Distance in miles: 48.8\n",
"Safety --> Nome\n",
" Distance in km: 34.5\n",
" Distance in miles: 21.6\n",
"\n",
"Total distance: 1438.8 km -- 899.3 miles\n"
]
}
],
"prompt_number": 14
}
],
"metadata": {}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment