Skip to content

Instantly share code, notes, and snippets.

@xarg
Created June 28, 2011 14:31
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save xarg/1051247 to your computer and use it in GitHub Desktop.
Save xarg/1051247 to your computer and use it in GitHub Desktop.
Douglas-Peucker
# pure-Python Douglas-Peucker line simplification/generalization
#
# this code was written by Schuyler Erle <schuyler@nocat.net> and is
# made available in the public domain.
#
# the code was ported from a freely-licensed example at
# http://www.3dsoftware.com/Cartography/Programming/PolyLineReduction/
#
# the original page is no longer available, but is mirrored at
# http://www.mappinghacks.com/code/PolyLineReduction/
"""
>>> line = [(0,0),(1,0),(2,0),(2,1),(2,2),(1,2),(0,2),(0,1),(0,0)]
>>> simplify_points(line, 1.0)
[(0, 0), (2, 0), (2, 2), (0, 2), (0, 0)]
>>> line = [(0,0),(0.5,0.5),(1,0),(1.25,-0.25),(1.5,.5)]
>>> simplify_points(line, 0.25)
[(0, 0), (0.5, 0.5), (1.25, -0.25), (1.5, 0.5)]
"""
import math
def simplify_points (pts, tolerance):
anchor = 0
floater = len(pts) - 1
stack = []
keep = set()
stack.append((anchor, floater))
while stack:
anchor, floater = stack.pop()
# initialize line segment
if pts[floater] != pts[anchor]:
anchorX = float(pts[floater][0] - pts[anchor][0])
anchorY = float(pts[floater][1] - pts[anchor][1])
seg_len = math.sqrt(anchorX ** 2 + anchorY ** 2)
# get the unit vector
anchorX /= seg_len
anchorY /= seg_len
else:
anchorX = anchorY = seg_len = 0.0
# inner loop:
max_dist = 0.0
farthest = anchor + 1
for i in range(anchor + 1, floater):
dist_to_seg = 0.0
# compare to anchor
vecX = float(pts[i][0] - pts[anchor][0])
vecY = float(pts[i][1] - pts[anchor][1])
seg_len = math.sqrt( vecX ** 2 + vecY ** 2 )
# dot product:
proj = vecX * anchorX + vecY * anchorY
if proj < 0.0:
dist_to_seg = seg_len
else:
# compare to floater
vecX = float(pts[i][0] - pts[floater][0])
vecY = float(pts[i][1] - pts[floater][1])
seg_len = math.sqrt( vecX ** 2 + vecY ** 2 )
# dot product:
proj = vecX * (-anchorX) + vecY * (-anchorY)
if proj < 0.0:
dist_to_seg = seg_len
else: # calculate perpendicular distance to line (pythagorean theorem):
dist_to_seg = math.sqrt(abs(seg_len ** 2 - proj ** 2))
if max_dist < dist_to_seg:
max_dist = dist_to_seg
farthest = i
if max_dist <= tolerance: # use line segment
keep.add(anchor)
keep.add(floater)
else:
stack.append((anchor, farthest))
stack.append((farthest, floater))
keep = list(keep)
keep.sort()
return [pts[i] for i in keep]
if __name__ == "__main__":
import doctest
doctest.testmod()
@SteffenWiesner
Copy link

Hi,

this is great code that I have actually used a lot myself. Today - by accident - I inspected it a little closer once more and found a confusing issue:

The max_dist test in line 71 is performed only within the branch starting at line 60. In other words, if the condition in line 58 is true (next point lies "behind" current segment), the value of dist_to_seg established in line 59 is never used and the current point will never be considered as a point to keep in this iteration, regardless of its actual shortest distance to the segment.

Although the code posted here is exactly like all other occurrences of that code I found on the web, I still have the strong feeling that this is actually a bug: The check in line 71 should be outside the if-else clause starting in line 58 (i.e., should be one indent level less) and should be performed each iteration.

I believe so for the following reasons:

a) considering what the algorithm is intended for, the current scheme doesn't seem to make sense geometrically (also after reading the cited site)
b) looking closer at the original C++ code on the cited site, it can be observed that the indentation used at that point is quite confusing and might likely have led to such error during porting to python. One has to carefully inspect the curly brackets of the original code in order to see that there the corresponding check is always performed.

Or am I missing something here? Can you comment on that? I am grateful for anyone proving me wrong.

Thanks and best regards

@intdxdt
Copy link

intdxdt commented Feb 10, 2013

@stevastator: good observation, the if statement (if max_dist < dist_to_seg:) on line 71 should be outside the else block starting at line 60. line 71 should belong to the top level block of the for loop (at line 50), same level as lines 51-60. The porting should consider the curly braces in cpp instead of indentation in cpp. there is a geometrical difference if not fixed. thanks stevastator !

@2ni
Copy link

2ni commented Jul 15, 2013

Couldn't you calculate the dist_len only when needed? Ie calculate line 55 at line 59 and not before, same for 64/68. You would save some expensive sqrt calculations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment