Skip to content

Instantly share code, notes, and snippets.

@InI4
Last active August 27, 2018 21:19
Show Gist options
  • Save InI4/a14b829a225b447cc58f3c659568bdd2 to your computer and use it in GitHub Desktop.
Save InI4/a14b829a225b447cc58f3c659568bdd2 to your computer and use it in GitHub Desktop.
A flexible Modification of the Ramer Douglas Peucker Algorithm.
<?php
/**
* An implementation of a modified Ramer-Douglas-Peucker algorithm for reducing
* the number of points on a polyline (aka polygonal chain aka track).
* https://en.wikipedia.org/wiki/Polygonal_chain
*
* For more information, see:
* http://en.wikipedia.org/wiki/Ramer%E2%80%93Douglas%E2%80%93Peucker_algorithm
*
* Main change is how to measure distance between a connection (aka line, reference)
* and intermediate points in a polyline.
*
* The orthogonal line point distance in RDP is replaced by point to point distance (aka metric)
* which is applied to a triangle. (cf. https://en.wikipedia.org/wiki/Metric_(mathematics) )
* This metric has been abstracted into below PointMetric (functional) interface.
*
*/
interface PointMetric { function d($p1, $p2); }
/*
* Arguments for this modification:
* - It empirically improves algorithm behaviour for some paths (in the common euklidean setting).
* - It generalizes easily to other geometries like the globe, 3D or weighted metrics.
* - In the 2d euklidean case the resulting line to point metric can be easily interpreted as "detour length".
* - In case of many (point) metrics (i.e. euklidean or geo distance) the resulting line to point metrics
* still has the property to drop down to zero when an inner point lies on shortest distance between outers.
* - When the (point) metric is a true metric, i.e. satisfies triangular inequality, this suffices for
* the point to line distance to be strictly positive.
* - It allows easy algorithm modification to solve the "hike track problem", i.e. even on a perfectly straight
* line, a hiker might appreciate the additional waypoint as an achievement or confirmation.
* - Despite the differences the new approach often matches RDP for easy input sets.
*
* Below GeoDistHeight metric implementation contains the major reason of entre of this particular algorithm:
*
* I simply had GPX hiking tracks where some conventional RDP based on mere geocoordinates and ignoring
* height smoothed away necessary detours to avoid steep edges. Climbers can skip this variant. With this
* RDP variant it was easy to include height in the smoothing process and give it a reasonable weight.
*/
/**
* Sample metric: distance on the sphere.
* Input must be of the form array(lat,lon). Additional array entries are ignored.
*
* NOTE: This is not working out of the box, as haversine() is not provided in this file.
*/
class GeoDist implements PointMetric {
public function d($p1, $p2) { return haversine($p1[0], $p1[1], $p2[0], $p2[1]); }
}
/**
* Sample metric: distance on the sphere, plus a height.
*
* Input must be of the form array(lat,lon,height). Additional array entries are ignored.
* By convention the height is in meters and the haversine should throw out meters too.
* The weight makes vertical difference count more than horizontal.
*
* NOTE: The calculation is not exact, but a somewhat euklidean approximation to combine the 3rd coordinate!
* NOTE: This is not working out of the box, as haversine() is not provided in this file.
* NOTE: Historically this particular metric was the main reason to derive the described RDP variant.
*
*/
class GeoDistHeight implements PointMetric {
public function d($p1, $p2) {
$d0 = haversine($p1[0], $p1[1], $p2[0], $p2[1]);
return sqrt($d0*$d0 + 10.0*pow($p1[2] - $p2[2],2));
}
}
/**
* Sample metric: distance in the plane.
*
* Input must be of the form array(x,y). Additional array entries are ignored.
*/
class Euklid2DDist implements PointMetric {
public function d($p1, $p2) { return sqrt(pow($p1[0]-$p2[0], 2)+pow($p1[1]-$p2[1], 2)); }
}
/**
* Sample metric: distance in space.
*
* Input must be of the form array(x,y,z). Additional array entries are ignored.
*/
class Euklid3DDist implements PointMetric {
public function d($p1, $p2) { return sqrt(pow($p1[0]-$p2[0], 2)+pow($p1[1]-$p2[1], 2) + pow($p1[2]-$p2[2],2)); }
}
/**
* Special case metric: euklidean.
*
* Input must be of the form array(x1,x2,..,xn) all inputs with the same n!
* NOTE: This is used as default metric in case none is given in the modRDP constructor!
*/
class EuklidDist implements PointMetric {
public function d($p1, $p2) {
$sum = 0.0;
for($i = 0; $i < count($p1); $i++) {
$sum += pow($p1[$i]-$p2[$i], 2);
}
return sqrt($sum);
}
}
/**
*
* Modified RamerDouglasPeucker.
*
* Construct with an appropriate point to point metric or get n-dimension Euklid as a default.
*/
class ModRDP
{
/**
* This must be a point to point metric capable of measuring the input data.
* Use a metric, that fits to the properties of your given data.
* */
protected $metric;
/**
* Setting this >= 0 potentially adds extra points on long straight lines.
* - A value of 0 means classic track shape approximation.
* - Values > 0 introduce extra points along long lines to keep a hiker happy.
* Use 0.0 for shape reconstruction, 0.4 for creating hiking tracks.
* */
protected $hikingFactor;
/** When no metric is given, we take Euklid. */
function __construct($pointMetric = null, $hikingFactor = 0.1) {
$this->metric = $pointMetric != null ? $pointMetric : new EuklidDist();
$this->hikingFactor = $hikingFactor;
}
/**
* Makes a "point to line distance" from 3 point to point distances.
*
* THIS IS THE DIFFERENCE FROM ORIGINAL RDP!
*
* @param $d1 point distance new point to reference 1.
* @param $d2 point distance new point to reference 2.
* @param $d0 the reference length (to avoid recalculations)
* @param $epsilon accuracy requirement
*
* @return float The distance from the point to the line.
*/
protected function combineDistances($d1, $d2, $d0, $epsilon) {
// The first term is the modified point distance metric.
// The second term is another modification to add points on long lines
// which is meant for orientation when a track is reduced to a hiking path.
return ($d1+$d2-$d0)
+ $this->hikingFactor * $d1*$d2/($d1+$d2+$d0+$epsilon);
}
/**
* Reduces the number of points on a polyline by removing those that are
* closer to the line than the distance $epsilon.
*
* @param array $pointList An array of items, where each item is one point
* on the polyline. Item distance must be measurable by the given metric!
*
* @param float $epsilon The distance threshold to use.
*
* @return array $pointList An array of items with the same format as the input
* Each point returned in the result array retains all its original data.
*/
public function simplify($pointList, $epsilon) {
if ($epsilon <= 0) {
throw new InvalidArgumentException("Non positive epsilon $epsilon.");
}
if (count($pointList) <= 2) {
return $pointList;
}
$res = $this->iSimplify($pointList, 0, count($pointList), $epsilon);
// append that last point (we skip this in the recursion to save a lot of copying).
$res[] = $pointList[count($pointList)-1];
return $res;
}
/**
* Indexed simplifying to avoid too much copying of the input array.
* (Only the outputs are getting merged together.)
* As a convention the result is always returned without the right border.
*/
protected function iSimplify(&$pointList, $start, $totalPoints, $epsilon)
{
// edge cases there are always edge cases ...
if ( $totalPoints == 1 ) return array();
$tp1 = $start + $totalPoints - 1;
if ( $totalPoints == 2 ) return array($pointList[$start]);
// Main loop scan intermediate points between start and end.
// Find the pivot point, i.e. point with the maximum distance from start and end.
$dmax = 0;
$index = 0;
$d0 = $this->metric->d($pointList[$start], $pointList[$tp1]);
for ($i = $start + 1; $i < $tp1; $i++) {
// do the raw point 2 point distances.
$d1 = $this->metric->d($pointList[$i], $pointList[$start]);
$d2 = $this->metric->d($pointList[$i], $pointList[$tp1]);
$d = $this->combineDistances($d1, $d2, $d0, $epsilon);
if ($d > $dmax) {
$index = $i;
$dmax = $d;
}
}
// If our distance is greater than epsilon, recursively simplify
if ($dmax > $epsilon) {
// another performance shortcut, these short segments happen quite frequently.
if ( $totalPoints == 3 ) return array_slice($pointList, $start, 2);
// Recursive call on each part of the polyline
$recResults1 = $this->iSimplify($pointList, $start, $index + 1 - $start, $epsilon);
$recResults2 = $this->iSimplify($pointList, $index, $totalPoints - $index + $start, $epsilon);
// Build the result list, note that the convention to skip right border saves slicing away pivot
$resultList = array_merge($recResults1, $recResults2);
} else {
$resultList = array($pointList[$start]);
}
return $resultList;
}
}
?>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment