Skip to content

Instantly share code, notes, and snippets.

@mgiraldo
Last active August 29, 2015 14:04
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save mgiraldo/a68b53175ce5892531bc to your computer and use it in GitHub Desktop.
Save mgiraldo/a68b53175ce5892531bc to your computer and use it in GitHub Desktop.
Finding shape consensus among multiple geo polygons. See: http://nbviewer.ipython.org/gist/mgiraldo/a68b53175ce5892531bc
Display the source blob
Display the rendered blob
Raw
{
"metadata": {
"language": "ruby",
"name": "",
"signature": "sha256:8d8903b40719dd38d9ab7362c4532e99994a2c72f48e9ffc0559d21728f87a34"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Finding shape consensus among multiple geo polygons\n",
"\n",
"One of the tasks in the [Building Inspector](http://buildinginspector.nypl.org/) is [fixing building footprints](http://buildinginspector.nypl.org/fix). The user is presented a map with an overlaid shape (red dots). The purpose is to draw the correct shape (or shapes, since the red overlay may cover multiple building footprints).\n",
"\n",
"Multiple people receive the same map and overlay. This notebook describes a process to find the resulting consensus (or mean) shape.\n",
"\n",
"Below is an example showing the map, the original polygon shown to each user (red dots) and the resulting polygons drawn (yellow)."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"IRuby.html '<iframe src=\"http://jsfiddle.net/mgiraldo/pdkCb/3/embedded/result/\" width=500 height=400></iframe>'"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<iframe src=\"http://jsfiddle.net/mgiraldo/pdkCb/3/embedded/result/\" width=500 height=400></iframe>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 1,
"text": [
"\"<iframe src=\\\"http://jsfiddle.net/mgiraldo/pdkCb/3/embedded/result/\\\" width=500 height=400></iframe>\""
]
}
],
"prompt_number": 1
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It is hard to see but there are 11 yellow polygons: one rectangle in the lower left part, one for the upper right part (both wrong), and 9 for the complete L-shaped building.\n",
"\n",
"# Requirements\n",
"\n",
"The process to find the geometry that best summarizes what is drawn by users has to take into account:\n",
"\n",
"1. an overlay may span _multiple_ polygons (red dots covering more than one building)\n",
"1. polygons may have any number of vertices greater or equal to three\n",
"1. users will not always draw the polygons the same way (eg: use more or fewer points)\n",
"\n",
"The process described in this notebook makes use of the [DBSCAN clustering algorithm](https://en.wikipedia.org/wiki/DBSCAN) to find an unknown amount of dense regions of points and determine the resulting geometries from there. The _input_ to this process will be a GeoJSON FeatureCollection containing all the polygons drawn by contributors that are associated to a given red overlay. the expected _output_ is a list of geo point arrays with the summary shapes determined by the algorithm.\n",
"\n",
"**All the necessary code is included** and should be executable by any machine that has the required Ruby gems installed. _This code was tested on Ruby 2.1.0._\n",
"\n",
"# Process\n",
"\n",
"First, we need the [RGeo](https://github.com/rgeo/rgeo) package along with its [GeoJSON component](https://github.com/rgeo/rgeo-geojson):"
]
},
{
"cell_type": "code",
"collapsed": true,
"input": [
"require 'rgeo'\n",
"require 'rgeo-geojson'"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 2,
"text": [
"true"
]
}
],
"prompt_number": 2
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will use a [Ruby implementation](https://github.com/matiasinsaurralde/dbscan) of the [DBSCAN clustering algorithm](https://en.wikipedia.org/wiki/DBSCAN)."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"require 'dbscan'"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 3,
"text": [
"true"
]
}
],
"prompt_number": 3
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For visualization convenience in this notebook we will also use the awesome [Nyaplot](https://github.com/domitry/nyaplot), a D3-powered visualization library. I had to manually build it according to [the instructions](https://github.com/domitry/nyaplot#installation) since it is not yet in RubyGems.org."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"require 'nyaplot'"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 4,
"text": [
"true"
]
}
],
"prompt_number": 4
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Initialize Nyaplot to work in this IRuby Notebook:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"Nyaplot.init_iruby"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<script>\n",
"if(window['d3'] === undefined ||\n",
" window['Nyaplot'] === undefined){\n",
" var path = {\"d3\":\"http://d3js.org/d3.v3.min\"};\n",
"\n",
"\n",
"\n",
" var shim = {\"d3\":{\"exports\":\"d3\"}};\n",
"\n",
" require.config({paths: path, shim:shim});\n",
"\n",
"\n",
"require(['d3'], function(d3){window['d3']=d3;console.log('finished loading d3');\n",
"\n",
"\tvar script = d3.select(\"head\")\n",
"\t .append(\"script\")\n",
"\t .attr(\"src\", \"https://rawgit.com/domitry/Nyaplotjs/master/release/nyaplot.js\")\n",
"\t .attr(\"async\", true);\n",
"\n",
"\tscript[0][0].onload = script[0][0].onreadystatechange = function(){\n",
"\t var event = document.createEvent(\"HTMLEvents\");\n",
"\t event.initEvent(\"load_nyaplot\",false,false);\n",
"\t window.dispatchEvent(event);\n",
"\t console.log('Finished loading Nyaplotjs');\n",
"\t};\n",
"\n",
"\n",
"});\n",
"}\n",
"</script>\n"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 5,
"text": [
"\"<script>\\nif(window['d3'] === undefined ||\\n window['Nyaplot'] === undefined){\\n var path = {\\\"d3\\\":\\\"http://d3js.org/d3.v3.min\\\"};\\n\\n\\n\\n var shim = {\\\"d3\\\":{\\\"exports\\\":\\\"d3\\\"}};\\n\\n require.config({paths: path, shim:shim});\\n\\n\\nrequire(['d3'], function(d3){window['d3']=d3;console.log('finished loading d3');\\n\\n\\tvar script = d3.select(\\\"head\\\")\\n\\t .append(\\\"script\\\")\\n\\t .attr(\\\"src\\\", \\\"https://rawgit.com/domitry/Nyaplotjs/master/release/nyaplot.js\\\")\\n\\t .attr(\\\"async\\\", true);\\n\\n\\tscript[0][0].onload = script[0][0].onreadystatechange = function(){\\n\\t var event = document.createEvent(\\\"HTMLEvents\\\");\\n\\t event.initEvent(\\\"load_nyaplot\\\",false,false);\\n\\t window.dispatchEvent(event);\\n\\t console.log('Finished loading Nyaplotjs');\\n\\t};\\n\\n\\n});\\n}\\n</script>\\n\""
]
}
],
"prompt_number": 5
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is the GeoJSON that describes the shapes that have been drawn by the different contributors:\n",
"\n",
"_Note: this GeoJSON will not validate in [GeoJSONLint](http://geojsonlint.com/) because first and last points do not match_"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"geomstr = '{\"type\":\"FeatureCollection\",\"features\":[{\"type\":\"Feature\",\"properties\":{\"user_id\":638},\"geometry\":{\"type\":\"Polygon\",\"coordinates\":[[[-73.98620970547199,40.7356342514617],[-73.98627072572708,40.735547874977094],[-73.98632504045963,40.73557226364293],[-73.98622445762157,40.73570995781772],[-73.9861835539341,40.73569268254945],[-73.98621775209902,40.735640856717666]]]}},{\"type\":\"Feature\",\"properties\":{\"user_id\":666},\"geometry\":{\"type\":\"Polygon\",\"coordinates\":[[[-73.98620769381522,40.73563526765495],[-73.9862660318613,40.735547874977094],[-73.98632504045963,40.735570739351566],[-73.98622579872608,40.73570944972167],[-73.98618154227734,40.73569217445325],[-73.98621775209902,40.73563933242788]]]}},{\"type\":\"Feature\",\"properties\":{\"session_id\":\"79e7ee062a9e0333926e3e1fdc3e92db\"},\"geometry\":{\"type\":\"Polygon\",\"coordinates\":[[[-73.98632369935513,40.735570739351566],[-73.98622512817383,40.73570944972167],[-73.98618154227734,40.73569014206842],[-73.98621909320354,40.735640856717666],[-73.98620970547199,40.73563526765495],[-73.98627005517483,40.73554889117169]]]}},{\"type\":\"Feature\",\"properties\":{\"session_id\":\"3d3003b26bb6b2f3b9577924b9ed5e0e\"},\"geometry\":{\"type\":\"Polygon\",\"coordinates\":[[[-73.98621842265129,40.7356423810074],[-73.98620903491974,40.73563577575159],[-73.98627139627934,40.735547874977094],[-73.98632436990738,40.735571755545806],[-73.98622579872608,40.73570995781772],[-73.98618087172508,40.735689633972214]]]}},{\"type\":\"Feature\",\"properties\":{\"user_id\":596},\"geometry\":{\"type\":\"Polygon\",\"coordinates\":[[[-73.98626938462257,40.73554889117167],[-73.98632369935513,40.735572771740024],[-73.98622445762157,40.73570894162559],[-73.98618154227734,40.73569065016463],[-73.98621775209902,40.735640856717666],[-73.98620836436749,40.735634251461676]]]}},{\"type\":\"Feature\",\"properties\":{\"session_id\":\"0afaf74383ce51aceba02fc49ce5a9e3\"},\"geometry\":{\"type\":\"Polygon\",\"coordinates\":[[[-73.98621775209902,40.73563984052446],[-73.98620836436749,40.73563272717173],[-73.98626938462257,40.735550415463514],[-73.98632235825062,40.73557124744871],[-73.98622360456956,40.73570641325812],[-73.98618768252459,40.73568957578454]]]}},{\"type\":\"Feature\",\"properties\":{\"user_id\":538},\"geometry\":{\"type\":\"Polygon\",\"coordinates\":[[[-73.98632571101189,40.735571755545806],[-73.98622378706932,40.73570995781772],[-73.98618288338184,40.73569268254945],[-73.98621775209902,40.73564034862108],[-73.9862110465765,40.7356362838482],[-73.98627005517483,40.735550923560815]]]}},{\"type\":\"Feature\",\"properties\":{\"user_id\":580},\"geometry\":{\"type\":\"Polygon\",\"coordinates\":[[[-73.98632436990738,40.73557124744871],[-73.98626066744328,40.7356581319994],[-73.98625999689102,40.7356581319994],[-73.98620903491974,40.735634759558316],[-73.98626804351805,40.735547874977094]]]}},{\"type\":\"Feature\",\"properties\":{\"user_id\":580},\"geometry\":{\"type\":\"Polygon\",\"coordinates\":[[[-73.98626133799553,40.7356581319994],[-73.98622579872608,40.73570944972167],[-73.98618154227734,40.73569166635704],[-73.98621842265129,40.73563984052446]]]}},{\"type\":\"Feature\",\"properties\":{\"user_id\":548},\"geometry\":{\"type\":\"Polygon\",\"coordinates\":[[[-73.98620970547199,40.73563475955834],[-73.98627005517483,40.73554990736624],[-73.98632369935513,40.735571755545806],[-73.98622360456956,40.73570641325812],[-73.9861848950386,40.735689633972214],[-73.98621842265129,40.735640856717666]]]}},{\"type\":\"Feature\",\"properties\":{\"session_id\":\"53056025663f6d6564a39975971cb87c\"},\"geometry\":{\"type\":\"Polygon\",\"coordinates\":[[[-73.98621909320354,40.735638316234656],[-73.98620836436749,40.7356362838482],[-73.98620769381522,40.73563577575159],[-73.98627005517483,40.73554939926897],[-73.98632302880287,40.73557023125444],[-73.98622360456956,40.73570641325812],[-73.98617953062057,40.735689633972214]]]}}]}'"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 6,
"text": [
"\"{\\\"type\\\":\\\"FeatureCollection\\\",\\\"features\\\":[{\\\"type\\\":\\\"Feature\\\",\\\"properties\\\":{\\\"user_id\\\":638},\\\"geometry\\\":{\\\"type\\\":\\\"Polygon\\\",\\\"coordinates\\\":[[[-73.98620970547199,40.7356342514617],[-73.98627072572708,40.735547874977094],[-73.98632504045963,40.73557226364293],[-73.98622445762157,40.73570995781772],[-73.9861835539341,40.73569268254945],[-73.98621775209902,40.735640856717666]]]}},{\\\"type\\\":\\\"Feature\\\",\\\"properties\\\":{\\\"user_id\\\":666},\\\"geometry\\\":{\\\"type\\\":\\\"Polygon\\\",\\\"coordinates\\\":[[[-73.98620769381522,40.73563526765495],[-73.9862660318613,40.735547874977094],[-73.98632504045963,40.735570739351566],[-73.98622579872608,40.73570944972167],[-73.98618154227734,40.73569217445325],[-73.98621775209902,40.73563933242788]]]}},{\\\"type\\\":\\\"Feature\\\",\\\"properties\\\":{\\\"session_id\\\":\\\"79e7ee062a9e0333926e3e1fdc3e92db\\\"},\\\"geometry\\\":{\\\"type\\\":\\\"Polygon\\\",\\\"coordinates\\\":[[[-73.98632369935513,40.735570739351566],[-73.98622512817383,40.73570944972167],[-73.98618154227734,40.73569014206842],[-73.98621909320354,40.735640856717666],[-73.98620970547199,40.73563526765495],[-73.98627005517483,40.73554889117169]]]}},{\\\"type\\\":\\\"Feature\\\",\\\"properties\\\":{\\\"session_id\\\":\\\"3d3003b26bb6b2f3b9577924b9ed5e0e\\\"},\\\"geometry\\\":{\\\"type\\\":\\\"Polygon\\\",\\\"coordinates\\\":[[[-73.98621842265129,40.7356423810074],[-73.98620903491974,40.73563577575159],[-73.98627139627934,40.735547874977094],[-73.98632436990738,40.735571755545806],[-73.98622579872608,40.73570995781772],[-73.98618087172508,40.735689633972214]]]}},{\\\"type\\\":\\\"Feature\\\",\\\"properties\\\":{\\\"user_id\\\":596},\\\"geometry\\\":{\\\"type\\\":\\\"Polygon\\\",\\\"coordinates\\\":[[[-73.98626938462257,40.73554889117167],[-73.98632369935513,40.735572771740024],[-73.98622445762157,40.73570894162559],[-73.98618154227734,40.73569065016463],[-73.98621775209902,40.735640856717666],[-73.98620836436749,40.735634251461676]]]}},{\\\"type\\\":\\\"Feature\\\",\\\"properties\\\":{\\\"session_id\\\":\\\"0afaf74383ce51aceba02fc49ce5a9e3\\\"},\\\"geometry\\\":{\\\"type\\\":\\\"Polygon\\\",\\\"coordinates\\\":[[[-73.98621775209902,40.73563984052446],[-73.98620836436749,40.73563272717173],[-73.98626938462257,40.735550415463514],[-73.98632235825062,40.73557124744871],[-73.98622360456956,40.73570641325812],[-73.98618768252459,40.73568957578454]]]}},{\\\"type\\\":\\\"Feature\\\",\\\"properties\\\":{\\\"user_id\\\":538},\\\"geometry\\\":{\\\"type\\\":\\\"Polygon\\\",\\\"coordinates\\\":[[[-73.98632571101189,40.735571755545806],[-73.98622378706932,40.73570995781772],[-73.98618288338184,40.73569268254945],[-73.98621775209902,40.73564034862108],[-73.9862110465765,40.7356362838482],[-73.98627005517483,40.735550923560815]]]}},{\\\"type\\\":\\\"Feature\\\",\\\"properties\\\":{\\\"user_id\\\":580},\\\"geometry\\\":{\\\"type\\\":\\\"Polygon\\\",\\\"coordinates\\\":[[[-73.98632436990738,40.73557124744871],[-73.98626066744328,40.7356581319994],[-73.98625999689102,40.7356581319994],[-73.98620903491974,40.735634759558316],[-73.98626804351805,40.735547874977094]]]}},{\\\"type\\\":\\\"Feature\\\",\\\"properties\\\":{\\\"user_id\\\":580},\\\"geometry\\\":{\\\"type\\\":\\\"Polygon\\\",\\\"coordinates\\\":[[[-73.98626133799553,40.7356581319994],[-73.98622579872608,40.73570944972167],[-73.98618154227734,40.73569166635704],[-73.98621842265129,40.73563984052446]]]}},{\\\"type\\\":\\\"Feature\\\",\\\"properties\\\":{\\\"user_id\\\":548},\\\"geometry\\\":{\\\"type\\\":\\\"Polygon\\\",\\\"coordinates\\\":[[[-73.98620970547199,40.73563475955834],[-73.98627005517483,40.73554990736624],[-73.98632369935513,40.735571755545806],[-73.98622360456956,40.73570641325812],[-73.9861848950386,40.735689633972214],[-73.98621842265129,40.735640856717666]]]}},{\\\"type\\\":\\\"Feature\\\",\\\"properties\\\":{\\\"session_id\\\":\\\"53056025663f6d6564a39975971cb87c\\\"},\\\"geometry\\\":{\\\"type\\\":\\\"Polygon\\\",\\\"coordinates\\\":[[[-73.98621909320354,40.735638316234656],[-73.98620836436749,40.7356362838482],[-73.98620769381522,40.73563577575159],[-73.98627005517483,40.73554939926897],[-73.98632302880287,40.73557023125444],[-73.98622360456956,40.73570641325812],[-73.98617953062057,40.735689633972214]]]}}]}\""
]
}
],
"prompt_number": 6
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We decode the GeoJSON into a `RGeo::GeoJSON` structure (see the [RGeo::GeoJSON docs](http://rdoc.info/github/rgeo/rgeo-geojson/frames)):"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"geocollection = RGeo::GeoJSON.decode(geomstr, :json_parser => :json)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 7,
"text": [
"#<RGeo::GeoJSON::FeatureCollection:0x80d363fc>"
]
}
],
"prompt_number": 7
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We wrap this in a function for convenience:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def parse_geojson(json)\n",
" RGeo::GeoJSON.decode(json, :json_parser => :json)\n",
"end"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 8,
"text": [
":parse_geojson"
]
}
],
"prompt_number": 8
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This structure is now a group of [features](http://rdoc.info/github/rgeo/rgeo-geojson/RGeo/GeoJSON/Feature), each with an [RGeo::Geos::CAPIPolygonImpl](http://rdoc.info/github/rgeo/rgeo/RGeo/Geos/CAPIPolygonImpl) geometry describing each polygon, among other properties (see the [RGeo docs](http://rdoc.info/github/rgeo/rgeo/frames)):"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"geocollection.first.geometry"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 9,
"text": [
"#<RGeo::Geos::CAPIPolygonImpl:0x80d3be74 \"POLYGON ((-73.98620970547199 40.7356342514617, -73.98627072572708 40.735547874977094, -73.98632504045963 40.73557226364293, -73.98622445762157 40.73570995781772, -73.9861835539341 40.73569268254945, -73.98621775209902 40.735640856717666, -73.98620970547199 40.7356342514617))\">"
]
}
],
"prompt_number": 9
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Algorithm\n",
"\n",
"The main logic behind this process is as follows:\n",
"\n",
"1. cluster all the polygons by their centroids (similar-shaped polygons should have similar centroids<sup>[1]</sup>, clustering will let us identify outliers)\n",
"1. only use clusters that have three or more centroids (three or more people drew similar-shaped polygons)\n",
"1. for each cluster:\n",
" 1. cluster the vertices of its polygons\n",
" 1. find the mean vertex describing each cluster\n",
" 1. connect those mean vertices in the most likely order\n",
" 1. verify that the connected polygon makes sense (will explain better below)\n",
"\n",
"[1] _different polygons might also have similar centroids but we're skipping this for now :)_\n",
"\n",
"Since DBSCAN works with number arrays, we need to convert the complex RGeo structures. Below a simple centroid-extraction function:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def get_centroid(poly_feature)\n",
" return if (poly_feature.geometry.geometry_type.type_name != \"Polygon\")\n",
" c = poly_feature.geometry.centroid\n",
" return [c.x, c.y]\n",
"end"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 10,
"text": [
":get_centroid"
]
}
],
"prompt_number": 10
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's test it with the first polygon in the collection:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"centroid = get_centroid(geocollection.first)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 11,
"text": [
"[-73.98625268168838, 40.73562601945317]"
]
}
],
"prompt_number": 11
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we need a convenience function to get all the centroids of the collection. We will make it a hash because we later need to be able to go back to this list to extract its corresponding set of polygons and a hash was the way I found most convenient:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def get_all_centroids(geom)\n",
" centroids = {}\n",
" geom.each_with_index do |poly,index|\n",
" next if (poly.geometry.geometry_type.type_name != \"Polygon\")\n",
" centroids[index] = get_centroid(poly)\n",
" end\n",
" return centroids\n",
"end"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 12,
"text": [
":get_all_centroids"
]
}
],
"prompt_number": 12
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Test again:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"centroids = get_all_centroids(geocollection)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 13,
"text": [
"{0=>[-73.98625268168838, 40.73562601945317], 1=>[-73.98625173238652, 40.735625569382876], 2=>[-73.9862518966646, 40.73562642272427], 3=>[-73.986252242017, 40.735626656082445], 4=>[-73.98625152460835, 40.735626229414], 5=>[-73.98625207318744, 40.73562418649854], 6=>[-73.98625258509149, 40.7356272053874], 7=>[-73.98626592099406, 40.735602617283476], 8=>[-73.9862216645921, 40.73567482334759], 9=>[-73.98625254867669, 40.735624721075084], 10=>[-73.98625077341322, 40.73562552211442]}"
]
}
],
"prompt_number": 13
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A simple plot of all the centroids using Nyaplot:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"plot = Nyaplot::Plot.new\n",
"plot.width(400)\n",
"plot.height(400)\n",
"plot.zoom(true)\n",
"plot.rotate_x_label(-60)\n",
"points_x = centroids.map { |p| p[1][0] }\n",
"points_y = centroids.map { |p| p[1][1] }\n",
"df = Nyaplot::DataFrame.new({x:points_x,y:points_y})\n",
"# add some padding\n",
"xmin = points_x.min - 1e-5\n",
"xmax = points_x.max + 1e-5\n",
"ymin = points_y.min - 1e-5\n",
"ymax = points_y.max + 1e-5\n",
"plot.xrange([xmin,xmax])\n",
"plot.yrange([ymin,ymax])\n",
"# end padding\n",
"sc = plot.add_with_df(df, :scatter, :x, :y)\n",
"plot.show"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div id='vis-e5936183-5798-4eb7-b7c6-7c43d7186bf2'></div>\n",
"<script>\n",
"(function(){\n",
" var render = function(){\n",
" var model = {\"panes\":[{\"diagrams\":[{\"type\":\"scatter\",\"options\":{\"x\":\"x\",\"y\":\"y\"},\"data\":\"4f1cf4be-6ced-46d8-aaf6-69cc3be1f335\"}],\"options\":{\"width\":400,\"height\":400,\"zoom\":true,\"rotate_x_label\":-60,\"xrange\":[-73.98627592099406,-73.98621166459209],\"yrange\":[40.73559261728347,40.73568482334759]}}],\"data\":{\"4f1cf4be-6ced-46d8-aaf6-69cc3be1f335\":[{\"x\":-73.98625268168838,\"y\":40.73562601945317},{\"x\":-73.98625173238652,\"y\":40.735625569382876},{\"x\":-73.9862518966646,\"y\":40.73562642272427},{\"x\":-73.986252242017,\"y\":40.735626656082445},{\"x\":-73.98625152460835,\"y\":40.735626229414},{\"x\":-73.98625207318744,\"y\":40.73562418649854},{\"x\":-73.98625258509149,\"y\":40.7356272053874},{\"x\":-73.98626592099406,\"y\":40.735602617283476},{\"x\":-73.9862216645921,\"y\":40.73567482334759},{\"x\":-73.98625254867669,\"y\":40.735624721075084},{\"x\":-73.98625077341322,\"y\":40.73562552211442}]},\"extension\":[]}\n",
" Nyaplot.core.parse(model, '#vis-e5936183-5798-4eb7-b7c6-7c43d7186bf2');\n",
" };\n",
" if(window['Nyaplot']==undefined){\n",
" window.addEventListener('load_nyaplot', render, false);\n",
"\treturn;\n",
" }\n",
" render();\n",
"})();\n",
"</script>\n"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 14,
"text": [
"\"<div id='vis-e5936183-5798-4eb7-b7c6-7c43d7186bf2'></div>\\n<script>\\n(function(){\\n var render = function(){\\n var model = {\\\"panes\\\":[{\\\"diagrams\\\":[{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"x\\\",\\\"y\\\":\\\"y\\\"},\\\"data\\\":\\\"4f1cf4be-6ced-46d8-aaf6-69cc3be1f335\\\"}],\\\"options\\\":{\\\"width\\\":400,\\\"height\\\":400,\\\"zoom\\\":true,\\\"rotate_x_label\\\":-60,\\\"xrange\\\":[-73.98627592099406,-73.98621166459209],\\\"yrange\\\":[40.73559261728347,40.73568482334759]}}],\\\"data\\\":{\\\"4f1cf4be-6ced-46d8-aaf6-69cc3be1f335\\\":[{\\\"x\\\":-73.98625268168838,\\\"y\\\":40.73562601945317},{\\\"x\\\":-73.98625173238652,\\\"y\\\":40.735625569382876},{\\\"x\\\":-73.9862518966646,\\\"y\\\":40.73562642272427},{\\\"x\\\":-73.986252242017,\\\"y\\\":40.735626656082445},{\\\"x\\\":-73.98625152460835,\\\"y\\\":40.735626229414},{\\\"x\\\":-73.98625207318744,\\\"y\\\":40.73562418649854},{\\\"x\\\":-73.98625258509149,\\\"y\\\":40.7356272053874},{\\\"x\\\":-73.98626592099406,\\\"y\\\":40.735602617283476},{\\\"x\\\":-73.9862216645921,\\\"y\\\":40.73567482334759},{\\\"x\\\":-73.98625254867669,\\\"y\\\":40.735624721075084},{\\\"x\\\":-73.98625077341322,\\\"y\\\":40.73562552211442}]},\\\"extension\\\":[]}\\n Nyaplot.core.parse(model, '#vis-e5936183-5798-4eb7-b7c6-7c43d7186bf2');\\n };\\n if(window['Nyaplot']==undefined){\\n window.addEventListener('load_nyaplot', render, false);\\n\\treturn;\\n }\\n render();\\n})();\\n</script>\\n\""
]
}
],
"prompt_number": 14
},
{
"cell_type": "code",
"collapsed": true,
"input": [
"dists = []\n",
"done = {}\n",
"centroids.each_with_index do |cc1,i|\n",
" centroids.each_with_index do |cc2,j|\n",
" c1 = cc1[1]\n",
" c2 = cc2[1]\n",
" dists.push({:dist=>Math.hypot(c1[0]-c2[0],c1[1]-c2[1]),:from=>i,:to=>j,:from_centroid=>c1,:to_centroid=>c2}) if (c1 != c2 && !done[[c2,c1]]) \n",
" done[[c1,c2]] = true\n",
" end\n",
"end\n",
"dists = dists.sort_by!{|k| k[:dist]}\n",
"dist_df = Nyaplot::DataFrame.new(dists)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<table><tr><th>dist</th><th>from</th><th>to</th><th>from_centroid</th><th>to_centroid</th></tr><tr><td>4.1680249477628687e-07</td><td>2</td><td>3</td><td>[-73.9862518966646, 40.73562642272427]</td><td>[-73.986252242017, 40.735626656082445]</td></tr><tr><td>4.1927880127312373e-07</td><td>2</td><td>4</td><td>[-73.9862518966646, 40.73562642272427]</td><td>[-73.98625152460835, 40.735626229414]</td></tr><tr><td>6.476388201422145e-07</td><td>3</td><td>6</td><td>[-73.986252242017, 40.735626656082445]</td><td>[-73.98625258509149, 40.7356272053874]</td></tr><tr><td>6.919630457708901e-07</td><td>1</td><td>4</td><td>[-73.98625173238652, 40.735625569382876]</td><td>[-73.98625152460835, 40.735626229414]</td></tr><tr><td>7.154453870992346e-07</td><td>5</td><td>9</td><td>[-73.98625207318744, 40.73562418649854]</td><td>[-73.98625254867669, 40.735624721075084]</td></tr><tr><td>7.736974578659688e-07</td><td>0</td><td>3</td><td>[-73.98625268168838, 40.73562601945317]</td><td>[-73.986252242017, 40.735626656082445]</td></tr><tr><td>8.346982305084655e-07</td><td>3</td><td>4</td><td>[-73.986252242017, 40.735626656082445]</td><td>[-73.98625152460835, 40.735626229414]</td></tr><tr><td>8.690102573992017e-07</td><td>1</td><td>2</td><td>[-73.98625173238652, 40.735625569382876]</td><td>[-73.9862518966646, 40.73562642272427]</td></tr><tr><td>8.825474076951457e-07</td><td>0</td><td>2</td><td>[-73.98625268168838, 40.73562601945317]</td><td>[-73.9862518966646, 40.73562642272427]</td></tr><tr><td>9.601375433120375e-07</td><td>1</td><td>10</td><td>[-73.98625173238652, 40.735625569382876]</td><td>[-73.98625077341322, 40.73562552211442]</td></tr><tr><td>1.0317784775226883e-06</td><td>4</td><td>10</td><td>[-73.98625152460835, 40.735626229414]</td><td>[-73.98625077341322, 40.73562552211442]</td></tr><tr><td>1.0423498270990819e-06</td><td>2</td><td>6</td><td>[-73.9862518966646, 40.73562642272427]</td><td>[-73.98625258509149, 40.7356272053874]</td></tr><tr><td>1.0505890250970463e-06</td><td>0</td><td>1</td><td>[-73.98625268168838, 40.73562601945317]</td><td>[-73.98625173238652, 40.735625569382876]</td></tr><tr><td>1.1759752372883875e-06</td><td>0</td><td>4</td><td>[-73.98625268168838, 40.73562601945317]</td><td>[-73.98625152460835, 40.735626229414]</td></tr><tr><td>1.1772662180684655e-06</td><td>1</td><td>9</td><td>[-73.98625173238652, 40.735625569382876]</td><td>[-73.98625254867669, 40.735624721075084]</td></tr><tr><td>1.1898617396243328e-06</td><td>0</td><td>6</td><td>[-73.98625268168838, 40.73562601945317]</td><td>[-73.98625258509149, 40.7356272053874]</td></tr><tr><td>...</td><td>...</td><td>...</td><td>...</td><td>...</td></tr><tr><td>8.468969718424401e-05</td><td>7</td><td>8</td><td>[-73.98626592099406, 40.735602617283476]</td><td>[-73.9862216645921, 40.73567482334759]</td></tr></table>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 15,
"text": [
"#<Nyaplot::DataFrame:0x000001019de188 @name=\"6a47c46f-26e0-417d-8978-9ffbfef4b0cb\", @rows=[{\"dist\"=>4.1680249477628687e-07, \"from\"=>2, \"to\"=>3, \"from_centroid\"=>[-73.9862518966646, 40.73562642272427], \"to_centroid\"=>[-73.986252242017, 40.735626656082445]}, {\"dist\"=>4.1927880127312373e-07, \"from\"=>2, \"to\"=>4, \"from_centroid\"=>[-73.9862518966646, 40.73562642272427], \"to_centroid\"=>[-73.98625152460835, 40.735626229414]}, {\"dist\"=>6.476388201422145e-07, \"from\"=>3, \"to\"=>6, \"from_centroid\"=>[-73.986252242017, 40.735626656082445], \"to_centroid\"=>[-73.98625258509149, 40.7356272053874]}, {\"dist\"=>6.919630457708901e-07, \"from\"=>1, \"to\"=>4, \"from_centroid\"=>[-73.98625173238652, 40.735625569382876], \"to_centroid\"=>[-73.98625152460835, 40.735626229414]}, {\"dist\"=>7.154453870992346e-07, \"from\"=>5, \"to\"=>9, \"from_centroid\"=>[-73.98625207318744, 40.73562418649854], \"to_centroid\"=>[-73.98625254867669, 40.735624721075084]}, {\"dist\"=>7.736974578659688e-07, \"from\"=>0, \"to\"=>3, \"from_centroid\"=>[-73.98625268168838, 40.73562601945317], \"to_centroid\"=>[-73.986252242017, 40.735626656082445]}, {\"dist\"=>8.346982305084655e-07, \"from\"=>3, \"to\"=>4, \"from_centroid\"=>[-73.986252242017, 40.735626656082445], \"to_centroid\"=>[-73.98625152460835, 40.735626229414]}, {\"dist\"=>8.690102573992017e-07, \"from\"=>1, \"to\"=>2, \"from_centroid\"=>[-73.98625173238652, 40.735625569382876], \"to_centroid\"=>[-73.9862518966646, 40.73562642272427]}, {\"dist\"=>8.825474076951457e-07, \"from\"=>0, \"to\"=>2, \"from_centroid\"=>[-73.98625268168838, 40.73562601945317], \"to_centroid\"=>[-73.9862518966646, 40.73562642272427]}, {\"dist\"=>9.601375433120375e-07, \"from\"=>1, \"to\"=>10, \"from_centroid\"=>[-73.98625173238652, 40.735625569382876], \"to_centroid\"=>[-73.98625077341322, 40.73562552211442]}, {\"dist\"=>1.0317784775226883e-06, \"from\"=>4, \"to\"=>10, \"from_centroid\"=>[-73.98625152460835, 40.735626229414], \"to_centroid\"=>[-73.98625077341322, 40.73562552211442]}, {\"dist\"=>1.0423498270990819e-06, \"from\"=>2, \"to\"=>6, \"from_centroid\"=>[-73.9862518966646, 40.73562642272427], \"to_centroid\"=>[-73.98625258509149, 40.7356272053874]}, {\"dist\"=>1.0505890250970463e-06, \"from\"=>0, \"to\"=>1, \"from_centroid\"=>[-73.98625268168838, 40.73562601945317], \"to_centroid\"=>[-73.98625173238652, 40.735625569382876]}, {\"dist\"=>1.1759752372883875e-06, \"from\"=>0, \"to\"=>4, \"from_centroid\"=>[-73.98625268168838, 40.73562601945317], \"to_centroid\"=>[-73.98625152460835, 40.735626229414]}, {\"dist\"=>1.1772662180684655e-06, \"from\"=>1, \"to\"=>9, \"from_centroid\"=>[-73.98625173238652, 40.735625569382876], \"to_centroid\"=>[-73.98625254867669, 40.735624721075084]}, {\"dist\"=>1.1898617396243328e-06, \"from\"=>0, \"to\"=>6, \"from_centroid\"=>[-73.98625268168838, 40.73562601945317], \"to_centroid\"=>[-73.98625258509149, 40.7356272053874]}, {\"dist\"=>1.2002662963773613e-06, \"from\"=>1, \"to\"=>3, \"from_centroid\"=>[-73.98625173238652, 40.735625569382876], \"to_centroid\"=>[-73.986252242017, 40.735626656082445]}, {\"dist\"=>1.3051734635243496e-06, \"from\"=>0, \"to\"=>9, \"from_centroid\"=>[-73.98625268168838, 40.73562601945317], \"to_centroid\"=>[-73.98625254867669, 40.735624721075084]}, {\"dist\"=>1.424259229759705e-06, \"from\"=>1, \"to\"=>5, \"from_centroid\"=>[-73.98625173238652, 40.735625569382876], \"to_centroid\"=>[-73.98625207318744, 40.73562418649854]}, {\"dist\"=>1.4397193384948688e-06, \"from\"=>2, \"to\"=>10, \"from_centroid\"=>[-73.9862518966646, 40.73562642272427], \"to_centroid\"=>[-73.98625077341322, 40.73562552211442]}, {\"dist\"=>1.441231617061079e-06, \"from\"=>4, \"to\"=>6, \"from_centroid\"=>[-73.98625152460835, 40.735626229414], \"to_centroid\"=>[-73.98625258509149, 40.7356272053874]}, {\"dist\"=>1.8222869506551436e-06, \"from\"=>2, \"to\"=>9, \"from_centroid\"=>[-73.9862518966646, 40.73562642272427], \"to_centroid\"=>[-73.98625254867669, 40.735624721075084]}, {\"dist\"=>1.8231297971933801e-06, \"from\"=>4, \"to\"=>9, \"from_centroid\"=>[-73.98625152460835, 40.735626229414], \"to_centroid\"=>[-73.98625254867669, 40.735624721075084]}, {\"dist\"=>1.844889313547645e-06, \"from\"=>1, \"to\"=>6, \"from_centroid\"=>[-73.98625173238652, 40.735625569382876], \"to_centroid\"=>[-73.98625258509149, 40.7356272053874]}, {\"dist\"=>1.8554461894036756e-06, \"from\"=>3, \"to\"=>10, \"from_centroid\"=>[-73.986252242017, 40.735626656082445], \"to_centroid\"=>[-73.98625077341322, 40.73562552211442]}, {\"dist\"=>1.8636745452851083e-06, \"from\"=>5, \"to\"=>10, \"from_centroid\"=>[-73.98625207318744, 40.73562418649854], \"to_centroid\"=>[-73.98625077341322, 40.73562552211442]}, {\"dist\"=>1.9313197748456895e-06, \"from\"=>0, \"to\"=>5, \"from_centroid\"=>[-73.98625268168838, 40.73562601945317], \"to_centroid\"=>[-73.98625207318744, 40.73562418649854]}, {\"dist\"=>1.947620190077808e-06, \"from\"=>9, \"to\"=>10, \"from_centroid\"=>[-73.98625254867669, 40.735624721075084], \"to_centroid\"=>[-73.98625077341322, 40.73562552211442]}, {\"dist\"=>1.9591563623025234e-06, \"from\"=>3, \"to\"=>9, \"from_centroid\"=>[-73.986252242017, 40.735626656082445], \"to_centroid\"=>[-73.98625254867669, 40.735624721075084]}, {\"dist\"=>1.972019255979438e-06, \"from\"=>0, \"to\"=>10, \"from_centroid\"=>[-73.98625268168838, 40.73562601945317], \"to_centroid\"=>[-73.98625077341322, 40.73562552211442]}, {\"dist\"=>2.115287830795299e-06, \"from\"=>4, \"to\"=>5, \"from_centroid\"=>[-73.98625152460835, 40.735626229414], \"to_centroid\"=>[-73.98625207318744, 40.73562418649854]}, {\"dist\"=>2.2431820796116284e-06, \"from\"=>2, \"to\"=>5, \"from_centroid\"=>[-73.9862518966646, 40.73562642272427], \"to_centroid\"=>[-73.98625207318744, 40.73562418649854]}, {\"dist\"=>2.4729711093657763e-06, \"from\"=>6, \"to\"=>10, \"from_centroid\"=>[-73.98625258509149, 40.7356272053874], \"to_centroid\"=>[-73.98625077341322, 40.73562552211442]}, {\"dist\"=>2.475348072708112e-06, \"from\"=>3, \"to\"=>5, \"from_centroid\"=>[-73.986252242017, 40.735626656082445], \"to_centroid\"=>[-73.98625207318744, 40.73562418649854]}, {\"dist\"=>2.4845791864857814e-06, \"from\"=>6, \"to\"=>9, \"from_centroid\"=>[-73.98625258509149, 40.7356272053874], \"to_centroid\"=>[-73.98625254867669, 40.735624721075084]}, {\"dist\"=>3.0619823175285526e-06, \"from\"=>5, \"to\"=>6, \"from_centroid\"=>[-73.98625207318744, 40.73562418649854], \"to_centroid\"=>[-73.98625258509149, 40.7356272053874]}, {\"dist\"=>2.563187052301914e-05, \"from\"=>5, \"to\"=>7, \"from_centroid\"=>[-73.98625207318744, 40.73562418649854], \"to_centroid\"=>[-73.98626592099406, 40.735602617283476]}, {\"dist\"=>2.5834017791549575e-05, \"from\"=>7, \"to\"=>9, \"from_centroid\"=>[-73.98626592099406, 40.735602617283476], \"to_centroid\"=>[-73.98625254867669, 40.735624721075084]}, {\"dist\"=>2.688755773883865e-05, \"from\"=>0, \"to\"=>7, \"from_centroid\"=>[-73.98625268168838, 40.73562601945317], \"to_centroid\"=>[-73.98626592099406, 40.735602617283476]}, {\"dist\"=>2.698361448529006e-05, \"from\"=>1, \"to\"=>7, \"from_centroid\"=>[-73.98625173238652, 40.735625569382876], \"to_centroid\"=>[-73.98626592099406, 40.735602617283476]}, {\"dist\"=>2.7460525955903987e-05, \"from\"=>7, \"to\"=>10, \"from_centroid\"=>[-73.98626592099406, 40.735602617283476], \"to_centroid\"=>[-73.98625077341322, 40.73562552211442]}, {\"dist\"=>2.7629347229741968e-05, \"from\"=>2, \"to\"=>7, \"from_centroid\"=>[-73.9862518966646, 40.73562642272427], \"to_centroid\"=>[-73.98626592099406, 40.735602617283476]}, {\"dist\"=>2.7654812048911204e-05, \"from\"=>4, \"to\"=>7, \"from_centroid\"=>[-73.98625152460835, 40.735626229414], \"to_centroid\"=>[-73.98626592099406, 40.735602617283476]}, {\"dist\"=>2.765824052887831e-05, \"from\"=>3, \"to\"=>7, \"from_centroid\"=>[-73.986252242017, 40.735626656082445], \"to_centroid\"=>[-73.98626592099406, 40.735602617283476]}, {\"dist\"=>2.797179207560266e-05, \"from\"=>6, \"to\"=>7, \"from_centroid\"=>[-73.98625258509149, 40.7356272053874], \"to_centroid\"=>[-73.98626592099406, 40.735602617283476]}, {\"dist\"=>5.677629272142381e-05, \"from\"=>6, \"to\"=>8, \"from_centroid\"=>[-73.98625258509149, 40.7356272053874], \"to_centroid\"=>[-73.9862216645921, 40.73567482334759]}, {\"dist\"=>5.7034997606567785e-05, \"from\"=>4, \"to\"=>8, \"from_centroid\"=>[-73.98625152460835, 40.735626229414], \"to_centroid\"=>[-73.9862216645921, 40.73567482334759]}, {\"dist\"=>5.7053171211386354e-05, \"from\"=>3, \"to\"=>8, \"from_centroid\"=>[-73.986252242017, 40.735626656082445], \"to_centroid\"=>[-73.9862216645921, 40.73567482334759]}, {\"dist\"=>5.706661497797782e-05, \"from\"=>2, \"to\"=>8, \"from_centroid\"=>[-73.9862518966646, 40.73562642272427], \"to_centroid\"=>[-73.9862216645921, 40.73567482334759]}, {\"dist\"=>5.725325369972539e-05, \"from\"=>8, \"to\"=>10, \"from_centroid\"=>[-73.9862216645921, 40.73567482334759], \"to_centroid\"=>[-73.98625077341322, 40.73562552211442]}, {\"dist\"=>5.7706371411325726e-05, \"from\"=>1, \"to\"=>8, \"from_centroid\"=>[-73.98625173238652, 40.735625569382876], \"to_centroid\"=>[-73.9862216645921, 40.73567482334759]}, {\"dist\"=>5.782629481832627e-05, \"from\"=>0, \"to\"=>8, \"from_centroid\"=>[-73.98625268168838, 40.73562601945317], \"to_centroid\"=>[-73.9862216645921, 40.73567482334759]}, {\"dist\"=>5.88563029015269e-05, \"from\"=>8, \"to\"=>9, \"from_centroid\"=>[-73.9862216645921, 40.73567482334759], \"to_centroid\"=>[-73.98625254867669, 40.735624721075084]}, {\"dist\"=>5.906583744015411e-05, \"from\"=>5, \"to\"=>8, \"from_centroid\"=>[-73.98625207318744, 40.73562418649854], \"to_centroid\"=>[-73.9862216645921, 40.73567482334759]}, {\"dist\"=>8.468969718424401e-05, \"from\"=>7, \"to\"=>8, \"from_centroid\"=>[-73.98626592099406, 40.735602617283476], \"to_centroid\"=>[-73.9862216645921, 40.73567482334759]}]>"
]
}
],
"prompt_number": 15
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Clustering centroids\n",
"\n",
"We can see here how the centroids reflect the three different basic shapes drawn by contributors above: the lone centroids for the upper-right and lower-left rectangles and the group of nine centroids for the L-shaped polygons in the \"center\".\n",
"\n",
"The problem now is finding a good minimum distance between centroids:\n",
"\n",
"- **big** enough to cover nearby centroids but also\n",
"- **small** enough to _not_ group polygons that don't belong with each other\n",
"\n",
"Let's create a table to see just how close/far these centroids are from each other (standard euclidean distance: $\\sqrt{((\\Delta x)^2+(\\Delta y)^2)}$). Notice that, since geographic metric units have a _lot_ of significant digits (numbers to the right of the decimal point), we are dealing with distances smaller than $10^{-6}$: "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"From the table (which is sorted by closest points first) we can see that the top 9 results are under $10^{-7}$ units away from each other (0.000001).\n",
"\n",
"## The DBSCAN algorithm\n",
"\n",
"To understand how clusters are formed, it is useful to understand how the [DBSCAN clustering algorithm](https://en.wikipedia.org/wiki/DBSCAN#Algorithm) works:\n",
"\n",
"> DBSCAN requires two parameters: \u03b5 (eps) and the minimum number of points (min_points) required to form a dense region. It starts with an arbitrary starting point that has not been visited. This point's \u03b5-neighborhood is retrieved, and if it contains sufficiently many points, a cluster is started. Otherwise, the point is labeled as noise. Note that this point might later be found in a sufficiently sized \u03b5-environment of a different point and hence be made part of a cluster.\n",
"\n",
"> If a point is found to be a dense part of a cluster, its \u03b5-neighborhood is also part of that cluster. Hence, all points that are found within the \u03b5-neighborhood are added, as is their own \u03b5-neighborhood when they are also dense. This process continues until the density-connected cluster is completely found. Then, a new unvisited point is retrieved and processed, leading to the discovery of a further cluster or noise.\n",
"\n",
"By playing around with different sets of polygons I came to a general \u03b5 of $1.8(10^{-6})$ and a `min_points` of 2 for **centroid clusters** (polygon vertex clusters have different input values as we will see below).\n",
"\n",
"This is the resulting centroid-clustering function:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def cluster_centroids(centroids, epsilon=1.8e-06, min_points=2)\n",
" dbscan = DBSCAN( centroids.map{|c| c[1]}, :epsilon => epsilon, :min_points => min_points, :distance => :euclidean_distance )\n",
" return dbscan.results.select{|k,v| k != -1} # omit the non-cluster\n",
"end"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 16,
"text": [
":cluster_centroids"
]
}
],
"prompt_number": 16
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's test it:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"centroid_clusters = cluster_centroids(centroids)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 17,
"text": [
"{0=>[[-73.98625268168838, 40.73562601945317], [-73.98625173238652, 40.735625569382876], [-73.9862518966646, 40.73562642272427], [-73.986252242017, 40.735626656082445], [-73.98625152460835, 40.735626229414], [-73.98625258509149, 40.7356272053874], [-73.98625254867669, 40.735624721075084], [-73.98625207318744, 40.73562418649854], [-73.98625077341322, 40.73562552211442]]}"
]
}
],
"prompt_number": 17
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The function returns a hash with whose `[-1]` key (if any) contains all the points that did not belong to a cluster and `[0..n]` contain the different clusters. In this example there is only one cluster, `centroid_clusters[0]` and the rejected `[-1]` non-cluster.\n",
"\n",
"Let's define a cluster plotting function and plot this (notice the \"disappearance\" of the two outliers that are being ignored by the function):"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def plot_clusters(clusters)\n",
" plot = Nyaplot::Plot.new\n",
" plot.width(300)\n",
" plot.height(400)\n",
" plot.zoom(true)\n",
" plot.rotate_x_label(-60)\n",
" pts = clusters.map{|c| c[1]}.flatten(1)\n",
" # add some padding\n",
" xmin = pts.map {|p| p[0]}.min - 1e-5\n",
" xmax = pts.map {|p| p[0]}.max + 1e-5\n",
" ymin = pts.map {|p| p[1]}.min - 1e-5\n",
" ymax = pts.map {|p| p[1]}.max + 1e-5\n",
" plot.xrange([xmin,xmax])\n",
" plot.yrange([ymin,ymax])\n",
" # now plot\n",
" clusters.each do |cluster|\n",
" if cluster[0] != -1 # ignore cluster -1 because not enough points\n",
" cluster_x = cluster[1].map { |c| c[0] }\n",
" cluster_y = cluster[1].map { |c| c[1] }\n",
" names = cluster[1].map { |c| cluster[0] }\n",
" df = Nyaplot::DataFrame.new({x:cluster_x,y:cluster_y,cluster:names})\n",
" sc = plot.add_with_df(df, :scatter, :x, :y)\n",
" sc.tooltip_contents([:cluster])\n",
" color = \"#\"+ \"%06x\" % (rand * 0xffffff)\n",
" sc.color(color)\n",
" end\n",
" end\n",
" plot.show\n",
" return plot\n",
"end"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 18,
"text": [
":plot_clusters"
]
}
],
"prompt_number": 18
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"plot = plot_clusters(centroid_clusters)\n",
"plot.show"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div id='vis-28fb465a-9286-4820-9c8c-b61a1ec17541'></div>\n",
"<script>\n",
"(function(){\n",
" var render = function(){\n",
" var model = {\"panes\":[{\"diagrams\":[{\"type\":\"scatter\",\"options\":{\"x\":\"x\",\"y\":\"y\",\"tooltip_contents\":[\"cluster\"],\"color\":\"#130783\"},\"data\":\"9a840fd3-526e-4c5f-840f-84cba5706672\"}],\"options\":{\"width\":300,\"height\":400,\"zoom\":true,\"rotate_x_label\":-60,\"xrange\":[-73.98626268168839,-73.98624077341321],\"yrange\":[40.73561418649854,40.735637205387405]}}],\"data\":{\"9a840fd3-526e-4c5f-840f-84cba5706672\":[{\"x\":-73.98625268168838,\"y\":40.73562601945317,\"cluster\":0},{\"x\":-73.98625173238652,\"y\":40.735625569382876,\"cluster\":0},{\"x\":-73.9862518966646,\"y\":40.73562642272427,\"cluster\":0},{\"x\":-73.986252242017,\"y\":40.735626656082445,\"cluster\":0},{\"x\":-73.98625152460835,\"y\":40.735626229414,\"cluster\":0},{\"x\":-73.98625258509149,\"y\":40.7356272053874,\"cluster\":0},{\"x\":-73.98625254867669,\"y\":40.735624721075084,\"cluster\":0},{\"x\":-73.98625207318744,\"y\":40.73562418649854,\"cluster\":0},{\"x\":-73.98625077341322,\"y\":40.73562552211442,\"cluster\":0}]},\"extension\":[]}\n",
" Nyaplot.core.parse(model, '#vis-28fb465a-9286-4820-9c8c-b61a1ec17541');\n",
" };\n",
" if(window['Nyaplot']==undefined){\n",
" window.addEventListener('load_nyaplot', render, false);\n",
"\treturn;\n",
" }\n",
" render();\n",
"})();\n",
"</script>\n"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 19,
"text": [
"\"<div id='vis-28fb465a-9286-4820-9c8c-b61a1ec17541'></div>\\n<script>\\n(function(){\\n var render = function(){\\n var model = {\\\"panes\\\":[{\\\"diagrams\\\":[{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"x\\\",\\\"y\\\":\\\"y\\\",\\\"tooltip_contents\\\":[\\\"cluster\\\"],\\\"color\\\":\\\"#130783\\\"},\\\"data\\\":\\\"9a840fd3-526e-4c5f-840f-84cba5706672\\\"}],\\\"options\\\":{\\\"width\\\":300,\\\"height\\\":400,\\\"zoom\\\":true,\\\"rotate_x_label\\\":-60,\\\"xrange\\\":[-73.98626268168839,-73.98624077341321],\\\"yrange\\\":[40.73561418649854,40.735637205387405]}}],\\\"data\\\":{\\\"9a840fd3-526e-4c5f-840f-84cba5706672\\\":[{\\\"x\\\":-73.98625268168838,\\\"y\\\":40.73562601945317,\\\"cluster\\\":0},{\\\"x\\\":-73.98625173238652,\\\"y\\\":40.735625569382876,\\\"cluster\\\":0},{\\\"x\\\":-73.9862518966646,\\\"y\\\":40.73562642272427,\\\"cluster\\\":0},{\\\"x\\\":-73.986252242017,\\\"y\\\":40.735626656082445,\\\"cluster\\\":0},{\\\"x\\\":-73.98625152460835,\\\"y\\\":40.735626229414,\\\"cluster\\\":0},{\\\"x\\\":-73.98625258509149,\\\"y\\\":40.7356272053874,\\\"cluster\\\":0},{\\\"x\\\":-73.98625254867669,\\\"y\\\":40.735624721075084,\\\"cluster\\\":0},{\\\"x\\\":-73.98625207318744,\\\"y\\\":40.73562418649854,\\\"cluster\\\":0},{\\\"x\\\":-73.98625077341322,\\\"y\\\":40.73562552211442,\\\"cluster\\\":0}]},\\\"extension\\\":[]}\\n Nyaplot.core.parse(model, '#vis-28fb465a-9286-4820-9c8c-b61a1ec17541');\\n };\\n if(window['Nyaplot']==undefined){\\n window.addEventListener('load_nyaplot', render, false);\\n\\treturn;\\n }\\n render();\\n})();\\n</script>\\n\""
]
}
],
"prompt_number": 19
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Clustering vertices\n",
"\n",
"Now we need to:\n",
"\n",
"1. work backwards from the centroid clusters that have three or more centroids (only one in this case)\n",
"1. find the polygons they belong to and, finally,\n",
"1. find their vertices and cluster them\n",
"\n",
"Below a function that retrieves the polygons for a given centroid cluster based on the structures we have built so far:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# given a list of centroids (lon,lat), find their poly's index in the centroid list (index => lon,lat)\n",
"def get_polys_for_centroid_cluster(cluster, centroids, original_polys)\n",
" polys = []\n",
" cluster.each do |cl|\n",
" index = centroids.select {|k,v| v == cl}.keys.first\n",
" polys.push(original_polys[index]) if index != -1\n",
" end\n",
" return polys\n",
"end"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 20,
"text": [
":get_polys_for_centroid_cluster"
]
}
],
"prompt_number": 20
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Applying this to the only cluster that has useful centroids:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"cluster_polygons = get_polys_for_centroid_cluster(centroid_clusters[0], centroids, geocollection)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 21,
"text": [
"[#<RGeo::GeoJSON::Feature:0x80d3bde8 id=nil geom=\"POLYGON ((-73.98620970547199 40.7356342514617, -73.98627072572708 40.735547874977094, -73.98632504045963 40.73557226364293, -73.98622445762157 40.73570995781772, -73.9861835539341 40.73569268254945, -73.98621775209902 40.735640856717666, -73.98620970547199 40.7356342514617))\">, #<RGeo::GeoJSON::Feature:0x80d3b550 id=nil geom=\"POLYGON ((-73.98620769381522 40.73563526765495, -73.9862660318613 40.735547874977094, -73.98632504045963 40.735570739351566, -73.98622579872608 40.73570944972167, -73.98618154227734 40.73569217445325, -73.98621775209902 40.73563933242788, -73.98620769381522 40.73563526765495))\">, #<RGeo::GeoJSON::Feature:0x80d3afc4 id=nil geom=\"POLYGON ((-73.98632369935513 40.735570739351566, -73.98622512817383 40.73570944972167, -73.98618154227734 40.73569014206842, -73.98621909320354 40.735640856717666, -73.98620970547199 40.73563526765495, -73.98627005517483 40.73554889117169, -73.98632369935513 40.735570739351566))\">, #<RGeo::GeoJSON::Feature:0x80d3aa9c id=nil geom=\"POLYGON ((-73.98621842265129 40.7356423810074, -73.98620903491974 40.73563577575159, -73.98627139627934 40.735547874977094, -73.98632436990738 40.735571755545806, -73.98622579872608 40.73570995781772, -73.98618087172508 40.735689633972214, -73.98621842265129 40.7356423810074))\">, #<RGeo::GeoJSON::Feature:0x80d3a59c id=nil geom=\"POLYGON ((-73.98626938462257 40.73554889117167, -73.98632369935513 40.735572771740024, -73.98622445762157 40.73570894162559, -73.98618154227734 40.73569065016463, -73.98621775209902 40.735640856717666, -73.98620836436749 40.735634251461676, -73.98626938462257 40.73554889117167))\">, #<RGeo::GeoJSON::Feature:0x80d37dc4 id=nil geom=\"POLYGON ((-73.98632571101189 40.735571755545806, -73.98622378706932 40.73570995781772, -73.98618288338184 40.73569268254945, -73.98621775209902 40.73564034862108, -73.9862110465765 40.7356362838482, -73.98627005517483 40.735550923560815, -73.98632571101189 40.735571755545806))\">, #<RGeo::GeoJSON::Feature:0x80d36e4c id=nil geom=\"POLYGON ((-73.98620970547199 40.73563475955834, -73.98627005517483 40.73554990736624, -73.98632369935513 40.735571755545806, -73.98622360456956 40.73570641325812, -73.9861848950386 40.735689633972214, -73.98621842265129 40.735640856717666, -73.98620970547199 40.73563475955834))\">, #<RGeo::GeoJSON::Feature:0x80d3a0b0 id=nil geom=\"POLYGON ((-73.98621775209902 40.73563984052446, -73.98620836436749 40.73563272717173, -73.98626938462257 40.735550415463514, -73.98632235825062 40.73557124744871, -73.98622360456956 40.73570641325812, -73.98618768252459 40.73568957578454, -73.98621775209902 40.73563984052446))\">, #<RGeo::GeoJSON::Feature:0x80d3644c id=nil geom=\"POLYGON ((-73.98621909320354 40.735638316234656, -73.98620836436749 40.7356362838482, -73.98620769381522 40.73563577575159, -73.98627005517483 40.73554939926897, -73.98632302880287 40.73557023125444, -73.98622360456956 40.73570641325812, -73.98617953062057 40.735689633972214, -73.98621909320354 40.735638316234656))\">]"
]
}
],
"prompt_number": 21
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We need a method to extract the vertices from each polygon (in a DBSCAN-compatible format):"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def get_points(poly_feature)\n",
" geom = poly_feature.geometry\n",
" return false if (geom.geometry_type.type_name != \"Polygon\")\n",
" pts = []\n",
" points = geom.exterior_ring.points\n",
" points.each do |point|\n",
" pts.push([point.x,point.y])\n",
" end\n",
" return pts\n",
"end"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 22,
"text": [
":get_points"
]
}
],
"prompt_number": 22
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now let's plot what we have so far (vertices from the same polygon are the same color):"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def plot_polys(polys)\n",
" plot = Nyaplot::Plot.new\n",
" plot.width(500)\n",
" plot.height(500)\n",
" plot.zoom(true)\n",
" plot.rotate_x_label(-60)\n",
" polys.each do |poly|\n",
" plot_poly(poly, plot)\n",
" end\n",
" plot.show\n",
"end\n",
"def plot_poly(poly, plot = nil)\n",
" showplot = false\n",
" if plot == nil\n",
" showplot = true\n",
" plot = Nyaplot::Plot.new\n",
" plot.width(500)\n",
" plot.height(500)\n",
" plot.zoom(true)\n",
" plot.rotate_x_label(-60)\n",
" end\n",
" points = get_points(poly)\n",
" points_x = points.map { |p| p[0] }\n",
" points_y = points.map { |p| p[1] }\n",
" df = Nyaplot::DataFrame.new({x:points_x,y:points_y})\n",
" sc = plot.add_with_df(df, :scatter, :x, :y)\n",
" color = \"#\"+ \"%06x\" % (rand * 0xffffff)\n",
" sc.color(color)\n",
" plot.show if showplot\n",
"end"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 23,
"text": [
":plot_poly"
]
}
],
"prompt_number": 23
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"plot_polys(cluster_polygons)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div id='vis-d40f4dbd-5edd-44f5-8abc-d0d49eef4fef'></div>\n",
"<script>\n",
"(function(){\n",
" var render = function(){\n",
" var model = {\"panes\":[{\"diagrams\":[{\"type\":\"scatter\",\"options\":{\"x\":\"x\",\"y\":\"y\",\"color\":\"#5d9c36\"},\"data\":\"d2949b10-9d6e-4bca-b9c4-068e5ca6ed8d\"},{\"type\":\"scatter\",\"options\":{\"x\":\"x\",\"y\":\"y\",\"color\":\"#6e5b61\"},\"data\":\"1ae0953c-32a1-410d-9309-9c17b4ee24b1\"},{\"type\":\"scatter\",\"options\":{\"x\":\"x\",\"y\":\"y\",\"color\":\"#917ab4\"},\"data\":\"2ea818d3-6d40-42bd-8883-51c377a69610\"},{\"type\":\"scatter\",\"options\":{\"x\":\"x\",\"y\":\"y\",\"color\":\"#38ca36\"},\"data\":\"f5f387ce-162a-43f6-ae21-06efbe8b684a\"},{\"type\":\"scatter\",\"options\":{\"x\":\"x\",\"y\":\"y\",\"color\":\"#d16d7c\"},\"data\":\"3064cf5f-6bd9-4acb-b889-5819c1534187\"},{\"type\":\"scatter\",\"options\":{\"x\":\"x\",\"y\":\"y\",\"color\":\"#885b3e\"},\"data\":\"d9ba0ec6-75fe-4cc3-a40e-7d39cf8ecd29\"},{\"type\":\"scatter\",\"options\":{\"x\":\"x\",\"y\":\"y\",\"color\":\"#4e0177\"},\"data\":\"cb7313ae-25e6-4cc4-962f-e31e9b50368e\"},{\"type\":\"scatter\",\"options\":{\"x\":\"x\",\"y\":\"y\",\"color\":\"#e2fd3d\"},\"data\":\"c5be729a-e8d3-483c-87c6-fa8d28814b9b\"},{\"type\":\"scatter\",\"options\":{\"x\":\"x\",\"y\":\"y\",\"color\":\"#c02dc4\"},\"data\":\"c715c674-86da-496d-896b-c4d84ad44ec8\"}],\"options\":{\"width\":500,\"height\":500,\"zoom\":true,\"rotate_x_label\":-60,\"xrange\":[-73.98632571101189,-73.98617953062057],\"yrange\":[40.735547874977094,40.73570995781772]}}],\"data\":{\"d2949b10-9d6e-4bca-b9c4-068e5ca6ed8d\":[{\"x\":-73.98620970547199,\"y\":40.7356342514617},{\"x\":-73.98627072572708,\"y\":40.735547874977094},{\"x\":-73.98632504045963,\"y\":40.73557226364293},{\"x\":-73.98622445762157,\"y\":40.73570995781772},{\"x\":-73.9861835539341,\"y\":40.73569268254945},{\"x\":-73.98621775209902,\"y\":40.735640856717666},{\"x\":-73.98620970547199,\"y\":40.7356342514617}],\"1ae0953c-32a1-410d-9309-9c17b4ee24b1\":[{\"x\":-73.98620769381522,\"y\":40.73563526765495},{\"x\":-73.9862660318613,\"y\":40.735547874977094},{\"x\":-73.98632504045963,\"y\":40.735570739351566},{\"x\":-73.98622579872608,\"y\":40.73570944972167},{\"x\":-73.98618154227734,\"y\":40.73569217445325},{\"x\":-73.98621775209902,\"y\":40.73563933242788},{\"x\":-73.98620769381522,\"y\":40.73563526765495}],\"2ea818d3-6d40-42bd-8883-51c377a69610\":[{\"x\":-73.98632369935513,\"y\":40.735570739351566},{\"x\":-73.98622512817383,\"y\":40.73570944972167},{\"x\":-73.98618154227734,\"y\":40.73569014206842},{\"x\":-73.98621909320354,\"y\":40.735640856717666},{\"x\":-73.98620970547199,\"y\":40.73563526765495},{\"x\":-73.98627005517483,\"y\":40.73554889117169},{\"x\":-73.98632369935513,\"y\":40.735570739351566}],\"f5f387ce-162a-43f6-ae21-06efbe8b684a\":[{\"x\":-73.98621842265129,\"y\":40.7356423810074},{\"x\":-73.98620903491974,\"y\":40.73563577575159},{\"x\":-73.98627139627934,\"y\":40.735547874977094},{\"x\":-73.98632436990738,\"y\":40.735571755545806},{\"x\":-73.98622579872608,\"y\":40.73570995781772},{\"x\":-73.98618087172508,\"y\":40.735689633972214},{\"x\":-73.98621842265129,\"y\":40.7356423810074}],\"3064cf5f-6bd9-4acb-b889-5819c1534187\":[{\"x\":-73.98626938462257,\"y\":40.73554889117167},{\"x\":-73.98632369935513,\"y\":40.735572771740024},{\"x\":-73.98622445762157,\"y\":40.73570894162559},{\"x\":-73.98618154227734,\"y\":40.73569065016463},{\"x\":-73.98621775209902,\"y\":40.735640856717666},{\"x\":-73.98620836436749,\"y\":40.735634251461676},{\"x\":-73.98626938462257,\"y\":40.73554889117167}],\"d9ba0ec6-75fe-4cc3-a40e-7d39cf8ecd29\":[{\"x\":-73.98632571101189,\"y\":40.735571755545806},{\"x\":-73.98622378706932,\"y\":40.73570995781772},{\"x\":-73.98618288338184,\"y\":40.73569268254945},{\"x\":-73.98621775209902,\"y\":40.73564034862108},{\"x\":-73.9862110465765,\"y\":40.7356362838482},{\"x\":-73.98627005517483,\"y\":40.735550923560815},{\"x\":-73.98632571101189,\"y\":40.735571755545806}],\"cb7313ae-25e6-4cc4-962f-e31e9b50368e\":[{\"x\":-73.98620970547199,\"y\":40.73563475955834},{\"x\":-73.98627005517483,\"y\":40.73554990736624},{\"x\":-73.98632369935513,\"y\":40.735571755545806},{\"x\":-73.98622360456956,\"y\":40.73570641325812},{\"x\":-73.9861848950386,\"y\":40.735689633972214},{\"x\":-73.98621842265129,\"y\":40.735640856717666},{\"x\":-73.98620970547199,\"y\":40.73563475955834}],\"c5be729a-e8d3-483c-87c6-fa8d28814b9b\":[{\"x\":-73.98621775209902,\"y\":40.73563984052446},{\"x\":-73.98620836436749,\"y\":40.73563272717173},{\"x\":-73.98626938462257,\"y\":40.735550415463514},{\"x\":-73.98632235825062,\"y\":40.73557124744871},{\"x\":-73.98622360456956,\"y\":40.73570641325812},{\"x\":-73.98618768252459,\"y\":40.73568957578454},{\"x\":-73.98621775209902,\"y\":40.73563984052446}],\"c715c674-86da-496d-896b-c4d84ad44ec8\":[{\"x\":-73.98621909320354,\"y\":40.735638316234656},{\"x\":-73.98620836436749,\"y\":40.7356362838482},{\"x\":-73.98620769381522,\"y\":40.73563577575159},{\"x\":-73.98627005517483,\"y\":40.73554939926897},{\"x\":-73.98632302880287,\"y\":40.73557023125444},{\"x\":-73.98622360456956,\"y\":40.73570641325812},{\"x\":-73.98617953062057,\"y\":40.735689633972214},{\"x\":-73.98621909320354,\"y\":40.735638316234656}]},\"extension\":[]}\n",
" Nyaplot.core.parse(model, '#vis-d40f4dbd-5edd-44f5-8abc-d0d49eef4fef');\n",
" };\n",
" if(window['Nyaplot']==undefined){\n",
" window.addEventListener('load_nyaplot', render, false);\n",
"\treturn;\n",
" }\n",
" render();\n",
"})();\n",
"</script>\n"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 24,
"text": [
"\"<div id='vis-d40f4dbd-5edd-44f5-8abc-d0d49eef4fef'></div>\\n<script>\\n(function(){\\n var render = function(){\\n var model = {\\\"panes\\\":[{\\\"diagrams\\\":[{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"x\\\",\\\"y\\\":\\\"y\\\",\\\"color\\\":\\\"#5d9c36\\\"},\\\"data\\\":\\\"d2949b10-9d6e-4bca-b9c4-068e5ca6ed8d\\\"},{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"x\\\",\\\"y\\\":\\\"y\\\",\\\"color\\\":\\\"#6e5b61\\\"},\\\"data\\\":\\\"1ae0953c-32a1-410d-9309-9c17b4ee24b1\\\"},{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"x\\\",\\\"y\\\":\\\"y\\\",\\\"color\\\":\\\"#917ab4\\\"},\\\"data\\\":\\\"2ea818d3-6d40-42bd-8883-51c377a69610\\\"},{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"x\\\",\\\"y\\\":\\\"y\\\",\\\"color\\\":\\\"#38ca36\\\"},\\\"data\\\":\\\"f5f387ce-162a-43f6-ae21-06efbe8b684a\\\"},{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"x\\\",\\\"y\\\":\\\"y\\\",\\\"color\\\":\\\"#d16d7c\\\"},\\\"data\\\":\\\"3064cf5f-6bd9-4acb-b889-5819c1534187\\\"},{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"x\\\",\\\"y\\\":\\\"y\\\",\\\"color\\\":\\\"#885b3e\\\"},\\\"data\\\":\\\"d9ba0ec6-75fe-4cc3-a40e-7d39cf8ecd29\\\"},{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"x\\\",\\\"y\\\":\\\"y\\\",\\\"color\\\":\\\"#4e0177\\\"},\\\"data\\\":\\\"cb7313ae-25e6-4cc4-962f-e31e9b50368e\\\"},{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"x\\\",\\\"y\\\":\\\"y\\\",\\\"color\\\":\\\"#e2fd3d\\\"},\\\"data\\\":\\\"c5be729a-e8d3-483c-87c6-fa8d28814b9b\\\"},{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"x\\\",\\\"y\\\":\\\"y\\\",\\\"color\\\":\\\"#c02dc4\\\"},\\\"data\\\":\\\"c715c674-86da-496d-896b-c4d84ad44ec8\\\"}],\\\"options\\\":{\\\"width\\\":500,\\\"height\\\":500,\\\"zoom\\\":true,\\\"rotate_x_label\\\":-60,\\\"xrange\\\":[-73.98632571101189,-73.98617953062057],\\\"yrange\\\":[40.735547874977094,40.73570995781772]}}],\\\"data\\\":{\\\"d2949b10-9d6e-4bca-b9c4-068e5ca6ed8d\\\":[{\\\"x\\\":-73.98620970547199,\\\"y\\\":40.7356342514617},{\\\"x\\\":-73.98627072572708,\\\"y\\\":40.735547874977094},{\\\"x\\\":-73.98632504045963,\\\"y\\\":40.73557226364293},{\\\"x\\\":-73.98622445762157,\\\"y\\\":40.73570995781772},{\\\"x\\\":-73.9861835539341,\\\"y\\\":40.73569268254945},{\\\"x\\\":-73.98621775209902,\\\"y\\\":40.735640856717666},{\\\"x\\\":-73.98620970547199,\\\"y\\\":40.7356342514617}],\\\"1ae0953c-32a1-410d-9309-9c17b4ee24b1\\\":[{\\\"x\\\":-73.98620769381522,\\\"y\\\":40.73563526765495},{\\\"x\\\":-73.9862660318613,\\\"y\\\":40.735547874977094},{\\\"x\\\":-73.98632504045963,\\\"y\\\":40.735570739351566},{\\\"x\\\":-73.98622579872608,\\\"y\\\":40.73570944972167},{\\\"x\\\":-73.98618154227734,\\\"y\\\":40.73569217445325},{\\\"x\\\":-73.98621775209902,\\\"y\\\":40.73563933242788},{\\\"x\\\":-73.98620769381522,\\\"y\\\":40.73563526765495}],\\\"2ea818d3-6d40-42bd-8883-51c377a69610\\\":[{\\\"x\\\":-73.98632369935513,\\\"y\\\":40.735570739351566},{\\\"x\\\":-73.98622512817383,\\\"y\\\":40.73570944972167},{\\\"x\\\":-73.98618154227734,\\\"y\\\":40.73569014206842},{\\\"x\\\":-73.98621909320354,\\\"y\\\":40.735640856717666},{\\\"x\\\":-73.98620970547199,\\\"y\\\":40.73563526765495},{\\\"x\\\":-73.98627005517483,\\\"y\\\":40.73554889117169},{\\\"x\\\":-73.98632369935513,\\\"y\\\":40.735570739351566}],\\\"f5f387ce-162a-43f6-ae21-06efbe8b684a\\\":[{\\\"x\\\":-73.98621842265129,\\\"y\\\":40.7356423810074},{\\\"x\\\":-73.98620903491974,\\\"y\\\":40.73563577575159},{\\\"x\\\":-73.98627139627934,\\\"y\\\":40.735547874977094},{\\\"x\\\":-73.98632436990738,\\\"y\\\":40.735571755545806},{\\\"x\\\":-73.98622579872608,\\\"y\\\":40.73570995781772},{\\\"x\\\":-73.98618087172508,\\\"y\\\":40.735689633972214},{\\\"x\\\":-73.98621842265129,\\\"y\\\":40.7356423810074}],\\\"3064cf5f-6bd9-4acb-b889-5819c1534187\\\":[{\\\"x\\\":-73.98626938462257,\\\"y\\\":40.73554889117167},{\\\"x\\\":-73.98632369935513,\\\"y\\\":40.735572771740024},{\\\"x\\\":-73.98622445762157,\\\"y\\\":40.73570894162559},{\\\"x\\\":-73.98618154227734,\\\"y\\\":40.73569065016463},{\\\"x\\\":-73.98621775209902,\\\"y\\\":40.735640856717666},{\\\"x\\\":-73.98620836436749,\\\"y\\\":40.735634251461676},{\\\"x\\\":-73.98626938462257,\\\"y\\\":40.73554889117167}],\\\"d9ba0ec6-75fe-4cc3-a40e-7d39cf8ecd29\\\":[{\\\"x\\\":-73.98632571101189,\\\"y\\\":40.735571755545806},{\\\"x\\\":-73.98622378706932,\\\"y\\\":40.73570995781772},{\\\"x\\\":-73.98618288338184,\\\"y\\\":40.73569268254945},{\\\"x\\\":-73.98621775209902,\\\"y\\\":40.73564034862108},{\\\"x\\\":-73.9862110465765,\\\"y\\\":40.7356362838482},{\\\"x\\\":-73.98627005517483,\\\"y\\\":40.735550923560815},{\\\"x\\\":-73.98632571101189,\\\"y\\\":40.735571755545806}],\\\"cb7313ae-25e6-4cc4-962f-e31e9b50368e\\\":[{\\\"x\\\":-73.98620970547199,\\\"y\\\":40.73563475955834},{\\\"x\\\":-73.98627005517483,\\\"y\\\":40.73554990736624},{\\\"x\\\":-73.98632369935513,\\\"y\\\":40.735571755545806},{\\\"x\\\":-73.98622360456956,\\\"y\\\":40.73570641325812},{\\\"x\\\":-73.9861848950386,\\\"y\\\":40.735689633972214},{\\\"x\\\":-73.98621842265129,\\\"y\\\":40.735640856717666},{\\\"x\\\":-73.98620970547199,\\\"y\\\":40.73563475955834}],\\\"c5be729a-e8d3-483c-87c6-fa8d28814b9b\\\":[{\\\"x\\\":-73.98621775209902,\\\"y\\\":40.73563984052446},{\\\"x\\\":-73.98620836436749,\\\"y\\\":40.73563272717173},{\\\"x\\\":-73.98626938462257,\\\"y\\\":40.735550415463514},{\\\"x\\\":-73.98632235825062,\\\"y\\\":40.73557124744871},{\\\"x\\\":-73.98622360456956,\\\"y\\\":40.73570641325812},{\\\"x\\\":-73.98618768252459,\\\"y\\\":40.73568957578454},{\\\"x\\\":-73.98621775209902,\\\"y\\\":40.73563984052446}],\\\"c715c674-86da-496d-896b-c4d84ad44ec8\\\":[{\\\"x\\\":-73.98621909320354,\\\"y\\\":40.735638316234656},{\\\"x\\\":-73.98620836436749,\\\"y\\\":40.7356362838482},{\\\"x\\\":-73.98620769381522,\\\"y\\\":40.73563577575159},{\\\"x\\\":-73.98627005517483,\\\"y\\\":40.73554939926897},{\\\"x\\\":-73.98632302880287,\\\"y\\\":40.73557023125444},{\\\"x\\\":-73.98622360456956,\\\"y\\\":40.73570641325812},{\\\"x\\\":-73.98617953062057,\\\"y\\\":40.735689633972214},{\\\"x\\\":-73.98621909320354,\\\"y\\\":40.735638316234656}]},\\\"extension\\\":[]}\\n Nyaplot.core.parse(model, '#vis-d40f4dbd-5edd-44f5-8abc-d0d49eef4fef');\\n };\\n if(window['Nyaplot']==undefined){\\n window.addEventListener('load_nyaplot', render, false);\\n\\treturn;\\n }\\n render();\\n})();\\n</script>\\n\""
]
}
],
"prompt_number": 24
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's cluster these points. Below is a function that extracts the points from a list of polygons:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def get_all_poly_points(polys)\n",
" points = []\n",
" polys.each do |poly|\n",
" points.push(get_points(poly))\n",
" end\n",
" return points\n",
"end"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 25,
"text": [
":get_all_poly_points"
]
}
],
"prompt_number": 25
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"cluster_poly_points = get_all_poly_points(cluster_polygons)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 26,
"text": [
"[[[-73.98620970547199, 40.7356342514617], [-73.98627072572708, 40.735547874977094], [-73.98632504045963, 40.73557226364293], [-73.98622445762157, 40.73570995781772], [-73.9861835539341, 40.73569268254945], [-73.98621775209902, 40.735640856717666], [-73.98620970547199, 40.7356342514617]], [[-73.98620769381522, 40.73563526765495], [-73.9862660318613, 40.735547874977094], [-73.98632504045963, 40.735570739351566], [-73.98622579872608, 40.73570944972167], [-73.98618154227734, 40.73569217445325], [-73.98621775209902, 40.73563933242788], [-73.98620769381522, 40.73563526765495]], [[-73.98632369935513, 40.735570739351566], [-73.98622512817383, 40.73570944972167], [-73.98618154227734, 40.73569014206842], [-73.98621909320354, 40.735640856717666], [-73.98620970547199, 40.73563526765495], [-73.98627005517483, 40.73554889117169], [-73.98632369935513, 40.735570739351566]], [[-73.98621842265129, 40.7356423810074], [-73.98620903491974, 40.73563577575159], [-73.98627139627934, 40.735547874977094], [-73.98632436990738, 40.735571755545806], [-73.98622579872608, 40.73570995781772], [-73.98618087172508, 40.735689633972214], [-73.98621842265129, 40.7356423810074]], [[-73.98626938462257, 40.73554889117167], [-73.98632369935513, 40.735572771740024], [-73.98622445762157, 40.73570894162559], [-73.98618154227734, 40.73569065016463], [-73.98621775209902, 40.735640856717666], [-73.98620836436749, 40.735634251461676], [-73.98626938462257, 40.73554889117167]], [[-73.98632571101189, 40.735571755545806], [-73.98622378706932, 40.73570995781772], [-73.98618288338184, 40.73569268254945], [-73.98621775209902, 40.73564034862108], [-73.9862110465765, 40.7356362838482], [-73.98627005517483, 40.735550923560815], [-73.98632571101189, 40.735571755545806]], [[-73.98620970547199, 40.73563475955834], [-73.98627005517483, 40.73554990736624], [-73.98632369935513, 40.735571755545806], [-73.98622360456956, 40.73570641325812], [-73.9861848950386, 40.735689633972214], [-73.98621842265129, 40.735640856717666], [-73.98620970547199, 40.73563475955834]], [[-73.98621775209902, 40.73563984052446], [-73.98620836436749, 40.73563272717173], [-73.98626938462257, 40.735550415463514], [-73.98632235825062, 40.73557124744871], [-73.98622360456956, 40.73570641325812], [-73.98618768252459, 40.73568957578454], [-73.98621775209902, 40.73563984052446]], [[-73.98621909320354, 40.735638316234656], [-73.98620836436749, 40.7356362838482], [-73.98620769381522, 40.73563577575159], [-73.98627005517483, 40.73554939926897], [-73.98632302880287, 40.73557023125444], [-73.98622360456956, 40.73570641325812], [-73.98617953062057, 40.735689633972214], [-73.98621909320354, 40.735638316234656]]]"
]
}
],
"prompt_number": 26
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The better \u03b5 value I found for these points is a bit more complicated. If it is too big, the L-shape will be lost: points in that corner will be clustered together. After fiddling around I found a decent value of of $6(10^{-6})$.\n",
"\n",
"An important aspect to account for here is that the GeoJSON spec requires that the coordinate array has to begin _and end_ with the _same point_. Therefore this point would be **counted twice** if we leave the array as-is. Below the resulting clustering function, corresponding test, and plot:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def cluster_points(points, epsilon=6e-06, min_points=2)\n",
" dbscan = DBSCAN( points.flatten(1), :epsilon => epsilon, :min_points => min_points, :distance => :euclidean_distance )\n",
" return dbscan.results.select{|k,v| k != -1} # omit the non-cluster\n",
"end"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 27,
"text": [
":cluster_points"
]
}
],
"prompt_number": 27
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# exclude first item in each poly since it is same as last\n",
"unique_points = cluster_poly_points.map{|poly| poly[1..-1]}\n",
"vertex_clusters = cluster_points(unique_points)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 28,
"text": [
"{0=>[[-73.98627072572708, 40.735547874977094], [-73.9862660318613, 40.735547874977094], [-73.98627005517483, 40.73554889117169], [-73.98627139627934, 40.735547874977094], [-73.98626938462257, 40.73554889117167], [-73.98627005517483, 40.735550923560815], [-73.98627005517483, 40.73554990736624], [-73.98626938462257, 40.735550415463514], [-73.98627005517483, 40.73554939926897]], 1=>[[-73.98632504045963, 40.73557226364293], [-73.98632504045963, 40.735570739351566], [-73.98632369935513, 40.735570739351566], [-73.98632436990738, 40.735571755545806], [-73.98632369935513, 40.735572771740024], [-73.98632571101189, 40.735571755545806], [-73.98632369935513, 40.735571755545806], [-73.98632235825062, 40.73557124744871], [-73.98632302880287, 40.73557023125444]], 2=>[[-73.98622445762157, 40.73570995781772], [-73.98622579872608, 40.73570944972167], [-73.98622512817383, 40.73570944972167], [-73.98622579872608, 40.73570995781772], [-73.98622445762157, 40.73570894162559], [-73.98622378706932, 40.73570995781772], [-73.98622360456956, 40.73570641325812], [-73.98622360456956, 40.73570641325812], [-73.98622360456956, 40.73570641325812]], 3=>[[-73.9861835539341, 40.73569268254945], [-73.98618154227734, 40.73569217445325], [-73.98618154227734, 40.73569014206842], [-73.98618087172508, 40.735689633972214], [-73.98618154227734, 40.73569065016463], [-73.98618288338184, 40.73569268254945], [-73.9861848950386, 40.735689633972214], [-73.98618768252459, 40.73568957578454], [-73.98617953062057, 40.735689633972214]], 4=>[[-73.98621775209902, 40.735640856717666], [-73.98621775209902, 40.73563933242788], [-73.98621909320354, 40.735640856717666], [-73.98621842265129, 40.7356423810074], [-73.98621775209902, 40.73564034862108], [-73.98621842265129, 40.735640856717666], [-73.98621775209902, 40.73563984052446], [-73.98621909320354, 40.735638316234656], [-73.98621775209902, 40.735640856717666]], 5=>[[-73.98620970547199, 40.7356342514617], [-73.98620769381522, 40.73563526765495], [-73.98620970547199, 40.73563526765495], [-73.98620903491974, 40.73563577575159], [-73.98620836436749, 40.735634251461676], [-73.9862110465765, 40.7356362838482], [-73.98620970547199, 40.73563475955834], [-73.98620836436749, 40.73563272717173], [-73.98620836436749, 40.7356362838482], [-73.98620769381522, 40.73563577575159]]}"
]
}
],
"prompt_number": 28
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"plot = plot_clusters(vertex_clusters)\n",
"plot.show"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div id='vis-6a777312-113c-4916-9476-4438cd56ec78'></div>\n",
"<script>\n",
"(function(){\n",
" var render = function(){\n",
" var model = {\"panes\":[{\"diagrams\":[{\"type\":\"scatter\",\"options\":{\"x\":\"x\",\"y\":\"y\",\"tooltip_contents\":[\"cluster\"],\"color\":\"#d10d45\"},\"data\":\"06607b62-1b6a-4ccb-a030-0da9f8e3f7ac\"},{\"type\":\"scatter\",\"options\":{\"x\":\"x\",\"y\":\"y\",\"tooltip_contents\":[\"cluster\"],\"color\":\"#12c935\"},\"data\":\"484d65e9-0b8c-4ed4-98d2-965306dd5bda\"},{\"type\":\"scatter\",\"options\":{\"x\":\"x\",\"y\":\"y\",\"tooltip_contents\":[\"cluster\"],\"color\":\"#de6ea2\"},\"data\":\"97316b13-e1ec-417f-824a-4a1c084c10c3\"},{\"type\":\"scatter\",\"options\":{\"x\":\"x\",\"y\":\"y\",\"tooltip_contents\":[\"cluster\"],\"color\":\"#0c0dcd\"},\"data\":\"a89cda52-f106-4b44-95aa-aaa6defc129b\"},{\"type\":\"scatter\",\"options\":{\"x\":\"x\",\"y\":\"y\",\"tooltip_contents\":[\"cluster\"],\"color\":\"#dd318d\"},\"data\":\"615c22b4-81f1-4413-80fa-fc406e6c1efb\"},{\"type\":\"scatter\",\"options\":{\"x\":\"x\",\"y\":\"y\",\"tooltip_contents\":[\"cluster\"],\"color\":\"#242148\"},\"data\":\"d0725452-aa45-4ebb-ac99-361f36b0bb10\"}],\"options\":{\"width\":300,\"height\":400,\"zoom\":true,\"rotate_x_label\":-60,\"xrange\":[-73.98633571101189,-73.98616953062057],\"yrange\":[40.73553787497709,40.73571995781772]}}],\"data\":{\"06607b62-1b6a-4ccb-a030-0da9f8e3f7ac\":[{\"x\":-73.98627072572708,\"y\":40.735547874977094,\"cluster\":0},{\"x\":-73.9862660318613,\"y\":40.735547874977094,\"cluster\":0},{\"x\":-73.98627005517483,\"y\":40.73554889117169,\"cluster\":0},{\"x\":-73.98627139627934,\"y\":40.735547874977094,\"cluster\":0},{\"x\":-73.98626938462257,\"y\":40.73554889117167,\"cluster\":0},{\"x\":-73.98627005517483,\"y\":40.735550923560815,\"cluster\":0},{\"x\":-73.98627005517483,\"y\":40.73554990736624,\"cluster\":0},{\"x\":-73.98626938462257,\"y\":40.735550415463514,\"cluster\":0},{\"x\":-73.98627005517483,\"y\":40.73554939926897,\"cluster\":0}],\"484d65e9-0b8c-4ed4-98d2-965306dd5bda\":[{\"x\":-73.98632504045963,\"y\":40.73557226364293,\"cluster\":1},{\"x\":-73.98632504045963,\"y\":40.735570739351566,\"cluster\":1},{\"x\":-73.98632369935513,\"y\":40.735570739351566,\"cluster\":1},{\"x\":-73.98632436990738,\"y\":40.735571755545806,\"cluster\":1},{\"x\":-73.98632369935513,\"y\":40.735572771740024,\"cluster\":1},{\"x\":-73.98632571101189,\"y\":40.735571755545806,\"cluster\":1},{\"x\":-73.98632369935513,\"y\":40.735571755545806,\"cluster\":1},{\"x\":-73.98632235825062,\"y\":40.73557124744871,\"cluster\":1},{\"x\":-73.98632302880287,\"y\":40.73557023125444,\"cluster\":1}],\"97316b13-e1ec-417f-824a-4a1c084c10c3\":[{\"x\":-73.98622445762157,\"y\":40.73570995781772,\"cluster\":2},{\"x\":-73.98622579872608,\"y\":40.73570944972167,\"cluster\":2},{\"x\":-73.98622512817383,\"y\":40.73570944972167,\"cluster\":2},{\"x\":-73.98622579872608,\"y\":40.73570995781772,\"cluster\":2},{\"x\":-73.98622445762157,\"y\":40.73570894162559,\"cluster\":2},{\"x\":-73.98622378706932,\"y\":40.73570995781772,\"cluster\":2},{\"x\":-73.98622360456956,\"y\":40.73570641325812,\"cluster\":2},{\"x\":-73.98622360456956,\"y\":40.73570641325812,\"cluster\":2},{\"x\":-73.98622360456956,\"y\":40.73570641325812,\"cluster\":2}],\"a89cda52-f106-4b44-95aa-aaa6defc129b\":[{\"x\":-73.9861835539341,\"y\":40.73569268254945,\"cluster\":3},{\"x\":-73.98618154227734,\"y\":40.73569217445325,\"cluster\":3},{\"x\":-73.98618154227734,\"y\":40.73569014206842,\"cluster\":3},{\"x\":-73.98618087172508,\"y\":40.735689633972214,\"cluster\":3},{\"x\":-73.98618154227734,\"y\":40.73569065016463,\"cluster\":3},{\"x\":-73.98618288338184,\"y\":40.73569268254945,\"cluster\":3},{\"x\":-73.9861848950386,\"y\":40.735689633972214,\"cluster\":3},{\"x\":-73.98618768252459,\"y\":40.73568957578454,\"cluster\":3},{\"x\":-73.98617953062057,\"y\":40.735689633972214,\"cluster\":3}],\"615c22b4-81f1-4413-80fa-fc406e6c1efb\":[{\"x\":-73.98621775209902,\"y\":40.735640856717666,\"cluster\":4},{\"x\":-73.98621775209902,\"y\":40.73563933242788,\"cluster\":4},{\"x\":-73.98621909320354,\"y\":40.735640856717666,\"cluster\":4},{\"x\":-73.98621842265129,\"y\":40.7356423810074,\"cluster\":4},{\"x\":-73.98621775209902,\"y\":40.73564034862108,\"cluster\":4},{\"x\":-73.98621842265129,\"y\":40.735640856717666,\"cluster\":4},{\"x\":-73.98621775209902,\"y\":40.73563984052446,\"cluster\":4},{\"x\":-73.98621909320354,\"y\":40.735638316234656,\"cluster\":4},{\"x\":-73.98621775209902,\"y\":40.735640856717666,\"cluster\":4}],\"d0725452-aa45-4ebb-ac99-361f36b0bb10\":[{\"x\":-73.98620970547199,\"y\":40.7356342514617,\"cluster\":5},{\"x\":-73.98620769381522,\"y\":40.73563526765495,\"cluster\":5},{\"x\":-73.98620970547199,\"y\":40.73563526765495,\"cluster\":5},{\"x\":-73.98620903491974,\"y\":40.73563577575159,\"cluster\":5},{\"x\":-73.98620836436749,\"y\":40.735634251461676,\"cluster\":5},{\"x\":-73.9862110465765,\"y\":40.7356362838482,\"cluster\":5},{\"x\":-73.98620970547199,\"y\":40.73563475955834,\"cluster\":5},{\"x\":-73.98620836436749,\"y\":40.73563272717173,\"cluster\":5},{\"x\":-73.98620836436749,\"y\":40.7356362838482,\"cluster\":5},{\"x\":-73.98620769381522,\"y\":40.73563577575159,\"cluster\":5}]},\"extension\":[]}\n",
" Nyaplot.core.parse(model, '#vis-6a777312-113c-4916-9476-4438cd56ec78');\n",
" };\n",
" if(window['Nyaplot']==undefined){\n",
" window.addEventListener('load_nyaplot', render, false);\n",
"\treturn;\n",
" }\n",
" render();\n",
"})();\n",
"</script>\n"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 29,
"text": [
"\"<div id='vis-6a777312-113c-4916-9476-4438cd56ec78'></div>\\n<script>\\n(function(){\\n var render = function(){\\n var model = {\\\"panes\\\":[{\\\"diagrams\\\":[{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"x\\\",\\\"y\\\":\\\"y\\\",\\\"tooltip_contents\\\":[\\\"cluster\\\"],\\\"color\\\":\\\"#d10d45\\\"},\\\"data\\\":\\\"06607b62-1b6a-4ccb-a030-0da9f8e3f7ac\\\"},{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"x\\\",\\\"y\\\":\\\"y\\\",\\\"tooltip_contents\\\":[\\\"cluster\\\"],\\\"color\\\":\\\"#12c935\\\"},\\\"data\\\":\\\"484d65e9-0b8c-4ed4-98d2-965306dd5bda\\\"},{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"x\\\",\\\"y\\\":\\\"y\\\",\\\"tooltip_contents\\\":[\\\"cluster\\\"],\\\"color\\\":\\\"#de6ea2\\\"},\\\"data\\\":\\\"97316b13-e1ec-417f-824a-4a1c084c10c3\\\"},{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"x\\\",\\\"y\\\":\\\"y\\\",\\\"tooltip_contents\\\":[\\\"cluster\\\"],\\\"color\\\":\\\"#0c0dcd\\\"},\\\"data\\\":\\\"a89cda52-f106-4b44-95aa-aaa6defc129b\\\"},{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"x\\\",\\\"y\\\":\\\"y\\\",\\\"tooltip_contents\\\":[\\\"cluster\\\"],\\\"color\\\":\\\"#dd318d\\\"},\\\"data\\\":\\\"615c22b4-81f1-4413-80fa-fc406e6c1efb\\\"},{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"x\\\",\\\"y\\\":\\\"y\\\",\\\"tooltip_contents\\\":[\\\"cluster\\\"],\\\"color\\\":\\\"#242148\\\"},\\\"data\\\":\\\"d0725452-aa45-4ebb-ac99-361f36b0bb10\\\"}],\\\"options\\\":{\\\"width\\\":300,\\\"height\\\":400,\\\"zoom\\\":true,\\\"rotate_x_label\\\":-60,\\\"xrange\\\":[-73.98633571101189,-73.98616953062057],\\\"yrange\\\":[40.73553787497709,40.73571995781772]}}],\\\"data\\\":{\\\"06607b62-1b6a-4ccb-a030-0da9f8e3f7ac\\\":[{\\\"x\\\":-73.98627072572708,\\\"y\\\":40.735547874977094,\\\"cluster\\\":0},{\\\"x\\\":-73.9862660318613,\\\"y\\\":40.735547874977094,\\\"cluster\\\":0},{\\\"x\\\":-73.98627005517483,\\\"y\\\":40.73554889117169,\\\"cluster\\\":0},{\\\"x\\\":-73.98627139627934,\\\"y\\\":40.735547874977094,\\\"cluster\\\":0},{\\\"x\\\":-73.98626938462257,\\\"y\\\":40.73554889117167,\\\"cluster\\\":0},{\\\"x\\\":-73.98627005517483,\\\"y\\\":40.735550923560815,\\\"cluster\\\":0},{\\\"x\\\":-73.98627005517483,\\\"y\\\":40.73554990736624,\\\"cluster\\\":0},{\\\"x\\\":-73.98626938462257,\\\"y\\\":40.735550415463514,\\\"cluster\\\":0},{\\\"x\\\":-73.98627005517483,\\\"y\\\":40.73554939926897,\\\"cluster\\\":0}],\\\"484d65e9-0b8c-4ed4-98d2-965306dd5bda\\\":[{\\\"x\\\":-73.98632504045963,\\\"y\\\":40.73557226364293,\\\"cluster\\\":1},{\\\"x\\\":-73.98632504045963,\\\"y\\\":40.735570739351566,\\\"cluster\\\":1},{\\\"x\\\":-73.98632369935513,\\\"y\\\":40.735570739351566,\\\"cluster\\\":1},{\\\"x\\\":-73.98632436990738,\\\"y\\\":40.735571755545806,\\\"cluster\\\":1},{\\\"x\\\":-73.98632369935513,\\\"y\\\":40.735572771740024,\\\"cluster\\\":1},{\\\"x\\\":-73.98632571101189,\\\"y\\\":40.735571755545806,\\\"cluster\\\":1},{\\\"x\\\":-73.98632369935513,\\\"y\\\":40.735571755545806,\\\"cluster\\\":1},{\\\"x\\\":-73.98632235825062,\\\"y\\\":40.73557124744871,\\\"cluster\\\":1},{\\\"x\\\":-73.98632302880287,\\\"y\\\":40.73557023125444,\\\"cluster\\\":1}],\\\"97316b13-e1ec-417f-824a-4a1c084c10c3\\\":[{\\\"x\\\":-73.98622445762157,\\\"y\\\":40.73570995781772,\\\"cluster\\\":2},{\\\"x\\\":-73.98622579872608,\\\"y\\\":40.73570944972167,\\\"cluster\\\":2},{\\\"x\\\":-73.98622512817383,\\\"y\\\":40.73570944972167,\\\"cluster\\\":2},{\\\"x\\\":-73.98622579872608,\\\"y\\\":40.73570995781772,\\\"cluster\\\":2},{\\\"x\\\":-73.98622445762157,\\\"y\\\":40.73570894162559,\\\"cluster\\\":2},{\\\"x\\\":-73.98622378706932,\\\"y\\\":40.73570995781772,\\\"cluster\\\":2},{\\\"x\\\":-73.98622360456956,\\\"y\\\":40.73570641325812,\\\"cluster\\\":2},{\\\"x\\\":-73.98622360456956,\\\"y\\\":40.73570641325812,\\\"cluster\\\":2},{\\\"x\\\":-73.98622360456956,\\\"y\\\":40.73570641325812,\\\"cluster\\\":2}],\\\"a89cda52-f106-4b44-95aa-aaa6defc129b\\\":[{\\\"x\\\":-73.9861835539341,\\\"y\\\":40.73569268254945,\\\"cluster\\\":3},{\\\"x\\\":-73.98618154227734,\\\"y\\\":40.73569217445325,\\\"cluster\\\":3},{\\\"x\\\":-73.98618154227734,\\\"y\\\":40.73569014206842,\\\"cluster\\\":3},{\\\"x\\\":-73.98618087172508,\\\"y\\\":40.735689633972214,\\\"cluster\\\":3},{\\\"x\\\":-73.98618154227734,\\\"y\\\":40.73569065016463,\\\"cluster\\\":3},{\\\"x\\\":-73.98618288338184,\\\"y\\\":40.73569268254945,\\\"cluster\\\":3},{\\\"x\\\":-73.9861848950386,\\\"y\\\":40.735689633972214,\\\"cluster\\\":3},{\\\"x\\\":-73.98618768252459,\\\"y\\\":40.73568957578454,\\\"cluster\\\":3},{\\\"x\\\":-73.98617953062057,\\\"y\\\":40.735689633972214,\\\"cluster\\\":3}],\\\"615c22b4-81f1-4413-80fa-fc406e6c1efb\\\":[{\\\"x\\\":-73.98621775209902,\\\"y\\\":40.735640856717666,\\\"cluster\\\":4},{\\\"x\\\":-73.98621775209902,\\\"y\\\":40.73563933242788,\\\"cluster\\\":4},{\\\"x\\\":-73.98621909320354,\\\"y\\\":40.735640856717666,\\\"cluster\\\":4},{\\\"x\\\":-73.98621842265129,\\\"y\\\":40.7356423810074,\\\"cluster\\\":4},{\\\"x\\\":-73.98621775209902,\\\"y\\\":40.73564034862108,\\\"cluster\\\":4},{\\\"x\\\":-73.98621842265129,\\\"y\\\":40.735640856717666,\\\"cluster\\\":4},{\\\"x\\\":-73.98621775209902,\\\"y\\\":40.73563984052446,\\\"cluster\\\":4},{\\\"x\\\":-73.98621909320354,\\\"y\\\":40.735638316234656,\\\"cluster\\\":4},{\\\"x\\\":-73.98621775209902,\\\"y\\\":40.735640856717666,\\\"cluster\\\":4}],\\\"d0725452-aa45-4ebb-ac99-361f36b0bb10\\\":[{\\\"x\\\":-73.98620970547199,\\\"y\\\":40.7356342514617,\\\"cluster\\\":5},{\\\"x\\\":-73.98620769381522,\\\"y\\\":40.73563526765495,\\\"cluster\\\":5},{\\\"x\\\":-73.98620970547199,\\\"y\\\":40.73563526765495,\\\"cluster\\\":5},{\\\"x\\\":-73.98620903491974,\\\"y\\\":40.73563577575159,\\\"cluster\\\":5},{\\\"x\\\":-73.98620836436749,\\\"y\\\":40.735634251461676,\\\"cluster\\\":5},{\\\"x\\\":-73.9862110465765,\\\"y\\\":40.7356362838482,\\\"cluster\\\":5},{\\\"x\\\":-73.98620970547199,\\\"y\\\":40.73563475955834,\\\"cluster\\\":5},{\\\"x\\\":-73.98620836436749,\\\"y\\\":40.73563272717173,\\\"cluster\\\":5},{\\\"x\\\":-73.98620836436749,\\\"y\\\":40.7356362838482,\\\"cluster\\\":5},{\\\"x\\\":-73.98620769381522,\\\"y\\\":40.73563577575159,\\\"cluster\\\":5}]},\\\"extension\\\":[]}\\n Nyaplot.core.parse(model, '#vis-6a777312-113c-4916-9476-4438cd56ec78');\\n };\\n if(window['Nyaplot']==undefined){\\n window.addEventListener('load_nyaplot', render, false);\\n\\treturn;\\n }\\n render();\\n})();\\n</script>\\n\""
]
}
],
"prompt_number": 29
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Finding the mean polygon\n",
"\n",
"Now we iterate through each vertex cluster and:\n",
"\n",
"1. find the mean vertex\n",
"1. connect the mean vertices into a mean polygon\n",
"\n",
"For this we need some extra functions in the `Array` object to find the mean value:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"class Array\n",
" def sum\n",
" inject(0.0) { |result, el| result + el }\n",
" end\n",
"\n",
" def mean \n",
" sum / size\n",
" end\n",
"end"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 30,
"text": [
":mean"
]
}
],
"prompt_number": 30
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we need a function that receives the vertex clusters and returns the average vertex for each cluster:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def get_mean_poly(clusters)\n",
" means = {}\n",
" clusters.each do |cluster|\n",
" next if cluster[0] == -1 # ignore cluster -1\n",
" lon = cluster[1].map {|c| c[0]}.mean\n",
" lat = cluster[1].map {|c| c[1]}.mean\n",
" means[cluster[0]] = [lon,lat]\n",
" end\n",
" return means\n",
"end"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 31,
"text": [
":get_mean_poly"
]
}
],
"prompt_number": 31
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We test this function with our vertex clusters and plot both (mean vertices as yellow diamonds):"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"mean_poly = get_mean_poly(vertex_clusters)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 32,
"text": [
"{0=>[-73.9862696826458, 40.73554911699269], 1=>[-73.98632407188416, 40.73557147326963], 2=>[-73.98622447129412, 40.73570855047738], 3=>[-73.98618267156186, 40.73569075660959], 4=>[-73.98621819913387, 40.73564040507624], 5=>[-73.98620896786451, 40.735635064416286]}"
]
}
],
"prompt_number": 32
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# plot clusters with overlaid (yellow) mean points\n",
"plot = plot_clusters(vertex_clusters)\n",
"# add means\n",
"m_x = mean_poly.map { |m| m[1][0] }\n",
"m_y = mean_poly.map { |m| m[1][1] }\n",
"sc = plot.add(:scatter, m_x, m_y)\n",
"color = \"#ffff00\"\n",
"sc.color(color)\n",
"sc.shape('diamond')\n",
"plot.show"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div id='vis-2c7602f7-cc7d-4d94-becc-fd61a3798004'></div>\n",
"<script>\n",
"(function(){\n",
" var render = function(){\n",
" var model = {\"panes\":[{\"diagrams\":[{\"type\":\"scatter\",\"options\":{\"x\":\"x\",\"y\":\"y\",\"tooltip_contents\":[\"cluster\"],\"color\":\"#e469de\"},\"data\":\"68ce990c-89db-490b-8361-a6602e98fafa\"},{\"type\":\"scatter\",\"options\":{\"x\":\"x\",\"y\":\"y\",\"tooltip_contents\":[\"cluster\"],\"color\":\"#003a5f\"},\"data\":\"f6612040-9d54-4dbb-a260-889ee7f8bf18\"},{\"type\":\"scatter\",\"options\":{\"x\":\"x\",\"y\":\"y\",\"tooltip_contents\":[\"cluster\"],\"color\":\"#d44634\"},\"data\":\"f8cca8ed-3965-4a6d-bcd0-724c6117ebe1\"},{\"type\":\"scatter\",\"options\":{\"x\":\"x\",\"y\":\"y\",\"tooltip_contents\":[\"cluster\"],\"color\":\"#3bf97f\"},\"data\":\"c9ec7987-1a82-4c01-8eb4-89a433e66050\"},{\"type\":\"scatter\",\"options\":{\"x\":\"x\",\"y\":\"y\",\"tooltip_contents\":[\"cluster\"],\"color\":\"#b45cb3\"},\"data\":\"01d68342-a813-4672-a1bb-aff5a45f8aaa\"},{\"type\":\"scatter\",\"options\":{\"x\":\"x\",\"y\":\"y\",\"tooltip_contents\":[\"cluster\"],\"color\":\"#8413c0\"},\"data\":\"c3189c33-d563-40f4-a651-ea8fa7bc0d7f\"},{\"type\":\"scatter\",\"options\":{\"x\":\"data0\",\"y\":\"data1\",\"color\":\"#ffff00\",\"shape\":\"diamond\"},\"data\":\"d666d9ca-db2d-47b5-832b-3565a182a39e\"}],\"options\":{\"width\":300,\"height\":400,\"zoom\":true,\"rotate_x_label\":-60,\"xrange\":[-73.98633571101189,-73.98616953062057],\"yrange\":[40.73553787497709,40.73571995781772]}}],\"data\":{\"68ce990c-89db-490b-8361-a6602e98fafa\":[{\"x\":-73.98627072572708,\"y\":40.735547874977094,\"cluster\":0},{\"x\":-73.9862660318613,\"y\":40.735547874977094,\"cluster\":0},{\"x\":-73.98627005517483,\"y\":40.73554889117169,\"cluster\":0},{\"x\":-73.98627139627934,\"y\":40.735547874977094,\"cluster\":0},{\"x\":-73.98626938462257,\"y\":40.73554889117167,\"cluster\":0},{\"x\":-73.98627005517483,\"y\":40.735550923560815,\"cluster\":0},{\"x\":-73.98627005517483,\"y\":40.73554990736624,\"cluster\":0},{\"x\":-73.98626938462257,\"y\":40.735550415463514,\"cluster\":0},{\"x\":-73.98627005517483,\"y\":40.73554939926897,\"cluster\":0}],\"f6612040-9d54-4dbb-a260-889ee7f8bf18\":[{\"x\":-73.98632504045963,\"y\":40.73557226364293,\"cluster\":1},{\"x\":-73.98632504045963,\"y\":40.735570739351566,\"cluster\":1},{\"x\":-73.98632369935513,\"y\":40.735570739351566,\"cluster\":1},{\"x\":-73.98632436990738,\"y\":40.735571755545806,\"cluster\":1},{\"x\":-73.98632369935513,\"y\":40.735572771740024,\"cluster\":1},{\"x\":-73.98632571101189,\"y\":40.735571755545806,\"cluster\":1},{\"x\":-73.98632369935513,\"y\":40.735571755545806,\"cluster\":1},{\"x\":-73.98632235825062,\"y\":40.73557124744871,\"cluster\":1},{\"x\":-73.98632302880287,\"y\":40.73557023125444,\"cluster\":1}],\"f8cca8ed-3965-4a6d-bcd0-724c6117ebe1\":[{\"x\":-73.98622445762157,\"y\":40.73570995781772,\"cluster\":2},{\"x\":-73.98622579872608,\"y\":40.73570944972167,\"cluster\":2},{\"x\":-73.98622512817383,\"y\":40.73570944972167,\"cluster\":2},{\"x\":-73.98622579872608,\"y\":40.73570995781772,\"cluster\":2},{\"x\":-73.98622445762157,\"y\":40.73570894162559,\"cluster\":2},{\"x\":-73.98622378706932,\"y\":40.73570995781772,\"cluster\":2},{\"x\":-73.98622360456956,\"y\":40.73570641325812,\"cluster\":2},{\"x\":-73.98622360456956,\"y\":40.73570641325812,\"cluster\":2},{\"x\":-73.98622360456956,\"y\":40.73570641325812,\"cluster\":2}],\"c9ec7987-1a82-4c01-8eb4-89a433e66050\":[{\"x\":-73.9861835539341,\"y\":40.73569268254945,\"cluster\":3},{\"x\":-73.98618154227734,\"y\":40.73569217445325,\"cluster\":3},{\"x\":-73.98618154227734,\"y\":40.73569014206842,\"cluster\":3},{\"x\":-73.98618087172508,\"y\":40.735689633972214,\"cluster\":3},{\"x\":-73.98618154227734,\"y\":40.73569065016463,\"cluster\":3},{\"x\":-73.98618288338184,\"y\":40.73569268254945,\"cluster\":3},{\"x\":-73.9861848950386,\"y\":40.735689633972214,\"cluster\":3},{\"x\":-73.98618768252459,\"y\":40.73568957578454,\"cluster\":3},{\"x\":-73.98617953062057,\"y\":40.735689633972214,\"cluster\":3}],\"01d68342-a813-4672-a1bb-aff5a45f8aaa\":[{\"x\":-73.98621775209902,\"y\":40.735640856717666,\"cluster\":4},{\"x\":-73.98621775209902,\"y\":40.73563933242788,\"cluster\":4},{\"x\":-73.98621909320354,\"y\":40.735640856717666,\"cluster\":4},{\"x\":-73.98621842265129,\"y\":40.7356423810074,\"cluster\":4},{\"x\":-73.98621775209902,\"y\":40.73564034862108,\"cluster\":4},{\"x\":-73.98621842265129,\"y\":40.735640856717666,\"cluster\":4},{\"x\":-73.98621775209902,\"y\":40.73563984052446,\"cluster\":4},{\"x\":-73.98621909320354,\"y\":40.735638316234656,\"cluster\":4},{\"x\":-73.98621775209902,\"y\":40.735640856717666,\"cluster\":4}],\"c3189c33-d563-40f4-a651-ea8fa7bc0d7f\":[{\"x\":-73.98620970547199,\"y\":40.7356342514617,\"cluster\":5},{\"x\":-73.98620769381522,\"y\":40.73563526765495,\"cluster\":5},{\"x\":-73.98620970547199,\"y\":40.73563526765495,\"cluster\":5},{\"x\":-73.98620903491974,\"y\":40.73563577575159,\"cluster\":5},{\"x\":-73.98620836436749,\"y\":40.735634251461676,\"cluster\":5},{\"x\":-73.9862110465765,\"y\":40.7356362838482,\"cluster\":5},{\"x\":-73.98620970547199,\"y\":40.73563475955834,\"cluster\":5},{\"x\":-73.98620836436749,\"y\":40.73563272717173,\"cluster\":5},{\"x\":-73.98620836436749,\"y\":40.7356362838482,\"cluster\":5},{\"x\":-73.98620769381522,\"y\":40.73563577575159,\"cluster\":5}],\"d666d9ca-db2d-47b5-832b-3565a182a39e\":[{\"data0\":-73.9862696826458,\"data1\":40.73554911699269},{\"data0\":-73.98632407188416,\"data1\":40.73557147326963},{\"data0\":-73.98622447129412,\"data1\":40.73570855047738},{\"data0\":-73.98618267156186,\"data1\":40.73569075660959},{\"data0\":-73.98621819913387,\"data1\":40.73564040507624},{\"data0\":-73.98620896786451,\"data1\":40.735635064416286}]},\"extension\":[]}\n",
" Nyaplot.core.parse(model, '#vis-2c7602f7-cc7d-4d94-becc-fd61a3798004');\n",
" };\n",
" if(window['Nyaplot']==undefined){\n",
" window.addEventListener('load_nyaplot', render, false);\n",
"\treturn;\n",
" }\n",
" render();\n",
"})();\n",
"</script>\n"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 33,
"text": [
"\"<div id='vis-2c7602f7-cc7d-4d94-becc-fd61a3798004'></div>\\n<script>\\n(function(){\\n var render = function(){\\n var model = {\\\"panes\\\":[{\\\"diagrams\\\":[{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"x\\\",\\\"y\\\":\\\"y\\\",\\\"tooltip_contents\\\":[\\\"cluster\\\"],\\\"color\\\":\\\"#e469de\\\"},\\\"data\\\":\\\"68ce990c-89db-490b-8361-a6602e98fafa\\\"},{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"x\\\",\\\"y\\\":\\\"y\\\",\\\"tooltip_contents\\\":[\\\"cluster\\\"],\\\"color\\\":\\\"#003a5f\\\"},\\\"data\\\":\\\"f6612040-9d54-4dbb-a260-889ee7f8bf18\\\"},{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"x\\\",\\\"y\\\":\\\"y\\\",\\\"tooltip_contents\\\":[\\\"cluster\\\"],\\\"color\\\":\\\"#d44634\\\"},\\\"data\\\":\\\"f8cca8ed-3965-4a6d-bcd0-724c6117ebe1\\\"},{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"x\\\",\\\"y\\\":\\\"y\\\",\\\"tooltip_contents\\\":[\\\"cluster\\\"],\\\"color\\\":\\\"#3bf97f\\\"},\\\"data\\\":\\\"c9ec7987-1a82-4c01-8eb4-89a433e66050\\\"},{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"x\\\",\\\"y\\\":\\\"y\\\",\\\"tooltip_contents\\\":[\\\"cluster\\\"],\\\"color\\\":\\\"#b45cb3\\\"},\\\"data\\\":\\\"01d68342-a813-4672-a1bb-aff5a45f8aaa\\\"},{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"x\\\",\\\"y\\\":\\\"y\\\",\\\"tooltip_contents\\\":[\\\"cluster\\\"],\\\"color\\\":\\\"#8413c0\\\"},\\\"data\\\":\\\"c3189c33-d563-40f4-a651-ea8fa7bc0d7f\\\"},{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"data0\\\",\\\"y\\\":\\\"data1\\\",\\\"color\\\":\\\"#ffff00\\\",\\\"shape\\\":\\\"diamond\\\"},\\\"data\\\":\\\"d666d9ca-db2d-47b5-832b-3565a182a39e\\\"}],\\\"options\\\":{\\\"width\\\":300,\\\"height\\\":400,\\\"zoom\\\":true,\\\"rotate_x_label\\\":-60,\\\"xrange\\\":[-73.98633571101189,-73.98616953062057],\\\"yrange\\\":[40.73553787497709,40.73571995781772]}}],\\\"data\\\":{\\\"68ce990c-89db-490b-8361-a6602e98fafa\\\":[{\\\"x\\\":-73.98627072572708,\\\"y\\\":40.735547874977094,\\\"cluster\\\":0},{\\\"x\\\":-73.9862660318613,\\\"y\\\":40.735547874977094,\\\"cluster\\\":0},{\\\"x\\\":-73.98627005517483,\\\"y\\\":40.73554889117169,\\\"cluster\\\":0},{\\\"x\\\":-73.98627139627934,\\\"y\\\":40.735547874977094,\\\"cluster\\\":0},{\\\"x\\\":-73.98626938462257,\\\"y\\\":40.73554889117167,\\\"cluster\\\":0},{\\\"x\\\":-73.98627005517483,\\\"y\\\":40.735550923560815,\\\"cluster\\\":0},{\\\"x\\\":-73.98627005517483,\\\"y\\\":40.73554990736624,\\\"cluster\\\":0},{\\\"x\\\":-73.98626938462257,\\\"y\\\":40.735550415463514,\\\"cluster\\\":0},{\\\"x\\\":-73.98627005517483,\\\"y\\\":40.73554939926897,\\\"cluster\\\":0}],\\\"f6612040-9d54-4dbb-a260-889ee7f8bf18\\\":[{\\\"x\\\":-73.98632504045963,\\\"y\\\":40.73557226364293,\\\"cluster\\\":1},{\\\"x\\\":-73.98632504045963,\\\"y\\\":40.735570739351566,\\\"cluster\\\":1},{\\\"x\\\":-73.98632369935513,\\\"y\\\":40.735570739351566,\\\"cluster\\\":1},{\\\"x\\\":-73.98632436990738,\\\"y\\\":40.735571755545806,\\\"cluster\\\":1},{\\\"x\\\":-73.98632369935513,\\\"y\\\":40.735572771740024,\\\"cluster\\\":1},{\\\"x\\\":-73.98632571101189,\\\"y\\\":40.735571755545806,\\\"cluster\\\":1},{\\\"x\\\":-73.98632369935513,\\\"y\\\":40.735571755545806,\\\"cluster\\\":1},{\\\"x\\\":-73.98632235825062,\\\"y\\\":40.73557124744871,\\\"cluster\\\":1},{\\\"x\\\":-73.98632302880287,\\\"y\\\":40.73557023125444,\\\"cluster\\\":1}],\\\"f8cca8ed-3965-4a6d-bcd0-724c6117ebe1\\\":[{\\\"x\\\":-73.98622445762157,\\\"y\\\":40.73570995781772,\\\"cluster\\\":2},{\\\"x\\\":-73.98622579872608,\\\"y\\\":40.73570944972167,\\\"cluster\\\":2},{\\\"x\\\":-73.98622512817383,\\\"y\\\":40.73570944972167,\\\"cluster\\\":2},{\\\"x\\\":-73.98622579872608,\\\"y\\\":40.73570995781772,\\\"cluster\\\":2},{\\\"x\\\":-73.98622445762157,\\\"y\\\":40.73570894162559,\\\"cluster\\\":2},{\\\"x\\\":-73.98622378706932,\\\"y\\\":40.73570995781772,\\\"cluster\\\":2},{\\\"x\\\":-73.98622360456956,\\\"y\\\":40.73570641325812,\\\"cluster\\\":2},{\\\"x\\\":-73.98622360456956,\\\"y\\\":40.73570641325812,\\\"cluster\\\":2},{\\\"x\\\":-73.98622360456956,\\\"y\\\":40.73570641325812,\\\"cluster\\\":2}],\\\"c9ec7987-1a82-4c01-8eb4-89a433e66050\\\":[{\\\"x\\\":-73.9861835539341,\\\"y\\\":40.73569268254945,\\\"cluster\\\":3},{\\\"x\\\":-73.98618154227734,\\\"y\\\":40.73569217445325,\\\"cluster\\\":3},{\\\"x\\\":-73.98618154227734,\\\"y\\\":40.73569014206842,\\\"cluster\\\":3},{\\\"x\\\":-73.98618087172508,\\\"y\\\":40.735689633972214,\\\"cluster\\\":3},{\\\"x\\\":-73.98618154227734,\\\"y\\\":40.73569065016463,\\\"cluster\\\":3},{\\\"x\\\":-73.98618288338184,\\\"y\\\":40.73569268254945,\\\"cluster\\\":3},{\\\"x\\\":-73.9861848950386,\\\"y\\\":40.735689633972214,\\\"cluster\\\":3},{\\\"x\\\":-73.98618768252459,\\\"y\\\":40.73568957578454,\\\"cluster\\\":3},{\\\"x\\\":-73.98617953062057,\\\"y\\\":40.735689633972214,\\\"cluster\\\":3}],\\\"01d68342-a813-4672-a1bb-aff5a45f8aaa\\\":[{\\\"x\\\":-73.98621775209902,\\\"y\\\":40.735640856717666,\\\"cluster\\\":4},{\\\"x\\\":-73.98621775209902,\\\"y\\\":40.73563933242788,\\\"cluster\\\":4},{\\\"x\\\":-73.98621909320354,\\\"y\\\":40.735640856717666,\\\"cluster\\\":4},{\\\"x\\\":-73.98621842265129,\\\"y\\\":40.7356423810074,\\\"cluster\\\":4},{\\\"x\\\":-73.98621775209902,\\\"y\\\":40.73564034862108,\\\"cluster\\\":4},{\\\"x\\\":-73.98621842265129,\\\"y\\\":40.735640856717666,\\\"cluster\\\":4},{\\\"x\\\":-73.98621775209902,\\\"y\\\":40.73563984052446,\\\"cluster\\\":4},{\\\"x\\\":-73.98621909320354,\\\"y\\\":40.735638316234656,\\\"cluster\\\":4},{\\\"x\\\":-73.98621775209902,\\\"y\\\":40.735640856717666,\\\"cluster\\\":4}],\\\"c3189c33-d563-40f4-a651-ea8fa7bc0d7f\\\":[{\\\"x\\\":-73.98620970547199,\\\"y\\\":40.7356342514617,\\\"cluster\\\":5},{\\\"x\\\":-73.98620769381522,\\\"y\\\":40.73563526765495,\\\"cluster\\\":5},{\\\"x\\\":-73.98620970547199,\\\"y\\\":40.73563526765495,\\\"cluster\\\":5},{\\\"x\\\":-73.98620903491974,\\\"y\\\":40.73563577575159,\\\"cluster\\\":5},{\\\"x\\\":-73.98620836436749,\\\"y\\\":40.735634251461676,\\\"cluster\\\":5},{\\\"x\\\":-73.9862110465765,\\\"y\\\":40.7356362838482,\\\"cluster\\\":5},{\\\"x\\\":-73.98620970547199,\\\"y\\\":40.73563475955834,\\\"cluster\\\":5},{\\\"x\\\":-73.98620836436749,\\\"y\\\":40.73563272717173,\\\"cluster\\\":5},{\\\"x\\\":-73.98620836436749,\\\"y\\\":40.7356362838482,\\\"cluster\\\":5},{\\\"x\\\":-73.98620769381522,\\\"y\\\":40.73563577575159,\\\"cluster\\\":5}],\\\"d666d9ca-db2d-47b5-832b-3565a182a39e\\\":[{\\\"data0\\\":-73.9862696826458,\\\"data1\\\":40.73554911699269},{\\\"data0\\\":-73.98632407188416,\\\"data1\\\":40.73557147326963},{\\\"data0\\\":-73.98622447129412,\\\"data1\\\":40.73570855047738},{\\\"data0\\\":-73.98618267156186,\\\"data1\\\":40.73569075660959},{\\\"data0\\\":-73.98621819913387,\\\"data1\\\":40.73564040507624},{\\\"data0\\\":-73.98620896786451,\\\"data1\\\":40.735635064416286}]},\\\"extension\\\":[]}\\n Nyaplot.core.parse(model, '#vis-2c7602f7-cc7d-4d94-becc-fd61a3798004');\\n };\\n if(window['Nyaplot']==undefined){\\n window.addEventListener('load_nyaplot', render, false);\\n\\treturn;\\n }\\n render();\\n})();\\n</script>\\n\""
]
}
],
"prompt_number": 33
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. Connecting it all\n",
"\n",
"So far we have a set of points that seem to be the most likely vertices of the mean polygon drawn by our contributors. However, there are **many ways in which these points could be connected to each other**.\n",
"\n",
"**DISCLAIMER**:\n",
"\n",
"What follows is a _very_ primitive process that I used to determine the most likely connection between those points. This process is the best I could come up with given my limited math knowledge and time. If you have a better idea of how to do this in Ruby please tweet me at [@mgiraldo](https://twitter.com/mgiraldo).\n",
"\n",
"**/DISCLAIMER**\n",
"\n",
"Before going through with connections we need to validate that we have a reasonable amount of clusters to work with: some vertices may be drawn far away enough for them to not cluster properly and therefore no cluster will be produced. We do this by determining the mean vertices in each polygon ($\\bar{m}$) and comparing it with the cluster count ($\\sum c$). Right now: $\\bar{m}\\leq\\sum c$ , so we should have at least _as many_ clusters as we have average points per polygon.\n",
"\n",
"Not perfect but works most of the time:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def validate_clusters(clusters, unique_points)\n",
" average = (unique_points.flatten.count.to_f / (unique_points.size * 2).to_f).round\n",
" return clusters.select{|k,v| k!=-1}.size >= average\n",
"end"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 34,
"text": [
":validate_clusters"
]
}
],
"prompt_number": 34
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"validate_clusters(vertex_clusters, unique_points)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 35,
"text": [
"true"
]
}
],
"prompt_number": 35
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that this has been verified we proceed to connect.\n",
"\n",
"The general process to connect mean vertices to each other is:\n",
"\n",
"1. for each mean vertex:\n",
" 1. find the cluster of vertices it represents (from_vertices)\n",
" 1. for each vertex in from_vertices:\n",
" 1. find the vertex it is connected to (to_vertex)\n",
" 1. find the cluster to_vertex belongs to (to_cluster)\n",
" 1. add a \"vote\" for to_cluster\n",
" 1. tally the votes\n",
" 1. the to_cluster with most votes is the connected cluster\n",
"1. connect the clusters\n",
"1. validate that the connection makes sense (eg: is a [directed cycle graph](http://en.wikipedia.org/wiki/Cycle_graph))\n",
"\n",
"Below all the corresponding functions:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def find_connected_point(point, original_points)\n",
" original_points.each do |poly|\n",
" poly.each_with_index do |p,index|\n",
" return poly[index+1] if point === p\n",
" end\n",
" end\n",
" return\n",
"end"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 36,
"text": [
":find_connected_point"
]
}
],
"prompt_number": 36
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def find_cluster_for_point(point, clusters)\n",
" clusters.each do |cluster|\n",
" cluster[1].each do |p|\n",
" return cluster[0] if point === p && cluster[0] != -1\n",
" end\n",
" end\n",
" return\n",
"end"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 37,
"text": [
":find_cluster_for_point"
]
}
],
"prompt_number": 37
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def connect_clusters(clusters, original_points)\n",
" connections = {}\n",
" # for each cluster\n",
" clusters.each do |cluster|\n",
" # for each point in cluster\n",
" if cluster[0] != -1 # exclude invalid cluster\n",
" cluster_votes = {} # to weigh connection popularity (diff pts might be connected to diff clusters)\n",
" cluster[1].each do |point|\n",
" # find original point connected to it\n",
" connection = find_connected_point(point, original_points)\n",
" connected_cluster = find_cluster_for_point(connection, clusters)\n",
" # if original point belongs to another cluster\n",
" if connected_cluster != nil && connected_cluster != cluster[0]\n",
" # vote for the cluster\n",
" cluster_votes[connected_cluster] = 0 if cluster_votes[connected_cluster] == nil\n",
" cluster_votes[connected_cluster] += 1\n",
" end\n",
" end\n",
" connections[cluster[0]] = cluster_votes.sort_by{|k, v| v}\n",
" next if connections[cluster[0]].size == 0\n",
" connections[cluster[0]] = connections[cluster[0]].reverse[0][0]\n",
" end\n",
" end\n",
" return connections\n",
"end"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 38,
"text": [
":connect_clusters"
]
}
],
"prompt_number": 38
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"connections = connect_clusters(vertex_clusters, cluster_poly_points)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 39,
"text": [
"{0=>1, 1=>2, 2=>3, 3=>4, 4=>5, 5=>0}"
]
}
],
"prompt_number": 39
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As can be seen above this is a directed cycle graph and the end result is a clean path from the first vertex to the last one.\n",
"\n",
"The fact that the points are sorted (0 to 1, 1 to 2, 2 to 3, and so on) is somewhat coincidential. Below is a basic function that checks the graph and returns a sorted list of clusters (the order we need to follow to draw the mean polygon):"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def sort_connections(connections)\n",
" # does some simple check for non-circularity \n",
" sorted = []\n",
" seen = {}\n",
" as_list = connections.select{|k,v| k}\n",
" done = false\n",
" first = as_list.first[0]\n",
" from = first\n",
" while !done do\n",
" to = connections[from]\n",
" done = true if seen[to] || to == nil || to.size == 0\n",
" seen[to] = true\n",
" from = to\n",
" sorted.push(to)\n",
" done = true if seen.size == connections.size\n",
" end\n",
" return nil if seen.size != connections.size\n",
" return sorted\n",
"end"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 40,
"text": [
":sort_connections"
]
}
],
"prompt_number": 40
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# testing sort function\n",
"sort_connections(connections)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 41,
"text": [
"[1, 2, 3, 4, 5, 0]"
]
}
],
"prompt_number": 41
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can proceed to build our final mean polygon:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def connect_mean_poly(mean_poly, connections)\n",
" connected = []\n",
" sorted = sort_connections(connections)\n",
" return nil if sorted == nil\n",
" sorted.each do |c|\n",
" connected.push([mean_poly[c][0], mean_poly[c][1]])\n",
" end\n",
" # for GeoJSON, last == first\n",
" first = sorted[0]\n",
" connected.push([mean_poly[first][0], mean_poly[first][1]])\n",
" return connected\n",
"end"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 42,
"text": [
":connect_mean_poly"
]
}
],
"prompt_number": 42
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"final_polygon = connect_mean_poly(mean_poly, connections)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 43,
"text": [
"[[-73.98632407188416, 40.73557147326963], [-73.98622447129412, 40.73570855047738], [-73.98618267156186, 40.73569075660959], [-73.98621819913387, 40.73564040507624], [-73.98620896786451, 40.735635064416286], [-73.9862696826458, 40.73554911699269], [-73.98632407188416, 40.73557147326963]]"
]
}
],
"prompt_number": 43
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's see how all this looks like:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"plot = plot_clusters(vertex_clusters)\n",
"m_x = final_polygon.map { |m| m[0] }\n",
"m_y = final_polygon.map { |m| m[1] }\n",
"sc = plot.add(:scatter, m_x, m_y)\n",
"color = \"#ffff00\"\n",
"sc.color(color)\n",
"sc.shape('diamond')\n",
"# add the MEAN POLYGON\n",
"final_polygon.each_with_index do |c, i|\n",
" next if i >= final_polygon.size-1\n",
" from = [ final_polygon[i][0], final_polygon[i+1][0] ]\n",
" to = [ final_polygon[i][1], final_polygon[i+1][1] ]\n",
" plot.add(:line, from, to)\n",
"end\n",
"plot.show"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div id='vis-63b28041-b4a5-46ca-a180-7b81d6e30777'></div>\n",
"<script>\n",
"(function(){\n",
" var render = function(){\n",
" var model = {\"panes\":[{\"diagrams\":[{\"type\":\"scatter\",\"options\":{\"x\":\"x\",\"y\":\"y\",\"tooltip_contents\":[\"cluster\"],\"color\":\"#7315a3\"},\"data\":\"4ab43641-706f-4da1-8813-55d5a79a9440\"},{\"type\":\"scatter\",\"options\":{\"x\":\"x\",\"y\":\"y\",\"tooltip_contents\":[\"cluster\"],\"color\":\"#c15e82\"},\"data\":\"3d2d1699-9cc2-4c56-b1d4-2f73bd2d7992\"},{\"type\":\"scatter\",\"options\":{\"x\":\"x\",\"y\":\"y\",\"tooltip_contents\":[\"cluster\"],\"color\":\"#635e90\"},\"data\":\"56d9819e-9b16-4edd-9c4d-be56208b9aaf\"},{\"type\":\"scatter\",\"options\":{\"x\":\"x\",\"y\":\"y\",\"tooltip_contents\":[\"cluster\"],\"color\":\"#0dd3f0\"},\"data\":\"ee09c06f-593e-455c-b6b3-71ec7ee40662\"},{\"type\":\"scatter\",\"options\":{\"x\":\"x\",\"y\":\"y\",\"tooltip_contents\":[\"cluster\"],\"color\":\"#c3a664\"},\"data\":\"a4bb4d61-07fc-4384-9425-e7fb1b2226bb\"},{\"type\":\"scatter\",\"options\":{\"x\":\"x\",\"y\":\"y\",\"tooltip_contents\":[\"cluster\"],\"color\":\"#4f0528\"},\"data\":\"c2c2ad5e-ef28-46e1-9c06-ba95b054c073\"},{\"type\":\"scatter\",\"options\":{\"x\":\"data0\",\"y\":\"data1\",\"color\":\"#ffff00\",\"shape\":\"diamond\"},\"data\":\"0c10be2f-3343-4f0b-82bf-017e9935aabb\"},{\"type\":\"line\",\"options\":{\"x\":\"data0\",\"y\":\"data1\"},\"data\":\"8adc9db5-013d-4e38-b6ec-7de039499e2c\"},{\"type\":\"line\",\"options\":{\"x\":\"data0\",\"y\":\"data1\"},\"data\":\"7b9f79b0-75f3-4447-80a1-356335986f5d\"},{\"type\":\"line\",\"options\":{\"x\":\"data0\",\"y\":\"data1\"},\"data\":\"3711473d-1dc0-48ef-b83f-e0abff3dc992\"},{\"type\":\"line\",\"options\":{\"x\":\"data0\",\"y\":\"data1\"},\"data\":\"4e4c2b7f-dead-432c-bd59-c9032a81753a\"},{\"type\":\"line\",\"options\":{\"x\":\"data0\",\"y\":\"data1\"},\"data\":\"08f22f21-80bf-409d-98a5-51507c927907\"},{\"type\":\"line\",\"options\":{\"x\":\"data0\",\"y\":\"data1\"},\"data\":\"c104f1a8-fdd8-486e-a43e-7fea2cbc3b5e\"}],\"options\":{\"width\":300,\"height\":400,\"zoom\":true,\"rotate_x_label\":-60,\"xrange\":[-73.98633571101189,-73.98616953062057],\"yrange\":[40.73553787497709,40.73571995781772]}}],\"data\":{\"4ab43641-706f-4da1-8813-55d5a79a9440\":[{\"x\":-73.98627072572708,\"y\":40.735547874977094,\"cluster\":0},{\"x\":-73.9862660318613,\"y\":40.735547874977094,\"cluster\":0},{\"x\":-73.98627005517483,\"y\":40.73554889117169,\"cluster\":0},{\"x\":-73.98627139627934,\"y\":40.735547874977094,\"cluster\":0},{\"x\":-73.98626938462257,\"y\":40.73554889117167,\"cluster\":0},{\"x\":-73.98627005517483,\"y\":40.735550923560815,\"cluster\":0},{\"x\":-73.98627005517483,\"y\":40.73554990736624,\"cluster\":0},{\"x\":-73.98626938462257,\"y\":40.735550415463514,\"cluster\":0},{\"x\":-73.98627005517483,\"y\":40.73554939926897,\"cluster\":0}],\"3d2d1699-9cc2-4c56-b1d4-2f73bd2d7992\":[{\"x\":-73.98632504045963,\"y\":40.73557226364293,\"cluster\":1},{\"x\":-73.98632504045963,\"y\":40.735570739351566,\"cluster\":1},{\"x\":-73.98632369935513,\"y\":40.735570739351566,\"cluster\":1},{\"x\":-73.98632436990738,\"y\":40.735571755545806,\"cluster\":1},{\"x\":-73.98632369935513,\"y\":40.735572771740024,\"cluster\":1},{\"x\":-73.98632571101189,\"y\":40.735571755545806,\"cluster\":1},{\"x\":-73.98632369935513,\"y\":40.735571755545806,\"cluster\":1},{\"x\":-73.98632235825062,\"y\":40.73557124744871,\"cluster\":1},{\"x\":-73.98632302880287,\"y\":40.73557023125444,\"cluster\":1}],\"56d9819e-9b16-4edd-9c4d-be56208b9aaf\":[{\"x\":-73.98622445762157,\"y\":40.73570995781772,\"cluster\":2},{\"x\":-73.98622579872608,\"y\":40.73570944972167,\"cluster\":2},{\"x\":-73.98622512817383,\"y\":40.73570944972167,\"cluster\":2},{\"x\":-73.98622579872608,\"y\":40.73570995781772,\"cluster\":2},{\"x\":-73.98622445762157,\"y\":40.73570894162559,\"cluster\":2},{\"x\":-73.98622378706932,\"y\":40.73570995781772,\"cluster\":2},{\"x\":-73.98622360456956,\"y\":40.73570641325812,\"cluster\":2},{\"x\":-73.98622360456956,\"y\":40.73570641325812,\"cluster\":2},{\"x\":-73.98622360456956,\"y\":40.73570641325812,\"cluster\":2}],\"ee09c06f-593e-455c-b6b3-71ec7ee40662\":[{\"x\":-73.9861835539341,\"y\":40.73569268254945,\"cluster\":3},{\"x\":-73.98618154227734,\"y\":40.73569217445325,\"cluster\":3},{\"x\":-73.98618154227734,\"y\":40.73569014206842,\"cluster\":3},{\"x\":-73.98618087172508,\"y\":40.735689633972214,\"cluster\":3},{\"x\":-73.98618154227734,\"y\":40.73569065016463,\"cluster\":3},{\"x\":-73.98618288338184,\"y\":40.73569268254945,\"cluster\":3},{\"x\":-73.9861848950386,\"y\":40.735689633972214,\"cluster\":3},{\"x\":-73.98618768252459,\"y\":40.73568957578454,\"cluster\":3},{\"x\":-73.98617953062057,\"y\":40.735689633972214,\"cluster\":3}],\"a4bb4d61-07fc-4384-9425-e7fb1b2226bb\":[{\"x\":-73.98621775209902,\"y\":40.735640856717666,\"cluster\":4},{\"x\":-73.98621775209902,\"y\":40.73563933242788,\"cluster\":4},{\"x\":-73.98621909320354,\"y\":40.735640856717666,\"cluster\":4},{\"x\":-73.98621842265129,\"y\":40.7356423810074,\"cluster\":4},{\"x\":-73.98621775209902,\"y\":40.73564034862108,\"cluster\":4},{\"x\":-73.98621842265129,\"y\":40.735640856717666,\"cluster\":4},{\"x\":-73.98621775209902,\"y\":40.73563984052446,\"cluster\":4},{\"x\":-73.98621909320354,\"y\":40.735638316234656,\"cluster\":4},{\"x\":-73.98621775209902,\"y\":40.735640856717666,\"cluster\":4}],\"c2c2ad5e-ef28-46e1-9c06-ba95b054c073\":[{\"x\":-73.98620970547199,\"y\":40.7356342514617,\"cluster\":5},{\"x\":-73.98620769381522,\"y\":40.73563526765495,\"cluster\":5},{\"x\":-73.98620970547199,\"y\":40.73563526765495,\"cluster\":5},{\"x\":-73.98620903491974,\"y\":40.73563577575159,\"cluster\":5},{\"x\":-73.98620836436749,\"y\":40.735634251461676,\"cluster\":5},{\"x\":-73.9862110465765,\"y\":40.7356362838482,\"cluster\":5},{\"x\":-73.98620970547199,\"y\":40.73563475955834,\"cluster\":5},{\"x\":-73.98620836436749,\"y\":40.73563272717173,\"cluster\":5},{\"x\":-73.98620836436749,\"y\":40.7356362838482,\"cluster\":5},{\"x\":-73.98620769381522,\"y\":40.73563577575159,\"cluster\":5}],\"0c10be2f-3343-4f0b-82bf-017e9935aabb\":[{\"data0\":-73.98632407188416,\"data1\":40.73557147326963},{\"data0\":-73.98622447129412,\"data1\":40.73570855047738},{\"data0\":-73.98618267156186,\"data1\":40.73569075660959},{\"data0\":-73.98621819913387,\"data1\":40.73564040507624},{\"data0\":-73.98620896786451,\"data1\":40.735635064416286},{\"data0\":-73.9862696826458,\"data1\":40.73554911699269},{\"data0\":-73.98632407188416,\"data1\":40.73557147326963}],\"8adc9db5-013d-4e38-b6ec-7de039499e2c\":[{\"data0\":-73.98632407188416,\"data1\":40.73557147326963},{\"data0\":-73.98622447129412,\"data1\":40.73570855047738}],\"7b9f79b0-75f3-4447-80a1-356335986f5d\":[{\"data0\":-73.98622447129412,\"data1\":40.73570855047738},{\"data0\":-73.98618267156186,\"data1\":40.73569075660959}],\"3711473d-1dc0-48ef-b83f-e0abff3dc992\":[{\"data0\":-73.98618267156186,\"data1\":40.73569075660959},{\"data0\":-73.98621819913387,\"data1\":40.73564040507624}],\"4e4c2b7f-dead-432c-bd59-c9032a81753a\":[{\"data0\":-73.98621819913387,\"data1\":40.73564040507624},{\"data0\":-73.98620896786451,\"data1\":40.735635064416286}],\"08f22f21-80bf-409d-98a5-51507c927907\":[{\"data0\":-73.98620896786451,\"data1\":40.735635064416286},{\"data0\":-73.9862696826458,\"data1\":40.73554911699269}],\"c104f1a8-fdd8-486e-a43e-7fea2cbc3b5e\":[{\"data0\":-73.9862696826458,\"data1\":40.73554911699269},{\"data0\":-73.98632407188416,\"data1\":40.73557147326963}]},\"extension\":[]}\n",
" Nyaplot.core.parse(model, '#vis-63b28041-b4a5-46ca-a180-7b81d6e30777');\n",
" };\n",
" if(window['Nyaplot']==undefined){\n",
" window.addEventListener('load_nyaplot', render, false);\n",
"\treturn;\n",
" }\n",
" render();\n",
"})();\n",
"</script>\n"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 44,
"text": [
"\"<div id='vis-63b28041-b4a5-46ca-a180-7b81d6e30777'></div>\\n<script>\\n(function(){\\n var render = function(){\\n var model = {\\\"panes\\\":[{\\\"diagrams\\\":[{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"x\\\",\\\"y\\\":\\\"y\\\",\\\"tooltip_contents\\\":[\\\"cluster\\\"],\\\"color\\\":\\\"#7315a3\\\"},\\\"data\\\":\\\"4ab43641-706f-4da1-8813-55d5a79a9440\\\"},{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"x\\\",\\\"y\\\":\\\"y\\\",\\\"tooltip_contents\\\":[\\\"cluster\\\"],\\\"color\\\":\\\"#c15e82\\\"},\\\"data\\\":\\\"3d2d1699-9cc2-4c56-b1d4-2f73bd2d7992\\\"},{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"x\\\",\\\"y\\\":\\\"y\\\",\\\"tooltip_contents\\\":[\\\"cluster\\\"],\\\"color\\\":\\\"#635e90\\\"},\\\"data\\\":\\\"56d9819e-9b16-4edd-9c4d-be56208b9aaf\\\"},{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"x\\\",\\\"y\\\":\\\"y\\\",\\\"tooltip_contents\\\":[\\\"cluster\\\"],\\\"color\\\":\\\"#0dd3f0\\\"},\\\"data\\\":\\\"ee09c06f-593e-455c-b6b3-71ec7ee40662\\\"},{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"x\\\",\\\"y\\\":\\\"y\\\",\\\"tooltip_contents\\\":[\\\"cluster\\\"],\\\"color\\\":\\\"#c3a664\\\"},\\\"data\\\":\\\"a4bb4d61-07fc-4384-9425-e7fb1b2226bb\\\"},{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"x\\\",\\\"y\\\":\\\"y\\\",\\\"tooltip_contents\\\":[\\\"cluster\\\"],\\\"color\\\":\\\"#4f0528\\\"},\\\"data\\\":\\\"c2c2ad5e-ef28-46e1-9c06-ba95b054c073\\\"},{\\\"type\\\":\\\"scatter\\\",\\\"options\\\":{\\\"x\\\":\\\"data0\\\",\\\"y\\\":\\\"data1\\\",\\\"color\\\":\\\"#ffff00\\\",\\\"shape\\\":\\\"diamond\\\"},\\\"data\\\":\\\"0c10be2f-3343-4f0b-82bf-017e9935aabb\\\"},{\\\"type\\\":\\\"line\\\",\\\"options\\\":{\\\"x\\\":\\\"data0\\\",\\\"y\\\":\\\"data1\\\"},\\\"data\\\":\\\"8adc9db5-013d-4e38-b6ec-7de039499e2c\\\"},{\\\"type\\\":\\\"line\\\",\\\"options\\\":{\\\"x\\\":\\\"data0\\\",\\\"y\\\":\\\"data1\\\"},\\\"data\\\":\\\"7b9f79b0-75f3-4447-80a1-356335986f5d\\\"},{\\\"type\\\":\\\"line\\\",\\\"options\\\":{\\\"x\\\":\\\"data0\\\",\\\"y\\\":\\\"data1\\\"},\\\"data\\\":\\\"3711473d-1dc0-48ef-b83f-e0abff3dc992\\\"},{\\\"type\\\":\\\"line\\\",\\\"options\\\":{\\\"x\\\":\\\"data0\\\",\\\"y\\\":\\\"data1\\\"},\\\"data\\\":\\\"4e4c2b7f-dead-432c-bd59-c9032a81753a\\\"},{\\\"type\\\":\\\"line\\\",\\\"options\\\":{\\\"x\\\":\\\"data0\\\",\\\"y\\\":\\\"data1\\\"},\\\"data\\\":\\\"08f22f21-80bf-409d-98a5-51507c927907\\\"},{\\\"type\\\":\\\"line\\\",\\\"options\\\":{\\\"x\\\":\\\"data0\\\",\\\"y\\\":\\\"data1\\\"},\\\"data\\\":\\\"c104f1a8-fdd8-486e-a43e-7fea2cbc3b5e\\\"}],\\\"options\\\":{\\\"width\\\":300,\\\"height\\\":400,\\\"zoom\\\":true,\\\"rotate_x_label\\\":-60,\\\"xrange\\\":[-73.98633571101189,-73.98616953062057],\\\"yrange\\\":[40.73553787497709,40.73571995781772]}}],\\\"data\\\":{\\\"4ab43641-706f-4da1-8813-55d5a79a9440\\\":[{\\\"x\\\":-73.98627072572708,\\\"y\\\":40.735547874977094,\\\"cluster\\\":0},{\\\"x\\\":-73.9862660318613,\\\"y\\\":40.735547874977094,\\\"cluster\\\":0},{\\\"x\\\":-73.98627005517483,\\\"y\\\":40.73554889117169,\\\"cluster\\\":0},{\\\"x\\\":-73.98627139627934,\\\"y\\\":40.735547874977094,\\\"cluster\\\":0},{\\\"x\\\":-73.98626938462257,\\\"y\\\":40.73554889117167,\\\"cluster\\\":0},{\\\"x\\\":-73.98627005517483,\\\"y\\\":40.735550923560815,\\\"cluster\\\":0},{\\\"x\\\":-73.98627005517483,\\\"y\\\":40.73554990736624,\\\"cluster\\\":0},{\\\"x\\\":-73.98626938462257,\\\"y\\\":40.735550415463514,\\\"cluster\\\":0},{\\\"x\\\":-73.98627005517483,\\\"y\\\":40.73554939926897,\\\"cluster\\\":0}],\\\"3d2d1699-9cc2-4c56-b1d4-2f73bd2d7992\\\":[{\\\"x\\\":-73.98632504045963,\\\"y\\\":40.73557226364293,\\\"cluster\\\":1},{\\\"x\\\":-73.98632504045963,\\\"y\\\":40.735570739351566,\\\"cluster\\\":1},{\\\"x\\\":-73.98632369935513,\\\"y\\\":40.735570739351566,\\\"cluster\\\":1},{\\\"x\\\":-73.98632436990738,\\\"y\\\":40.735571755545806,\\\"cluster\\\":1},{\\\"x\\\":-73.98632369935513,\\\"y\\\":40.735572771740024,\\\"cluster\\\":1},{\\\"x\\\":-73.98632571101189,\\\"y\\\":40.735571755545806,\\\"cluster\\\":1},{\\\"x\\\":-73.98632369935513,\\\"y\\\":40.735571755545806,\\\"cluster\\\":1},{\\\"x\\\":-73.98632235825062,\\\"y\\\":40.73557124744871,\\\"cluster\\\":1},{\\\"x\\\":-73.98632302880287,\\\"y\\\":40.73557023125444,\\\"cluster\\\":1}],\\\"56d9819e-9b16-4edd-9c4d-be56208b9aaf\\\":[{\\\"x\\\":-73.98622445762157,\\\"y\\\":40.73570995781772,\\\"cluster\\\":2},{\\\"x\\\":-73.98622579872608,\\\"y\\\":40.73570944972167,\\\"cluster\\\":2},{\\\"x\\\":-73.98622512817383,\\\"y\\\":40.73570944972167,\\\"cluster\\\":2},{\\\"x\\\":-73.98622579872608,\\\"y\\\":40.73570995781772,\\\"cluster\\\":2},{\\\"x\\\":-73.98622445762157,\\\"y\\\":40.73570894162559,\\\"cluster\\\":2},{\\\"x\\\":-73.98622378706932,\\\"y\\\":40.73570995781772,\\\"cluster\\\":2},{\\\"x\\\":-73.98622360456956,\\\"y\\\":40.73570641325812,\\\"cluster\\\":2},{\\\"x\\\":-73.98622360456956,\\\"y\\\":40.73570641325812,\\\"cluster\\\":2},{\\\"x\\\":-73.98622360456956,\\\"y\\\":40.73570641325812,\\\"cluster\\\":2}],\\\"ee09c06f-593e-455c-b6b3-71ec7ee40662\\\":[{\\\"x\\\":-73.9861835539341,\\\"y\\\":40.73569268254945,\\\"cluster\\\":3},{\\\"x\\\":-73.98618154227734,\\\"y\\\":40.73569217445325,\\\"cluster\\\":3},{\\\"x\\\":-73.98618154227734,\\\"y\\\":40.73569014206842,\\\"cluster\\\":3},{\\\"x\\\":-73.98618087172508,\\\"y\\\":40.735689633972214,\\\"cluster\\\":3},{\\\"x\\\":-73.98618154227734,\\\"y\\\":40.73569065016463,\\\"cluster\\\":3},{\\\"x\\\":-73.98618288338184,\\\"y\\\":40.73569268254945,\\\"cluster\\\":3},{\\\"x\\\":-73.9861848950386,\\\"y\\\":40.735689633972214,\\\"cluster\\\":3},{\\\"x\\\":-73.98618768252459,\\\"y\\\":40.73568957578454,\\\"cluster\\\":3},{\\\"x\\\":-73.98617953062057,\\\"y\\\":40.735689633972214,\\\"cluster\\\":3}],\\\"a4bb4d61-07fc-4384-9425-e7fb1b2226bb\\\":[{\\\"x\\\":-73.98621775209902,\\\"y\\\":40.735640856717666,\\\"cluster\\\":4},{\\\"x\\\":-73.98621775209902,\\\"y\\\":40.73563933242788,\\\"cluster\\\":4},{\\\"x\\\":-73.98621909320354,\\\"y\\\":40.735640856717666,\\\"cluster\\\":4},{\\\"x\\\":-73.98621842265129,\\\"y\\\":40.7356423810074,\\\"cluster\\\":4},{\\\"x\\\":-73.98621775209902,\\\"y\\\":40.73564034862108,\\\"cluster\\\":4},{\\\"x\\\":-73.98621842265129,\\\"y\\\":40.735640856717666,\\\"cluster\\\":4},{\\\"x\\\":-73.98621775209902,\\\"y\\\":40.73563984052446,\\\"cluster\\\":4},{\\\"x\\\":-73.98621909320354,\\\"y\\\":40.735638316234656,\\\"cluster\\\":4},{\\\"x\\\":-73.98621775209902,\\\"y\\\":40.735640856717666,\\\"cluster\\\":4}],\\\"c2c2ad5e-ef28-46e1-9c06-ba95b054c073\\\":[{\\\"x\\\":-73.98620970547199,\\\"y\\\":40.7356342514617,\\\"cluster\\\":5},{\\\"x\\\":-73.98620769381522,\\\"y\\\":40.73563526765495,\\\"cluster\\\":5},{\\\"x\\\":-73.98620970547199,\\\"y\\\":40.73563526765495,\\\"cluster\\\":5},{\\\"x\\\":-73.98620903491974,\\\"y\\\":40.73563577575159,\\\"cluster\\\":5},{\\\"x\\\":-73.98620836436749,\\\"y\\\":40.735634251461676,\\\"cluster\\\":5},{\\\"x\\\":-73.9862110465765,\\\"y\\\":40.7356362838482,\\\"cluster\\\":5},{\\\"x\\\":-73.98620970547199,\\\"y\\\":40.73563475955834,\\\"cluster\\\":5},{\\\"x\\\":-73.98620836436749,\\\"y\\\":40.73563272717173,\\\"cluster\\\":5},{\\\"x\\\":-73.98620836436749,\\\"y\\\":40.7356362838482,\\\"cluster\\\":5},{\\\"x\\\":-73.98620769381522,\\\"y\\\":40.73563577575159,\\\"cluster\\\":5}],\\\"0c10be2f-3343-4f0b-82bf-017e9935aabb\\\":[{\\\"data0\\\":-73.98632407188416,\\\"data1\\\":40.73557147326963},{\\\"data0\\\":-73.98622447129412,\\\"data1\\\":40.73570855047738},{\\\"data0\\\":-73.98618267156186,\\\"data1\\\":40.73569075660959},{\\\"data0\\\":-73.98621819913387,\\\"data1\\\":40.73564040507624},{\\\"data0\\\":-73.98620896786451,\\\"data1\\\":40.735635064416286},{\\\"data0\\\":-73.9862696826458,\\\"data1\\\":40.73554911699269},{\\\"data0\\\":-73.98632407188416,\\\"data1\\\":40.73557147326963}],\\\"8adc9db5-013d-4e38-b6ec-7de039499e2c\\\":[{\\\"data0\\\":-73.98632407188416,\\\"data1\\\":40.73557147326963},{\\\"data0\\\":-73.98622447129412,\\\"data1\\\":40.73570855047738}],\\\"7b9f79b0-75f3-4447-80a1-356335986f5d\\\":[{\\\"data0\\\":-73.98622447129412,\\\"data1\\\":40.73570855047738},{\\\"data0\\\":-73.98618267156186,\\\"data1\\\":40.73569075660959}],\\\"3711473d-1dc0-48ef-b83f-e0abff3dc992\\\":[{\\\"data0\\\":-73.98618267156186,\\\"data1\\\":40.73569075660959},{\\\"data0\\\":-73.98621819913387,\\\"data1\\\":40.73564040507624}],\\\"4e4c2b7f-dead-432c-bd59-c9032a81753a\\\":[{\\\"data0\\\":-73.98621819913387,\\\"data1\\\":40.73564040507624},{\\\"data0\\\":-73.98620896786451,\\\"data1\\\":40.735635064416286}],\\\"08f22f21-80bf-409d-98a5-51507c927907\\\":[{\\\"data0\\\":-73.98620896786451,\\\"data1\\\":40.735635064416286},{\\\"data0\\\":-73.9862696826458,\\\"data1\\\":40.73554911699269}],\\\"c104f1a8-fdd8-486e-a43e-7fea2cbc3b5e\\\":[{\\\"data0\\\":-73.9862696826458,\\\"data1\\\":40.73554911699269},{\\\"data0\\\":-73.98632407188416,\\\"data1\\\":40.73557147326963}]},\\\"extension\\\":[]}\\n Nyaplot.core.parse(model, '#vis-63b28041-b4a5-46ca-a180-7b81d6e30777');\\n };\\n if(window['Nyaplot']==undefined){\\n window.addEventListener('load_nyaplot', render, false);\\n\\treturn;\\n }\\n render();\\n})();\\n</script>\\n\""
]
}
],
"prompt_number": 44
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To wrap it all up we create a single consensus function that receives a GeoJSON string and returns a list of mean polygons:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def calculate_polygonfix_consensus(geojson)\n",
" output = []\n",
" geom = parse_geojson(geojson)\n",
" centroids = get_all_centroids(geom)\n",
" centroid_clusters = cluster_centroids(centroids)\n",
" centroid_clusters.each do |ccluster|\n",
" next if ccluster[0] == -1\n",
" cluster = ccluster[1] # only the set of latlons\n",
" sub_geom = get_polys_for_centroid_cluster(cluster, centroids, geom)\n",
" next if sub_geom.size == 0\n",
" original_points = get_all_poly_points(sub_geom)\n",
" next if original_points == nil\n",
" unique_points = original_points.map{|poly| poly[1..-1]}\n",
" vertex_clusters = cluster_points(unique_points)\n",
" next if !validate_clusters(vertex_clusters, unique_points)\n",
" mean_poly = get_mean_poly(vertex_clusters)\n",
" next if mean_poly == {}\n",
" connections = connect_clusters(vertex_clusters, original_points)\n",
" next if connections == nil || connections == {}\n",
" poly = connect_mean_poly(mean_poly, connections)\n",
" next if poly == nil || poly.count == 0\n",
" output.push(poly)\n",
" end\n",
" return output\n",
"end"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 45,
"text": [
":calculate_polygonfix_consensus"
]
}
],
"prompt_number": 45
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"consensus = calculate_polygonfix_consensus(geomstr)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 46,
"text": [
"[[[-73.98632407188416, 40.73557147326963], [-73.98622447129412, 40.73570855047738], [-73.98618267156186, 40.73569075660959], [-73.98621819913387, 40.73564040507624], [-73.98620896786451, 40.735635064416286], [-73.9862696826458, 40.73554911699269], [-73.98632407188416, 40.73557147326963]]]"
]
}
],
"prompt_number": 46
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The GeoJSON of all this might look something like:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"geo_json = {:type => \"FeatureCollection\", :features => consensus.map { |f| {:type => \"Feature\", :properties => { :id => 1 }, :geometry => { :type => \"Polygon\", :coordinates =>[f] } } } }.to_json\n",
"puts geo_json"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"{\"type\":\"FeatureCollection\",\"features\":[{\"type\":\"Feature\",\"properties\":{\"id\":1},\"geometry\":{\"type\":\"Polygon\",\"coordinates\":[[[-73.98632407188416,40.73557147326963],[-73.98622447129412,40.73570855047738],[-73.98618267156186,40.73569075660959],[-73.98621819913387,40.73564040507624],[-73.98620896786451,40.735635064416286],[-73.9862696826458,40.73554911699269],[-73.98632407188416,40.73557147326963]]]}}]}\n"
]
}
],
"prompt_number": 47
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now let's plots the resulting GeoJSON on the original map (purple):"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"IRuby.html '<iframe src=\"http://jsfiddle.net/mgiraldo/m4XeU/1/embedded/result/\" width=500 height=400></iframe>'"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<iframe src=\"http://jsfiddle.net/mgiraldo/m4XeU/1/embedded/result/\" width=500 height=400></iframe>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 48,
"text": [
"\"<iframe src=\\\"http://jsfiddle.net/mgiraldo/m4XeU/1/embedded/result/\\\" width=500 height=400></iframe>\""
]
}
],
"prompt_number": 48
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Voil\u00e0! The mean polygon looks good!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Conclusion\n",
"\n",
"This is a first step towards finding geometric consensus from a list of user contributions to a given starting geometry and a map. It is a work in progress and hopefully other ideas can be added to improve this algorithm.\n",
"\n",
"This code is part of NYPL Labs' [Building Inspector](http://buildinginspector.nypl.org/). Explore and fork the [GitHub repository](https://github.com/NYPL/building-inspector).\n",
"\n",
"This notebook was created by [Mauricio Giraldo Arteaga](https://twitter.com/mgiraldo)."
]
}
],
"metadata": {}
}
]
}
@mgiraldo
Copy link
Author

@domitry
Copy link

domitry commented Jul 25, 2014

Hi, current version of nyaplot can rotate labels on x and y-axis. Try it if you don't like over-lapping labels :)

example : http://nbviewer.ipython.org/urls/gist.githubusercontent.com/domitry/e087d69315075bebe3b1/raw/5110b04d5591c91b2bc269ed41d647bdec682f00/polygonfix%20writeup.ipynb

@mgiraldo
Copy link
Author

Done :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment