trevormunoz/finding-cluster-breaks.ipynb

## finding-cluster-breaks.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Binning Horizontal Page Position Data\n",
    "\n",
    "22 July 2017"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "const dsv = require('d3-dsv');\n",
    "const arr = require('d3-array');\n",
    "const stats = require('simple-statistics');\n",
    "const fs = require('fs');\n",
    "const path = require('path');"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "let __dirname = path.resolve();\n",
    "let filePath = path.join(__dirname, '..', '/data/modified/xpositions_years_dataset.csv')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Read in the contents of the CSV file synchronously …"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "let dataString = fs.readFileSync(filePath, {encoding: 'utf-8'});"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Parse the contents into an array of objects …"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "let data = dsv.csvParse(dataString)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[Array] [\"dish_id\",\"year\",\"scaled_xpos\"]"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data.columns"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### D3\n",
    "\n",
    "Look at a couple of different ways of dividing the data into bins. First using the histogram generator function from D3. Got the idea from [this StackOverflow question](https://stackoverflow.com/questions/37445495/binning-an-array-in-javascript-for-a-histogram).\n",
    "\n",
    "The value function in each case below, let's us tell the function to use the `scaled_xpos` value without having to mess with the original object."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Preset number of bins (here: 4)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "let histogram = arr.histogram()\n",
    "                    .value(function(d,i,array) { return d['scaled_xpos']; })\n",
    "                    .thresholds(4);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Freedman-Diaconis threshold algorithm"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "let histogram1 = arr.histogram()\n",
    "                    .value(function(d,i,array) { return d['scaled_xpos']; })\n",
    "                    .thresholds(arr.thresholdFreedmanDiaconis);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Scott threshold algorithm"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "let histogram2 = arr.histogram()\n",
    "                    .value(function(d,i,array) { return d['scaled_xpos']; })\n",
    "                    .thresholds(arr.thresholdScott);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Sturges threshold algorithm (d3 default)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "let histogram3 = arr.histogram()\n",
    "                    .value(function(d,i,array) { return d['scaled_xpos']; })\n",
    "                    .thresholds(arr.thresholdSturges);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Run the histogram generators on the data to get the bins …"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [],
   "source": [
    "// preset\n",
    "let bins = histogram(data)\n",
    "\n",
    "//Freedman-Diaconis\n",
    "let bins1 = histogram1(data)\n",
    "\n",
    "// Scott\n",
    "let bins2 = histogram2(data)\n",
    "\n",
    "//Sturges\n",
    "let bins3 = histogram3(data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "bins.length // Sanity check!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "197"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "//Freedman-Diaconis\n",
    "bins1.length"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "197"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "//Scott\n",
    "bins2.length"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "19"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "// Sturges\n",
    "bins3.length"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If we wanted to report out the number of values and the start and end indexes from each binning method, we could run the following:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for(b=0; b < bins1.length; b++) {\n",
    "    console.log(\"Start index: \" + bins1[b].x0);\n",
    "    console.log(\"End index: \" + (bins1[b].x1 - 1));\n",
    "    console.log(\"Bin size: \" + bins1[b].length);\n",
    "    console.log(\"====\")\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Simple Statistics\n",
    "\n",
    "None of the above is particularly satisfying. Let's try a method from another library …"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Have to explicitly pull out the position values …"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "let positions = data.map((d) => { return d['scaled_xpos']; })"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\"111.429\""
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "positions[0] // Sanity check"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Some Googling let me to [this Stack Overflow Stats discussion](https://stats.stackexchange.com/questions/34242/how-to-intelligently-bin-a-collection-of-sorted-data). First, I followed the suggestion to look at using the [Jenks natural breaks optimization](http://en.wikipedia.org/wiki/Jenks_natural_breaks_optimization) algorithm. I remembered seeing an implementation of this in an impressive-looking javascript statistics library I'd perused before called [`simple-statistics`](https://www.npmjs.com/package/simple-statistics).\n",
    "\n",
    "When I went looking for the algorithm in the current version of the library, I saw that it had been superseded by an implementation of [ckmeans clustering](https://simplestatistics.org/docs/#ckmeans). So, let's try that. For the sake of this experiment, I picked 4 clusters &mdash; thinking of our previous visualizations.\n",
    "\n",
    "(But of course, I don't really know what I am doing with this algorithm!)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "// This takes a couple hours to run!\n",
    "let clusters = stats.ckmeans(positions, 4)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As promised this generates four clusters …"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "clusters.length // sanity check"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Out of curiousity, how many values are in each cluster?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "436568"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "clusters[0].length"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "333537"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "clusters[1].length"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "306477"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "clusters[2].length"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "255927"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "clusters[3].length"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "At the moment, I just want to know the values at the \"breaks\" between the clusters, so grab the last value from each array (representing a cluster)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [],
   "source": [
    "let breaks = clusters.map(function(item, index, array) {\n",
    "    return item.slice(-1)[0]\n",
    "});"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[Array] [\"238.667\",\"433.333\",\"620.0\",\"985.333\"]"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "breaks"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So, I'm interpreting this as four bins of x-position data. If the position is less than 238.667, consider it part of \"column one\". If the position is between 238.668 and 433.333, consider it in \"column 2\", etc. …"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next: combine these break point positions with our dot scatter graphs and see if they make any sense."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "NodeJS v6.9.0",
   "language": "javascript",
   "name": "nodejs"
  },
  "language_info": {
   "codemirror_mode": "javascript",
   "file_extension": ".js",
   "mimetype": "text/javascript",
   "name": "nodejs",
   "pygments_lexer": "javascript",
   "version": "0.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
	{
	"cells": [
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Binning Horizontal Page Position Data\n",
	"\n",
	"22 July 2017"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 1,
	"metadata": {},
	"outputs": [],
	"source": [
	"const dsv = require('d3-dsv');\n",
	"const arr = require('d3-array');\n",
	"const stats = require('simple-statistics');\n",
	"const fs = require('fs');\n",
	"const path = require('path');"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 2,
	"metadata": {},
	"outputs": [],
	"source": [
	"let __dirname = path.resolve();\n",
	"let filePath = path.join(__dirname, '..', '/data/modified/xpositions_years_dataset.csv')"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Read in the contents of the CSV file synchronously …"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 3,
	"metadata": {},
	"outputs": [],
	"source": [
	"let dataString = fs.readFileSync(filePath, {encoding: 'utf-8'});"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Parse the contents into an array of objects …"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 4,
	"metadata": {},
	"outputs": [],
	"source": [
	"let data = dsv.csvParse(dataString)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 5,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"[Array] [\"dish_id\",\"year\",\"scaled_xpos\"]"
	]
	},
	"execution_count": 5,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"data.columns"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"### D3\n",
	"\n",
	"Look at a couple of different ways of dividing the data into bins. First using the histogram generator function from D3. Got the idea from [this StackOverflow question](https://stackoverflow.com/questions/37445495/binning-an-array-in-javascript-for-a-histogram).\n",
	"\n",
	"The value function in each case below, let's us tell the function to use the `scaled_xpos` value without having to mess with the original object."
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"#### Preset number of bins (here: 4)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 6,
	"metadata": {
	"collapsed": true
	},
	"outputs": [],
	"source": [
	"let histogram = arr.histogram()\n",
	" .value(function(d,i,array) { return d['scaled_xpos']; })\n",
	" .thresholds(4);"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"#### Freedman-Diaconis threshold algorithm"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 7,
	"metadata": {},
	"outputs": [],
	"source": [
	"let histogram1 = arr.histogram()\n",
	" .value(function(d,i,array) { return d['scaled_xpos']; })\n",
	" .thresholds(arr.thresholdFreedmanDiaconis);"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"#### Scott threshold algorithm"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 8,
	"metadata": {
	"collapsed": true
	},
	"outputs": [],
	"source": [
	"let histogram2 = arr.histogram()\n",
	" .value(function(d,i,array) { return d['scaled_xpos']; })\n",
	" .thresholds(arr.thresholdScott);"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"#### Sturges threshold algorithm (d3 default)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 9,
	"metadata": {
	"collapsed": true
	},
	"outputs": [],
	"source": [
	"let histogram3 = arr.histogram()\n",
	" .value(function(d,i,array) { return d['scaled_xpos']; })\n",
	" .thresholds(arr.thresholdSturges);"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Run the histogram generators on the data to get the bins …"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 34,
	"metadata": {},
	"outputs": [],
	"source": [
	"// preset\n",
	"let bins = histogram(data)\n",
	"\n",
	"//Freedman-Diaconis\n",
	"let bins1 = histogram1(data)\n",
	"\n",
	"// Scott\n",
	"let bins2 = histogram2(data)\n",
	"\n",
	"//Sturges\n",
	"let bins3 = histogram3(data)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 35,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"4"
	]
	},
	"execution_count": 35,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"bins.length // Sanity check!"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 36,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"197"
	]
	},
	"execution_count": 36,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"//Freedman-Diaconis\n",
	"bins1.length"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 37,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"197"
	]
	},
	"execution_count": 37,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"//Scott\n",
	"bins2.length"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 38,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"19"
	]
	},
	"execution_count": 38,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"// Sturges\n",
	"bins3.length"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"If we wanted to report out the number of values and the start and end indexes from each binning method, we could run the following:"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": [
	"for(b=0; b < bins1.length; b++) {\n",
	" console.log(\"Start index: \" + bins1[b].x0);\n",
	" console.log(\"End index: \" + (bins1[b].x1 - 1));\n",
	" console.log(\"Bin size: \" + bins1[b].length);\n",
	" console.log(\"====\")\n",
	"}"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"### Simple Statistics\n",
	"\n",
	"None of the above is particularly satisfying. Let's try a method from another library …"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Have to explicitly pull out the position values …"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 15,
	"metadata": {
	"collapsed": true
	},
	"outputs": [],
	"source": [
	"let positions = data.map((d) => { return d['scaled_xpos']; })"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 16,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"\"111.429\""
	]
	},
	"execution_count": 16,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"positions[0] // Sanity check"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Some Googling let me to [this Stack Overflow Stats discussion](https://stats.stackexchange.com/questions/34242/how-to-intelligently-bin-a-collection-of-sorted-data). First, I followed the suggestion to look at using the [Jenks natural breaks optimization](http://en.wikipedia.org/wiki/Jenks_natural_breaks_optimization) algorithm. I remembered seeing an implementation of this in an impressive-looking javascript statistics library I'd perused before called [`simple-statistics`](https://www.npmjs.com/package/simple-statistics).\n",
	"\n",
	"When I went looking for the algorithm in the current version of the library, I saw that it had been superseded by an implementation of [ckmeans clustering](https://simplestatistics.org/docs/#ckmeans). So, let's try that. For the sake of this experiment, I picked 4 clusters — thinking of our previous visualizations.\n",
	"\n",
	"(But of course, I don't really know what I am doing with this algorithm!)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 17,
	"metadata": {
	"collapsed": true
	},
	"outputs": [],
	"source": [
	"// This takes a couple hours to run!\n",
	"let clusters = stats.ckmeans(positions, 4)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"As promised this generates four clusters …"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 18,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"4"
	]
	},
	"execution_count": 18,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"clusters.length // sanity check"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Out of curiousity, how many values are in each cluster?"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 19,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"436568"
	]
	},
	"execution_count": 19,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"clusters[0].length"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 20,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"333537"
	]
	},
	"execution_count": 20,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"clusters[1].length"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 21,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"306477"
	]
	},
	"execution_count": 21,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"clusters[2].length"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 22,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"255927"
	]
	},
	"execution_count": 22,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"clusters[3].length"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"At the moment, I just want to know the values at the \"breaks\" between the clusters, so grab the last value from each array (representing a cluster)."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 32,
	"metadata": {},
	"outputs": [],
	"source": [
	"let breaks = clusters.map(function(item, index, array) {\n",
	" return item.slice(-1)[0]\n",
	"});"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 33,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"[Array] [\"238.667\",\"433.333\",\"620.0\",\"985.333\"]"
	]
	},
	"execution_count": 33,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"breaks"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"So, I'm interpreting this as four bins of x-position data. If the position is less than 238.667, consider it part of \"column one\". If the position is between 238.668 and 433.333, consider it in \"column 2\", etc. …"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Next: combine these break point positions with our dot scatter graphs and see if they make any sense."
	]
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "NodeJS v6.9.0",
	"language": "javascript",
	"name": "nodejs"
	},
	"language_info": {
	"codemirror_mode": "javascript",
	"file_extension": ".js",
	"mimetype": "text/javascript",
	"name": "nodejs",
	"pygments_lexer": "javascript",
	"version": "0.10"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 2
	}