Skip to content

Instantly share code, notes, and snippets.

@empet
Created January 5, 2019 14:58
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save empet/8e466955c9e30f7471b4fb45c3a0fb21 to your computer and use it in GitHub Desktop.
Save empet/8e466955c9e30f7471b4fb45c3a0fb21 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Hexbin plot"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Hexagonal Binning is a method for vizualizing bivariate distributions. It is recommended \n",
"for identifying patterns in large 2d data sets.\n",
"\n",
" The underlying idea is as follows: a rectangular region including a data set is tesselated with regular hexagons.\n",
" The number/proportion of points falling in each cell is counted and mapped to a colormap.\n",
"The resulting chart is called hexbin plot. \n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Matplotlib provides the function [`pyplot.hexbin`](http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.hexbin)\n",
"that returns an instance of `PolyCollection`. We call for such an instance a few methods in order to get data in an appropriate form for a Plotly plot."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
"%matplotlib inline\n",
"import numpy as np\n",
"import matplotlib.cm as cm\n",
"import cmocean# http://matplotlib.org/cmocean/\n",
"\n",
"import plotly.graph_objs as go"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Our hexagonal tesselation consists in Plotly [shapes](https://plot.ly/python/reference/#layout-shapes) bounded by regular hexagons. The corresponding color of each cell is the matplotlib facecolor of the corresponding `PolyCollection`, converted to a Plotly color by a function defined below (`pl_cell_color`).\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Read data from a file:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"points = np.load('hexbin-data.npy')#https://github.com/empet/Datasets/blob/master/hexbin-data.npy\n",
"x, y = points.T"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Call the matplotlib hexbin function for our data set. Since we need only to create an instance of the `PolyCollection` class and not to show its plot, we set a very small figure size. \n",
"\n",
"In order to get initialized all attributes of this instance it is important to have `%matplotlib inline`, because some attributes are set at the plot time."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"plt.figure(figsize=(0.05,0.05))\n",
"plt.axis('off')\n",
"HB = plt.hexbin(x, y, gridsize=25, cmap=cmocean.cm.algae , mincnt=1) # cmocean.cm.algae is a cmocean colormap"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`gridsize` is the number of hexagons in the x direction. By default it is 100.\n",
"\n",
"`mincnt` gives the minimum number of points in each hexagon. More precisely, any cell containing at least `mincnt` data points will be plotted. The default value is 0. Hence to avoid plotting hexagons with no points, we set it to 1."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We define below the function `get_hexbin_attributes`, that returns the attributes of a hexbin type `PolyCollection` object, namely:\n",
" \n",
"- a numpy.array of shape (7, 2) that contains the coordinates of the vertices $V_0, V_1, V_2, V_3, V_4, V_5, V_0$, of a prototypical hexagon of the tesselation. It is a hexagon\n",
"symmetric with respect to the origin, $O(0,0)$, with two vertices on $Oy$, and scaled such that `gridsize` hexagons fill a row of the tesselation. This hexagon is then translated to the corresponding positions in the rectangular region of data, in order to get a hexagonal lattice.\n",
"- the `offsets` of the translation transformations, as a `numpy.array` of shape `(no_hexagons, 2)`;\n",
"- the matplotlib color codes (facecolors) of each hexagon;\n",
"- the list of hexagonal bin counts. \n",
"\n",
"The offsets, facecolors and the list of counts have the same length, equal to the number of hexagons containing at least `mincnt` points.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def get_hexbin_attributes(hexbin):\n",
" paths = hexbin.get_paths()\n",
" points_codes = list(paths[0].iter_segments())#path[0].iter_segments() is a generator \n",
" prototypical_hexagon = [item[0] for item in points_codes]\n",
" return prototypical_hexagon, hexbin.get_offsets(), hexbin.get_facecolors(), hexbin.get_array()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The following function converts matplotlib facecolors to Plotly color codes:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def pl_cell_color(mpl_facecolors):\n",
" \n",
" return [ f'rgb({int(R*255)}, {int(G*255)}, {int(B*255)})' for (R, G, B, A) in mpl_facecolors]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Define a function that associates to the prototypical hexagon and an offset, a closed hexagonal path, filled\n",
"with the corresponding Plotly facecolor. Moreover, it computes the hexagon center :"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def make_hexagon(prototypical_hex, offset, fillcolor, linecolor=None):\n",
" \n",
" new_hex_vertices = [vertex + offset for vertex in prototypical_hex]\n",
" vertices = np.asarray(new_hex_vertices[:-1])\n",
" # hexagon center\n",
" center=np.mean(vertices, axis=0)\n",
" if linecolor is None:\n",
" linecolor = fillcolor\n",
" #define the SVG-type path: \n",
" path = 'M '\n",
" for vert in new_hex_vertices:\n",
" path += f'{vert[0]}, {vert[1]} L' \n",
" return dict(type='path',\n",
" line=dict(color=linecolor, \n",
" width=0.5),\n",
" path= path[:-2],\n",
" fillcolor=fillcolor, \n",
" ), center "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can transform the hexbin, HB, to a Plotly 2D hexagonal histogram:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"hexagon_vertices, offsets, mpl_facecolors, counts = get_hexbin_attributes(HB)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The prototypical hexagon has the vertices:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"hexagon_vertices[:-1]# the last vertex coincides with the first one"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"cell_color = pl_cell_color(mpl_facecolors)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"shapes = []\n",
"centers = []\n",
"for k in range(len(offsets)):\n",
" shape, center = make_hexagon(hexagon_vertices, offsets[k], cell_color[k])\n",
" shapes.append(shape)\n",
" centers.append(center)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In order to associate a colorbar to the hexbin plot, we define a dummy `Scatter` trace representing the hexagon centers.\n",
"The `color` attribute is the list of counts, and the colorscale is the Plotly colorscale corresponding to the matplotlib\n",
"colormap passed in the call of `plt.hexbin()` above."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A matplotlib colormap is converted into a Plotly colorscale with N entries by the following function:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def mpl_to_plotly(cmap, N):\n",
" h = 1.0/(N-1)\n",
" pl_colorscale = []\n",
" for k in range(N):\n",
" C = list(map(np.uint8, np.array(cmap(k*h)[:3])*255))\n",
" pl_colorscale.append([round(k*h,2), f'rgb({C[0]}, {C[1]}, {C[2]})'])\n",
" return pl_colorscale\n",
" "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"pl_algae = mpl_to_plotly(cmocean.cm.algae, 11)\n",
"pl_algae"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Get data for the Plotly Scatter trace of hexagon centers:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"X, Y = zip(*centers)\n",
"\n",
"#define text to be displayed on hovering the mouse over the cells\n",
"text = [f'x: {round(X[k],2)}<br>y: {round(Y[k],2)}<br>counts: {int(counts[k])}' for k in range(len(X))]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"trace = go.Scatter(\n",
" x=list(X), \n",
" y=list(Y), \n",
" mode='markers',\n",
" marker=dict(size=0.5, \n",
" color=counts, \n",
" colorscale=pl_algae, \n",
" showscale=True,\n",
" colorbar=dict(\n",
" thickness=20, \n",
" ticklen=4\n",
" )), \n",
" text=text, \n",
" hoverinfo='text'\n",
" ) "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"axis = dict(showgrid=False,\n",
" showline=False,\n",
" zeroline=False,\n",
" ticklen=4 \n",
" )\n",
"\n",
"layout = go.Layout(title='Hexbin plot',\n",
" width=530, height=550,\n",
" xaxis=axis,\n",
" yaxis=axis,\n",
" hovermode='closest',\n",
" shapes=shapes,\n",
" plot_bgcolor='black')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"fig = go.FigureWidget(data=[trace], layout=layout)\n",
"fig"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from IPython.core.display import HTML\n",
"def css_styling():\n",
" styles = open(\"./custom.css\", \"r\").read()\n",
" return HTML(styles)\n",
"css_styling()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.4"
}
},
"nbformat": 4,
"nbformat_minor": 1
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment