Skip to content

Instantly share code, notes, and snippets.

@michaelmalak
Created May 11, 2014 19:19
Show Gist options
  • Save michaelmalak/dd5495a605a8b951da43 to your computer and use it in GitHub Desktop.
Save michaelmalak/dd5495a605a8b951da43 to your computer and use it in GitHub Desktop.
GeoSparkGram
Display the source blob
Display the rendered blob
Raw
{
"metadata": {
"name": "",
"signature": "sha256:10fff447c215331555586b57dc5e85aceba196cc5df2037db7c6d1cf2757b55c"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Requires IPython Notebook 2.0 for inline d3.js Javascript. If using <a href=\"http://continuum.io/downloads\" target=\"_blank\">Anaconda</a> as your means of installing IPython Notebook, then as of this writing (May 11, 2014), it is still on IPython 1.x despite IPython 2.x having been released on April 1, 2014. After installing Anaconda, you can update it to IPython 2.x by typing into a command prompt:\n",
"<PRE>\n",
"conda update conda\n",
"conda update ipython\n",
"</PRE>\n",
"On Windows at least, this may end up removing the convenient launch icon for IPython Notebook. If so, you can manually launch from a command prompt with:\n",
"<PRE>\n",
"ipython notebook\n",
"</PRE>\n",
"<p>For Windows, install the following and add C:\\Program Files (x86)\\GnuWin32\\bin to your PATH</p>\n",
"http://gnuwin32.sourceforge.net/packages/wget.htm<br />\n",
"http://gnuwin32.sourceforge.net/packages/unzip.htm<br />\n",
"http://gnuwin32.sourceforge.net/packages/coreutils.htm<br />"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import math\n",
"import os\n",
"import datetime\n",
"import numpy\n",
"import pandas\n",
"import matplotlib.pyplot as plt\n",
"%matplotlib inline"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 584
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def tofloat(x):\n",
" try:\n",
" return float(x)\n",
" except ValueError:\n",
" return None"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 585
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Cities were hand-selected, with WBAN manually looked up from http://cdo.ncdc.noaa.gov/qclcd/QCLCD?prior=N and the INCITS code manually looked up from http://en.wikipedia.org/wiki/List_of_United_States_counties_and_county_equivalents#Table (Topojson also codes counties by INCITS)"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"dfcities = pandas.DataFrame([{'City':'Centennial', 'WBAN':93067, 'INCITS':8005},\n",
" {'City':'San Diego', 'WBAN':3131, 'INCITS':6073},\n",
" {'City':'Washington, DC', 'WBAN':13743, 'INCITS':11001},\n",
" {'City':'San Francisco', 'WBAN':23234, 'INCITS':6075},\n",
" {'City':'New York City', 'WBAN':94728, 'INCITS':36061},\n",
" {'City':'Atlanta', 'WBAN':13874, 'INCITS':13121},\n",
" {'City':'Phoenix', 'WBAN':23183, 'INCITS':4013},\n",
" {'City':'Dallas', 'WBAN':3927, 'INCITS':48113},\n",
" {'City':'Seattle', 'WBAN':24233, 'INCITS':53033},\n",
" {'City':'Kansas City', 'WBAN':3947, 'INCITS':29165},\n",
" {'City':'Minneapolis', 'WBAN':14922, 'INCITS':27053},\n",
" {'City':'New Orleans', 'WBAN':12916, 'INCITS':22051},\n",
" {'City':'Chicago', 'WBAN':94846, 'INCITS':17031},\n",
" {'City':'Anchorage', 'WBAN':26451, 'INCITS':2020},\n",
" {'City':'Honolulu', 'WBAN':22521, 'INCITS':15003},\n",
" {'City':'Boston', 'WBAN':14739, 'INCITS':25025},\n",
" {'City':'Miami', 'WBAN':12839, 'INCITS':12086},\n",
" {'City':'Detroit', 'WBAN':94847, 'INCITS':26163},\n",
" {'City':'Pittsburgh', 'WBAN':94823, 'INCITS':42003},\n",
" {'City':'Las Vegas', 'WBAN':23169, 'INCITS':32003},\n",
" {'City':'Houston', 'WBAN':12960, 'INCITS':48201}])"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 586
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"os.mkdir(\"TempBarometerFiles\")\n",
"os.chdir(\"TempBarometerFiles\")"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 587
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"processingyear = datetime.date.today().year\n",
"processingmonth = datetime.date.today().month\n",
"dfdiff=pandas.DataFrame(numpy.zeros(0,dtype=[('INCITS', 'a10'),('Range', 'f8')]))\n",
"for x in range(0, 12):\n",
" dt = datetime.datetime(processingyear, processingmonth, 1) - datetime.timedelta(days=1)\n",
" processingyear = dt.year\n",
" processingmonth = dt.month\n",
" os.system(\"wget -q http://cdo.ncdc.noaa.gov/qclcd_ascii/QCLCD\" + str(processingyear) + str(processingmonth).zfill(2) + \".zip\")\n",
" os.system(\"unzip QCLCD\" + str(processingyear) + str(processingmonth).zfill(2) + \".zip\")\n",
" df = pandas.read_csv(str(processingyear) + str(processingmonth).zfill(2) + \"hourly.txt\",low_memory=False)\n",
" dfsp = df.merge(dfcities, on=\"WBAN\").ix[:,(\"INCITS\", \"Date\", \"StationPressure\")]\n",
" dfsp[\"StationPressureFloat\"] = dfsp[\"StationPressure\"].apply(lambda x: tofloat(x))\n",
" del dfsp[\"StationPressure\"]\n",
" dfsp = dfsp.ix[dfsp[\"StationPressureFloat\"].apply(lambda x: not math.isnan(x))]\n",
" gb = dfsp.groupby([\"INCITS\",\"Date\"])\n",
" dfminmax = gb.min().join(gb.max(), lsuffix=\"Min\", rsuffix=\"Max\")\n",
" dfdiffcur = pandas.DataFrame(dfminmax[\"StationPressureFloatMax\"] - dfminmax[\"StationPressureFloatMin\"], columns=[\"Range\"])\n",
" dfdiffcur.reset_index(level=0, inplace=True)\n",
" dfdiff = dfdiff.append(dfdiffcur)\n",
" os.system(\"rm *.txt\")"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 588
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"arrhist = []\n",
"for incits in dfcities[\"INCITS\"].values:\n",
" arrhist.append({'id':incits, 'hist':numpy.histogram(dfdiff.ix[dfdiff[\"INCITS\"]==incits,\"Range\"],range=(0,0.6))[0]})"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 589
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Since about January, 2014, d3.js will attempt to cooperate with AMD if it is present. This is the case in IPython Notebook 2.0, so d3.js has to import d3.js through require.js instead of directly."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%%html\n",
"<style type=\"text/css\">\n",
".land {\n",
" fill: silver;\n",
"}\n",
".states {\n",
" fill: none;\n",
" stroke: black;\n",
" stroke-linejoin: round;\n",
"}\n",
"</style>\n",
"\n",
"<div id=\"county_map\" style=\"height:600px; width:100%\"></div>"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<style type=\"text/css\">\n",
".land {\n",
" fill: silver;\n",
"}\n",
".states {\n",
" fill: none;\n",
" stroke: black;\n",
" stroke-linejoin: round;\n",
"}\n",
"</style>\n",
"\n",
"<div id=\"county_map\" style=\"height:600px; width:100%\"></div>"
],
"metadata": {},
"output_type": "display_data",
"text": [
"<IPython.core.display.HTML at 0xec7e160>"
]
}
],
"prompt_number": 590
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<center><h2>GeoSparkGrams of Daily Barometric Volatility</h2></center>\n",
"<p>Daily variation of barometric pressure (maximum minus minimum for each day) in inches, for the past 12 months. For each of the hand-picked major cities, the 365 daily ranges for that city are histogrammed.</p>\n",
"<p>Histogram is in 10 bins, from 0.00 delta inches to 0.60 delta inches of mercury (horizontal axis). Vertical axis is 150 days.</p>"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from IPython.core.display import Javascript, display\n",
"display(Javascript(\"var histdata = eval('\" + pandas.DataFrame(arrhist).to_json(orient='records') + \"');\" + \"\"\"\n",
"// https://github.com/mbostock/d3/issues/1693\n",
"require.config({\n",
" paths: {\n",
" d3: \"http://d3js.org/d3.v3.min\",\n",
" topojson: \"http://d3js.org/topojson.v1.min\"\n",
" }\n",
"});\n",
"\n",
"require([\"d3\", \"topojson\"], function(d3, topojson) {\n",
"\n",
" var square = 40;\n",
" var ydaysmax = 150;\n",
" \n",
" var path = d3.geo.path();\n",
"\n",
" var svg = d3.select('#county_map').append(\"svg\")\n",
" .attr(\"width\", 960)\n",
" .attr(\"height\", 500);\n",
"\n",
" d3.json(\"http://mashupguide.net/wwod14/us.json\", function(error, us) {\n",
" \n",
" var countylist = histdata.map(function(x) {return x.id});\n",
"\n",
" svg.insert(\"path\", \".graticule\")\n",
" .datum(topojson.feature(us, us.objects.land))\n",
" .attr(\"class\", \"land\")\n",
" .attr(\"d\", path);\n",
"\n",
" svg.append(\"path\")\n",
" .datum(topojson.mesh(us, us.objects.states), function(a, b) { return a !== b; })\n",
" .attr(\"class\", \"states\")\n",
" .attr(\"d\", path);\n",
"\n",
" var percountysvg = svg.append(\"g\")\n",
" .selectAll(\"svg\")\n",
" .data(topojson.feature(us,us.objects.counties).features.filter(function(x) {return countylist.indexOf(x.id) >= 0;}))\n",
" .enter()\n",
" .append(\"svg\")\n",
" .attr(\"x\", function(d) {return d3.geo.path().centroid(d)[0] - square/2})\n",
" .attr(\"y\", function(d) {return d3.geo.path().centroid(d)[1] - square/2}) \n",
" \n",
" percountysvg.append(\"rect\")\n",
" .attr(\"width\", square)\n",
" .attr(\"height\", square)\n",
" .attr(\"fill\", \"white\")\n",
" .attr(\"stroke\", \"black\")\n",
"\n",
" var xscale = d3.scale.linear().domain([0, histdata[0].hist.length]).range([0, square]);\n",
" var yscale = d3.scale.linear().domain([0, ydaysmax]).range([square, 0]);\n",
"\n",
" var areapath = d3.svg.area()\n",
" .x(function(d) { return xscale(d.x); })\n",
" .y0(square)\n",
" .y1(function(d) { return yscale(d.y); })\n",
" .interpolate(\"linear\");\n",
" \n",
" percountysvg.append(\"path\")\n",
" .attr(\"d\", function(d) {return areapath($.grep(histdata, function(x){ return x.id==d.id; })\n",
" [0].hist.map(function(y,i) { return {x:i,y:y}; }));})\n",
" .attr(\"fill\", \"blue\");\n",
" });\n",
"});\n",
"\"\"\"))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"javascript": [
"var histdata = eval('[{\"hist\":[12,128,90,65,35,20,8,4,2,1],\"id\":8005},{\"hist\":[57,246,57,5,0,0,0,0,0,0],\"id\":6073},{\"hist\":[6,93,105,60,32,25,15,10,5,7],\"id\":11001},{\"hist\":[27,224,84,18,7,4,1,0,0,0],\"id\":6075},{\"hist\":[7,80,98,57,40,27,16,6,8,6],\"id\":36061},{\"hist\":[7,164,126,34,13,12,5,2,0,2],\"id\":13121},{\"hist\":[2,87,221,48,6,0,0,1,0,0],\"id\":4013},{\"hist\":[1,73,141,51,34,18,7,2,4,3],\"id\":48113},{\"hist\":[29,123,105,45,21,19,5,10,4,2],\"id\":53033},{\"hist\":[11,99,87,62,33,30,16,14,8,1],\"id\":29165},{\"hist\":[25,81,92,50,35,24,12,15,15,5],\"id\":27053},{\"hist\":[12,197,101,20,22,8,2,3,0,0],\"id\":22051},{\"hist\":[20,97,81,61,27,38,15,5,11,4],\"id\":17031},{\"hist\":[34,96,82,48,37,25,13,8,6,4],\"id\":2020},{\"hist\":[8,316,41,0,0,0,0,0,0,0],\"id\":15003},{\"hist\":[13,74,80,56,53,28,20,11,7,5],\"id\":25025},{\"hist\":[11,285,53,11,3,2,0,0,0,0],\"id\":12086},{\"hist\":[14,89,97,58,36,25,15,10,9,6],\"id\":26163},{\"hist\":[12,116,87,62,29,25,13,9,2,2],\"id\":42003},{\"hist\":[1,94,185,51,16,11,4,2,1,0],\"id\":32003},{\"hist\":[3,137,140,38,29,8,4,1,4,1],\"id\":48201}]');\n",
"// https://github.com/mbostock/d3/issues/1693\n",
"require.config({\n",
" paths: {\n",
" d3: \"http://d3js.org/d3.v3.min\",\n",
" topojson: \"http://d3js.org/topojson.v1.min\"\n",
" }\n",
"});\n",
"\n",
"require([\"d3\", \"topojson\"], function(d3, topojson) {\n",
"\n",
" var square = 40;\n",
" var ydaysmax = 150;\n",
" \n",
" var path = d3.geo.path();\n",
"\n",
" var svg = d3.select('#county_map').append(\"svg\")\n",
" .attr(\"width\", 960)\n",
" .attr(\"height\", 500);\n",
"\n",
" d3.json(\"http://mashupguide.net/wwod14/us.json\", function(error, us) {\n",
" \n",
" var countylist = histdata.map(function(x) {return x.id});\n",
"\n",
" svg.insert(\"path\", \".graticule\")\n",
" .datum(topojson.feature(us, us.objects.land))\n",
" .attr(\"class\", \"land\")\n",
" .attr(\"d\", path);\n",
"\n",
" svg.append(\"path\")\n",
" .datum(topojson.mesh(us, us.objects.states), function(a, b) { return a !== b; })\n",
" .attr(\"class\", \"states\")\n",
" .attr(\"d\", path);\n",
"\n",
" var percountysvg = svg.append(\"g\")\n",
" .selectAll(\"svg\")\n",
" .data(topojson.feature(us,us.objects.counties).features.filter(function(x) {return countylist.indexOf(x.id) >= 0;}))\n",
" .enter()\n",
" .append(\"svg\")\n",
" .attr(\"x\", function(d) {return d3.geo.path().centroid(d)[0] - square/2})\n",
" .attr(\"y\", function(d) {return d3.geo.path().centroid(d)[1] - square/2}) \n",
" \n",
" percountysvg.append(\"rect\")\n",
" .attr(\"width\", square)\n",
" .attr(\"height\", square)\n",
" .attr(\"fill\", \"white\")\n",
" .attr(\"stroke\", \"black\")\n",
"\n",
" var xscale = d3.scale.linear().domain([0, histdata[0].hist.length]).range([0, square]);\n",
" var yscale = d3.scale.linear().domain([0, ydaysmax]).range([square, 0]);\n",
"\n",
" var areapath = d3.svg.area()\n",
" .x(function(d) { return xscale(d.x); })\n",
" .y0(square)\n",
" .y1(function(d) { return yscale(d.y); })\n",
" .interpolate(\"linear\");\n",
" \n",
" percountysvg.append(\"path\")\n",
" .attr(\"d\", function(d) {return areapath($.grep(histdata, function(x){ return x.id==d.id; })\n",
" [0].hist.map(function(y,i) { return {x:i,y:y}; }));})\n",
" .attr(\"fill\", \"blue\");\n",
" });\n",
"});\n"
],
"metadata": {},
"output_type": "display_data",
"text": [
"<IPython.core.display.Javascript at 0xec698d0>"
]
}
],
"prompt_number": 591
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"os.chdir(\"..\")"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 592
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"!rm -rf TempBarometerFiles"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 593
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 593
}
],
"metadata": {}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment