hui-tony-zk/SportsHack15.ipynb

## SportsHack15.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h2 align = center> Vancouver | Toronto | Halifax </h2>\n",
    "<br>\n",
    "<a href = \"http://sportshackweekend.org/\"><img src = \"http://sportshackweekend.org/ca/2015/img/sh_logo2.png\"></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h1 align=center> #SportsHack15 </h1>\n",
    "<h1 align=center>November 27-29, 2015 </h1>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Please run this cell, by pressing Shift+Enter. Will auto-hide when run.\n",
    "\n",
    "from IPython.display import HTML\n",
    "import warnings\n",
    "warnings.simplefilter(action = \"ignore\", category = FutureWarning)\n",
    "\n",
    "\n",
    "from IPython.html import widgets\n",
    "from IPython.display import display, clear_output\n",
    "\n",
    "def downloaddata(x):\n",
    "    clear_output()\n",
    "    if checkbox.value == True:\n",
    "        print \"Downloading to [/resources/ticats_users.csv] (676 Kb) ...\"\n",
    "        !wget --quiet --output-document ticats_users.csv https://ibm.box.com/shared/static/9smn57rcdd9a99wmkyqoj27zvg7igu2t.csv\n",
    "        print \"...DONE.\"\n",
    "        \n",
    "        print \"Downloading to [/resources/ticats_useractions.csv] (31.1 Mb) ...\"\n",
    "        !wget --quiet --output-document ticats_useractions.csv https://ibm.box.com/shared/static/cga07fxyp6xslxbyut7qr0ybb7u71r2i.csv\n",
    "        print \"...DONE.\"\n",
    "        print \"Download complete. Check the files under 'Recent Data'. \"\n",
    "    else:\n",
    "        print \"You did not accept the terms.\"\n",
    "        \n",
    "checkbox = widgets.Checkbox(description = \"I Accept\", value = False)\n",
    "\n",
    "button = widgets.Button(description = 'Click me to download to DSWB')\n",
    "button.on_click(downloaddata)\n",
    "\n",
    "hide_me = ''\n",
    "HTML('''<script>\n",
    "code_show=true; \n",
    "function code_toggle() {\n",
    "  if (code_show) {\n",
    "    $('div.input').each(function(id) {\n",
    "      el = $(this).find('.cm-variable:first');\n",
    "      if (id == 0 || el.text() == 'hide_me') {\n",
    "        $(this).hide();\n",
    "      }\n",
    "    });\n",
    "    $('div.output_prompt').css('opacity', 0);\n",
    "  } else {\n",
    "    $('div.input').each(function(id) {\n",
    "      $(this).show();\n",
    "    });\n",
    "    $('div.output_prompt').css('opacity', 1);\n",
    "  }\n",
    "  code_show = !code_show\n",
    "} \n",
    "$( document ).ready(code_toggle);\n",
    "</script>\n",
    "<form action=\"javascript:code_toggle()\"><input style=\"opacity:0\" type=\"submit\" value=\"Click here to toggle on/off the raw code.\"></form>''')\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a href = https://twitter.com/intent/tweet?text=Aiming+for+the+first+prize+of+%244500+at+%23SportsHack15%21+%40BigDataU+%40RyersonRC4+%40lighthouse_labs+%40CFL+%40voltaeffect+sportshackweekend.org><img src = \"https://ibm.box.com/shared/static/n9d6z6rsw5fv5sn8txs3xhbch97f9hpp.png\" width = 700  style=\"padding:1px;border:thin solid black;\"></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<hr>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h1, align=center>Quickstart Guide to Data Scientist Workbench"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "from IPython.display import IFrame\n",
    "IFrame(\"https://www.youtube.com/embed/3oI9z0Wq5u4\", width = 640, height = 480)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## In Data Scientist Workbench, you can:\n",
    "- use R, Python, Scala notebooks\n",
    "- use [RStudio IDE](https://datascientistworkbench.com/rstudio)\n",
    "- use [OpenRefine](https://datascientistworkbench.com/openrefine)\n",
    "- use Apache Spark"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br><br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h2 align = center> How to Upload/Import Data into DSWB </h2>\n",
    "<img src = https://ibm.box.com/shared/static/cqjnps0dxflroy69jmujgslh6j5kksob.png width = 640>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Example of `wget`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "#Example. Click in this cell and press Shift+Enter to run. Dummy URL used.\n",
    "!wget --output-document destination_file.txt https://0.0.0.0/source_file.txt"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h2 align = center> How to Upload Notebooks into DSWB </h2>\n",
    "<img src = https://ibm.box.com/shared/static/04iujuo353e4v0qlgke243nwepaqvjzo.png width = 640>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h2 align = center> How to Share Notebooks with Another User </h2>\n",
    "<img src = https://ibm.box.com/shared/static/cf5wjnacs30n0jhq3sldnkelw0vxp3kn.png width = 640>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h2 align = center> How to Create a New Notebook (Python, R or Scala) </h2>\n",
    "<img src = https://ibm.box.com/shared/static/xv5qpwm5fkkxbibgyz8rdfq3vietfrtz.png width = 640>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br><br><hr>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h1 align = center id=\"toc\">Table of Contents</h1>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##1. [CFL Data](#CFL-Data)      \n",
    "##2. [TiCats: All Access Data](#ticats)  \n",
    "##3. [Weather Data](#weather)           \n",
    "##4. [Open Data](#opendata) \n",
    "##5. [CFL Videos](#CFL-Video) \n",
    "##6. [Resources](#resources)           "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br><br>\n",
    "<hr>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id = \"CFL-Data\"></a>\n",
    "<h1 align = center>CFL Data</h1>\n",
    "<br>\n",
    "<a href = http://http://cfl.ca/><img width=300 src=\"http://cfl.uploads.mrx.ca/league/images/en/newser/2009/01/72373.jpg\"></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The [Canadian Football League](http://cfl.ca) (CFL) is a professional football league in Canada. As of 2015, the CFL is comprised of nine teams, divided into the East and West divisions:\n",
    "\n",
    "|Team | Division | Twitter\n",
    "|-|-|\n",
    "|[BC Lions](http://www.bclions.com/) | West | [@BCLions](https://twitter.com/BCLions)\n",
    "|[Calgary Stampeders](http://www.stampeders.com/) | West | [@calstampeders](https://twitter.com/calstampeders)\n",
    "|[Edmonton Eskimos](http://www.esks.com/) | West | [@EdmontonEsks](https://twitter.com/edmontonesks)\n",
    "|[Saskatchewan Roughriders](http://www.riderville.com/) | West | [@sskroughriders](https://twitter.com/sskroughriders)\n",
    "|[Winnipeg Blue Bombers](http://www.bluebombers.com) | West | [@Wpg_BlueBombers](https://twitter.com/Wpg_BlueBombers)\n",
    "|[Toronto Argonauts](http://www.argonauts.ca/) | East | [@TorontoArgos](https://twitter.com/TorontoArgos)\n",
    "|[Hamilton Tiger-Cats](http://ticats.ca/) | East | [@Ticats](https://twitter.com/Ticats)\n",
    "|[Montreal Alouettes](http://en.montrealalouettes.com/) | East | [@MTLAlouettes](https://twitter.com/MTLAlouettes)\n",
    "|[Ottawa Redblacks](http://www.ottawaredblacks.com/) | East | [@REDBLACKS](https://twitter.com/REDBLACKS)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a href = https://twitter.com/intent/tweet?text=Thank+you+%40CFL+for+providing+%23CFL+data+for+%23SportsHack15+in+%23Toronto+%23Vancouver+%23Halifax.+sportshackweekend.org><img src = \"https://ibm.box.com/shared/static/xgyi42mbqbwgbmwlkjg4tys61fhwt9jd.png\", width = 700  style=\"padding:1px;border:thin solid black;\">\n",
    "\n",
    "<Thank%20you%20%40CFL%20for%20providing%20%23CFL%20data%20for%20the%20Sports%20Hackathon%20in%20%23Toronto%20%23Vancouver%20%23Halifax.%20sportshackweekend.org>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h2, align=center> There are 5 CFL datasets available for SportHack  </h2>\n",
    "\n",
    "|**#**| **Dataset** | **Filename**\n",
    "|-|-|-|\n",
    "|1| [Game_data](#gamedata) <font color=\"red\"> *NEW!*</font> | cfl_games.csv\n",
    "|2| [Team stats data](#teamstats)  | cfl_team_stats.csv\n",
    "|3| [Roster data](#rosterdata)  | cfl_roster.csv <br> cfl_roster_stats.csv\n",
    "|4| [Play-by-Play data](#playbyplay)  | cfl_play_by_play.csv\n",
    "|5| [Draft data](#draftdata)<font color=\"red\"> *NEW!*</font>  | cfl_draft.csv"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### <a id=\"gamedata\"></a>\n",
    "## 1) Game data\n",
    "\n",
    "- There are '5798' gamess in the cfl_games.csv\n",
    "- The cfl_game.csv dataset is about historical games from 1990 to 2015\n",
    "- This dataset has aggregated fields about the team names (home/away), game date, week, score, and attendence of each game.\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Download the data into Data Scientist Workbench"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Downloaded [/resources/cfl_games.csv]. Check under 'Recent Data' in the sidebar.\n"
     ]
    }
   ],
   "source": [
    "!wget --quiet --output-document cfl_games.csv https://ibm.box.com/shared/static/m83uz7a0i4anpzlk29j7v5pko2ttx2fh.csv\n",
    "print \"Downloaded [/resources/cfl_games.csv]. Check under 'Recent Data' in the sidebar.\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Import the data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import numpy as np\n",
    "\n",
    "games = pd.read_csv('/resources/cfl_games.csv')\n",
    "games.head(5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "games.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "<a id=\"teamstats\"></a>\n",
    "## 2) Team stats\n",
    "\n",
    "- There are 9 teams in the cfl_team_stats.csv\n",
    "- The cfl_team_stats.csv dataset includes 398 records for 199 games. That is, two rocords per game, which is for two teams participating in the game\n",
    "- The cfl_team_stats.csv dataset is about games from 13-06-12 to 15-07-26\n",
    "- This dataset has aggregated fields about the stats of each game, including 95 columns.\n",
    "- More info about the rules and terms in [CFL Rule Book (pdf)](http://www.cfl.ca/page/game_rule_rule1)  \n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Download the data into Data Scientist Workbench"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Downloaded [/resources/cfl_team_stats.csv]. Check under 'Recent Data' in the sidebar.\n"
     ]
    }
   ],
   "source": [
    "!wget --quiet --output-document cfl_team_stats.csv https://ibm.box.com/shared/static/kzrxqeg75t9nwjsjoormmwg2dbjzp948.csv\n",
    "print \"Downloaded [/resources/cfl_team_stats.csv]. Check under 'Recent Data' in the sidebar.\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Import the data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "teamstats = pd.read_csv('/resources/cfl_team_stats.csv')\n",
    "teamstats.head(5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Shape of dataframe"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "teamstats.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "####Columns\n",
    "Print the column headers:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "for i in teamstats.columns: print str(i)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### What does the data tell us about a particular game?\n",
    "- There are two records for each game which shows stats about each team playing in that game.\n",
    "- Records have data about team, game, results and stadium"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "teamstats[teamstats.game_id == 10549]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When did **game_id = 10549** occur?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "teamstats[teamstats.game_id == 10549][\"game_date\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Some stats about each team and their games"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "groupedgame = teamstats.groupby(['city','name', 'team_id'], as_index=False)\n",
    "groupedgame.first()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Wins and Losses"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "g = teamstats.groupby(['city','name'])\n",
    "result = g.agg({'id' : 'count', \n",
    "                'first_downs_number' : 'sum', \n",
    "                'win': {'win_count':'sum',\n",
    "                        'loss_count' : lambda x: np.sum(x==0)}\n",
    "               })\n",
    "result"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Plotting wins and losses by city using Bokeh"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Install [bokeh](http://bokeh.pydata.org/en/latest/) package for plotting:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "%%bash\n",
    "pip install bokeh"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "from bokeh.io import output_notebook\n",
    "output_notebook()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "from collections import OrderedDict\n",
    "\n",
    "import pandas as pd\n",
    "\n",
    "from bokeh._legacy_charts import Bar, show\n",
    "\n",
    "\n",
    "# get the countries and we group the data by medal type\n",
    "labels = list(result.index.get_level_values('city').values)\n",
    "loss = result['win']['loss_count'].astype(float).values\n",
    "win = result['win']['win_count'].astype(float).values\n",
    "\n",
    "# group the data into a dictionary\n",
    "winslosses = OrderedDict(win=win, loss=loss)\n",
    "\n",
    "bar = Bar(winslosses, labels, title=\"Wins and losses by CFL team in 2013\", stacked=True)\n",
    "show(bar)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id = rosterdata></a>\n",
    "## 3) Roster data\n",
    "\n",
    "There are two sets of data about players in each team.  \n",
    "\n",
    "1. **cfl_roster.csv** contains demographic information about each player.  \n",
    "E.g., \n",
    "  - first name\n",
    "  - last name\n",
    "  - team id\n",
    "  - number\n",
    "  - position\n",
    "  - height\n",
    "  - weight\n",
    "  - birthdate\n",
    "  - birthplace\n",
    "  - college\n",
    "  - years in the team\n",
    "  - years in the league\n",
    "\n",
    "2. **cfl_roster_stats.csv** contains historical gameplay statistics about each player.\n",
    "  B) Some stats about history of each player\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Download files directly to Data Scientist Workbench"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Downloaded [/resources/cfl_roster.csv]. Check under 'Recent Data' in the sidebar.\n",
      "Downloaded [/resources/cfl_roster_stats.csv]. Check under 'Recent Data' in the sidebar.\n"
     ]
    }
   ],
   "source": [
    "!wget --quiet --output-document cfl_roster.csv https://ibm.box.com/shared/static/61rubyhnam20rq3dttnhq3cd5x5z1fbm.csv\n",
    "!wget --quiet --output-document cfl_roster_stats.csv https://ibm.box.com/shared/static/cf1buai5bvnjsx5s4bpq9c2iap0qswow.csv\n",
    "print \"Downloaded [/resources/cfl_roster.csv]. Check under 'Recent Data' in the sidebar.\"\n",
    "print \"Downloaded [/resources/cfl_roster_stats.csv]. Check under 'Recent Data' in the sidebar.\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Import the data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "roster = pd.read_csv('/resources/cfl_roster.csv')\n",
    "roster_stats = pd.read_csv('/resources/cfl_roster_stats.csv')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### First few rows of cfl_roster.csv (player demographics)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "roster.head(3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "roster.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### First few rows of cfl_roster_stats.csv (player stats)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "roster_stats.head(3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "roster_stats.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Let's look at a specific player"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "roster[roster[\"id\"] == 31]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Which team does Anthony Calvillo play for, in this dataset?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "teamstats[teamstats[\"team_id\"] == 9]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Let's look at Anthony Calvillo's stats "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "roster_stats[roster_stats[\"roster_id\"] == 31]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"playbyplay\"></a>\n",
    "## 4) Play-by-Play\n",
    "\n",
    "\n",
    "- cfl_play_by_play.csv contains  play by play record of each game."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Download the data directly into Data Scientist Workbench"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Downloaded [/resources/cfl_play_by_play.csv]. Check under 'Recent Data' in the sidebar.\n"
     ]
    }
   ],
   "source": [
    "!wget --quiet --output-document cfl_play_by_play.csv  https://ibm.box.com/shared/static/nx8ciulv66lsyj4xnjregmh3elfpqkx3.csv\n",
    "print \"Downloaded [/resources/cfl_play_by_play.csv]. Check under 'Recent Data' in the sidebar.\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Import the data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "playData = pd.read_csv('/resources/cfl_play_by_play.csv')\n",
    "print playData.head(5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "playData.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Understanding the play-by-play data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Reference to actual CFL page:\n",
    "\n",
    ">**Wed, Jun 12, 2013.**  \n",
    ">**Toronto Argonauts vs. Winnipeg Blue Bombers.**  \n",
    ">**Final score: 24-6**  \n",
    ">http://cfl.ca/statistics/statsGame/id/10545"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "game10545 = playData[playData.game_id == 10545 ]\n",
    "for row in range(len(game10545)): \n",
    "    print \"Quarter\", game10545.iloc[row][\"quarter\"], game10545.iloc[row][\"time\"], game10545.iloc[row][\"details\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br><br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"draftdata\"></a>\n",
    "## 5) Draft Data <font color=\"red\">  *NEW!*</font>\n",
    "\n",
    "A draft is a process used to allocate certain players to sports teams. In a draft, teams take turns selecting from a pool of eligible players. When a team selects a player, the team receives exclusive rights to sign that player to a contract, and no other team in the league may sign the player (wikipedia).\n",
    "\n",
    "More details about draft in CFL: <a href=http://www.cfl.ca/draft>CFL Draft</a>\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Download the data into Data Scientist Workbench"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "!wget --quiet --output-document cfl_draft.csv  https://ibm.box.com/shared/static/6re9o22surhqy6bkyu8266t5nqf4x4qv.csv\n",
    "print \"Downloaded [/resources/cfl_draft.csv]. Check under 'Recent Data' in the sidebar.\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Import the data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "draftdata = pd.read_csv('/resources/cfl_draft.csv')\n",
    "draftdata.head(5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "draftdata.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## How can I analyze the #CFL data?\n",
    "Can I answer any of the following questions?\n",
    "- What do the **top teams** do that is different from the worst teams?\n",
    "- Can we *predict* the contribution of each player to their team's chances of winning a game?\n",
    "- Based on play-by-play information, can we create a **recommender system** that can help coaches decide what to do next? Or help a novice fan understand what typical plays would be expected next at a given moment during a game? \n",
    "- Is it possible to create a model that **optimizes** player choices or strategies in a particular scenario?\n",
    "- Can we create interesting **visualizations** based on play-by-play data?\n",
    "- Which **colleges and cities** are the top players coming from, for each position?\n",
    "- How can we best **rank** each player? Can we create an interesting visualization with regards to rankings?\n",
    "- What can be told about the **ages, heights, weights** of the players in each position? Based on his age, weight, and other characteristics, which position would you assign a person to?\n",
    "- Can you find **another dataset** on player injuries and group them by player position?\n",
    "- How do CFL athletes compare to **NFL** athletes?\n",
    "- How do CFL athletes compare to athletes in **other sports**?\n",
    "- Pairing the #CFL datasets with **other datasets** - what interesting insights can you come up with? Health, education, salary, life expectancy, Twitter followers, crime, news articles?\n",
    "- **Explore** the datasets more. Can you come up with other interesting insights?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br><br><hr>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"ticats\"></a>\n",
    "<h1 align = center>TiCats: All Access Data</h1>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The [**Hamilton Tiger Cats**](https://ticats.ca) (#ticats) is a CFL team and [**All Access**](https://allaccess.ticats.ca/landing) is a website for their fans. Members of All Access can gain points (called \"yards\") while performing certain actions on the All Access site:\n",
    "- clicking links about past/upcoming games\n",
    "- clicking on special discounts and offers\n",
    "- watching videos\n",
    "- participating in a fantasy quiz (ten questions about what they think will happen in the next game)\n",
    "- visiting the [**All Access**](https://allaccess.ticats.ca/landing) page while on wifi at the Tim Hortons Stadium in Hamilton, ON."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a href = https://twitter.com/intent/tweet?text=Thank+you+%40Ticats+for+providing+%23Ticats+data+for+%23SportsHack15+in+%23Toronto+%23Vancouver+%23Halifax.+sportshackweekend.org><img src = https://ibm.box.com/shared/static/5hc0my8c0mq38n6ky6i477npwinma1st.png, width = 700, style=\"padding:1px;border:thin solid black;\">\n",
    "<Thank%20you%20%40Ticats%20for%20providing%20%23Ticats%20data%20for%20the%20Sports%20Hackathon%20held%20in%20%23Toronto%20%23Vancouver%20%23Halifax.%20sportshackweekend.org>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "### Getting the data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Ticats Data: Terms of Agreement\n",
    "> `ticats_users.csv`    \n",
    "> `ticats_useractions.csv`  \n",
    ">- You have indicated you wish to access the two datasets listed above, regarding Hamilton Tiger Cats  (\"TiCats Data\").   \n",
    ">- You will use TiCats Data only for your participation in the SportsHack.  \n",
    ">- You will not disclose or send TiCats Data to any third party.  \n",
    ">- When the SportsHack ends, you will immediately **destroy** all whole and partial copies of TiCats Data in your possession or custody or under your control.    \n",
    "\n",
    "**If you accept the above terms, check the \"I Accept\" box below, and click the button to download the files directly into Data Scientist Workbench. If you do not agree to the terms, do not download the files.**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "#Run this cell\n",
    "display(checkbox)\n",
    "button"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Import data using pandas:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "users_ticats = pd.read_csv(\"/resources/ticats_users.csv\") #users information\n",
    "actions_ticats = pd.read_csv(\"/resources/ticats_useractions.csv\") #user behaviours online"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## ticats_users.csv\n",
    "This file contains data on each user (e.g., demographics)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "| Column | Description| Value |\n",
    "|-|:-:|:-:|\n",
    "| id | Unique User ID | 6455.0 \n",
    "| created_at | Date account was created | e.g., 5/21/2015 15:30:32\n",
    "| hh_size | Size of household as a range | e.g., 3-4\n",
    "| gender | Gender | Male/Female\n",
    "| birth_year | Year of user's birth | e.g., 1960.0\n",
    "| city | City | e.g., Guelph\n",
    "| province | User's Province | e.g., Ontario\n",
    "| fsa | First three digits of postal code | e.g., L5N\n",
    "| newsletter | Willing to receive Monthly Newsletters | True/False\n",
    "| partners_emails | Willing to receive Partner Emails? | True/False\n",
    "| account_type | Type of Season Seat Holder | 'Season Seat Account Holder', 'Casual Ticket Purchaser', 'No Ticketing Accounts', OR nan |\n",
    "| account_add_year | If account_type is not nan, when account was created | e.g., 2005 |\n",
    "| attended_thf | Has Been To [Tim Hortons Field](http://ticats.ca/tim-hortons-field/) | True/False\n",
    "| consider_ticket_package | Has considered a package of season tickets. If yes, how many. | e.g., Yes - 11 Game\n",
    "| events_attended | Number of events user attends in a year | e.g., 3-4\n",
    "| attend_companion | Who they are likely to bring to an event, if any | e.g., Children\n",
    "| entertainment | Entertainment events user attends | (A list.) e.g., [\"Movies\", \"Concerts / Music Festivals\"]\n",
    "| signup_ip | IP used when signing up, if available | e.g., 184.149.14.20\n",
    "| signup_agent | Details on User's Connection | e.g., Mozilla/5.0 ...\n",
    "| signup_browser | Web Browser used at sign-up | e.g., Chrome\n",
    "| signup_platfrom | User's Operating System | e.g., Windows\n",
    "| signup_method | Method used for account sign-up | e.g., facebook\n",
    "| signin_count\n",
    "| current_signin_date | Date and Time of current sign-in | e.g., 15-11-01 12:55\n",
    "| current_signin_ip | IP address of current sign-in | e.g., 184.151.63.143\n",
    "| current_signin_lat | Latitude of current sign-in, if available | (latitude)\n",
    "| current_signin_long | Longitude of current sign-in, if available | (longitude)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "users_ticats.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## ticats_useractions.csv"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This file contains data on the **actions** the users have performed on the All Access site."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Column | Description | Value\n",
    "-|-|-\n",
    "action_num | Unique ID of User's Action| e.g., 162539\n",
    "id | Unique User ID | e.g., 6468\n",
    "points | Points accumulated for action | e.g., 10\n",
    "created_at | Date and Time of action | e.g., 15-10-01 13:00\n",
    "action | User action | e.g., watched_video\n",
    "description\t| Description of User action | e.g., Watched video: sept30_mp4_360p.mp4\n",
    "content\t| Content ID number | e.g., 26651\n",
    "url\t| URL accessed | (url) or NaN\n",
    "gameday\t| Game number, if action falls on a game day | e.g., 9\n",
    "ip | IP address when action was performed | e.g., 184.151.36.228\n",
    "user_agent\t| Details on User's Connection | e.g., Mozilla/5.0 ...\n",
    "user_agent_browser | Web Browser used | e.g., chrome\n",
    "user_agent_version | User Agent Version number | e.g., 45.0.2454.101\n",
    "user_agent_platform | User's Operating System | e.g., Windows"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "actions_ticats.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## How can I analyze the #ticats data?\n",
    "Can I answer any of the following questions?\n",
    "- Can we identify **clusters** of users based on their actions and demographics? Who kind of person is most likely to click on offers/links and gain points? Can we visualize this in an interesting way (e.g., a [radar chart](https://en.wikipedia.org/wiki/Radar_chart))?\n",
    "- Can we **predict the total points** accumulated by a user based on their demographics, actions, or other variables?\n",
    "- Can we **predict the next action** (e.g., Markov model) based on Bayesian probability?\n",
    "- Can we **visualize** the behaviours of All Access members leading up to, and during a CFL game?\n",
    "- Can we create **personalized offers** based on the pattern of clicking behaviour (i.e., recommender system)? (Offers currently are not personalized)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br><br><hr>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id = \"weather\"></a>\n",
    "<h1 align = center>Weather Data </h1>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a href = https://twitter.com/intent/tweet?text=Thank+you+%40weathercompany+%40wundeground+for+providing+weather+data+for+%23SportsHack15.+sportshackweekend.org>\n",
    "<img src = https://ibm.box.com/shared/static/s1vaxpcfazoesaif943vsonu8q9bemdb.pn, width = 700, style=\"padding:1px;border:thin solid black;\">\n",
    "<Thank%20you%20%40weatherchannel%20%40wunderground%20for%20providing%20weather%20data%20for%20the%20Sports%20Hackathon.%20sportshackweekend.org>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### The Weather Company & Weather Underground"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[**The Weather Company**](http://www.theweathercompany.com/) and [**Weather Underground**](http://www.wunderground.com/) have kindly offered their API for weather data. Through this API, you will have access to historical, current, and forecast conditions around the world.\n",
    "\n",
    "You have access to multiple weather observations including:\n",
    "- temperature\n",
    "- humidity\n",
    "- precipitation\n",
    "- weather alerts\n",
    "- air pressure\n",
    "- wind speed, wind direction\n",
    "- air quality\n",
    "- satellite\n",
    "- for more info, see the full documentation below:\n",
    "<h3 align=center> **[API keys and full documentation](http://bit.ly/sportsweatherAPI):**\n",
    "<h4 align=center>Click below</h4>\n",
    "<a href = http://bit.ly/sportsweatherAPI><img src=https://ibm.box.com/shared/static/38ygmffepvrvmukfdov72xc0bzkygmbk.png, width = 600, style=\"padding:1px;border:thin solid black;\">\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Quickstart: Retrieve a dataframe of historical weather"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import requests\n",
    "import time\n",
    "import pandas as pd\n",
    "def getWeatherDataframe(latlon, daterange, apikey):\n",
    "    \"\"\"\n",
    "    latlon: a tuple/list for latitude and longitude. Example: (lat, lon)\n",
    "    daterange: a tuple/list for the start to end date. Example: [20150101, 20150105]\n",
    "        Note that daterange must have maximum range of 30 days.\n",
    "    apikey: the SUN API key in https://sites.google.com/a/weather.com/sports-hackathon/\n",
    "    \"\"\"\n",
    "    #To check input:\n",
    "    print \"latlon: \" + str(latlon)\n",
    "    print \"daterange: \" + str(daterange)\n",
    "    \n",
    "    # Create the URL for GET request\n",
    "    url = \"http://api.weather.com/v1/geocode/\"+str(latlon[0])+\"/\"+str(tor_latlon[1])+\"/observations/historical.json?apiKey=\"+apikey+\"&units=m&startDate=\"+str(daterange[0])+\"&endDate=\"+str(daterange[1])\n",
    "    print url\n",
    "    \n",
    "    # API request\n",
    "    r = requests.get(url)\n",
    "    \n",
    "    # Check if the API request returned an error\n",
    "    if r.status_code != 200: \n",
    "        print \"400 error\"\n",
    "        print r.json()\n",
    "        return\n",
    "    \n",
    "    j = r.json() # returns the API result as JSON (dict type)\n",
    "    \n",
    "    # Convert JSON to pandas df\n",
    "    obs = j[\"observations\"][0]\n",
    "    weather = pd.DataFrame([[i[key] for key in obs.keys()] for i in j[\"observations\"]])\n",
    "    weather.columns = obs.keys()\n",
    "    weather = weather.drop(\"expire_time_gmt\", 1)\n",
    "    #weather[\"expire_time_gmt\"] = weather[\"expire_time_gmt\"].map(lambda x: time.strftime('%Y-%m-%d %H:%M:%S',time.gmtime(x)))\n",
    "    weather[\"Year\"] = weather[\"valid_time_gmt\"].map(lambda x: time.strftime('%Y',time.gmtime(x)))\n",
    "    weather[\"Month\"] = weather[\"valid_time_gmt\"].map(lambda x: time.strftime('%m',time.gmtime(x)))\n",
    "    weather[\"Day\"] = weather[\"valid_time_gmt\"].map(lambda x: time.strftime('%d',time.gmtime(x)))\n",
    "    weather[\"Time\"] = weather[\"valid_time_gmt\"].map(lambda x: time.strftime('%H:%M:%S',time.gmtime(x)))\n",
    "    weather[\"valid_time_gmt\"] = weather[\"valid_time_gmt\"].map(lambda x: time.strftime('%Y-%m-%d %H:%M:%S',time.gmtime(x)))\n",
    "    cols = weather.columns.tolist()\n",
    "    cols = cols[-4:] + cols[:-4]\n",
    "    weather = weather[cols]\n",
    "    return weather"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Set the API key and choose a date range"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Set the API key below"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "apikey = \"myAPIkey\" #SUN API key. See http://bit.ly/sportsweatherAPI\n",
    "daterange = (20151101, 20151110) #Must not be >30 days between the two dates"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Sample latitudes and longitudes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "tor_latlon = [\"43.653226\",\"-79.3831843\"] #Toronto Latitude, Longitude\n",
    "van_latlon = [\"49.246292\",\"-123.116226\"] #Vancouver Latitude, Longitude\n",
    "hal_latlon = [\"44.648881\",\"-63.575312\"] #Halifax Latitude, Longitude"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Retrieve the data from the API as a pandas dataframe"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "weatherDF = getWeatherDataframe(hal_latlon,  #Halifax lat/lon\n",
    "                                daterange, \n",
    "                                apikey = apikey)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Print the headers:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "for i in weatherDF.columns:\n",
    "    print str(i)\n",
    "print weatherDF.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Print the first few rows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "weatherDF.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Basic weather variables"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "#Look only at some weather variables\n",
    "weather_basic = weatherDF[[\"Year\",\"Month\",\"Day\",\"Time\",\n",
    "          \"temp\", #temperature\n",
    "           \"rh\", #relative humidity\n",
    "           \"wspd\", #windspeed\n",
    "           \"gust\", #wind gust speed\n",
    "           \"wc\", #temperature, \"feels like\"\n",
    "           \"wdir\", #wind direction\n",
    "           \"pressure\", #air pressure\n",
    "           \"wx_phrase\", #text description of weather conditions\n",
    "          ]]\n",
    "\n",
    "weather_basic.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Groupby dates using `mean`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "#Average weather conditions by day\n",
    "weather_basic.groupby([\"Year\",\"Month\",\"Day\"], as_index=False).agg(\"mean\").head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## How can I analyze weather data?\n",
    "Can I answer any of the following questions?\n",
    "- How does weather affect team performance? Do certain teams play better in certain weather conditions?\n",
    "- How does weather affect player performance? Can coaches justify choosing players based on how they play in certain weather conditions?\n",
    "- Which weather variables are correlated with success/failure in which plays (e.g., punting)?\n",
    "- How is car traffic affected on a CFL game day, and how does weather add to traffic?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br><br><hr>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id = \"opendata\"></a>\n",
    "<h1 align = center>Open Data: \n",
    "<br>City of Toronto/Vancouver/Halifax, and Canada</h1>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a href = https://twitter.com/intent/tweet?text=Thank+you+%40thinkdataworks+%40namaraio+for+providing+%23toronto+%23vancouver+%23halifax+data+for+%23SportsHack15%21+sportshackweekend.org><img src = https://ibm.box.com/shared/static/a1n4lwxi54op06sqyzicvhd1eq250n2m.png, width = 700, style=\"padding:1px;border:thin solid black;\">\n",
    "<Thank%20you%20%40thinkdataworks%20%40namaraio%20for%20providing%20%23toronto%2C%20%23vancouver%2C%20%23halifax%20data%20for%20the%20Sports%20Hackathon!%20sportshackweekend.org>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### [Namara.io](https://namara.io) has compiled all relevant Open Data from Canada, Toronto, Vancouver, Halifax on:\n",
    " - traffic\n",
    " - public transportation\n",
    " - parking\n",
    " - bikeways\n",
    " - crime\n",
    " - and more!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h2 align =center> How to access the collection of relevant data: </h2>\n",
    "### Step 1: [Register](https://namara.io/) for Namara's SportsHack Collection at: [https://namara.io/](https://namara.io/)\n",
    "### Step 2: Click on the icon below to access the [SportsHack Collection](http://bit.ly/namaraSH)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h4 align = center> Click below: </h4>\n",
    "<a href = http://bit.ly/namaraSH><img src = https://ibm.box.com/shared/static/nzwjzwv0fwz87wuq6l2nmj8mln430hfm.png, width = 600, style=\"padding:1px;border:thin solid black;\"></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br><br><hr>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id = \"CFL-Video\"></a>\n",
    "<h1 align = center>CFL Videos <font color = \"red\"> *(New!)* </font></h1>\n",
    "\n",
    "The CFL is also kindly providing some videos gameplay footage.\n",
    "\n",
    "####You can download the videos of some games [**here**](https://ibm.box.com/s/fpo6jjf539dzrlebid5ni45chu1op4dn)\n",
    "\n",
    "These videos are about the following games:\n",
    "\n",
    "|**#**| **Date** | **Game** | **Video**| **CFL** \n",
    "|-|-|-|\n",
    "|1| Thu Jun 18\t| Toronto\tvs.\t Montreal | [GAME 06 TOR@MTL](https://ibm.box.com/s/rb9er5y686b2nfrm6p9yq6iruu57vu4e) | [view Play_by_Play](http://cfl.ca/statistics/statsGame/id/12843) \n",
    "|2| Sat Oct 10\t| Winnipeg\tvs.\t BC | [GAME 87 WIN@BC](https://ibm.box.com/s/ik0qxqi94c68566wht4mw5tw22t2a1pa) | [view Play_by_Play](http://cfl.ca/statistics/statsGame/id/12975) \n",
    "|3| Sat Oct 17\t| Calgary\tvs.\t Toronto | [GAME 89 CAL@TOR](https://ibm.box.com/s/pytpq0xaj45ozh8nuh8u576kuca54i73) | [view Play_by_Play](http://cfl.ca/statistics/statsGame/id/12981) \n",
    "|4| Sat Oct 24\t| Edmonton\tvs.\t Saskatchewan | [GAME 82 EDM@SSK](https://ibm.box.com/s/4dp9421fukg0e2xuat9hh1q0wpe974hf) | [view Play_by_Play](http://cfl.ca/statistics/statsGame/id/12993) "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h2> More full-game replays on [**CFL's YouTube page**](https://www.youtube.com/playlist?list=PLe1fb5CeiWnOarVCMGZ9TvyjxflYHUBH1)</h2>\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "###How can I analyze video data?  \n",
    "Can I answer any of the following questions?\n",
    "\n",
    "- How can the CFL improve their live broadcast by display some data insights in real-time?\n",
    "- How can the CFL improve the recorded playback (slow motion or highlights)?\n",
    "- Is there a way to predict which time intervals, clips or highlights will most likely be shared on social media?\n",
    "- How can we create tools to help fans manipulate a live broadcast?\n",
    "- If we break apart the audio and visual components of the broadcast, can use them as two separate sets of media in new and interesting ways?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br><br><hr>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"resources\"></a>\n",
    "<h1 align=center>Resources:</h1>\n",
    "<br>\n",
    "<h2 align=left>(1) More details about CFL terms? </h2>\n",
    "<h3 align=left>\n",
    "- Watch the recorded video in this event: \n",
    "<br><a href=http://bigdatauniversity.com/events/lighthouse-labs-sportshack-2015-prep-event-lighthouse-labs-toronto/>SportsHack 2015 Prep event Lighthouse Labs (Toronto)\n",
    "</h3></a>\n",
    "<h3 align=left>\n",
    "- You can download the PDF here: \n",
    "<br>\n",
    "<a href=https://ibm.box.com/shared/static/3qh72t8f3m1xcx9qh6j016442sdh7h91.pdf>All you need to know about football</a></h3>\n",
    "\n",
    "<br>\n",
    "<h2 align=left>(2) More details about the datasets/notebook? </h2>\n",
    "<h3 align=left>- Watch the recorded video in this event: \n",
    "<br><a href=http://bigdatauniversity.com/events/sportshack-2015-halifax-prep-event/>SportsHack 2015 Halifax Prep Event</h3></a>\n",
    "<br>\n",
    "\n",
    "<h2 align=left>(3) Do you want use text mining on the Social Web (e.g. Sport News, Blog, Tweets, etc.)</h2>\n",
    "<h3 align=left>- Beginner's Guide: \n",
    "<br><a href=https://rawgit.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/Chapter%205%20-%20Mining%20Web%20Pages.html>Mining the Social Web</a></h3>\n",
    "<br>\n",
    "\n",
    "<h2 align=left>(4) Do you need some links, commentary, a searchable rulebook, answers to Frequently Asked Questions and more resources? </h2>\n",
    "<h3 align=left>- CFLdb is a collection of Canadian Football League information that aims to be the resource for attributable and indexed information on the CFL : \n",
    "<br><a href=https://cfldb.ca/>The Canadian Football League Database</a></h3>\n",
    "<br>\n",
    "\n",
    "<h2 align=center>Need more help? </h2>\n",
    "<h3 align=center>\n",
    "Ask us on Slack at <a href=https://sportshack15.slack.com/>sportshack15.slack.com</a></h3>\n",
    "<font align=center>_Don't have a Slack account? E-mail us at [admin@sportshackweekend.org](mailto:admin@sportshackweekend.org) and we can create an account for you_</font>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h2 align=center> Send questions to @BigDataU</h2><br>\n",
    "<a href = https://twitter.com/intent/tweet?text=%40BigDataU%3A%20How%20do%20i%20...><img src=\"https://ibm.box.com/shared/static/r6wp1lp3ohnfy1gfusj7sawzr0hgmfo3.png\", width=600, style=\"padding:1px;border:thin solid black;\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "<h1 align=center>Need a refresher on data science?</h1>\n",
    "<h3 align=center> Check out Big Data University's [**courses**](https://www.bigdatauniversity.com) and [**recorded events**](https://www.bigdatauniversity.com/events):</h3>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "<a href = https://bigdatauniversity.com/><img src = http://sportshackweekend.org/ca/2015/img/logos/bdu.png width = 200></a>"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}