tmthyjames/SQLCell.ipynb

## SQLCell.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "I love Jupyter Notebooks and the many ways you can customize your notebooks. I try to out-source any work I do in other apps to Jupyter simply because I like the interface, it's intuitive, easy to navigate, and a joy to use. One example of getting rid of an app that I can use Jupyter for is pgAdmin. I absolutely hate pgAdmin, a postgresql interface. The main function I use it for, like anyone else I presume, is retrieving data (not necessarily the admin tasks, which it's great for). But the main reason for using Jupyter for the primary pgAdmin tasks is because I don't like opening two apps when one will do. Plus I was bored.\n",
    "\n",
    "To start, I knew I needed to create a custom \"<a href=\"http://ipython.readthedocs.io/en/stable/\">magic function</a>\" in Jupyter. To do this, you need to import IPython's `register_line_cell_magic` decorator from its core library and wrap a function with it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from IPython.core.magic import (register_line_cell_magic)\n",
    "\n",
    "@register_line_cell_magic\n",
    "def magic_test(line, cell=None):\n",
    "    return line, cell"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As you can see, we now have access to the `line` argument, which is the text directly to the right of the magic function, and the `cell` argument, the text directly under it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(u'hello, line.', u'hello, cell.')"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%magic_test hello, line.\n",
    "hello, cell."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "Let's try a slightly more complicated example, using actual SQL. For this, we need to `import` SQLAlchemy's `create_engine` function to establish a connection to the database. See <a href\"http://docs.sqlalchemy.org/en/latest/core/engines.html\">SQLAlchemy's docs</a> for more information about engine configuration. You will also need <a href=\"http://pandas.pydata.org/\">pandas</a> in order to view the data in a nice table. Later, we will write our own function for viewing an HTML table in case you don't have pandas installed or don't want to install pandas."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from sqlalchemy import create_engine\n",
    "import pandas as pd\n",
    "\n",
    "# choose an existing databae you have access to\n",
    "engine = create_engine('postgresql://tdobbins:tdobbins@localhost:5432/bls')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "@register_line_cell_magic\n",
    "def sql(path, cell=None):\n",
    "    data = engine.execute(cell)\n",
    "    columns = data.keys()\n",
    "    df = pd.DataFrame([i for i in data])\n",
    "    if df.empty:\n",
    "        return 'No data available'\n",
    "\n",
    "    df.columns = columns\n",
    "        \n",
    "    return df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>series_id</th>\n",
       "      <th>year</th>\n",
       "      <th>period</th>\n",
       "      <th>value</th>\n",
       "      <th>footnote_codes</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M01</td>\n",
       "      <td>6.2</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M02</td>\n",
       "      <td>6.1</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M03</td>\n",
       "      <td>6.0</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M04</td>\n",
       "      <td>5.9</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M05</td>\n",
       "      <td>5.8</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "              series_id  year period  value footnote_codes\n",
       "0  LASST470000000000003  1976    M01    6.2           None\n",
       "1  LASST470000000000003  1976    M02    6.1           None\n",
       "2  LASST470000000000003  1976    M03    6.0           None\n",
       "3  LASST470000000000003  1976    M04    5.9           None\n",
       "4  LASST470000000000003  1976    M05    5.8           None"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql\n",
    "select *\n",
    "from la_unemployment \n",
    "limit 5"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "BOOM! But this is FAR from perfect. Let's continue adding functionality to it so it becomes easier to use and more flexible. After all, we'd like to contribute this to open source and allow other people to use it! So what's missing? We need to allow our magic function to take arguments. But how? Using python's builtin `exec` function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "@register_line_cell_magic\n",
    "def sql(args, cell=None):\n",
    "    \n",
    "    exec(args) # executes a string as python code\n",
    "    print PATH # notice that this gets printed and was the argument we passed\n",
    "        \n",
    "    return cell"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "/Users/tdobbins/Desktop/data.csv\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "u'select *\\nfrom la_unemployment\\nlimit 10'"
      ]
     },
     "execution_count": 49,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql PATH='/Users/tdobbins/Desktop/data.csv'\n",
    "select *\n",
    "from la_unemployment\n",
    "limit 10"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, let's build this into our example and put the argument practical use by writing the resulting query results to a CSV."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "@register_line_cell_magic\n",
    "def sql(args, cell=None):\n",
    "    \n",
    "    exec(args)\n",
    "    \n",
    "    data = engine.execute(cell)\n",
    "    columns = data.keys()\n",
    "    df = pd.DataFrame([i for i in data])\n",
    "    if df.empty:\n",
    "        return 'No data available'\n",
    "    \n",
    "    df.columns = columns\n",
    "    \n",
    "    df.to_csv(PATH)\n",
    "        \n",
    "    return df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>series_id</th>\n",
       "      <th>year</th>\n",
       "      <th>period</th>\n",
       "      <th>value</th>\n",
       "      <th>footnote_codes</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M01</td>\n",
       "      <td>6.2</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M02</td>\n",
       "      <td>6.1</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M03</td>\n",
       "      <td>6.0</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M04</td>\n",
       "      <td>5.9</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M05</td>\n",
       "      <td>5.8</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "              series_id  year period  value footnote_codes\n",
       "0  LASST470000000000003  1976    M01    6.2           None\n",
       "1  LASST470000000000003  1976    M02    6.1           None\n",
       "2  LASST470000000000003  1976    M03    6.0           None\n",
       "3  LASST470000000000003  1976    M04    5.9           None\n",
       "4  LASST470000000000003  1976    M05    5.8           None"
      ]
     },
     "execution_count": 55,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql PATH='/Users/tdobbins/Desktop/data.csv'\n",
    "select * \n",
    "from la_unemployment\n",
    "limit 5"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You should now have data where ever you specified! Cool right? But we still have a long way to go to make this user-friendly. We need more functionality and ways to parse the arguments. And I'd like to not have to import libraries upon starting a new Jupyter notebook so that this is useful out of the box. Also, how about making the DataFrame available after the call? Let's start with making the DataFrame available after the query results are returned."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 69,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "@register_line_cell_magic\n",
    "def sql(path, cell=None):\n",
    "    \n",
    "    args = path.split(' ')\n",
    "    for i in args:\n",
    "        if i.startswith('MAKE_GLOBAL'):\n",
    "            glovar = i.split('=')\n",
    "            exec(glovar[0]+'='+glovar[1]+'=None')\n",
    "\n",
    "    data = engine.execute(cell)\n",
    "    columns = data.keys()\n",
    "    table_data = [i for i in data]\n",
    "    df = pd.DataFrame(table_data)\n",
    "    \n",
    "    if df.empty:\n",
    "        return 'No data available'\n",
    "    \n",
    "    df.columns = columns\n",
    "\n",
    "    if 'PATH' in locals():\n",
    "        df.to_csv(PATH)\n",
    "\n",
    "    if 'MAKE_GLOBAL' in locals():\n",
    "        exec('global ' + glovar[1] + '\\n' + glovar[1] + '=df')\n",
    "        \n",
    "    return df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>series_id</th>\n",
       "      <th>year</th>\n",
       "      <th>period</th>\n",
       "      <th>value</th>\n",
       "      <th>footnote_codes</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M01</td>\n",
       "      <td>6.2</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M02</td>\n",
       "      <td>6.1</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M03</td>\n",
       "      <td>6.0</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M04</td>\n",
       "      <td>5.9</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M05</td>\n",
       "      <td>5.8</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "              series_id  year period  value footnote_codes\n",
       "0  LASST470000000000003  1976    M01    6.2           None\n",
       "1  LASST470000000000003  1976    M02    6.1           None\n",
       "2  LASST470000000000003  1976    M03    6.0           None\n",
       "3  LASST470000000000003  1976    M04    5.9           None\n",
       "4  LASST470000000000003  1976    M05    5.8           None"
      ]
     },
     "execution_count": 70,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql MAKE_GLOBAL=WHATEVER_VARIABLE_NAME_YOU_WANT\n",
    "select *\n",
    "from la_unemployment\n",
    "limit 5"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>series_id</th>\n",
       "      <th>year</th>\n",
       "      <th>period</th>\n",
       "      <th>value</th>\n",
       "      <th>footnote_codes</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M01</td>\n",
       "      <td>6.2</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M02</td>\n",
       "      <td>6.1</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M03</td>\n",
       "      <td>6.0</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M04</td>\n",
       "      <td>5.9</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M05</td>\n",
       "      <td>5.8</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "              series_id  year period  value footnote_codes\n",
       "0  LASST470000000000003  1976    M01    6.2           None\n",
       "1  LASST470000000000003  1976    M02    6.1           None\n",
       "2  LASST470000000000003  1976    M03    6.0           None\n",
       "3  LASST470000000000003  1976    M04    5.9           None\n",
       "4  LASST470000000000003  1976    M05    5.8           None"
      ]
     },
     "execution_count": 71,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "WHATEVER_VARIABLE_NAME_YOU_WANT"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "COOOOOOOL! YOU CAN PASS IT ANY VARIABLE NAME (sorry, forgot about my caps), and it will pass the data to that variable!\n",
    "\n",
    "Another cool feature would be to allow the user to switch databases without reconfiguring the engine. Let's try it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 91,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "@register_line_cell_magic\n",
    "def sql(path, cell=None):\n",
    "    \n",
    "    args = path.split(' ')\n",
    "    for i in args:\n",
    "        if i.startswith('MAKE_GLOBAL'):\n",
    "            glovar = i.split('=')\n",
    "            exec(glovar[0]+'='+glovar[1]+'=None')\n",
    "        elif i.startswith('DB'):\n",
    "            from sqlalchemy import create_engine\n",
    "            db = i.replace('DB=', '')\n",
    "            exec(\"global engine\\nengine=engine = create_engine('postgresql://tdobbins:tdobbins@localhost:5432/\"+db+\"')\")\n",
    "            exec('global DB\\nDB=db')\n",
    "        else:\n",
    "            exec(i)\n",
    "\n",
    "    data = engine.execute(cell)\n",
    "    columns = data.keys()\n",
    "    table_data = [i for i in data]\n",
    "    df = pd.DataFrame(table_data)\n",
    "    \n",
    "    if df.empty:\n",
    "        return 'No data available'\n",
    "    \n",
    "    df.columns = columns\n",
    "\n",
    "    if 'PATH' in locals():\n",
    "        df.to_csv(PATH)\n",
    "\n",
    "    if 'MAKE_GLOBAL' in locals():\n",
    "        exec('global ' + glovar[1] + '\\n' + glovar[1] + '=df')\n",
    "        \n",
    "    return df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 97,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>dateof</th>\n",
       "      <th>team</th>\n",
       "      <th>opp</th>\n",
       "      <th>points</th>\n",
       "      <th>fg</th>\n",
       "      <th>fgatt</th>\n",
       "      <th>ft</th>\n",
       "      <th>ftatt</th>\n",
       "      <th>fg3</th>\n",
       "      <th>fg3att</th>\n",
       "      <th>offrebounds</th>\n",
       "      <th>defrebounds</th>\n",
       "      <th>totalrebounds</th>\n",
       "      <th>assists</th>\n",
       "      <th>blocks</th>\n",
       "      <th>fouls</th>\n",
       "      <th>steals</th>\n",
       "      <th>turnovers</th>\n",
       "      <th>side</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1369</td>\n",
       "      <td>2015-10-27</td>\n",
       "      <td>DET</td>\n",
       "      <td>ATL</td>\n",
       "      <td>106</td>\n",
       "      <td>37</td>\n",
       "      <td>96</td>\n",
       "      <td>20</td>\n",
       "      <td>26</td>\n",
       "      <td>12</td>\n",
       "      <td>29</td>\n",
       "      <td>23</td>\n",
       "      <td>36</td>\n",
       "      <td>59</td>\n",
       "      <td>23</td>\n",
       "      <td>3</td>\n",
       "      <td>15</td>\n",
       "      <td>5</td>\n",
       "      <td>15</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1370</td>\n",
       "      <td>2015-10-27</td>\n",
       "      <td>ATL</td>\n",
       "      <td>DET</td>\n",
       "      <td>94</td>\n",
       "      <td>37</td>\n",
       "      <td>82</td>\n",
       "      <td>12</td>\n",
       "      <td>15</td>\n",
       "      <td>8</td>\n",
       "      <td>27</td>\n",
       "      <td>7</td>\n",
       "      <td>33</td>\n",
       "      <td>40</td>\n",
       "      <td>22</td>\n",
       "      <td>4</td>\n",
       "      <td>25</td>\n",
       "      <td>9</td>\n",
       "      <td>15</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>1371</td>\n",
       "      <td>2015-10-27</td>\n",
       "      <td>CLE</td>\n",
       "      <td>CHI</td>\n",
       "      <td>95</td>\n",
       "      <td>38</td>\n",
       "      <td>94</td>\n",
       "      <td>10</td>\n",
       "      <td>17</td>\n",
       "      <td>9</td>\n",
       "      <td>29</td>\n",
       "      <td>11</td>\n",
       "      <td>39</td>\n",
       "      <td>50</td>\n",
       "      <td>26</td>\n",
       "      <td>7</td>\n",
       "      <td>21</td>\n",
       "      <td>5</td>\n",
       "      <td>11</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>1372</td>\n",
       "      <td>2015-10-27</td>\n",
       "      <td>CHI</td>\n",
       "      <td>CLE</td>\n",
       "      <td>97</td>\n",
       "      <td>37</td>\n",
       "      <td>87</td>\n",
       "      <td>16</td>\n",
       "      <td>23</td>\n",
       "      <td>7</td>\n",
       "      <td>19</td>\n",
       "      <td>7</td>\n",
       "      <td>40</td>\n",
       "      <td>47</td>\n",
       "      <td>13</td>\n",
       "      <td>10</td>\n",
       "      <td>22</td>\n",
       "      <td>6</td>\n",
       "      <td>13</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>1373</td>\n",
       "      <td>2015-10-27</td>\n",
       "      <td>NO</td>\n",
       "      <td>GS</td>\n",
       "      <td>95</td>\n",
       "      <td>35</td>\n",
       "      <td>83</td>\n",
       "      <td>19</td>\n",
       "      <td>27</td>\n",
       "      <td>6</td>\n",
       "      <td>18</td>\n",
       "      <td>8</td>\n",
       "      <td>25</td>\n",
       "      <td>33</td>\n",
       "      <td>21</td>\n",
       "      <td>3</td>\n",
       "      <td>26</td>\n",
       "      <td>9</td>\n",
       "      <td>19</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     id      dateof team  opp  points  fg  fgatt  ft  ftatt  fg3  fg3att  \\\n",
       "0  1369  2015-10-27  DET  ATL     106  37     96  20     26   12      29   \n",
       "1  1370  2015-10-27  ATL  DET      94  37     82  12     15    8      27   \n",
       "2  1371  2015-10-27  CLE  CHI      95  38     94  10     17    9      29   \n",
       "3  1372  2015-10-27  CHI  CLE      97  37     87  16     23    7      19   \n",
       "4  1373  2015-10-27   NO   GS      95  35     83  19     27    6      18   \n",
       "\n",
       "   offrebounds  defrebounds  totalrebounds  assists  blocks  fouls  steals  \\\n",
       "0           23           36             59       23       3     15       5   \n",
       "1            7           33             40       22       4     25       9   \n",
       "2           11           39             50       26       7     21       5   \n",
       "3            7           40             47       13      10     22       6   \n",
       "4            8           25             33       21       3     26       9   \n",
       "\n",
       "   turnovers  side  \n",
       "0         15  None  \n",
       "1         15  None  \n",
       "2         11  None  \n",
       "3         13  None  \n",
       "4         19  None  "
      ]
     },
     "execution_count": 97,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql DB=sports\n",
    "select *\n",
    "from nbateamstats\n",
    "limit 5"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 98,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>series_id</th>\n",
       "      <th>year</th>\n",
       "      <th>period</th>\n",
       "      <th>value</th>\n",
       "      <th>footnote_codes</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M01</td>\n",
       "      <td>6.2</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M02</td>\n",
       "      <td>6.1</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M03</td>\n",
       "      <td>6.0</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M04</td>\n",
       "      <td>5.9</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M05</td>\n",
       "      <td>5.8</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "              series_id  year period  value footnote_codes\n",
       "0  LASST470000000000003  1976    M01    6.2           None\n",
       "1  LASST470000000000003  1976    M02    6.1           None\n",
       "2  LASST470000000000003  1976    M03    6.0           None\n",
       "3  LASST470000000000003  1976    M04    5.9           None\n",
       "4  LASST470000000000003  1976    M05    5.8           None"
      ]
     },
     "execution_count": 98,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql DB=bls PATH='/Users/tdobbins/Desktop/bls.csv'\n",
    "select *\n",
    "from la_unemployment\n",
    "limit 5"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And the last specified database name will maintain the connection. So you don't need to specify which database until you want to change databases."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 99,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>series_id</th>\n",
       "      <th>year</th>\n",
       "      <th>period</th>\n",
       "      <th>value</th>\n",
       "      <th>footnote_codes</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M01</td>\n",
       "      <td>6.2</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M02</td>\n",
       "      <td>6.1</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M03</td>\n",
       "      <td>6.0</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M04</td>\n",
       "      <td>5.9</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M05</td>\n",
       "      <td>5.8</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "              series_id  year period  value footnote_codes\n",
       "0  LASST470000000000003  1976    M01    6.2           None\n",
       "1  LASST470000000000003  1976    M02    6.1           None\n",
       "2  LASST470000000000003  1976    M03    6.0           None\n",
       "3  LASST470000000000003  1976    M04    5.9           None\n",
       "4  LASST470000000000003  1976    M05    5.8           None"
      ]
     },
     "execution_count": 99,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql \n",
    "select *\n",
    "from la_unemployment\n",
    "limit 5"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And just like that we can switch databases. But I think we can make this even more user friendly. Let's add SQLAlchemy parameters and a way to parse the arguments."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "@register_line_cell_magic\n",
    "def sql(path, cell=None):\n",
    "\n",
    "    args = path.split(' ')\n",
    "    for i in args:\n",
    "        if i.startswith('MAKE_GLOBAL'):\n",
    "            glovar = i.split('=')\n",
    "            exec(glovar[0]+'='+glovar[1]+'=None')\n",
    "        elif i.startswith('DB'):\n",
    "            from sqlalchemy import create_engine\n",
    "            db = i.replace('DB=', '')\n",
    "            exec(\"global engine\\nengine=create_engine('\"+driver+\"://\"+username+\":\"+password+\"@\"+host+\":\"+port+\"/\"+db+\"')\")\n",
    "            exec('global DB\\nDB=db')\n",
    "        else:\n",
    "            exec(i)\n",
    "            \n",
    "    if 'PARAMS' in locals():\n",
    "        for i in PARAMS.keys():\n",
    "            ref = '%('+i+')s'\n",
    "            if ref in cell:\n",
    "                cell = cell.replace(ref, (\"'\"+(str(eval(i)))+\"'\" if eval(i) != 'null' else 'null'))\n",
    "\n",
    "    data = engine.execute(cell)\n",
    "    columns = data.keys()\n",
    "    table_data = [i for i in data] if 'pd' in globals() else [columns] + [i for i in data]\n",
    "    df = pd.DataFrame(table_data)\n",
    "    \n",
    "    if df.empty:\n",
    "        return 'No data available'\n",
    "    \n",
    "    df.columns = columns\n",
    "\n",
    "    if 'PATH' in locals():\n",
    "        df.to_csv(PATH)\n",
    "\n",
    "    if 'MAKE_GLOBAL' in locals():\n",
    "        exec('global ' + glovar[1] + '\\n' + glovar[1] + '=df')\n",
    "        \n",
    "    return df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This part is cool. I use SQLAlchemy named parameters all the time in production, but when I copy and paste my queries into pgAdmin, I have to go through my query and delete the SQLAlchemy placeholders and put static values in there, and when I switch back to my development environment, I have to delete the static values and use the named parameters again. Annoying!!!!!\n",
    "\n",
    "It shouldn't have to be like this."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# define your parameters here\n",
    "year = '1976'\n",
    "period = 'M01'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>series_id</th>\n",
       "      <th>year</th>\n",
       "      <th>period</th>\n",
       "      <th>value</th>\n",
       "      <th>footnote_codes</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M01</td>\n",
       "      <td>6.2</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>LASST470000000000004</td>\n",
       "      <td>1976</td>\n",
       "      <td>M01</td>\n",
       "      <td>111152.0</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>LASST470000000000005</td>\n",
       "      <td>1976</td>\n",
       "      <td>M01</td>\n",
       "      <td>1691780.0</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>LASST470000000000006</td>\n",
       "      <td>1976</td>\n",
       "      <td>M01</td>\n",
       "      <td>1802930.0</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>LAUST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M01</td>\n",
       "      <td>6.9</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "              series_id  year period      value footnote_codes\n",
       "0  LASST470000000000003  1976    M01        6.2           None\n",
       "1  LASST470000000000004  1976    M01   111152.0           None\n",
       "2  LASST470000000000005  1976    M01  1691780.0           None\n",
       "3  LASST470000000000006  1976    M01  1802930.0           None\n",
       "4  LAUST470000000000003  1976    M01        6.9           None"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql PARAMS={'year':year,'period':period}\n",
    "select *\n",
    "from la_unemployment\n",
    "where year = %(year)s\n",
    "    and period = %(period)s\n",
    "limit 5"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Very userful! Let's finish this script. We'll add a way to display the HTML table for those who don't have pandas and a way to write to a CSV. This example contains a `class` that converts the data into an HTML table and a `class` method that writes the data to a CSV. If the pandas import fails, then we invoke the class."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "from IPython.core.magic import (register_line_magic, register_cell_magic,\n",
    "                                register_line_cell_magic)\n",
    "\n",
    "class HTMLTable(list):\n",
    "    \"\"\"\n",
    "    Creates an HTML table if pandas isn't installed.\n",
    "    The .empty attribute takes the place of df.empty,\n",
    "    and to_csv takes the place of df.to_csv.\n",
    "    \"\"\"\n",
    "    \n",
    "    empty = []\n",
    "    \n",
    "    def _repr_html_(self):\n",
    "        table = '<table width=100%>'\n",
    "        thead = '<thead><tr>'\n",
    "        tbody = '<tbody><tr>'\n",
    "        for n,row in enumerate(self):\n",
    "            if n == 0:\n",
    "                thead += ''.join([('<th>' + str(r) + '</th>') for r in row])\n",
    "            else:\n",
    "                tbody += '<tr>' + ''.join([('<td>' + str(r) + '</td>') for r in row]) + '</tr>'\n",
    "        thead += '</tr></thead>'\n",
    "        tbody += '</tbody>'\n",
    "        table += thead + tbody\n",
    "        return table\n",
    "\n",
    "    def to_csv(self, path):\n",
    "        import csv\n",
    "        with open(path, 'w') as fp:\n",
    "            a = csv.writer(fp, delimiter=',')\n",
    "            a.writerows(self)\n",
    "    \n",
    "try:\n",
    "    import pandas as pd\n",
    "    pd.options.display.max_columns = None\n",
    "    to_table = pd.DataFrame\n",
    "except ImportError as e:\n",
    "    to_table = HTMLTable\n",
    "    \n",
    "# default connection string info here\n",
    "driver = 'postgresql'\n",
    "username = 'tdobbins'\n",
    "password = 'tdobbins'\n",
    "host = 'localhost'\n",
    "port = '5432'\n",
    "\n",
    "@register_line_cell_magic\n",
    "def sql(path, cell=None):\n",
    "    \"\"\"\n",
    "    Create magic cell function to treat cell text as SQL\n",
    "    to remove the need of third party SQL interfaces. The \n",
    "    args are split on spaces so don't use spaces except to \n",
    "    input a new argument.\n",
    "    Args:\n",
    "        PATH (str): path to write dataframe to in csv.\n",
    "        PARAMS (dict): allows SQLAlchemy named parameters.\n",
    "        MAKE_GLOBAL: make dataframe available globally.\n",
    "        DB: name of database to connect to.\n",
    "    Returns:\n",
    "        DataFrame:\n",
    "    \"\"\"\n",
    "    args = path.split(' ')\n",
    "    for i in args:\n",
    "        if i.startswith('MAKE_GLOBAL'):\n",
    "            glovar = i.split('=')\n",
    "            exec(glovar[0]+'='+glovar[1]+'=None')\n",
    "        elif i.startswith('DB'):\n",
    "            from sqlalchemy import create_engine\n",
    "            db = i.replace('DB=', '')\n",
    "            exec(\"global engine\\nengine=create_engine('\"+driver+\"://\"+username+\":\"+password+\"@\"+host+\":\"+port+\"/\"+db+\"')\")\n",
    "            exec('global DB\\nDB=db')\n",
    "        else:\n",
    "            exec(i)\n",
    "            \n",
    "    if 'PARAMS' in locals():\n",
    "        for i in PARAMS.keys():\n",
    "            ref = '%('+i+')s'\n",
    "            if ref in cell:\n",
    "                cell = cell.replace(ref, (\"'\"+(str(eval(i)))+\"'\" if eval(i) != 'null' else 'null'))\n",
    "\n",
    "    data = engine.execute(cell)\n",
    "    columns = data.keys()\n",
    "    table_data = [i for i in data] if 'pd' in globals() else [columns] + [i for i in data]\n",
    "    df = to_table(table_data)\n",
    "    \n",
    "    if df.empty:\n",
    "        return 'No data available'\n",
    "    \n",
    "    df.columns = columns\n",
    "\n",
    "    if 'PATH' in locals():\n",
    "        df.to_csv(PATH)\n",
    "\n",
    "    if 'MAKE_GLOBAL' in locals():\n",
    "        exec('global ' + glovar[1] + '\\n' + glovar[1] + '=df')\n",
    "        \n",
    "    return df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With pandas:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>series_id</th>\n",
       "      <th>year</th>\n",
       "      <th>period</th>\n",
       "      <th>value</th>\n",
       "      <th>footnote_codes</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>LASST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M01</td>\n",
       "      <td>6.2</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>LASST470000000000004</td>\n",
       "      <td>1976</td>\n",
       "      <td>M01</td>\n",
       "      <td>111152.0</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>LASST470000000000005</td>\n",
       "      <td>1976</td>\n",
       "      <td>M01</td>\n",
       "      <td>1691780.0</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>LASST470000000000006</td>\n",
       "      <td>1976</td>\n",
       "      <td>M01</td>\n",
       "      <td>1802930.0</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>LAUST470000000000003</td>\n",
       "      <td>1976</td>\n",
       "      <td>M01</td>\n",
       "      <td>6.9</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "              series_id  year period      value footnote_codes\n",
       "0  LASST470000000000003  1976    M01        6.2           None\n",
       "1  LASST470000000000004  1976    M01   111152.0           None\n",
       "2  LASST470000000000005  1976    M01  1691780.0           None\n",
       "3  LASST470000000000006  1976    M01  1802930.0           None\n",
       "4  LAUST470000000000003  1976    M01        6.9           None"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql PATH='/Users/tdobbins/Desktop/data.csv' PARAMS={'year':year,'period':period} MAKE_GLOBAL=DF DB=bls\n",
    "select *\n",
    "from la_unemployment\n",
    "where year = %(year)s\n",
    "    and period = %(period)s\n",
    "limit 5"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Without pandas:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table width=100%><thead><tr><th>series_id</th><th>year</th><th>period</th><th>value</th><th>footnote_codes</th></tr></thead><tbody><tr><tr><td>LASST470000000000003</td><td>1976</td><td>M01</td><td>6.2</td><td>None</td></tr><tr><td>LASST470000000000004</td><td>1976</td><td>M01</td><td>111152.0</td><td>None</td></tr><tr><td>LASST470000000000005</td><td>1976</td><td>M01</td><td>1691780.0</td><td>None</td></tr><tr><td>LASST470000000000006</td><td>1976</td><td>M01</td><td>1802930.0</td><td>None</td></tr><tr><td>LAUST470000000000003</td><td>1976</td><td>M01</td><td>6.9</td><td>None</td></tr></tbody>"
      ],
      "text/plain": [
       "[[u'series_id', u'year', u'period', u'value', u'footnote_codes'],\n",
       " (u'LASST470000000000003', 1976, u'M01', 6.2, None),\n",
       " (u'LASST470000000000004', 1976, u'M01', 111152.0, None),\n",
       " (u'LASST470000000000005', 1976, u'M01', 1691780.0, None),\n",
       " (u'LASST470000000000006', 1976, u'M01', 1802930.0, None),\n",
       " (u'LAUST470000000000003', 1976, u'M01', 6.9, None)]"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql PATH='/Users/tdobbins/Desktop/data.csv' PARAMS={'year':year,'period':period} MAKE_GLOBAL=DF DB=bls\n",
    "select *\n",
    "from la_unemployment\n",
    "where year = %(year)s\n",
    "    and period = %(period)s\n",
    "limit 5"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you put this script in your ~/.ipython/profile_default/startup directory, it will load the magic function at the start of every Jupyter notebook."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.10"
  },
  "widgets": {
   "state": {},
   "version": "1.1.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}