Skip to content

Instantly share code, notes, and snippets.

@psychemedia
Last active October 22, 2020 06:36
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save psychemedia/3187bce18ffdd79bf258d6011ec301b3 to your computer and use it in GitHub Desktop.
Save psychemedia/3187bce18ffdd79bf258d6011ec301b3 to your computer and use it in GitHub Desktop.
Proof of concept sqlite_utils magic
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install --upgrade sqlite_utils\n",
"%pip install --upgrade csvs-to-sqlite"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Quick Sketch - `sqlite_utils` magic\n",
"\n",
"Wondering: what might `sqlite_utils` magic look like, and why should we bother?\n",
"\n",
"One way for it to look would be to start to replicate the CLI.\n",
"\n",
"As to why bother? Maybe folk are using a notebook but they aren't a Python programmer, or maybe they just want some shortcut, bespoke magic to do common operations in a CLI rather than a Python function way.\n",
"\n",
"Along the way, we might as well make use of other tools in @simonw's `-to-sqlite` universe."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Base magic — WIP\n",
"\n",
"Let's have a quick hack at some magic.."
]
},
{
"cell_type": "code",
"execution_count": 159,
"metadata": {},
"outputs": [],
"source": [
"from IPython.core.magic import magics_class, line_magic, cell_magic, line_cell_magic, Magics\n",
"from IPython.core import magic_arguments\n",
"\n",
"from sqlite_utils import Database\n",
"from io import StringIO\n",
"import pandas as pd\n",
"import json\n",
"import requests\n",
"\n",
"@magics_class\n",
"class SQLite_Utils(Magics):\n",
"\n",
" def clean_string(self, string):\n",
" \"\"\"Clean string argument.\"\"\"\n",
" if isinstance(string, str):\n",
" string = string.strip(\"\\\"\\'\")\n",
" return f'{string}'\n",
" return string\n",
"\n",
"\n",
" def download_file(self, url, filename=None, chunk_size=8192):\n",
" \"\"\"Download file.\"\"\"\n",
" # We stream in case the file is a large one...\n",
" # Should we download to a temporary file?\n",
" # Or is persistence good for audit / if anything goes wrong?\n",
" fn = url.split('/')[-1] if not filename else filename\n",
" with requests.get(url, stream=True) as r:\n",
" #r.raise_for_status() # Error reporting\n",
" with open(fn, 'wb') as f:\n",
" for chunk in r.iter_content(chunk_size=chunk_size): \n",
" f.write(chunk)\n",
" return fn\n",
"\n",
"\n",
" # We'll simplify things for now by saying you MUST call this first\n",
" @line_magic\n",
" @magic_arguments.magic_arguments()\n",
" @magic_arguments.argument(\"--dbvar\", \"-d\", default='_sqlite_utils_db', help=\"DB var.\")\n",
" @magic_arguments.argument(\"--memory\", \"-m\", action='store_true', help=\"DB var.\")\n",
" @magic_arguments.argument(\"--recreate\", \"-r\", action='store_true', help=\"DB var.\")\n",
" @magic_arguments.argument(\n",
" \"--filename\", \"-f\", default=\"sqlite_utils_magic.db\", help=\"SQLite database pathname.\")\n",
" def sql_utils_create(self, line):\n",
" \"Create a database.\"\n",
" args = magic_arguments.parse_argstring(self.sql_utils_create, line)\n",
" \n",
" # TO DO - the --recreate flag seems to create a read only database?\n",
" if args.memory:\n",
" # This is a bit of trickery - keep the db reference in a variable\n",
" # in the parent Python environment\n",
" self.shell.user_ns[args.dbvar] = Database(memory=True, recreate=args.recreate)\n",
" else:\n",
" self.shell.user_ns[args.dbvar] = Database(args.filename, recreate=args.recreate)\n",
" self.shell.user_ns['_sql_utils_magic_db_path'] =args.filename\n",
"\n",
"\n",
" @line_magic\n",
" @magic_arguments.magic_arguments()\n",
" @magic_arguments.argument(\"--dbvar\", \"-d\", default='_sqlite_utils_db', help=\"DB var.\")\n",
" @magic_arguments.argument(\"--table\", \"-t\", default=None, help=\"Table name.\")\n",
" @magic_arguments.argument(\"--columns\", \"-c\", action='store_true', help=\"Show columns.\")\n",
" def sql_utils_tables(self, line):\n",
" \"\"\"Show tables.\"\"\"\n",
" args = magic_arguments.parse_argstring(self.sql_utils_tables, line)\n",
" db = self.shell.user_ns[args.dbvar]\n",
" if args.table and args.columns:\n",
" return db[args.table].columns\n",
" return db.table_names()\n",
"\n",
"\n",
" @line_magic\n",
" @magic_arguments.magic_arguments()\n",
" @magic_arguments.argument(\"--dbvar\", \"-d\", default='_sqlite_utils_db', help=\"DB var.\")\n",
" def sql_utils_views(self, line):\n",
" \"\"\"Show views.\"\"\"\n",
" args = magic_arguments.parse_argstring(self.sql_utils_views, line)\n",
" db = self.shell.user_ns[args.dbvar]\n",
" return db.view_names()\n",
"\n",
"\n",
" # Remember for now we have simplified by requirig the db has been set up\n",
" @line_cell_magic\n",
" @magic_arguments.magic_arguments()\n",
" @magic_arguments.argument(\"--csv\", \"-c\", default=None, help=\"CSV path.\")\n",
" @magic_arguments.argument(\"--filename\", \"-f\", default=None, help=\"Filename.\")\n",
" def sql_utils_import_csv(self, line, cell=None):\n",
" \"\"\"Import CSV file.\"\"\"\n",
" args = magic_arguments.parse_argstring(self.sql_utils_import_csv, line)\n",
" if not args.csv or '_sql_utils_magic_db_path' not in self.shell.user_ns:\n",
" return\n",
" if args.csv.startswith('http'):\n",
" # It's a URL... so download it...\n",
" fn = self.download_file(args.csv, args.filename)\n",
" # Given we're in magic land!\n",
" db_path = self.shell.user_ns['_sql_utils_magic_db_path']\n",
" !csvs-to-sqlite {fn} {db_path}\n",
"\n",
"\n",
" @cell_magic\n",
" @magic_arguments.magic_arguments()\n",
" @magic_arguments.argument(\"--dbvar\", \"-d\", default='_sqlite_utils_db', help=\"DB var.\")\n",
" @magic_arguments.argument(\"--table\", \"-t\", default=None, help=\"Table name.\")\n",
" def sql_utils_add_rows(self, line, cell):\n",
" \"\"\"Add rows from JSON.\"\"\"\n",
" args = magic_arguments.parse_argstring(self.sql_utils_add_rows, line)\n",
" if not args.table:\n",
" return\n",
" db = self.shell.user_ns[args.dbvar]\n",
" df = pd.DataFrame(StringIO(cell))\n",
" df = pd.json_normalize(df[0].str.strip().apply(json.loads))\n",
" db[args.table].insert_all(df.to_dict(orient='records'))\n",
"\n",
" \n",
" @line_cell_magic\n",
" @magic_arguments.magic_arguments()\n",
" @magic_arguments.argument(\"--dbvar\", \"-d\", default='_sqlite_utils_db', help=\"DB var.\")\n",
" @magic_arguments.argument(\"--create\", \"-k\", action='store_true', help=\"Create.\")\n",
" @magic_arguments.argument(\"--table\", \"-t\", default=None, help=\"Table name.\")\n",
" @magic_arguments.argument(\"--tabledef\", \"-T\", default=None, help=\"Table definition.\")\n",
" @magic_arguments.argument(\"--pk\", \"-p\", default=None, help=\"Primary key.\")\n",
" @magic_arguments.argument(\"--fk\", \"-f\", default=None, help=\"Foreign keys.\")\n",
" @magic_arguments.argument(\"--columnorder\", \"-c\", default=None, help=\"Column order.\")\n",
" @magic_arguments.argument(\"--notnull\", \"-n\", default=None, help=\"Not null.\")\n",
" def sql_utils_table(self, line, cell=None):\n",
" args = magic_arguments.parse_argstring(self.sql_utils_table, line)\n",
" db = self.shell.user_ns[args.dbvar]\n",
" tabledef = self.clean_string(args.tabledef)\n",
" pk = self.clean_string(args.pk)\n",
" fk = self.clean_string(args.fk)\n",
" columnorder = self.clean_string(args.columnorder)\n",
" if columnorder:\n",
" columnorder = tuple([f\"{c.strip()}\" for c in columnorder.split(',')])\n",
" notnull = self.clean_string(args.notnull)\n",
" if notnull:\n",
" notnull = set([f\"{n.strip()}\" for n in notnull.split(',')])\n",
" if cell is None and args.table:\n",
" if args.create:\n",
" db[args.table].create(tabledef,\n",
" pk=pk,\n",
" foreign_keys=fk,\n",
" column_order=columnorder,\n",
" not_null=notnull\n",
" )\n",
" else:\n",
" db.table(args.table,\n",
" pk=pk,\n",
" foreign_keys=fk,\n",
" column_order=columnorder,\n",
" not_null=notnull\n",
" )\n",
" else:\n",
" db[args.table].create(eval(cell),\n",
" pk=pk,\n",
" foreign_keys=fk,\n",
" column_order=columnorder,\n",
" not_null=notnull\n",
" )\n",
"\n",
" \n",
" @line_cell_magic\n",
" @magic_arguments.magic_arguments()\n",
" @magic_arguments.argument(\"--dbvar\", \"-d\", default='_sqlite_utils_db', help=\"DB var.\")\n",
" @magic_arguments.argument(\"--table\", \"-t\", default=None, help=\"Table name.\")\n",
" @magic_arguments.argument(\"--select\", \"-s\", default=None, help=\"Query.\")\n",
" @magic_arguments.argument(\"--where\", \"-w\", default=None, help=\"Where.\")\n",
" @magic_arguments.argument(\"--orderby\", \"-o\", default=None, help=\"Order by.\")\n",
" @magic_arguments.argument(\"--generator\", \"-G\", default=None, help=\"Return generator.\")\n",
" @magic_arguments.argument(\"--outvar\", \"-O\", default=None, help=\"Output variable.\")\n",
" @magic_arguments.argument(\"--query\", \"-q\", default=None, help=\"Open query.\")\n",
" @magic_arguments.argument(\"--get\", \"-g\", default=None, help=\"Get record by primary key.\")\n",
" def sql_utils_query(self, line, cell=None):\n",
" args = magic_arguments.parse_argstring(self.sql_utils_query, line)\n",
" db = self.shell.user_ns[args.dbvar]\n",
" if cell is None:\n",
" # Allow a quoted string to be passed via -s\n",
" if args.table:\n",
" if args.get:\n",
" # Does this need heuristic typecasting?\n",
" # Or can we pass vars that retain type?\n",
" return db[args.table].get(args.get)\n",
" else:\n",
" select = self.clean_string(args.select)\n",
" where = self.clean_string(args.where)\n",
" orderby = self.clean_string(args.orderby)\n",
" resp = db[args.table].rows_where(where=where,\n",
" select=select,\n",
" order_by=orderby)\n",
" if args.generator:\n",
" return resp\n",
" df = pd.DataFrame(resp)\n",
" if args.outvar:\n",
" self.shell.user_ns[args.outvar] = df\n",
" return pd.DataFrame(df)\n",
" elif args.query:\n",
" # TO DO - DRY - we should be able to abstract most\n",
" # of this into a separate utility function\n",
" query = self.clean_string(args.query)\n",
" df = pd.DataFrame(db.execute(query).fetchall())\n",
" if args.outvar:\n",
" self.shell.user_ns[args.outvar] = df\n",
" return pd.DataFrame(df)\n",
" else:\n",
" # For now assume this is just a query\n",
" df = pd.DataFrame(db.execute(cell).fetchall())\n",
" if args.outvar:\n",
" self.shell.user_ns[args.outvar] = df\n",
" return pd.DataFrame(df)\n",
"\n",
"\n",
"ip = get_ipython()\n",
"ip.register_magics(SQLite_Utils)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Start off by creating a db - at the moment this is a MUST first step.\n",
"\n",
"Known issues:\n",
"\n",
"- `recreate` seems to result in a read-only db?"
]
},
{
"cell_type": "code",
"execution_count": 129,
"metadata": {},
"outputs": [],
"source": [
"#%load_ext SQLite_Utils\n",
"%sql_utils_create --filename mydemo1.db"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We actually have a reference to the database:"
]
},
{
"cell_type": "code",
"execution_count": 130,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<Database <sqlite3.Connection object at 0x116cd38f0>>"
]
},
"execution_count": 130,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"_sqlite_utils_db"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can add some rows to a table:"
]
},
{
"cell_type": "code",
"execution_count": 131,
"metadata": {},
"outputs": [],
"source": [
"%%sql_utils_add_rows -t test1\n",
"{\"id\": 1, \"name\": \"Cleo\", \"twitter\": \"cleopaws\", \"age\": 3,\"is_good_dog\": true}\n",
"{\"id\": 2, \"name\": \"Marnie\", \"twitter\": \"MarnieTheDog\", \"age\": 16,\"is_good_dog\": true}\n",
"{\"id\": 3, \"name\": \"Cleo2\", \"twitter\": \"cleopaws2\", \"age\": 3,\"is_good_dog\": true}\n",
"{\"id\": 4, \"name\": \"Marnie2\", \"twitter\": \"MarnieTheDog2\", \"age\": 16,\"is_good_dog\": true}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can list the tables:"
]
},
{
"cell_type": "code",
"execution_count": 132,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['test1', '?f=csv', 'mps']"
]
},
"execution_count": 132,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%sql_utils_tables"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can query tables:"
]
},
{
"cell_type": "code",
"execution_count": 135,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>name</th>\n",
" <th>age</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Cleo</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Marnie</td>\n",
" <td>16</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Cleo2</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Marnie2</td>\n",
" <td>16</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Cleo</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>Marnie</td>\n",
" <td>16</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>Cleo2</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>Marnie2</td>\n",
" <td>16</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>Cleo</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>Marnie</td>\n",
" <td>16</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td>Cleo2</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11</th>\n",
" <td>Marnie2</td>\n",
" <td>16</td>\n",
" </tr>\n",
" <tr>\n",
" <th>12</th>\n",
" <td>Cleo</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>Marnie</td>\n",
" <td>16</td>\n",
" </tr>\n",
" <tr>\n",
" <th>14</th>\n",
" <td>Cleo2</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>15</th>\n",
" <td>Marnie2</td>\n",
" <td>16</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" name age\n",
"0 Cleo 3\n",
"1 Marnie 16\n",
"2 Cleo2 3\n",
"3 Marnie2 16\n",
"4 Cleo 3\n",
"5 Marnie 16\n",
"6 Cleo2 3\n",
"7 Marnie2 16\n",
"8 Cleo 3\n",
"9 Marnie 16\n",
"10 Cleo2 3\n",
"11 Marnie2 16\n",
"12 Cleo 3\n",
"13 Marnie 16\n",
"14 Cleo2 3\n",
"15 Marnie2 16"
]
},
"execution_count": 135,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%sql_utils_query -t test1 -s name,age"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can have more elaborate queries:"
]
},
{
"cell_type": "code",
"execution_count": 136,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>name</th>\n",
" <th>age</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Cleo</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Marnie</td>\n",
" <td>16</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Cleo2</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Marnie2</td>\n",
" <td>16</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Cleo</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>Marnie</td>\n",
" <td>16</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>Cleo2</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>Marnie2</td>\n",
" <td>16</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>Cleo</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>Marnie</td>\n",
" <td>16</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td>Cleo2</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11</th>\n",
" <td>Marnie2</td>\n",
" <td>16</td>\n",
" </tr>\n",
" <tr>\n",
" <th>12</th>\n",
" <td>Cleo</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>Marnie</td>\n",
" <td>16</td>\n",
" </tr>\n",
" <tr>\n",
" <th>14</th>\n",
" <td>Cleo2</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>15</th>\n",
" <td>Marnie2</td>\n",
" <td>16</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" name age\n",
"0 Cleo 3\n",
"1 Marnie 16\n",
"2 Cleo2 3\n",
"3 Marnie2 16\n",
"4 Cleo 3\n",
"5 Marnie 16\n",
"6 Cleo2 3\n",
"7 Marnie2 16\n",
"8 Cleo 3\n",
"9 Marnie 16\n",
"10 Cleo2 3\n",
"11 Marnie2 16\n",
"12 Cleo 3\n",
"13 Marnie 16\n",
"14 Cleo2 3\n",
"15 Marnie2 16"
]
},
"execution_count": 136,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%sql_utils_query -t test1 -s 'name, age'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can import data from a CSV file downloaded from a URL:"
]
},
{
"cell_type": "code",
"execution_count": 94,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Loaded 1 dataframes\n",
"/usr/local/lib/python3.7/site-packages/pandas/core/generic.py:2612: UserWarning: The spaces in these column names will not be changed. In pandas versions < 0.14, spaces were converted to underscores.\n",
" method=method,\n",
"Added 1 CSV file to mydemo1.db\n"
]
}
],
"source": [
"%sql_utils_import_csv --csv https://www.theyworkforyou.com/mps/?f=csv --filename mps.csv"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Recheck the tables:"
]
},
{
"cell_type": "code",
"execution_count": 137,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['test1', '?f=csv', 'mps']"
]
},
"execution_count": 137,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%sql_utils_tables"
]
},
{
"cell_type": "code",
"execution_count": 138,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Person ID</th>\n",
" <th>First name</th>\n",
" <th>Last name</th>\n",
" <th>Party</th>\n",
" <th>Constituency</th>\n",
" <th>URI</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>10001</td>\n",
" <td>Diane</td>\n",
" <td>Abbott</td>\n",
" <td>Labour</td>\n",
" <td>Hackney North and Stoke Newington</td>\n",
" <td>https://www.theyworkforyou.com/mp/10001/diane_...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>25034</td>\n",
" <td>Debbie</td>\n",
" <td>Abrahams</td>\n",
" <td>Labour</td>\n",
" <td>Oldham East and Saddleworth</td>\n",
" <td>https://www.theyworkforyou.com/mp/25034/debbie...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>24878</td>\n",
" <td>Nigel</td>\n",
" <td>Adams</td>\n",
" <td>Conservative</td>\n",
" <td>Selby and Ainsty</td>\n",
" <td>https://www.theyworkforyou.com/mp/24878/nigel_...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>25661</td>\n",
" <td>Bim</td>\n",
" <td>Afolami</td>\n",
" <td>Conservative</td>\n",
" <td>Hitchin and Harpenden</td>\n",
" <td>https://www.theyworkforyou.com/mp/25661/bim_af...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>11929</td>\n",
" <td>Adam</td>\n",
" <td>Afriyie</td>\n",
" <td>Conservative</td>\n",
" <td>Windsor</td>\n",
" <td>https://www.theyworkforyou.com/mp/11929/adam_a...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>645</th>\n",
" <td>11791</td>\n",
" <td>Jeremy</td>\n",
" <td>Wright</td>\n",
" <td>Conservative</td>\n",
" <td>Kenilworth and Southam</td>\n",
" <td>https://www.theyworkforyou.com/mp/11791/jeremy...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>646</th>\n",
" <td>25649</td>\n",
" <td>Mohammad</td>\n",
" <td>Yasin</td>\n",
" <td>Labour</td>\n",
" <td>Bedford</td>\n",
" <td>https://www.theyworkforyou.com/mp/25649/mohamm...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>647</th>\n",
" <td>25806</td>\n",
" <td>Jacob</td>\n",
" <td>Young</td>\n",
" <td>Conservative</td>\n",
" <td>Redcar</td>\n",
" <td>https://www.theyworkforyou.com/mp/25806/jacob_...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>648</th>\n",
" <td>24822</td>\n",
" <td>Nadhim</td>\n",
" <td>Zahawi</td>\n",
" <td>Conservative</td>\n",
" <td>Stratford-on-Avon</td>\n",
" <td>https://www.theyworkforyou.com/mp/24822/nadhim...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>649</th>\n",
" <td>25386</td>\n",
" <td>Daniel</td>\n",
" <td>Zeichner</td>\n",
" <td>Labour</td>\n",
" <td>Cambridge</td>\n",
" <td>https://www.theyworkforyou.com/mp/25386/daniel...</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>650 rows × 6 columns</p>\n",
"</div>"
],
"text/plain": [
" Person ID First name Last name Party \\\n",
"0 10001 Diane Abbott Labour \n",
"1 25034 Debbie Abrahams Labour \n",
"2 24878 Nigel Adams Conservative \n",
"3 25661 Bim Afolami Conservative \n",
"4 11929 Adam Afriyie Conservative \n",
".. ... ... ... ... \n",
"645 11791 Jeremy Wright Conservative \n",
"646 25649 Mohammad Yasin Labour \n",
"647 25806 Jacob Young Conservative \n",
"648 24822 Nadhim Zahawi Conservative \n",
"649 25386 Daniel Zeichner Labour \n",
"\n",
" Constituency \\\n",
"0 Hackney North and Stoke Newington \n",
"1 Oldham East and Saddleworth \n",
"2 Selby and Ainsty \n",
"3 Hitchin and Harpenden \n",
"4 Windsor \n",
".. ... \n",
"645 Kenilworth and Southam \n",
"646 Bedford \n",
"647 Redcar \n",
"648 Stratford-on-Avon \n",
"649 Cambridge \n",
"\n",
" URI \n",
"0 https://www.theyworkforyou.com/mp/10001/diane_... \n",
"1 https://www.theyworkforyou.com/mp/25034/debbie... \n",
"2 https://www.theyworkforyou.com/mp/24878/nigel_... \n",
"3 https://www.theyworkforyou.com/mp/25661/bim_af... \n",
"4 https://www.theyworkforyou.com/mp/11929/adam_a... \n",
".. ... \n",
"645 https://www.theyworkforyou.com/mp/11791/jeremy... \n",
"646 https://www.theyworkforyou.com/mp/25649/mohamm... \n",
"647 https://www.theyworkforyou.com/mp/25806/jacob_... \n",
"648 https://www.theyworkforyou.com/mp/24822/nadhim... \n",
"649 https://www.theyworkforyou.com/mp/25386/daniel... \n",
"\n",
"[650 rows x 6 columns]"
]
},
"execution_count": 138,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%sql_utils_query -t mps -s '*'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Run an arbitrary query:"
]
},
{
"cell_type": "code",
"execution_count": 139,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>0</th>\n",
" <th>1</th>\n",
" <th>2</th>\n",
" <th>3</th>\n",
" <th>4</th>\n",
" <th>5</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>10001</td>\n",
" <td>Diane</td>\n",
" <td>Abbott</td>\n",
" <td>Labour</td>\n",
" <td>Hackney North and Stoke Newington</td>\n",
" <td>https://www.theyworkforyou.com/mp/10001/diane_...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>25034</td>\n",
" <td>Debbie</td>\n",
" <td>Abrahams</td>\n",
" <td>Labour</td>\n",
" <td>Oldham East and Saddleworth</td>\n",
" <td>https://www.theyworkforyou.com/mp/25034/debbie...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>24878</td>\n",
" <td>Nigel</td>\n",
" <td>Adams</td>\n",
" <td>Conservative</td>\n",
" <td>Selby and Ainsty</td>\n",
" <td>https://www.theyworkforyou.com/mp/24878/nigel_...</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" 0 1 2 3 4 \\\n",
"0 10001 Diane Abbott Labour Hackney North and Stoke Newington \n",
"1 25034 Debbie Abrahams Labour Oldham East and Saddleworth \n",
"2 24878 Nigel Adams Conservative Selby and Ainsty \n",
"\n",
" 5 \n",
"0 https://www.theyworkforyou.com/mp/10001/diane_... \n",
"1 https://www.theyworkforyou.com/mp/25034/debbie... \n",
"2 https://www.theyworkforyou.com/mp/24878/nigel_... "
]
},
"execution_count": 139,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = %sql_utils_query -q \"SELECT * FROM mps LIMIT 3;\"\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If we run the block magic we assume it's an open query:"
]
},
{
"cell_type": "code",
"execution_count": 163,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>0</th>\n",
" <th>1</th>\n",
" <th>2</th>\n",
" <th>3</th>\n",
" <th>4</th>\n",
" <th>5</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>10001</td>\n",
" <td>Diane</td>\n",
" <td>Abbott</td>\n",
" <td>Labour</td>\n",
" <td>Hackney North and Stoke Newington</td>\n",
" <td>https://www.theyworkforyou.com/mp/10001/diane_...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>25034</td>\n",
" <td>Debbie</td>\n",
" <td>Abrahams</td>\n",
" <td>Labour</td>\n",
" <td>Oldham East and Saddleworth</td>\n",
" <td>https://www.theyworkforyou.com/mp/25034/debbie...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>24878</td>\n",
" <td>Nigel</td>\n",
" <td>Adams</td>\n",
" <td>Conservative</td>\n",
" <td>Selby and Ainsty</td>\n",
" <td>https://www.theyworkforyou.com/mp/24878/nigel_...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>25661</td>\n",
" <td>Bim</td>\n",
" <td>Afolami</td>\n",
" <td>Conservative</td>\n",
" <td>Hitchin and Harpenden</td>\n",
" <td>https://www.theyworkforyou.com/mp/25661/bim_af...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>11929</td>\n",
" <td>Adam</td>\n",
" <td>Afriyie</td>\n",
" <td>Conservative</td>\n",
" <td>Windsor</td>\n",
" <td>https://www.theyworkforyou.com/mp/11929/adam_a...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>25817</td>\n",
" <td>Nickie</td>\n",
" <td>Aiken</td>\n",
" <td>Conservative</td>\n",
" <td>Cities of London and Westminster</td>\n",
" <td>https://www.theyworkforyou.com/mp/25817/nickie...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>24904</td>\n",
" <td>Peter</td>\n",
" <td>Aldous</td>\n",
" <td>Conservative</td>\n",
" <td>Waveney</td>\n",
" <td>https://www.theyworkforyou.com/mp/24904/peter_...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>24958</td>\n",
" <td>Rushanara</td>\n",
" <td>Ali</td>\n",
" <td>Labour</td>\n",
" <td>Bethnal Green and Bow</td>\n",
" <td>https://www.theyworkforyou.com/mp/24958/rushan...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>25888</td>\n",
" <td>Tahir</td>\n",
" <td>Ali</td>\n",
" <td>Labour</td>\n",
" <td>Birmingham, Hall Green</td>\n",
" <td>https://www.theyworkforyou.com/mp/25888/tahir_...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>25337</td>\n",
" <td>Lucy</td>\n",
" <td>Allan</td>\n",
" <td>Conservative</td>\n",
" <td>Telford</td>\n",
" <td>https://www.theyworkforyou.com/mp/25337/lucy_a...</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" 0 1 2 3 \\\n",
"0 10001 Diane Abbott Labour \n",
"1 25034 Debbie Abrahams Labour \n",
"2 24878 Nigel Adams Conservative \n",
"3 25661 Bim Afolami Conservative \n",
"4 11929 Adam Afriyie Conservative \n",
"5 25817 Nickie Aiken Conservative \n",
"6 24904 Peter Aldous Conservative \n",
"7 24958 Rushanara Ali Labour \n",
"8 25888 Tahir Ali Labour \n",
"9 25337 Lucy Allan Conservative \n",
"\n",
" 4 \\\n",
"0 Hackney North and Stoke Newington \n",
"1 Oldham East and Saddleworth \n",
"2 Selby and Ainsty \n",
"3 Hitchin and Harpenden \n",
"4 Windsor \n",
"5 Cities of London and Westminster \n",
"6 Waveney \n",
"7 Bethnal Green and Bow \n",
"8 Birmingham, Hall Green \n",
"9 Telford \n",
"\n",
" 5 \n",
"0 https://www.theyworkforyou.com/mp/10001/diane_... \n",
"1 https://www.theyworkforyou.com/mp/25034/debbie... \n",
"2 https://www.theyworkforyou.com/mp/24878/nigel_... \n",
"3 https://www.theyworkforyou.com/mp/25661/bim_af... \n",
"4 https://www.theyworkforyou.com/mp/11929/adam_a... \n",
"5 https://www.theyworkforyou.com/mp/25817/nickie... \n",
"6 https://www.theyworkforyou.com/mp/24904/peter_... \n",
"7 https://www.theyworkforyou.com/mp/24958/rushan... \n",
"8 https://www.theyworkforyou.com/mp/25888/tahir_... \n",
"9 https://www.theyworkforyou.com/mp/25337/lucy_a... "
]
},
"execution_count": 163,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql_utils_query --outvar mydf\n",
"SELECT *\n",
"-- with a comment\n",
"FROM mps -- and another comment\n",
"LIMIT 10"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The `--outvar` switch lets us assign the result to a variable:"
]
},
{
"cell_type": "code",
"execution_count": 119,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>0</th>\n",
" <th>1</th>\n",
" <th>2</th>\n",
" <th>3</th>\n",
" <th>4</th>\n",
" <th>5</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>10001</td>\n",
" <td>Diane</td>\n",
" <td>Abbott</td>\n",
" <td>Labour</td>\n",
" <td>Hackney North and Stoke Newington</td>\n",
" <td>https://www.theyworkforyou.com/mp/10001/diane_...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>25034</td>\n",
" <td>Debbie</td>\n",
" <td>Abrahams</td>\n",
" <td>Labour</td>\n",
" <td>Oldham East and Saddleworth</td>\n",
" <td>https://www.theyworkforyou.com/mp/25034/debbie...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>24878</td>\n",
" <td>Nigel</td>\n",
" <td>Adams</td>\n",
" <td>Conservative</td>\n",
" <td>Selby and Ainsty</td>\n",
" <td>https://www.theyworkforyou.com/mp/24878/nigel_...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>25661</td>\n",
" <td>Bim</td>\n",
" <td>Afolami</td>\n",
" <td>Conservative</td>\n",
" <td>Hitchin and Harpenden</td>\n",
" <td>https://www.theyworkforyou.com/mp/25661/bim_af...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>11929</td>\n",
" <td>Adam</td>\n",
" <td>Afriyie</td>\n",
" <td>Conservative</td>\n",
" <td>Windsor</td>\n",
" <td>https://www.theyworkforyou.com/mp/11929/adam_a...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>25817</td>\n",
" <td>Nickie</td>\n",
" <td>Aiken</td>\n",
" <td>Conservative</td>\n",
" <td>Cities of London and Westminster</td>\n",
" <td>https://www.theyworkforyou.com/mp/25817/nickie...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>24904</td>\n",
" <td>Peter</td>\n",
" <td>Aldous</td>\n",
" <td>Conservative</td>\n",
" <td>Waveney</td>\n",
" <td>https://www.theyworkforyou.com/mp/24904/peter_...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>24958</td>\n",
" <td>Rushanara</td>\n",
" <td>Ali</td>\n",
" <td>Labour</td>\n",
" <td>Bethnal Green and Bow</td>\n",
" <td>https://www.theyworkforyou.com/mp/24958/rushan...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>25888</td>\n",
" <td>Tahir</td>\n",
" <td>Ali</td>\n",
" <td>Labour</td>\n",
" <td>Birmingham, Hall Green</td>\n",
" <td>https://www.theyworkforyou.com/mp/25888/tahir_...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>25337</td>\n",
" <td>Lucy</td>\n",
" <td>Allan</td>\n",
" <td>Conservative</td>\n",
" <td>Telford</td>\n",
" <td>https://www.theyworkforyou.com/mp/25337/lucy_a...</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" 0 1 2 3 \\\n",
"0 10001 Diane Abbott Labour \n",
"1 25034 Debbie Abrahams Labour \n",
"2 24878 Nigel Adams Conservative \n",
"3 25661 Bim Afolami Conservative \n",
"4 11929 Adam Afriyie Conservative \n",
"5 25817 Nickie Aiken Conservative \n",
"6 24904 Peter Aldous Conservative \n",
"7 24958 Rushanara Ali Labour \n",
"8 25888 Tahir Ali Labour \n",
"9 25337 Lucy Allan Conservative \n",
"\n",
" 4 \\\n",
"0 Hackney North and Stoke Newington \n",
"1 Oldham East and Saddleworth \n",
"2 Selby and Ainsty \n",
"3 Hitchin and Harpenden \n",
"4 Windsor \n",
"5 Cities of London and Westminster \n",
"6 Waveney \n",
"7 Bethnal Green and Bow \n",
"8 Birmingham, Hall Green \n",
"9 Telford \n",
"\n",
" 5 \n",
"0 https://www.theyworkforyou.com/mp/10001/diane_... \n",
"1 https://www.theyworkforyou.com/mp/25034/debbie... \n",
"2 https://www.theyworkforyou.com/mp/24878/nigel_... \n",
"3 https://www.theyworkforyou.com/mp/25661/bim_af... \n",
"4 https://www.theyworkforyou.com/mp/11929/adam_a... \n",
"5 https://www.theyworkforyou.com/mp/25817/nickie... \n",
"6 https://www.theyworkforyou.com/mp/24904/peter_... \n",
"7 https://www.theyworkforyou.com/mp/24958/rushan... \n",
"8 https://www.theyworkforyou.com/mp/25888/tahir_... \n",
"9 https://www.theyworkforyou.com/mp/25337/lucy_a... "
]
},
"execution_count": 119,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"mydf"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can preconfigure table attributes for when a table is created by an insert:"
]
},
{
"cell_type": "code",
"execution_count": 152,
"metadata": {},
"outputs": [],
"source": [
"%sql_utils_table --table \"authors\" --pk \"id\" --notnull \"name,score\" -c \"id,name,score,url\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can explictly create a table:"
]
},
{
"cell_type": "code",
"execution_count": 160,
"metadata": {},
"outputs": [],
"source": [
"%%sql_utils_table --table cats --pk \"id\"\n",
"{\n",
" \"id\": int,\n",
" \"name\": str,\n",
" \"weight\": float,\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can view table columns:"
]
},
{
"cell_type": "code",
"execution_count": 161,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Column(cid=0, name='id', type='INTEGER', notnull=0, default_value=None, is_pk=1),\n",
" Column(cid=1, name='name', type='TEXT', notnull=0, default_value=None, is_pk=0),\n",
" Column(cid=2, name='weight', type='FLOAT', notnull=0, default_value=None, is_pk=0)]"
]
},
"execution_count": 161,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%sql_utils_tables --table cats -c"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment