Created
July 24, 2019 22:20
-
-
Save ScienceTony/32051bcd4a2789ecd622055d930000bb to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Using SQL (postresql and magic) to examine earthquake dataset." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"[Tony McDonald](https://twitter.com/ScienceTony) " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 1, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"'24-07-2019'" | |
] | |
}, | |
"execution_count": 1, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"from datetime import datetime\n", | |
"datetime.today().strftime('%d-%m-%Y')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"This is taken from the Socratica SQL youtube video for [SELECT](https://www.youtube.com/watch?v=VkabxQgtGsA&list=PLi01XoE8jYojRqM4qGBF1U90Ee1Ecb5tt&index=7). I have also added in a bit more<br>\n", | |
"License for the dataset: MIT License with Copyright (c) 2019 Socratica https://github.com/socratica/data/blob/master/LICENSE\n", | |
"<br>\n", | |
"The license for the dataset allows me to manipulate and use the data. " | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"SELECT is for choosing <b>columns</b><br>\n", | |
"FROM is for choosing a <b>database</b> <br>\n", | |
"WHERE is for choosing the <b>rows</b><br>\n", | |
"ORDER BY is for <b>sorting</b> and sorts on a <i>specified column</i> " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 3, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"# Load in the sql extensions:\n", | |
"%load_ext sql" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 4, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"#% sql sqlite://" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 5, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"postgres (PostgreSQL) 11.2\r\n" | |
] | |
} | |
], | |
"source": [ | |
"! postgres -V" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## I used pgAdmin to prepare a database and import the csv file. The following code allows me to connect to the database. " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 6, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"DB_ENGINE='postgresql' # tells ipython-sql that we will use postgreSQL as our database engine\n", | |
"DB_USER='postgres' # id of the user who is logging in\n", | |
"DB_PWD='postgres' # the user's password\n", | |
"DB_ADDR='localhost:5432' # the host and port on which the database engine is listening\n", | |
"DB_NAME='earthquake' # the name of the database" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 7, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"'postgresql://postgres:postgres@localhost:5432/earthquake'" | |
] | |
}, | |
"execution_count": 7, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"#Construct the connection string\n", | |
"DB_CONNECTION = '{engine}://{user}:{pwd}@{addr}/{name}'.format(engine=DB_ENGINE,\n", | |
" user=DB_USER,\n", | |
" pwd=DB_PWD,\n", | |
" addr=DB_ADDR,\n", | |
" name=DB_NAME)\n", | |
"\n", | |
"#Preview the connection string\n", | |
"DB_CONNECTION" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 4, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"'Connected: postgres@earthquake'" | |
] | |
}, | |
"execution_count": 4, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"%sql $DB_CONNECTION" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 49, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
" * postgresql://postgres:***@localhost:5432/earthquake\n", | |
"8 rows affected.\n" | |
] | |
}, | |
{ | |
"data": { | |
"text/html": [ | |
"<table>\n", | |
" <tr>\n", | |
" <th>earthquake_id</th>\n", | |
" <th>occurred_on</th>\n", | |
" <th>latitude</th>\n", | |
" <th>longitude</th>\n", | |
" <th>depth</th>\n", | |
" <th>magnitude</th>\n", | |
" <th>calculation_method</th>\n", | |
" <th>network_id</th>\n", | |
" <th>place</th>\n", | |
" <th>cause</th>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>1</td>\n", | |
" <td>1969-01-01 09:07:06</td>\n", | |
" <td>51.096</td>\n", | |
" <td>-179.392</td>\n", | |
" <td>45</td>\n", | |
" <td>5.6</td>\n", | |
" <td>mw</td>\n", | |
" <td>iscgem812771</td>\n", | |
" <td>Andreanof Islands, Aleutian Islands, Alaska</td>\n", | |
" <td>earthquake</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>2</td>\n", | |
" <td>1969-01-02 17:50:48</td>\n", | |
" <td>-56.096</td>\n", | |
" <td>-27.842</td>\n", | |
" <td>80.1</td>\n", | |
" <td>6</td>\n", | |
" <td>mw</td>\n", | |
" <td>iscgemsup812819</td>\n", | |
" <td>South Sandwich Islands region</td>\n", | |
" <td>earthquake</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>3</td>\n", | |
" <td>1969-01-03 03:16:40</td>\n", | |
" <td>37.14</td>\n", | |
" <td>57.899</td>\n", | |
" <td>10</td>\n", | |
" <td>5.5</td>\n", | |
" <td>mw</td>\n", | |
" <td>iscgem812826</td>\n", | |
" <td>Turkmenistan-Iran border region</td>\n", | |
" <td>earthquake</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>4</td>\n", | |
" <td>1969-01-03 13:28:12</td>\n", | |
" <td>51.132</td>\n", | |
" <td>-179.306</td>\n", | |
" <td>15</td>\n", | |
" <td>5.9</td>\n", | |
" <td>mw</td>\n", | |
" <td>iscgem812841</td>\n", | |
" <td>Andreanof Islands, Aleutian Islands, Alaska</td>\n", | |
" <td>earthquake</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>5</td>\n", | |
" <td>1969-01-04 22:36:48</td>\n", | |
" <td>-6.85</td>\n", | |
" <td>129.821</td>\n", | |
" <td>105</td>\n", | |
" <td>5.8</td>\n", | |
" <td>mw</td>\n", | |
" <td>iscgem812879</td>\n", | |
" <td>Banda Sea</td>\n", | |
" <td>earthquake</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>6</td>\n", | |
" <td>1969-01-05 13:26:43</td>\n", | |
" <td>-7.97</td>\n", | |
" <td>158.888</td>\n", | |
" <td>60</td>\n", | |
" <td>7</td>\n", | |
" <td>mw</td>\n", | |
" <td>iscgem812898</td>\n", | |
" <td>Solomon Islands</td>\n", | |
" <td>earthquake</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>7</td>\n", | |
" <td>1969-01-06 15:39:00</td>\n", | |
" <td>-10.407</td>\n", | |
" <td>164.479</td>\n", | |
" <td>10</td>\n", | |
" <td>6.9</td>\n", | |
" <td>mw</td>\n", | |
" <td>iscgem812954</td>\n", | |
" <td>Santa Cruz Islands region</td>\n", | |
" <td>earthquake</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>8</td>\n", | |
" <td>1969-01-08 21:55:51</td>\n", | |
" <td>11.823</td>\n", | |
" <td>143.419</td>\n", | |
" <td>45</td>\n", | |
" <td>5.6</td>\n", | |
" <td>mw</td>\n", | |
" <td>iscgem813024</td>\n", | |
" <td>south of the Mariana Islands</td>\n", | |
" <td>earthquake</td>\n", | |
" </tr>\n", | |
"</table>" | |
], | |
"text/plain": [ | |
"[(1, datetime.datetime(1969, 1, 1, 9, 7, 6), Decimal('51.096'), Decimal('-179.392'), Decimal('45'), Decimal('5.6'), 'mw', 'iscgem812771', 'Andreanof Islands, Aleutian Islands, Alaska', 'earthquake'),\n", | |
" (2, datetime.datetime(1969, 1, 2, 17, 50, 48), Decimal('-56.096'), Decimal('-27.842'), Decimal('80.1'), Decimal('6'), 'mw', 'iscgemsup812819', 'South Sandwich Islands region', 'earthquake'),\n", | |
" (3, datetime.datetime(1969, 1, 3, 3, 16, 40), Decimal('37.14'), Decimal('57.899'), Decimal('10'), Decimal('5.5'), 'mw', 'iscgem812826', 'Turkmenistan-Iran border region', 'earthquake'),\n", | |
" (4, datetime.datetime(1969, 1, 3, 13, 28, 12), Decimal('51.132'), Decimal('-179.306'), Decimal('15'), Decimal('5.9'), 'mw', 'iscgem812841', 'Andreanof Islands, Aleutian Islands, Alaska', 'earthquake'),\n", | |
" (5, datetime.datetime(1969, 1, 4, 22, 36, 48), Decimal('-6.85'), Decimal('129.821'), Decimal('105'), Decimal('5.8'), 'mw', 'iscgem812879', 'Banda Sea', 'earthquake'),\n", | |
" (6, datetime.datetime(1969, 1, 5, 13, 26, 43), Decimal('-7.97'), Decimal('158.888'), Decimal('60'), Decimal('7'), 'mw', 'iscgem812898', 'Solomon Islands', 'earthquake'),\n", | |
" (7, datetime.datetime(1969, 1, 6, 15, 39), Decimal('-10.407'), Decimal('164.479'), Decimal('10'), Decimal('6.9'), 'mw', 'iscgem812954', 'Santa Cruz Islands region', 'earthquake'),\n", | |
" (8, datetime.datetime(1969, 1, 8, 21, 55, 51), Decimal('11.823'), Decimal('143.419'), Decimal('45'), Decimal('5.6'), 'mw', 'iscgem813024', 'south of the Mariana Islands', 'earthquake')]" | |
] | |
}, | |
"execution_count": 49, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"%%sql\n", | |
"SELECT * FROM earthquake\n", | |
"LIMIT 8;" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 50, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
" * postgresql://postgres:***@localhost:5432/earthquake\n", | |
"1 rows affected.\n" | |
] | |
}, | |
{ | |
"data": { | |
"text/html": [ | |
"<table>\n", | |
" <tr>\n", | |
" <th>count</th>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>23119</td>\n", | |
" </tr>\n", | |
"</table>" | |
], | |
"text/plain": [ | |
"[(23119,)]" | |
] | |
}, | |
"execution_count": 50, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"%%sql\n", | |
"SELECT COUNT(*) FROM earthquake;" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 51, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
" * postgresql://postgres:***@localhost:5432/earthquake\n", | |
"8 rows affected.\n" | |
] | |
}, | |
{ | |
"data": { | |
"text/html": [ | |
"<table>\n", | |
" <tr>\n", | |
" <th>magnitude</th>\n", | |
" <th>place</th>\n", | |
" <th>occurred_on</th>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>5.6</td>\n", | |
" <td>Andreanof Islands, Aleutian Islands, Alaska</td>\n", | |
" <td>1969-01-01 09:07:06</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>6</td>\n", | |
" <td>South Sandwich Islands region</td>\n", | |
" <td>1969-01-02 17:50:48</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>5.5</td>\n", | |
" <td>Turkmenistan-Iran border region</td>\n", | |
" <td>1969-01-03 03:16:40</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>5.9</td>\n", | |
" <td>Andreanof Islands, Aleutian Islands, Alaska</td>\n", | |
" <td>1969-01-03 13:28:12</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>5.8</td>\n", | |
" <td>Banda Sea</td>\n", | |
" <td>1969-01-04 22:36:48</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>7</td>\n", | |
" <td>Solomon Islands</td>\n", | |
" <td>1969-01-05 13:26:43</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>6.9</td>\n", | |
" <td>Santa Cruz Islands region</td>\n", | |
" <td>1969-01-06 15:39:00</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>5.6</td>\n", | |
" <td>south of the Mariana Islands</td>\n", | |
" <td>1969-01-08 21:55:51</td>\n", | |
" </tr>\n", | |
"</table>" | |
], | |
"text/plain": [ | |
"[(Decimal('5.6'), 'Andreanof Islands, Aleutian Islands, Alaska', datetime.datetime(1969, 1, 1, 9, 7, 6)),\n", | |
" (Decimal('6'), 'South Sandwich Islands region', datetime.datetime(1969, 1, 2, 17, 50, 48)),\n", | |
" (Decimal('5.5'), 'Turkmenistan-Iran border region', datetime.datetime(1969, 1, 3, 3, 16, 40)),\n", | |
" (Decimal('5.9'), 'Andreanof Islands, Aleutian Islands, Alaska', datetime.datetime(1969, 1, 3, 13, 28, 12)),\n", | |
" (Decimal('5.8'), 'Banda Sea', datetime.datetime(1969, 1, 4, 22, 36, 48)),\n", | |
" (Decimal('7'), 'Solomon Islands', datetime.datetime(1969, 1, 5, 13, 26, 43)),\n", | |
" (Decimal('6.9'), 'Santa Cruz Islands region', datetime.datetime(1969, 1, 6, 15, 39)),\n", | |
" (Decimal('5.6'), 'south of the Mariana Islands', datetime.datetime(1969, 1, 8, 21, 55, 51))]" | |
] | |
}, | |
"execution_count": 51, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"%%sql\n", | |
"SELECT magnitude, place, occurred_on FROM earthquake\n", | |
"LIMIT 8;" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Using a condition for WHERE" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 52, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
" * postgresql://postgres:***@localhost:5432/earthquake\n", | |
"1 rows affected.\n" | |
] | |
}, | |
{ | |
"data": { | |
"text/html": [ | |
"<table>\n", | |
" <tr>\n", | |
" <th>earthquake_id</th>\n", | |
" <th>occurred_on</th>\n", | |
" <th>latitude</th>\n", | |
" <th>longitude</th>\n", | |
" <th>depth</th>\n", | |
" <th>magnitude</th>\n", | |
" <th>calculation_method</th>\n", | |
" <th>network_id</th>\n", | |
" <th>place</th>\n", | |
" <th>cause</th>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>18756</td>\n", | |
" <td>2010-02-27 06:34:12</td>\n", | |
" <td>-36.122</td>\n", | |
" <td>-72.898</td>\n", | |
" <td>22.9</td>\n", | |
" <td>8.8</td>\n", | |
" <td>mww</td>\n", | |
" <td>official20100227063411530_30</td>\n", | |
" <td>offshore Bio-Bio, Chile</td>\n", | |
" <td>earthquake</td>\n", | |
" </tr>\n", | |
"</table>" | |
], | |
"text/plain": [ | |
"[(18756, datetime.datetime(2010, 2, 27, 6, 34, 12), Decimal('-36.122'), Decimal('-72.898'), Decimal('22.9'), Decimal('8.8'), 'mww', 'official20100227063411530_30', 'offshore Bio-Bio, Chile', 'earthquake')]" | |
] | |
}, | |
"execution_count": 52, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"%%sql\n", | |
"SELECT * \n", | |
"FROM earthquake\n", | |
"WHERE occurred_on >= '2010-1-1' AND occurred_on <= '2010-12-31'\n", | |
"ORDER BY magnitude DESC -- The DESC means descending order \n", | |
"LIMIT 1;" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Aggregation " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 53, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
" * postgresql://postgres:***@localhost:5432/earthquake\n", | |
"1 rows affected.\n" | |
] | |
}, | |
{ | |
"data": { | |
"text/html": [ | |
"<table>\n", | |
" <tr>\n", | |
" <th>min</th>\n", | |
" <th>max</th>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>1969-01-01 09:07:06</td>\n", | |
" <td>2018-12-31 02:35:38</td>\n", | |
" </tr>\n", | |
"</table>" | |
], | |
"text/plain": [ | |
"[(datetime.datetime(1969, 1, 1, 9, 7, 6), datetime.datetime(2018, 12, 31, 2, 35, 38))]" | |
] | |
}, | |
"execution_count": 53, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"%%sql\n", | |
"SELECT MIN(occurred_on), MAX(occurred_on) \n", | |
"FROM earthquake;" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 54, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
" * postgresql://postgres:***@localhost:5432/earthquake\n", | |
"1 rows affected.\n" | |
] | |
}, | |
{ | |
"data": { | |
"text/html": [ | |
"<table>\n", | |
" <tr>\n", | |
" <th>min</th>\n", | |
" <th>max</th>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>5.5</td>\n", | |
" <td>9.1</td>\n", | |
" </tr>\n", | |
"</table>" | |
], | |
"text/plain": [ | |
"[(Decimal('5.5'), Decimal('9.1'))]" | |
] | |
}, | |
"execution_count": 54, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"%%sql\n", | |
"SELECT MIN(magnitude), MAX(magnitude) \n", | |
"FROM earthquake;" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 83, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
" * postgresql://postgres:***@localhost:5432/earthquake\n", | |
"1 rows affected.\n" | |
] | |
}, | |
{ | |
"data": { | |
"text/html": [ | |
"<table>\n", | |
" <tr>\n", | |
" <th>avg</th>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>5.8744322851334400</td>\n", | |
" </tr>\n", | |
"</table>" | |
], | |
"text/plain": [ | |
"[(Decimal('5.8744322851334400'),)]" | |
] | |
}, | |
"execution_count": 83, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"%%sql\n", | |
"SELECT AVG(magnitude)\n", | |
"FROM earthquake;" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# The DISTINCT keyword\n", | |
"# Show the distinct values for cause" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 56, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
" * postgresql://postgres:***@localhost:5432/earthquake\n", | |
"3 rows affected.\n" | |
] | |
}, | |
{ | |
"data": { | |
"text/html": [ | |
"<table>\n", | |
" <tr>\n", | |
" <th>cause</th>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>explosion</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>nuclear explosion</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>earthquake</td>\n", | |
" </tr>\n", | |
"</table>" | |
], | |
"text/plain": [ | |
"[('explosion',), ('nuclear explosion',), ('earthquake',)]" | |
] | |
}, | |
"execution_count": 56, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"%%sql\n", | |
"SELECT DISTINCT cause\n", | |
"FROM earthquake;" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Counting the entries for these causes" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 57, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
" * postgresql://postgres:***@localhost:5432/earthquake\n", | |
"1 rows affected.\n" | |
] | |
}, | |
{ | |
"data": { | |
"text/html": [ | |
"<table>\n", | |
" <tr>\n", | |
" <th>count</th>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>22942</td>\n", | |
" </tr>\n", | |
"</table>" | |
], | |
"text/plain": [ | |
"[(22942,)]" | |
] | |
}, | |
"execution_count": 57, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"%%sql\n", | |
"SELECT COUNT(*)\n", | |
"FROM earthquake \n", | |
"WHERE cause = 'earthquake';" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 58, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
" * postgresql://postgres:***@localhost:5432/earthquake\n", | |
"1 rows affected.\n" | |
] | |
}, | |
{ | |
"data": { | |
"text/html": [ | |
"<table>\n", | |
" <tr>\n", | |
" <th>count</th>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>173</td>\n", | |
" </tr>\n", | |
"</table>" | |
], | |
"text/plain": [ | |
"[(173,)]" | |
] | |
}, | |
"execution_count": 58, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"%%sql\n", | |
"SELECT COUNT(*)\n", | |
"FROM earthquake \n", | |
"WHERE cause = 'nuclear explosion';" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Count the number of earthquakes that have explosion as the cause" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 59, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
" * postgresql://postgres:***@localhost:5432/earthquake\n", | |
"1 rows affected.\n" | |
] | |
}, | |
{ | |
"data": { | |
"text/html": [ | |
"<table>\n", | |
" <tr>\n", | |
" <th>count</th>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>4</td>\n", | |
" </tr>\n", | |
"</table>" | |
], | |
"text/plain": [ | |
"[(4,)]" | |
] | |
}, | |
"execution_count": 59, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"%%sql\n", | |
"SELECT COUNT(*)\n", | |
"FROM earthquake \n", | |
"WHERE cause = 'explosion';" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Show the 4 rows with explosion as cause" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 60, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
" * postgresql://postgres:***@localhost:5432/earthquake\n", | |
"4 rows affected.\n" | |
] | |
}, | |
{ | |
"data": { | |
"text/html": [ | |
"<table>\n", | |
" <tr>\n", | |
" <th>earthquake_id</th>\n", | |
" <th>occurred_on</th>\n", | |
" <th>latitude</th>\n", | |
" <th>longitude</th>\n", | |
" <th>depth</th>\n", | |
" <th>magnitude</th>\n", | |
" <th>calculation_method</th>\n", | |
" <th>network_id</th>\n", | |
" <th>place</th>\n", | |
" <th>cause</th>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>1483</td>\n", | |
" <td>1973-02-16 05:02:58</td>\n", | |
" <td>49.835</td>\n", | |
" <td>78.232</td>\n", | |
" <td>0</td>\n", | |
" <td>5.6</td>\n", | |
" <td>mb</td>\n", | |
" <td>usp00000jc</td>\n", | |
" <td>eastern Kazakhstan</td>\n", | |
" <td>explosion</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>2461</td>\n", | |
" <td>1975-08-23 08:59:58</td>\n", | |
" <td>73.369</td>\n", | |
" <td>54.641</td>\n", | |
" <td>0</td>\n", | |
" <td>6.4</td>\n", | |
" <td>mb</td>\n", | |
" <td>usp0000cjx</td>\n", | |
" <td>Novaya Zemlya, Russia</td>\n", | |
" <td>explosion</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>6107</td>\n", | |
" <td>1984-11-02 20:44:59</td>\n", | |
" <td>-21.883</td>\n", | |
" <td>-138.994</td>\n", | |
" <td>0</td>\n", | |
" <td>5.7</td>\n", | |
" <td>mb</td>\n", | |
" <td>usp00028r7</td>\n", | |
" <td>Tuamotu Archipelago, French Polynesia region</td>\n", | |
" <td>explosion</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>6380</td>\n", | |
" <td>1985-05-08 20:27:59</td>\n", | |
" <td>-21.823</td>\n", | |
" <td>-139.048</td>\n", | |
" <td>0</td>\n", | |
" <td>5.7</td>\n", | |
" <td>mb</td>\n", | |
" <td>usp0002ejj</td>\n", | |
" <td>Tuamotu Archipelago, French Polynesia region</td>\n", | |
" <td>explosion</td>\n", | |
" </tr>\n", | |
"</table>" | |
], | |
"text/plain": [ | |
"[(1483, datetime.datetime(1973, 2, 16, 5, 2, 58), Decimal('49.835'), Decimal('78.232'), Decimal('0'), Decimal('5.6'), 'mb', 'usp00000jc', 'eastern Kazakhstan', 'explosion'),\n", | |
" (2461, datetime.datetime(1975, 8, 23, 8, 59, 58), Decimal('73.369'), Decimal('54.641'), Decimal('0'), Decimal('6.4'), 'mb', 'usp0000cjx', 'Novaya Zemlya, Russia', 'explosion'),\n", | |
" (6107, datetime.datetime(1984, 11, 2, 20, 44, 59), Decimal('-21.883'), Decimal('-138.994'), Decimal('0'), Decimal('5.7'), 'mb', 'usp00028r7', 'Tuamotu Archipelago, French Polynesia region', 'explosion'),\n", | |
" (6380, datetime.datetime(1985, 5, 8, 20, 27, 59), Decimal('-21.823'), Decimal('-139.048'), Decimal('0'), Decimal('5.7'), 'mb', 'usp0002ejj', 'Tuamotu Archipelago, French Polynesia region', 'explosion')]" | |
] | |
}, | |
"execution_count": 60, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"%%sql\n", | |
"SELECT *\n", | |
"FROM earthquake \n", | |
"WHERE cause = 'explosion';" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"The above table is actually nuclear explosions and not explosion. This is an example of messy data and needs to be cleaned. Evidence for the explosions being nuclear explosions<br>\n", | |
"<b>eastern Kazakhstan<b> https://en.wikipedia.org/wiki/1973_Soviet_nuclear_tests <br>\n", | |
"<b>Novaya Zemlya, Russia<b> https://en.wikipedia.org/wiki/1975_Soviet_nuclear_tests <br>\n", | |
"<b>Tuamotu Archipelago, French Polynesia region, 1984 and 1985<b> https://en.wikipedia.org/wiki/1983%E2%80%9385_French_nuclear_tests <br>\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# The DESC keyword and the 10 strongest earthquakes for the dataset\n", | |
"`DESC` arranges the column in numerical <b>descending<b> order" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 69, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
" * postgresql://postgres:***@localhost:5432/earthquake\n", | |
"10 rows affected.\n" | |
] | |
}, | |
{ | |
"data": { | |
"text/html": [ | |
"<table>\n", | |
" <tr>\n", | |
" <th>place</th>\n", | |
" <th>magnitude</th>\n", | |
" <th>occurred_on</th>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>off the west coast of northern Sumatra</td>\n", | |
" <td>9.1</td>\n", | |
" <td>2004-12-26 00:58:53</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>near the east coast of Honshu, Japan</td>\n", | |
" <td>9.1</td>\n", | |
" <td>2011-03-11 05:46:24</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>offshore Bio-Bio, Chile</td>\n", | |
" <td>8.8</td>\n", | |
" <td>2010-02-27 06:34:12</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>off the west coast of northern Sumatra</td>\n", | |
" <td>8.6</td>\n", | |
" <td>2012-04-11 08:38:37</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>northern Sumatra, Indonesia</td>\n", | |
" <td>8.6</td>\n", | |
" <td>2005-03-28 16:09:37</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>near the coast of southern Peru</td>\n", | |
" <td>8.4</td>\n", | |
" <td>2001-06-23 20:33:14</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>southern Sumatra, Indonesia</td>\n", | |
" <td>8.4</td>\n", | |
" <td>2007-09-12 11:10:27</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>south of Sumbawa, Indonesia</td>\n", | |
" <td>8.3</td>\n", | |
" <td>1977-08-19 06:08:55</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>Kuril Islands</td>\n", | |
" <td>8.3</td>\n", | |
" <td>2006-11-15 11:14:14</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>Hokkaido, Japan region</td>\n", | |
" <td>8.3</td>\n", | |
" <td>2003-09-25 19:50:06</td>\n", | |
" </tr>\n", | |
"</table>" | |
], | |
"text/plain": [ | |
"[('off the west coast of northern Sumatra', Decimal('9.1'), datetime.datetime(2004, 12, 26, 0, 58, 53)),\n", | |
" ('near the east coast of Honshu, Japan', Decimal('9.1'), datetime.datetime(2011, 3, 11, 5, 46, 24)),\n", | |
" ('offshore Bio-Bio, Chile', Decimal('8.8'), datetime.datetime(2010, 2, 27, 6, 34, 12)),\n", | |
" ('off the west coast of northern Sumatra', Decimal('8.6'), datetime.datetime(2012, 4, 11, 8, 38, 37)),\n", | |
" ('northern Sumatra, Indonesia', Decimal('8.6'), datetime.datetime(2005, 3, 28, 16, 9, 37)),\n", | |
" ('near the coast of southern Peru', Decimal('8.4'), datetime.datetime(2001, 6, 23, 20, 33, 14)),\n", | |
" ('southern Sumatra, Indonesia', Decimal('8.4'), datetime.datetime(2007, 9, 12, 11, 10, 27)),\n", | |
" ('south of Sumbawa, Indonesia', Decimal('8.3'), datetime.datetime(1977, 8, 19, 6, 8, 55)),\n", | |
" ('Kuril Islands', Decimal('8.3'), datetime.datetime(2006, 11, 15, 11, 14, 14)),\n", | |
" ('Hokkaido, Japan region', Decimal('8.3'), datetime.datetime(2003, 9, 25, 19, 50, 6))]" | |
] | |
}, | |
"execution_count": 69, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"%%sql \n", | |
"SELECT place, magnitude, occurred_on \n", | |
"FROM earthquake\n", | |
"ORDER BY magnitude DESC\n", | |
"LIMIT 10;" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# The LIKE and BETWEEN keywords\n", | |
"# How many aftershocks were there after the <I>'near the east coast of Honshu, Japan'<i> earthquake with a magnitude of 6 or greater? " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 9, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
" * postgresql://postgres:***@localhost:5432/earthquake\n", | |
"1 rows affected.\n" | |
] | |
}, | |
{ | |
"data": { | |
"text/html": [ | |
"<table>\n", | |
" <tr>\n", | |
" <th>Aftershocks</th>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>51</td>\n", | |
" </tr>\n", | |
"</table>" | |
], | |
"text/plain": [ | |
"[(51,)]" | |
] | |
}, | |
"execution_count": 9, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"%%sql\n", | |
"SELECT COUNT(*) AS \"Aftershocks\"\n", | |
"FROM earthquake\n", | |
"WHERE place LIKE '%Honshu%Japan%'\n", | |
" AND occurred_on BETWEEN '2011-03-11' AND '2011-03-18'\n", | |
" AND magnitude >= 6;" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Notice i used the AS keyword to rename the column header from count to Aftershocks" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"[View Full Report](https://www.sciencetony.com/2019/07/nuclear-test-map-for-1969-to-2017.html)<br>\n", | |
"[Sciencetony.com](https://www.sciencetony.com/)<br>\n", | |
"[View My LinkedIn CV](https://www.linkedin.com/in/tony-mcdonald-sciencetony/)<br>\n", | |
"[My Twitter](https://twitter.com/ScienceTony)<br>\n", | |
"[My YouTube](https://www.youtube.com/Dreamazium)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.7.2" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 2 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment