Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ScienceTony/32051bcd4a2789ecd622055d930000bb to your computer and use it in GitHub Desktop.
Save ScienceTony/32051bcd4a2789ecd622055d930000bb to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Using SQL (postresql and magic) to examine earthquake dataset."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Tony McDonald](https://twitter.com/ScienceTony) "
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'24-07-2019'"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from datetime import datetime\n",
"datetime.today().strftime('%d-%m-%Y')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is taken from the Socratica SQL youtube video for [SELECT](https://www.youtube.com/watch?v=VkabxQgtGsA&list=PLi01XoE8jYojRqM4qGBF1U90Ee1Ecb5tt&index=7). I have also added in a bit more<br>\n",
"License for the dataset: MIT License with Copyright (c) 2019 Socratica https://github.com/socratica/data/blob/master/LICENSE\n",
"<br>\n",
"The license for the dataset allows me to manipulate and use the data. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"SELECT is for choosing <b>columns</b><br>\n",
"FROM is for choosing a <b>database</b> <br>\n",
"WHERE is for choosing the <b>rows</b><br>\n",
"ORDER BY is for <b>sorting</b> and sorts on a <i>specified column</i> "
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"# Load in the sql extensions:\n",
"%load_ext sql"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"#% sql sqlite://"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"postgres (PostgreSQL) 11.2\r\n"
]
}
],
"source": [
"! postgres -V"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## I used pgAdmin to prepare a database and import the csv file. The following code allows me to connect to the database. "
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"DB_ENGINE='postgresql' # tells ipython-sql that we will use postgreSQL as our database engine\n",
"DB_USER='postgres' # id of the user who is logging in\n",
"DB_PWD='postgres' # the user's password\n",
"DB_ADDR='localhost:5432' # the host and port on which the database engine is listening\n",
"DB_NAME='earthquake' # the name of the database"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'postgresql://postgres:postgres@localhost:5432/earthquake'"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Construct the connection string\n",
"DB_CONNECTION = '{engine}://{user}:{pwd}@{addr}/{name}'.format(engine=DB_ENGINE,\n",
" user=DB_USER,\n",
" pwd=DB_PWD,\n",
" addr=DB_ADDR,\n",
" name=DB_NAME)\n",
"\n",
"#Preview the connection string\n",
"DB_CONNECTION"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Connected: postgres@earthquake'"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%sql $DB_CONNECTION"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * postgresql://postgres:***@localhost:5432/earthquake\n",
"8 rows affected.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>earthquake_id</th>\n",
" <th>occurred_on</th>\n",
" <th>latitude</th>\n",
" <th>longitude</th>\n",
" <th>depth</th>\n",
" <th>magnitude</th>\n",
" <th>calculation_method</th>\n",
" <th>network_id</th>\n",
" <th>place</th>\n",
" <th>cause</th>\n",
" </tr>\n",
" <tr>\n",
" <td>1</td>\n",
" <td>1969-01-01 09:07:06</td>\n",
" <td>51.096</td>\n",
" <td>-179.392</td>\n",
" <td>45</td>\n",
" <td>5.6</td>\n",
" <td>mw</td>\n",
" <td>iscgem812771</td>\n",
" <td>Andreanof Islands, Aleutian Islands, Alaska</td>\n",
" <td>earthquake</td>\n",
" </tr>\n",
" <tr>\n",
" <td>2</td>\n",
" <td>1969-01-02 17:50:48</td>\n",
" <td>-56.096</td>\n",
" <td>-27.842</td>\n",
" <td>80.1</td>\n",
" <td>6</td>\n",
" <td>mw</td>\n",
" <td>iscgemsup812819</td>\n",
" <td>South Sandwich Islands region</td>\n",
" <td>earthquake</td>\n",
" </tr>\n",
" <tr>\n",
" <td>3</td>\n",
" <td>1969-01-03 03:16:40</td>\n",
" <td>37.14</td>\n",
" <td>57.899</td>\n",
" <td>10</td>\n",
" <td>5.5</td>\n",
" <td>mw</td>\n",
" <td>iscgem812826</td>\n",
" <td>Turkmenistan-Iran border region</td>\n",
" <td>earthquake</td>\n",
" </tr>\n",
" <tr>\n",
" <td>4</td>\n",
" <td>1969-01-03 13:28:12</td>\n",
" <td>51.132</td>\n",
" <td>-179.306</td>\n",
" <td>15</td>\n",
" <td>5.9</td>\n",
" <td>mw</td>\n",
" <td>iscgem812841</td>\n",
" <td>Andreanof Islands, Aleutian Islands, Alaska</td>\n",
" <td>earthquake</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5</td>\n",
" <td>1969-01-04 22:36:48</td>\n",
" <td>-6.85</td>\n",
" <td>129.821</td>\n",
" <td>105</td>\n",
" <td>5.8</td>\n",
" <td>mw</td>\n",
" <td>iscgem812879</td>\n",
" <td>Banda Sea</td>\n",
" <td>earthquake</td>\n",
" </tr>\n",
" <tr>\n",
" <td>6</td>\n",
" <td>1969-01-05 13:26:43</td>\n",
" <td>-7.97</td>\n",
" <td>158.888</td>\n",
" <td>60</td>\n",
" <td>7</td>\n",
" <td>mw</td>\n",
" <td>iscgem812898</td>\n",
" <td>Solomon Islands</td>\n",
" <td>earthquake</td>\n",
" </tr>\n",
" <tr>\n",
" <td>7</td>\n",
" <td>1969-01-06 15:39:00</td>\n",
" <td>-10.407</td>\n",
" <td>164.479</td>\n",
" <td>10</td>\n",
" <td>6.9</td>\n",
" <td>mw</td>\n",
" <td>iscgem812954</td>\n",
" <td>Santa Cruz Islands region</td>\n",
" <td>earthquake</td>\n",
" </tr>\n",
" <tr>\n",
" <td>8</td>\n",
" <td>1969-01-08 21:55:51</td>\n",
" <td>11.823</td>\n",
" <td>143.419</td>\n",
" <td>45</td>\n",
" <td>5.6</td>\n",
" <td>mw</td>\n",
" <td>iscgem813024</td>\n",
" <td>south of the Mariana Islands</td>\n",
" <td>earthquake</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[(1, datetime.datetime(1969, 1, 1, 9, 7, 6), Decimal('51.096'), Decimal('-179.392'), Decimal('45'), Decimal('5.6'), 'mw', 'iscgem812771', 'Andreanof Islands, Aleutian Islands, Alaska', 'earthquake'),\n",
" (2, datetime.datetime(1969, 1, 2, 17, 50, 48), Decimal('-56.096'), Decimal('-27.842'), Decimal('80.1'), Decimal('6'), 'mw', 'iscgemsup812819', 'South Sandwich Islands region', 'earthquake'),\n",
" (3, datetime.datetime(1969, 1, 3, 3, 16, 40), Decimal('37.14'), Decimal('57.899'), Decimal('10'), Decimal('5.5'), 'mw', 'iscgem812826', 'Turkmenistan-Iran border region', 'earthquake'),\n",
" (4, datetime.datetime(1969, 1, 3, 13, 28, 12), Decimal('51.132'), Decimal('-179.306'), Decimal('15'), Decimal('5.9'), 'mw', 'iscgem812841', 'Andreanof Islands, Aleutian Islands, Alaska', 'earthquake'),\n",
" (5, datetime.datetime(1969, 1, 4, 22, 36, 48), Decimal('-6.85'), Decimal('129.821'), Decimal('105'), Decimal('5.8'), 'mw', 'iscgem812879', 'Banda Sea', 'earthquake'),\n",
" (6, datetime.datetime(1969, 1, 5, 13, 26, 43), Decimal('-7.97'), Decimal('158.888'), Decimal('60'), Decimal('7'), 'mw', 'iscgem812898', 'Solomon Islands', 'earthquake'),\n",
" (7, datetime.datetime(1969, 1, 6, 15, 39), Decimal('-10.407'), Decimal('164.479'), Decimal('10'), Decimal('6.9'), 'mw', 'iscgem812954', 'Santa Cruz Islands region', 'earthquake'),\n",
" (8, datetime.datetime(1969, 1, 8, 21, 55, 51), Decimal('11.823'), Decimal('143.419'), Decimal('45'), Decimal('5.6'), 'mw', 'iscgem813024', 'south of the Mariana Islands', 'earthquake')]"
]
},
"execution_count": 49,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT * FROM earthquake\n",
"LIMIT 8;"
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * postgresql://postgres:***@localhost:5432/earthquake\n",
"1 rows affected.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>count</th>\n",
" </tr>\n",
" <tr>\n",
" <td>23119</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[(23119,)]"
]
},
"execution_count": 50,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT COUNT(*) FROM earthquake;"
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * postgresql://postgres:***@localhost:5432/earthquake\n",
"8 rows affected.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>magnitude</th>\n",
" <th>place</th>\n",
" <th>occurred_on</th>\n",
" </tr>\n",
" <tr>\n",
" <td>5.6</td>\n",
" <td>Andreanof Islands, Aleutian Islands, Alaska</td>\n",
" <td>1969-01-01 09:07:06</td>\n",
" </tr>\n",
" <tr>\n",
" <td>6</td>\n",
" <td>South Sandwich Islands region</td>\n",
" <td>1969-01-02 17:50:48</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5.5</td>\n",
" <td>Turkmenistan-Iran border region</td>\n",
" <td>1969-01-03 03:16:40</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5.9</td>\n",
" <td>Andreanof Islands, Aleutian Islands, Alaska</td>\n",
" <td>1969-01-03 13:28:12</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5.8</td>\n",
" <td>Banda Sea</td>\n",
" <td>1969-01-04 22:36:48</td>\n",
" </tr>\n",
" <tr>\n",
" <td>7</td>\n",
" <td>Solomon Islands</td>\n",
" <td>1969-01-05 13:26:43</td>\n",
" </tr>\n",
" <tr>\n",
" <td>6.9</td>\n",
" <td>Santa Cruz Islands region</td>\n",
" <td>1969-01-06 15:39:00</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5.6</td>\n",
" <td>south of the Mariana Islands</td>\n",
" <td>1969-01-08 21:55:51</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[(Decimal('5.6'), 'Andreanof Islands, Aleutian Islands, Alaska', datetime.datetime(1969, 1, 1, 9, 7, 6)),\n",
" (Decimal('6'), 'South Sandwich Islands region', datetime.datetime(1969, 1, 2, 17, 50, 48)),\n",
" (Decimal('5.5'), 'Turkmenistan-Iran border region', datetime.datetime(1969, 1, 3, 3, 16, 40)),\n",
" (Decimal('5.9'), 'Andreanof Islands, Aleutian Islands, Alaska', datetime.datetime(1969, 1, 3, 13, 28, 12)),\n",
" (Decimal('5.8'), 'Banda Sea', datetime.datetime(1969, 1, 4, 22, 36, 48)),\n",
" (Decimal('7'), 'Solomon Islands', datetime.datetime(1969, 1, 5, 13, 26, 43)),\n",
" (Decimal('6.9'), 'Santa Cruz Islands region', datetime.datetime(1969, 1, 6, 15, 39)),\n",
" (Decimal('5.6'), 'south of the Mariana Islands', datetime.datetime(1969, 1, 8, 21, 55, 51))]"
]
},
"execution_count": 51,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT magnitude, place, occurred_on FROM earthquake\n",
"LIMIT 8;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Using a condition for WHERE"
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * postgresql://postgres:***@localhost:5432/earthquake\n",
"1 rows affected.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>earthquake_id</th>\n",
" <th>occurred_on</th>\n",
" <th>latitude</th>\n",
" <th>longitude</th>\n",
" <th>depth</th>\n",
" <th>magnitude</th>\n",
" <th>calculation_method</th>\n",
" <th>network_id</th>\n",
" <th>place</th>\n",
" <th>cause</th>\n",
" </tr>\n",
" <tr>\n",
" <td>18756</td>\n",
" <td>2010-02-27 06:34:12</td>\n",
" <td>-36.122</td>\n",
" <td>-72.898</td>\n",
" <td>22.9</td>\n",
" <td>8.8</td>\n",
" <td>mww</td>\n",
" <td>official20100227063411530_30</td>\n",
" <td>offshore Bio-Bio, Chile</td>\n",
" <td>earthquake</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[(18756, datetime.datetime(2010, 2, 27, 6, 34, 12), Decimal('-36.122'), Decimal('-72.898'), Decimal('22.9'), Decimal('8.8'), 'mww', 'official20100227063411530_30', 'offshore Bio-Bio, Chile', 'earthquake')]"
]
},
"execution_count": 52,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT * \n",
"FROM earthquake\n",
"WHERE occurred_on >= '2010-1-1' AND occurred_on <= '2010-12-31'\n",
"ORDER BY magnitude DESC -- The DESC means descending order \n",
"LIMIT 1;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Aggregation "
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * postgresql://postgres:***@localhost:5432/earthquake\n",
"1 rows affected.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>min</th>\n",
" <th>max</th>\n",
" </tr>\n",
" <tr>\n",
" <td>1969-01-01 09:07:06</td>\n",
" <td>2018-12-31 02:35:38</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[(datetime.datetime(1969, 1, 1, 9, 7, 6), datetime.datetime(2018, 12, 31, 2, 35, 38))]"
]
},
"execution_count": 53,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT MIN(occurred_on), MAX(occurred_on) \n",
"FROM earthquake;"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * postgresql://postgres:***@localhost:5432/earthquake\n",
"1 rows affected.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>min</th>\n",
" <th>max</th>\n",
" </tr>\n",
" <tr>\n",
" <td>5.5</td>\n",
" <td>9.1</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[(Decimal('5.5'), Decimal('9.1'))]"
]
},
"execution_count": 54,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT MIN(magnitude), MAX(magnitude) \n",
"FROM earthquake;"
]
},
{
"cell_type": "code",
"execution_count": 83,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * postgresql://postgres:***@localhost:5432/earthquake\n",
"1 rows affected.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>avg</th>\n",
" </tr>\n",
" <tr>\n",
" <td>5.8744322851334400</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[(Decimal('5.8744322851334400'),)]"
]
},
"execution_count": 83,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT AVG(magnitude)\n",
"FROM earthquake;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# The DISTINCT keyword\n",
"# Show the distinct values for cause"
]
},
{
"cell_type": "code",
"execution_count": 56,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * postgresql://postgres:***@localhost:5432/earthquake\n",
"3 rows affected.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>cause</th>\n",
" </tr>\n",
" <tr>\n",
" <td>explosion</td>\n",
" </tr>\n",
" <tr>\n",
" <td>nuclear explosion</td>\n",
" </tr>\n",
" <tr>\n",
" <td>earthquake</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[('explosion',), ('nuclear explosion',), ('earthquake',)]"
]
},
"execution_count": 56,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT DISTINCT cause\n",
"FROM earthquake;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Counting the entries for these causes"
]
},
{
"cell_type": "code",
"execution_count": 57,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * postgresql://postgres:***@localhost:5432/earthquake\n",
"1 rows affected.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>count</th>\n",
" </tr>\n",
" <tr>\n",
" <td>22942</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[(22942,)]"
]
},
"execution_count": 57,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT COUNT(*)\n",
"FROM earthquake \n",
"WHERE cause = 'earthquake';"
]
},
{
"cell_type": "code",
"execution_count": 58,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * postgresql://postgres:***@localhost:5432/earthquake\n",
"1 rows affected.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>count</th>\n",
" </tr>\n",
" <tr>\n",
" <td>173</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[(173,)]"
]
},
"execution_count": 58,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT COUNT(*)\n",
"FROM earthquake \n",
"WHERE cause = 'nuclear explosion';"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Count the number of earthquakes that have explosion as the cause"
]
},
{
"cell_type": "code",
"execution_count": 59,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * postgresql://postgres:***@localhost:5432/earthquake\n",
"1 rows affected.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>count</th>\n",
" </tr>\n",
" <tr>\n",
" <td>4</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[(4,)]"
]
},
"execution_count": 59,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT COUNT(*)\n",
"FROM earthquake \n",
"WHERE cause = 'explosion';"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Show the 4 rows with explosion as cause"
]
},
{
"cell_type": "code",
"execution_count": 60,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * postgresql://postgres:***@localhost:5432/earthquake\n",
"4 rows affected.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>earthquake_id</th>\n",
" <th>occurred_on</th>\n",
" <th>latitude</th>\n",
" <th>longitude</th>\n",
" <th>depth</th>\n",
" <th>magnitude</th>\n",
" <th>calculation_method</th>\n",
" <th>network_id</th>\n",
" <th>place</th>\n",
" <th>cause</th>\n",
" </tr>\n",
" <tr>\n",
" <td>1483</td>\n",
" <td>1973-02-16 05:02:58</td>\n",
" <td>49.835</td>\n",
" <td>78.232</td>\n",
" <td>0</td>\n",
" <td>5.6</td>\n",
" <td>mb</td>\n",
" <td>usp00000jc</td>\n",
" <td>eastern Kazakhstan</td>\n",
" <td>explosion</td>\n",
" </tr>\n",
" <tr>\n",
" <td>2461</td>\n",
" <td>1975-08-23 08:59:58</td>\n",
" <td>73.369</td>\n",
" <td>54.641</td>\n",
" <td>0</td>\n",
" <td>6.4</td>\n",
" <td>mb</td>\n",
" <td>usp0000cjx</td>\n",
" <td>Novaya Zemlya, Russia</td>\n",
" <td>explosion</td>\n",
" </tr>\n",
" <tr>\n",
" <td>6107</td>\n",
" <td>1984-11-02 20:44:59</td>\n",
" <td>-21.883</td>\n",
" <td>-138.994</td>\n",
" <td>0</td>\n",
" <td>5.7</td>\n",
" <td>mb</td>\n",
" <td>usp00028r7</td>\n",
" <td>Tuamotu Archipelago, French Polynesia region</td>\n",
" <td>explosion</td>\n",
" </tr>\n",
" <tr>\n",
" <td>6380</td>\n",
" <td>1985-05-08 20:27:59</td>\n",
" <td>-21.823</td>\n",
" <td>-139.048</td>\n",
" <td>0</td>\n",
" <td>5.7</td>\n",
" <td>mb</td>\n",
" <td>usp0002ejj</td>\n",
" <td>Tuamotu Archipelago, French Polynesia region</td>\n",
" <td>explosion</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[(1483, datetime.datetime(1973, 2, 16, 5, 2, 58), Decimal('49.835'), Decimal('78.232'), Decimal('0'), Decimal('5.6'), 'mb', 'usp00000jc', 'eastern Kazakhstan', 'explosion'),\n",
" (2461, datetime.datetime(1975, 8, 23, 8, 59, 58), Decimal('73.369'), Decimal('54.641'), Decimal('0'), Decimal('6.4'), 'mb', 'usp0000cjx', 'Novaya Zemlya, Russia', 'explosion'),\n",
" (6107, datetime.datetime(1984, 11, 2, 20, 44, 59), Decimal('-21.883'), Decimal('-138.994'), Decimal('0'), Decimal('5.7'), 'mb', 'usp00028r7', 'Tuamotu Archipelago, French Polynesia region', 'explosion'),\n",
" (6380, datetime.datetime(1985, 5, 8, 20, 27, 59), Decimal('-21.823'), Decimal('-139.048'), Decimal('0'), Decimal('5.7'), 'mb', 'usp0002ejj', 'Tuamotu Archipelago, French Polynesia region', 'explosion')]"
]
},
"execution_count": 60,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT *\n",
"FROM earthquake \n",
"WHERE cause = 'explosion';"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The above table is actually nuclear explosions and not explosion. This is an example of messy data and needs to be cleaned. Evidence for the explosions being nuclear explosions<br>\n",
"<b>eastern Kazakhstan<b> https://en.wikipedia.org/wiki/1973_Soviet_nuclear_tests <br>\n",
"<b>Novaya Zemlya, Russia<b> https://en.wikipedia.org/wiki/1975_Soviet_nuclear_tests <br>\n",
"<b>Tuamotu Archipelago, French Polynesia region, 1984 and 1985<b> https://en.wikipedia.org/wiki/1983%E2%80%9385_French_nuclear_tests <br>\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# The DESC keyword and the 10 strongest earthquakes for the dataset\n",
"`DESC` arranges the column in numerical <b>descending<b> order"
]
},
{
"cell_type": "code",
"execution_count": 69,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * postgresql://postgres:***@localhost:5432/earthquake\n",
"10 rows affected.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>place</th>\n",
" <th>magnitude</th>\n",
" <th>occurred_on</th>\n",
" </tr>\n",
" <tr>\n",
" <td>off the west coast of northern Sumatra</td>\n",
" <td>9.1</td>\n",
" <td>2004-12-26 00:58:53</td>\n",
" </tr>\n",
" <tr>\n",
" <td>near the east coast of Honshu, Japan</td>\n",
" <td>9.1</td>\n",
" <td>2011-03-11 05:46:24</td>\n",
" </tr>\n",
" <tr>\n",
" <td>offshore Bio-Bio, Chile</td>\n",
" <td>8.8</td>\n",
" <td>2010-02-27 06:34:12</td>\n",
" </tr>\n",
" <tr>\n",
" <td>off the west coast of northern Sumatra</td>\n",
" <td>8.6</td>\n",
" <td>2012-04-11 08:38:37</td>\n",
" </tr>\n",
" <tr>\n",
" <td>northern Sumatra, Indonesia</td>\n",
" <td>8.6</td>\n",
" <td>2005-03-28 16:09:37</td>\n",
" </tr>\n",
" <tr>\n",
" <td>near the coast of southern Peru</td>\n",
" <td>8.4</td>\n",
" <td>2001-06-23 20:33:14</td>\n",
" </tr>\n",
" <tr>\n",
" <td>southern Sumatra, Indonesia</td>\n",
" <td>8.4</td>\n",
" <td>2007-09-12 11:10:27</td>\n",
" </tr>\n",
" <tr>\n",
" <td>south of Sumbawa, Indonesia</td>\n",
" <td>8.3</td>\n",
" <td>1977-08-19 06:08:55</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Kuril Islands</td>\n",
" <td>8.3</td>\n",
" <td>2006-11-15 11:14:14</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Hokkaido, Japan region</td>\n",
" <td>8.3</td>\n",
" <td>2003-09-25 19:50:06</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[('off the west coast of northern Sumatra', Decimal('9.1'), datetime.datetime(2004, 12, 26, 0, 58, 53)),\n",
" ('near the east coast of Honshu, Japan', Decimal('9.1'), datetime.datetime(2011, 3, 11, 5, 46, 24)),\n",
" ('offshore Bio-Bio, Chile', Decimal('8.8'), datetime.datetime(2010, 2, 27, 6, 34, 12)),\n",
" ('off the west coast of northern Sumatra', Decimal('8.6'), datetime.datetime(2012, 4, 11, 8, 38, 37)),\n",
" ('northern Sumatra, Indonesia', Decimal('8.6'), datetime.datetime(2005, 3, 28, 16, 9, 37)),\n",
" ('near the coast of southern Peru', Decimal('8.4'), datetime.datetime(2001, 6, 23, 20, 33, 14)),\n",
" ('southern Sumatra, Indonesia', Decimal('8.4'), datetime.datetime(2007, 9, 12, 11, 10, 27)),\n",
" ('south of Sumbawa, Indonesia', Decimal('8.3'), datetime.datetime(1977, 8, 19, 6, 8, 55)),\n",
" ('Kuril Islands', Decimal('8.3'), datetime.datetime(2006, 11, 15, 11, 14, 14)),\n",
" ('Hokkaido, Japan region', Decimal('8.3'), datetime.datetime(2003, 9, 25, 19, 50, 6))]"
]
},
"execution_count": 69,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql \n",
"SELECT place, magnitude, occurred_on \n",
"FROM earthquake\n",
"ORDER BY magnitude DESC\n",
"LIMIT 10;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# The LIKE and BETWEEN keywords\n",
"# How many aftershocks were there after the <I>'near the east coast of Honshu, Japan'<i> earthquake with a magnitude of 6 or greater? "
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * postgresql://postgres:***@localhost:5432/earthquake\n",
"1 rows affected.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>Aftershocks</th>\n",
" </tr>\n",
" <tr>\n",
" <td>51</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[(51,)]"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT COUNT(*) AS \"Aftershocks\"\n",
"FROM earthquake\n",
"WHERE place LIKE '%Honshu%Japan%'\n",
" AND occurred_on BETWEEN '2011-03-11' AND '2011-03-18'\n",
" AND magnitude >= 6;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Notice i used the AS keyword to rename the column header from count to Aftershocks"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[View Full Report](https://www.sciencetony.com/2019/07/nuclear-test-map-for-1969-to-2017.html)<br>\n",
"[Sciencetony.com](https://www.sciencetony.com/)<br>\n",
"[View My LinkedIn CV](https://www.linkedin.com/in/tony-mcdonald-sciencetony/)<br>\n",
"[My Twitter](https://twitter.com/ScienceTony)<br>\n",
"[My YouTube](https://www.youtube.com/Dreamazium)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.2"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment