Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save oaadeegbe/84f1626c3dd88830293870ce7163903c to your computer and use it in GitHub Desktop.
Save oaadeegbe/84f1626c3dd88830293870ce7163903c to your computer and use it in GitHub Desktop.
Created on Skills Network Labs
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a href=\"https://cognitiveclass.ai\"><img src = \"https://ibm.box.com/shared/static/ugcqz6ohbvff804xp84y4kqnvvk3bq1g.png\" width = 300, align = \"center\"></a>\n",
"\n",
"<h1 align=center><font size = 5>Assignment: Notebook for Peer Assignment</font></h1>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Introduction\n",
"\n",
"Using this Python notebook you will:\n",
"1. Understand 3 Chicago datasets \n",
"1. Load the 3 datasets into 3 tables in a Db2 database\n",
"1. Execute SQL queries to answer assignment questions "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Understand the datasets \n",
"To complete the assignment problems in this notebook you will be using three datasets that are available on the city of Chicago's Data Portal:\n",
"1. <a href=\"https://data.cityofchicago.org/Health-Human-Services/Census-Data-Selected-socioeconomic-indicators-in-C/kn9c-c2s2\">Socioeconomic Indicators in Chicago</a>\n",
"1. <a href=\"https://data.cityofchicago.org/Education/Chicago-Public-Schools-Progress-Report-Cards-2011-/9xs2-f89t\">Chicago Public Schools</a>\n",
"1. <a href=\"https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-present/ijzp-q8t2\">Chicago Crime Data</a>\n",
"\n",
"### 1. Socioeconomic Indicators in Chicago\n",
"This dataset contains a selection of six socioeconomic indicators of public health significance and a “hardship index,” for each Chicago community area, for the years 2008 – 2012.\n",
"\n",
"For this assignment you will use a snapshot of this dataset which can be downloaded from:\n",
"https://ibm.box.com/shared/static/05c3415cbfbtfnr2fx4atenb2sd361ze.csv\n",
"\n",
"A detailed description of this dataset and the original dataset can be obtained from the Chicago Data Portal at:\n",
"https://data.cityofchicago.org/Health-Human-Services/Census-Data-Selected-socioeconomic-indicators-in-C/kn9c-c2s2\n",
"\n",
"\n",
"\n",
"### 2. Chicago Public Schools\n",
"\n",
"This dataset shows all school level performance data used to create CPS School Report Cards for the 2011-2012 school year. This dataset is provided by the city of Chicago's Data Portal.\n",
"\n",
"For this assignment you will use a snapshot of this dataset which can be downloaded from:\n",
"https://ibm.box.com/shared/static/f9gjvj1gjmxxzycdhplzt01qtz0s7ew7.csv\n",
"\n",
"A detailed description of this dataset and the original dataset can be obtained from the Chicago Data Portal at:\n",
"https://data.cityofchicago.org/Education/Chicago-Public-Schools-Progress-Report-Cards-2011-/9xs2-f89t\n",
"\n",
"\n",
"\n",
"\n",
"### 3. Chicago Crime Data \n",
"\n",
"This dataset reflects reported incidents of crime (with the exception of murders where data exists for each victim) that occurred in the City of Chicago from 2001 to present, minus the most recent seven days. \n",
"\n",
"This dataset is quite large - over 1.5GB in size with over 6.5 million rows. For the purposes of this assignment we will use a much smaller sample of this dataset which can be downloaded from:\n",
"https://ibm.box.com/shared/static/svflyugsr9zbqy5bmowgswqemfpm1x7f.csv\n",
"\n",
"A detailed description of this dataset and the original dataset can be obtained from the Chicago Data Portal at:\n",
"https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-present/ijzp-q8t2\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Download the datasets\n",
"In many cases the dataset to be analyzed is available as a .CSV (comma separated values) file, perhaps on the internet. Click on the links below to download and save the datasets (.CSV files):\n",
"1. __CENSUS_DATA:__ https://ibm.box.com/shared/static/05c3415cbfbtfnr2fx4atenb2sd361ze.csv\n",
"1. __CHICAGO_PUBLIC_SCHOOLS__ https://ibm.box.com/shared/static/f9gjvj1gjmxxzycdhplzt01qtz0s7ew7.csv\n",
"1. __CHICAGO_CRIME_DATA:__ https://ibm.box.com/shared/static/svflyugsr9zbqy5bmowgswqemfpm1x7f.csv\n",
"\n",
"__NOTE:__ Ensure you have downloaded the datasets using the links above instead of directly from the Chicago Data Portal. The versions linked here are subsets of the original datasets and have some of the column names modified to be more database friendly which will make it easier to complete this assignment."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Store the datasets in database tables\n",
"To analyze the data using SQL, it first needs to be stored in the database.\n",
"\n",
"While it is easier to read the dataset into a Pandas dataframe and then PERSIST it into the database as we saw in Week 3 Lab 3, it results in mapping to default datatypes which may not be optimal for SQL querying. For example a long textual field may map to a CLOB instead of a VARCHAR. \n",
"\n",
"Therefore, __it is highly recommended to manually load the table using the database console LOAD tool, as indicated in Week 2 Lab 1 Part II__. The only difference with that lab is that in Step 5 of the instructions you will need to click on create \"(+) New Table\" and specify the name of the table you want to create and then click \"Next\". \n",
"\n",
"<img src = \"https://ibm.box.com/shared/static/uc4xjh1uxcc78ks1i18v668simioz4es.jpg\">\n",
"\n",
"##### Now open the Db2 console, open the LOAD tool, Select / Drag the .CSV file for the first dataset, Next create a New Table, and then follow the steps on-screen instructions to load the data. Name the new tables as folows:\n",
"1. __CENSUS_DATA__\n",
"1. __CHICAGO_PUBLIC_SCHOOLS__\n",
"1. __CHICAGO_CRIME_DATA__"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Connect to the database \n",
"Let us first load the SQL extension and establish a connection with the database"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"%load_ext sql"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the next cell enter your db2 connection string. Recall you created Service Credentials for your Db2 instance in first lab in Week 3. From the __uri__ field of your Db2 service credentials copy everything after db2:// (except the double quote at the end) and paste it in the cell below after ibm_db_sa://\n",
"\n",
"<img src =\"https://ibm.box.com/shared/static/hzhkvdyinpupm2wfx49lkr71q9swbpec.jpg\">"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Connected: wjj67061@BLUDB'"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Remember the connection string is of the format:\n",
"# %sql ibm_db_sa://my-username:my-password@my-hostname:my-port/my-db-name\n",
"# Enter the connection string for your Db2 on Cloud database instance below\n",
"%sql ibm_db_sa://wjj67061:m7f1lz0zkqdrpd%406@dashdb-txn-sbox-yp-dal09-08.services.dal.bluemix.net:50000/BLUDB"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Problems\n",
"Now write and execute SQL queries to solve assignment problems\n",
"\n",
"### Problem 1\n",
"\n",
"##### Find the total number of crimes recorded in the CRIME table"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * ibm_db_sa://wjj67061:***@dashdb-txn-sbox-yp-dal09-08.services.dal.bluemix.net:50000/BLUDB\n",
"Done.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>1</th>\n",
" </tr>\n",
" <tr>\n",
" <td>533</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[(Decimal('533'),)]"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Rows in Crime table\n",
"%sql SELECT COUNT(*) FROM CRIME"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Problem 2\n",
"\n",
"##### Retrieve first 10 rows from the CRIME table\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * ibm_db_sa://wjj67061:***@dashdb-txn-sbox-yp-dal09-08.services.dal.bluemix.net:50000/BLUDB\n",
"Done.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>id</th>\n",
" <th>case_number</th>\n",
" <th>DATE</th>\n",
" <th>block</th>\n",
" <th>iucr</th>\n",
" <th>primary_type</th>\n",
" <th>description</th>\n",
" <th>location_description</th>\n",
" <th>arrest</th>\n",
" <th>domestic</th>\n",
" <th>beat</th>\n",
" <th>district</th>\n",
" <th>ward</th>\n",
" <th>community_area_number</th>\n",
" <th>fbicode</th>\n",
" <th>x_coordinate</th>\n",
" <th>y_coordinate</th>\n",
" <th>YEAR</th>\n",
" <th>updatedon</th>\n",
" <th>latitude</th>\n",
" <th>longitude</th>\n",
" <th>location</th>\n",
" </tr>\n",
" <tr>\n",
" <td>3512276</td>\n",
" <td>HK587712</td>\n",
" <td>2004-08-28 17:50:56</td>\n",
" <td>047XX S KEDZIE AVE</td>\n",
" <td>890</td>\n",
" <td>THEFT</td>\n",
" <td>FROM BUILDING</td>\n",
" <td>SMALL RETAIL STORE</td>\n",
" <td>FALSE</td>\n",
" <td>FALSE</td>\n",
" <td>911</td>\n",
" <td>9</td>\n",
" <td>14</td>\n",
" <td>58</td>\n",
" <td>6</td>\n",
" <td>1155838</td>\n",
" <td>1873050</td>\n",
" <td>2004</td>\n",
" <td>2018-02-10 15:50:01</td>\n",
" <td>41.80744050</td>\n",
" <td>-87.70395585</td>\n",
" <td>(41.8074405, -87.703955849)</td>\n",
" </tr>\n",
" <tr>\n",
" <td>3406613</td>\n",
" <td>HK456306</td>\n",
" <td>2004-06-26 12:40:00</td>\n",
" <td>009XX N CENTRAL PARK AVE</td>\n",
" <td>820</td>\n",
" <td>THEFT</td>\n",
" <td>$500 AND UNDER</td>\n",
" <td>OTHER</td>\n",
" <td>FALSE</td>\n",
" <td>FALSE</td>\n",
" <td>1112</td>\n",
" <td>11</td>\n",
" <td>27</td>\n",
" <td>23</td>\n",
" <td>6</td>\n",
" <td>1152206</td>\n",
" <td>1906127</td>\n",
" <td>2004</td>\n",
" <td>2018-02-28 15:56:25</td>\n",
" <td>41.89827996</td>\n",
" <td>-87.71640551</td>\n",
" <td>(41.898279962, -87.716405505)</td>\n",
" </tr>\n",
" <tr>\n",
" <td>8002131</td>\n",
" <td>HT233595</td>\n",
" <td>2011-04-04 05:45:00</td>\n",
" <td>043XX S WABASH AVE</td>\n",
" <td>820</td>\n",
" <td>THEFT</td>\n",
" <td>$500 AND UNDER</td>\n",
" <td>NURSING HOME/RETIREMENT HOME</td>\n",
" <td>FALSE</td>\n",
" <td>FALSE</td>\n",
" <td>221</td>\n",
" <td>2</td>\n",
" <td>3</td>\n",
" <td>38</td>\n",
" <td>6</td>\n",
" <td>1177436</td>\n",
" <td>1876313</td>\n",
" <td>2011</td>\n",
" <td>2018-02-10 15:50:01</td>\n",
" <td>41.81593313</td>\n",
" <td>-87.62464213</td>\n",
" <td>(41.815933131, -87.624642127)</td>\n",
" </tr>\n",
" <tr>\n",
" <td>7903289</td>\n",
" <td>HT133522</td>\n",
" <td>2010-12-30 16:30:00</td>\n",
" <td>083XX S KINGSTON AVE</td>\n",
" <td>840</td>\n",
" <td>THEFT</td>\n",
" <td>FINANCIAL ID THEFT: OVER $300</td>\n",
" <td>RESIDENCE</td>\n",
" <td>FALSE</td>\n",
" <td>FALSE</td>\n",
" <td>423</td>\n",
" <td>4</td>\n",
" <td>7</td>\n",
" <td>46</td>\n",
" <td>6</td>\n",
" <td>1194622</td>\n",
" <td>1850125</td>\n",
" <td>2010</td>\n",
" <td>2018-02-10 15:50:01</td>\n",
" <td>41.74366532</td>\n",
" <td>-87.56246276</td>\n",
" <td>(41.743665322, -87.562462756)</td>\n",
" </tr>\n",
" <tr>\n",
" <td>10402076</td>\n",
" <td>HZ138551</td>\n",
" <td>2016-02-02 19:30:00</td>\n",
" <td>033XX W 66TH ST</td>\n",
" <td>820</td>\n",
" <td>THEFT</td>\n",
" <td>$500 AND UNDER</td>\n",
" <td>ALLEY</td>\n",
" <td>FALSE</td>\n",
" <td>FALSE</td>\n",
" <td>831</td>\n",
" <td>8</td>\n",
" <td>15</td>\n",
" <td>66</td>\n",
" <td>6</td>\n",
" <td>1155240</td>\n",
" <td>1860661</td>\n",
" <td>2016</td>\n",
" <td>2018-02-10 15:50:01</td>\n",
" <td>41.77345530</td>\n",
" <td>-87.70648047</td>\n",
" <td>(41.773455295, -87.706480471)</td>\n",
" </tr>\n",
" <tr>\n",
" <td>7732712</td>\n",
" <td>HS540106</td>\n",
" <td>2010-09-29 07:59:00</td>\n",
" <td>006XX W CHICAGO AVE</td>\n",
" <td>810</td>\n",
" <td>THEFT</td>\n",
" <td>OVER $500</td>\n",
" <td>PARKING LOT/GARAGE(NON.RESID.)</td>\n",
" <td>FALSE</td>\n",
" <td>FALSE</td>\n",
" <td>1323</td>\n",
" <td>12</td>\n",
" <td>27</td>\n",
" <td>24</td>\n",
" <td>6</td>\n",
" <td>1171668</td>\n",
" <td>1905607</td>\n",
" <td>2010</td>\n",
" <td>2018-02-10 15:50:01</td>\n",
" <td>41.89644677</td>\n",
" <td>-87.64493868</td>\n",
" <td>(41.896446772, -87.644938678)</td>\n",
" </tr>\n",
" <tr>\n",
" <td>10769475</td>\n",
" <td>HZ534771</td>\n",
" <td>2016-11-30 01:15:00</td>\n",
" <td>050XX N KEDZIE AVE</td>\n",
" <td>810</td>\n",
" <td>THEFT</td>\n",
" <td>OVER $500</td>\n",
" <td>STREET</td>\n",
" <td>FALSE</td>\n",
" <td>FALSE</td>\n",
" <td>1713</td>\n",
" <td>17</td>\n",
" <td>33</td>\n",
" <td>14</td>\n",
" <td>6</td>\n",
" <td>1154133</td>\n",
" <td>1933314</td>\n",
" <td>2016</td>\n",
" <td>2018-02-10 15:50:01</td>\n",
" <td>41.97284491</td>\n",
" <td>-87.70860008</td>\n",
" <td>(41.972844913, -87.708600079)</td>\n",
" </tr>\n",
" <tr>\n",
" <td>4494340</td>\n",
" <td>HL793243</td>\n",
" <td>2005-12-16 16:45:00</td>\n",
" <td>005XX E PERSHING RD</td>\n",
" <td>860</td>\n",
" <td>THEFT</td>\n",
" <td>RETAIL THEFT</td>\n",
" <td>GROCERY FOOD STORE</td>\n",
" <td>TRUE</td>\n",
" <td>FALSE</td>\n",
" <td>213</td>\n",
" <td>2</td>\n",
" <td>3</td>\n",
" <td>38</td>\n",
" <td>6</td>\n",
" <td>1180448</td>\n",
" <td>1879234</td>\n",
" <td>2005</td>\n",
" <td>2018-02-28 15:56:25</td>\n",
" <td>41.82387989</td>\n",
" <td>-87.61350386</td>\n",
" <td>(41.823879885, -87.613503857)</td>\n",
" </tr>\n",
" <tr>\n",
" <td>3778925</td>\n",
" <td>HL149610</td>\n",
" <td>2005-01-28 17:00:00</td>\n",
" <td>100XX S WASHTENAW AVE</td>\n",
" <td>810</td>\n",
" <td>THEFT</td>\n",
" <td>OVER $500</td>\n",
" <td>STREET</td>\n",
" <td>FALSE</td>\n",
" <td>FALSE</td>\n",
" <td>2211</td>\n",
" <td>22</td>\n",
" <td>19</td>\n",
" <td>72</td>\n",
" <td>6</td>\n",
" <td>1160129</td>\n",
" <td>1838040</td>\n",
" <td>2005</td>\n",
" <td>2018-02-28 15:56:25</td>\n",
" <td>41.71128051</td>\n",
" <td>-87.68917910</td>\n",
" <td>(41.711280513, -87.689179097)</td>\n",
" </tr>\n",
" <tr>\n",
" <td>3324217</td>\n",
" <td>HK361551</td>\n",
" <td>2004-05-13 14:15:00</td>\n",
" <td>033XX W BELMONT AVE</td>\n",
" <td>820</td>\n",
" <td>THEFT</td>\n",
" <td>$500 AND UNDER</td>\n",
" <td>SMALL RETAIL STORE</td>\n",
" <td>FALSE</td>\n",
" <td>FALSE</td>\n",
" <td>1733</td>\n",
" <td>17</td>\n",
" <td>35</td>\n",
" <td>21</td>\n",
" <td>6</td>\n",
" <td>1153590</td>\n",
" <td>1921084</td>\n",
" <td>2004</td>\n",
" <td>2018-02-28 15:56:25</td>\n",
" <td>41.93929582</td>\n",
" <td>-87.71092344</td>\n",
" <td>(41.939295821, -87.710923442)</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[(3512276, 'HK587712', datetime.datetime(2004, 8, 28, 17, 50, 56), '047XX S KEDZIE AVE', '890', 'THEFT', 'FROM BUILDING', 'SMALL RETAIL STORE', 'FALSE', 'FALSE', 911, 9, 14, 58, '6', 1155838, 1873050, 2004, datetime.datetime(2018, 2, 10, 15, 50, 1), Decimal('41.80744050'), Decimal('-87.70395585'), '(41.8074405, -87.703955849)'),\n",
" (3406613, 'HK456306', datetime.datetime(2004, 6, 26, 12, 40), '009XX N CENTRAL PARK AVE', '820', 'THEFT', '$500 AND UNDER', 'OTHER', 'FALSE', 'FALSE', 1112, 11, 27, 23, '6', 1152206, 1906127, 2004, datetime.datetime(2018, 2, 28, 15, 56, 25), Decimal('41.89827996'), Decimal('-87.71640551'), '(41.898279962, -87.716405505)'),\n",
" (8002131, 'HT233595', datetime.datetime(2011, 4, 4, 5, 45), '043XX S WABASH AVE', '820', 'THEFT', '$500 AND UNDER', 'NURSING HOME/RETIREMENT HOME', 'FALSE', 'FALSE', 221, 2, 3, 38, '6', 1177436, 1876313, 2011, datetime.datetime(2018, 2, 10, 15, 50, 1), Decimal('41.81593313'), Decimal('-87.62464213'), '(41.815933131, -87.624642127)'),\n",
" (7903289, 'HT133522', datetime.datetime(2010, 12, 30, 16, 30), '083XX S KINGSTON AVE', '840', 'THEFT', 'FINANCIAL ID THEFT: OVER $300', 'RESIDENCE', 'FALSE', 'FALSE', 423, 4, 7, 46, '6', 1194622, 1850125, 2010, datetime.datetime(2018, 2, 10, 15, 50, 1), Decimal('41.74366532'), Decimal('-87.56246276'), '(41.743665322, -87.562462756)'),\n",
" (10402076, 'HZ138551', datetime.datetime(2016, 2, 2, 19, 30), '033XX W 66TH ST', '820', 'THEFT', '$500 AND UNDER', 'ALLEY', 'FALSE', 'FALSE', 831, 8, 15, 66, '6', 1155240, 1860661, 2016, datetime.datetime(2018, 2, 10, 15, 50, 1), Decimal('41.77345530'), Decimal('-87.70648047'), '(41.773455295, -87.706480471)'),\n",
" (7732712, 'HS540106', datetime.datetime(2010, 9, 29, 7, 59), '006XX W CHICAGO AVE', '810', 'THEFT', 'OVER $500', 'PARKING LOT/GARAGE(NON.RESID.)', 'FALSE', 'FALSE', 1323, 12, 27, 24, '6', 1171668, 1905607, 2010, datetime.datetime(2018, 2, 10, 15, 50, 1), Decimal('41.89644677'), Decimal('-87.64493868'), '(41.896446772, -87.644938678)'),\n",
" (10769475, 'HZ534771', datetime.datetime(2016, 11, 30, 1, 15), '050XX N KEDZIE AVE', '810', 'THEFT', 'OVER $500', 'STREET', 'FALSE', 'FALSE', 1713, 17, 33, 14, '6', 1154133, 1933314, 2016, datetime.datetime(2018, 2, 10, 15, 50, 1), Decimal('41.97284491'), Decimal('-87.70860008'), '(41.972844913, -87.708600079)'),\n",
" (4494340, 'HL793243', datetime.datetime(2005, 12, 16, 16, 45), '005XX E PERSHING RD', '860', 'THEFT', 'RETAIL THEFT', 'GROCERY FOOD STORE', 'TRUE', 'FALSE', 213, 2, 3, 38, '6', 1180448, 1879234, 2005, datetime.datetime(2018, 2, 28, 15, 56, 25), Decimal('41.82387989'), Decimal('-87.61350386'), '(41.823879885, -87.613503857)'),\n",
" (3778925, 'HL149610', datetime.datetime(2005, 1, 28, 17, 0), '100XX S WASHTENAW AVE', '810', 'THEFT', 'OVER $500', 'STREET', 'FALSE', 'FALSE', 2211, 22, 19, 72, '6', 1160129, 1838040, 2005, datetime.datetime(2018, 2, 28, 15, 56, 25), Decimal('41.71128051'), Decimal('-87.68917910'), '(41.711280513, -87.689179097)'),\n",
" (3324217, 'HK361551', datetime.datetime(2004, 5, 13, 14, 15), '033XX W BELMONT AVE', '820', 'THEFT', '$500 AND UNDER', 'SMALL RETAIL STORE', 'FALSE', 'FALSE', 1733, 17, 35, 21, '6', 1153590, 1921084, 2004, datetime.datetime(2018, 2, 28, 15, 56, 25), Decimal('41.93929582'), Decimal('-87.71092344'), '(41.939295821, -87.710923442)')]"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%sql SELECT * FROM CRIME LIMIT 10"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Problem 3\n",
"\n",
"##### How many crimes involve an arrest?"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * ibm_db_sa://wjj67061:***@dashdb-txn-sbox-yp-dal09-08.services.dal.bluemix.net:50000/BLUDB\n",
"Done.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>1</th>\n",
" </tr>\n",
" <tr>\n",
" <td>163</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[(Decimal('163'),)]"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%sql SELECT COUNT(*) FROM CRIME WHERE arrest = TRUE"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Problem 4\n",
"\n",
"##### Which unique types of crimes have been recorded at GAS STATION locations?\n"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * ibm_db_sa://wjj67061:***@dashdb-txn-sbox-yp-dal09-08.services.dal.bluemix.net:50000/BLUDB\n",
"Done.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>primary_type</th>\n",
" <th>location_description</th>\n",
" </tr>\n",
" <tr>\n",
" <td>THEFT</td>\n",
" <td>GAS STATION</td>\n",
" </tr>\n",
" <tr>\n",
" <td>THEFT</td>\n",
" <td>GAS STATION</td>\n",
" </tr>\n",
" <tr>\n",
" <td>NARCOTICS</td>\n",
" <td>GAS STATION</td>\n",
" </tr>\n",
" <tr>\n",
" <td>ROBBERY</td>\n",
" <td>GAS STATION</td>\n",
" </tr>\n",
" <tr>\n",
" <td>ROBBERY</td>\n",
" <td>GAS STATION</td>\n",
" </tr>\n",
" <tr>\n",
" <td>CRIMINAL TRESPASS</td>\n",
" <td>GAS STATION</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[('THEFT', 'GAS STATION'),\n",
" ('THEFT', 'GAS STATION'),\n",
" ('NARCOTICS', 'GAS STATION'),\n",
" ('ROBBERY', 'GAS STATION'),\n",
" ('ROBBERY', 'GAS STATION'),\n",
" ('CRIMINAL TRESPASS', 'GAS STATION')]"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%sql SELECT primary_type, location_description FROM CRIME WHERE location_description LIKE '%GAS%'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Hint: Which column lists types of crimes e.g. THEFT?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Problem 5\n",
"\n",
"##### In the CENUS_DATA table list all Community Areas whose names start with the letter ‘B’."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * ibm_db_sa://wjj67061:***@dashdb-txn-sbox-yp-dal09-08.services.dal.bluemix.net:50000/BLUDB\n",
"Done.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>community_area_name</th>\n",
" </tr>\n",
" <tr>\n",
" <td>Belmont Cragin</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Burnside</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Brighton Park</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Bridgeport</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Beverly</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[('Belmont Cragin',),\n",
" ('Burnside',),\n",
" ('Brighton Park',),\n",
" ('Bridgeport',),\n",
" ('Beverly',)]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%sql SELECT community_area_name FROM CENSUS WHERE community_area_name LIKE 'B%'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Problem 6\n",
"\n",
"##### Which schools in Community Areas 10 to 15 are healthy school certified?"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * ibm_db_sa://wjj67061:***@dashdb-txn-sbox-yp-dal09-08.services.dal.bluemix.net:50000/BLUDB\n",
"Done.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>name_of_school</th>\n",
" <th>community_area_number</th>\n",
" <th>healthy_school_certified</th>\n",
" </tr>\n",
" <tr>\n",
" <td>Rufus M Hitch Elementary School</td>\n",
" <td>10</td>\n",
" <td>Yes</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[('Rufus M Hitch Elementary School', 10, 'Yes')]"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%sql SELECT name_of_school, community_area_number, healthy_school_certified FROM PROGRESS WHERE community_area_number BETWEEN 10 AND 15 AND healthy_school_certified = 'Yes'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Problem 7\n",
"\n",
"##### What is the average school Safety Score? "
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * ibm_db_sa://wjj67061:***@dashdb-txn-sbox-yp-dal09-08.services.dal.bluemix.net:50000/BLUDB\n",
"Done.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>1</th>\n",
" </tr>\n",
" <tr>\n",
" <td>49.504873</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[(Decimal('49.504873'),)]"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%sql SELECT AVG(SAFETY_SCORE) FROM PROGRESS"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Problem 8\n",
"\n",
"##### List the top 5 Community Areas by average College Enrollment [number of students] "
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * ibm_db_sa://wjj67061:***@dashdb-txn-sbox-yp-dal09-08.services.dal.bluemix.net:50000/BLUDB\n",
"Done.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>community_area_name</th>\n",
" <th>average_college_enrollment</th>\n",
" </tr>\n",
" <tr>\n",
" <td>ARCHER HEIGHTS</td>\n",
" <td>2411.500000</td>\n",
" </tr>\n",
" <tr>\n",
" <td>MONTCLARE</td>\n",
" <td>1317.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <td>WEST ELSDON</td>\n",
" <td>1233.333333</td>\n",
" </tr>\n",
" <tr>\n",
" <td>BRIGHTON PARK</td>\n",
" <td>1205.875000</td>\n",
" </tr>\n",
" <tr>\n",
" <td>BELMONT CRAGIN</td>\n",
" <td>1198.833333</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[('ARCHER HEIGHTS', Decimal('2411.500000')),\n",
" ('MONTCLARE', Decimal('1317.000000')),\n",
" ('WEST ELSDON', Decimal('1233.333333')),\n",
" ('BRIGHTON PARK', Decimal('1205.875000')),\n",
" ('BELMONT CRAGIN', Decimal('1198.833333'))]"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%sql SELECT COMMUNITY_AREA_NAME, AVG(COLLEGE_ENROLLMENT) AS AVERAGE_COLLEGE_ENROLLMENT FROM PROGRESS GROUP BY COMMUNITY_AREA_NAME ORDER BY 2 DESC LIMIT 5"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Problem 9\n",
"\n",
"##### Use a sub-query to determine which Community Area has the least value for school Safety Score? "
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * ibm_db_sa://wjj67061:***@dashdb-txn-sbox-yp-dal09-08.services.dal.bluemix.net:50000/BLUDB\n",
"Done.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>community_area_name</th>\n",
" <th>safety_score</th>\n",
" </tr>\n",
" <tr>\n",
" <td>WASHINGTON PARK</td>\n",
" <td>1</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[('WASHINGTON PARK', 1)]"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%sql SELECT COMMUNITY_AREA_NAME, SAFETY_SCORE FROM PROGRESS WHERE SAFETY_SCORE = (SELECT SAFETY_SCORE FROM PROGRESS ORDER BY 1 LIMIT 1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Problem 10\n",
"\n",
"##### [Without using an explicit JOIN operator] Find the Per Capita Income of the Community Area which has a school Safety Score of 1."
]
},
{
"cell_type": "code",
"execution_count": 59,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * ibm_db_sa://wjj67061:***@dashdb-txn-sbox-yp-dal09-08.services.dal.bluemix.net:50000/BLUDB\n",
"Done.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>community_area_name</th>\n",
" <th>per_capita_income</th>\n",
" </tr>\n",
" <tr>\n",
" <td>WASHINGTON PARK</td>\n",
" <td>1996566</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[('WASHINGTON PARK', 1996566)]"
]
},
"execution_count": 59,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#%sql SELECT P.COMMUNITY_AREA_NAME, SUM(C.PER_CAPITA_INCOME) AS PER_CAPITA_INCOME FROM PROGRESS P, CENSUS C WHERE P.SAFETY_SCORE = 1 GROUP BY P.COMMUNITY_AREA_NAME\n",
"\n",
"%sql SELECT P.COMMUNITY_AREA_NAME, C.PER_CAPITA_INCOME FROM PROGRESS P, CENSUS C WHERE P.COMMUNITY_AREA_NAME = C.COMMUNITY_AREA_NAME AND (SELECT SAFETY_SCORE FROM PROGRESS WHERE P.SAFETY_SCORE = 1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright &copy; 2018 [cognitiveclass.ai](cognitiveclass.ai?utm_source=bducopyrightlink&utm_medium=dswb&utm_campaign=bdu). This notebook and its source code are released under the terms of the [MIT License](https://bigdatauniversity.com/mit-license/).\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python",
"language": "python",
"name": "conda-env-python-py"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.10"
},
"widgets": {
"state": {},
"version": "1.1.2"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment