Skip to content

Instantly share code, notes, and snippets.

@Heidi75
Created April 14, 2020 16:49
Show Gist options
  • Save Heidi75/fb72dde25022213fc5ae65d25fdf8d9f to your computer and use it in GitHub Desktop.
Save Heidi75/fb72dde25022213fc5ae65d25fdf8d9f to your computer and use it in GitHub Desktop.
Created on Skills Network Labs
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# SEGMENTING AND CLUSTERING NEIGHBORHOODS IN TORONTO"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this assignment, you will be required to explore, segment, and cluster the neighborhoods in the city of Toronto. However, unlike New York, the neighborhood data is not readily available on the internet. What is interesting about the field of data science is that each project can be challenging in its unique way, so you need to learn to be agile and refine the skill to learn new libraries and tools quickly depending on the project.\n",
"\n",
"For the Toronto neighborhood data, a Wikipedia page exists that has all the information we need to explore and cluster the neighborhoods in Toronto. You will be required to scrape the Wikipedia page and wrangle the data, clean it, and then read it into a pandas dataframe so that it is in a structured format like the New York dataset.\n",
"\n",
"Once the data is in a structured format, you can replicate the analysis that we did to the New York City dataset to explore and cluster the neighborhoods in the city of Toronto.\n",
"\n",
"Your submission will be a link to your Jupyter Notebook on your Github repository.\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: lxml in /home/jupyterlab/conda/envs/python/lib/python3.6/site-packages (4.5.0)\n",
"Requirement already satisfied: bs4 in /home/jupyterlab/conda/envs/python/lib/python3.6/site-packages (0.0.1)\n",
"Requirement already satisfied: beautifulsoup4 in /home/jupyterlab/conda/envs/python/lib/python3.6/site-packages (from bs4) (4.9.0)\n",
"Requirement already satisfied: soupsieve>1.2 in /home/jupyterlab/conda/envs/python/lib/python3.6/site-packages (from beautifulsoup4->bs4) (2.0)\n",
"Requirement already satisfied: html5lib in /home/jupyterlab/conda/envs/python/lib/python3.6/site-packages (0.9999999)\n",
"Requirement already satisfied: six in /home/jupyterlab/conda/envs/python/lib/python3.6/site-packages (from html5lib) (1.14.0)\n",
"Collecting package metadata (current_repodata.json): done\n",
"Solving environment: done\n",
"\n",
"# All requested packages already installed.\n",
"\n",
"Collecting package metadata (current_repodata.json): done\n",
"Solving environment: done\n",
"\n",
"# All requested packages already installed.\n",
"\n",
"Folium installed\n",
"Libraries imported.\n"
]
}
],
"source": [
"import requests # library to handle requests\n",
"import pandas as pd # library for data analsysis\n",
"import numpy as np # library to handle data in a vectorized manner\n",
"import json #library to handle json files\n",
"import random # library for random number generation\n",
"\n",
"#scraping wikitable\n",
"!pip install lxml\n",
"import lxml\n",
"!pip install bs4\n",
"!pip install html5lib\n",
"from pandas.io.html import read_html\n",
"\n",
"!conda install -c conda-forge geopy --yes \n",
"from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values\n",
"\n",
"# matplotlib and associated plotting modules\n",
"import matplotlib.cm as cm\n",
"import matplotlib.colors as colors\n",
"import matplotlib.pyplot as plt\n",
"\n",
"\n",
"# import k-means for clustering\n",
"from sklearn.cluster import KMeans\n",
"\n",
"#libraries for displaying images\n",
"from IPython.display import Image \n",
"from IPython.core.display import HTML \n",
" \n",
"#tranforming json file into a pandas dataframe library\n",
"from pandas.io.json import json_normalize\n",
"\n",
"!conda install -c conda-forge folium=0.5.0 --yes\n",
"import folium # plotting library\n",
"\n",
"print('Folium installed')\n",
"print('Libraries imported.')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Download and Explore Dataset "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### scrape the Wikipedia page and wrangle the data"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Extracted 1 wikitables\n"
]
}
],
"source": [
"URL = 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'\n",
"wikitables = read_html(URL, attrs={\"class\":\"wikitable\"})\n",
"\n",
"print (\"Extracted {num} wikitables\".format(num=len(wikitables))) "
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Postal code</th>\n",
" <th>Borough</th>\n",
" <th>Neighborhood</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>M1A</td>\n",
" <td>Not assigned</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>M2A</td>\n",
" <td>Not assigned</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>M3A</td>\n",
" <td>North York</td>\n",
" <td>Parkwoods</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>M4A</td>\n",
" <td>North York</td>\n",
" <td>Victoria Village</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>M5A</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>Regent Park / Harbourfront</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>175</th>\n",
" <td>M5Z</td>\n",
" <td>Not assigned</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>176</th>\n",
" <td>M6Z</td>\n",
" <td>Not assigned</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>177</th>\n",
" <td>M7Z</td>\n",
" <td>Not assigned</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>178</th>\n",
" <td>M8Z</td>\n",
" <td>Etobicoke</td>\n",
" <td>Mimico NW / The Queensway West / South of Bloo...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>179</th>\n",
" <td>M9Z</td>\n",
" <td>Not assigned</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>180 rows × 3 columns</p>\n",
"</div>"
],
"text/plain": [
" Postal code Borough \\\n",
"0 M1A Not assigned \n",
"1 M2A Not assigned \n",
"2 M3A North York \n",
"3 M4A North York \n",
"4 M5A Downtown Toronto \n",
".. ... ... \n",
"175 M5Z Not assigned \n",
"176 M6Z Not assigned \n",
"177 M7Z Not assigned \n",
"178 M8Z Etobicoke \n",
"179 M9Z Not assigned \n",
"\n",
" Neighborhood \n",
"0 NaN \n",
"1 NaN \n",
"2 Parkwoods \n",
"3 Victoria Village \n",
"4 Regent Park / Harbourfront \n",
".. ... \n",
"175 NaN \n",
"176 NaN \n",
"177 NaN \n",
"178 Mimico NW / The Queensway West / South of Bloo... \n",
"179 NaN \n",
"\n",
"[180 rows x 3 columns]"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"wikitables[0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"convert wikitable to a panda dataframe"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"df = pd.DataFrame(wikitables[0])\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"removed not assigned neighborhoods\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Postal code</th>\n",
" <th>Borough</th>\n",
" <th>Neighborhood</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>M3A</td>\n",
" <td>North York</td>\n",
" <td>Parkwoods</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>M4A</td>\n",
" <td>North York</td>\n",
" <td>Victoria Village</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>M5A</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>Regent Park / Harbourfront</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>M6A</td>\n",
" <td>North York</td>\n",
" <td>Lawrence Manor / Lawrence Heights</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>M7A</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>Queen's Park / Ontario Provincial Government</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>160</th>\n",
" <td>M8X</td>\n",
" <td>Etobicoke</td>\n",
" <td>The Kingsway / Montgomery Road / Old Mill North</td>\n",
" </tr>\n",
" <tr>\n",
" <th>165</th>\n",
" <td>M4Y</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>Church and Wellesley</td>\n",
" </tr>\n",
" <tr>\n",
" <th>168</th>\n",
" <td>M7Y</td>\n",
" <td>East Toronto</td>\n",
" <td>Business reply mail Processing CentrE</td>\n",
" </tr>\n",
" <tr>\n",
" <th>169</th>\n",
" <td>M8Y</td>\n",
" <td>Etobicoke</td>\n",
" <td>Old Mill South / King's Mill Park / Sunnylea /...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>178</th>\n",
" <td>M8Z</td>\n",
" <td>Etobicoke</td>\n",
" <td>Mimico NW / The Queensway West / South of Bloo...</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>103 rows × 3 columns</p>\n",
"</div>"
],
"text/plain": [
" Postal code Borough \\\n",
"2 M3A North York \n",
"3 M4A North York \n",
"4 M5A Downtown Toronto \n",
"5 M6A North York \n",
"6 M7A Downtown Toronto \n",
".. ... ... \n",
"160 M8X Etobicoke \n",
"165 M4Y Downtown Toronto \n",
"168 M7Y East Toronto \n",
"169 M8Y Etobicoke \n",
"178 M8Z Etobicoke \n",
"\n",
" Neighborhood \n",
"2 Parkwoods \n",
"3 Victoria Village \n",
"4 Regent Park / Harbourfront \n",
"5 Lawrence Manor / Lawrence Heights \n",
"6 Queen's Park / Ontario Provincial Government \n",
".. ... \n",
"160 The Kingsway / Montgomery Road / Old Mill North \n",
"165 Church and Wellesley \n",
"168 Business reply mail Processing CentrE \n",
"169 Old Mill South / King's Mill Park / Sunnylea /... \n",
"178 Mimico NW / The Queensway West / South of Bloo... \n",
"\n",
"[103 rows x 3 columns]"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# get the names of the index which they are not assigned\n",
"indexnames = df[df['Borough'] == 'Not assigned'].index\n",
"#delete the rows\n",
"df.drop(indexnames, inplace=True)\n",
"df"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(103, 3)"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.shape"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Use Geoply to get the latitude and longitude values of Canada\n"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Postal Code</th>\n",
" <th>Latitude</th>\n",
" <th>Longitude</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>M1B</td>\n",
" <td>43.806686</td>\n",
" <td>-79.194353</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>M1C</td>\n",
" <td>43.784535</td>\n",
" <td>-79.160497</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>M1E</td>\n",
" <td>43.763573</td>\n",
" <td>-79.188711</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>M1G</td>\n",
" <td>43.770992</td>\n",
" <td>-79.216917</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>M1H</td>\n",
" <td>43.773136</td>\n",
" <td>-79.239476</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Postal Code Latitude Longitude\n",
"0 M1B 43.806686 -79.194353\n",
"1 M1C 43.784535 -79.160497\n",
"2 M1E 43.763573 -79.188711\n",
"3 M1G 43.770992 -79.216917\n",
"4 M1H 43.773136 -79.239476"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"lat_lon = pd.read_csv('https://cocl.us/Geospatial_data')\n",
"lat_lon.head()"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"#change column names to match the df to merge\n",
"lat_lon.columns=['Postal code','Latitude','Longitude']"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Postal code</th>\n",
" <th>Borough</th>\n",
" <th>Neighborhood</th>\n",
" <th>Latitude</th>\n",
" <th>Longitude</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>M3A</td>\n",
" <td>North York</td>\n",
" <td>Parkwoods</td>\n",
" <td>43.753259</td>\n",
" <td>-79.329656</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>M4A</td>\n",
" <td>North York</td>\n",
" <td>Victoria Village</td>\n",
" <td>43.725882</td>\n",
" <td>-79.315572</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>M5A</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>Regent Park / Harbourfront</td>\n",
" <td>43.654260</td>\n",
" <td>-79.360636</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>M6A</td>\n",
" <td>North York</td>\n",
" <td>Lawrence Manor / Lawrence Heights</td>\n",
" <td>43.718518</td>\n",
" <td>-79.464763</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>M7A</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>Queen's Park / Ontario Provincial Government</td>\n",
" <td>43.662301</td>\n",
" <td>-79.389494</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>98</th>\n",
" <td>M8X</td>\n",
" <td>Etobicoke</td>\n",
" <td>The Kingsway / Montgomery Road / Old Mill North</td>\n",
" <td>43.653654</td>\n",
" <td>-79.506944</td>\n",
" </tr>\n",
" <tr>\n",
" <th>99</th>\n",
" <td>M4Y</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>Church and Wellesley</td>\n",
" <td>43.665860</td>\n",
" <td>-79.383160</td>\n",
" </tr>\n",
" <tr>\n",
" <th>100</th>\n",
" <td>M7Y</td>\n",
" <td>East Toronto</td>\n",
" <td>Business reply mail Processing CentrE</td>\n",
" <td>43.662744</td>\n",
" <td>-79.321558</td>\n",
" </tr>\n",
" <tr>\n",
" <th>101</th>\n",
" <td>M8Y</td>\n",
" <td>Etobicoke</td>\n",
" <td>Old Mill South / King's Mill Park / Sunnylea /...</td>\n",
" <td>43.636258</td>\n",
" <td>-79.498509</td>\n",
" </tr>\n",
" <tr>\n",
" <th>102</th>\n",
" <td>M8Z</td>\n",
" <td>Etobicoke</td>\n",
" <td>Mimico NW / The Queensway West / South of Bloo...</td>\n",
" <td>43.628841</td>\n",
" <td>-79.520999</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>103 rows × 5 columns</p>\n",
"</div>"
],
"text/plain": [
" Postal code Borough \\\n",
"0 M3A North York \n",
"1 M4A North York \n",
"2 M5A Downtown Toronto \n",
"3 M6A North York \n",
"4 M7A Downtown Toronto \n",
".. ... ... \n",
"98 M8X Etobicoke \n",
"99 M4Y Downtown Toronto \n",
"100 M7Y East Toronto \n",
"101 M8Y Etobicoke \n",
"102 M8Z Etobicoke \n",
"\n",
" Neighborhood Latitude Longitude \n",
"0 Parkwoods 43.753259 -79.329656 \n",
"1 Victoria Village 43.725882 -79.315572 \n",
"2 Regent Park / Harbourfront 43.654260 -79.360636 \n",
"3 Lawrence Manor / Lawrence Heights 43.718518 -79.464763 \n",
"4 Queen's Park / Ontario Provincial Government 43.662301 -79.389494 \n",
".. ... ... ... \n",
"98 The Kingsway / Montgomery Road / Old Mill North 43.653654 -79.506944 \n",
"99 Church and Wellesley 43.665860 -79.383160 \n",
"100 Business reply mail Processing CentrE 43.662744 -79.321558 \n",
"101 Old Mill South / King's Mill Park / Sunnylea /... 43.636258 -79.498509 \n",
"102 Mimico NW / The Queensway West / South of Bloo... 43.628841 -79.520999 \n",
"\n",
"[103 rows x 5 columns]"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#merge dataframes\n",
"Canada_df= pd.merge( df,lat_lon, on='Postal code')\n",
"Canada_df"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(103, 5)"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Canada_df.shape"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"How many boroughs and neighborhoods in Canada?"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The dataframe has 10 boroughs and 103 neighborhoods.\n"
]
}
],
"source": [
"print('The dataframe has {} boroughs and {} neighborhoods.'.format(\n",
" len(Canada_df['Borough'].unique()),\n",
" Canada_df.shape[0]\n",
" )\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Use Geoply library to get the longitude and latitude of Canada"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"in order to define and instance in we define a user agent of Canada_explorer shown below"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
},
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The geograpical coordinate of Canada are 61.0666922, -107.9917071.\n"
]
}
],
"source": [
"address = 'Canada'\n",
"\n",
"geolocator = Nominatim(user_agent=\"Canada_explorer\")\n",
"location = geolocator.geocode(address)\n",
"latitude = location.latitude\n",
"longitude = location.longitude\n",
"print('The geograpical coordinate of Canada are {}, {}.'.format(latitude, longitude))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### New Dataframe with Only Toronto Data"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Postal code</th>\n",
" <th>Borough</th>\n",
" <th>Neighborhood</th>\n",
" <th>Latitude</th>\n",
" <th>Longitude</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>M5A</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>Regent Park / Harbourfront</td>\n",
" <td>43.654260</td>\n",
" <td>-79.360636</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>M7A</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>Queen's Park / Ontario Provincial Government</td>\n",
" <td>43.662301</td>\n",
" <td>-79.389494</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>M5B</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>Garden District, Ryerson</td>\n",
" <td>43.657162</td>\n",
" <td>-79.378937</td>\n",
" </tr>\n",
" <tr>\n",
" <th>15</th>\n",
" <td>M5C</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>St. James Town</td>\n",
" <td>43.651494</td>\n",
" <td>-79.375418</td>\n",
" </tr>\n",
" <tr>\n",
" <th>19</th>\n",
" <td>M4E</td>\n",
" <td>East Toronto</td>\n",
" <td>The Beaches</td>\n",
" <td>43.676357</td>\n",
" <td>-79.293031</td>\n",
" </tr>\n",
" <tr>\n",
" <th>20</th>\n",
" <td>M5E</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>Berczy Park</td>\n",
" <td>43.644771</td>\n",
" <td>-79.373306</td>\n",
" </tr>\n",
" <tr>\n",
" <th>24</th>\n",
" <td>M5G</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>Central Bay Street</td>\n",
" <td>43.657952</td>\n",
" <td>-79.387383</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25</th>\n",
" <td>M6G</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>Christie</td>\n",
" <td>43.669542</td>\n",
" <td>-79.422564</td>\n",
" </tr>\n",
" <tr>\n",
" <th>30</th>\n",
" <td>M5H</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>Richmond / Adelaide / King</td>\n",
" <td>43.650571</td>\n",
" <td>-79.384568</td>\n",
" </tr>\n",
" <tr>\n",
" <th>31</th>\n",
" <td>M6H</td>\n",
" <td>West Toronto</td>\n",
" <td>Dufferin / Dovercourt Village</td>\n",
" <td>43.669005</td>\n",
" <td>-79.442259</td>\n",
" </tr>\n",
" <tr>\n",
" <th>36</th>\n",
" <td>M5J</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>Harbourfront East / Union Station / Toronto Is...</td>\n",
" <td>43.640816</td>\n",
" <td>-79.381752</td>\n",
" </tr>\n",
" <tr>\n",
" <th>37</th>\n",
" <td>M6J</td>\n",
" <td>West Toronto</td>\n",
" <td>Little Portugal / Trinity</td>\n",
" <td>43.647927</td>\n",
" <td>-79.419750</td>\n",
" </tr>\n",
" <tr>\n",
" <th>41</th>\n",
" <td>M4K</td>\n",
" <td>East Toronto</td>\n",
" <td>The Danforth West / Riverdale</td>\n",
" <td>43.679557</td>\n",
" <td>-79.352188</td>\n",
" </tr>\n",
" <tr>\n",
" <th>42</th>\n",
" <td>M5K</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>Toronto Dominion Centre / Design Exchange</td>\n",
" <td>43.647177</td>\n",
" <td>-79.381576</td>\n",
" </tr>\n",
" <tr>\n",
" <th>43</th>\n",
" <td>M6K</td>\n",
" <td>West Toronto</td>\n",
" <td>Brockton / Parkdale Village / Exhibition Place</td>\n",
" <td>43.636847</td>\n",
" <td>-79.428191</td>\n",
" </tr>\n",
" <tr>\n",
" <th>47</th>\n",
" <td>M4L</td>\n",
" <td>East Toronto</td>\n",
" <td>India Bazaar / The Beaches West</td>\n",
" <td>43.668999</td>\n",
" <td>-79.315572</td>\n",
" </tr>\n",
" <tr>\n",
" <th>48</th>\n",
" <td>M5L</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>Commerce Court / Victoria Hotel</td>\n",
" <td>43.648198</td>\n",
" <td>-79.379817</td>\n",
" </tr>\n",
" <tr>\n",
" <th>54</th>\n",
" <td>M4M</td>\n",
" <td>East Toronto</td>\n",
" <td>Studio District</td>\n",
" <td>43.659526</td>\n",
" <td>-79.340923</td>\n",
" </tr>\n",
" <tr>\n",
" <th>61</th>\n",
" <td>M4N</td>\n",
" <td>Central Toronto</td>\n",
" <td>Lawrence Park</td>\n",
" <td>43.728020</td>\n",
" <td>-79.388790</td>\n",
" </tr>\n",
" <tr>\n",
" <th>62</th>\n",
" <td>M5N</td>\n",
" <td>Central Toronto</td>\n",
" <td>Roselawn</td>\n",
" <td>43.711695</td>\n",
" <td>-79.416936</td>\n",
" </tr>\n",
" <tr>\n",
" <th>67</th>\n",
" <td>M4P</td>\n",
" <td>Central Toronto</td>\n",
" <td>Davisville North</td>\n",
" <td>43.712751</td>\n",
" <td>-79.390197</td>\n",
" </tr>\n",
" <tr>\n",
" <th>68</th>\n",
" <td>M5P</td>\n",
" <td>Central Toronto</td>\n",
" <td>Forest Hill North &amp; West</td>\n",
" <td>43.696948</td>\n",
" <td>-79.411307</td>\n",
" </tr>\n",
" <tr>\n",
" <th>69</th>\n",
" <td>M6P</td>\n",
" <td>West Toronto</td>\n",
" <td>High Park / The Junction South</td>\n",
" <td>43.661608</td>\n",
" <td>-79.464763</td>\n",
" </tr>\n",
" <tr>\n",
" <th>73</th>\n",
" <td>M4R</td>\n",
" <td>Central Toronto</td>\n",
" <td>North Toronto West</td>\n",
" <td>43.715383</td>\n",
" <td>-79.405678</td>\n",
" </tr>\n",
" <tr>\n",
" <th>74</th>\n",
" <td>M5R</td>\n",
" <td>Central Toronto</td>\n",
" <td>The Annex / North Midtown / Yorkville</td>\n",
" <td>43.672710</td>\n",
" <td>-79.405678</td>\n",
" </tr>\n",
" <tr>\n",
" <th>75</th>\n",
" <td>M6R</td>\n",
" <td>West Toronto</td>\n",
" <td>Parkdale / Roncesvalles</td>\n",
" <td>43.648960</td>\n",
" <td>-79.456325</td>\n",
" </tr>\n",
" <tr>\n",
" <th>79</th>\n",
" <td>M4S</td>\n",
" <td>Central Toronto</td>\n",
" <td>Davisville</td>\n",
" <td>43.704324</td>\n",
" <td>-79.388790</td>\n",
" </tr>\n",
" <tr>\n",
" <th>80</th>\n",
" <td>M5S</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>University of Toronto / Harbord</td>\n",
" <td>43.662696</td>\n",
" <td>-79.400049</td>\n",
" </tr>\n",
" <tr>\n",
" <th>81</th>\n",
" <td>M6S</td>\n",
" <td>West Toronto</td>\n",
" <td>Runnymede / Swansea</td>\n",
" <td>43.651571</td>\n",
" <td>-79.484450</td>\n",
" </tr>\n",
" <tr>\n",
" <th>83</th>\n",
" <td>M4T</td>\n",
" <td>Central Toronto</td>\n",
" <td>Moore Park / Summerhill East</td>\n",
" <td>43.689574</td>\n",
" <td>-79.383160</td>\n",
" </tr>\n",
" <tr>\n",
" <th>84</th>\n",
" <td>M5T</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>Kensington Market / Chinatown / Grange Park</td>\n",
" <td>43.653206</td>\n",
" <td>-79.400049</td>\n",
" </tr>\n",
" <tr>\n",
" <th>86</th>\n",
" <td>M4V</td>\n",
" <td>Central Toronto</td>\n",
" <td>Summerhill West / Rathnelly / South Hill / For...</td>\n",
" <td>43.686412</td>\n",
" <td>-79.400049</td>\n",
" </tr>\n",
" <tr>\n",
" <th>87</th>\n",
" <td>M5V</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>CN Tower / King and Spadina / Railway Lands / ...</td>\n",
" <td>43.628947</td>\n",
" <td>-79.394420</td>\n",
" </tr>\n",
" <tr>\n",
" <th>91</th>\n",
" <td>M4W</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>Rosedale</td>\n",
" <td>43.679563</td>\n",
" <td>-79.377529</td>\n",
" </tr>\n",
" <tr>\n",
" <th>92</th>\n",
" <td>M5W</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>Stn A PO Boxes</td>\n",
" <td>43.646435</td>\n",
" <td>-79.374846</td>\n",
" </tr>\n",
" <tr>\n",
" <th>96</th>\n",
" <td>M4X</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>St. James Town / Cabbagetown</td>\n",
" <td>43.667967</td>\n",
" <td>-79.367675</td>\n",
" </tr>\n",
" <tr>\n",
" <th>97</th>\n",
" <td>M5X</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>First Canadian Place / Underground city</td>\n",
" <td>43.648429</td>\n",
" <td>-79.382280</td>\n",
" </tr>\n",
" <tr>\n",
" <th>99</th>\n",
" <td>M4Y</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>Church and Wellesley</td>\n",
" <td>43.665860</td>\n",
" <td>-79.383160</td>\n",
" </tr>\n",
" <tr>\n",
" <th>100</th>\n",
" <td>M7Y</td>\n",
" <td>East Toronto</td>\n",
" <td>Business reply mail Processing CentrE</td>\n",
" <td>43.662744</td>\n",
" <td>-79.321558</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Postal code Borough \\\n",
"2 M5A Downtown Toronto \n",
"4 M7A Downtown Toronto \n",
"9 M5B Downtown Toronto \n",
"15 M5C Downtown Toronto \n",
"19 M4E East Toronto \n",
"20 M5E Downtown Toronto \n",
"24 M5G Downtown Toronto \n",
"25 M6G Downtown Toronto \n",
"30 M5H Downtown Toronto \n",
"31 M6H West Toronto \n",
"36 M5J Downtown Toronto \n",
"37 M6J West Toronto \n",
"41 M4K East Toronto \n",
"42 M5K Downtown Toronto \n",
"43 M6K West Toronto \n",
"47 M4L East Toronto \n",
"48 M5L Downtown Toronto \n",
"54 M4M East Toronto \n",
"61 M4N Central Toronto \n",
"62 M5N Central Toronto \n",
"67 M4P Central Toronto \n",
"68 M5P Central Toronto \n",
"69 M6P West Toronto \n",
"73 M4R Central Toronto \n",
"74 M5R Central Toronto \n",
"75 M6R West Toronto \n",
"79 M4S Central Toronto \n",
"80 M5S Downtown Toronto \n",
"81 M6S West Toronto \n",
"83 M4T Central Toronto \n",
"84 M5T Downtown Toronto \n",
"86 M4V Central Toronto \n",
"87 M5V Downtown Toronto \n",
"91 M4W Downtown Toronto \n",
"92 M5W Downtown Toronto \n",
"96 M4X Downtown Toronto \n",
"97 M5X Downtown Toronto \n",
"99 M4Y Downtown Toronto \n",
"100 M7Y East Toronto \n",
"\n",
" Neighborhood Latitude Longitude \n",
"2 Regent Park / Harbourfront 43.654260 -79.360636 \n",
"4 Queen's Park / Ontario Provincial Government 43.662301 -79.389494 \n",
"9 Garden District, Ryerson 43.657162 -79.378937 \n",
"15 St. James Town 43.651494 -79.375418 \n",
"19 The Beaches 43.676357 -79.293031 \n",
"20 Berczy Park 43.644771 -79.373306 \n",
"24 Central Bay Street 43.657952 -79.387383 \n",
"25 Christie 43.669542 -79.422564 \n",
"30 Richmond / Adelaide / King 43.650571 -79.384568 \n",
"31 Dufferin / Dovercourt Village 43.669005 -79.442259 \n",
"36 Harbourfront East / Union Station / Toronto Is... 43.640816 -79.381752 \n",
"37 Little Portugal / Trinity 43.647927 -79.419750 \n",
"41 The Danforth West / Riverdale 43.679557 -79.352188 \n",
"42 Toronto Dominion Centre / Design Exchange 43.647177 -79.381576 \n",
"43 Brockton / Parkdale Village / Exhibition Place 43.636847 -79.428191 \n",
"47 India Bazaar / The Beaches West 43.668999 -79.315572 \n",
"48 Commerce Court / Victoria Hotel 43.648198 -79.379817 \n",
"54 Studio District 43.659526 -79.340923 \n",
"61 Lawrence Park 43.728020 -79.388790 \n",
"62 Roselawn 43.711695 -79.416936 \n",
"67 Davisville North 43.712751 -79.390197 \n",
"68 Forest Hill North & West 43.696948 -79.411307 \n",
"69 High Park / The Junction South 43.661608 -79.464763 \n",
"73 North Toronto West 43.715383 -79.405678 \n",
"74 The Annex / North Midtown / Yorkville 43.672710 -79.405678 \n",
"75 Parkdale / Roncesvalles 43.648960 -79.456325 \n",
"79 Davisville 43.704324 -79.388790 \n",
"80 University of Toronto / Harbord 43.662696 -79.400049 \n",
"81 Runnymede / Swansea 43.651571 -79.484450 \n",
"83 Moore Park / Summerhill East 43.689574 -79.383160 \n",
"84 Kensington Market / Chinatown / Grange Park 43.653206 -79.400049 \n",
"86 Summerhill West / Rathnelly / South Hill / For... 43.686412 -79.400049 \n",
"87 CN Tower / King and Spadina / Railway Lands / ... 43.628947 -79.394420 \n",
"91 Rosedale 43.679563 -79.377529 \n",
"92 Stn A PO Boxes 43.646435 -79.374846 \n",
"96 St. James Town / Cabbagetown 43.667967 -79.367675 \n",
"97 First Canadian Place / Underground city 43.648429 -79.382280 \n",
"99 Church and Wellesley 43.665860 -79.383160 \n",
"100 Business reply mail Processing CentrE 43.662744 -79.321558 "
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Toronto_df = Canada_df[Canada_df['Borough'].str.contains('Toronto', regex =False)]\n",
"Toronto_df"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(39, 5)"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Toronto_df.shape"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's find the latitude and longitude of Toronto"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The geograpical coordinate of Toronto are 43.6534817, -79.3839347.\n"
]
}
],
"source": [
"address = 'Toronto'\n",
"\n",
"geolocator = Nominatim(user_agent=\"ny_explorer\")\n",
"location = geolocator.geocode(address)\n",
"latitude = location.latitude\n",
"longitude = location.longitude\n",
"print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Explore and cluster the neighborhoods in Toronto"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"Explore a Map of Toronto \n"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"button": false,
"collapsed": false,
"jupyter": {
"outputs_hidden": false
},
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"width:100%;\"><div style=\"position:relative;width:100%;height:0;padding-bottom:60%;\"><iframe src=\"about:blank\" style=\"position:absolute;width:100%;height:100%;left:0;top:0;border:none !important;\" data-html= onload=\"this.contentDocument.open();this.contentDocument.write(atob(this.getAttribute('data-html')));this.contentDocument.close();\" allowfullscreen webkitallowfullscreen mozallowfullscreen></iframe></div></div>"
],
"text/plain": [
"<folium.folium.Map at 0x7fa747f26a58>"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"map_toronto = folium.Map(location=[43.6534817,-79.3839347],zoom_start=12)\n",
"\n",
"for lat,lng,borough,neighborhood in zip(Toronto_df['Latitude'],Toronto_df['Longitude'],Toronto_df['Borough'],Toronto_df['Neighborhood']):\n",
" label = '{}, {}'.format(neighborhood, borough)\n",
" label = folium.Popup(label, parse_html=True)\n",
" folium.CircleMarker(\n",
" [lat,lng],\n",
" radius=2,\n",
" popup=label,\n",
" color='blue',\n",
" fill=True,\n",
" fill_color='#3186cc',\n",
" fill_opacity=0.7,\n",
" parse_html=False).add_to(map_toronto)\n",
"map_toronto"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Generate Maps to visualize Neighborhoods and how they go together"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, we are going to start utilizing the Foursquare API to explore the neighborhoods and segment them."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Define Foursquare Credentials and Version"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"button": false,
"collapsed": false,
"jupyter": {
"outputs_hidden": false
},
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Your credentails:\n",
"CLIENT_ID: LAUAGY5VQH2DJ4VUXN4OXSNGEGCKT0TLXSDASO4FL1XB4SES\n",
"CLIENT_SECRET:D3CTWFSGB2D5XW0DQ02ZB2VGVOI2IZMI0ISACKDSVCL0MEV2\n"
]
}
],
"source": [
"CLIENT_ID = 'LAUAGY5VQH2DJ4VUXN4OXSNGEGCKT0TLXSDASO4FL1XB4SES' # your Foursquare ID\n",
"CLIENT_SECRET = 'D3CTWFSGB2D5XW0DQ02ZB2VGVOI2IZMI0ISACKDSVCL0MEV2' # your Foursquare Secret\n",
"VERSION = '20180605'\n",
"\n",
"print('Your credentails:')\n",
"print('CLIENT_ID: ' + CLIENT_ID)\n",
"print('CLIENT_SECRET:' + CLIENT_SECRET)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Let's explore the neighborhood of Toronto\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"get venues"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [],
"source": [
"LIMIT = 100 # limit of number of venues returned by Foursquare API\n",
"radius = 500 # define radius"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'https://api.foursquare.com/v2/venues/explore?&client_id=LAUAGY5VQH2DJ4VUXN4OXSNGEGCKT0TLXSDASO4FL1XB4SES&client_secret=D3CTWFSGB2D5XW0DQ02ZB2VGVOI2IZMI0ISACKDSVCL0MEV2&v=20180605&ll=43.6534817,-79.3839347&radius=500&limit=100'"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# create URL\n",
"url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(\n",
" CLIENT_ID, \n",
" CLIENT_SECRET, \n",
" VERSION, \n",
" latitude, \n",
" longitude, \n",
" radius, \n",
" LIMIT)\n",
"url # display URL\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"send Get request to examn the results"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
"results = requests.get(url).json()\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"from the foursquare model in previous example we know that information is in item's key. Lets borrow the get_category_type function from foresquare."
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [],
"source": [
"# function that extracts the category of the venue\n",
"def get_category_type(row):\n",
" try:\n",
" categories_list = row['categories']\n",
" except:\n",
" categories_list = row['venue.categories']\n",
" \n",
" if len(categories_list) == 0:\n",
" return None\n",
" else:\n",
" return categories_list[0]['name']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Clean Data in JSon File put it in panda dataframe "
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/jupyterlab/conda/envs/python/lib/python3.6/site-packages/ipykernel_launcher.py:3: FutureWarning: pandas.io.json.json_normalize is deprecated, use pandas.json_normalize instead\n",
" This is separate from the ipykernel package so we can avoid doing imports until\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>name</th>\n",
" <th>categories</th>\n",
" <th>lat</th>\n",
" <th>lng</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Downtown Toronto</td>\n",
" <td>Neighborhood</td>\n",
" <td>43.653232</td>\n",
" <td>-79.385296</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Nathan Phillips Square</td>\n",
" <td>Plaza</td>\n",
" <td>43.652270</td>\n",
" <td>-79.383516</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Eggspectation Bell Trinity Square</td>\n",
" <td>Breakfast Spot</td>\n",
" <td>43.653144</td>\n",
" <td>-79.381980</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Japango</td>\n",
" <td>Sushi Restaurant</td>\n",
" <td>43.655268</td>\n",
" <td>-79.385165</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Indigo</td>\n",
" <td>Bookstore</td>\n",
" <td>43.653515</td>\n",
" <td>-79.380696</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" name categories lat lng\n",
"0 Downtown Toronto Neighborhood 43.653232 -79.385296\n",
"1 Nathan Phillips Square Plaza 43.652270 -79.383516\n",
"2 Eggspectation Bell Trinity Square Breakfast Spot 43.653144 -79.381980\n",
"3 Japango Sushi Restaurant 43.655268 -79.385165\n",
"4 Indigo Bookstore 43.653515 -79.380696"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"venues = results['response']['groups'][0]['items']\n",
" \n",
"nearby_venues = json_normalize(venues) # flatten JSON\n",
"\n",
"# filter columns\n",
"filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']\n",
"nearby_venues =nearby_venues.loc[:, filtered_columns]\n",
"\n",
"# filter the category for each row\n",
"nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)\n",
"\n",
"# clean columns\n",
"nearby_venues.columns = [col.split(\".\")[-1] for col in nearby_venues.columns]\n",
"\n",
"nearby_venues.head()"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"78 venues were returned by Foursquare.\n"
]
}
],
"source": [
"print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Lets check the size of the resulting dataframe"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"make new data frame "
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [],
"source": [
"def getNearbyVenues(names, latitudes, longitudes, radius=500):\n",
" \n",
" venues_list=[]\n",
" for name, lat, lng in zip(names, latitudes, longitudes):\n",
" print(name)\n",
" \n",
" # create the API request URL\n",
" url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(\n",
" CLIENT_ID, \n",
" CLIENT_SECRET, \n",
" VERSION, \n",
" lat, \n",
" lng, \n",
" radius, \n",
" LIMIT)\n",
" \n",
" # make the GET request\n",
" results = requests.get(url).json()[\"response\"]['groups'][0]['items']\n",
" \n",
" # return only relevant information for each nearby venue\n",
" venues_list.append([(\n",
" name, \n",
" lat, \n",
" lng, \n",
" v['venue']['name'], \n",
" v['venue']['location']['lat'], \n",
" v['venue']['location']['lng'], \n",
" v['venue']['categories'][0]['name']) for v in results])\n",
"\n",
" nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])\n",
" nearby_venues.columns = ['Neighborhood', \n",
" 'Neighborhood Latitude', \n",
" 'Neighborhood Longitude', \n",
" 'Venue', \n",
" 'Venue Latitude', \n",
" 'Venue Longitude', \n",
" 'Venue Category']\n",
" \n",
" return(nearby_venues)"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
},
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Regent Park / Harbourfront\n",
"Queen's Park / Ontario Provincial Government\n",
"Garden District, Ryerson\n",
"St. James Town\n",
"The Beaches\n",
"Berczy Park\n",
"Central Bay Street\n",
"Christie\n",
"Richmond / Adelaide / King\n",
"Dufferin / Dovercourt Village\n",
"Harbourfront East / Union Station / Toronto Islands\n",
"Little Portugal / Trinity\n",
"The Danforth West / Riverdale\n",
"Toronto Dominion Centre / Design Exchange\n",
"Brockton / Parkdale Village / Exhibition Place\n",
"India Bazaar / The Beaches West\n",
"Commerce Court / Victoria Hotel\n",
"Studio District\n",
"Lawrence Park\n",
"Roselawn\n",
"Davisville North\n",
"Forest Hill North & West\n",
"High Park / The Junction South\n",
"North Toronto West\n",
"The Annex / North Midtown / Yorkville\n",
"Parkdale / Roncesvalles\n",
"Davisville\n",
"University of Toronto / Harbord\n",
"Runnymede / Swansea\n",
"Moore Park / Summerhill East\n",
"Kensington Market / Chinatown / Grange Park\n",
"Summerhill West / Rathnelly / South Hill / Forest Hill SE / Deer Park\n",
"CN Tower / King and Spadina / Railway Lands / Harbourfront West / Bathurst Quay / South Niagara / Island airport\n",
"Rosedale\n",
"Stn A PO Boxes\n",
"St. James Town / Cabbagetown\n",
"First Canadian Place / Underground city\n",
"Church and Wellesley\n",
"Business reply mail Processing CentrE\n"
]
}
],
"source": [
"Toronto_venues = getNearbyVenues(names=Toronto_df['Neighborhood'],\n",
" latitudes=Toronto_df['Latitude'],\n",
" longitudes=Toronto_df['Longitude']\n",
" )"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Neighborhood</th>\n",
" <th>Neighborhood Latitude</th>\n",
" <th>Neighborhood Longitude</th>\n",
" <th>Venue</th>\n",
" <th>Venue Latitude</th>\n",
" <th>Venue Longitude</th>\n",
" <th>Venue Category</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Regent Park / Harbourfront</td>\n",
" <td>43.65426</td>\n",
" <td>-79.360636</td>\n",
" <td>Roselle Desserts</td>\n",
" <td>43.653447</td>\n",
" <td>-79.362017</td>\n",
" <td>Bakery</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Regent Park / Harbourfront</td>\n",
" <td>43.65426</td>\n",
" <td>-79.360636</td>\n",
" <td>Tandem Coffee</td>\n",
" <td>43.653559</td>\n",
" <td>-79.361809</td>\n",
" <td>Coffee Shop</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Regent Park / Harbourfront</td>\n",
" <td>43.65426</td>\n",
" <td>-79.360636</td>\n",
" <td>Cooper Koo Family YMCA</td>\n",
" <td>43.653249</td>\n",
" <td>-79.358008</td>\n",
" <td>Distribution Center</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Regent Park / Harbourfront</td>\n",
" <td>43.65426</td>\n",
" <td>-79.360636</td>\n",
" <td>Body Blitz Spa East</td>\n",
" <td>43.654735</td>\n",
" <td>-79.359874</td>\n",
" <td>Spa</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Regent Park / Harbourfront</td>\n",
" <td>43.65426</td>\n",
" <td>-79.360636</td>\n",
" <td>Morning Glory Cafe</td>\n",
" <td>43.653947</td>\n",
" <td>-79.361149</td>\n",
" <td>Breakfast Spot</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Neighborhood Neighborhood Latitude Neighborhood Longitude \\\n",
"0 Regent Park / Harbourfront 43.65426 -79.360636 \n",
"1 Regent Park / Harbourfront 43.65426 -79.360636 \n",
"2 Regent Park / Harbourfront 43.65426 -79.360636 \n",
"3 Regent Park / Harbourfront 43.65426 -79.360636 \n",
"4 Regent Park / Harbourfront 43.65426 -79.360636 \n",
"\n",
" Venue Venue Latitude Venue Longitude \\\n",
"0 Roselle Desserts 43.653447 -79.362017 \n",
"1 Tandem Coffee 43.653559 -79.361809 \n",
"2 Cooper Koo Family YMCA 43.653249 -79.358008 \n",
"3 Body Blitz Spa East 43.654735 -79.359874 \n",
"4 Morning Glory Cafe 43.653947 -79.361149 \n",
"\n",
" Venue Category \n",
"0 Bakery \n",
"1 Coffee Shop \n",
"2 Distribution Center \n",
"3 Spa \n",
"4 Breakfast Spot "
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Toronto_venues.head()"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Neighborhood Latitude</th>\n",
" <th>Neighborhood Longitude</th>\n",
" <th>Venue</th>\n",
" <th>Venue Latitude</th>\n",
" <th>Venue Longitude</th>\n",
" <th>Venue Category</th>\n",
" </tr>\n",
" <tr>\n",
" <th>Neighborhood</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>Berczy Park</th>\n",
" <td>55</td>\n",
" <td>55</td>\n",
" <td>55</td>\n",
" <td>55</td>\n",
" <td>55</td>\n",
" <td>55</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Brockton / Parkdale Village / Exhibition Place</th>\n",
" <td>24</td>\n",
" <td>24</td>\n",
" <td>24</td>\n",
" <td>24</td>\n",
" <td>24</td>\n",
" <td>24</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Business reply mail Processing CentrE</th>\n",
" <td>19</td>\n",
" <td>19</td>\n",
" <td>19</td>\n",
" <td>19</td>\n",
" <td>19</td>\n",
" <td>19</td>\n",
" </tr>\n",
" <tr>\n",
" <th>CN Tower / King and Spadina / Railway Lands / Harbourfront West / Bathurst Quay / South Niagara / Island airport</th>\n",
" <td>17</td>\n",
" <td>17</td>\n",
" <td>17</td>\n",
" <td>17</td>\n",
" <td>17</td>\n",
" <td>17</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Central Bay Street</th>\n",
" <td>65</td>\n",
" <td>65</td>\n",
" <td>65</td>\n",
" <td>65</td>\n",
" <td>65</td>\n",
" <td>65</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Christie</th>\n",
" <td>17</td>\n",
" <td>17</td>\n",
" <td>17</td>\n",
" <td>17</td>\n",
" <td>17</td>\n",
" <td>17</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Church and Wellesley</th>\n",
" <td>74</td>\n",
" <td>74</td>\n",
" <td>74</td>\n",
" <td>74</td>\n",
" <td>74</td>\n",
" <td>74</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Commerce Court / Victoria Hotel</th>\n",
" <td>100</td>\n",
" <td>100</td>\n",
" <td>100</td>\n",
" <td>100</td>\n",
" <td>100</td>\n",
" <td>100</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Davisville</th>\n",
" <td>34</td>\n",
" <td>34</td>\n",
" <td>34</td>\n",
" <td>34</td>\n",
" <td>34</td>\n",
" <td>34</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Davisville North</th>\n",
" <td>11</td>\n",
" <td>11</td>\n",
" <td>11</td>\n",
" <td>11</td>\n",
" <td>11</td>\n",
" <td>11</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Dufferin / Dovercourt Village</th>\n",
" <td>15</td>\n",
" <td>15</td>\n",
" <td>15</td>\n",
" <td>15</td>\n",
" <td>15</td>\n",
" <td>15</td>\n",
" </tr>\n",
" <tr>\n",
" <th>First Canadian Place / Underground city</th>\n",
" <td>100</td>\n",
" <td>100</td>\n",
" <td>100</td>\n",
" <td>100</td>\n",
" <td>100</td>\n",
" <td>100</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Forest Hill North &amp; West</th>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Garden District, Ryerson</th>\n",
" <td>100</td>\n",
" <td>100</td>\n",
" <td>100</td>\n",
" <td>100</td>\n",
" <td>100</td>\n",
" <td>100</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Harbourfront East / Union Station / Toronto Islands</th>\n",
" <td>100</td>\n",
" <td>100</td>\n",
" <td>100</td>\n",
" <td>100</td>\n",
" <td>100</td>\n",
" <td>100</td>\n",
" </tr>\n",
" <tr>\n",
" <th>High Park / The Junction South</th>\n",
" <td>25</td>\n",
" <td>25</td>\n",
" <td>25</td>\n",
" <td>25</td>\n",
" <td>25</td>\n",
" <td>25</td>\n",
" </tr>\n",
" <tr>\n",
" <th>India Bazaar / The Beaches West</th>\n",
" <td>20</td>\n",
" <td>20</td>\n",
" <td>20</td>\n",
" <td>20</td>\n",
" <td>20</td>\n",
" <td>20</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Kensington Market / Chinatown / Grange Park</th>\n",
" <td>62</td>\n",
" <td>62</td>\n",
" <td>62</td>\n",
" <td>62</td>\n",
" <td>62</td>\n",
" <td>62</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Lawrence Park</th>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Little Portugal / Trinity</th>\n",
" <td>43</td>\n",
" <td>43</td>\n",
" <td>43</td>\n",
" <td>43</td>\n",
" <td>43</td>\n",
" <td>43</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Moore Park / Summerhill East</th>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>North Toronto West</th>\n",
" <td>21</td>\n",
" <td>21</td>\n",
" <td>21</td>\n",
" <td>21</td>\n",
" <td>21</td>\n",
" <td>21</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Parkdale / Roncesvalles</th>\n",
" <td>14</td>\n",
" <td>14</td>\n",
" <td>14</td>\n",
" <td>14</td>\n",
" <td>14</td>\n",
" <td>14</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Queen's Park / Ontario Provincial Government</th>\n",
" <td>31</td>\n",
" <td>31</td>\n",
" <td>31</td>\n",
" <td>31</td>\n",
" <td>31</td>\n",
" <td>31</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Regent Park / Harbourfront</th>\n",
" <td>47</td>\n",
" <td>47</td>\n",
" <td>47</td>\n",
" <td>47</td>\n",
" <td>47</td>\n",
" <td>47</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Richmond / Adelaide / King</th>\n",
" <td>97</td>\n",
" <td>97</td>\n",
" <td>97</td>\n",
" <td>97</td>\n",
" <td>97</td>\n",
" <td>97</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Rosedale</th>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Roselawn</th>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Runnymede / Swansea</th>\n",
" <td>41</td>\n",
" <td>41</td>\n",
" <td>41</td>\n",
" <td>41</td>\n",
" <td>41</td>\n",
" <td>41</td>\n",
" </tr>\n",
" <tr>\n",
" <th>St. James Town</th>\n",
" <td>86</td>\n",
" <td>86</td>\n",
" <td>86</td>\n",
" <td>86</td>\n",
" <td>86</td>\n",
" <td>86</td>\n",
" </tr>\n",
" <tr>\n",
" <th>St. James Town / Cabbagetown</th>\n",
" <td>44</td>\n",
" <td>44</td>\n",
" <td>44</td>\n",
" <td>44</td>\n",
" <td>44</td>\n",
" <td>44</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Stn A PO Boxes</th>\n",
" <td>95</td>\n",
" <td>95</td>\n",
" <td>95</td>\n",
" <td>95</td>\n",
" <td>95</td>\n",
" <td>95</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Studio District</th>\n",
" <td>41</td>\n",
" <td>41</td>\n",
" <td>41</td>\n",
" <td>41</td>\n",
" <td>41</td>\n",
" <td>41</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Summerhill West / Rathnelly / South Hill / Forest Hill SE / Deer Park</th>\n",
" <td>16</td>\n",
" <td>16</td>\n",
" <td>16</td>\n",
" <td>16</td>\n",
" <td>16</td>\n",
" <td>16</td>\n",
" </tr>\n",
" <tr>\n",
" <th>The Annex / North Midtown / Yorkville</th>\n",
" <td>22</td>\n",
" <td>22</td>\n",
" <td>22</td>\n",
" <td>22</td>\n",
" <td>22</td>\n",
" <td>22</td>\n",
" </tr>\n",
" <tr>\n",
" <th>The Beaches</th>\n",
" <td>5</td>\n",
" <td>5</td>\n",
" <td>5</td>\n",
" <td>5</td>\n",
" <td>5</td>\n",
" <td>5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>The Danforth West / Riverdale</th>\n",
" <td>43</td>\n",
" <td>43</td>\n",
" <td>43</td>\n",
" <td>43</td>\n",
" <td>43</td>\n",
" <td>43</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Toronto Dominion Centre / Design Exchange</th>\n",
" <td>100</td>\n",
" <td>100</td>\n",
" <td>100</td>\n",
" <td>100</td>\n",
" <td>100</td>\n",
" <td>100</td>\n",
" </tr>\n",
" <tr>\n",
" <th>University of Toronto / Harbord</th>\n",
" <td>35</td>\n",
" <td>35</td>\n",
" <td>35</td>\n",
" <td>35</td>\n",
" <td>35</td>\n",
" <td>35</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Neighborhood Latitude \\\n",
"Neighborhood \n",
"Berczy Park 55 \n",
"Brockton / Parkdale Village / Exhibition Place 24 \n",
"Business reply mail Processing CentrE 19 \n",
"CN Tower / King and Spadina / Railway Lands / H... 17 \n",
"Central Bay Street 65 \n",
"Christie 17 \n",
"Church and Wellesley 74 \n",
"Commerce Court / Victoria Hotel 100 \n",
"Davisville 34 \n",
"Davisville North 11 \n",
"Dufferin / Dovercourt Village 15 \n",
"First Canadian Place / Underground city 100 \n",
"Forest Hill North & West 4 \n",
"Garden District, Ryerson 100 \n",
"Harbourfront East / Union Station / Toronto Isl... 100 \n",
"High Park / The Junction South 25 \n",
"India Bazaar / The Beaches West 20 \n",
"Kensington Market / Chinatown / Grange Park 62 \n",
"Lawrence Park 3 \n",
"Little Portugal / Trinity 43 \n",
"Moore Park / Summerhill East 3 \n",
"North Toronto West 21 \n",
"Parkdale / Roncesvalles 14 \n",
"Queen's Park / Ontario Provincial Government 31 \n",
"Regent Park / Harbourfront 47 \n",
"Richmond / Adelaide / King 97 \n",
"Rosedale 4 \n",
"Roselawn 2 \n",
"Runnymede / Swansea 41 \n",
"St. James Town 86 \n",
"St. James Town / Cabbagetown 44 \n",
"Stn A PO Boxes 95 \n",
"Studio District 41 \n",
"Summerhill West / Rathnelly / South Hill / Fore... 16 \n",
"The Annex / North Midtown / Yorkville 22 \n",
"The Beaches 5 \n",
"The Danforth West / Riverdale 43 \n",
"Toronto Dominion Centre / Design Exchange 100 \n",
"University of Toronto / Harbord 35 \n",
"\n",
" Neighborhood Longitude \\\n",
"Neighborhood \n",
"Berczy Park 55 \n",
"Brockton / Parkdale Village / Exhibition Place 24 \n",
"Business reply mail Processing CentrE 19 \n",
"CN Tower / King and Spadina / Railway Lands / H... 17 \n",
"Central Bay Street 65 \n",
"Christie 17 \n",
"Church and Wellesley 74 \n",
"Commerce Court / Victoria Hotel 100 \n",
"Davisville 34 \n",
"Davisville North 11 \n",
"Dufferin / Dovercourt Village 15 \n",
"First Canadian Place / Underground city 100 \n",
"Forest Hill North & West 4 \n",
"Garden District, Ryerson 100 \n",
"Harbourfront East / Union Station / Toronto Isl... 100 \n",
"High Park / The Junction South 25 \n",
"India Bazaar / The Beaches West 20 \n",
"Kensington Market / Chinatown / Grange Park 62 \n",
"Lawrence Park 3 \n",
"Little Portugal / Trinity 43 \n",
"Moore Park / Summerhill East 3 \n",
"North Toronto West 21 \n",
"Parkdale / Roncesvalles 14 \n",
"Queen's Park / Ontario Provincial Government 31 \n",
"Regent Park / Harbourfront 47 \n",
"Richmond / Adelaide / King 97 \n",
"Rosedale 4 \n",
"Roselawn 2 \n",
"Runnymede / Swansea 41 \n",
"St. James Town 86 \n",
"St. James Town / Cabbagetown 44 \n",
"Stn A PO Boxes 95 \n",
"Studio District 41 \n",
"Summerhill West / Rathnelly / South Hill / Fore... 16 \n",
"The Annex / North Midtown / Yorkville 22 \n",
"The Beaches 5 \n",
"The Danforth West / Riverdale 43 \n",
"Toronto Dominion Centre / Design Exchange 100 \n",
"University of Toronto / Harbord 35 \n",
"\n",
" Venue Venue Latitude \\\n",
"Neighborhood \n",
"Berczy Park 55 55 \n",
"Brockton / Parkdale Village / Exhibition Place 24 24 \n",
"Business reply mail Processing CentrE 19 19 \n",
"CN Tower / King and Spadina / Railway Lands / H... 17 17 \n",
"Central Bay Street 65 65 \n",
"Christie 17 17 \n",
"Church and Wellesley 74 74 \n",
"Commerce Court / Victoria Hotel 100 100 \n",
"Davisville 34 34 \n",
"Davisville North 11 11 \n",
"Dufferin / Dovercourt Village 15 15 \n",
"First Canadian Place / Underground city 100 100 \n",
"Forest Hill North & West 4 4 \n",
"Garden District, Ryerson 100 100 \n",
"Harbourfront East / Union Station / Toronto Isl... 100 100 \n",
"High Park / The Junction South 25 25 \n",
"India Bazaar / The Beaches West 20 20 \n",
"Kensington Market / Chinatown / Grange Park 62 62 \n",
"Lawrence Park 3 3 \n",
"Little Portugal / Trinity 43 43 \n",
"Moore Park / Summerhill East 3 3 \n",
"North Toronto West 21 21 \n",
"Parkdale / Roncesvalles 14 14 \n",
"Queen's Park / Ontario Provincial Government 31 31 \n",
"Regent Park / Harbourfront 47 47 \n",
"Richmond / Adelaide / King 97 97 \n",
"Rosedale 4 4 \n",
"Roselawn 2 2 \n",
"Runnymede / Swansea 41 41 \n",
"St. James Town 86 86 \n",
"St. James Town / Cabbagetown 44 44 \n",
"Stn A PO Boxes 95 95 \n",
"Studio District 41 41 \n",
"Summerhill West / Rathnelly / South Hill / Fore... 16 16 \n",
"The Annex / North Midtown / Yorkville 22 22 \n",
"The Beaches 5 5 \n",
"The Danforth West / Riverdale 43 43 \n",
"Toronto Dominion Centre / Design Exchange 100 100 \n",
"University of Toronto / Harbord 35 35 \n",
"\n",
" Venue Longitude \\\n",
"Neighborhood \n",
"Berczy Park 55 \n",
"Brockton / Parkdale Village / Exhibition Place 24 \n",
"Business reply mail Processing CentrE 19 \n",
"CN Tower / King and Spadina / Railway Lands / H... 17 \n",
"Central Bay Street 65 \n",
"Christie 17 \n",
"Church and Wellesley 74 \n",
"Commerce Court / Victoria Hotel 100 \n",
"Davisville 34 \n",
"Davisville North 11 \n",
"Dufferin / Dovercourt Village 15 \n",
"First Canadian Place / Underground city 100 \n",
"Forest Hill North & West 4 \n",
"Garden District, Ryerson 100 \n",
"Harbourfront East / Union Station / Toronto Isl... 100 \n",
"High Park / The Junction South 25 \n",
"India Bazaar / The Beaches West 20 \n",
"Kensington Market / Chinatown / Grange Park 62 \n",
"Lawrence Park 3 \n",
"Little Portugal / Trinity 43 \n",
"Moore Park / Summerhill East 3 \n",
"North Toronto West 21 \n",
"Parkdale / Roncesvalles 14 \n",
"Queen's Park / Ontario Provincial Government 31 \n",
"Regent Park / Harbourfront 47 \n",
"Richmond / Adelaide / King 97 \n",
"Rosedale 4 \n",
"Roselawn 2 \n",
"Runnymede / Swansea 41 \n",
"St. James Town 86 \n",
"St. James Town / Cabbagetown 44 \n",
"Stn A PO Boxes 95 \n",
"Studio District 41 \n",
"Summerhill West / Rathnelly / South Hill / Fore... 16 \n",
"The Annex / North Midtown / Yorkville 22 \n",
"The Beaches 5 \n",
"The Danforth West / Riverdale 43 \n",
"Toronto Dominion Centre / Design Exchange 100 \n",
"University of Toronto / Harbord 35 \n",
"\n",
" Venue Category \n",
"Neighborhood \n",
"Berczy Park 55 \n",
"Brockton / Parkdale Village / Exhibition Place 24 \n",
"Business reply mail Processing CentrE 19 \n",
"CN Tower / King and Spadina / Railway Lands / H... 17 \n",
"Central Bay Street 65 \n",
"Christie 17 \n",
"Church and Wellesley 74 \n",
"Commerce Court / Victoria Hotel 100 \n",
"Davisville 34 \n",
"Davisville North 11 \n",
"Dufferin / Dovercourt Village 15 \n",
"First Canadian Place / Underground city 100 \n",
"Forest Hill North & West 4 \n",
"Garden District, Ryerson 100 \n",
"Harbourfront East / Union Station / Toronto Isl... 100 \n",
"High Park / The Junction South 25 \n",
"India Bazaar / The Beaches West 20 \n",
"Kensington Market / Chinatown / Grange Park 62 \n",
"Lawrence Park 3 \n",
"Little Portugal / Trinity 43 \n",
"Moore Park / Summerhill East 3 \n",
"North Toronto West 21 \n",
"Parkdale / Roncesvalles 14 \n",
"Queen's Park / Ontario Provincial Government 31 \n",
"Regent Park / Harbourfront 47 \n",
"Richmond / Adelaide / King 97 \n",
"Rosedale 4 \n",
"Roselawn 2 \n",
"Runnymede / Swansea 41 \n",
"St. James Town 86 \n",
"St. James Town / Cabbagetown 44 \n",
"Stn A PO Boxes 95 \n",
"Studio District 41 \n",
"Summerhill West / Rathnelly / South Hill / Fore... 16 \n",
"The Annex / North Midtown / Yorkville 22 \n",
"The Beaches 5 \n",
"The Danforth West / Riverdale 43 \n",
"Toronto Dominion Centre / Design Exchange 100 \n",
"University of Toronto / Harbord 35 "
]
},
"execution_count": 44,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Toronto_venues.groupby('Neighborhood').count()"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"There are 226 uniques categories.\n"
]
}
],
"source": [
"print('There are {} uniques categories.'.format(len(Toronto_venues['Venue Category'].unique())))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Analyze Each Neighborhood"
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Yoga Studio</th>\n",
" <th>Airport</th>\n",
" <th>Airport Food Court</th>\n",
" <th>Airport Gate</th>\n",
" <th>Airport Lounge</th>\n",
" <th>Airport Service</th>\n",
" <th>Airport Terminal</th>\n",
" <th>American Restaurant</th>\n",
" <th>Antique Shop</th>\n",
" <th>Aquarium</th>\n",
" <th>...</th>\n",
" <th>Theater</th>\n",
" <th>Theme Restaurant</th>\n",
" <th>Toy / Game Store</th>\n",
" <th>Trail</th>\n",
" <th>Train Station</th>\n",
" <th>Vegetarian / Vegan Restaurant</th>\n",
" <th>Video Game Store</th>\n",
" <th>Vietnamese Restaurant</th>\n",
" <th>Wine Bar</th>\n",
" <th>Women's Store</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>5 rows × 226 columns</p>\n",
"</div>"
],
"text/plain": [
" Yoga Studio Airport Airport Food Court Airport Gate Airport Lounge \\\n",
"0 0 0 0 0 0 \n",
"1 0 0 0 0 0 \n",
"2 0 0 0 0 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 0 \n",
"\n",
" Airport Service Airport Terminal American Restaurant Antique Shop \\\n",
"0 0 0 0 0 \n",
"1 0 0 0 0 \n",
"2 0 0 0 0 \n",
"3 0 0 0 0 \n",
"4 0 0 0 0 \n",
"\n",
" Aquarium ... Theater Theme Restaurant Toy / Game Store Trail \\\n",
"0 0 ... 0 0 0 0 \n",
"1 0 ... 0 0 0 0 \n",
"2 0 ... 0 0 0 0 \n",
"3 0 ... 0 0 0 0 \n",
"4 0 ... 0 0 0 0 \n",
"\n",
" Train Station Vegetarian / Vegan Restaurant Video Game Store \\\n",
"0 0 0 0 \n",
"1 0 0 0 \n",
"2 0 0 0 \n",
"3 0 0 0 \n",
"4 0 0 0 \n",
"\n",
" Vietnamese Restaurant Wine Bar Women's Store \n",
"0 0 0 0 \n",
"1 0 0 0 \n",
"2 0 0 0 \n",
"3 0 0 0 \n",
"4 0 0 0 \n",
"\n",
"[5 rows x 226 columns]"
]
},
"execution_count": 48,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# one hot encoding\n",
"Toronto_onehot = pd.get_dummies(Toronto_venues[['Venue Category']], prefix=\"\", prefix_sep=\"\")\n",
"\n",
"# add neighborhood column back to dataframe\n",
"Toronto_onehot['Neighborhood'] = Toronto_venues['Neighborhood'] \n",
"\n",
"# move neighborhood column to the first column\n",
"fixed_columns = [Toronto_onehot.columns[-1]] + list(Toronto_onehot.columns[:-1])\n",
"Toronto_onehot = Toronto_onehot[fixed_columns]\n",
"\n",
"Toronto_onehot.head()"
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(1635, 226)"
]
},
"execution_count": 50,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# examn new data frame\n",
"Toronto_onehot.shape"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category"
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Neighborhood</th>\n",
" <th>Yoga Studio</th>\n",
" <th>Airport</th>\n",
" <th>Airport Food Court</th>\n",
" <th>Airport Gate</th>\n",
" <th>Airport Lounge</th>\n",
" <th>Airport Service</th>\n",
" <th>Airport Terminal</th>\n",
" <th>American Restaurant</th>\n",
" <th>Antique Shop</th>\n",
" <th>...</th>\n",
" <th>Theater</th>\n",
" <th>Theme Restaurant</th>\n",
" <th>Toy / Game Store</th>\n",
" <th>Trail</th>\n",
" <th>Train Station</th>\n",
" <th>Vegetarian / Vegan Restaurant</th>\n",
" <th>Video Game Store</th>\n",
" <th>Vietnamese Restaurant</th>\n",
" <th>Wine Bar</th>\n",
" <th>Women's Store</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Berczy Park</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.018182</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Brockton / Parkdale Village / Exhibition Place</td>\n",
" <td>0.041667</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Business reply mail Processing CentrE</td>\n",
" <td>0.052632</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>CN Tower / King and Spadina / Railway Lands / ...</td>\n",
" <td>0.000000</td>\n",
" <td>0.058824</td>\n",
" <td>0.058824</td>\n",
" <td>0.058824</td>\n",
" <td>0.117647</td>\n",
" <td>0.176471</td>\n",
" <td>0.117647</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Central Bay Street</td>\n",
" <td>0.015385</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.015385</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.015385</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>5 rows × 226 columns</p>\n",
"</div>"
],
"text/plain": [
" Neighborhood Yoga Studio Airport \\\n",
"0 Berczy Park 0.000000 0.000000 \n",
"1 Brockton / Parkdale Village / Exhibition Place 0.041667 0.000000 \n",
"2 Business reply mail Processing CentrE 0.052632 0.000000 \n",
"3 CN Tower / King and Spadina / Railway Lands / ... 0.000000 0.058824 \n",
"4 Central Bay Street 0.015385 0.000000 \n",
"\n",
" Airport Food Court Airport Gate Airport Lounge Airport Service \\\n",
"0 0.000000 0.000000 0.000000 0.000000 \n",
"1 0.000000 0.000000 0.000000 0.000000 \n",
"2 0.000000 0.000000 0.000000 0.000000 \n",
"3 0.058824 0.058824 0.117647 0.176471 \n",
"4 0.000000 0.000000 0.000000 0.000000 \n",
"\n",
" Airport Terminal American Restaurant Antique Shop ... Theater \\\n",
"0 0.000000 0.0 0.0 ... 0.0 \n",
"1 0.000000 0.0 0.0 ... 0.0 \n",
"2 0.000000 0.0 0.0 ... 0.0 \n",
"3 0.117647 0.0 0.0 ... 0.0 \n",
"4 0.000000 0.0 0.0 ... 0.0 \n",
"\n",
" Theme Restaurant Toy / Game Store Trail Train Station \\\n",
"0 0.0 0.0 0.0 0.0 \n",
"1 0.0 0.0 0.0 0.0 \n",
"2 0.0 0.0 0.0 0.0 \n",
"3 0.0 0.0 0.0 0.0 \n",
"4 0.0 0.0 0.0 0.0 \n",
"\n",
" Vegetarian / Vegan Restaurant Video Game Store Vietnamese Restaurant \\\n",
"0 0.018182 0.0 0.0 \n",
"1 0.000000 0.0 0.0 \n",
"2 0.000000 0.0 0.0 \n",
"3 0.000000 0.0 0.0 \n",
"4 0.015385 0.0 0.0 \n",
"\n",
" Wine Bar Women's Store \n",
"0 0.000000 0.0 \n",
"1 0.000000 0.0 \n",
"2 0.000000 0.0 \n",
"3 0.000000 0.0 \n",
"4 0.015385 0.0 \n",
"\n",
"[5 rows x 226 columns]"
]
},
"execution_count": 51,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Toronto_grouped = Toronto_onehot.groupby('Neighborhood').mean().reset_index()\n",
"Toronto_grouped.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Let's confirm the new size"
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(39, 226)"
]
},
"execution_count": 52,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Toronto_grouped.shape"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Let's print each neighborhood along with the top 5 most common venues"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"----Berczy Park----\n",
" venue freq\n",
"0 Coffee Shop 0.05\n",
"1 Farmers Market 0.04\n",
"2 Italian Restaurant 0.04\n",
"3 Seafood Restaurant 0.04\n",
"4 Cheese Shop 0.04\n",
"\n",
"\n",
"----Brockton / Parkdale Village / Exhibition Place----\n",
" venue freq\n",
"0 Café 0.12\n",
"1 Coffee Shop 0.08\n",
"2 Nightclub 0.08\n",
"3 Breakfast Spot 0.08\n",
"4 Yoga Studio 0.04\n",
"\n",
"\n",
"----Business reply mail Processing CentrE----\n",
" venue freq\n",
"0 Light Rail Station 0.11\n",
"1 Yoga Studio 0.05\n",
"2 Spa 0.05\n",
"3 Garden Center 0.05\n",
"4 Garden 0.05\n",
"\n",
"\n",
"----CN Tower / King and Spadina / Railway Lands / Harbourfront West / Bathurst Quay / South Niagara / Island airport----\n",
" venue freq\n",
"0 Airport Service 0.18\n",
"1 Airport Lounge 0.12\n",
"2 Airport Terminal 0.12\n",
"3 Harbor / Marina 0.06\n",
"4 Boutique 0.06\n",
"\n",
"\n",
"----Central Bay Street----\n",
" venue freq\n",
"0 Coffee Shop 0.18\n",
"1 Italian Restaurant 0.06\n",
"2 Café 0.05\n",
"3 Sandwich Place 0.05\n",
"4 Bubble Tea Shop 0.03\n",
"\n",
"\n",
"----Christie----\n",
" venue freq\n",
"0 Grocery Store 0.24\n",
"1 Café 0.18\n",
"2 Park 0.12\n",
"3 Diner 0.06\n",
"4 Italian Restaurant 0.06\n",
"\n",
"\n",
"----Church and Wellesley----\n",
" venue freq\n",
"0 Coffee Shop 0.07\n",
"1 Japanese Restaurant 0.05\n",
"2 Gay Bar 0.05\n",
"3 Sushi Restaurant 0.04\n",
"4 Restaurant 0.04\n",
"\n",
"\n",
"----Commerce Court / Victoria Hotel----\n",
" venue freq\n",
"0 Coffee Shop 0.10\n",
"1 Café 0.07\n",
"2 Restaurant 0.07\n",
"3 Hotel 0.06\n",
"4 American Restaurant 0.04\n",
"\n",
"\n",
"----Davisville----\n",
" venue freq\n",
"0 Sandwich Place 0.09\n",
"1 Dessert Shop 0.09\n",
"2 Coffee Shop 0.06\n",
"3 Gym 0.06\n",
"4 Café 0.06\n",
"\n",
"\n",
"----Davisville North----\n",
" venue freq\n",
"0 Gym 0.09\n",
"1 Hotel 0.09\n",
"2 Convenience Store 0.09\n",
"3 Pizza Place 0.09\n",
"4 Dance Studio 0.09\n",
"\n",
"\n",
"----Dufferin / Dovercourt Village----\n",
" venue freq\n",
"0 Pharmacy 0.13\n",
"1 Bakery 0.13\n",
"2 Pizza Place 0.07\n",
"3 Park 0.07\n",
"4 Gym / Fitness Center 0.07\n",
"\n",
"\n",
"----First Canadian Place / Underground city----\n",
" venue freq\n",
"0 Coffee Shop 0.10\n",
"1 Café 0.07\n",
"2 Restaurant 0.06\n",
"3 Hotel 0.04\n",
"4 Asian Restaurant 0.03\n",
"\n",
"\n",
"----Forest Hill North & West----\n",
" venue freq\n",
"0 Park 0.25\n",
"1 Jewelry Store 0.25\n",
"2 Trail 0.25\n",
"3 Sushi Restaurant 0.25\n",
"4 Yoga Studio 0.00\n",
"\n",
"\n",
"----Garden District, Ryerson----\n",
" venue freq\n",
"0 Coffee Shop 0.09\n",
"1 Clothing Store 0.08\n",
"2 Café 0.04\n",
"3 Cosmetics Shop 0.03\n",
"4 Bubble Tea Shop 0.03\n",
"\n",
"\n",
"----Harbourfront East / Union Station / Toronto Islands----\n",
" venue freq\n",
"0 Coffee Shop 0.12\n",
"1 Aquarium 0.05\n",
"2 Hotel 0.04\n",
"3 Café 0.04\n",
"4 Restaurant 0.04\n",
"\n",
"\n",
"----High Park / The Junction South----\n",
" venue freq\n",
"0 Bar 0.08\n",
"1 Mexican Restaurant 0.08\n",
"2 Café 0.08\n",
"3 Thai Restaurant 0.08\n",
"4 Bakery 0.04\n",
"\n",
"\n",
"----India Bazaar / The Beaches West----\n",
" venue freq\n",
"0 Fast Food Restaurant 0.10\n",
"1 Gym 0.05\n",
"2 Italian Restaurant 0.05\n",
"3 Pet Store 0.05\n",
"4 Pizza Place 0.05\n",
"\n",
"\n",
"----Kensington Market / Chinatown / Grange Park----\n",
" venue freq\n",
"0 Café 0.08\n",
"1 Coffee Shop 0.06\n",
"2 Mexican Restaurant 0.05\n",
"3 Vietnamese Restaurant 0.05\n",
"4 Vegetarian / Vegan Restaurant 0.05\n",
"\n",
"\n",
"----Lawrence Park----\n",
" venue freq\n",
"0 Park 0.33\n",
"1 Bus Line 0.33\n",
"2 Swim School 0.33\n",
"3 Yoga Studio 0.00\n",
"4 Monument / Landmark 0.00\n",
"\n",
"\n",
"----Little Portugal / Trinity----\n",
" venue freq\n",
"0 Bar 0.12\n",
"1 Restaurant 0.07\n",
"2 Café 0.05\n",
"3 Vegetarian / Vegan Restaurant 0.05\n",
"4 Coffee Shop 0.05\n",
"\n",
"\n",
"----Moore Park / Summerhill East----\n",
" venue freq\n",
"0 Park 0.33\n",
"1 Playground 0.33\n",
"2 Summer Camp 0.33\n",
"3 Yoga Studio 0.00\n",
"4 Monument / Landmark 0.00\n",
"\n",
"\n",
"----North Toronto West----\n",
" venue freq\n",
"0 Clothing Store 0.14\n",
"1 Coffee Shop 0.10\n",
"2 Yoga Studio 0.05\n",
"3 Seafood Restaurant 0.05\n",
"4 Salon / Barbershop 0.05\n",
"\n",
"\n",
"----Parkdale / Roncesvalles----\n",
" venue freq\n",
"0 Gift Shop 0.14\n",
"1 Bookstore 0.07\n",
"2 Dessert Shop 0.07\n",
"3 Eastern European Restaurant 0.07\n",
"4 Movie Theater 0.07\n",
"\n",
"\n",
"----Queen's Park / Ontario Provincial Government----\n",
" venue freq\n",
"0 Coffee Shop 0.26\n",
"1 Diner 0.06\n",
"2 Yoga Studio 0.03\n",
"3 Burrito Place 0.03\n",
"4 Juice Bar 0.03\n",
"\n",
"\n",
"----Regent Park / Harbourfront----\n",
" venue freq\n",
"0 Coffee Shop 0.17\n",
"1 Park 0.06\n",
"2 Bakery 0.06\n",
"3 Pub 0.06\n",
"4 Theater 0.04\n",
"\n",
"\n",
"----Richmond / Adelaide / King----\n",
" venue freq\n",
"0 Coffee Shop 0.08\n",
"1 Café 0.05\n",
"2 Gym 0.04\n",
"3 Restaurant 0.04\n",
"4 Deli / Bodega 0.03\n",
"\n",
"\n",
"----Rosedale----\n",
" venue freq\n",
"0 Park 0.50\n",
"1 Playground 0.25\n",
"2 Trail 0.25\n",
"3 Yoga Studio 0.00\n",
"4 Moroccan Restaurant 0.00\n",
"\n",
"\n",
"----Roselawn----\n",
" venue freq\n",
"0 Garden 0.5\n",
"1 Music Venue 0.5\n",
"2 Yoga Studio 0.0\n",
"3 Moroccan Restaurant 0.0\n",
"4 Liquor Store 0.0\n",
"\n",
"\n",
"----Runnymede / Swansea----\n",
" venue freq\n",
"0 Café 0.07\n",
"1 Pizza Place 0.07\n",
"2 Coffee Shop 0.07\n",
"3 Pub 0.05\n",
"4 Italian Restaurant 0.05\n",
"\n",
"\n",
"----St. James Town----\n",
" venue freq\n",
"0 Café 0.06\n",
"1 Coffee Shop 0.06\n",
"2 Cocktail Bar 0.05\n",
"3 American Restaurant 0.03\n",
"4 Restaurant 0.03\n",
"\n",
"\n",
"----St. James Town / Cabbagetown----\n",
" venue freq\n",
"0 Coffee Shop 0.07\n",
"1 Pub 0.05\n",
"2 Bakery 0.05\n",
"3 Pizza Place 0.05\n",
"4 Restaurant 0.05\n",
"\n",
"\n",
"----Stn A PO Boxes----\n",
" venue freq\n",
"0 Coffee Shop 0.09\n",
"1 Italian Restaurant 0.04\n",
"2 Café 0.04\n",
"3 Restaurant 0.04\n",
"4 Hotel 0.03\n",
"\n",
"\n",
"----Studio District----\n",
" venue freq\n",
"0 Café 0.10\n",
"1 Coffee Shop 0.07\n",
"2 Brewery 0.05\n",
"3 Bakery 0.05\n",
"4 Gastropub 0.05\n",
"\n",
"\n",
"----Summerhill West / Rathnelly / South Hill / Forest Hill SE / Deer Park----\n",
" venue freq\n",
"0 Pub 0.12\n",
"1 Coffee Shop 0.12\n",
"2 Sports Bar 0.06\n",
"3 Fried Chicken Joint 0.06\n",
"4 Restaurant 0.06\n",
"\n",
"\n",
"----The Annex / North Midtown / Yorkville----\n",
" venue freq\n",
"0 Sandwich Place 0.14\n",
"1 Café 0.14\n",
"2 Coffee Shop 0.09\n",
"3 Cosmetics Shop 0.05\n",
"4 Indian Restaurant 0.05\n",
"\n",
"\n",
"----The Beaches----\n",
" venue freq\n",
"0 Asian Restaurant 0.2\n",
"1 Health Food Store 0.2\n",
"2 Trail 0.2\n",
"3 Pub 0.2\n",
"4 Yoga Studio 0.0\n",
"\n",
"\n",
"----The Danforth West / Riverdale----\n",
" venue freq\n",
"0 Greek Restaurant 0.19\n",
"1 Italian Restaurant 0.07\n",
"2 Coffee Shop 0.07\n",
"3 Bookstore 0.05\n",
"4 Restaurant 0.05\n",
"\n",
"\n",
"----Toronto Dominion Centre / Design Exchange----\n",
" venue freq\n",
"0 Coffee Shop 0.12\n",
"1 Hotel 0.08\n",
"2 Café 0.07\n",
"3 Restaurant 0.05\n",
"4 Italian Restaurant 0.03\n",
"\n",
"\n",
"----University of Toronto / Harbord----\n",
" venue freq\n",
"0 Café 0.14\n",
"1 Restaurant 0.06\n",
"2 Italian Restaurant 0.06\n",
"3 Japanese Restaurant 0.06\n",
"4 Bar 0.06\n",
"\n",
"\n"
]
}
],
"source": [
"num_top_venues = 5\n",
"\n",
"for hood in Toronto_grouped['Neighborhood']:\n",
" print(\"----\"+hood+\"----\")\n",
" temp = Toronto_grouped[Toronto_grouped['Neighborhood'] == hood].T.reset_index()\n",
" temp.columns = ['venue','freq']\n",
" temp = temp.iloc[1:]\n",
" temp['freq'] = temp['freq'].astype(float)\n",
" temp = temp.round({'freq': 2})\n",
" print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))\n",
" print('\\n')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Let's put that into a *pandas* dataframe"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {},
"outputs": [],
"source": [
"#function to sort in decending order\n",
"def return_most_common_venues(row, num_top_venues):\n",
" row_categories = row.iloc[1:]\n",
" row_categories_sorted = row_categories.sort_values(ascending=False)\n",
" \n",
" return row_categories_sorted.index.values[0:num_top_venues]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now let's create the new dataframe and display the top 10 venues for each neighborhood."
]
},
{
"cell_type": "code",
"execution_count": 55,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Neighborhood</th>\n",
" <th>1st Most Common Venue</th>\n",
" <th>2nd Most Common Venue</th>\n",
" <th>3rd Most Common Venue</th>\n",
" <th>4th Most Common Venue</th>\n",
" <th>5th Most Common Venue</th>\n",
" <th>6th Most Common Venue</th>\n",
" <th>7th Most Common Venue</th>\n",
" <th>8th Most Common Venue</th>\n",
" <th>9th Most Common Venue</th>\n",
" <th>10th Most Common Venue</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Berczy Park</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Cocktail Bar</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Restaurant</td>\n",
" <td>Beer Bar</td>\n",
" <td>Café</td>\n",
" <td>Cheese Shop</td>\n",
" <td>Bakery</td>\n",
" <td>Seafood Restaurant</td>\n",
" <td>Farmers Market</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Brockton / Parkdale Village / Exhibition Place</td>\n",
" <td>Café</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Breakfast Spot</td>\n",
" <td>Nightclub</td>\n",
" <td>Pet Store</td>\n",
" <td>Stadium</td>\n",
" <td>Burrito Place</td>\n",
" <td>Restaurant</td>\n",
" <td>Climbing Gym</td>\n",
" <td>Performing Arts Venue</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Business reply mail Processing CentrE</td>\n",
" <td>Light Rail Station</td>\n",
" <td>Yoga Studio</td>\n",
" <td>Spa</td>\n",
" <td>Garden Center</td>\n",
" <td>Garden</td>\n",
" <td>Gym / Fitness Center</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Farmers Market</td>\n",
" <td>Comic Shop</td>\n",
" <td>Pizza Place</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>CN Tower / King and Spadina / Railway Lands / ...</td>\n",
" <td>Airport Service</td>\n",
" <td>Airport Lounge</td>\n",
" <td>Airport Terminal</td>\n",
" <td>Airport</td>\n",
" <td>Airport Food Court</td>\n",
" <td>Airport Gate</td>\n",
" <td>Bar</td>\n",
" <td>Boutique</td>\n",
" <td>Rental Car Location</td>\n",
" <td>Boat or Ferry</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Central Bay Street</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Sandwich Place</td>\n",
" <td>Café</td>\n",
" <td>Salad Place</td>\n",
" <td>Ice Cream Shop</td>\n",
" <td>Japanese Restaurant</td>\n",
" <td>Middle Eastern Restaurant</td>\n",
" <td>Sushi Restaurant</td>\n",
" <td>Spa</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Neighborhood 1st Most Common Venue \\\n",
"0 Berczy Park Coffee Shop \n",
"1 Brockton / Parkdale Village / Exhibition Place Café \n",
"2 Business reply mail Processing CentrE Light Rail Station \n",
"3 CN Tower / King and Spadina / Railway Lands / ... Airport Service \n",
"4 Central Bay Street Coffee Shop \n",
"\n",
" 2nd Most Common Venue 3rd Most Common Venue 4th Most Common Venue \\\n",
"0 Cocktail Bar Italian Restaurant Restaurant \n",
"1 Coffee Shop Breakfast Spot Nightclub \n",
"2 Yoga Studio Spa Garden Center \n",
"3 Airport Lounge Airport Terminal Airport \n",
"4 Italian Restaurant Sandwich Place Café \n",
"\n",
" 5th Most Common Venue 6th Most Common Venue 7th Most Common Venue \\\n",
"0 Beer Bar Café Cheese Shop \n",
"1 Pet Store Stadium Burrito Place \n",
"2 Garden Gym / Fitness Center Fast Food Restaurant \n",
"3 Airport Food Court Airport Gate Bar \n",
"4 Salad Place Ice Cream Shop Japanese Restaurant \n",
"\n",
" 8th Most Common Venue 9th Most Common Venue 10th Most Common Venue \n",
"0 Bakery Seafood Restaurant Farmers Market \n",
"1 Restaurant Climbing Gym Performing Arts Venue \n",
"2 Farmers Market Comic Shop Pizza Place \n",
"3 Boutique Rental Car Location Boat or Ferry \n",
"4 Middle Eastern Restaurant Sushi Restaurant Spa "
]
},
"execution_count": 55,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"num_top_venues = 10\n",
"\n",
"indicators = ['st', 'nd', 'rd']\n",
"\n",
"# create columns according to number of top venues\n",
"columns = ['Neighborhood']\n",
"for ind in np.arange(num_top_venues):\n",
" try:\n",
" columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))\n",
" except:\n",
" columns.append('{}th Most Common Venue'.format(ind+1))\n",
"\n",
"# create a new dataframe\n",
"neighborhoods_venues_sorted = pd.DataFrame(columns=columns)\n",
"neighborhoods_venues_sorted['Neighborhood'] = Toronto_grouped['Neighborhood']\n",
"\n",
"for ind in np.arange(Toronto_grouped.shape[0]):\n",
" neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(Toronto_grouped.iloc[ind, :], num_top_venues)\n",
"\n",
"neighborhoods_venues_sorted.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Cluster Neighborhoods"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Run *k*-means to cluster the neighborhood into 5 clusters."
]
},
{
"cell_type": "code",
"execution_count": 56,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=int32)"
]
},
"execution_count": 56,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# set number of clusters\n",
"kclusters = 5\n",
"\n",
"Toronto_grouped_clustering = Toronto_grouped.drop('Neighborhood', 1)\n",
"\n",
"# run k-means clustering\n",
"kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(Toronto_grouped_clustering)\n",
"\n",
"# check cluster labels generated for each row in the dataframe\n",
"kmeans.labels_[0:10] "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood."
]
},
{
"cell_type": "code",
"execution_count": 57,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Postal code</th>\n",
" <th>Borough</th>\n",
" <th>Neighborhood</th>\n",
" <th>Latitude</th>\n",
" <th>Longitude</th>\n",
" <th>Cluster Labels</th>\n",
" <th>1st Most Common Venue</th>\n",
" <th>2nd Most Common Venue</th>\n",
" <th>3rd Most Common Venue</th>\n",
" <th>4th Most Common Venue</th>\n",
" <th>5th Most Common Venue</th>\n",
" <th>6th Most Common Venue</th>\n",
" <th>7th Most Common Venue</th>\n",
" <th>8th Most Common Venue</th>\n",
" <th>9th Most Common Venue</th>\n",
" <th>10th Most Common Venue</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>M5A</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>Regent Park / Harbourfront</td>\n",
" <td>43.654260</td>\n",
" <td>-79.360636</td>\n",
" <td>0</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Bakery</td>\n",
" <td>Park</td>\n",
" <td>Pub</td>\n",
" <td>Theater</td>\n",
" <td>Mexican Restaurant</td>\n",
" <td>Breakfast Spot</td>\n",
" <td>Restaurant</td>\n",
" <td>Café</td>\n",
" <td>Shoe Store</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>M7A</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>Queen's Park / Ontario Provincial Government</td>\n",
" <td>43.662301</td>\n",
" <td>-79.389494</td>\n",
" <td>0</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Diner</td>\n",
" <td>Yoga Studio</td>\n",
" <td>Creperie</td>\n",
" <td>Beer Bar</td>\n",
" <td>Sandwich Place</td>\n",
" <td>Burger Joint</td>\n",
" <td>Burrito Place</td>\n",
" <td>Café</td>\n",
" <td>College Auditorium</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>M5B</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>Garden District, Ryerson</td>\n",
" <td>43.657162</td>\n",
" <td>-79.378937</td>\n",
" <td>0</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Clothing Store</td>\n",
" <td>Café</td>\n",
" <td>Japanese Restaurant</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Middle Eastern Restaurant</td>\n",
" <td>Cosmetics Shop</td>\n",
" <td>Bubble Tea Shop</td>\n",
" <td>Electronics Store</td>\n",
" <td>Lingerie Store</td>\n",
" </tr>\n",
" <tr>\n",
" <th>15</th>\n",
" <td>M5C</td>\n",
" <td>Downtown Toronto</td>\n",
" <td>St. James Town</td>\n",
" <td>43.651494</td>\n",
" <td>-79.375418</td>\n",
" <td>0</td>\n",
" <td>Café</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Cocktail Bar</td>\n",
" <td>Beer Bar</td>\n",
" <td>Restaurant</td>\n",
" <td>American Restaurant</td>\n",
" <td>Hotel</td>\n",
" <td>Diner</td>\n",
" <td>Art Gallery</td>\n",
" <td>Cosmetics Shop</td>\n",
" </tr>\n",
" <tr>\n",
" <th>19</th>\n",
" <td>M4E</td>\n",
" <td>East Toronto</td>\n",
" <td>The Beaches</td>\n",
" <td>43.676357</td>\n",
" <td>-79.293031</td>\n",
" <td>0</td>\n",
" <td>Asian Restaurant</td>\n",
" <td>Trail</td>\n",
" <td>Pub</td>\n",
" <td>Health Food Store</td>\n",
" <td>Women's Store</td>\n",
" <td>Distribution Center</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Diner</td>\n",
" <td>Discount Store</td>\n",
" <td>Dog Run</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Postal code Borough \\\n",
"2 M5A Downtown Toronto \n",
"4 M7A Downtown Toronto \n",
"9 M5B Downtown Toronto \n",
"15 M5C Downtown Toronto \n",
"19 M4E East Toronto \n",
"\n",
" Neighborhood Latitude Longitude \\\n",
"2 Regent Park / Harbourfront 43.654260 -79.360636 \n",
"4 Queen's Park / Ontario Provincial Government 43.662301 -79.389494 \n",
"9 Garden District, Ryerson 43.657162 -79.378937 \n",
"15 St. James Town 43.651494 -79.375418 \n",
"19 The Beaches 43.676357 -79.293031 \n",
"\n",
" Cluster Labels 1st Most Common Venue 2nd Most Common Venue \\\n",
"2 0 Coffee Shop Bakery \n",
"4 0 Coffee Shop Diner \n",
"9 0 Coffee Shop Clothing Store \n",
"15 0 Café Coffee Shop \n",
"19 0 Asian Restaurant Trail \n",
"\n",
" 3rd Most Common Venue 4th Most Common Venue 5th Most Common Venue \\\n",
"2 Park Pub Theater \n",
"4 Yoga Studio Creperie Beer Bar \n",
"9 Café Japanese Restaurant Italian Restaurant \n",
"15 Cocktail Bar Beer Bar Restaurant \n",
"19 Pub Health Food Store Women's Store \n",
"\n",
" 6th Most Common Venue 7th Most Common Venue 8th Most Common Venue \\\n",
"2 Mexican Restaurant Breakfast Spot Restaurant \n",
"4 Sandwich Place Burger Joint Burrito Place \n",
"9 Middle Eastern Restaurant Cosmetics Shop Bubble Tea Shop \n",
"15 American Restaurant Hotel Diner \n",
"19 Distribution Center Dessert Shop Diner \n",
"\n",
" 9th Most Common Venue 10th Most Common Venue \n",
"2 Café Shoe Store \n",
"4 Café College Auditorium \n",
"9 Electronics Store Lingerie Store \n",
"15 Art Gallery Cosmetics Shop \n",
"19 Discount Store Dog Run "
]
},
"execution_count": 57,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# add clustering labels\n",
"neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)\n",
"\n",
"Toronto_merged = Toronto_df\n",
"\n",
"# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood\n",
"Toronto_merged = Toronto_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')\n",
"\n",
"Toronto_merged.head() # check the last columns!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally, let's visualize the resulting clusters"
]
},
{
"cell_type": "code",
"execution_count": 60,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"width:100%;\"><div style=\"position:relative;width:100%;height:0;padding-bottom:60%;\"><iframe src=\"about:blank\" style=\"position:absolute;width:100%;height:100%;left:0;top:0;border:none !important;\" data-html= onload=\"this.contentDocument.open();this.contentDocument.write(atob(this.getAttribute('data-html')));this.contentDocument.close();\" allowfullscreen webkitallowfullscreen mozallowfullscreen></iframe></div></div>"
],
"text/plain": [
"<folium.folium.Map at 0x7fa746ee97b8>"
]
},
"execution_count": 60,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# create map\n",
"map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)\n",
"\n",
"# set color scheme for the clusters\n",
"x = np.arange(kclusters)\n",
"ys = [i + x + (i*x)**2 for i in range(kclusters)]\n",
"colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))\n",
"rainbow = [colors.rgb2hex(i) for i in colors_array]\n",
"\n",
"# add markers to the map\n",
"markers_colors = []\n",
"for lat, lon, poi, cluster in zip(Toronto_merged['Latitude'], Toronto_merged['Longitude'], Toronto_merged['Neighborhood'], Toronto_merged['Cluster Labels']):\n",
" label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)\n",
" folium.CircleMarker(\n",
" [lat, lon],\n",
" radius=5,\n",
" popup=label,\n",
" color=rainbow[cluster-1],\n",
" fill=True,\n",
" fill_color=rainbow[cluster-1],\n",
" fill_opacity=0.7).add_to(map_clusters)\n",
" \n",
"map_clusters"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python",
"language": "python",
"name": "conda-env-python-py"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.10"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment