Skip to content

Instantly share code, notes, and snippets.

@bertrandobi
Created April 4, 2019 12:43
Show Gist options
  • Save bertrandobi/075062cf2d9f02742b280ae6d5fa98c8 to your computer and use it in GitHub Desktop.
Save bertrandobi/075062cf2d9f02742b280ae6d5fa98c8 to your computer and use it in GitHub Desktop.
Created on Cognitive Class Labs
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## BUSINESS DENSITY ANALYSIS IN PARIS FRANCE"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Problem Statement"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Even when Entrepreneurs posses the capital required in setting up a business. Choosing the right location that can attract potential customers for the business is a major decision to make. Even when a city is chosen, it requires further business density analysis or neighbourhood analysis in order to choose a suitable neigbourhood for the business."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Background"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This project will be implemented in Paris - France, owing to the fact that its the Capital city of France and a historic touristic city. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Target Audience"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This project could be use by the following individuals:\n",
" 1. Business persons with interest in setting up small businesses in Paris\n",
" 2. Customers looking for where to obtain particular services in Paris\n",
" 3. Tourists interested in visiting and spending quality time in Paris for the first time\n",
" 4. Paris city administration interested in channelling balance business set ups in particular neighbourhoods in the city\n",
" 5. Government interested in balance developments in the city of Paris."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Project Objectives:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Build a dataframe of neighborhoods in Paris - France by web scraping the data from Wikipedia page\n",
"- Get the geographical coordinates of the neighborhoods using the geocoder function\n",
"- Obtain the venue data for the neighborhoods from Foursquare API\n",
"- Explore and cluster the neighborhoods\n",
"- Select the best cluster to open a new business"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Import Libraries necessary for the project"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Libraries imported!\n"
]
}
],
"source": [
"import urllib.request # open and read URLs\n",
"import json # handle JSON files\n",
"from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe\n",
"import requests # handle requests\n",
"import pandas as pd # process data as dataframes with Pandas\n",
"import numpy as np # handle data in a vectorized manner with NumPy\n",
"# !conda install -c conda-forge geopy --yes # uncomment this line if you haven't installed the GeoPy geocoding library yet\n",
"from geopy.geocoders import Nominatim # convert an address into latitude and longitude values\n",
"# !conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't installed the Folium library yet\n",
"import folium # map rendering library\n",
"# Matplolib plotting library and associated modules\n",
"import matplotlib.pyplot as plt \n",
"import matplotlib.cm as cm\n",
"import matplotlib.colors as colors\n",
"from sklearn.cluster import KMeans # for K-Means clustering with Scikit-Learn\n",
"print(\"Libraries imported!\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Connect to a csv file containing open data on Paris Neighbourhoods(Arrondissements) and transform to a Dataframe"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Arrondissement</th>\n",
" <th>Neighborhood</th>\n",
" <th>Latitude</th>\n",
" <th>Longitude</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1</td>\n",
" <td>Louvre</td>\n",
" <td>48.862563</td>\n",
" <td>2.336443</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2</td>\n",
" <td>Bourse</td>\n",
" <td>48.868279</td>\n",
" <td>2.342803</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>3</td>\n",
" <td>Temple</td>\n",
" <td>48.862872</td>\n",
" <td>2.360001</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>4</td>\n",
" <td>Hôtel-de-Ville</td>\n",
" <td>48.854341</td>\n",
" <td>2.357630</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>5</td>\n",
" <td>Panthéon</td>\n",
" <td>48.844443</td>\n",
" <td>2.350715</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>6</td>\n",
" <td>Luxembourg</td>\n",
" <td>48.849130</td>\n",
" <td>2.332898</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>7</td>\n",
" <td>Palais-Bourbon</td>\n",
" <td>48.856174</td>\n",
" <td>2.312188</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>8</td>\n",
" <td>Elysée</td>\n",
" <td>48.872721</td>\n",
" <td>2.312554</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>9</td>\n",
" <td>Opéra</td>\n",
" <td>48.877164</td>\n",
" <td>2.337458</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>10</td>\n",
" <td>Entrepôt</td>\n",
" <td>48.876130</td>\n",
" <td>2.360728</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td>11</td>\n",
" <td>Popincourt</td>\n",
" <td>48.859059</td>\n",
" <td>2.380058</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11</th>\n",
" <td>12</td>\n",
" <td>Reuilly</td>\n",
" <td>48.834974</td>\n",
" <td>2.421325</td>\n",
" </tr>\n",
" <tr>\n",
" <th>12</th>\n",
" <td>13</td>\n",
" <td>Gobelins</td>\n",
" <td>48.828388</td>\n",
" <td>2.362272</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>14</td>\n",
" <td>Observatoire</td>\n",
" <td>48.829245</td>\n",
" <td>2.326542</td>\n",
" </tr>\n",
" <tr>\n",
" <th>14</th>\n",
" <td>15</td>\n",
" <td>Vaugirard</td>\n",
" <td>48.840085</td>\n",
" <td>2.292826</td>\n",
" </tr>\n",
" <tr>\n",
" <th>15</th>\n",
" <td>16</td>\n",
" <td>Passy</td>\n",
" <td>48.860392</td>\n",
" <td>2.261971</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16</th>\n",
" <td>17</td>\n",
" <td>Batignolles-Monceau</td>\n",
" <td>48.887327</td>\n",
" <td>2.306777</td>\n",
" </tr>\n",
" <tr>\n",
" <th>17</th>\n",
" <td>18</td>\n",
" <td>Buttes-Montmartre</td>\n",
" <td>48.892569</td>\n",
" <td>2.348161</td>\n",
" </tr>\n",
" <tr>\n",
" <th>18</th>\n",
" <td>19</td>\n",
" <td>Buttes-Chaumont</td>\n",
" <td>48.887076</td>\n",
" <td>2.384821</td>\n",
" </tr>\n",
" <tr>\n",
" <th>19</th>\n",
" <td>20</td>\n",
" <td>Ménilmontant</td>\n",
" <td>48.863461</td>\n",
" <td>2.401188</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Arrondissement Neighborhood Latitude Longitude\n",
"0 1 Louvre 48.862563 2.336443\n",
"1 2 Bourse 48.868279 2.342803\n",
"2 3 Temple 48.862872 2.360001\n",
"3 4 Hôtel-de-Ville 48.854341 2.357630\n",
"4 5 Panthéon 48.844443 2.350715\n",
"5 6 Luxembourg 48.849130 2.332898\n",
"6 7 Palais-Bourbon 48.856174 2.312188\n",
"7 8 Elysée 48.872721 2.312554\n",
"8 9 Opéra 48.877164 2.337458\n",
"9 10 Entrepôt 48.876130 2.360728\n",
"10 11 Popincourt 48.859059 2.380058\n",
"11 12 Reuilly 48.834974 2.421325\n",
"12 13 Gobelins 48.828388 2.362272\n",
"13 14 Observatoire 48.829245 2.326542\n",
"14 15 Vaugirard 48.840085 2.292826\n",
"15 16 Passy 48.860392 2.261971\n",
"16 17 Batignolles-Monceau 48.887327 2.306777\n",
"17 18 Buttes-Montmartre 48.892569 2.348161\n",
"18 19 Buttes-Chaumont 48.887076 2.384821\n",
"19 20 Ménilmontant 48.863461 2.401188"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dfs = pd.read_csv('arrondissements.csv', sep=';')\n",
"df=dfs[['C_AR', 'L_AR', 'L_AROFF', 'Geometry X Y']]\n",
"df[['Latitude','Longitude']]=df['Geometry X Y'].str.split(\",\",expand=True,)\n",
"df[['Latitude','Longitude']] = df[['Latitude','Longitude']].astype(float)\n",
"df1 = df.drop(df[['L_AR', 'Geometry X Y']], axis=1)\n",
"df1.columns = ['Arrondissement', 'Neighborhood', 'Latitude', 'Longitude']\n",
"paris_data = df1.sort_values([\"Arrondissement\"]).reset_index(drop=True)\n",
"paris_data.loc[paris_data.Neighborhood == \"Élysée\", [\"Neighborhood\"]] = \"Elysée\"\n",
"paris_data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Use GeoPy library to get the geographical coordinates of Paris"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The geograpical coordinates of Paris are 48.8566101, 2.3514992.\n"
]
}
],
"source": [
"address = \"Paris, FR\"\n",
"geolocator = Nominatim(user_agent=\"my-application\")\n",
"location_par = geolocator.geocode(address)\n",
"latitude_par = location_par.latitude\n",
"longitude_par = location_par.longitude\n",
"print(\"The geograpical coordinates of Paris are {}, {}.\".format(latitude_par, longitude_par))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Visualise Paris on a Map"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"width:100%;\"><div style=\"position:relative;width:100%;height:0;padding-bottom:60%;\"><iframe src=\"data:text/html;charset=utf-8;base64,\" style=\"position:absolute;width:100%;height:100%;left:0;top:0;border:none !important;\" allowfullscreen webkitallowfullscreen mozallowfullscreen></iframe></div></div>"
],
"text/plain": [
"<folium.folium.Map at 0x7f750fc03f98>"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"map_paris = folium.Map(location=[latitude_par, longitude_par], zoom_start=12)\n",
"# add markers to the map\n",
"for lat, lng, arrondissement, neighborhood in zip(paris_data['Latitude'], paris_data['Longitude'], paris_data['Arrondissement'], paris_data['Neighborhood']):\n",
" label = \"{}, {}\".format(arrondissement,neighborhood)\n",
" label = folium.Popup(label, parse_html=True)\n",
" folium.CircleMarker(\n",
" [lat, lng],\n",
" radius = 5,\n",
" popup = label,\n",
" color = \"blue\",\n",
" fill = True,\n",
" fill_color = \"#3186cc\",\n",
" fill_opacity = 0.7,\n",
" parse_html = False).add_to(map_paris) \n",
"map_paris.save('Paris.png')\n",
"map_paris"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Use Foursquare API to explore Neighbourhoods and segnment the data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Foursquare credentials"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Your credentails:\n",
"CLIENT_ID: IZJNDNMGN2UG0QC4VTSCO0X3FIJCUWLEOMDME03AZXXKSFK1\n",
"CLIENT_SECRET:2VRK4XTUNZGWEC5FU3ITX45S3KRWYECKYHE10VK0EJLMJHGJ\n"
]
}
],
"source": [
"CLIENT_ID = 'IZJNDNMGN2UG0QC4VTSCO0X3FIJCUWLEOMDME03AZXXKSFK1' # your Foursquare ID\n",
"CLIENT_SECRET = '2VRK4XTUNZGWEC5FU3ITX45S3KRWYECKYHE10VK0EJLMJHGJ' # your Foursquare Secret\n",
"VERSION = '20180605' # Foursquare API version\n",
"print('Your credentails:')\n",
"print('CLIENT_ID: ' + CLIENT_ID)\n",
"print('CLIENT_SECRET:' + CLIENT_SECRET)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Function to get venues"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"def getNearbyVenues_paris(arrondissements, names, latitudes, longitudes, radius=750, limit=100):\n",
" \n",
" venues_list = []\n",
" for arrondissement, name, lat, lng in zip(arrondissements, names, latitudes, longitudes):\n",
" print(name)\n",
" \n",
" # create the API request URL\n",
" url = \"https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}\".format(\n",
" CLIENT_ID, \n",
" CLIENT_SECRET, \n",
" VERSION, \n",
" lat, \n",
" lng, \n",
" radius, \n",
" limit)\n",
" \n",
" # make the GET request\n",
" results = requests.get(url).json()[\"response\"][\"groups\"][0][\"items\"]\n",
" \n",
" # return only relevant information for each nearby venue\n",
" venues_list.append([(\n",
" arrondissement,\n",
" name, \n",
" lat, \n",
" lng, \n",
" v[\"venue\"][\"name\"], \n",
" v[\"venue\"][\"location\"][\"lat\"], \n",
" v[\"venue\"][\"location\"][\"lng\"], \n",
" v[\"venue\"][\"categories\"][0][\"name\"]) for v in results])\n",
" \n",
" nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])\n",
" nearby_venues.columns = [\"Arrondissement\",\n",
" \"Neighborhood\", \n",
" \"Neighborhood Latitude\", \n",
" \"Neighborhood Longitude\", \n",
" \"Venue\", \n",
" \"Venue Latitude\", \n",
" \"Venue Longitude\", \n",
" \"Venue Category\"] \n",
" return(nearby_venues)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Louvre\n",
"Bourse\n",
"Temple\n",
"Hôtel-de-Ville\n",
"Panthéon\n",
"Luxembourg\n",
"Palais-Bourbon\n",
"Elysée\n",
"Opéra\n",
"Entrepôt\n",
"Popincourt\n",
"Reuilly\n",
"Gobelins\n",
"Observatoire\n",
"Vaugirard\n",
"Passy\n",
"Batignolles-Monceau\n",
"Buttes-Montmartre\n",
"Buttes-Chaumont\n",
"Ménilmontant\n"
]
}
],
"source": [
"paris_venues = getNearbyVenues_paris(arrondissements=paris_data[\"Arrondissement\"],\n",
" names = paris_data[\"Neighborhood\"],\n",
" latitudes = paris_data[\"Latitude\"],\n",
" longitudes = paris_data[\"Longitude\"]\n",
" )"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(1815, 8)\n"
]
}
],
"source": [
"print(paris_venues.shape)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"There are 222 unique categories.\n"
]
}
],
"source": [
"print(\"There are {} unique categories.\".format(len(paris_venues[\"Venue Category\"].unique())))"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Arrondissement</th>\n",
" <th>Neighborhood</th>\n",
" <th>Neighborhood Latitude</th>\n",
" <th>Neighborhood Longitude</th>\n",
" <th>Venue</th>\n",
" <th>Venue Latitude</th>\n",
" <th>Venue Longitude</th>\n",
" <th>Venue Category</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1</td>\n",
" <td>Louvre</td>\n",
" <td>48.862563</td>\n",
" <td>2.336443</td>\n",
" <td>Musée du Louvre</td>\n",
" <td>48.860847</td>\n",
" <td>2.336440</td>\n",
" <td>Art Museum</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>1</td>\n",
" <td>Louvre</td>\n",
" <td>48.862563</td>\n",
" <td>2.336443</td>\n",
" <td>Comédie-Française</td>\n",
" <td>48.863088</td>\n",
" <td>2.336612</td>\n",
" <td>Theater</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>1</td>\n",
" <td>Louvre</td>\n",
" <td>48.862563</td>\n",
" <td>2.336443</td>\n",
" <td>Palais Royal</td>\n",
" <td>48.863758</td>\n",
" <td>2.337121</td>\n",
" <td>Historic Site</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>1</td>\n",
" <td>Louvre</td>\n",
" <td>48.862563</td>\n",
" <td>2.336443</td>\n",
" <td>Place du Palais Royal</td>\n",
" <td>48.862523</td>\n",
" <td>2.336688</td>\n",
" <td>Plaza</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>1</td>\n",
" <td>Louvre</td>\n",
" <td>48.862563</td>\n",
" <td>2.336443</td>\n",
" <td>La Clef Louvre Paris</td>\n",
" <td>48.863977</td>\n",
" <td>2.336140</td>\n",
" <td>Hotel</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Arrondissement Neighborhood Neighborhood Latitude Neighborhood Longitude \\\n",
"0 1 Louvre 48.862563 2.336443 \n",
"1 1 Louvre 48.862563 2.336443 \n",
"2 1 Louvre 48.862563 2.336443 \n",
"3 1 Louvre 48.862563 2.336443 \n",
"4 1 Louvre 48.862563 2.336443 \n",
"\n",
" Venue Venue Latitude Venue Longitude Venue Category \n",
"0 Musée du Louvre 48.860847 2.336440 Art Museum \n",
"1 Comédie-Française 48.863088 2.336612 Theater \n",
"2 Palais Royal 48.863758 2.337121 Historic Site \n",
"3 Place du Palais Royal 48.862523 2.336688 Plaza \n",
"4 La Clef Louvre Paris   48.863977 2.336140 Hotel "
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"paris_venues.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Use Onehot encoding to explore Neighbourhoods"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Arrondissement</th>\n",
" <th>Neighborhood</th>\n",
" <th>Afghan Restaurant</th>\n",
" <th>African Restaurant</th>\n",
" <th>American Restaurant</th>\n",
" <th>Antique Shop</th>\n",
" <th>Arepa Restaurant</th>\n",
" <th>Argentinian Restaurant</th>\n",
" <th>Art Gallery</th>\n",
" <th>Art Museum</th>\n",
" <th>...</th>\n",
" <th>Udon Restaurant</th>\n",
" <th>Vegetarian / Vegan Restaurant</th>\n",
" <th>Venezuelan Restaurant</th>\n",
" <th>Video Game Store</th>\n",
" <th>Vietnamese Restaurant</th>\n",
" <th>Wine Bar</th>\n",
" <th>Wine Shop</th>\n",
" <th>Women's Store</th>\n",
" <th>Yoga Studio</th>\n",
" <th>Zoo</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1</td>\n",
" <td>Louvre</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>1</td>\n",
" <td>Louvre</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>1</td>\n",
" <td>Louvre</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>1</td>\n",
" <td>Louvre</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>1</td>\n",
" <td>Louvre</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>5 rows × 224 columns</p>\n",
"</div>"
],
"text/plain": [
" Arrondissement Neighborhood Afghan Restaurant African Restaurant \\\n",
"0 1 Louvre 0 0 \n",
"1 1 Louvre 0 0 \n",
"2 1 Louvre 0 0 \n",
"3 1 Louvre 0 0 \n",
"4 1 Louvre 0 0 \n",
"\n",
" American Restaurant Antique Shop Arepa Restaurant \\\n",
"0 0 0 0 \n",
"1 0 0 0 \n",
"2 0 0 0 \n",
"3 0 0 0 \n",
"4 0 0 0 \n",
"\n",
" Argentinian Restaurant Art Gallery Art Museum ... Udon Restaurant \\\n",
"0 0 0 1 ... 0 \n",
"1 0 0 0 ... 0 \n",
"2 0 0 0 ... 0 \n",
"3 0 0 0 ... 0 \n",
"4 0 0 0 ... 0 \n",
"\n",
" Vegetarian / Vegan Restaurant Venezuelan Restaurant Video Game Store \\\n",
"0 0 0 0 \n",
"1 0 0 0 \n",
"2 0 0 0 \n",
"3 0 0 0 \n",
"4 0 0 0 \n",
"\n",
" Vietnamese Restaurant Wine Bar Wine Shop Women's Store Yoga Studio Zoo \n",
"0 0 0 0 0 0 0 \n",
"1 0 0 0 0 0 0 \n",
"2 0 0 0 0 0 0 \n",
"3 0 0 0 0 0 0 \n",
"4 0 0 0 0 0 0 \n",
"\n",
"[5 rows x 224 columns]"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"paris_onehot = pd.get_dummies(paris_venues[[\"Venue Category\"]], prefix=\"\", prefix_sep=\"\")\n",
"# add arrondissement and neighborhood column back to dataframe\n",
"paris_onehot[\"Arrondissement\"] = paris_venues[\"Arrondissement\"] \n",
"paris_onehot[\"Neighborhood\"] = paris_venues[\"Neighborhood\"] \n",
"# move arrondissement and neighborhood columns to the first columns\n",
"fixed_columns = [paris_onehot.columns[-2], paris_onehot.columns[-1]] + list(paris_onehot.columns[:-2])\n",
"paris_onehot = paris_onehot[fixed_columns]\n",
"paris_onehot.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Group by neighborhood and by the mean frequency for each category"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Neighborhood</th>\n",
" <th>Arrondissement</th>\n",
" <th>Afghan Restaurant</th>\n",
" <th>African Restaurant</th>\n",
" <th>American Restaurant</th>\n",
" <th>Antique Shop</th>\n",
" <th>Arepa Restaurant</th>\n",
" <th>Argentinian Restaurant</th>\n",
" <th>Art Gallery</th>\n",
" <th>Art Museum</th>\n",
" <th>...</th>\n",
" <th>Udon Restaurant</th>\n",
" <th>Vegetarian / Vegan Restaurant</th>\n",
" <th>Venezuelan Restaurant</th>\n",
" <th>Video Game Store</th>\n",
" <th>Vietnamese Restaurant</th>\n",
" <th>Wine Bar</th>\n",
" <th>Wine Shop</th>\n",
" <th>Women's Store</th>\n",
" <th>Yoga Studio</th>\n",
" <th>Zoo</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Batignolles-Monceau</td>\n",
" <td>17</td>\n",
" <td>0.0</td>\n",
" <td>0.00</td>\n",
" <td>0.010417</td>\n",
" <td>0.0</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>...</td>\n",
" <td>0.0</td>\n",
" <td>0.00</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.010417</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Bourse</td>\n",
" <td>2</td>\n",
" <td>0.0</td>\n",
" <td>0.00</td>\n",
" <td>0.010000</td>\n",
" <td>0.0</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>...</td>\n",
" <td>0.0</td>\n",
" <td>0.00</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.00</td>\n",
" <td>0.05</td>\n",
" <td>0.01</td>\n",
" <td>0.01</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Buttes-Chaumont</td>\n",
" <td>19</td>\n",
" <td>0.0</td>\n",
" <td>0.00</td>\n",
" <td>0.010000</td>\n",
" <td>0.0</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>...</td>\n",
" <td>0.0</td>\n",
" <td>0.00</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.01</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Buttes-Montmartre</td>\n",
" <td>18</td>\n",
" <td>0.0</td>\n",
" <td>0.01</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.01</td>\n",
" <td>0.01</td>\n",
" <td>0.02</td>\n",
" <td>0.00</td>\n",
" <td>...</td>\n",
" <td>0.0</td>\n",
" <td>0.02</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.02</td>\n",
" <td>0.02</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Elysée</td>\n",
" <td>8</td>\n",
" <td>0.0</td>\n",
" <td>0.00</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.04</td>\n",
" <td>0.01</td>\n",
" <td>...</td>\n",
" <td>0.0</td>\n",
" <td>0.00</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>5 rows × 224 columns</p>\n",
"</div>"
],
"text/plain": [
" Neighborhood Arrondissement Afghan Restaurant African Restaurant \\\n",
"0 Batignolles-Monceau 17 0.0 0.00 \n",
"1 Bourse 2 0.0 0.00 \n",
"2 Buttes-Chaumont 19 0.0 0.00 \n",
"3 Buttes-Montmartre 18 0.0 0.01 \n",
"4 Elysée 8 0.0 0.00 \n",
"\n",
" American Restaurant Antique Shop Arepa Restaurant \\\n",
"0 0.010417 0.0 0.00 \n",
"1 0.010000 0.0 0.00 \n",
"2 0.010000 0.0 0.00 \n",
"3 0.000000 0.0 0.01 \n",
"4 0.000000 0.0 0.00 \n",
"\n",
" Argentinian Restaurant Art Gallery Art Museum ... Udon Restaurant \\\n",
"0 0.00 0.00 0.00 ... 0.0 \n",
"1 0.00 0.00 0.00 ... 0.0 \n",
"2 0.00 0.00 0.00 ... 0.0 \n",
"3 0.01 0.02 0.00 ... 0.0 \n",
"4 0.00 0.04 0.01 ... 0.0 \n",
"\n",
" Vegetarian / Vegan Restaurant Venezuelan Restaurant Video Game Store \\\n",
"0 0.00 0.0 0.0 \n",
"1 0.00 0.0 0.0 \n",
"2 0.00 0.0 0.0 \n",
"3 0.02 0.0 0.0 \n",
"4 0.00 0.0 0.0 \n",
"\n",
" Vietnamese Restaurant Wine Bar Wine Shop Women's Store Yoga Studio Zoo \n",
"0 0.00 0.00 0.00 0.00 0.010417 0.0 \n",
"1 0.00 0.05 0.01 0.01 0.000000 0.0 \n",
"2 0.01 0.00 0.00 0.00 0.000000 0.0 \n",
"3 0.02 0.02 0.00 0.00 0.000000 0.0 \n",
"4 0.00 0.00 0.00 0.00 0.000000 0.0 \n",
"\n",
"[5 rows x 224 columns]"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"paris_grouped = paris_onehot.groupby(\"Neighborhood\").mean().reset_index()\n",
"paris_grouped.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Top 5 most common categories of venues for each neighborhood "
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"----Batignolles-Monceau----\n",
" venue freq\n",
"0 Italian Restaurant 0.14\n",
"1 French Restaurant 0.14\n",
"2 Hotel 0.12\n",
"3 Japanese Restaurant 0.05\n",
"4 Bakery 0.05\n",
"\n",
"\n",
"----Bourse----\n",
" venue freq\n",
"0 French Restaurant 0.15\n",
"1 Bistro 0.06\n",
"2 Cocktail Bar 0.05\n",
"3 Wine Bar 0.05\n",
"4 Italian Restaurant 0.04\n",
"\n",
"\n",
"----Buttes-Chaumont----\n",
" venue freq\n",
"0 Bar 0.13\n",
"1 French Restaurant 0.08\n",
"2 Hotel 0.05\n",
"3 Pizza Place 0.05\n",
"4 Supermarket 0.04\n",
"\n",
"\n",
"----Buttes-Montmartre----\n",
" venue freq\n",
"0 French Restaurant 0.20\n",
"1 Bar 0.15\n",
"2 Pizza Place 0.06\n",
"3 Italian Restaurant 0.05\n",
"4 Bistro 0.05\n",
"\n",
"\n",
"----Elysée----\n",
" venue freq\n",
"0 French Restaurant 0.17\n",
"1 Hotel 0.17\n",
"2 Art Gallery 0.04\n",
"3 Bakery 0.04\n",
"4 Italian Restaurant 0.04\n",
"\n",
"\n",
"----Entrepôt----\n",
" venue freq\n",
"0 French Restaurant 0.08\n",
"1 Coffee Shop 0.08\n",
"2 Bistro 0.06\n",
"3 Pizza Place 0.05\n",
"4 Italian Restaurant 0.04\n",
"\n",
"\n",
"----Gobelins----\n",
" venue freq\n",
"0 Vietnamese Restaurant 0.18\n",
"1 Thai Restaurant 0.11\n",
"2 Asian Restaurant 0.11\n",
"3 Chinese Restaurant 0.07\n",
"4 French Restaurant 0.07\n",
"\n",
"\n",
"----Hôtel-de-Ville----\n",
" venue freq\n",
"0 French Restaurant 0.15\n",
"1 Ice Cream Shop 0.06\n",
"2 Plaza 0.04\n",
"3 Pastry Shop 0.03\n",
"4 Bakery 0.03\n",
"\n",
"\n",
"----Louvre----\n",
" venue freq\n",
"0 French Restaurant 0.10\n",
"1 Plaza 0.07\n",
"2 Hotel 0.07\n",
"3 Japanese Restaurant 0.06\n",
"4 Café 0.06\n",
"\n",
"\n",
"----Luxembourg----\n",
" venue freq\n",
"0 French Restaurant 0.09\n",
"1 Hotel 0.06\n",
"2 Italian Restaurant 0.05\n",
"3 Bistro 0.04\n",
"4 Chocolate Shop 0.04\n",
"\n",
"\n",
"----Ménilmontant----\n",
" venue freq\n",
"0 French Restaurant 0.14\n",
"1 Bar 0.09\n",
"2 Bakery 0.07\n",
"3 Bistro 0.06\n",
"4 Plaza 0.06\n",
"\n",
"\n",
"----Observatoire----\n",
" venue freq\n",
"0 French Restaurant 0.16\n",
"1 Hotel 0.10\n",
"2 Italian Restaurant 0.07\n",
"3 Café 0.04\n",
"4 Bistro 0.04\n",
"\n",
"\n",
"----Opéra----\n",
" venue freq\n",
"0 French Restaurant 0.17\n",
"1 Hotel 0.10\n",
"2 Cocktail Bar 0.06\n",
"3 Italian Restaurant 0.05\n",
"4 Bistro 0.04\n",
"\n",
"\n",
"----Palais-Bourbon----\n",
" venue freq\n",
"0 French Restaurant 0.28\n",
"1 Hotel 0.14\n",
"2 Plaza 0.06\n",
"3 Italian Restaurant 0.04\n",
"4 Cocktail Bar 0.03\n",
"\n",
"\n",
"----Panthéon----\n",
" venue freq\n",
"0 French Restaurant 0.18\n",
"1 Bar 0.06\n",
"2 Bakery 0.04\n",
"3 Hotel 0.04\n",
"4 Café 0.04\n",
"\n",
"\n",
"----Passy----\n",
" venue freq\n",
"0 Plaza 0.12\n",
"1 Garden 0.08\n",
"2 Lake 0.08\n",
"3 French Restaurant 0.08\n",
"4 Pool 0.08\n",
"\n",
"\n",
"----Popincourt----\n",
" venue freq\n",
"0 Bar 0.08\n",
"1 French Restaurant 0.08\n",
"2 Bistro 0.06\n",
"3 Cocktail Bar 0.06\n",
"4 Italian Restaurant 0.04\n",
"\n",
"\n",
"----Reuilly----\n",
" venue freq\n",
"0 Lake 0.22\n",
"1 Zoo 0.11\n",
"2 Bus Stop 0.11\n",
"3 Monument / Landmark 0.11\n",
"4 French Restaurant 0.11\n",
"\n",
"\n",
"----Temple----\n",
" venue freq\n",
"0 French Restaurant 0.07\n",
"1 Coffee Shop 0.05\n",
"2 Art Gallery 0.05\n",
"3 Italian Restaurant 0.04\n",
"4 Wine Bar 0.04\n",
"\n",
"\n",
"----Vaugirard----\n",
" venue freq\n",
"0 French Restaurant 0.18\n",
"1 Hotel 0.09\n",
"2 Italian Restaurant 0.05\n",
"3 Coffee Shop 0.05\n",
"4 Bistro 0.04\n",
"\n",
"\n"
]
}
],
"source": [
"num_top_venues = 5\n",
"for hood in paris_grouped[\"Neighborhood\"]:\n",
" print(\"----\"+hood+\"----\")\n",
" temp = paris_grouped[paris_grouped[\"Neighborhood\"]==hood].T.reset_index()\n",
" temp.columns = [\"venue\",\"freq\"]\n",
" temp = temp.iloc[2:]\n",
" temp[\"freq\"] = temp[\"freq\"].astype(float)\n",
" temp = temp.round({\"freq\": 2})\n",
" print(temp.sort_values(\"freq\", ascending=False).reset_index(drop=True).head(num_top_venues))\n",
" print(\"\\n\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Considering that Paris is a touristic city, Hotels, restaurants, Coffee shops, Bakery and Bars dominates most neighbourhoods.\n",
"- In Reuilly neigbourhood, despite enormous touristic potentials, restaurants and hotels seem to absent. This could be a great investment opportunity for an investor\n",
"- Banks also seem not to be very common in most communities this also represent a great investment opportunities."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Define a function to sort the venues in descending order"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"def return_most_common_venues(row, num_top_venues):\n",
" row_categories = row.iloc[1:]\n",
" row_categories_sorted = row_categories.sort_values(ascending=False)\n",
" return row_categories_sorted.index.values[0:num_top_venues]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# The top 10 venues for each neighborhood"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Arrondissement</th>\n",
" <th>Neighborhood</th>\n",
" <th>1st Most Common Venue</th>\n",
" <th>2nd Most Common Venue</th>\n",
" <th>3rd Most Common Venue</th>\n",
" <th>4th Most Common Venue</th>\n",
" <th>5th Most Common Venue</th>\n",
" <th>6th Most Common Venue</th>\n",
" <th>7th Most Common Venue</th>\n",
" <th>8th Most Common Venue</th>\n",
" <th>9th Most Common Venue</th>\n",
" <th>10th Most Common Venue</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>17</td>\n",
" <td>Batignolles-Monceau</td>\n",
" <td>French Restaurant</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Hotel</td>\n",
" <td>Japanese Restaurant</td>\n",
" <td>Bakery</td>\n",
" <td>Restaurant</td>\n",
" <td>Café</td>\n",
" <td>Bistro</td>\n",
" <td>Plaza</td>\n",
" <td>Bar</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2</td>\n",
" <td>Bourse</td>\n",
" <td>French Restaurant</td>\n",
" <td>Bistro</td>\n",
" <td>Wine Bar</td>\n",
" <td>Cocktail Bar</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Boutique</td>\n",
" <td>Japanese Restaurant</td>\n",
" <td>Bakery</td>\n",
" <td>Creperie</td>\n",
" <td>Hotel</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>19</td>\n",
" <td>Buttes-Chaumont</td>\n",
" <td>Bar</td>\n",
" <td>French Restaurant</td>\n",
" <td>Pizza Place</td>\n",
" <td>Hotel</td>\n",
" <td>Supermarket</td>\n",
" <td>Smoke Shop</td>\n",
" <td>Café</td>\n",
" <td>Bistro</td>\n",
" <td>Restaurant</td>\n",
" <td>Seafood Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>18</td>\n",
" <td>Buttes-Montmartre</td>\n",
" <td>French Restaurant</td>\n",
" <td>Bar</td>\n",
" <td>Pizza Place</td>\n",
" <td>Bistro</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Café</td>\n",
" <td>Art Gallery</td>\n",
" <td>Middle Eastern Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>8</td>\n",
" <td>Elysée</td>\n",
" <td>French Restaurant</td>\n",
" <td>Hotel</td>\n",
" <td>Art Gallery</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Bakery</td>\n",
" <td>Theater</td>\n",
" <td>Clothing Store</td>\n",
" <td>Japanese Restaurant</td>\n",
" <td>Thai Restaurant</td>\n",
" <td>Boutique</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>10</td>\n",
" <td>Entrepôt</td>\n",
" <td>French Restaurant</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Bistro</td>\n",
" <td>Pizza Place</td>\n",
" <td>Japanese Restaurant</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Bakery</td>\n",
" <td>Breakfast Spot</td>\n",
" <td>Indian Restaurant</td>\n",
" <td>Cocktail Bar</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>13</td>\n",
" <td>Gobelins</td>\n",
" <td>Vietnamese Restaurant</td>\n",
" <td>Thai Restaurant</td>\n",
" <td>Asian Restaurant</td>\n",
" <td>French Restaurant</td>\n",
" <td>Chinese Restaurant</td>\n",
" <td>Bakery</td>\n",
" <td>Hotel</td>\n",
" <td>Cantonese Restaurant</td>\n",
" <td>Japanese Restaurant</td>\n",
" <td>Cambodian Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>4</td>\n",
" <td>Hôtel-de-Ville</td>\n",
" <td>French Restaurant</td>\n",
" <td>Ice Cream Shop</td>\n",
" <td>Plaza</td>\n",
" <td>Wine Bar</td>\n",
" <td>Tapas Restaurant</td>\n",
" <td>Pastry Shop</td>\n",
" <td>Bakery</td>\n",
" <td>Cocktail Bar</td>\n",
" <td>Clothing Store</td>\n",
" <td>Gastropub</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>1</td>\n",
" <td>Louvre</td>\n",
" <td>French Restaurant</td>\n",
" <td>Plaza</td>\n",
" <td>Hotel</td>\n",
" <td>Café</td>\n",
" <td>Japanese Restaurant</td>\n",
" <td>Exhibit</td>\n",
" <td>Historic Site</td>\n",
" <td>Udon Restaurant</td>\n",
" <td>Art Museum</td>\n",
" <td>Cosmetics Shop</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>6</td>\n",
" <td>Luxembourg</td>\n",
" <td>French Restaurant</td>\n",
" <td>Hotel</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Wine Bar</td>\n",
" <td>Chocolate Shop</td>\n",
" <td>Bistro</td>\n",
" <td>Seafood Restaurant</td>\n",
" <td>Tea Room</td>\n",
" <td>Plaza</td>\n",
" <td>Pastry Shop</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td>20</td>\n",
" <td>Ménilmontant</td>\n",
" <td>French Restaurant</td>\n",
" <td>Bar</td>\n",
" <td>Bakery</td>\n",
" <td>Plaza</td>\n",
" <td>Bistro</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Café</td>\n",
" <td>Park</td>\n",
" <td>Bookstore</td>\n",
" <td>Supermarket</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11</th>\n",
" <td>14</td>\n",
" <td>Observatoire</td>\n",
" <td>French Restaurant</td>\n",
" <td>Hotel</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Bistro</td>\n",
" <td>Café</td>\n",
" <td>Bar</td>\n",
" <td>Vietnamese Restaurant</td>\n",
" <td>Plaza</td>\n",
" <td>Bakery</td>\n",
" <td>Pizza Place</td>\n",
" </tr>\n",
" <tr>\n",
" <th>12</th>\n",
" <td>9</td>\n",
" <td>Opéra</td>\n",
" <td>French Restaurant</td>\n",
" <td>Hotel</td>\n",
" <td>Cocktail Bar</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Bistro</td>\n",
" <td>Bar</td>\n",
" <td>Bakery</td>\n",
" <td>Wine Bar</td>\n",
" <td>Lounge</td>\n",
" <td>Pizza Place</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>7</td>\n",
" <td>Palais-Bourbon</td>\n",
" <td>French Restaurant</td>\n",
" <td>Hotel</td>\n",
" <td>Plaza</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Café</td>\n",
" <td>Cocktail Bar</td>\n",
" <td>History Museum</td>\n",
" <td>Historic Site</td>\n",
" <td>Ice Cream Shop</td>\n",
" <td>Bar</td>\n",
" </tr>\n",
" <tr>\n",
" <th>14</th>\n",
" <td>5</td>\n",
" <td>Panthéon</td>\n",
" <td>French Restaurant</td>\n",
" <td>Bar</td>\n",
" <td>Wine Bar</td>\n",
" <td>Café</td>\n",
" <td>Hotel</td>\n",
" <td>Bakery</td>\n",
" <td>Plaza</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Greek Restaurant</td>\n",
" <td>Museum</td>\n",
" </tr>\n",
" <tr>\n",
" <th>15</th>\n",
" <td>16</td>\n",
" <td>Passy</td>\n",
" <td>Plaza</td>\n",
" <td>Lake</td>\n",
" <td>Pool</td>\n",
" <td>French Restaurant</td>\n",
" <td>Garden</td>\n",
" <td>Diner</td>\n",
" <td>Skate Park</td>\n",
" <td>Bus Station</td>\n",
" <td>Bus Stop</td>\n",
" <td>Cafeteria</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16</th>\n",
" <td>11</td>\n",
" <td>Popincourt</td>\n",
" <td>Bar</td>\n",
" <td>French Restaurant</td>\n",
" <td>Cocktail Bar</td>\n",
" <td>Bistro</td>\n",
" <td>Pizza Place</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Restaurant</td>\n",
" <td>Vegetarian / Vegan Restaurant</td>\n",
" <td>Beer Bar</td>\n",
" <td>Wine Bar</td>\n",
" </tr>\n",
" <tr>\n",
" <th>17</th>\n",
" <td>12</td>\n",
" <td>Reuilly</td>\n",
" <td>Lake</td>\n",
" <td>Zoo</td>\n",
" <td>Diner</td>\n",
" <td>Bus Stop</td>\n",
" <td>Exhibit</td>\n",
" <td>French Restaurant</td>\n",
" <td>Monument / Landmark</td>\n",
" <td>Japanese Restaurant</td>\n",
" <td>Hot Dog Joint</td>\n",
" <td>Fast Food Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>18</th>\n",
" <td>3</td>\n",
" <td>Temple</td>\n",
" <td>French Restaurant</td>\n",
" <td>Art Gallery</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Gourmet Shop</td>\n",
" <td>Wine Bar</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Chinese Restaurant</td>\n",
" <td>Moroccan Restaurant</td>\n",
" <td>Sandwich Place</td>\n",
" <td>Cocktail Bar</td>\n",
" </tr>\n",
" <tr>\n",
" <th>19</th>\n",
" <td>15</td>\n",
" <td>Vaugirard</td>\n",
" <td>French Restaurant</td>\n",
" <td>Hotel</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Bistro</td>\n",
" <td>Persian Restaurant</td>\n",
" <td>Lebanese Restaurant</td>\n",
" <td>Park</td>\n",
" <td>Restaurant</td>\n",
" <td>Japanese Restaurant</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Arrondissement Neighborhood 1st Most Common Venue \\\n",
"0 17 Batignolles-Monceau French Restaurant \n",
"1 2 Bourse French Restaurant \n",
"2 19 Buttes-Chaumont Bar \n",
"3 18 Buttes-Montmartre French Restaurant \n",
"4 8 Elysée French Restaurant \n",
"5 10 Entrepôt French Restaurant \n",
"6 13 Gobelins Vietnamese Restaurant \n",
"7 4 Hôtel-de-Ville French Restaurant \n",
"8 1 Louvre French Restaurant \n",
"9 6 Luxembourg French Restaurant \n",
"10 20 Ménilmontant French Restaurant \n",
"11 14 Observatoire French Restaurant \n",
"12 9 Opéra French Restaurant \n",
"13 7 Palais-Bourbon French Restaurant \n",
"14 5 Panthéon French Restaurant \n",
"15 16 Passy Plaza \n",
"16 11 Popincourt Bar \n",
"17 12 Reuilly Lake \n",
"18 3 Temple French Restaurant \n",
"19 15 Vaugirard French Restaurant \n",
"\n",
" 2nd Most Common Venue 3rd Most Common Venue 4th Most Common Venue \\\n",
"0 Italian Restaurant Hotel Japanese Restaurant \n",
"1 Bistro Wine Bar Cocktail Bar \n",
"2 French Restaurant Pizza Place Hotel \n",
"3 Bar Pizza Place Bistro \n",
"4 Hotel Art Gallery Italian Restaurant \n",
"5 Coffee Shop Bistro Pizza Place \n",
"6 Thai Restaurant Asian Restaurant French Restaurant \n",
"7 Ice Cream Shop Plaza Wine Bar \n",
"8 Plaza Hotel Café \n",
"9 Hotel Italian Restaurant Wine Bar \n",
"10 Bar Bakery Plaza \n",
"11 Hotel Italian Restaurant Bistro \n",
"12 Hotel Cocktail Bar Italian Restaurant \n",
"13 Hotel Plaza Italian Restaurant \n",
"14 Bar Wine Bar Café \n",
"15 Lake Pool French Restaurant \n",
"16 French Restaurant Cocktail Bar Bistro \n",
"17 Zoo Diner Bus Stop \n",
"18 Art Gallery Coffee Shop Gourmet Shop \n",
"19 Hotel Italian Restaurant Coffee Shop \n",
"\n",
" 5th Most Common Venue 6th Most Common Venue 7th Most Common Venue \\\n",
"0 Bakery Restaurant Café \n",
"1 Italian Restaurant Boutique Japanese Restaurant \n",
"2 Supermarket Smoke Shop Café \n",
"3 Italian Restaurant Coffee Shop Café \n",
"4 Bakery Theater Clothing Store \n",
"5 Japanese Restaurant Italian Restaurant Bakery \n",
"6 Chinese Restaurant Bakery Hotel \n",
"7 Tapas Restaurant Pastry Shop Bakery \n",
"8 Japanese Restaurant Exhibit Historic Site \n",
"9 Chocolate Shop Bistro Seafood Restaurant \n",
"10 Bistro Italian Restaurant Café \n",
"11 Café Bar Vietnamese Restaurant \n",
"12 Bistro Bar Bakery \n",
"13 Café Cocktail Bar History Museum \n",
"14 Hotel Bakery Plaza \n",
"15 Garden Diner Skate Park \n",
"16 Pizza Place Italian Restaurant Restaurant \n",
"17 Exhibit French Restaurant Monument / Landmark \n",
"18 Wine Bar Italian Restaurant Chinese Restaurant \n",
"19 Bistro Persian Restaurant Lebanese Restaurant \n",
"\n",
" 8th Most Common Venue 9th Most Common Venue \\\n",
"0 Bistro Plaza \n",
"1 Bakery Creperie \n",
"2 Bistro Restaurant \n",
"3 Art Gallery Middle Eastern Restaurant \n",
"4 Japanese Restaurant Thai Restaurant \n",
"5 Breakfast Spot Indian Restaurant \n",
"6 Cantonese Restaurant Japanese Restaurant \n",
"7 Cocktail Bar Clothing Store \n",
"8 Udon Restaurant Art Museum \n",
"9 Tea Room Plaza \n",
"10 Park Bookstore \n",
"11 Plaza Bakery \n",
"12 Wine Bar Lounge \n",
"13 Historic Site Ice Cream Shop \n",
"14 Coffee Shop Greek Restaurant \n",
"15 Bus Station Bus Stop \n",
"16 Vegetarian / Vegan Restaurant Beer Bar \n",
"17 Japanese Restaurant Hot Dog Joint \n",
"18 Moroccan Restaurant Sandwich Place \n",
"19 Park Restaurant \n",
"\n",
" 10th Most Common Venue \n",
"0 Bar \n",
"1 Hotel \n",
"2 Seafood Restaurant \n",
"3 Convenience Store \n",
"4 Boutique \n",
"5 Cocktail Bar \n",
"6 Cambodian Restaurant \n",
"7 Gastropub \n",
"8 Cosmetics Shop \n",
"9 Pastry Shop \n",
"10 Supermarket \n",
"11 Pizza Place \n",
"12 Pizza Place \n",
"13 Bar \n",
"14 Museum \n",
"15 Cafeteria \n",
"16 Wine Bar \n",
"17 Fast Food Restaurant \n",
"18 Cocktail Bar \n",
"19 Japanese Restaurant "
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"num_top_venues = 10\n",
"indicators = [\"st\", \"nd\", \"rd\"]\n",
"# create columns according to number of top venues\n",
"columns = [\"Arrondissement\", \"Neighborhood\"]\n",
"for ind in np.arange(num_top_venues):\n",
" try:\n",
" columns.append(\"{}{} Most Common Venue\".format(ind+1, indicators[ind]))\n",
" except:\n",
" columns.append(\"{}th Most Common Venue\".format(ind+1))\n",
"# create a new dataframe\n",
"paris_neighborhoods_venues_sorted = pd.DataFrame(columns=columns)\n",
"paris_neighborhoods_venues_sorted[[ \"Arrondissement\",\"Neighborhood\"]] = paris_grouped[[\"Arrondissement\",\"Neighborhood\"]]\n",
"for ind in np.arange(paris_grouped.shape[0]):\n",
" paris_neighborhoods_venues_sorted.iloc[ind, 2:] = return_most_common_venues(paris_grouped.iloc[ind, 1:], num_top_venues)\n",
"paris_neighborhoods_venues_sorted"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Cluster the venues and visualize them on a map"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Create the metric (same squared distances) to identify the best k"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"columns = [\"Neighborhood\", \"Arrondissement\"]\n",
"paris_grouped_clustering = paris_grouped.drop(columns, axis = 1)\n",
"Sum_of_squared_distances = []\n",
"ks = range(1,11)\n",
"for k in ks:\n",
" kmeans = KMeans(n_clusters=k, random_state=0).fit(paris_grouped_clustering)\n",
" Sum_of_squared_distances.append(kmeans.inertia_)\n",
"\n",
"# Plot of sum of squared distances\n",
"plt.plot(ks, Sum_of_squared_distances, \"bx-\")\n",
"plt.xlabel(\"k number of clusters\")\n",
"plt.ylabel(\"Sum_of_squared_distances\")\n",
"plt.title(\"Elbow method for optimal k\")\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"According to the figure above, the correct value for k could be 4 or 5. Here, we choose to cluster the data points using k=5."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Cluster the venues of Paris into 5 clusters"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([3, 1, 1, 1, 3, 1, 4, 1, 1, 1], dtype=int32)"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"kclusters = 5\n",
"# run k-means clustering\n",
"kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(paris_grouped_clustering)\n",
"# check cluster labels generated for each row in the dataframe\n",
"kmeans.labels_[0:10]"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Arrondissement</th>\n",
" <th>Neighborhood</th>\n",
" <th>Latitude</th>\n",
" <th>Longitude</th>\n",
" <th>Cluster Labels</th>\n",
" <th>1st Most Common Venue</th>\n",
" <th>2nd Most Common Venue</th>\n",
" <th>3rd Most Common Venue</th>\n",
" <th>4th Most Common Venue</th>\n",
" <th>5th Most Common Venue</th>\n",
" <th>6th Most Common Venue</th>\n",
" <th>7th Most Common Venue</th>\n",
" <th>8th Most Common Venue</th>\n",
" <th>9th Most Common Venue</th>\n",
" <th>10th Most Common Venue</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1</td>\n",
" <td>Louvre</td>\n",
" <td>48.862563</td>\n",
" <td>2.336443</td>\n",
" <td>3</td>\n",
" <td>French Restaurant</td>\n",
" <td>Plaza</td>\n",
" <td>Hotel</td>\n",
" <td>Café</td>\n",
" <td>Japanese Restaurant</td>\n",
" <td>Exhibit</td>\n",
" <td>Historic Site</td>\n",
" <td>Udon Restaurant</td>\n",
" <td>Art Museum</td>\n",
" <td>Cosmetics Shop</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2</td>\n",
" <td>Bourse</td>\n",
" <td>48.868279</td>\n",
" <td>2.342803</td>\n",
" <td>1</td>\n",
" <td>French Restaurant</td>\n",
" <td>Bistro</td>\n",
" <td>Wine Bar</td>\n",
" <td>Cocktail Bar</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Boutique</td>\n",
" <td>Japanese Restaurant</td>\n",
" <td>Bakery</td>\n",
" <td>Creperie</td>\n",
" <td>Hotel</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>3</td>\n",
" <td>Temple</td>\n",
" <td>48.862872</td>\n",
" <td>2.360001</td>\n",
" <td>1</td>\n",
" <td>French Restaurant</td>\n",
" <td>Art Gallery</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Gourmet Shop</td>\n",
" <td>Wine Bar</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Chinese Restaurant</td>\n",
" <td>Moroccan Restaurant</td>\n",
" <td>Sandwich Place</td>\n",
" <td>Cocktail Bar</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>4</td>\n",
" <td>Hôtel-de-Ville</td>\n",
" <td>48.854341</td>\n",
" <td>2.357630</td>\n",
" <td>1</td>\n",
" <td>French Restaurant</td>\n",
" <td>Ice Cream Shop</td>\n",
" <td>Plaza</td>\n",
" <td>Wine Bar</td>\n",
" <td>Tapas Restaurant</td>\n",
" <td>Pastry Shop</td>\n",
" <td>Bakery</td>\n",
" <td>Cocktail Bar</td>\n",
" <td>Clothing Store</td>\n",
" <td>Gastropub</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>5</td>\n",
" <td>Panthéon</td>\n",
" <td>48.844443</td>\n",
" <td>2.350715</td>\n",
" <td>3</td>\n",
" <td>French Restaurant</td>\n",
" <td>Bar</td>\n",
" <td>Wine Bar</td>\n",
" <td>Café</td>\n",
" <td>Hotel</td>\n",
" <td>Bakery</td>\n",
" <td>Plaza</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Greek Restaurant</td>\n",
" <td>Museum</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>6</td>\n",
" <td>Luxembourg</td>\n",
" <td>48.849130</td>\n",
" <td>2.332898</td>\n",
" <td>1</td>\n",
" <td>French Restaurant</td>\n",
" <td>Hotel</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Wine Bar</td>\n",
" <td>Chocolate Shop</td>\n",
" <td>Bistro</td>\n",
" <td>Seafood Restaurant</td>\n",
" <td>Tea Room</td>\n",
" <td>Plaza</td>\n",
" <td>Pastry Shop</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>7</td>\n",
" <td>Palais-Bourbon</td>\n",
" <td>48.856174</td>\n",
" <td>2.312188</td>\n",
" <td>4</td>\n",
" <td>French Restaurant</td>\n",
" <td>Hotel</td>\n",
" <td>Plaza</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Café</td>\n",
" <td>Cocktail Bar</td>\n",
" <td>History Museum</td>\n",
" <td>Historic Site</td>\n",
" <td>Ice Cream Shop</td>\n",
" <td>Bar</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>8</td>\n",
" <td>Elysée</td>\n",
" <td>48.872721</td>\n",
" <td>2.312554</td>\n",
" <td>1</td>\n",
" <td>French Restaurant</td>\n",
" <td>Hotel</td>\n",
" <td>Art Gallery</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Bakery</td>\n",
" <td>Theater</td>\n",
" <td>Clothing Store</td>\n",
" <td>Japanese Restaurant</td>\n",
" <td>Thai Restaurant</td>\n",
" <td>Boutique</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>9</td>\n",
" <td>Opéra</td>\n",
" <td>48.877164</td>\n",
" <td>2.337458</td>\n",
" <td>1</td>\n",
" <td>French Restaurant</td>\n",
" <td>Hotel</td>\n",
" <td>Cocktail Bar</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Bistro</td>\n",
" <td>Bar</td>\n",
" <td>Bakery</td>\n",
" <td>Wine Bar</td>\n",
" <td>Lounge</td>\n",
" <td>Pizza Place</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>10</td>\n",
" <td>Entrepôt</td>\n",
" <td>48.876130</td>\n",
" <td>2.360728</td>\n",
" <td>1</td>\n",
" <td>French Restaurant</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Bistro</td>\n",
" <td>Pizza Place</td>\n",
" <td>Japanese Restaurant</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Bakery</td>\n",
" <td>Breakfast Spot</td>\n",
" <td>Indian Restaurant</td>\n",
" <td>Cocktail Bar</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td>11</td>\n",
" <td>Popincourt</td>\n",
" <td>48.859059</td>\n",
" <td>2.380058</td>\n",
" <td>1</td>\n",
" <td>Bar</td>\n",
" <td>French Restaurant</td>\n",
" <td>Cocktail Bar</td>\n",
" <td>Bistro</td>\n",
" <td>Pizza Place</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Restaurant</td>\n",
" <td>Vegetarian / Vegan Restaurant</td>\n",
" <td>Beer Bar</td>\n",
" <td>Wine Bar</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11</th>\n",
" <td>12</td>\n",
" <td>Reuilly</td>\n",
" <td>48.834974</td>\n",
" <td>2.421325</td>\n",
" <td>3</td>\n",
" <td>Lake</td>\n",
" <td>Zoo</td>\n",
" <td>Diner</td>\n",
" <td>Bus Stop</td>\n",
" <td>Exhibit</td>\n",
" <td>French Restaurant</td>\n",
" <td>Monument / Landmark</td>\n",
" <td>Japanese Restaurant</td>\n",
" <td>Hot Dog Joint</td>\n",
" <td>Fast Food Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>12</th>\n",
" <td>13</td>\n",
" <td>Gobelins</td>\n",
" <td>48.828388</td>\n",
" <td>2.362272</td>\n",
" <td>3</td>\n",
" <td>Vietnamese Restaurant</td>\n",
" <td>Thai Restaurant</td>\n",
" <td>Asian Restaurant</td>\n",
" <td>French Restaurant</td>\n",
" <td>Chinese Restaurant</td>\n",
" <td>Bakery</td>\n",
" <td>Hotel</td>\n",
" <td>Cantonese Restaurant</td>\n",
" <td>Japanese Restaurant</td>\n",
" <td>Cambodian Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>14</td>\n",
" <td>Observatoire</td>\n",
" <td>48.829245</td>\n",
" <td>2.326542</td>\n",
" <td>3</td>\n",
" <td>French Restaurant</td>\n",
" <td>Hotel</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Bistro</td>\n",
" <td>Café</td>\n",
" <td>Bar</td>\n",
" <td>Vietnamese Restaurant</td>\n",
" <td>Plaza</td>\n",
" <td>Bakery</td>\n",
" <td>Pizza Place</td>\n",
" </tr>\n",
" <tr>\n",
" <th>14</th>\n",
" <td>15</td>\n",
" <td>Vaugirard</td>\n",
" <td>48.840085</td>\n",
" <td>2.292826</td>\n",
" <td>1</td>\n",
" <td>French Restaurant</td>\n",
" <td>Hotel</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Bistro</td>\n",
" <td>Persian Restaurant</td>\n",
" <td>Lebanese Restaurant</td>\n",
" <td>Park</td>\n",
" <td>Restaurant</td>\n",
" <td>Japanese Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>15</th>\n",
" <td>16</td>\n",
" <td>Passy</td>\n",
" <td>48.860392</td>\n",
" <td>2.261971</td>\n",
" <td>2</td>\n",
" <td>Plaza</td>\n",
" <td>Lake</td>\n",
" <td>Pool</td>\n",
" <td>French Restaurant</td>\n",
" <td>Garden</td>\n",
" <td>Diner</td>\n",
" <td>Skate Park</td>\n",
" <td>Bus Station</td>\n",
" <td>Bus Stop</td>\n",
" <td>Cafeteria</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16</th>\n",
" <td>17</td>\n",
" <td>Batignolles-Monceau</td>\n",
" <td>48.887327</td>\n",
" <td>2.306777</td>\n",
" <td>1</td>\n",
" <td>French Restaurant</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Hotel</td>\n",
" <td>Japanese Restaurant</td>\n",
" <td>Bakery</td>\n",
" <td>Restaurant</td>\n",
" <td>Café</td>\n",
" <td>Bistro</td>\n",
" <td>Plaza</td>\n",
" <td>Bar</td>\n",
" </tr>\n",
" <tr>\n",
" <th>17</th>\n",
" <td>18</td>\n",
" <td>Buttes-Montmartre</td>\n",
" <td>48.892569</td>\n",
" <td>2.348161</td>\n",
" <td>0</td>\n",
" <td>French Restaurant</td>\n",
" <td>Bar</td>\n",
" <td>Pizza Place</td>\n",
" <td>Bistro</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Café</td>\n",
" <td>Art Gallery</td>\n",
" <td>Middle Eastern Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" </tr>\n",
" <tr>\n",
" <th>18</th>\n",
" <td>19</td>\n",
" <td>Buttes-Chaumont</td>\n",
" <td>48.887076</td>\n",
" <td>2.384821</td>\n",
" <td>1</td>\n",
" <td>Bar</td>\n",
" <td>French Restaurant</td>\n",
" <td>Pizza Place</td>\n",
" <td>Hotel</td>\n",
" <td>Supermarket</td>\n",
" <td>Smoke Shop</td>\n",
" <td>Café</td>\n",
" <td>Bistro</td>\n",
" <td>Restaurant</td>\n",
" <td>Seafood Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>19</th>\n",
" <td>20</td>\n",
" <td>Ménilmontant</td>\n",
" <td>48.863461</td>\n",
" <td>2.401188</td>\n",
" <td>3</td>\n",
" <td>French Restaurant</td>\n",
" <td>Bar</td>\n",
" <td>Bakery</td>\n",
" <td>Plaza</td>\n",
" <td>Bistro</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Café</td>\n",
" <td>Park</td>\n",
" <td>Bookstore</td>\n",
" <td>Supermarket</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Arrondissement Neighborhood Latitude Longitude Cluster Labels \\\n",
"0 1 Louvre 48.862563 2.336443 3 \n",
"1 2 Bourse 48.868279 2.342803 1 \n",
"2 3 Temple 48.862872 2.360001 1 \n",
"3 4 Hôtel-de-Ville 48.854341 2.357630 1 \n",
"4 5 Panthéon 48.844443 2.350715 3 \n",
"5 6 Luxembourg 48.849130 2.332898 1 \n",
"6 7 Palais-Bourbon 48.856174 2.312188 4 \n",
"7 8 Elysée 48.872721 2.312554 1 \n",
"8 9 Opéra 48.877164 2.337458 1 \n",
"9 10 Entrepôt 48.876130 2.360728 1 \n",
"10 11 Popincourt 48.859059 2.380058 1 \n",
"11 12 Reuilly 48.834974 2.421325 3 \n",
"12 13 Gobelins 48.828388 2.362272 3 \n",
"13 14 Observatoire 48.829245 2.326542 3 \n",
"14 15 Vaugirard 48.840085 2.292826 1 \n",
"15 16 Passy 48.860392 2.261971 2 \n",
"16 17 Batignolles-Monceau 48.887327 2.306777 1 \n",
"17 18 Buttes-Montmartre 48.892569 2.348161 0 \n",
"18 19 Buttes-Chaumont 48.887076 2.384821 1 \n",
"19 20 Ménilmontant 48.863461 2.401188 3 \n",
"\n",
" 1st Most Common Venue 2nd Most Common Venue 3rd Most Common Venue \\\n",
"0 French Restaurant Plaza Hotel \n",
"1 French Restaurant Bistro Wine Bar \n",
"2 French Restaurant Art Gallery Coffee Shop \n",
"3 French Restaurant Ice Cream Shop Plaza \n",
"4 French Restaurant Bar Wine Bar \n",
"5 French Restaurant Hotel Italian Restaurant \n",
"6 French Restaurant Hotel Plaza \n",
"7 French Restaurant Hotel Art Gallery \n",
"8 French Restaurant Hotel Cocktail Bar \n",
"9 French Restaurant Coffee Shop Bistro \n",
"10 Bar French Restaurant Cocktail Bar \n",
"11 Lake Zoo Diner \n",
"12 Vietnamese Restaurant Thai Restaurant Asian Restaurant \n",
"13 French Restaurant Hotel Italian Restaurant \n",
"14 French Restaurant Hotel Italian Restaurant \n",
"15 Plaza Lake Pool \n",
"16 French Restaurant Italian Restaurant Hotel \n",
"17 French Restaurant Bar Pizza Place \n",
"18 Bar French Restaurant Pizza Place \n",
"19 French Restaurant Bar Bakery \n",
"\n",
" 4th Most Common Venue 5th Most Common Venue 6th Most Common Venue \\\n",
"0 Café Japanese Restaurant Exhibit \n",
"1 Cocktail Bar Italian Restaurant Boutique \n",
"2 Gourmet Shop Wine Bar Italian Restaurant \n",
"3 Wine Bar Tapas Restaurant Pastry Shop \n",
"4 Café Hotel Bakery \n",
"5 Wine Bar Chocolate Shop Bistro \n",
"6 Italian Restaurant Café Cocktail Bar \n",
"7 Italian Restaurant Bakery Theater \n",
"8 Italian Restaurant Bistro Bar \n",
"9 Pizza Place Japanese Restaurant Italian Restaurant \n",
"10 Bistro Pizza Place Italian Restaurant \n",
"11 Bus Stop Exhibit French Restaurant \n",
"12 French Restaurant Chinese Restaurant Bakery \n",
"13 Bistro Café Bar \n",
"14 Coffee Shop Bistro Persian Restaurant \n",
"15 French Restaurant Garden Diner \n",
"16 Japanese Restaurant Bakery Restaurant \n",
"17 Bistro Italian Restaurant Coffee Shop \n",
"18 Hotel Supermarket Smoke Shop \n",
"19 Plaza Bistro Italian Restaurant \n",
"\n",
" 7th Most Common Venue 8th Most Common Venue \\\n",
"0 Historic Site Udon Restaurant \n",
"1 Japanese Restaurant Bakery \n",
"2 Chinese Restaurant Moroccan Restaurant \n",
"3 Bakery Cocktail Bar \n",
"4 Plaza Coffee Shop \n",
"5 Seafood Restaurant Tea Room \n",
"6 History Museum Historic Site \n",
"7 Clothing Store Japanese Restaurant \n",
"8 Bakery Wine Bar \n",
"9 Bakery Breakfast Spot \n",
"10 Restaurant Vegetarian / Vegan Restaurant \n",
"11 Monument / Landmark Japanese Restaurant \n",
"12 Hotel Cantonese Restaurant \n",
"13 Vietnamese Restaurant Plaza \n",
"14 Lebanese Restaurant Park \n",
"15 Skate Park Bus Station \n",
"16 Café Bistro \n",
"17 Café Art Gallery \n",
"18 Café Bistro \n",
"19 Café Park \n",
"\n",
" 9th Most Common Venue 10th Most Common Venue \n",
"0 Art Museum Cosmetics Shop \n",
"1 Creperie Hotel \n",
"2 Sandwich Place Cocktail Bar \n",
"3 Clothing Store Gastropub \n",
"4 Greek Restaurant Museum \n",
"5 Plaza Pastry Shop \n",
"6 Ice Cream Shop Bar \n",
"7 Thai Restaurant Boutique \n",
"8 Lounge Pizza Place \n",
"9 Indian Restaurant Cocktail Bar \n",
"10 Beer Bar Wine Bar \n",
"11 Hot Dog Joint Fast Food Restaurant \n",
"12 Japanese Restaurant Cambodian Restaurant \n",
"13 Bakery Pizza Place \n",
"14 Restaurant Japanese Restaurant \n",
"15 Bus Stop Cafeteria \n",
"16 Plaza Bar \n",
"17 Middle Eastern Restaurant Convenience Store \n",
"18 Restaurant Seafood Restaurant \n",
"19 Bookstore Supermarket "
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"paris_merged = paris_data\n",
"# add clustering labels\n",
"paris_merged[\"Cluster Labels\"] = kmeans.labels_\n",
"paris_merged = pd.merge(paris_merged, paris_neighborhoods_venues_sorted, how = \"left\", left_on = [\"Arrondissement\",\"Neighborhood\"], right_on = [\"Arrondissement\",\"Neighborhood\"])\n",
"paris_merged"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Represent Clusters on a map"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"width:100%;\"><div style=\"position:relative;width:100%;height:0;padding-bottom:60%;\"><iframe src=\"data:text/html;charset=utf-8;base64,\" style=\"position:absolute;width:100%;height:100%;left:0;top:0;border:none !important;\" allowfullscreen webkitallowfullscreen mozallowfullscreen></iframe></div></div>"
],
"text/plain": [
"<folium.folium.Map at 0x7f750f66b6d8>"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"map_clusters = folium.Map(location=[latitude_par, longitude_par], zoom_start=12)\n",
"# set color scheme for the clusters\n",
"x = np.arange(kclusters)\n",
"ys = [i+x+(i*x)**2 for i in range(kclusters)]\n",
"colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))\n",
"rainbow = [colors.rgb2hex(i) for i in colors_array]\n",
"# add markers to the map\n",
"markers_colors = []\n",
"for lat, lon, poi, cluster in zip(paris_merged[\"Latitude\"], paris_merged[\"Longitude\"], paris_merged[\"Neighborhood\"], paris_merged[\"Cluster Labels\"]):\n",
" label = folium.Popup(str(poi) + \"Cluster\" + str(cluster), parse_html=True)\n",
" folium.CircleMarker(\n",
" [lat, lon],\n",
" radius = 5,\n",
" popup = label,\n",
" color = rainbow[cluster-1],\n",
" fill = True,\n",
" fill_color = rainbow[cluster-1],\n",
" fill_opacity = 0.7).add_to(map_clusters) \n",
"map_clusters"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Cluster 1: Red"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Neighborhood</th>\n",
" <th>1st Most Common Venue</th>\n",
" <th>2nd Most Common Venue</th>\n",
" <th>3rd Most Common Venue</th>\n",
" <th>4th Most Common Venue</th>\n",
" <th>5th Most Common Venue</th>\n",
" <th>6th Most Common Venue</th>\n",
" <th>7th Most Common Venue</th>\n",
" <th>8th Most Common Venue</th>\n",
" <th>9th Most Common Venue</th>\n",
" <th>10th Most Common Venue</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>17</th>\n",
" <td>Buttes-Montmartre</td>\n",
" <td>French Restaurant</td>\n",
" <td>Bar</td>\n",
" <td>Pizza Place</td>\n",
" <td>Bistro</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Café</td>\n",
" <td>Art Gallery</td>\n",
" <td>Middle Eastern Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Neighborhood 1st Most Common Venue 2nd Most Common Venue \\\n",
"17 Buttes-Montmartre French Restaurant Bar \n",
"\n",
" 3rd Most Common Venue 4th Most Common Venue 5th Most Common Venue \\\n",
"17 Pizza Place Bistro Italian Restaurant \n",
"\n",
" 6th Most Common Venue 7th Most Common Venue 8th Most Common Venue \\\n",
"17 Coffee Shop Café Art Gallery \n",
"\n",
" 9th Most Common Venue 10th Most Common Venue \n",
"17 Middle Eastern Restaurant Convenience Store "
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"paris_merged.loc[paris_merged[\"Cluster Labels\"] == 0, paris_merged.columns[[1] + list(range(5, paris_merged.shape[1]))]]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Most common venues in Cluster 1 are restauration services consisting of restaurants, Bars and Café shops. This represents 80% of the venues. \n",
"Opportunities exists here for the establishment of other businesses like Hotels, banks, book stores, education services, Insurance, Shopping Centers, etc."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Cluster 2: Purple"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Neighborhood</th>\n",
" <th>1st Most Common Venue</th>\n",
" <th>2nd Most Common Venue</th>\n",
" <th>3rd Most Common Venue</th>\n",
" <th>4th Most Common Venue</th>\n",
" <th>5th Most Common Venue</th>\n",
" <th>6th Most Common Venue</th>\n",
" <th>7th Most Common Venue</th>\n",
" <th>8th Most Common Venue</th>\n",
" <th>9th Most Common Venue</th>\n",
" <th>10th Most Common Venue</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Bourse</td>\n",
" <td>French Restaurant</td>\n",
" <td>Bistro</td>\n",
" <td>Wine Bar</td>\n",
" <td>Cocktail Bar</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Boutique</td>\n",
" <td>Japanese Restaurant</td>\n",
" <td>Bakery</td>\n",
" <td>Creperie</td>\n",
" <td>Hotel</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Temple</td>\n",
" <td>French Restaurant</td>\n",
" <td>Art Gallery</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Gourmet Shop</td>\n",
" <td>Wine Bar</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Chinese Restaurant</td>\n",
" <td>Moroccan Restaurant</td>\n",
" <td>Sandwich Place</td>\n",
" <td>Cocktail Bar</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Hôtel-de-Ville</td>\n",
" <td>French Restaurant</td>\n",
" <td>Ice Cream Shop</td>\n",
" <td>Plaza</td>\n",
" <td>Wine Bar</td>\n",
" <td>Tapas Restaurant</td>\n",
" <td>Pastry Shop</td>\n",
" <td>Bakery</td>\n",
" <td>Cocktail Bar</td>\n",
" <td>Clothing Store</td>\n",
" <td>Gastropub</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>Luxembourg</td>\n",
" <td>French Restaurant</td>\n",
" <td>Hotel</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Wine Bar</td>\n",
" <td>Chocolate Shop</td>\n",
" <td>Bistro</td>\n",
" <td>Seafood Restaurant</td>\n",
" <td>Tea Room</td>\n",
" <td>Plaza</td>\n",
" <td>Pastry Shop</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>Elysée</td>\n",
" <td>French Restaurant</td>\n",
" <td>Hotel</td>\n",
" <td>Art Gallery</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Bakery</td>\n",
" <td>Theater</td>\n",
" <td>Clothing Store</td>\n",
" <td>Japanese Restaurant</td>\n",
" <td>Thai Restaurant</td>\n",
" <td>Boutique</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>Opéra</td>\n",
" <td>French Restaurant</td>\n",
" <td>Hotel</td>\n",
" <td>Cocktail Bar</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Bistro</td>\n",
" <td>Bar</td>\n",
" <td>Bakery</td>\n",
" <td>Wine Bar</td>\n",
" <td>Lounge</td>\n",
" <td>Pizza Place</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>Entrepôt</td>\n",
" <td>French Restaurant</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Bistro</td>\n",
" <td>Pizza Place</td>\n",
" <td>Japanese Restaurant</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Bakery</td>\n",
" <td>Breakfast Spot</td>\n",
" <td>Indian Restaurant</td>\n",
" <td>Cocktail Bar</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td>Popincourt</td>\n",
" <td>Bar</td>\n",
" <td>French Restaurant</td>\n",
" <td>Cocktail Bar</td>\n",
" <td>Bistro</td>\n",
" <td>Pizza Place</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Restaurant</td>\n",
" <td>Vegetarian / Vegan Restaurant</td>\n",
" <td>Beer Bar</td>\n",
" <td>Wine Bar</td>\n",
" </tr>\n",
" <tr>\n",
" <th>14</th>\n",
" <td>Vaugirard</td>\n",
" <td>French Restaurant</td>\n",
" <td>Hotel</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Bistro</td>\n",
" <td>Persian Restaurant</td>\n",
" <td>Lebanese Restaurant</td>\n",
" <td>Park</td>\n",
" <td>Restaurant</td>\n",
" <td>Japanese Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16</th>\n",
" <td>Batignolles-Monceau</td>\n",
" <td>French Restaurant</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Hotel</td>\n",
" <td>Japanese Restaurant</td>\n",
" <td>Bakery</td>\n",
" <td>Restaurant</td>\n",
" <td>Café</td>\n",
" <td>Bistro</td>\n",
" <td>Plaza</td>\n",
" <td>Bar</td>\n",
" </tr>\n",
" <tr>\n",
" <th>18</th>\n",
" <td>Buttes-Chaumont</td>\n",
" <td>Bar</td>\n",
" <td>French Restaurant</td>\n",
" <td>Pizza Place</td>\n",
" <td>Hotel</td>\n",
" <td>Supermarket</td>\n",
" <td>Smoke Shop</td>\n",
" <td>Café</td>\n",
" <td>Bistro</td>\n",
" <td>Restaurant</td>\n",
" <td>Seafood Restaurant</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Neighborhood 1st Most Common Venue 2nd Most Common Venue \\\n",
"1 Bourse French Restaurant Bistro \n",
"2 Temple French Restaurant Art Gallery \n",
"3 Hôtel-de-Ville French Restaurant Ice Cream Shop \n",
"5 Luxembourg French Restaurant Hotel \n",
"7 Elysée French Restaurant Hotel \n",
"8 Opéra French Restaurant Hotel \n",
"9 Entrepôt French Restaurant Coffee Shop \n",
"10 Popincourt Bar French Restaurant \n",
"14 Vaugirard French Restaurant Hotel \n",
"16 Batignolles-Monceau French Restaurant Italian Restaurant \n",
"18 Buttes-Chaumont Bar French Restaurant \n",
"\n",
" 3rd Most Common Venue 4th Most Common Venue 5th Most Common Venue \\\n",
"1 Wine Bar Cocktail Bar Italian Restaurant \n",
"2 Coffee Shop Gourmet Shop Wine Bar \n",
"3 Plaza Wine Bar Tapas Restaurant \n",
"5 Italian Restaurant Wine Bar Chocolate Shop \n",
"7 Art Gallery Italian Restaurant Bakery \n",
"8 Cocktail Bar Italian Restaurant Bistro \n",
"9 Bistro Pizza Place Japanese Restaurant \n",
"10 Cocktail Bar Bistro Pizza Place \n",
"14 Italian Restaurant Coffee Shop Bistro \n",
"16 Hotel Japanese Restaurant Bakery \n",
"18 Pizza Place Hotel Supermarket \n",
"\n",
" 6th Most Common Venue 7th Most Common Venue 8th Most Common Venue \\\n",
"1 Boutique Japanese Restaurant Bakery \n",
"2 Italian Restaurant Chinese Restaurant Moroccan Restaurant \n",
"3 Pastry Shop Bakery Cocktail Bar \n",
"5 Bistro Seafood Restaurant Tea Room \n",
"7 Theater Clothing Store Japanese Restaurant \n",
"8 Bar Bakery Wine Bar \n",
"9 Italian Restaurant Bakery Breakfast Spot \n",
"10 Italian Restaurant Restaurant Vegetarian / Vegan Restaurant \n",
"14 Persian Restaurant Lebanese Restaurant Park \n",
"16 Restaurant Café Bistro \n",
"18 Smoke Shop Café Bistro \n",
"\n",
" 9th Most Common Venue 10th Most Common Venue \n",
"1 Creperie Hotel \n",
"2 Sandwich Place Cocktail Bar \n",
"3 Clothing Store Gastropub \n",
"5 Plaza Pastry Shop \n",
"7 Thai Restaurant Boutique \n",
"8 Lounge Pizza Place \n",
"9 Indian Restaurant Cocktail Bar \n",
"10 Beer Bar Wine Bar \n",
"14 Restaurant Japanese Restaurant \n",
"16 Plaza Bar \n",
"18 Restaurant Seafood Restaurant "
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"paris_merged.loc[paris_merged[\"Cluster Labels\"] == 1, paris_merged.columns[[1] + list(range(5, paris_merged.shape[1]))]]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Cluster 2 has eleven neighbourhood consisting of mainly restauration services and hotels. This could also be a great location for the establishment of shpping centers, bakery and a banks"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Cluster 3: Blue "
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Neighborhood</th>\n",
" <th>1st Most Common Venue</th>\n",
" <th>2nd Most Common Venue</th>\n",
" <th>3rd Most Common Venue</th>\n",
" <th>4th Most Common Venue</th>\n",
" <th>5th Most Common Venue</th>\n",
" <th>6th Most Common Venue</th>\n",
" <th>7th Most Common Venue</th>\n",
" <th>8th Most Common Venue</th>\n",
" <th>9th Most Common Venue</th>\n",
" <th>10th Most Common Venue</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>15</th>\n",
" <td>Passy</td>\n",
" <td>Plaza</td>\n",
" <td>Lake</td>\n",
" <td>Pool</td>\n",
" <td>French Restaurant</td>\n",
" <td>Garden</td>\n",
" <td>Diner</td>\n",
" <td>Skate Park</td>\n",
" <td>Bus Station</td>\n",
" <td>Bus Stop</td>\n",
" <td>Cafeteria</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Neighborhood 1st Most Common Venue 2nd Most Common Venue \\\n",
"15 Passy Plaza Lake \n",
"\n",
" 3rd Most Common Venue 4th Most Common Venue 5th Most Common Venue \\\n",
"15 Pool French Restaurant Garden \n",
"\n",
" 6th Most Common Venue 7th Most Common Venue 8th Most Common Venue \\\n",
"15 Diner Skate Park Bus Station \n",
"\n",
" 9th Most Common Venue 10th Most Common Venue \n",
"15 Bus Stop Cafeteria "
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"paris_merged.loc[paris_merged[\"Cluster Labels\"] == 2, paris_merged.columns[[1] + list(range(5, paris_merged.shape[1]))]]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Cluster 3 also has one neighbourhood consisting of mainly restauration services and touristic sites. Business like banks or payment counters and hotels could also be established here"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Cluster 4: Green"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Neighborhood</th>\n",
" <th>1st Most Common Venue</th>\n",
" <th>2nd Most Common Venue</th>\n",
" <th>3rd Most Common Venue</th>\n",
" <th>4th Most Common Venue</th>\n",
" <th>5th Most Common Venue</th>\n",
" <th>6th Most Common Venue</th>\n",
" <th>7th Most Common Venue</th>\n",
" <th>8th Most Common Venue</th>\n",
" <th>9th Most Common Venue</th>\n",
" <th>10th Most Common Venue</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Louvre</td>\n",
" <td>French Restaurant</td>\n",
" <td>Plaza</td>\n",
" <td>Hotel</td>\n",
" <td>Café</td>\n",
" <td>Japanese Restaurant</td>\n",
" <td>Exhibit</td>\n",
" <td>Historic Site</td>\n",
" <td>Udon Restaurant</td>\n",
" <td>Art Museum</td>\n",
" <td>Cosmetics Shop</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Panthéon</td>\n",
" <td>French Restaurant</td>\n",
" <td>Bar</td>\n",
" <td>Wine Bar</td>\n",
" <td>Café</td>\n",
" <td>Hotel</td>\n",
" <td>Bakery</td>\n",
" <td>Plaza</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Greek Restaurant</td>\n",
" <td>Museum</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11</th>\n",
" <td>Reuilly</td>\n",
" <td>Lake</td>\n",
" <td>Zoo</td>\n",
" <td>Diner</td>\n",
" <td>Bus Stop</td>\n",
" <td>Exhibit</td>\n",
" <td>French Restaurant</td>\n",
" <td>Monument / Landmark</td>\n",
" <td>Japanese Restaurant</td>\n",
" <td>Hot Dog Joint</td>\n",
" <td>Fast Food Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>12</th>\n",
" <td>Gobelins</td>\n",
" <td>Vietnamese Restaurant</td>\n",
" <td>Thai Restaurant</td>\n",
" <td>Asian Restaurant</td>\n",
" <td>French Restaurant</td>\n",
" <td>Chinese Restaurant</td>\n",
" <td>Bakery</td>\n",
" <td>Hotel</td>\n",
" <td>Cantonese Restaurant</td>\n",
" <td>Japanese Restaurant</td>\n",
" <td>Cambodian Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>Observatoire</td>\n",
" <td>French Restaurant</td>\n",
" <td>Hotel</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Bistro</td>\n",
" <td>Café</td>\n",
" <td>Bar</td>\n",
" <td>Vietnamese Restaurant</td>\n",
" <td>Plaza</td>\n",
" <td>Bakery</td>\n",
" <td>Pizza Place</td>\n",
" </tr>\n",
" <tr>\n",
" <th>19</th>\n",
" <td>Ménilmontant</td>\n",
" <td>French Restaurant</td>\n",
" <td>Bar</td>\n",
" <td>Bakery</td>\n",
" <td>Plaza</td>\n",
" <td>Bistro</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Café</td>\n",
" <td>Park</td>\n",
" <td>Bookstore</td>\n",
" <td>Supermarket</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Neighborhood 1st Most Common Venue 2nd Most Common Venue \\\n",
"0 Louvre French Restaurant Plaza \n",
"4 Panthéon French Restaurant Bar \n",
"11 Reuilly Lake Zoo \n",
"12 Gobelins Vietnamese Restaurant Thai Restaurant \n",
"13 Observatoire French Restaurant Hotel \n",
"19 Ménilmontant French Restaurant Bar \n",
"\n",
" 3rd Most Common Venue 4th Most Common Venue 5th Most Common Venue \\\n",
"0 Hotel Café Japanese Restaurant \n",
"4 Wine Bar Café Hotel \n",
"11 Diner Bus Stop Exhibit \n",
"12 Asian Restaurant French Restaurant Chinese Restaurant \n",
"13 Italian Restaurant Bistro Café \n",
"19 Bakery Plaza Bistro \n",
"\n",
" 6th Most Common Venue 7th Most Common Venue 8th Most Common Venue \\\n",
"0 Exhibit Historic Site Udon Restaurant \n",
"4 Bakery Plaza Coffee Shop \n",
"11 French Restaurant Monument / Landmark Japanese Restaurant \n",
"12 Bakery Hotel Cantonese Restaurant \n",
"13 Bar Vietnamese Restaurant Plaza \n",
"19 Italian Restaurant Café Park \n",
"\n",
" 9th Most Common Venue 10th Most Common Venue \n",
"0 Art Museum Cosmetics Shop \n",
"4 Greek Restaurant Museum \n",
"11 Hot Dog Joint Fast Food Restaurant \n",
"12 Japanese Restaurant Cambodian Restaurant \n",
"13 Bakery Pizza Place \n",
"19 Bookstore Supermarket "
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"paris_merged.loc[paris_merged[\"Cluster Labels\"] == 3, paris_merged.columns[[1] + list(range(5, paris_merged.shape[1]))]]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Cluster 4 has six Neighborhoods consisting of mainly restauration services, touristic sites and few hotels. Hotels could be a great investment in this cluster. other posiible investments are shopping centers, retails shops, banks, etc."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Cluster 5: Orange (west of Paris)"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Neighborhood</th>\n",
" <th>1st Most Common Venue</th>\n",
" <th>2nd Most Common Venue</th>\n",
" <th>3rd Most Common Venue</th>\n",
" <th>4th Most Common Venue</th>\n",
" <th>5th Most Common Venue</th>\n",
" <th>6th Most Common Venue</th>\n",
" <th>7th Most Common Venue</th>\n",
" <th>8th Most Common Venue</th>\n",
" <th>9th Most Common Venue</th>\n",
" <th>10th Most Common Venue</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>Palais-Bourbon</td>\n",
" <td>French Restaurant</td>\n",
" <td>Hotel</td>\n",
" <td>Plaza</td>\n",
" <td>Italian Restaurant</td>\n",
" <td>Café</td>\n",
" <td>Cocktail Bar</td>\n",
" <td>History Museum</td>\n",
" <td>Historic Site</td>\n",
" <td>Ice Cream Shop</td>\n",
" <td>Bar</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Neighborhood 1st Most Common Venue 2nd Most Common Venue \\\n",
"6 Palais-Bourbon French Restaurant Hotel \n",
"\n",
" 3rd Most Common Venue 4th Most Common Venue 5th Most Common Venue \\\n",
"6 Plaza Italian Restaurant Café \n",
"\n",
" 6th Most Common Venue 7th Most Common Venue 8th Most Common Venue \\\n",
"6 Cocktail Bar History Museum Historic Site \n",
"\n",
" 9th Most Common Venue 10th Most Common Venue \n",
"6 Ice Cream Shop Bar "
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"paris_merged.loc[paris_merged[\"Cluster Labels\"] == 4, paris_merged.columns[[1] + list(range(5, paris_merged.shape[1]))]]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Cluster 5 has one Neighborhood mainly consisting of restauration, Hotels and touristic business venues. Additional restaurants, banks and retail shops could be established here."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Considering that Paris is a touristic city, Hotels, restaurants, Coffee shops, Bakery and Bars dominates most neighbourhoods.\n",
"- In Reuilly neigbourhood, despite enormous touristic potentials, restaurants and hotels seem to absent. This could be a great investment opportunity for an investor\n",
"- Banks also seem not to be very common in most communities this also represent a great investment opportunities."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Conclusion"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Using Business density analysis, lots of opportunities exists to established competitive businesses in various neighborhoods of the city of Paris. This analysis can also be conducted in other cities in the World with adequate venues data in Foursquare. The results of this analysis when combined with other factors will enable improved business locations for sustainability."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.8"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment