Skip to content

Instantly share code, notes, and snippets.

@focaalvarez
Created July 13, 2019 16:35
Show Gist options
  • Save focaalvarez/9bf233b658b91f8d70cdbb93d400acdb to your computer and use it in GitHub Desktop.
Save focaalvarez/9bf233b658b91f8d70cdbb93d400acdb to your computer and use it in GitHub Desktop.
#get the regional Postcode only
pubs['Postcode']=pubs['zip_code'].str.split(' ').str[0]
#Create a Grouped DF by Postcode
pubs_by_postcode=pd.DataFrame(data=pubs['Postcode'].value_counts())
pubs_by_postcode.reset_index(inplace=True)
pubs_by_postcode.columns=['Postcode','Pubs']
#Append Population and Coordinates; calculate Pubs per 1.000 people
pubs_by_postcode=pubs_by_postcode.merge(cities[['Postcode', 'Latitude', 'Longitude','Town/Area','Population']],how='left',on='Postcode')
pubs_by_postcode['ratio']=pubs_by_postcode['Pubs']/pubs_by_postcode['Population']*1000
pubs_by_postcode.sort_values(by='ratio',ascending=False,inplace=True)
#Drop Outliers, top 5% of the ratios
pubs_by_postcode=pubs_by_postcode[pubs_by_postcode['ratio']<3]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment