Instantly share code, notes, and snippets.

# digitalWestie/clustering_scotland.md

Last active June 8, 2017 23:50
Show Gist options
• Save digitalWestie/aa6080d3c9b60c80b5dc515e6b818bda to your computer and use it in GitHub Desktop.
Clustering Scotland's constituencies

# Using clustering to group Scotland's constituencies

A few days ago I briefly got involved in The Bureau Local's Voterpower Hack. Participants were provided with demographic and political data sourced from the Office of National Statistics, the Cabinet Office and the British Election Study.

To get started, I decided to play with some Python machine learning libraries, this was partly a learning exercise but I also wondered if I could surface any insights.

The first thing I did was run a k-means clustering algorithm on the constituency demographic & voting data.

You can see the results and the code below. In k-means clustering you input how many groups you want the algorithm to produce. Here I've provided 2,4, and 6. A '1' against a constituency means that the constituency belongs to the group in the column.

Comparing the results with the different number of groups can be revealing. In most of the results you can see the highland and island constituencies end up in the same group. Just split into 2 groups though and you'll see Glasgow North grouped with more rural constituencies. This is a funny one, but if you inspect the data you'll see the 'usual_residents' figure is low for a city, possibly the result of a high student population.

Have a look and see if you can spot any other oddities. With a little more time it might be nice to visualise this on a map with a neat cluster count slider control.

Edit: I just noticed, in the 6 clusters sheet Glasgow Central appeaars in a group of its own. Bit unsure, but it could be a result of Glasgow Central having a the lowest median age (30) of all constituencies in Scotland.

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
constituency 0 1 Aberdeen North 1 0 Aberdeen South 1 0 Aberdeenshire West & Kincardine 1 0 Airdrie & Shotts 1 0 Angus 1 0 Argyll & Bute 1 0 Ayr, Carrick & Cumnock 1 0 Ayrshire Central 1 0 Ayrshire North & Arran 1 0 Banff & Buchan 1 0 Berwickshire, Roxburgh & Selkirk 1 0 Caithness, Sutherland & Easter Ross 0 1 Coatbridge, Chryston & Bellshill 1 0 Cumbernauld, Kilsyth & Kirkintilloch East 1 0 Dumfries & Galloway 1 0 Dumfriesshire, Clydesdale & Tweeddale 1 0 Dunbartonshire East 1 0 Dunbartonshire West 1 0 Dundee East 1 0 Dundee West 1 0 Dunfermline & Fife West 1 0 East Kilbride, Strathaven & Lesmahagow 1 0 East Lothian 1 0 Edinburgh East 1 0 Edinburgh North & Leith 1 0 Edinburgh South 1 0 Edinburgh South West 1 0 Edinburgh West 1 0 Falkirk 1 0 Fife North East 1 0 Glasgow Central 1 0 Glasgow East 1 0 Glasgow North 0 1 Glasgow North East 1 0 Glasgow North West 1 0 Glasgow South 1 0 Glasgow South West 1 0 Glenrothes 1 0 Gordon 1 0 Inverclyde 1 0 Inverness, Nairn, Badenoch & Strathspey 1 0 Kilmarnock & Loudoun 1 0 Kirkcaldy & Cowdenbeath 1 0 Lanark & Hamilton East 1 0 Linlithgow & Falkirk East 1 0 Livingston 1 0 Midlothian 1 0 Moray 1 0 Motherwell & Wishaw 1 0 Na H-Eileanan An Iar 0 1 Ochil & Perthshire South 1 0 Orkney & Shetland 0 1 Paisley & Renfrewshire North 1 0 Paisley & Renfrewshire South 1 0 Perth & Perthshire North 1 0 Renfrewshire East 1 0 Ross, Skye & Lochaber 0 1 Rutherglen & Hamilton West 1 0 Stirling 1 0
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
constituency 0 1 2 3 Aberdeen North 0 1 0 0 Aberdeen South 0 1 0 0 Aberdeenshire West & Kincardine 0 0 1 0 Airdrie & Shotts 0 0 1 0 Angus 0 0 1 0 Argyll & Bute 0 0 1 0 Ayr, Carrick & Cumnock 0 0 1 0 Ayrshire Central 0 0 1 0 Ayrshire North & Arran 0 0 1 0 Banff & Buchan 0 0 1 0 Berwickshire, Roxburgh & Selkirk 0 0 1 0 Caithness, Sutherland & Easter Ross 1 0 0 0 Coatbridge, Chryston & Bellshill 0 0 1 0 Cumbernauld, Kilsyth & Kirkintilloch East 0 0 1 0 Dumfries & Galloway 0 0 1 0 Dumfriesshire, Clydesdale & Tweeddale 0 0 1 0 Dunbartonshire East 0 0 1 0 Dunbartonshire West 0 0 1 0 Dundee East 0 0 1 0 Dundee West 0 0 1 0 Dunfermline & Fife West 0 0 1 0 East Kilbride, Strathaven & Lesmahagow 0 0 1 0 East Lothian 0 0 1 0 Edinburgh East 0 1 0 0 Edinburgh North & Leith 0 1 0 0 Edinburgh South 0 0 1 0 Edinburgh South West 0 1 0 0 Edinburgh West 0 1 0 0 Falkirk 0 0 1 0 Fife North East 0 0 1 0 Glasgow Central 0 0 0 1 Glasgow East 0 0 1 0 Glasgow North 1 0 0 0 Glasgow North East 0 0 1 0 Glasgow North West 0 0 1 0 Glasgow South 0 0 1 0 Glasgow South West 0 0 1 0 Glenrothes 0 0 1 0 Gordon 0 1 0 0 Inverclyde 0 0 1 0 Inverness, Nairn, Badenoch & Strathspey 0 1 0 0 Kilmarnock & Loudoun 0 0 1 0 Kirkcaldy & Cowdenbeath 0 0 1 0 Lanark & Hamilton East 0 0 1 0 Linlithgow & Falkirk East 0 0 1 0 Livingston 0 1 0 0 Midlothian 0 0 1 0 Moray 0 0 1 0 Motherwell & Wishaw 0 0 1 0 Na H-Eileanan An Iar 1 0 0 0 Ochil & Perthshire South 0 0 1 0 Orkney & Shetland 1 0 0 0 Paisley & Renfrewshire North 0 0 1 0 Paisley & Renfrewshire South 0 0 1 0 Perth & Perthshire North 0 0 1 0 Renfrewshire East 0 0 1 0 Ross, Skye & Lochaber 1 0 0 0 Rutherglen & Hamilton West 0 0 1 0 Stirling 0 0 1 0
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
constituency 0 1 2 3 4 5 Aberdeen North 0 0 0 0 0 1 Aberdeen South 0 0 0 0 0 1 Aberdeenshire West & Kincardine 0 1 0 0 0 0 Airdrie & Shotts 1 0 0 0 0 0 Angus 1 0 0 0 0 0 Argyll & Bute 1 0 0 0 0 0 Ayr, Carrick & Cumnock 0 1 0 0 0 0 Ayrshire Central 1 0 0 0 0 0 Ayrshire North & Arran 0 1 0 0 0 0 Banff & Buchan 1 0 0 0 0 0 Berwickshire, Roxburgh & Selkirk 0 1 0 0 0 0 Caithness, Sutherland & Easter Ross 0 0 0 0 1 0 Coatbridge, Chryston & Bellshill 0 1 0 0 0 0 Cumbernauld, Kilsyth & Kirkintilloch East 1 0 0 0 0 0 Dumfries & Galloway 0 1 0 0 0 0 Dumfriesshire, Clydesdale & Tweeddale 1 0 0 0 0 0 Dunbartonshire East 1 0 0 0 0 0 Dunbartonshire West 1 0 0 0 0 0 Dundee East 1 0 0 0 0 0 Dundee West 1 0 0 0 0 0 Dunfermline & Fife West 0 1 0 0 0 0 East Kilbride, Strathaven & Lesmahagow 0 1 0 0 0 0 East Lothian 0 1 0 0 0 0 Edinburgh East 0 0 0 0 0 1 Edinburgh North & Leith 0 0 0 0 0 1 Edinburgh South 1 0 0 0 0 0 Edinburgh South West 0 0 0 0 0 1 Edinburgh West 0 0 0 0 0 1 Falkirk 0 1 0 0 0 0 Fife North East 1 0 0 0 0 0 Glasgow Central 0 0 1 0 0 0 Glasgow East 1 0 0 0 0 0 Glasgow North 0 0 0 0 1 0 Glasgow North East 1 0 0 0 0 0 Glasgow North West 1 0 0 0 0 0 Glasgow South 1 0 0 0 0 0 Glasgow South West 1 0 0 0 0 0 Glenrothes 1 0 0 0 0 0 Gordon 0 0 0 0 0 1 Inverclyde 1 0 0 0 0 0 Inverness, Nairn, Badenoch & Strathspey 0 1 0 0 0 0 Kilmarnock & Loudoun 0 1 0 0 0 0 Kirkcaldy & Cowdenbeath 0 1 0 0 0 0 Lanark & Hamilton East 0 1 0 0 0 0 Linlithgow & Falkirk East 0 1 0 0 0 0 Livingston 0 1 0 0 0 0 Midlothian 1 0 0 0 0 0 Moray 0 1 0 0 0 0 Motherwell & Wishaw 1 0 0 0 0 0 Na H-Eileanan An Iar 0 0 0 1 0 0 Ochil & Perthshire South 0 1 0 0 0 0 Orkney & Shetland 0 0 0 1 0 0 Paisley & Renfrewshire North 1 0 0 0 0 0 Paisley & Renfrewshire South 1 0 0 0 0 0 Perth & Perthshire North 0 1 0 0 0 0 Renfrewshire East 1 0 0 0 0 0 Ross, Skye & Lochaber 0 0 0 0 1 0 Rutherglen & Hamilton West 0 1 0 0 0 0 Stirling 1 0 0 0 0 0
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
 import pandas as pd from sklearn.cluster import KMeans dataset = pd.read_csv('scotland_demographics.csv') #print(dataset.shape) def cluster_demographics(n_clusters): kmeans_model = KMeans(n_clusters=n_clusters, random_state=1).fit(dataset.iloc[:, 3:104]) labels = kmeans_model.labels_ result = pd.crosstab(labels, dataset["constituency"]) result.T.to_csv("nclusters_"+str(n_clusters)+'.csv') cluster_demographics(2) cluster_demographics(4) cluster_demographics(6) cluster_demographics(8)
to join this conversation on GitHub. Already have an account? Sign in to comment