Using clustering to group Scotland's constituencies
A few days ago I briefly got involved in The Bureau Local's Voterpower Hack. Participants were provided with demographic and political data sourced from the Office of National Statistics, the Cabinet Office and the British Election Study.
To get started, I decided to play with some Python machine learning libraries, this was partly a learning exercise but I also wondered if I could surface any insights.
The first thing I did was run a k-means clustering algorithm on the constituency demographic & voting data.
You can see the results and the code below. In k-means clustering you input how many groups you want the algorithm to produce. Here I've provided 2,4, and 6. A '1' against a constituency means that the constituency belongs to the group in the column.
Comparing the results with the different number of groups can be revealing. In most of the results you can see the highland and island constituencies end up in the same group. Just split into 2 groups though and you'll see Glasgow North grouped with more rural constituencies. This is a funny one, but if you inspect the data you'll see the 'usual_residents' figure is low for a city, possibly the result of a high student population.
Have a look and see if you can spot any other oddities. With a little more time it might be nice to visualise this on a map with a neat cluster count slider control.
Edit: I just noticed, in the 6 clusters sheet Glasgow Central appeaars in a group of its own. Bit unsure, but it could be a result of Glasgow Central having a the lowest median age (30) of all constituencies in Scotland.