Skip to content

Instantly share code, notes, and snippets.

@kjhealy
Last active March 24, 2021 15:19
Show Gist options
  • Save kjhealy/d8760aed7f7b23ec57d428c827db2376 to your computer and use it in GitHub Desktop.
Save kjhealy/d8760aed7f7b23ec57d428c827db2376 to your computer and use it in GitHub Desktop.
# UK map v. quick example
## Libraries
library(tidyverse)
library(sf)
#> Linking to GEOS 3.8.1, GDAL 3.1.4, PROJ 6.3.1
## Get the map data
## An authoritative source for UK map files is the [ONS Open Geography Portal](https://geoportal.statistics.gov.uk).
# https://geoportal.statistics.gov.uk/search?collection=Dataset&sort=name&tags=all(BDY_LAD)
# Look under the `Data` tab for the link to the geojson file. We are going to directly grab a boundary file of UK
# local authority areas:
uk_lads <- read_sf("https://opendata.arcgis.com/datasets/69cd46d7d2664e02b30c2f8dcc2bfaf7_0.geojson")
uk_lads
#> Simple feature collection with 382 features and 10 fields
#> geometry type: MULTIPOLYGON
#> dimension: XY
#> bbox: xmin: -8.649996 ymin: 49.88234 xmax: 1.763571 ymax: 60.84575
#> geographic CRS: WGS 84
#> # A tibble: 382 x 11
#> OBJECTID LAD19CD LAD19NM LAD19NMW BNG_E BNG_N LONG LAT Shape__Area
#> <int> <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl>
#> 1 1 E060000… Hartlepool " " 447157 531476 -1.27 54.7 96512311.
#> 2 2 E060000… Middlesbro… " " 451141 516887 -1.21 54.5 55229150.
#> 3 3 E060000… Redcar and… " " 464359 519597 -1.01 54.6 248409004.
#> 4 4 E060000… Stockton-o… " " 444937 518183 -1.31 54.6 205231500.
#> 5 5 E060000… Darlington " " 428029 515648 -1.57 54.5 198812771.
#> 6 6 E060000… Halton " " 354246 382146 -2.69 53.3 82869023.
#> 7 7 E060000… Warrington " " 362744 388456 -2.56 53.4 178742907.
#> 8 8 E060000… Blackburn … " " 369490 422806 -2.46 53.7 139386373.
#> 9 9 E060000… Blackpool " " 332763 436633 -3.02 53.8 33677587.
#> 10 10 E060000… Kingston u… " " 511894 431716 -0.304 53.8 70882300.
#> # … with 372 more rows, and 2 more variables: Shape__Length <dbl>,
#> # geometry <MULTIPOLYGON [°]>
## Get some sample COVID data from https://coronavirus.data.gov.uk
# https://api.coronavirus.data.gov.uk/v2/data?areaType=utla&metric=cumDeaths60DaysByDeathDate&format=csv
covid <- read_csv("https://api.coronavirus.data.gov.uk/v2/data?areaType=utla&metric=cumDeaths60DaysByDeathDate&format=csv")
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#> date = col_date(format = ""),
#> areaType = col_character(),
#> areaCode = col_character(),
#> areaName = col_character(),
#> cumDeaths60DaysByDeathDate = col_double()
#> )
covid
#> # A tibble: 55,239 x 5
#> date areaType areaCode areaName cumDeaths60DaysByDeathDate
#> <date> <chr> <chr> <chr> <dbl>
#> 1 2021-03-22 utla E10000007 Derbyshire 2084
#> 2 2021-03-21 utla E10000007 Derbyshire 2084
#> 3 2021-03-20 utla E10000007 Derbyshire 2083
#> 4 2021-03-19 utla E10000007 Derbyshire 2080
#> 5 2021-03-18 utla E10000007 Derbyshire 2078
#> 6 2021-03-17 utla E10000007 Derbyshire 2075
#> 7 2021-03-16 utla E10000007 Derbyshire 2074
#> 8 2021-03-15 utla E10000007 Derbyshire 2072
#> 9 2021-03-14 utla E10000007 Derbyshire 2068
#> 10 2021-03-22 utla E08000002 Bury 587
#> # … with 55,229 more rows
### We rename the `areaCode` to `LAD19CD` so we can join the data to the map file in a moment.
### Then we filter the table so we just look at a single day; i.e. just one observation per LAD.
covid_sm <- covid %>%
rename(LAD19CD = areaCode) %>%
filter(date == "2021-03-17")
covid_sm
#> # A tibble: 149 x 5
#> date areaType LAD19CD areaName cumDeaths60DaysByDeathDate
#> <date> <chr> <chr> <chr> <dbl>
#> 1 2021-03-17 utla E10000007 Derbyshire 2075
#> 2 2021-03-17 utla E08000002 Bury 585
#> 3 2021-03-17 utla E10000019 Lincolnshire 1865
#> 4 2021-03-17 utla E08000016 Barnsley 903
#> 5 2021-03-17 utla E08000008 Tameside 757
#> 6 2021-03-17 utla E08000019 Sheffield 1288
#> 7 2021-03-17 utla E06000030 Swindon 303
#> 8 2021-03-17 utla E06000033 Southend-on-Sea 694
#> 9 2021-03-17 utla E10000014 Hampshire 2695
#> 10 2021-03-17 utla E10000031 Warwickshire 1230
#> # … with 139 more rows
### Looks like we have 149 unique places in this data. So there will be a bunch of missing LADs
### in the map.
### We merge this with our map data:
uk_lads_covid <- uk_lads %>%
left_join(covid_sm, by = "LAD19CD")
# looks OK
uk_lads_covid %>%
select(LAD19CD, LAD19NM, geometry, cumDeaths60DaysByDeathDate)
#> Simple feature collection with 382 features and 3 fields
#> geometry type: MULTIPOLYGON
#> dimension: XY
#> bbox: xmin: -8.649996 ymin: 49.88234 xmax: 1.763571 ymax: 60.84575
#> geographic CRS: WGS 84
#> # A tibble: 382 x 4
#> LAD19CD LAD19NM geometry cumDeaths60DaysBy…
#> <chr> <chr> <MULTIPOLYGON [°]> <dbl>
#> 1 E060000… Hartlepool (((-1.177633 54.69919, -1.173981 54… 287
#> 2 E060000… Middlesbrou… (((-1.282626 54.56528, -1.262559 54… 393
#> 3 E060000… Redcar and … (((-1.149131 54.61433, -1.154624 54… 328
#> 4 E060000… Stockton-on… (((-1.282626 54.56528, -1.270612 54… 512
#> 5 E060000… Darlington (((-1.696926 54.53601, -1.705274 54… 296
#> 6 E060000… Halton (((-2.674641 53.35366, -2.630622 53… 304
#> 7 E060000… Warrington (((-2.576743 53.44606, -2.57039 53.… 548
#> 8 E060000… Blackburn w… (((-2.551298 53.75639, -2.465808 53… 455
#> 9 E060000… Blackpool (((-3.04795 53.87573, -3.01975 53.8… 515
#> 10 E060000… Kingston up… (((-0.2414035 53.75491, -0.2516817 … 714
#> # … with 372 more rows
### Now we can draw a rough and ready map
uk_lads_covid %>%
ggplot(mapping = aes(fill = cumDeaths60DaysByDeathDate)) +
geom_sf() +
labs(fill = "60-Day COVID") +
theme_void() +
theme(legend.position = "top")
## The basic sequence will be the same always:
## 1. Find the boundaries you want and get the map file for them. Import that with read_sf()
## 2. Find the data you want. Make sure it is at the same level of observation as your map.
## 3. Make sure there is a column you can merge/join on---usually the unique official id of the spatial unit.
## 4. Join the tables. Usually this will be a left_join(), with the map data on the left and the the data of interest joined to it
## 5. Now you have a table with spatial data and your measures of interest as columns.
## 5. Draw your map with the `fill` aesthetic assigned to your statistic of interest.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment