Julie Gravier 2022-09-29
Librairies required:
library(tidyverse) # or at least tidyr, dplyr and ggplot2 librairies
library(sf)
library(maptools)
library(spData)
We have two datasets in spData package:
- the current regions of New Zeland,
- the top 101 heighest points in New Zealand (2017).
The question is whether these highest points have had an impact on the delimitation of the country's regions. For this purpose, we study the distances between these highest points and the regional limits that we will compare with a random situation.
Initial regional dataset
spData::nz
#> Name Island Land_area Population Median_income Sex_ratio
#> 1 Northland North 12500.5611 175500 23400 0.9424532
#> 2 Auckland North 4941.5726 1657200 29600 0.9442858
#> 3 Waikato North 23900.0364 460100 27900 0.9520500
#> 4 Bay of Plenty North 12071.1447 299900 26200 0.9280391
#> 5 Gisborne North 8385.8266 48500 24400 0.9349734
#> 6 Hawke's Bay North 14137.5244 164000 26100 0.9238375
#> 7 Taranaki North 7254.4804 118000 29100 0.9569363
#> 8 Manawatu-Wanganui North 22220.6084 234500 25000 0.9387734
#> 9 Wellington North 8048.5528 513900 32700 0.9335524
#> 10 West Coast South 23245.4559 32400 26900 1.0139072
Tranforming polygons as lines
nz_lines <- st_cast(x = spData::nz, to = "MULTILINESTRING")
Lines dataset
Simple feature collection with 16 features and 6 fields
Geometry type: MULTILINESTRING
Dimension: XY
Bounding box: xmin: 1090144 ymin: 4748537 xmax: 2089533 ymax: 6191874
Projected CRS: NZGD2000 / New Zealand Transverse Mercator 2000
First 10 features:
Name Island Land_area Population Median_income Sex_ratio geom
1 Northland North 12500.561 175500 23400 0.9424532 MULTILINESTRING ((1745493 6...
2 Auckland North 4941.573 1657200 29600 0.9442858 MULTILINESTRING ((1803822 5...
3 Waikato North 23900.036 460100 27900 0.9520500 MULTILINESTRING ((1860345 5...
4 Bay of Plenty North 12071.145 299900 26200 0.9280391 MULTILINESTRING ((2049387 5...
5 Gisborne North 8385.827 48500 24400 0.9349734 MULTILINESTRING ((2024489 5...
6 Hawke's Bay North 14137.524 164000 26100 0.9238375 MULTILINESTRING ((2024489 5...
7 Taranaki North 7254.480 118000 29100 0.9569363 MULTILINESTRING ((1740438 5...
8 Manawatu-Wanganui North 22220.608 234500 25000 0.9387734 MULTILINESTRING ((1866732 5...
9 Wellington North 8048.553 513900 32700 0.9335524 MULTILINESTRING ((1881590 5...
10 West Coast South 23245.456 32400 26900 1.0139072 MULTILINESTRING ((1557042 5...
Point pattern dataset
spData::nz_height
Simple feature collection with 101 features and 6 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 1204143 ymin: 5048309 xmax: 1822492 ymax: 5650492
Projected CRS: NZGD2000 / New Zealand Transverse Mercator 2000
First 10 features:
t50_fid elevation nearest_line_id snap_dist X Y geometry
1 2353944 2723 13 15439.8492 1188706 5050278 POINT (1204143 5049971)
2 2354404 2820 10 14414.9990 1233682 5062686 POINT (1234725 5048309)
3 2354405 2830 10 14066.4000 1234896 5062775 POINT (1235915 5048745)
4 2369113 3033 10 2760.7402 1261779 5074752 POINT (1259702 5076570)
5 2362630 2749 10 13102.4487 1365475 5161736 POINT (1378170 5158491)
6 2362814 2822 10 11830.1136 1389413 5180579 POINT (1389460 5168749)
7 2362817 2778 10 11138.4179 1389413 5180579 POINT (1390166 5169466)
8 2363991 3004 10 1433.0696 1371424 5173816 POINT (1372357 5172729)
9 2363993 3114 10 856.0622 1371505 5173886 POINT (1372062 5173236)
10 2363994 2882 10 1204.0424 1372026 5174333 POINT (1372810 5173419)
Mapping datasets
plot(nz_lines$geom)
plot(nz_height$geom, col = "darkorange", add = TRUE)
Add ID to each region
nz_lines <- nz_lines %>%
rowid_to_column(.) # id column name: rowid
Create sp datasets for snapping points to nearest lines with snapPointsToLines function
# sp data
nz_height_sp <- as(object = nz_height, Class = "Spatial")
nz_lines_sp <- as(object = nz_lines, Class = "Spatial")
# snapping points to nearest lines
tibble_distances <- maptools::snapPointsToLines(points = nz_height_sp,
lines = nz_lines_sp,
maxDist = 40000, # here in meters
idField = "rowid") %>%
as_tibble(.)
Dataset of snapped points
tibble_distances
# A tibble: 101 × 4
nearest_line_id snap_dist X Y
<int> <dbl> <dbl> <dbl>
1 13 15440. 1188706. 5050278.
2 10 14415. 1233682. 5062686.
3 10 14066. 1234896. 5062775.
4 10 2761. 1261779. 5074752.
5 10 13102. 1365475. 5161736.
6 10 11830. 1389413. 5180579.
7 10 11138. 1389413. 5180579.
8 10 1433. 1371424. 5173816.
9 10 856. 1371505. 5173886.
10 10 1204. 1372026. 5174333.
# … with 91 more rows
Order of the dataset is the same of the initial point pattern. It is then possible to bind nz_height with this result.
nz_height <- nz_height %>%
bind_cols(tibble_distances)
nz_height
# sf object
# where X and Y are the coordinates of Pi snap points
# and geometry the coordinates of Pi
Simple feature collection with 101 features and 6 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 1204143 ymin: 5048309 xmax: 1822492 ymax: 5650492
Projected CRS: NZGD2000 / New Zealand Transverse Mercator 2000
First 10 features:
t50_fid elevation nearest_line_id snap_dist X Y geometry
1 2353944 2723 13 15439.8492 1188706 5050278 POINT (1204143 5049971)
2 2354404 2820 10 14414.9990 1233682 5062686 POINT (1234725 5048309)
3 2354405 2830 10 14066.4000 1234896 5062775 POINT (1235915 5048745)
4 2369113 3033 10 2760.7402 1261779 5074752 POINT (1259702 5076570)
5 2362630 2749 10 13102.4487 1365475 5161736 POINT (1378170 5158491)
6 2362814 2822 10 11830.1136 1389413 5180579 POINT (1389460 5168749)
7 2362817 2778 10 11138.4179 1389413 5180579 POINT (1390166 5169466)
8 2363991 3004 10 1433.0696 1371424 5173816 POINT (1372357 5172729)
9 2363993 3114 10 856.0622 1371505 5173886 POINT (1372062 5173236)
10 2363994 2882 10 1204.0424 1372026 5174333 POINT (1372810 5173419)
We can then plot the distribution of the distances between the heighest points of New Zeland and the borders of the regions.
nz_height %>%
ggplot(mapping = aes(x = snap_dist)) +
geom_density()
In a second step we can simulate N random point patterns, then calculate the distances between the laters and the borders of the regions. N always is chosen by the user.
First, we create of an empty data table that will be filled with the simulation results:
compilrandom <- tibble(nearest_line_id = numeric(),
snap_dist = numeric(),
X = numeric(),
Y = numeric(),
sim = numeric())
Then, we create a loop to simulate random point patterns in New Zeland and calculate the distances.
# zone in which random points are created
new_zeland <- nz %>%
st_union(.)
plot(new_zeland)
Loop
for (i in 1:10) { # N simulation = 10
random_points <- st_sample(x = new_zeland, # random points IN New Zeland zone
size = nrow(x = nz_height)) # size of the sample is the same as the init. point pattern
snapping <- maptools::snapPointsToLines(points = as(object = random_points, Class = "Spatial"),
lines = as(object = nz_lines, Class = "Spatial"),
maxDist = 40000, # in meters
idField = "rowid") %>%
as_tibble(x = .) %>%
mutate(sim = i) # add a column containing the simulation number
print(i) # to know where we are in processing
compilrandom <- compilrandom %>%
bind_rows(snapping)
}
Dataset result
compilrandom
# A tibble: 941 × 5
nearest_line_id snap_dist X Y sim
<dbl> <dbl> <dbl> <dbl> <dbl>
1 11 39383. 1332211. 5071253. 1
2 12 28305. 1292366. 4977682. 1
3 12 13868. 1302700. 4856972. 1
4 3 5636. 1856544. 5892553. 1
5 11 18025. 1395149. 5004766. 1
6 12 11757. 1223251. 5058076. 1
7 8 36948. 1775204. 5571056. 1
8 1 7827. 1667504. 6122581. 1
9 8 22649. 1868865. 5492924. 1
10 6 35363. 1864128. 5568156. 1
# … with 931 more rows
We can plot the distances results as before for observed points
compilrandom %>%
ggplot(mapping = aes(x = snap_dist)) +
geom_density()
Construction of a dataset containing both distances datasets.
distances_tibble <- compilrandom %>%
bind_rows(tibble_distances %>%
mutate(sim = NA) # creating a new column where observed distances = NA simulation
) %>%
mutate(type_of_distance = if_else(
condition = is.na(sim),
true = "observed",
false = "simulated" # new column to distinguished simulated and observed distances
))
Two types of representation for comparison
# first representation
distances_tibble %>%
ggplot(mapping = aes(x = snap_dist)) +
geom_density(mapping = aes(color = as.character(sim))) +
scale_x_continuous(name = "distances (in m.)", trans = "log10")
# second representation
distances_tibble %>%
ggplot(mapping = aes(x = snap_dist)) +
geom_density(mapping = aes(color = type_of_distance)) +
scale_x_continuous(name = "distances (in m.)", trans = "log10")
Observed and simulated distances between points and the borders of the regions of New Zeland are very different. Indeed, simulated distances are on average 14 kilometers long, while the observed ones are 3 kms long.
The highest points are elements of the landscape that have influenced the determination of the boundaries of New Zealand's regions, knowing of course that they are not the only factors in the construction of boundaries.