library(sf) | |
library(dplyr) | |
library(mapview) | |
library(mapedit) | |
qk_mx <- data.matrix(quakes[,2:1]) | |
qk_mp <- st_multipoint(qk_mx) | |
qk_sf <- st_sf(st_cast(st_sfc(qk_mp), "POINT"), quakes, crs=4326) | |
grd <- st_sf(geom=st_make_grid(qk_sf), crs=4326) | |
grd_mut <- grd %>% | |
# add id | |
mutate(id = 1:n()) %>% | |
# calculate | |
# index of quakes contained in grid rect | |
# number of quakes in grid rect | |
# mean mag of quakes in grid rect | |
mutate( | |
qk_contained = lapply(st_contains(st_sf(geom), qk_sf), identity), | |
n_quakes = sapply(qk_contained,length), | |
mean_mag = sapply( | |
qk_contained, | |
function(x) { | |
mean(qk_sf[x,]$mag) | |
} | |
) | |
) #%>% | |
# make n_quakes NA if 0 | |
#mutate( | |
# n_quakes = ifelse(n_quakes==0, NA, n_quakes), | |
# mean_mag = ifelse(is.nan(mean_mag), 0, mean_mag) | |
#) | |
mapview(grd_mut, zcol="n_quakes", na.color="gray", legend=TRUE) | |
# as edzer highlights there is only na.color and not na.alpha | |
mapview(grd_mut, zcol="mean_mag", na.color="white", legend=TRUE) |
timelyportfolio
commented
Jun 6, 2017
•
@timelyportfolio this is great to see the dplyr flow applied to such a complex task with sf! I can see a pretty specific optimization though, if we leverage raster directly. (Whether that makes sense will always depend on the component parts of the workflow - this point-in-cell task is something we do a lot - your approach would already work for arbitrary polygon regions, and mine would not).
Taking things a little closer to the metal, we can directly index between the raster cells and the features. I use spex as well for some convenience with rounded alignment when creating the raster, and a quadmesh method for creating the feature-per-pixel sf object which is a bit faster than other methods I think.
A rep of 5000 gives us Tim's desired 5e6 points without data.table. I just shamelessly repeat the points with a jitter to get a lot of variability.
library(raster)
library(dplyr)
library(spex) ## devtools::install_github("mdsumner/spex")
repeater <- rep(seq_len(nrow(quakes)), 5000)
qk_mx <- jitter(as.matrix(quakes[,2:1])[repeater, ], 135)
#qk_mx <- data.matrix(quakes[,2:1])
hres_ras <- raster(spex::buffer_extent(extent(qk_mx), 0.5), res = 0.5, crs = "+init=epsg:4326")
library(tibble)
cells <- tibble(cell_ = cellFromXY(hres_ras, qk_mx)) %>%
mutate(mag = quakes$mag[repeater]) %>%
group_by(cell_) %>%
summarize(n_quakes = n(), mean_mag = mean(mag))
## raster to polygon (one poly per pixel via indexed quadmesh)
library(sf)
poly <- spex::qm_rasterToPolygons(hres_ras)
poly$n_quakes <- poly$mean_mag <- NA_real_
poly$layer <- NULL
poly$n_quakes[cells$cell_] <- cells$n_quakes
poly$mean_mag[cells$cell_] <- cells$mean_mag
library(mapview)
mapview(poly %>% filter(!is.na(n_quakes)), zcol="n_quakes", na.color="gray")
Notes:
- data.matrix is way too slow to take a data.frame to a matrix for this simple situation
- I can take this to 0.1 resolution raster and it still performs fast enough, and mapview kicks into a new mode but works well!
- I didn't put thought into the join, we could pipe this nice from the start I think
very nice @mdsumner!!! I knew raster
would be faster, but did not know how to implement. Thanks!
The mapview addLargeFeatures
mode does not work properly in this case as the png state is cropped at the dateline. A long standing issue with leaflet/mapview. In fact, the very first reported issue on mapview is still open :-( r-spatial/mapview#6