Skip to content

Instantly share code, notes, and snippets.

@etiennebr
Last active July 23, 2024 23:44
Show Gist options
  • Save etiennebr/9515738 to your computer and use it in GitHub Desktop.
Save etiennebr/9515738 to your computer and use it in GitHub Desktop.
Transform raster or terra object to data.table
#' Transform raster to data.table
#'
#' @param x Raster* object
#' @param row.names `NULL` or a character vector giving the row names for the data frame. Missing values are not allowed
#' @param optional logical. If `TRUE`, setting row names and converting column names (to syntactic names: see make.names) is optional
#' @param xy logical. If `TRUE`, also return the spatial coordinates
#' @param centroids logical. If TRUE return the centroids instead of all spatial coordinates (only relevant if xy=TRUE)
#' @param sepNA logical. If TRUE the parts of the spatial objects are separated by lines that are NA (only if xy=TRUE and, for polygons, if centroids=FALSE
#' @param ... Additional arguments (none) passed to `raster::as.data.frame`
#'
#' @value returns a data.table object
#' @examples
#' logo <- brick(system.file("external/rlogo.grd", package="raster"))
#' v <- as.data.table(logo)
#' @import
as.data.table.raster <- function(x, row.names = NULL, optional = FALSE, xy=FALSE, inmem = canProcessInMemory(x, 2), ...) {
stopifnot(require("data.table"))
if(inmem) {
v <- as.data.table(as.data.frame(x, row.names=row.names, optional=optional, xy=xy, ...))
} else {
tr <- blockSize(x, n=2)
l <- lapply(1:tr$n, function(i)
as.data.table(as.data.frame(getValues(x,
row=tr$row[i],
nrows=tr$nrows[i]),
row.names=row.names, optional=optional, xy=xy, ...)))
v <- rbindlist(l)
}
coln <- names(x)
if(xy) coln <- c("x", "y", coln)
setnames(v, coln)
v
}
#' @param xy logical. If TRUE, the coordinates of each raster cell are included
#' @param cells logical. If TRUE, the cell numbers of each raster cell are included
#' @param na.rm logical. If TRUE, cells that have a NA value in at least one layer are removed
#' @param ... Additional arguments (none) passed to `terra::as.data.frame`
#' @value returns a data.table object
#' @examples
#' r <- rast(ncols=2, nrows=2)
#' values(r) <- 1:ncell(r)
#' as.data.table(r, xy = TRUE)
#' @importFrom terra as.data.frame
#' @importFrom data.table as.data.table
as.data.table.SpatRaster <- function(x, optional = FALSE, xy = FALSE, ...) {
stopifnot(require("data.table"))
v <- as.data.table(as.data.frame(x, optional = optional, xy = xy, ...))
coln <- names(x)
if(xy) coln <- c("x", "y", coln)
setnames(v, coln)
v
}
if (!isGeneric("as.data.table")) {
setGeneric("as.data.table", function(x, ...)
standardGeneric("as.data.table"))
}
setMethod('as.data.table', signature(x='data.frame'), data.table::as.data.table)
# make sure you have terra or raster loaded (as needed) before
setMethod('as.data.table', signature(x='Raster'), as.data.table.raster)
setMethod('as.data.table', signature(x='SpatRaster'), as.data.table.SpatRaster)
@tylerhoecker
Copy link

tylerhoecker commented Nov 8, 2023

Thanks @etiennebr for this super useful function! I used it successfully with terra. I did encounter an error when applying this function over a list of SpatRasters using purrr::map, but not when using lapply. A non-repro example:

# WORKS
dt_list <- lapply(spatrast_list, FUN = as.data.table)

# ERROR
dt_list <- spatrast_list %>% 
  map(., as.data.table())

# Error message: Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘as.data.table’ for signature ‘"missing"’

@etiennebr
Copy link
Author

Thanks @tylerhoecker, it seems that the as.data.table function should be called the same way than for lapply. Could you try:

dt_list <- spatrast_list %>% 
  map(as.data.table)
# or
dt_list <- map(spatrast_list, as.data.table)

@stephstewart02
Copy link

Hi @etiennebr, I am a bit confused about why the main body of code doesn't incorporate a solution to the issue Idemaz identified. Is there a particular reason? I am trying to work with a package that relied on the main code, understandably because they probably thought that it functioned correctly. It is causing issues because I am working with a large amount of raster data, and want the x and y columns. Is the solution provided in the comments the best solution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment