Skip to content

Instantly share code, notes, and snippets.

@etiennebr
Last active July 23, 2024 23:44
Show Gist options
  • Save etiennebr/9515738 to your computer and use it in GitHub Desktop.
Save etiennebr/9515738 to your computer and use it in GitHub Desktop.
Transform raster or terra object to data.table
#' Transform raster to data.table
#'
#' @param x Raster* object
#' @param row.names `NULL` or a character vector giving the row names for the data frame. Missing values are not allowed
#' @param optional logical. If `TRUE`, setting row names and converting column names (to syntactic names: see make.names) is optional
#' @param xy logical. If `TRUE`, also return the spatial coordinates
#' @param centroids logical. If TRUE return the centroids instead of all spatial coordinates (only relevant if xy=TRUE)
#' @param sepNA logical. If TRUE the parts of the spatial objects are separated by lines that are NA (only if xy=TRUE and, for polygons, if centroids=FALSE
#' @param ... Additional arguments (none) passed to `raster::as.data.frame`
#'
#' @value returns a data.table object
#' @examples
#' logo <- brick(system.file("external/rlogo.grd", package="raster"))
#' v <- as.data.table(logo)
#' @import
as.data.table.raster <- function(x, row.names = NULL, optional = FALSE, xy=FALSE, inmem = canProcessInMemory(x, 2), ...) {
stopifnot(require("data.table"))
if(inmem) {
v <- as.data.table(as.data.frame(x, row.names=row.names, optional=optional, xy=xy, ...))
} else {
tr <- blockSize(x, n=2)
l <- lapply(1:tr$n, function(i)
as.data.table(as.data.frame(getValues(x,
row=tr$row[i],
nrows=tr$nrows[i]),
row.names=row.names, optional=optional, xy=xy, ...)))
v <- rbindlist(l)
}
coln <- names(x)
if(xy) coln <- c("x", "y", coln)
setnames(v, coln)
v
}
#' @param xy logical. If TRUE, the coordinates of each raster cell are included
#' @param cells logical. If TRUE, the cell numbers of each raster cell are included
#' @param na.rm logical. If TRUE, cells that have a NA value in at least one layer are removed
#' @param ... Additional arguments (none) passed to `terra::as.data.frame`
#' @value returns a data.table object
#' @examples
#' r <- rast(ncols=2, nrows=2)
#' values(r) <- 1:ncell(r)
#' as.data.table(r, xy = TRUE)
#' @importFrom terra as.data.frame
#' @importFrom data.table as.data.table
as.data.table.SpatRaster <- function(x, optional = FALSE, xy = FALSE, ...) {
stopifnot(require("data.table"))
v <- as.data.table(as.data.frame(x, optional = optional, xy = xy, ...))
coln <- names(x)
if(xy) coln <- c("x", "y", coln)
setnames(v, coln)
v
}
if (!isGeneric("as.data.table")) {
setGeneric("as.data.table", function(x, ...)
standardGeneric("as.data.table"))
}
setMethod('as.data.table', signature(x='data.frame'), data.table::as.data.table)
# make sure you have terra or raster loaded (as needed) before
setMethod('as.data.table', signature(x='Raster'), as.data.table.raster)
setMethod('as.data.table', signature(x='SpatRaster'), as.data.table.SpatRaster)
@philipshirk
Copy link

Great function! This reduces the processing time for counting the number of cells in my raster with a particular value from 175 seconds using raster::freq() down to 1.7 seconds!

@jacksonvoelkel
Copy link

Etienne, just dropping by to thank you for this great function!

@thiagoveloso
Copy link

Awesome function! I wonder why is it NOT shipped in the raster package?

@mxblsdl
Copy link

mxblsdl commented May 23, 2020

Recently came back to this function for an old workflow and keep running into memory allocation errors. Anyone else experience this? It is with large, but sparse rasters, lots of NA values over a large extent.

@etiennebr
Copy link
Author

I haven't used it in a while, so I can't say. If you can provide a reproducible example I could have a look.

@ptompalski
Copy link

@etiennebr fantastic tool!
Question - do you know if it is possible to adapt it to work with terra?
The terra package has terra::as.data.frame() function similar to raster::as.data.frame(), but the parameters are different. Perhaps you have looked into this already?

@etiennebr
Copy link
Author

Thanks @ptompalski! I haven't worked on this, but you're right it would be useful.

@thiagoveloso
Copy link

Maybe it's useful to take a look at Robert's feedback to an old request of mine here: rspatial/raster#55

@etiennebr
Copy link
Author

Thanks @thiagoveloso, @ptompalski I updated the gist. Let me know if it works for you!

@mikeshewring
Copy link

Updated to terra

as.data.table.raster <- function(x, row.names = NULL, optional = FALSE, xy=FALSE, inmem = terra::inMemory(x), ...) {
stopifnot(require("data.table"))
if(inmem) {
v <- as.data.table(as.data.frame(x, row.names=row.names, optional=optional, xy=xy, ...))
coln <- names(x)
if(xy) coln <- c("x", "y", coln)
setnames(v, coln)
} else {
tr <- blocks(x)
l <- lapply(1:tr$n, function(i) {
DT <- as.data.table(as.data.frame(terra::values(x, row = tr$row[i], nrows = tr$nrows[i]), ...))
if(xy == TRUE) {
cells <- terra::cellFromRowCol(x, c(tr$row[i], tr$row[i] + tr$nrows[i] - 1), c(1, ncol(x)))
coords <- terra::xyFromCell(x, cell = cells[1]:cells[2])
DT[, c("x", "y") := data.frame(terra::xyFromCell(x, cell = cells[1]:cells[2]))]
}
DT
})
v <- rbindlist(l)
coln <- names(x)
if(xy) {
coln <- c("x", "y", coln)
setcolorder(v, coln)
}
}
v
}

@etiennebr
Copy link
Author

Hi Mike, thanks for sharing! The current gist works for me with terra.

# after sourcing the gist
terra::rast(matrix(1, 2, 2)) %>% as.data.table(xy = TRUE)
#> x   y lyr.1
#> 1: 0.5 1.5     1
#> 2: 1.5 1.5     1
#> 3: 0.5 0.5     1
#> 4: 1.5 0.5     1

But maybe there are some edge cases that are not supported. Let me know if there anything missing from the actual gist that didn't work for you on terra.

@tylerhoecker
Copy link

tylerhoecker commented Nov 8, 2023

Thanks @etiennebr for this super useful function! I used it successfully with terra. I did encounter an error when applying this function over a list of SpatRasters using purrr::map, but not when using lapply. A non-repro example:

# WORKS
dt_list <- lapply(spatrast_list, FUN = as.data.table)

# ERROR
dt_list <- spatrast_list %>% 
  map(., as.data.table())

# Error message: Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘as.data.table’ for signature ‘"missing"’

@etiennebr
Copy link
Author

Thanks @tylerhoecker, it seems that the as.data.table function should be called the same way than for lapply. Could you try:

dt_list <- spatrast_list %>% 
  map(as.data.table)
# or
dt_list <- map(spatrast_list, as.data.table)

@stephstewart02
Copy link

Hi @etiennebr, I am a bit confused about why the main body of code doesn't incorporate a solution to the issue Idemaz identified. Is there a particular reason? I am trying to work with a package that relied on the main code, understandably because they probably thought that it functioned correctly. It is causing issues because I am working with a large amount of raster data, and want the x and y columns. Is the solution provided in the comments the best solution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment