Skip to content

Instantly share code, notes, and snippets.

@mdsumner
Last active July 13, 2017 12:27
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mdsumner/c9bbe897fafcba9ddaa00d7554d0edff to your computer and use it in GitHub Desktop.
Save mdsumner/c9bbe897fafcba9ddaa00d7554d0edff to your computer and use it in GitHub Desktop.
#' Melt array
#'
#' Long form version of an array in a data frame with explict row, col, etc. dimension
#' index identifiers.
#'
#' Degenerate dimensions are 'dropped' by default, which means there won't be an
#' explicit column recording their presence. If `drop = FALSE` the column will be present
#' (filled with `1`).
#'
#' The names used for dimensions are taken from `c("row", "col", letters)`. If you have more
#' than 28 dimensions then I will accept patches.
#'
#' Originally implemented for a NetCDF project: https://github.com/hypertidy/tidync/blob/select-idiom/R/hyper_tibble.R#L58
#'
#' @param a array or matrix
#' @param drop
#'
#' @return data frame, `tbl_df` variant
#' @export
#' @importFrom tibble tibble
#' @examples
#' #reshape2::melt(volcano) %>% as_tibble() %>%
#' #ggplot(aes(x = Var1, y = Var2, fill = value)) + geom_raster()
#' arr_melt(matrix(1:10), drop = TRUE)
#' arr_melt(array(c(volcano, volcano * 2), c(dim(volcano), 3)))
#' arr_melt(volcano[,1])
#' arr_melt(volcano[,1, drop = FALSE])
#' #arr_melt(volcano) %>% ggplot(aes(x = row, y = col, fill = data)) + geom_raster()
arr_melt <- function(a, drop = FALSE) {
axes <- lapply(dim(a), function(n) seq_len(n))
if (drop) {
ones <- unlist(lapply(axes, length)) == 1L
if (sum(ones) < 1) stop("no non-degenerate dimensions, use drop = FALSE")
axes <- axes[!ones]
}
total_prod <- prod(dim(a))
nm <- c("row", "col", letters)[seq_along(axes)]
tib <- tibble::tibble(data = as.vector(a))
prod_dims <- 1
for (i in seq_along(axes)) {
nd <- dim(a)[i]
tib[[nm[i]]] <- rep(axes[[i]], each = prod_dims, length.out = total_prod)
prod_dims <- prod_dims * nd
}
tib
}
@mdsumner
Copy link
Author

mdsumner commented Jul 13, 2017

Untested variant with na.rm capability

arr_melt <- function(a, drop = FALSE, na.rm = FALSE) {
  axes <- lapply(dim(a), function(n) seq_len(n))
  if (drop) {
    ones <- unlist(lapply(axes, length)) == 1L
    if (sum(ones) < 1) stop("no non-degenerate dimensions, use drop = FALSE")
    axes <- axes[!ones]
  }
  
  total_prod <- prod(dim(a))
  nm <- c("row", "col", letters)[seq_along(axes)]

  v <- as.vector(a)
  keep <- rep(TRUE, length(v))
  if (na.rm) keep <- !is.na(v)
  tib <- tibble::tibble(data = v)
  prod_dims <- 1
  for (i in seq_along(axes)) {
    nd <- dim(a)[i]
    tib[[nm[i]]] <- rep(axes[[i]], each = prod_dims, length.out = total_prod)[keep]
    prod_dims <- prod_dims * nd
  }
  tib
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment