Skip to content

Instantly share code, notes, and snippets.

@mrdwab
Last active March 3, 2020 06:10
Show Gist options
  • Save mrdwab/6123681 to your computer and use it in GitHub Desktop.
Save mrdwab/6123681 to your computer and use it in GitHub Desktop.
`reshape()` for "unbalanced" datasets.
uReshape <- function(data, id.vars, var.stubs, sep) {
# vectorized version of grep
vGrep <- Vectorize(grep, "pattern", SIMPLIFY = FALSE)
# Isolate the columns starting with the var.stubs
temp <- names(data)[names(data) %in% unlist(vGrep(var.stubs, names(data), value = TRUE))]
# Split the vector and reasemble into a data.frame
x <- do.call(rbind.data.frame, strsplit(temp, split = sep))
names(x) <- c("VAR", paste(".time", 1:(length(x)-1), sep = "_"))
# Prep to decide whether normal reshape or unbalanced reshape
xS <- split(x$.time_1, x$VAR)
xL <- unique(unlist(xS))
if (isTRUE(all(sapply(xS, function(x) all(xL %in% x))))) {
# Everything looks ok for normal `reshape` to work
reshape(data, direction = "long", idvar = id.vars,
varying = lapply(vGrep(var.stubs, names(data), value = TRUE), sort),
sep = sep, v.names = var.stubs)
} else {
# Padding required to "balance" the data
# Find out which variables need to be padded
newVars <- unlist(lapply(names(xS), function(y) {
temp <- xL[!xL %in% xS[[y]]]
if (length(temp) == 0) {
temp <- NULL
} else {
paste(y, temp, sep = sep)
}
}))
# Create matrix of NAs
myMat <- setNames(data.frame(matrix(NA, nrow = nrow(data), ncol = length(newVars))), newVars)
# Bind with original data.frame
out <- cbind(data, myMat)
# Use `reshape` as normal
reshape(out, direction = "long", idvar = id.vars,
varying = lapply(vGrep(var.stubs, names(out),
value = TRUE), sort),
sep = sep, v.names = var.stubs)
}
}
@corynissen
Copy link

Thanks man, just what I needed.

@simrinm
Copy link

simrinm commented Mar 14, 2017

This is great. How could I modify this code to convert unbalanced long data wide?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment