Skip to content

Instantly share code, notes, and snippets.

@mrdwab
Last active March 8, 2018 17:07
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mrdwab/2d8305237937cc5de510b3ca4d3d075f to your computer and use it in GitHub Desktop.
Save mrdwab/2d8305237937cc5de510b3ca4d3d075f to your computer and use it in GitHub Desktop.
Fast leading and trailing whitespace trimming for lists and vectors. Preserves `NA` values.
trim_list <- function(x, relist = TRUE, convert = FALSE) {
x <- replace(x, lengths(x) == 0, NA_character_)
y <- unlist(x, use.names = FALSE)
y[!nzchar(y)] <- NA_character_
out <- trim_vec(y, TRUE)
if ((attr(out, "test") == "clean") & (!isTRUE(convert))) x
if (isTRUE(convert)) out <- type.convert(out, as.is = TRUE)
if (isTRUE(relist)) {
out <- split(out, factor(rep.int(seq.int(length(x)), lengths(x))))
if (is.null(names(x))) unname(out) else `names<-`(out, names(x))
} else {
`attributes<-`(out, NULL)
}
}
trim_vec <- function(vec, attr = FALSE) {
if (!is.atomic(vec)) stop("This function is for character vectors only")
sw <- startsWith(vec, " ")
ew <- endsWith(vec, " ")
if (!any(sw, na.rm = TRUE) & !any(ew, na.rm = TRUE)) {
if (isTRUE(attr)) `attr<-`(vec, "test", "clean") else vec
} else {
if (any(sw, na.rm = TRUE)) {
vec[which(sw)] <- sub("^ +", "", vec[which(sw)])
}
if (any(ew, na.rm = TRUE)) {
vec[which(ew)] <- sub(" +$", "", vec[which(ew)])
}
vec
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment