Skip to content

Instantly share code, notes, and snippets.

@randrescastaneda
Created September 12, 2019 14:41
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save randrescastaneda/b17e96ebe7f8045ab6f936885f089a6e to your computer and use it in GitHub Desktop.
Save randrescastaneda/b17e96ebe7f8045ab6f936885f089a6e to your computer and use it in GitHub Desktop.
Calculate size of directories. Display directory tree and returns complete data frame
# requires data.table and data.tree
dir_size <- function(path = ".",
limit = 25,
sharet = .1) {
# get file from directory path
a <- list.files(path,
all.files = TRUE,
recursive = TRUE,
full.names = TRUE)
# get info
b <- file.info(a)
# Convert to data.data
data.table::setDT(b, keep.rownames = "name")[]
# Get directory name and replace original path for shortness
b$dir <- sub("(.*)(/[^/]*\\.[A-z]+)$", "\\1", b$name)
b$dir <- sub(path, "OrigPath", b$dir)
# Sum direcoty size and sort
c <- b[, .(size = sum(size)), by = dir]
c <- c[order(-size)][!is.na(size), ]
# get share with respect to largest size
c[, share := size/max(size)]
e <- c[share > sharet,]
# Convert to data tree
d <- data.tree::as.Node(e, pathName = "dir")
# Formatting output
data.tree::SetFormat(d, "share", formatFun = data.tree::FormatPercent)
data.tree::SetFormat(d, "size", formatFun = function(x)
data.tree::FormatFixedDecimal(x, digits = 1))
print(d, "size", "share", limit = limit)
# Return data frame
return(c)
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment