Skip to content

Instantly share code, notes, and snippets.

@mtandon09
Last active November 21, 2023 22:29
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save mtandon09/4a870bf4addbe46e784059bce0e5d8d6 to your computer and use it in GitHub Desktop.
Save mtandon09/4a870bf4addbe46e784059bce0e5d8d6 to your computer and use it in GitHub Desktop.
Automatically install missing R packages

Automatically install missing R packages

Motivation

When running shiny apps in RStudio using runGitHub(), I often found that the app would crash if required libraries couldn't be loaded. For complex apps, this was frustrating because it was difficult to know which libraries were required unless there's good documentation and/or code layout.

But this is a common problem when sharing code in general. So this little script was my own attempt at a solution.

Note: The pacman package has a similar motivation. But as far as I can tell, it requires loading libraries with the p_load function. So you could probably just recursively replace library or require calls with p_load as an alternative to this solution.

Usage

You can run this script from RStudio or from the command line.

RStudio

You can simply paste this into any script in RStudio, and hit "Run current line" to look through R code in the folder where that script is located:

## Requires two libraries, install them first if you don't have them
#install.packages("rstudioapi")
#install.packages("BiocManager")
library("rstudioapi")
library("BiocManager")
source("https://gist.githubusercontent.com/mtandon09/4a870bf4addbe46e784059bce0e5d8d6/raw/dc2927aa3e6a09b34a39f8346b5ebcfd41ce2a6d/install_R_dependencies.R")

Or explicitly:

  1. Place the script in the directory where you want to search for R source code (e.g. the root directory of a Shiny app)
  2. Open the script in RStudio
  3. Run the script (click 'Source' or Cmd+Shift+S)

Command line

You can call the script like this: Rscript install_R_dependencies.R /path/to/R/code where /path/to/R/code is the directory you want to search in.

If that directory does not exist, or is not provided, the script will search in the folder in which it is located.

Features

This script will find *.R files in a given directory, parse them to find the libraries used, and try to install them from CRAN and Bioconductor.

By default, it will look for R files in the directory where this script is located.

If called from the command line with 'Rscript', you can specify a directory to look in.

Detecting file path

I wanted this to work from RStudio (for running Shiny apps) or command line, so the function determine_self_path() will handle that correctly. It does this by examining the output of commandArgs. If a non-existent file path is given as a command-line argument, then the location of the script is used.

Identifying libraries used

This is done in the find_R_dependencies() function by greping to find non-comment lines that contain library( or require(, and extracting the text right after. Note that this will obviously not work as intended in cases where the library is being loaded using a variable (i.e. the variable will be captured, not the library name), but it will not interfere with the installation function.

Installing

The other annoyance I always had with R is that you never know if a package is in CRAN or Bioconductor, so it always requires a Google-ing or trial/error. So the install_multi_source() function will check both and install what's available.

Caveats/Holes

  • If a library is being loaded using a variable, it will be not be captured
  • I'm not really tracking/reporting what got installed and what didn't, that would be useful
  • The script will search recursively for R files, which could be a bad idea; prob should be optional/tune-able
  • The libraries will be installed to the default library location
  • Lol lots of other things, this code isn't really thought through or testing all that much!

Btw

I started off a long time ago with a bash one-liner that would print a string I could paste into R to run install.packages. Here it is for kicks and giggles.

grep "library(" *.R | cut -d':' -f2 | sort | uniq | sed -e 's/^.*library(\(.*\))/\1/' | awk '{print "\"" $0 "\""}' | sort | uniq | tr '\n' ',' | sed 's/,$//' | echo "all_pkgs<-c($(cat -))"

It'll output something like this: all_pkgs <- c("ggplot2",...

References/Ideas stolen from

### Some functions to do the work
# This one detects the path of the script
determine_self_path <- function(all_args) {
if (all_args[1]=="RStudio"){
mydir = dirname(rstudioapi::getSourceEditorContext()$path)
} else {
mydir=dirname(gsub("--file=","",all_args[grep("--file", all_args)]))
args = commandArgs(trailingOnly = T)
if (length(args) > 0) {
if (dir.exists(args[1])) {
mydir = args[1]
}
}
}
return(mydir)
}
# This one reads R code to identify libraries used
find_R_dependencies <- function(R_source_files) {
### Read R code, extract mentions of 'library()' or 'require()', unless it's commented out
all_pkgs <- unique(unlist(lapply(R_source_files, function(currfile) {
currsource <- readLines(currfile)
matched_lines <- currsource[which(grepl("^[^#]*library\\(",currsource) | grepl("^[^#]*require\\(",currsource))]
libnames <- gsub("library\\((.*)\\)","\\1",matched_lines)
libnames <- trimws(gsub("require\\((.*)\\)","\\1",libnames))
libnames <- unlist(lapply(strsplit(libnames,","),"[[",1))
return(libnames)
})))
return(all_pkgs)
}
# This one installs packages from CRAN or Bioconductor as appropriate
install_multi_source <- function(pkglist, sources=c("cran","bioconductor")) {
### Could potentially also check for git repos, but that's hard
### Also should generalize this function more better if adding more sources
if(any(grepl("bioc", tolower(sources)))) {
### Figure out which ones are available in Bioconductor and install any new that are not already present
bioc_universe <- BiocManager::available()
bioc_packages <- intersect(bioc_universe, pkglist)
print(paste0(length(bioc_packages), " of ", length(pkglist), " packages found in Bioconductor."))
bioc_packages <- bioc_packages[!(bioc_packages %in% installed.packages()[,"Package"])]
print(paste0("Installing ",length(bioc_packages), " new packages from Bioconductor..."))
if(length(bioc_packages)) BiocManager::install(bioc_packages)
}
sources <- sources[!grepl("bioc", tolower(sources))]
if(any(grepl("cran", tolower(sources)))) {
### Figure out which ones are available in CRAN and install any new that are not already present
cran_universe <- available.packages(repos="https://cloud.r-project.org")[,"Package"]
cran_packages <- intersect(cran_universe, pkglist)
print(paste0(length(cran_packages), " of ", length(pkglist), " packages found in CRAN."))
cran_packages <- cran_packages[!(cran_packages %in% installed.packages()[,"Package"])]
print(paste0("Installing ",length(cran_packages), " new packages from CRAN..."))
if(length(cran_packages)) install.packages(cran_packages, repos="https://cloud.r-project.org")
}
sources <- sources[!grepl("cran", tolower(sources))]
}
######################### EXECUTION STARTS HERE #########################
args <- commandArgs(trailingOnly = F)
my_dir <- determine_self_path(args)
print(paste0("Searching in current path:",my_dir))
r_file_list <- list.files(my_dir, "*.R$|*.Rmd$", full.names = T, recursive = T)
required_pkgs <- find_R_dependencies(r_file_list)
install_multi_source(required_pkgs)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment