Skip to content

Instantly share code, notes, and snippets.

@dfalster
Last active February 19, 2023 00:29
Show Gist options
  • Save dfalster/5589956 to your computer and use it in GitHub Desktop.
Save dfalster/5589956 to your computer and use it in GitHub Desktop.
The function addNewData.R modifies a data frame with a lookup table. This is useful where you want to supplement data loaded from file with other data, e.g. to add details, change treatment names, or similar. The function readNewData is also included. This function runs some checks on the new table to ensure it has correct variable names and val…
##' Modifies 'data' by adding new values supplied in newDataFileName
##'
##' newDataFileName is expected to have columns
##' c(lookupVariable,lookupValue,newVariable,newValue,source)
##'
##' Within the column 'newVariable', replace values that
##' match 'lookupValue' within column 'lookupVariable' with the value
##' newValue'. If 'lookupVariable' is NA, then replace *all* elements
##' of 'newVariable' with the value 'newValue'.
##'
##' Note that lookupVariable can be the same as newVariable.
##'
##' @param newDataFileName name of lookup table
##' @param data existing data.frame
##' @param allowedVars vector of permissible variable names for newVariable
##' @return modified data.frame
addNewData <- function(newDataFileName, data, allowedVars){
import <- readNewData(newDataFileName, allowedVars)
if( !is.null(import)){
for(i in seq_len(nrow(import))){ #Make replacements
col.to <- import$newVariable[i]
col.from <- import$lookupVariable[i]
if(is.na(col.from)){ # apply to whole column
data[col.to] <- import$newValue[i]
} else { # apply to subset
rows <- data[[col.from]] == import$lookupValue[i]
data[rows,col.to] <- import$newValue[i]
}
}
}
data
}
##' Utility function to read/process newDataFileName for addNewData
##'
##' @param newDataFileName name of lookup table
##' @param allowedVars vector of permissible variable names for newVariable
##' @return data.frame with columns c(lookupVariable,lookupValue,newVariable,newValue,source)
readNewData <- function(newDataFileName, allowedVars){
if( file.exists(newDataFileName)){
import <- read.csv(newDataFileName, header=TRUE, stringsAsFactors=FALSE,
strip.white=TRUE)
if( nrow(import)> 0 ){
#Check columns names for import are right
expectedColumns<- c("lookupVariable","lookupValue","newVariable","newValue")
nameIsOK <- expectedColumns %in% names(import)
if(any(!nameIsOK))
stop("Incorrect name in lookup table for ",
newDataFileName, "--> ", paste(expectedColumns[!nameIsOK],
collapse=", "))
#Check values of newVariable are in list of allowed variables
import$lookupVariable[import$lookupVariable == ""] <- NA
nameIsOK <- import$newVariable %in% allowedVars
if(any(!nameIsOK))
stop("Incorrect name(s) in newVariable column of ",
newDataFileName, "--> ", paste(import$newVariable[!nameIsOK],
collapse=", "))
} else {
import <- NULL
}
} else {
import <- NULL
}
import
}
lookupVariable lookupValue newVariable newValue source
id 1 species Banksia oblongifolia Daniel Falster
id 2 species Banksia ericifolia Daniel Falster
id 3 species Banksia serrata Daniel Falster
id 4 species Banksia grandis Daniel Falster
family Proteaceae Daniel Falster
location NSW Daniel Falster
id 4 location WA Daniel Falster

To install the code, do one of the following:

  1. Download it then load with source(addNewData.R)
  2. Load it directly from github
library(devtools, quietly=TRUE)
source_gist("https://gist.github.com/dfalster/5589956")

To use it, simply type

addNewData("dataNew.csv", myData, allowedVars)

where allowedVars is vector of permissible variable names for the columns of myData.

@psychemistz
Copy link

Hi, for those who have problem in source_gist, try this.
devtools::source_gist("5589956", filename = "addNewData.R")

@martungel
Copy link

Hi there, thank you very much for this very useful script!
For the French users - it is important to have your dataNew.csv file comma delimited (sep = ",") and not semi-column delimited (sep = ";") which is the default for French System Excel.

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment