Skip to content

Instantly share code, notes, and snippets.

@johncolby
Created October 26, 2011 18:22
Show Gist options
  • Save johncolby/1317248 to your computer and use it in GitHub Desktop.
Save johncolby/1317248 to your computer and use it in GitHub Desktop.
An example of how each column in a data frame could be residualized for some covariates
# Download example data from: https://github.com/johncolby/SVM-RFE/zipball/master
setwd('/path/to/SVM-RFE/') # Change this to your setup
load('demo/input.Rdata')
data = input[, 2:3]
input = input[, -(1:3)]
# Function to residualize a vector x for the covariates in data
residualize <- function(x, fun=x~., data) {
data = cbind(x=x, data)
lm(fun, data=data)$resid
}
# Apply our residualize function to each column of input
input = colwise(residualize)(input, data=data)
################################################################################
# Modification for test data
load('demo/input.Rdata')
# Setup training data
input.train = input[1:100, ]
data.train = input.train[, 2:3]
input.train = input.train[, -(1:3)]
# Setup test data
input.test = input[101:nrow(input), ]
data.test = input.test[, 2:3]
input.test = input.test[, -(1:3)]
# Modified residualize function to residualize test data based on model fit to
# training data only
residualizeTest <- function(x, fun=x~., data, x.new, data.new) {
data = cbind(x=x, data)
fit = lm(fun, data=data)
data.new = cbind(x=x.new, data.new)
pred = predict(fit, data.new)
x.new - pred
}
# Apply the function to each column in our test data
# (more complicated now so can't simply use colwise)
input.test = sapply(1:ncol(input.test), function(x) residualizeTest(input.train[,x], data=data.train, x.new=input.test[,x], data.new=data.test))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment