Skip to content

Instantly share code, notes, and snippets.

@samclifford
Created May 2, 2017 06:35
Show Gist options
  • Save samclifford/e1223861931c69ff159c8c680a96c11d to your computer and use it in GitHub Desktop.
Save samclifford/e1223861931c69ff159c8c680a96c11d to your computer and use it in GitHub Desktop.
samvif <- function(mod){
# mod is an mgcv object
# this function calculates the variance inflation factors for GAM as no one else has written code to do it properly
# this is used to summarise how well the GAM performed
mod.sum <- summary(mod)
s2 <- mod$sig2 # estimate of standard deviation of residuals
X <- mod$model # data used to fit the model
n <- nrow(X) # how many observations were used in fitting?
v <- -1 # omit the intercept term, it can't inflate variance
varbeta <- mod.sum$p.table[v,2]^2 # variance in estimates
varXj <- apply(X=X[,row.names(mod.sum$p.table)[v]],MARGIN=2, var) # variance of all the explanatory variables
VIF <- varbeta/(s2/(n-1)*1/varXj) # the variance inflation factor, obtained by rearranging
# var(beta_j) = s^2/(n-1) * 1/var(X_j) * VIF_j
VIF.df <- data.frame(variable=names(VIF),
vif=VIF,
row.names=NULL)
return(VIF.df)
}
@sebsau
Copy link

sebsau commented Jan 7, 2018

Thanks for writing this function! That was exactly what I was looking for.

However, I just tried it and it threw this error:

> samvif(mod = m_gamm) Error in [.data.frame(X, , row.names(mod.sum$p.table)[v]) : undefined columns selected Called from: [.data.frame(X, , row.names(mod.sum$p.table)[v])

(with mgcv_1.8-22 under R 3.4.2)

Do you know what the problem might be there?

@CarolineXGao
Copy link

X <- mod$model should be model.matrix(mod)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment