Skip to content

Instantly share code, notes, and snippets.

@mathzero
Last active July 9, 2020 09:52
Show Gist options
  • Save mathzero/23799fe94336f008bbb136716bba24ec to your computer and use it in GitHub Desktop.
Save mathzero/23799fe94336f008bbb136716bba24ec to your computer and use it in GitHub Desktop.
Penalised regression stability analysis
### prep data
Xdata <- data %>% select(-outcome) %>% as.matrix()
Ydata <- as.matrix(data$outcome)
### definte stability function
LassoSub=function(k=1, Xdata, Ydata){
set.seed(k)
s=sample(nrow(data), size=0.8*nrow(data))
Xsub=Xdata[s, ]
Ysub=Ydata[s]
model.sub=cv.glmnet(x=Xsub, y=Ysub, alpha=1, family="gaussian")
coef.sub=coef(model.sub, s='lambda.min')[-1]
return(coef.sub)
}
### run function x1000
niter=1000
lasso.stab=sapply(1:niter, FUN=LassoSub, Xdata=as.matrix(Xdata), Ydata=as.matrix(Ydata))
### process results
df.results <- data.frame(question=colnames(Xdata), mean_beta=NA, sd_beta =NA, selection_prop=NA)
df.results$mean_beta <- rowMeans(lasso.stab,na.rm = TRUE)
df.results$sd_beta <- apply(lasso.stab,1,FUN=sd, na.rm = TRUE)
df.results$selection_prop <- rowSums(abs(lasso.stab) >0) / 1000
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment