Skip to content

Instantly share code, notes, and snippets.

@s-kganz
Last active January 15, 2021 20:09
Show Gist options
  • Save s-kganz/d08473f9492d48ea0e56c3c8a3fe1a74 to your computer and use it in GitHub Desktop.
Save s-kganz/d08473f9492d48ea0e56c3c8a3fe1a74 to your computer and use it in GitHub Desktop.
Synthetic data included in the scutr package for benchmarking.
set.seed(01152021)
col.num <- 10 # number of non-class columns
cls.num <- 20 # number of classes
obs.num <- 10 # number of observations for first class
del.obs <- 10 # increase in observation per class
dis.sd <- 5 # SD of class centers around 0
mtx <- matrix(nrow=0, ncol=col.num+1) # add a class col
for (cls in 1:cls.num){
newmtx <- sapply(1:col.num,
function(x){
center <- rnorm(1, 0, sd=dis.sd)
rnorm(obs.num, mean=center)
})
newmtx <- cbind(newmtx, rep(cls, nrow(newmtx)))
mtx <- rbind(mtx, newmtx)
obs.num <- obs.num + del.obs
}
imbalance <- as.data.frame(mtx)
names(imbalance)[col.num+1] <- "class"
imbalance$class <- as.factor(imbalance$class)
use_data(imbalance, overwrite=T)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment