Skip to content

Instantly share code, notes, and snippets.

@xiaodaigh
Created October 15, 2017 23:51
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save xiaodaigh/719e678c35c2f733eaafc33a1fc9a19b to your computer and use it in GitHub Desktop.
Save xiaodaigh/719e678c35c2f733eaafc33a1fc9a19b to your computer and use it in GitHub Desktop.
ctree vs kmeans on the iris dataset
# data prep ---------------------------------------------------------------
library(data.table)
data(iris)
iris_copy <- copy(iris)
setDT(iris_copy)
iris_copy_ctree <- copy(iris_copy)
# ctree model -------------------------------------------------------------
library(partykit)
ctree_model <- ctree(Species ~ ., data = iris_copy_ctree)
ctree_pred <- fitted.values(ctree_model)
iris_copy[,ctree_pred_species := ctree_pred$`(response)`]
# kmeans ------------------------------------------------------------------
a <- kmeans(iris_copy[,1:4],3)
iris_copy[,kmeans_prediction_n := a$cluster]
tmp <- iris_copy[,.N,.(Species, kmeans_prediction_n)][, maxN:=max(N), .(Species)]
kmeans_prediction_fnl <- tmp[N == maxN, .(kmeans_pred_species = Species, kmeans_prediction_n)]
iris1 <- merge(iris_copy, kmeans_prediction_fnl, by = "kmeans_prediction_n")
# performance comparison --------------------------------------------------
iris1[, ctree_correct := ctree_pred_species == Species]
iris1[, kmeans_correct := kmeans_pred_species == Species]
# ctree seems to be superior
iris1[,.N,.(ctree_correct,kmeans_correct)]
warning("ctree is superior;")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment