Skip to content

Instantly share code, notes, and snippets.

@tcibinan
Created March 18, 2018 22:28
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tcibinan/2e32c72cbe224d7f8db03f348df53979 to your computer and use it in GitHub Desktop.
Save tcibinan/2e32c72cbe224d7f8db03f348df53979 to your computer and use it in GitHub Desktop.
Clustering with kmeans method and hierarchy clustering with agnes method.
library(dplyr)
library(cluster)
library(stringr)
library(caret)
set.seed(2342)
# Data preprocessing
raw_data <-
read.csv('lab7_data.csv', stringsAsFactors = F) %>%
mutate(Class = str_replace_all(Class, " ", "")) %>%
mutate(Class = factor(Class))
unclassified_data <-
raw_data %>%
select(-Class)
# Computations
kmeans_cluster <- kmeans(unclassified_data, 3)
clusterisation_error <-
confusionMatrix(kmeans_cluster$cluster, as.numeric(raw_data$Class)) %>%
(function (matrix) 1 - matrix$overall[['Accuracy']])
plot(unclassified_data$Sepala.length,
col = kmeans_cluster$cluster,
ylab = "Sepala length",
xlab = "Observation",
main = paste0("kmeans clusterization with error=", clusterisation_error))
hierarchy_cluster <- agnes(unclassified_data)
plot(hierarchy_cluster,
xlab = "cluster",
ylab = "height",
main = "hierarchy clusterization")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment