Skip to content

Instantly share code, notes, and snippets.

@stephlocke
Created December 14, 2018 15:24
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save stephlocke/638cfa911266c91455c9e5a0fae65e2b to your computer and use it in GitHub Desktop.
Save stephlocke/638cfa911266c91455c9e5a0fae65e2b to your computer and use it in GitHub Desktop.
# Load R packages for use
library("dplyr")
library("recipes")
library("rsample")
library("broom")
library("jsonlite")
library("sessioninfo")
# Sample data
sm_iris = initial_split(iris)
train = training(sm_iris)
test = testing(iris)
# Assign our volumes to our output params
ntrain = nrow(train)
ntest = nrow(test)
# Feature engineering
transform = recipe(train, ~.)
transform %>%
step_rm("Species") %>%
step_scale(all_numeric()) ->
transform
transform = prep(transform)
# Prepare our data for modelling
train_clean = bake(transform, train)
test_clean = bake(transform, test)
# Build model
km = kmeans(train_clean, centers = 3)
# Predict on test data
test_scored = kmeans(test_clean, centers = km$centers)
# Generate output
OutputDataSet = cbind(test_clean, test_scored$cluster)
# Prepare quality measures
keymetric = "tot.withinss"
keymetricval = test_scored$tot.withinss
metric = toJSON(glance(test_scored))
# Info
inf = toJSON(package_info())
# Convert models for output
femodel = paste0(serialize(transform, NULL), collapse = "")
mlmodel = paste0(serialize(km, NULL), collapse = "")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment