Skip to content

Instantly share code, notes, and snippets.

@imjakedaniels
Last active April 15, 2019 16:22
Show Gist options
  • Save imjakedaniels/24fbb73db9efcc11fb3b97d3f858d063 to your computer and use it in GitHub Desktop.
Save imjakedaniels/24fbb73db9efcc11fb3b97d3f858d063 to your computer and use it in GitHub Desktop.
Churn Analysis Project with Clustering & Decision Trees
```{r}
install.packages("tidyverse")
library(tidyverse)
library(stringr)
churn <- read_csv(file.choose())
```
```{r}
#take numerics and remove high-correlation
ch <- data.frame(churn[,c(2,5:21)])
```
```{r}
###categorical data to logical
ch$Int.l.Plan<- str_replace_all(ch$Int.l.Plan, "no", "F")
ch$Int.l.Plan<- str_replace_all(ch$Int.l.Plan, "yes", "T")
ch$VMail.Plan <- str_replace_all(ch$VMail.Plan, "no", "F")
ch$VMail.Plan <- str_replace_all(ch$VMail.Plan, "yes", "T")
ch$Churn. <- str_replace_all(ch$Churn., "False.", "F")
ch$Churn. <- str_replace_all(ch$Churn., "True.", "T")
#logicals
ch$Intl.Plan <- as.logical(ch$Int.l.Plan)
ch$VMail.Plan <- as.logical(ch$VMail.Plan)
ch$Churn <- as.logical(ch$Churn.)
#combine mins
ch$Local.Mins = NULL
ch$Local.Mins <- c(ch$Day.Mins + ch$Eve.Mins + ch$Night.Mins)
ch$Local.Charge = NULL
ch$Local.Charge <- c(ch$Day.Charge + ch$Eve.Charge + ch$Night.Charge)
#remove old mins
ch$Day.Mins = NULL
ch$Eve.Mins = NULL
ch$Night.Mins = NULL
#export for weka
install.packages("RWeka")
library(RWeka)
write.arff(ch, file = "clusteringresults.arff")
```
Upon Identifying our the three archetypes with the highest risk to Churn, I generated an email list to send call centers to offer incentives.
```{r}
install.packages("tidyverse")
library(tidyverse)
library(stringr)
#combine area code and phone numbers, then remove
ch$PhoneNumbers <- paste(ch$Area.Code, ch$Phone)
ch$Area.Code = NULL
ch$Phone = NULL
#Customer1 - Heavy Mins
heavy_users <- which(ch$Local.Charge > 71.54 & ch$VMail.Plan == F)
Customer1 <- ch[heavy_users,]
Customer1 <- Customer1$PhoneNumbers
#Customer2 - Moderate, Low-Contact with Intl Plans
moderate_international_users <- which(ch$Local.Charge <= 71.54 & ch$CustServ.Calls <= 3 & ch$Int.l.Plan == T & (ch$Intl.Calls <= 2 | ch$Intl.Mins > 13.1))
Customer2 <- ch[moderate_international_users,]
Customer2 <- Customer2$PhoneNumbers
#Customer3 - Light, Frequent-Contact
which(ch$Local.Charge <= 71.54)
light_recurring <- which(ch$Local.Charge <= 54.12 & ch$CustServ.Calls > 3)
Customer3 <- ch[light_recurring,]
Customer3 <- Customer3$PhoneNumbers
```
@imjakedaniels
Copy link
Author

imjakedaniels commented Jan 27, 2018

More examples of the archetypes in cluster analysis, my responsibility in the project.
clusteringarchetypes

We performed K-Means Clustering with Euclidean Distance. With this data, we discovered which attributes we should investigate and created customer archetypes. On screen, we see two examples of the customers our decision tree revealed to us.

The Customer 1 Archetype, who are heavy users with no voicemail plan, and the Customer 3 Archetype, who are light users with many complaints.

When these clusters of customers with a high propensity to churn are exposed, we can improve our data collection surrounding them to reveal more attributes as to why that is in the future and adapt our current strategies to better handle sensitive customers like those with >3 customer service calls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment