Skip to content

Instantly share code, notes, and snippets.

Selva Prabhakaran selva86

Block or report user

Report or block selva86

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
@selva86
selva86 / kolmogorov_smirnov_chart.R
Last active Oct 5, 2017
Function to reproduce the KS Chart in machinelearningplus.com/evaluation-metrics-classification-models
View kolmogorov_smirnov_chart.R
library(InformationValue)
library(ggplot2)
ks_plot <- function (actuals, predictedScores) {
rank <- 0:10
ks_table_out <- InformationValue:::ks_table(actuals = actuals, predictedScores = predictedScores)
perc_positive <- c(0, ks_table_out$cum_perc_responders) * 100
perc_negative <- c(0, ks_table_out$cum_perc_non_responders) * 100
random_prediction <- seq(0, 100, 10)
df <- data.frame(rank, random_prediction, perc_positive, perc_negative)
df_stack <- stack(df, c(random_prediction, perc_positive, perc_negative))
@selva86
selva86 / ks_plot_example.R
Created Oct 5, 2017
Reproducible example for ks_plot
View ks_plot_example.R
library(InformationValue)
library(ggplot2)
# 1. Import dataset
trainData <- read.csv('https://raw.githubusercontent.com/selva86/datasets/master/breastcancer_training.csv')
testData <- read.csv('https://raw.githubusercontent.com/selva86/datasets/master/breastcancer_test.csv')
# 2. Build Logistic Model
logitmod <- glm(Class ~ Cl.thickness + Cell.size + Cell.shape, family = "binomial", data=trainData)
# 3. Predict on testData
@selva86
selva86 / lasso_dataprep.R
Created Mar 25, 2017
Preparatory code for lasso regression lecture
View lasso_dataprep.R
# prep training and test datasets
set.seed(100)
trainRows <- createDataPartition(prostate$lpsa, p=.75, list=FALSE)
trainData <- prostate[trainRows, ]
testData <- prostate[-trainRows, ]
# prepare X and Y matrices separately
train_x <- as.matrix(trainData[, colnames(trainData) %ni% c("lpsa", "train")])
train_y <- as.matrix(trainData[, "lpsa"])
test_x <- as.matrix(testData[, colnames(trainData) %ni% c("lpsa", "train")])
@selva86
selva86 / final_test.R
Created Mar 24, 2017
Solutions for Final Test of Learn R By Intensive Practice
View final_test.R
## Solutions for Final Test of Learn R By Intensive Practice
Q1.
```{r}
#1
sqrt (729)
#2
1203 %% 22
#3
@selva86
selva86 / multilevel_ifelse.R
Last active Nov 10, 2016
How to write multi-level ifelse() in R?
View multilevel_ifelse.R
# How to write multi-level ifelse()
set.seed(100)
abc <- sample(letters[1:5], 1000, replace = T)
df <- data.frame(v1=abc, v2="blank", stringsAsFactors = F)
head(df)
system.time({
df$v2 <- ifelse(df$v1 == "a", "apple",
ifelse(df$v1 == "b", "ball",
ifelse(df$v1 == "c", "cat",
@selva86
selva86 / area_plot_in_base_graphics.R
Created Jul 20, 2016
area_plot_in_base_graphics.R
View area_plot_in_base_graphics.R
# How to fill area under the line in base graphics
library(xts)
library(data.table)
library(lubridate)
set.seed(100)
date_seq <- seq.POSIXt(from=ymd("2016-01-01", tz="UTC"), length=100, by = "day")
y <- round(runif(100), 2)
df <- data.table(date=date_seq, y)
head(df)
@selva86
selva86 / residual_analysis_heteroscedasticity.R
Created Nov 26, 2015
remove_heteroscedasticity_example.R
View residual_analysis_heteroscedasticity.R
.libPaths()
url <- "http://rstatistics.net/wp-content/uploads/2015/09/ozone.csv"
inputData <- read.csv(url)
# Replace outliers as missing values.
replace_outlier_with_missing <- function(x, na.rm = TRUE, ...) {
qnt <- quantile(x, probs=c(.25, .75), na.rm = na.rm, ...) # get %iles
H <- 1.5 * IQR(x, na.rm = na.rm) # outlier limit threshold
y <- x
@selva86
selva86 / prep_for_significant_variables.R
Created Nov 24, 2015
Ozone Data treated for outliers and missing values
View prep_for_significant_variables.R
# Code used in R Programming Course.
# Import Data
url <- "http://rstatistics.net/wp-content/uploads/2015/09/ozone.csv"
inputData <- read.csv(url)
# Replace outliers as missing values.
replace_outlier_with_missing <- function(x, na.rm = TRUE, ...) {
qnt <- quantile(x, probs=c(.25, .75), na.rm = na.rm, ...) # get %iles
H <- 1.5 * IQR(x, na.rm = na.rm) # outlier limit threshold
y <- x
You can’t perform that action at this time.