Skip to content

Instantly share code, notes, and snippets.

@ko-lem
Created September 3, 2014 15:48
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ko-lem/9f69a24ecd8c0302fded to your computer and use it in GitHub Desktop.
Save ko-lem/9f69a24ecd8c0302fded to your computer and use it in GitHub Desktop.
Scripts for Criteo Challenge
setwd('~/kaggle/criteo')
library(ggplot2)
library(scales)
probs <- read.csv('submit/submit-8.csv')
num_less <- length(which(probs$Predicted < 0.5))
num_more <- length(which(probs$Predicted > 0.5))
num_less / (num_less + num_more)
num_more / (num_less + num_more)
plot.histogram <- function(data, title) {
ggplot(data, aes(x=Predicted)) +
ggtitle(title) +
geom_histogram(binwidth=1/20) +
theme(plot.title = element_text(size=17, vjust=1)) +
xlab("Predicted Probability") +
ylab("Frequency") +
scale_x_continuous(breaks=c(0, 0.25, 0.5, 0.75, 1)) +
scale_y_continuous(labels=comma)
}
plot.histogram(probs, "Histogram of a Submit That Scored 0.47971")
ideal.probs <- probs
ideal.probs$Predicted[probs$Predicted > 0.5] <- 1.5 - probs$Predicted[probs$Predicted > 0.5]
plot.histogram(ideal.probs, "Ideal Histogram?")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment