Skip to content

Instantly share code, notes, and snippets.

@datalove
Last active August 29, 2015 14:02
Show Gist options
  • Save datalove/4827063d8d8ac38545d6 to your computer and use it in GitHub Desktop.
Save datalove/4827063d8d8ac38545d6 to your computer and use it in GitHub Desktop.
Kaggle Acquire Competition - Distribution of Scores 11 June 2014
library(XML)
library(ggplot2)
url <- "http://www.kaggle.com/c/acquire-valued-shoppers-challenge/leaderboard"
tree <- htmlTreeParse(url)
tbl <- readHTMLTable(pagetree, stringsAsFactors = FALSE)[[1]]
colnames(tbl) <- gsub("[^a-zA-Z0-9#]","", colnames(tbl))
tbl$Score <- as.numeric(tbl$Score)
ggplot(tbl, aes(Score)) + geom_histogram() + xlab("Score") + geom_vline(x=0.5, colour = "red")
ggplot(tbl, aes(Score)) + stat_ecdf()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment