Skip to content

Instantly share code, notes, and snippets.

@hadley
Created October 5, 2011 20:11
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save hadley/03d3e527067b554018a0 to your computer and use it in GitHub Desktop.
Save hadley/03d3e527067b554018a0 to your computer and use it in GitHub Desktop.
benford-prices.r
library(ggplot2)
orig <- read.csv("prices.tsv.gz", sep = "\t")
orig <- mutate(orig,
cents = (100 * Price) %% 100,
fd = cents %/% 10)
df <- count(orig, "fd")
df <- mutate(df,
prob = prop.table(freq),
benford = log10(fd + 1) - log10(fd + 0))
ggplot(df, aes(x = fd, y = prob)) +
geom_bar(stat = "identity", fill = "blue") +
geom_line(aes(x = Numeral, y = benford, size = 0.1)) +
geom_point(aes(x = Numeral, y = benford, color = "red", size = 1)) +
theme_bw()
scale_x_continuous(breaks = seq(1:9))
ggsave("benford.png")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment