Skip to content

Instantly share code, notes, and snippets.

@mhlr
Last active January 10, 2017 05:49
Show Gist options
  • Save mhlr/82f697ecfac34293d015 to your computer and use it in GitHub Desktop.
Save mhlr/82f697ecfac34293d015 to your computer and use it in GitHub Desktop.
generate Zipf distributed data as described in http://arxiv.org/abs/1407.7135
# gnerate Zipf distributed data as described in
# http://arxiv.org/abs/1407.7135
# Zipf's law arises naturally in structured, high-dimensional data
# Laurence Aitchison, Nicola Corradi, Peter E. Latham
n <- 2**22
k <- 20
b <- rnorm(k, 1, 0.2)
data <- replicate(n, {z <- runif(1)
p <- (z^b) / (z^b + (1-z)^b)
sum((runif(k) < p) * (2**(0:(k-1)))) })
tbl <- sort(table(data), decreasing=TRUE)
plot(log2(1:length(tbl)), log2(tbl))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment