Skip to content

Instantly share code, notes, and snippets.

@mavam
Last active December 17, 2015 12:29
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mavam/5610503 to your computer and use it in GitHub Desktop.
Save mavam/5610503 to your computer and use it in GitHub Desktop.
Plots the size of a Bloom filter as a function of the expected number of elements for various false-positive rates.
library(ggplot2)
library(reshape)
library(scales)
# Computes the number of kB a basic bloom filter requires.
# n: the number of elements to store
# fp: the desired false positive rate
space = function(n, fp) { -n * log(fp) / log(2)^2 / 8 / 1024 }
N = 10^(1:9)
FP = 10^-(1:6)
x = melt(sapply(N, function(i) sapply(FP, function(j) space(i, j))))
names(x) = c("fp", "n", "value")
g = ggplot(x) +
aes(x=n, y=value, group=factor(fp), color=fp) +
geom_point() +
geom_line() +
scale_x_continuous(breaks=log10(N), labels=math_format(10^.x)) +
scale_y_log10(breaks=10^(-3:6), labels=comma) +
scale_color_continuous(name="FP rate", labels=math_format(10^-.x)) +
labs(x="Number of unique elements", y="Size (kB)")
ggsave("bf-size.png", g)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment