Skip to content

Instantly share code, notes, and snippets.

View nuwansa's full-sized avatar
:octocat:

Nuwan Sanjeewa Abeysiriwardana nuwansa

:octocat:
  • Sri Lanka
View GitHub Profile
@brandt
brandt / bloom-filter-calculator.rb
Created February 17, 2015 17:19
Calculate the required bloom filter size and optimal number of hashes from the expected number of items in the collection and acceptable false-positive rate
# Optimal bloom filter size and number of hashes
# Tips:
# 1. One byte per item in the input set gives about a 2% false positive rate.
# 2. The optimal number of hash functions is ~0.7x the number of bits per item.
# 3. The number of hashes dominates performance.
# Expected number of items in the collection
# n = (m * ln(2))/k;
n = 300_000