Skip to content

Instantly share code, notes, and snippets.

@xoebus
Created November 30, 2011 20:59
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save xoebus/1410792 to your computer and use it in GitHub Desktop.
Save xoebus/1410792 to your computer and use it in GitHub Desktop.
A Bloom Filter in Ruby
require "zlib"
class BloomFilter
def initialize(size=100, hash_count=3)
raise(ArgumentError, "negative or zero buffer size") if size <= 0
raise(ArgumentError, "negative or zero hash count") if hash_count <= 0
@size = size
@hash_count = hash_count
@buffer = Array.new(size, false)
end
def insert(element)
hash(element).each { |i| @buffer[i] = true}
end
def maybe_include?(element)
hash(element).map { |i| @buffer[i] }.inject(:&)
end
private :hash
def hash(element)
hashes = []
1.upto(@hash_count) do |i|
hashes << Zlib.crc32(element, i)
end
hashes.map { |h| h % @size }
end
end
b = BloomFilter.new(50, 5)
b.insert("hello")
puts b.maybe_include?("hello") # => true
puts b.maybe_include?("goodbye") # => false
@xoebus
Copy link
Author

xoebus commented Nov 30, 2011

I realise the hashing is stupid but IDGAF.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment