Class: Bloombroom::BloomFilter
- Inherits:
-
Object
- Object
- Bloombroom::BloomFilter
- Defined in:
- lib/bloombroom/filter/bloom_filter.rb
Overview
BloomFilter false positive probability rule of thumb: see www.igvita.com/2008/12/27/scalable-datasets-bloom-filters-in-ruby/ a Bloom filter with a 1% error rate and an optimal value for k only needs 9.6 bits per key, and each time we add 4.8 bits per element we decrease the error rate by ten times.
10000 elements, 1% error rate: m = 10000 * 10 bits -> 12k of memory, k = 0.7 * (10000 * 10 bits / 10000) = 7 hash functions 10000 elements, 0.1% error rate: m = 10000 * 15 bits -> 18k of memory, k = 0.7 * (10000 * 15 bits / 10000) = 11 hash functions
Bloombroom::BloomHelper.find_m_k can be used to compute optimal m & k values for a required capacity and error rate.
Instance Attribute Summary collapse
-
#bits ⇒ Object
readonly
Returns the value of attribute bits.
-
#k ⇒ Object
readonly
Returns the value of attribute k.
-
#m ⇒ Object
readonly
Returns the value of attribute m.
-
#size ⇒ Object
readonly
Returns the value of attribute size.
Instance Method Summary collapse
-
#add(key) ⇒ Fixnum
(also: #<<)
The total number of keys in the filter.
-
#include?(key) ⇒ Boolean
(also: #[])
True if given key is present in the filter.
-
#initialize(m, k) ⇒ BloomFilter
constructor
A new instance of BloomFilter.
Constructor Details
#initialize(m, k) ⇒ BloomFilter
Returns a new instance of BloomFilter.
21 22 23 24 25 26 |
# File 'lib/bloombroom/filter/bloom_filter.rb', line 21 def initialize(m, k) @bits = BitField.new(m) @m = m @k = k @size = 0 end |
Instance Attribute Details
#bits ⇒ Object (readonly)
Returns the value of attribute bits.
17 18 19 |
# File 'lib/bloombroom/filter/bloom_filter.rb', line 17 def bits @bits end |
#k ⇒ Object (readonly)
Returns the value of attribute k.
17 18 19 |
# File 'lib/bloombroom/filter/bloom_filter.rb', line 17 def k @k end |
#m ⇒ Object (readonly)
Returns the value of attribute m.
17 18 19 |
# File 'lib/bloombroom/filter/bloom_filter.rb', line 17 def m @m end |
#size ⇒ Object (readonly)
Returns the value of attribute size.
17 18 19 |
# File 'lib/bloombroom/filter/bloom_filter.rb', line 17 def size @size end |
Instance Method Details
#add(key) ⇒ Fixnum Also known as: <<
Returns the total number of keys in the filter.
30 31 32 33 |
# File 'lib/bloombroom/filter/bloom_filter.rb', line 30 def add(key) BloomHelper.multi_hash(key, @k).each{|position| @bits.set(position % @m)} @size += 1 end |
#include?(key) ⇒ Boolean Also known as: []
Returns true if given key is present in the filter. false positive are possible and dependant on the m and k filter parameters.
38 39 40 41 |
# File 'lib/bloombroom/filter/bloom_filter.rb', line 38 def include?(key) BloomHelper.multi_hash(key, @k).each{|position| return false unless @bits.include?(position % @m)} true end |