Class: Bloombroom::ContinuousBloomFilter
- Inherits:
-
Object
- Object
- Bloombroom::ContinuousBloomFilter
- Defined in:
- lib/bloombroom/filter/continuous_bloom_filter.rb
Overview
ContinuousBloomFilter is a bloom filter for unbounded stream of keys where keys are expired over a given period of time. The expected capacity of the bloom filter for the desired validity period must be known or estimated. For a given capacity and error rate, BloomHelper.find_m_k can be used to compute optimal m & k values.
4 bits per key (instead of 1 bit in a normal bloom filter) are used for keeping track of the keys ttl. the internal timer resolution is set to half of the ttl (resolution divisor of 2). using 4 bits gives us 15 usable time slots (slot 0 is for the unset state). basically the internal time bookeeping is similar to a ring buffer where the first timer tick will be time slot=1, slot=2, .. slot=15, slot=1 and so on. The total time of our internal clock will thus be 15 * (ttl / 2). We keep track of ttl by writing the current time slot in the key k buckets when first inserted in the filter. when doing a key lookup if any of the bucket contain the 0 value the key is not found. if the interval betweem the current time slot and any of the k buckets value is greater than 2 (resolution divisor) we know this key is expired and we reset the expired buckets to 0.
Constant Summary collapse
- RESOLUTION_DIVISOR =
2
- BITS_PER_BUCKET =
4
Instance Attribute Summary collapse
-
#buckets ⇒ Object
readonly
Returns the value of attribute buckets.
-
#k ⇒ Object
readonly
Returns the value of attribute k.
-
#m ⇒ Object
readonly
Returns the value of attribute m.
-
#ttl ⇒ Object
readonly
Returns the value of attribute ttl.
Instance Method Summary collapse
-
#add(key) ⇒ ContinuousBloomFilter
(also: #<<)
Self.
-
#inc_time_slot ⇒ Object
advance internal time slot.
-
#include?(key) ⇒ Boolean
(also: #[])
True if given key is present in the filter.
-
#initialize(m, k, ttl) ⇒ ContinuousBloomFilter
constructor
A new instance of ContinuousBloomFilter.
-
#start_timer ⇒ Object
start the internal timer thread for managing ttls.
Constructor Details
#initialize(m, k, ttl) ⇒ ContinuousBloomFilter
Returns a new instance of ContinuousBloomFilter.
30 31 32 33 34 35 36 37 38 39 40 41 |
# File 'lib/bloombroom/filter/continuous_bloom_filter.rb', line 30 def initialize(m, k, ttl) @m = m @k = k @ttl = ttl @buckets = BitBucketField.new(BITS_PER_BUCKET, m) # time management @increment_period = @ttl / RESOLUTION_DIVISOR @current_slot = 1 @max_slot = (2 ** BITS_PER_BUCKET) - 1 # ex. with 4 bits -> we want range 1..15 @lock = Mutex.new end |
Instance Attribute Details
#buckets ⇒ Object (readonly)
Returns the value of attribute buckets.
22 23 24 |
# File 'lib/bloombroom/filter/continuous_bloom_filter.rb', line 22 def buckets @buckets end |
#k ⇒ Object (readonly)
Returns the value of attribute k.
22 23 24 |
# File 'lib/bloombroom/filter/continuous_bloom_filter.rb', line 22 def k @k end |
#m ⇒ Object (readonly)
Returns the value of attribute m.
22 23 24 |
# File 'lib/bloombroom/filter/continuous_bloom_filter.rb', line 22 def m @m end |
#ttl ⇒ Object (readonly)
Returns the value of attribute ttl.
22 23 24 |
# File 'lib/bloombroom/filter/continuous_bloom_filter.rb', line 22 def ttl @ttl end |
Instance Method Details
#add(key) ⇒ ContinuousBloomFilter Also known as: <<
Returns self.
45 46 47 48 49 |
# File 'lib/bloombroom/filter/continuous_bloom_filter.rb', line 45 def add(key) current_slot = @lock.synchronize{@current_slot} BloomHelper.multi_hash(key, @k).each{|position| @buckets[position % @m] = current_slot} self end |
#inc_time_slot ⇒ Object
advance internal time slot. this is exposed primarily for spec’ing purposes. normally this is automatically called by the internal timer thread but if not using the internal timer thread it can be called explicitly when doing your own time management.
80 81 82 83 |
# File 'lib/bloombroom/filter/continuous_bloom_filter.rb', line 80 def inc_time_slot # ex. with 4 bits -> we want range 1..15, @lock.synchronize{@current_slot = (@current_slot % @max_slot) + 1} end |
#include?(key) ⇒ Boolean Also known as: []
Returns true if given key is present in the filter. false positive are possible and dependant on the m and k filter parameters.
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 |
# File 'lib/bloombroom/filter/continuous_bloom_filter.rb', line 54 def include?(key) current_slot = @lock.synchronize{@current_slot} expired = false BloomHelper.multi_hash(key, @k).each do |position| start_slot = @buckets[position % @m] if start_slot == 0 expired = true elsif elapsed(start_slot, current_slot) > RESOLUTION_DIVISOR expired = true @buckets[position % @m] = 0 end end !expired end |
#start_timer ⇒ Object
start the internal timer thread for managing ttls. must be explicitely called
72 73 74 |
# File 'lib/bloombroom/filter/continuous_bloom_filter.rb', line 72 def start_timer @timer ||= detach_timer end |