Class: Hobix::Search::Simple::Vector
- Inherits:
-
Object
- Object
- Hobix::Search::Simple::Vector
- Defined in:
- lib/hobix/search/vector.rb
Instance Attribute Summary collapse
-
#at ⇒ Object
Returns the value of attribute at.
-
#bits ⇒ Object
readonly
Returns the value of attribute bits.
-
#max_bit ⇒ Object
readonly
Returns the value of attribute max_bit.
-
#num_bits ⇒ Object
readonly
Returns the value of attribute num_bits.
Instance Method Summary collapse
- #add_word_index(index) ⇒ Object
- #dot(vector) ⇒ Object
-
#initialize ⇒ Vector
constructor
A new instance of Vector.
-
#score_against(must_match, must_not_match, general) ⇒ Object
We’re a document’s vector, and we’re being matched against three other vectors: 1.
Constructor Details
#initialize ⇒ Vector
Returns a new instance of Vector.
12 13 14 15 16 17 |
# File 'lib/hobix/search/vector.rb', line 12 def initialize # @bits = [] @bits = 0 @max_bit = -1 @num_bits = 0 end |
Instance Attribute Details
#at ⇒ Object
Returns the value of attribute at.
9 10 11 |
# File 'lib/hobix/search/vector.rb', line 9 def at @at end |
#bits ⇒ Object (readonly)
Returns the value of attribute bits.
10 11 12 |
# File 'lib/hobix/search/vector.rb', line 10 def bits @bits end |
#max_bit ⇒ Object (readonly)
Returns the value of attribute max_bit.
10 11 12 |
# File 'lib/hobix/search/vector.rb', line 10 def max_bit @max_bit end |
#num_bits ⇒ Object (readonly)
Returns the value of attribute num_bits.
10 11 12 |
# File 'lib/hobix/search/vector.rb', line 10 def num_bits @num_bits end |
Instance Method Details
#add_word_index(index) ⇒ Object
19 20 21 22 23 24 25 |
# File 'lib/hobix/search/vector.rb', line 19 def add_word_index(index) if @bits[index].zero? @bits += (1 << index) @num_bits += 1 @max_bit = index if @max_bit < index end end |
#dot(vector) ⇒ Object
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
# File 'lib/hobix/search/vector.rb', line 27 def dot(vector) # We only need to calculate up to the end of the shortest vector limit = @max_bit # Commenting out the next line makes this vector the dominant # one when doing the comparison limit = vector.max_bit if limit > vector.max_bit # because both vectors have just ones or zeros in them, # we can pre-calculate the AnBn component # The vector's magnitude is Sqrt(num set bits) factor = Math.sqrt(1.0/@num_bits) * Math.sqrt(1.0/vector.num_bits) count = 0 (limit+1).times {|i| count += 1 if @bits[i] ==1 && vector.bits[i] == 1} factor * count end |
#score_against(must_match, must_not_match, general) ⇒ Object
We’re a document’s vector, and we’re being matched against three other vectors:
-
A list of must match words
-
A list of must not match words
-
A list of general words. The score we return is the number of these that we match
52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
# File 'lib/hobix/search/vector.rb', line 52 def score_against(must_match, must_not_match, general) # Eliminate if any _must_not_match_ words found unless must_not_match.num_bits.zero? return 0 unless (@bits & must_not_match.bits).zero? end # If the match was entirely negative, then we know we're passed at # this point if must_match.num_bits.zero? and general.num_bits.zero? return 1 end count = 0 # Eliminate unless all _must_match_ words found unless must_match.num_bits.zero? return 0 unless (@bits & must_match.bits) == must_match.bits count = 1 end # finally score on the rest common = general.bits & @bits count += count_bits(common, @max_bit+1) unless common.zero? count end |