Class: Hobix::Search::Simple::Vector

Inherits:
Object
  • Object
show all
Defined in:
lib/hobix/search/vector.rb

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initializeVector

Returns a new instance of Vector.



12
13
14
15
16
17
# File 'lib/hobix/search/vector.rb', line 12

def initialize
    #    @bits = []
  @bits = 0
  @max_bit = -1
  @num_bits = 0
end

Instance Attribute Details

#atObject

Returns the value of attribute at.



9
10
11
# File 'lib/hobix/search/vector.rb', line 9

def at
  @at
end

#bitsObject (readonly)

Returns the value of attribute bits.



10
11
12
# File 'lib/hobix/search/vector.rb', line 10

def bits
  @bits
end

#max_bitObject (readonly)

Returns the value of attribute max_bit.



10
11
12
# File 'lib/hobix/search/vector.rb', line 10

def max_bit
  @max_bit
end

#num_bitsObject (readonly)

Returns the value of attribute num_bits.



10
11
12
# File 'lib/hobix/search/vector.rb', line 10

def num_bits
  @num_bits
end

Instance Method Details

#add_word_index(index) ⇒ Object



19
20
21
22
23
24
25
# File 'lib/hobix/search/vector.rb', line 19

def add_word_index(index)
  if @bits[index].zero?
    @bits += (1 << index)
    @num_bits += 1
    @max_bit = index if @max_bit < index
  end
end

#dot(vector) ⇒ Object



27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
# File 'lib/hobix/search/vector.rb', line 27

def dot(vector)
  # We only need to calculate up to the end of the shortest vector
  limit = @max_bit
    # Commenting out the next line makes this vector the dominant
    # one when doing the comparison
  limit = vector.max_bit if limit > vector.max_bit
    
  # because both vectors have just ones or zeros in them,
  # we can pre-calculate the AnBn component
  # The vector's magnitude is Sqrt(num set bits)
  factor = Math.sqrt(1.0/@num_bits) * Math.sqrt(1.0/vector.num_bits)
    
  count = 0
  (limit+1).times {|i| count += 1 if @bits[i] ==1 && vector.bits[i] == 1}
    
  factor * count
end

#score_against(must_match, must_not_match, general) ⇒ Object

We’re a document’s vector, and we’re being matched against three other vectors:

  1. A list of must match words

  2. A list of must not match words

  3. A list of general words. The score we return is the number of these that we match



52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
# File 'lib/hobix/search/vector.rb', line 52

def score_against(must_match, must_not_match, general)
  # Eliminate if any _must_not_match_ words found
  unless must_not_match.num_bits.zero?
    return 0 unless (@bits & must_not_match.bits).zero?
  end
    
  # If the match was entirely negative, then we know we're passed at
  # this point
    
  if must_match.num_bits.zero? and general.num_bits.zero?
    return 1
  end
    
  count = 0
    
  # Eliminate unless all _must_match_ words found
    
  unless must_match.num_bits.zero?
    return 0 unless (@bits & must_match.bits) == must_match.bits
    count = 1
  end
    
  # finally score on the rest
  common = general.bits & @bits
  count += count_bits(common, @max_bit+1) unless common.zero?
  count
end