Class: Search::Simple::Vector

Inherits:
Object
  • Object
show all
Defined in:
lib/search/simple/vector.rb

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initializeVector

Returns a new instance of Vector.



10
11
12
13
14
15
# File 'lib/search/simple/vector.rb', line 10

def initialize
    #    @bits = []
  @bits = 0
  @max_bit = -1
  @num_bits = 0
end

Instance Attribute Details

#bitsObject (readonly)

Returns the value of attribute bits.



8
9
10
# File 'lib/search/simple/vector.rb', line 8

def bits
  @bits
end

#max_bitObject (readonly)

Returns the value of attribute max_bit.



8
9
10
# File 'lib/search/simple/vector.rb', line 8

def max_bit
  @max_bit
end

#num_bitsObject (readonly)

Returns the value of attribute num_bits.



8
9
10
# File 'lib/search/simple/vector.rb', line 8

def num_bits
  @num_bits
end

Instance Method Details

#add_word_index(index) ⇒ Object



17
18
19
20
21
22
23
# File 'lib/search/simple/vector.rb', line 17

def add_word_index(index)
  if @bits[index].zero?
    @bits += (1 << index)
    @num_bits += 1
    @max_bit = index if @max_bit < index
  end
end

#dot(vector) ⇒ Object



25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
# File 'lib/search/simple/vector.rb', line 25

def dot(vector)
  # We only need to calculate up to the end of the shortest vector
  limit = @max_bit
    # Commenting out the next line makes this vector the dominant
    # one when doing the comparison
  limit = vector.max_bit if limit > vector.max_bit
    
  # because both vectors have just ones or zeros in them,
  # we can pre-calculate the AnBn component
  # The vector's magnitude is Sqrt(num set bits)
  factor = Math.sqrt(1.0/@num_bits) * Math.sqrt(1.0/vector.num_bits)
    
  count = 0
  (limit+1).times {|i| count += 1 if @bits[i] ==1 && vector.bits[i] == 1}
    
  factor * count
end

#score_against(must_match, must_not_match, general) ⇒ Object

We’re a document’s vector, and we’re being matched against three other vectors:

  1. A list of must match words

  2. A list of must not match words

  3. A list of general words. The score we return is the number of these that we match



50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
# File 'lib/search/simple/vector.rb', line 50

def score_against(must_match, must_not_match, general)
  # Eliminate if any _must_not_match_ words found
  unless must_not_match.num_bits.zero?
    return 0 unless (@bits & must_not_match.bits).zero?
  end
    
  # If the match was entirely negative, then we know we're passed at
  # this point
    
  if must_match.num_bits.zero? and general.num_bits.zero?
    return 1
  end
    
  count = 0
    
  # Eliminate unless all _must_match_ words found
    
  unless must_match.num_bits.zero?
    return 0 unless (@bits & must_match.bits) == must_match.bits
    count = 1
  end
    
  # finally score on the rest
  common = general.bits & @bits
  count += count_bits(common, @max_bit+1) unless common.zero?
  count
end