Module: Clusterer::DocumentSimilarity

Included in:
DocumentBase
Defined in:
lib/clusterer/similarity.rb

Instance Method Summary collapse

Instance Method Details

#cosine_similarity(document) ⇒ Object

find similarity between two documents, or cluster centroids



8
9
10
11
12
13
14
15
# File 'lib/clusterer/similarity.rb', line 8

def cosine_similarity(document)
  return 1.0 if self.empty? || document.nil? || document.empty?
  similarity = 0
  self.each do |w,value|
    similarity += (value * (document[w] || 0))
  end
  similarity /= (self.vector_length * document.vector_length)
end