Module: Clusterer::DocumentSimilarity
- Included in:
- DocumentBase
- Defined in:
- lib/clusterer/similarity.rb
Instance Method Summary collapse
-
#cosine_similarity(document) ⇒ Object
find similarity between two documents, or cluster centroids.
Instance Method Details
#cosine_similarity(document) ⇒ Object
find similarity between two documents, or cluster centroids
8 9 10 11 12 13 14 15 |
# File 'lib/clusterer/similarity.rb', line 8 def cosine_similarity(document) return 1.0 if self.empty? || document.nil? || document.empty? similarity = 0 self.each do |w,value| similarity += (value * (document[w] || 0)) end similarity /= (self.vector_length * document.vector_length) end |