Module: Edits::Jaro

Defined in:
lib/edits/jaro.rb

Overview

Implements Jaro similarity algorithm.

Class Method Summary collapse

Class Method Details

.distance(str1, str2) ⇒ Float

Calculate Jaro distance

Dj = 1 - Sj

Examples:

Edits::Jaro.distance("information", "informant")# => 0.09764309764309764

43
44
45
# File 'lib/edits/jaro.rb', line 43

def self.distance(str1, str2)
  1.0 - similarity(str1, str2)
end

.similarity(seq1, seq2) ⇒ Float

Calculate Jaro similarity

Sj = 1/3 * ((m / |A|) + (m / |B|) + ((m - t) / m))

Where m is #matches and t is #transposes

Examples:

Edits::Jaro.similarity("information", "informant")# => 0.9023569023569024

20
21
22
23
24
25
26
27
28
29
30
31
32
# File 'lib/edits/jaro.rb', line 20

def self.similarity(seq1, seq2)
  return 1.0 if seq1 == seq2
  return 0.0 if seq1.empty? || seq2.empty?

  seq1 = seq1.codepoints if seq1.is_a? String
  seq2 = seq2.codepoints if seq2.is_a? String

  m, t = jaro_matches(seq1, seq2)
  return 0.0 if m.zero?

  m = m.to_f
  ((m / seq1.length) + (m / seq2.length) + ((m - t) / m)) / 3
end