Class: IndexedSearch::Match::Leet

Inherits:
Base
  • Object
show all
Defined in:
lib/indexed_search/match/leet.rb

Overview

Performs a speedy but minimalistic rudimentary 1337 (Leet) match, see: en.wikipedia.org/wiki/Leet Note that non-alpha-numerics are never included in the index, so we’re limited to letters and numbers. The default mappings only has some basic ascii, but could be extended to include many unicode characters. It’s kept so short by default so that speed impact doesn’t get out of hand. Using regular expressions could yield higher quantity and quality matches, and be a more succinct of a description of them too, but they are very slow with really large word lists so they’re kind of impractical to use here.

Class Method Summary collapse

Instance Method Summary collapse

Methods inherited from Base

#find, find_attributes, #initialize, match_against_term?, #results, #term_matches, #term_non_matches

Constructor Details

This class inherits a constructor from IndexedSearch::Match::Base

Class Method Details

.matches_for(term) ⇒ Object

given a string of characters, look each one up in the leet replacement table, and return a giant batch of possibilities with every character replaced with every possible combination of leet from the table this loop tries to be efficient due to the potential for a lot of possible matches



73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
# File 'lib/indexed_search/match/leet.rb', line 73

def self.matches_for(term)
  matches = []
  counts = [0] * term.length
  # cached in local var for speed increase, compared to method call in loops
  replacements = (0..term.length-1).collect { |pos| replacements_for(term[pos]) || [term[pos]] }
  # treating original string like a bunch of digits, this loop is the digit incrementer
  loop do
    # concatenate a match together (better speed by not using temporary arrays with a chained one-liner)
    match = ''
    (0..term.length-1).each { |pos| match << replacements[pos][counts[pos]] }
    matches << match
    # increment digit
    counts[0] += 1
    # loop for carrying over to next digit(s), when a digit reaches its maximum
    pos = 0
    while counts[pos] >= replacements[pos].length
      counts[pos] = 0
      pos += 1
      # return results when all digits reached max, we're done
      return matches if pos >= term.length
      counts[pos] += 1
    end
  end
end

.replacements_for(char) ⇒ Object



98
99
100
101
# File 'lib/indexed_search/match/leet.rb', line 98

def self.replacements_for(char)
  # cached version with original first, like a zero is used in math when you count
  (@@replacements_for ||= Hash[replacements.collect { |orig,repls| [orig, [orig] + repls] }])[char]
end

Instance Method Details

#scopeObject



59
60
61
# File 'lib/indexed_search/match/leet.rb', line 59

def scope
  @scope.where(self.class.matcher_attribute => term_map.keys)
end

#term_mapObject

map potential matches back to which search query term(s) have them



64
65
66
67
68
# File 'lib/indexed_search/match/leet.rb', line 64

def term_map
  @term_map ||= Hash.new { |hash,key| hash[key] = [] }.tap do |map|
    term_matches.each { |term| self.class.matches_for(term).each { |match| map[match] << term } }
  end
end