Class: IndexedSearch::Match::Leet
- Defined in:
- lib/indexed_search/match/leet.rb
Overview
Performs a speedy but minimalistic rudimentary 1337 (Leet) match, see: en.wikipedia.org/wiki/Leet Note that non-alpha-numerics are never included in the index, so we’re limited to letters and numbers. The default mappings only has some basic ascii, but could be extended to include many unicode characters. It’s kept so short by default so that speed impact doesn’t get out of hand. Using regular expressions could yield higher quantity and quality matches, and be a more succinct of a description of them too, but they are very slow with really large word lists so they’re kind of impractical to use here.
Class Method Summary collapse
-
.matches_for(term) ⇒ Object
given a string of characters, look each one up in the leet replacement table, and return a giant batch of possibilities with every character replaced with every possible combination of leet from the table this loop tries to be efficient due to the potential for a lot of possible matches.
- .replacements_for(char) ⇒ Object
Instance Method Summary collapse
- #scope ⇒ Object
-
#term_map ⇒ Object
map potential matches back to which search query term(s) have them.
Methods inherited from Base
#find, find_attributes, #initialize, match_against_term?, #results, #term_matches, #term_non_matches
Constructor Details
This class inherits a constructor from IndexedSearch::Match::Base
Class Method Details
.matches_for(term) ⇒ Object
given a string of characters, look each one up in the leet replacement table, and return a giant batch of possibilities with every character replaced with every possible combination of leet from the table this loop tries to be efficient due to the potential for a lot of possible matches
73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 |
# File 'lib/indexed_search/match/leet.rb', line 73 def self.matches_for(term) matches = [] counts = [0] * term.length # cached in local var for speed increase, compared to method call in loops replacements = (0..term.length-1).collect { |pos| replacements_for(term[pos]) || [term[pos]] } # treating original string like a bunch of digits, this loop is the digit incrementer loop do # concatenate a match together (better speed by not using temporary arrays with a chained one-liner) match = '' (0..term.length-1).each { |pos| match << replacements[pos][counts[pos]] } matches << match # increment digit counts[0] += 1 # loop for carrying over to next digit(s), when a digit reaches its maximum pos = 0 while counts[pos] >= replacements[pos].length counts[pos] = 0 pos += 1 # return results when all digits reached max, we're done return matches if pos >= term.length counts[pos] += 1 end end end |
.replacements_for(char) ⇒ Object
98 99 100 101 |
# File 'lib/indexed_search/match/leet.rb', line 98 def self.replacements_for(char) # cached version with original first, like a zero is used in math when you count (@@replacements_for ||= Hash[replacements.collect { |orig,repls| [orig, [orig] + repls] }])[char] end |
Instance Method Details
#scope ⇒ Object
59 60 61 |
# File 'lib/indexed_search/match/leet.rb', line 59 def scope @scope.where(self.class.matcher_attribute => term_map.keys) end |
#term_map ⇒ Object
map potential matches back to which search query term(s) have them
64 65 66 67 68 |
# File 'lib/indexed_search/match/leet.rb', line 64 def term_map @term_map ||= Hash.new { |hash,key| hash[key] = [] }.tap do |map| term_matches.each { |term| self.class.matches_for(term).each { |match| map[match] << term } } end end |