Class: KeywordSearcher
- Inherits:
-
Object
- Object
- KeywordSearcher
- Defined in:
- lib/ppr/keyword_searcher.rb
Overview
Tool for looking for keywords within a string.
Instance Method Summary collapse
-
#[](keyword) ⇒ Object
Get the object corresponding to
keyword
. -
#[]=(keyword, object) ⇒ Object
Adds a
keyword
to the searcher associated with anobject
. -
#each_in(text) ⇒ Object
Search each keyword inside a
text
and apply the block on the corresponding objects if found with the range in the string where it has been found. -
#find(text, skip = []) ⇒ Object
Search a keyword inside a
text
and return the corresponding object if found with the range in the string where it has been found. -
#initialize(separator = "") ⇒ KeywordSearcher
constructor
Creates a new keyword searcher, where words are between
seperators
regular expressions. -
#to_h ⇒ Object
Converts to an hash: actually return self.
Constructor Details
#initialize(separator = "") ⇒ KeywordSearcher
Creates a new keyword searcher, where words are between seperators
regular expressions.
11 12 13 14 15 16 17 18 19 20 |
# File 'lib/ppr/keyword_searcher.rb', line 11 def initialize(separator = "") # Checks and set the separator. @separator = Regexp.new(separator).to_s # Initialize the inner map. @map = {} # Initialize the list of keywords @keywords = [] # Initialize the keyword extraction regular expression @keyword_extract = // end |
Instance Method Details
#[](keyword) ⇒ Object
Get the object corresponding to keyword
.
46 47 48 |
# File 'lib/ppr/keyword_searcher.rb', line 46 def [](keyword) return @map[keyword.to_s] end |
#[]=(keyword, object) ⇒ Object
Adds a keyword
to the searcher associated with an object
.
30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
# File 'lib/ppr/keyword_searcher.rb', line 30 def []=(keyword,object) # Ensure the keyword is a valid string. keyword = keyword.to_str unless /^[A-Za-z_]\w*$/.match(keyword) raise "Invalid string for a keyword: #{keyword}." end # Update the map. @map[keyword] = object # Get the keywords sorted in reverse order (used for building the # searching regular expressions). @keywords = @map.keys.sort!.reverse! # Update the searching regular expression. @keyword_extract = Regexp.new(@keywords.join("|")) end |
#each_in(text) ⇒ Object
Search each keyword inside a text
and apply the block on the corresponding objects if found with the range in the string where it has been found.
Returns an enumerator if no block is given.
NOTE: keywords included into a longer one are ignored.
89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 |
# File 'lib/ppr/keyword_searcher.rb', line 89 def each_in(text) return to_enum(:each_in,text) unless block_given? # Check and clone the text to avoid side effects. text = text.to_s.clone # Look for a first keyword. macro,range = find(text) while macro do # Delete the range from the text. text[range] = " " * (range.last-range.first+1) # Apply the block yield(macro,range) # Look for the next macro if any # print "text = #{text}\n" macro,range = find(text) end end |
#find(text, skip = []) ⇒ Object
Search a keyword inside a text
and return the corresponding object if found with the range in the string where it has been found.
If a keyword is in skip
it s ignored.
NOTE: the first found object is returned.
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 |
# File 'lib/ppr/keyword_searcher.rb', line 56 def find(text,skip = []) # print "skip=#{skip} @keywords=#{@keywords}\n" # Compute the regular expression for finding the keywords. rexp = Regexp.new( (@keywords - skip).map! do |k| @separator + k + @separator end.join("|") ) # print "find with @rexp=#{@rexp}\n" # Look for the first keyword. matched = rexp.match(text) # Isolate the keyword from the separators. # found = @keywords.match(matched.to_s) found = @keyword_extract.match(matched.to_s) if found then found = found.to_s # A keyword is found, adjust the range and # return it with the corresponding object. range = matched.offset(0) range[0] += matched.to_s.index(found) range[1] = range[0] + found.size - 1 return [ @map[found], range[0]..range[1] ] else # A keyword is not found. return nil end end |
#to_h ⇒ Object
Converts to an hash: actually return self.
NOTE: for duck typing purpose.
25 26 27 |
# File 'lib/ppr/keyword_searcher.rb', line 25 def to_h return self end |