Class: KeywordLinker
- Inherits:
-
Object
- Object
- KeywordLinker
- Defined in:
- lib/keyword_linker.rb
Overview
Given a set of keywords and url’s, and optionally HTML attributes to set on links, takes text and adds hyperlinks from the specified keywords to their associated URL’s. Example:
linker = KeywordLinker.new
linker.add_url('http://www.latimes.com', 'Los Angeles Times')
linker.link_text("Let's check out the Los Angeles Times!")
=> "Let's check out the <a href=\"http://www.latimes.com\">Los Angeles Times</a>!"
KeywordLinker depends on hpricot for parsing HTML. This is done to prevent hyperlinks from being added inside of other hyperlinks and inside of attribute text.
Constant Summary collapse
- @@blacklist_strategy =
Object.new
Instance Method Summary collapse
-
#add_url(url, keyword, html_attributes = {}) ⇒ Object
Takes a url and a keyword String or Array of keywords, and adds it to the tree of keywords in the KeywordLinker.
-
#blacklist_keyword(keyword) ⇒ Object
Blacklist this keyword or array of keywords.
-
#init_tree ⇒ Object
Initialize the tree after all url’s have been added.
-
#initialize(*lookups) ⇒ KeywordLinker
constructor
Takes an optional array of lookup objects.
-
#link_text(text) ⇒ Object
Adds links to known url’s into the text provided.
-
#process(text) ⇒ Object
Returns an array of matches in the specified text.
Constructor Details
#initialize(*lookups) ⇒ KeywordLinker
Takes an optional array of lookup objects. A lookup object is anything that responds to the process method and returns an array of Match objects, including KeywordLinker, KeywordProspector, and LookupChain objects. If multiple objects are specified, a LookupChain is created that gives highest priority to matches from objects closer to the end of the array.
38 39 40 41 42 43 44 |
# File 'lib/keyword_linker.rb', line 38 def initialize(*lookups) @tree_initialized=true if(lookups) @lookup = LookupChain.new(lookups) end end |
Instance Method Details
#add_url(url, keyword, html_attributes = {}) ⇒ Object
Takes a url and a keyword String or Array of keywords, and adds it to the tree of keywords in the KeywordLinker. Takes an optional hash of html attributes to be associated with this url.
Only the first occurrence of the url will be linked. If multiple keywords are specified, then only the first occurrence of any of the keywords is linked to the target url. ie, if multiple keywords match for this url, only one instance of one keyword will be linked.
54 55 56 57 58 59 60 61 |
# File 'lib/keyword_linker.rb', line 54 def add_url(url, keyword, html_attributes={}) init_lookup strategy = HyperlinkStrategy.new(url, html_attributes) strategy.keywords = keyword @dl.add(strategy) end |
#blacklist_keyword(keyword) ⇒ Object
Blacklist this keyword or array of keywords. If a keyword is blacklisted, it will not be linked. For example, if the “Los Angeles” part of “Los Angeles Times” is getting linked, you can blacklist “Los Angeles Times” to keep it from being linked.
67 68 69 70 71 |
# File 'lib/keyword_linker.rb', line 67 def blacklist_keyword(keyword) init_lookup @dl.add(keyword, @@blacklist_strategy) end |
#init_tree ⇒ Object
Initialize the tree after all url’s have been added. This needs to be called once. If you don’t call init_tree, it will be called automatically on the first call to the process or link_text method. You may find this annoying or inconvenient if it happens on the first request to your application and you’ve constructed a large set of links. Adding url’s after calling init_tree, process, or link_text is not supported.
79 80 81 82 83 84 |
# File 'lib/keyword_linker.rb', line 79 def init_tree unless @tree_initialized @dl.construct_fail @tree_initialized = true end end |
#link_text(text) ⇒ Object
Adds links to known url’s into the text provided. Only the first instance of each keyword or set of keywords associated to a url is linked. In cases of overlap, the longest keyword is chosen to resolve the overlap.
89 90 91 92 93 94 95 96 97 98 99 |
# File 'lib/keyword_linker.rb', line 89 def link_text(text) init_tree unless @tree_initialized linked_outputs = Set.new htext = Hpricot(text) link_text_in_elem(htext, linked_outputs) return htext.to_s end |
#process(text) ⇒ Object
Returns an array of matches in the specified text. Doesn’t filter overlaps or parse HTML to prevent matches in attribute text or inside of existing hyperlinks. Primarily for internal use.
104 105 106 107 108 |
# File 'lib/keyword_linker.rb', line 104 def process(text) init_tree unless @tree_initialized @lookup.process(text) end |