Class: Ubi::Aranea
- Inherits:
-
Object
- Object
- Ubi::Aranea
- Defined in:
- lib/ubi/aranea.rb
Overview
Base for araneas (spiders)
Constant Summary collapse
Instance Attribute Summary collapse
-
#datum ⇒ Object
storage: MemoryStore.
-
#thema ⇒ Object
storage: MemoryStore.
-
#url ⇒ Object
storage: MemoryStore.
Instance Method Summary collapse
- #crawl! ⇒ Object
-
#initialize(thema, url, opts = {}) ⇒ Aranea
constructor
A new instance of Aranea.
- #parser(chunk) ⇒ Object
Constructor Details
#initialize(thema, url, opts = {}) ⇒ Aranea
Returns a new instance of Aranea.
15 16 17 18 19 |
# File 'lib/ubi/aranea.rb', line 15 def initialize(thema, url, opts = {}) @thema = thema @url = url @opts = opts end |
Instance Attribute Details
#datum ⇒ Object
storage: MemoryStore
13 14 15 |
# File 'lib/ubi/aranea.rb', line 13 def datum @datum end |
#thema ⇒ Object
storage: MemoryStore
13 14 15 |
# File 'lib/ubi/aranea.rb', line 13 def thema @thema end |
#url ⇒ Object
storage: MemoryStore
13 14 15 |
# File 'lib/ubi/aranea.rb', line 13 def url @url end |
Instance Method Details
#crawl! ⇒ Object
23 24 25 26 27 28 29 30 31 |
# File 'lib/ubi/aranea.rb', line 23 def crawl! Polipus.crawler(name, url, OPTIONS.merge(@opts)) do |crawler| # In-place page processing crawler.on_page_downloaded do |page| # A nokogiri object puts "'#{page.doc.css('title').text}' (#{page.url})" end end end |
#parser(chunk) ⇒ Object
33 34 35 |
# File 'lib/ubi/aranea.rb', line 33 def parser(chunk) Nokogiri::HTML(chunk) end |