Class: Wgit::Base
Overview
Class to inherit from, as an alternative form of using the Wgit::DSL
.
All subclasses must define a #parse(doc, &block)
method.
Constant Summary
Constants included from DSL
Class Method Summary collapse
-
.mode(method) ⇒ Object
Sets the crawl/index method to call when
Base.run
is called. -
.run(&block) ⇒ Object
Runs the crawl/index passing each crawled
Wgit::Document
and the given block to the subclass's#parse
method.
Instance Method Summary collapse
-
#setup ⇒ Object
Runs once before the crawl/index is run.
-
#teardown ⇒ Object
Runs once after the crawl/index is complete.
Methods included from DSL
crawl, crawl_site, empty_db!, extract, follow, index, index_site, index_www, last_response, reset, search, start, use_crawler, use_database
Class Method Details
.mode(method) ⇒ Object
Sets the crawl/index method to call when Base.run
is called.
The mode method must match one defined in the Wgit::Crawler
or
Wgit::Indexer
class.
35 36 37 |
# File 'lib/wgit/base.rb', line 35 def self.mode(method) @method = method end |
.run(&block) ⇒ Object
Runs the crawl/index passing each crawled Wgit::Document
and the given
block to the subclass's #parse
method.
15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
# File 'lib/wgit/base.rb', line 15 def self.run(&block) crawl_method = @method || :crawl obj = new unless obj.respond_to?(:parse) raise "#{obj.class} must respond_to? #parse(doc, &block)" end obj.setup send(crawl_method) { |doc| obj.parse(doc, &block) } obj.teardown obj end |
Instance Method Details
#setup ⇒ Object
Runs once before the crawl/index is run. Override as needed.
8 |
# File 'lib/wgit/base.rb', line 8 def setup; end |
#teardown ⇒ Object
Runs once after the crawl/index is complete. Override as needed.
11 |
# File 'lib/wgit/base.rb', line 11 def teardown; end |