Class: Grubby::Scraper
- Inherits:
-
Object
- Object
- Grubby::Scraper
- Defined in:
- lib/grubby/scraper.rb
Direct Known Subclasses
Defined Under Namespace
Classes: Error
Instance Attribute Summary collapse
-
#source ⇒ Object
readonly
The source being scraped.
Class Method Summary collapse
-
.fields ⇒ Array<Symbol>
The names of all scraped values, as defined by Scraper.scrapes.
-
.scrapes(field, optional: false) { ... } ⇒ Object
Defines an attribute reader method named by
field.
Instance Method Summary collapse
-
#[](field) ⇒ Object
Returns the scraped value named by
field. -
#initialize(source) ⇒ Scraper
constructor
A new instance of Scraper.
-
#to_h ⇒ Hash<Symbol, Object>
Returns all scraped values as a Hash.
Constructor Details
#initialize(source) ⇒ Scraper
Returns a new instance of Scraper.
60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 |
# File 'lib/grubby/scraper.rb', line 60 def initialize(source) @source = source @scraped = {} @errors = {} self.class.fields.each do |field| begin self.send(field) rescue RuntimeError end end unless @errors.empty? listing = @errors.map do |field, error| error_class = " (#{error.class})" unless error.class == RuntimeError error_trace = error.backtrace.join("\n").indent(2) "* #{field} -- #{error.message}#{error_class}\n#{error_trace}" end raise Error.new("Failed to scrape the following fields:\n#{listing.join("\n")}") end end |
Instance Attribute Details
#source ⇒ Object (readonly)
Returns The source being scraped. Typically a Mechanize pluggable parser such as Mechanize::Page.
55 56 57 |
# File 'lib/grubby/scraper.rb', line 55 def source @source end |
Class Method Details
.fields ⇒ Array<Symbol>
Returns The names of all scraped values, as defined by scrapes.
48 49 50 |
# File 'lib/grubby/scraper.rb', line 48 def self.fields @fields ||= [] end |
.scrapes(field, optional: false) { ... } ⇒ Object
Defines an attribute reader method named by field. During initialize, the given block is called, and the attribute is set to the block’s return value. By default, if the block’s return value is nil, an exception will be raised. To prevent this behavior, set optional to true.
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
# File 'lib/grubby/scraper.rb', line 20 def self.scrapes(field, optional: false, &block) field = field.to_sym self.fields << field define_method(field) do return @scraped[field] if @scraped.key?(field) unless @errors.key?(field) begin value = instance_eval(&block) if value.nil? raise "`#{field}` cannot be nil" unless optional $log.debug("Scraped nil value for #{self.class}##{field}") end @scraped[field] = value rescue RuntimeError => e @errors[field] = e end end raise "`#{field}` raised a #{@errors[field].class}" if @errors.key?(field) @scraped[field] end end |
Instance Method Details
#[](field) ⇒ Object
Returns the scraped value named by field.
88 89 90 |
# File 'lib/grubby/scraper.rb', line 88 def [](field) @scraped.fetch(field.to_sym) end |
#to_h ⇒ Hash<Symbol, Object>
Returns all scraped values as a Hash.
95 96 97 |
# File 'lib/grubby/scraper.rb', line 95 def to_h @scraped.dup end |