Class: Zenrows::CssExtractor
- Inherits:
-
Object
- Object
- Zenrows::CssExtractor
- Defined in:
- lib/zenrows/css_extractor.rb
Overview
DSL for building CSS extraction rules
Provides a clean interface for defining CSS selectors to extract data from web pages using the ZenRows API.
Instance Attribute Summary collapse
-
#rules ⇒ Hash{Symbol => String}
readonly
Extraction rules.
Class Method Summary collapse
-
.build {|extractor| ... } ⇒ CssExtractor
Build extractor using DSL block.
Instance Method Summary collapse
-
#empty? ⇒ Boolean
Check if extractor has rules.
-
#extract(name, selector, attribute: nil) ⇒ self
Define extraction rule.
-
#images(name, selector) ⇒ self
Add rule for extracting src attributes.
-
#initialize ⇒ CssExtractor
constructor
Initialize empty extractor.
-
#links(name, selector) ⇒ self
Add rule for extracting href attributes.
-
#size ⇒ Integer
Number of extraction rules.
-
#to_h ⇒ Hash{Symbol => String}
Convert to hash.
-
#to_json ⇒ String
Convert to JSON string for API.
Constructor Details
#initialize ⇒ CssExtractor
Initialize empty extractor
44 45 46 |
# File 'lib/zenrows/css_extractor.rb', line 44 def initialize @rules = {} end |
Instance Attribute Details
#rules ⇒ Hash{Symbol => String} (readonly)
Returns Extraction rules.
33 34 35 |
# File 'lib/zenrows/css_extractor.rb', line 33 def rules @rules end |
Class Method Details
.build {|extractor| ... } ⇒ CssExtractor
Build extractor using DSL block
39 40 41 |
# File 'lib/zenrows/css_extractor.rb', line 39 def self.build(&block) new.tap { |e| e.instance_eval(&block) } end |
Instance Method Details
#empty? ⇒ Boolean
Check if extractor has rules
100 101 102 |
# File 'lib/zenrows/css_extractor.rb', line 100 def empty? @rules.empty? end |
#extract(name, selector, attribute: nil) ⇒ self
Define extraction rule
60 61 62 63 |
# File 'lib/zenrows/css_extractor.rb', line 60 def extract(name, selector, attribute: nil) @rules[name.to_sym] = attribute ? "#{selector} @#{attribute}" : selector self end |
#images(name, selector) ⇒ self
Add rule for extracting src attributes
79 80 81 |
# File 'lib/zenrows/css_extractor.rb', line 79 def images(name, selector) extract(name, selector, attribute: "src") end |
#links(name, selector) ⇒ self
Add rule for extracting href attributes
70 71 72 |
# File 'lib/zenrows/css_extractor.rb', line 70 def links(name, selector) extract(name, selector, attribute: "href") end |
#size ⇒ Integer
Number of extraction rules
107 108 109 |
# File 'lib/zenrows/css_extractor.rb', line 107 def size @rules.size end |
#to_h ⇒ Hash{Symbol => String}
Convert to hash
86 87 88 |
# File 'lib/zenrows/css_extractor.rb', line 86 def to_h @rules end |
#to_json ⇒ String
Convert to JSON string for API
93 94 95 |
# File 'lib/zenrows/css_extractor.rb', line 93 def to_json(*) @rules.transform_keys(&:to_s).to_json end |