Class: Webshaker::Scraper
- Inherits:
-
Object
- Object
- Webshaker::Scraper
- Defined in:
- lib/webshaker/scraper.rb
Instance Attribute Summary collapse
-
#driver ⇒ Object
readonly
Returns the value of attribute driver.
-
#options ⇒ Object
readonly
Returns the value of attribute options.
-
#status_update ⇒ Object
readonly
Returns the value of attribute status_update.
-
#url ⇒ Object
readonly
Returns the value of attribute url.
Class Method Summary collapse
Instance Method Summary collapse
-
#initialize(url, options = {}, status_update: ->(status) {}) ⇒ Scraper
constructor
A new instance of Scraper.
- #scrape ⇒ Object
Constructor Details
#initialize(url, options = {}, status_update: ->(status) {}) ⇒ Scraper
Returns a new instance of Scraper.
9 10 11 12 13 14 15 16 17 18 19 20 |
# File 'lib/webshaker/scraper.rb', line 9 def initialize(url, = {}, status_update: ->(status) {}) @url = url @options = @status_update = status_update status_update.call(:scrape_init) @driver = Selenium::WebDriver.for( :chrome, options: Selenium::WebDriver::Chrome::Options.new.tap(&method(:configure)) ) end |
Instance Attribute Details
#driver ⇒ Object (readonly)
Returns the value of attribute driver.
7 8 9 |
# File 'lib/webshaker/scraper.rb', line 7 def driver @driver end |
#options ⇒ Object (readonly)
Returns the value of attribute options.
7 8 9 |
# File 'lib/webshaker/scraper.rb', line 7 def @options end |
#status_update ⇒ Object (readonly)
Returns the value of attribute status_update.
7 8 9 |
# File 'lib/webshaker/scraper.rb', line 7 def status_update @status_update end |
#url ⇒ Object (readonly)
Returns the value of attribute url.
7 8 9 |
# File 'lib/webshaker/scraper.rb', line 7 def url @url end |
Class Method Details
.scrape(url, options = {}) ⇒ Object
38 39 40 |
# File 'lib/webshaker/scraper.rb', line 38 def self.scrape(url, = {}) new(url, ).scrape end |
Instance Method Details
#scrape ⇒ Object
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
# File 'lib/webshaker/scraper.rb', line 22 def scrape status_update.call(:scrape_start) driver.navigate.to url do_wait screenshot = driver.screenshot_as :base64 html_content = clean_up(driver.page_source) driver.quit status_update.call(:scrape_done) ScrapeResult.new(screenshot, html_content) end |