Class: Html2rss::RequestService::PuppetCommander

Inherits:
Object
  • Object
show all
Defined in:
lib/html2rss/request_service/puppet_commander.rb

Overview

Commands the Puppeteer Browser to the website and builds the Response.

Constant Summary collapse

BROWSER_UNSAFE_HEADERS =

rubocop:disable Metrics/ClassLength

%w[
  host connection content-length transfer-encoding
  sec-fetch-dest sec-fetch-mode sec-fetch-site sec-fetch-user
  upgrade-insecure-requests
].to_set.freeze

Instance Method Summary collapse

Constructor Details

#initialize(ctx, browser, skip_request_resources: %w[stylesheet image media font].to_set, referer: [ctx.url.scheme, ctx.url.host].join('://')) ⇒ PuppetCommander

Returns a new instance of PuppetCommander.

Parameters:

  • ctx (Context)
  • browser (Puppeteer::Browser)
  • skip_request_resources (Set<String>) (defaults to: %w[stylesheet image media font].to_set)

    the resource types not to request

  • referer (String) (defaults to: [ctx.url.scheme, ctx.url.host].join('://'))

    the referer to use for the request



18
19
20
21
22
23
24
25
26
# File 'lib/html2rss/request_service/puppet_commander.rb', line 18

def initialize(ctx,
               browser,
               skip_request_resources: %w[stylesheet image media font].to_set,
               referer: [ctx.url.scheme, ctx.url.host].join('://'))
  @ctx = ctx
  @browser = browser
  @skip_request_resources = skip_request_resources
  @referer = referer
end

Instance Method Details

#body(page) ⇒ String

Returns rendered HTML content.

Parameters:

  • page (Puppeteer::Page)

    browser page

Returns:

  • (String)

    rendered HTML content



94
# File 'lib/html2rss/request_service/puppet_commander.rb', line 94

def body(page) = page.content

#callResponse

Visits the request URL and normalizes the page into a response object.

Returns:



32
33
34
35
36
37
38
39
40
41
42
# File 'lib/html2rss/request_service/puppet_commander.rb', line 32

def call
  page = new_page
  navigation_response = navigate_to_destination(page, ctx.url)
  perform_preload(page)
  raise_navigation_error_if_any
  final_navigation_response = latest_navigation_response || navigation_response
  validate_navigation_response!(final_navigation_response)
  build_response(page, final_navigation_response)
ensure
  page&.close
end

#configure_navigation_guards(page) ⇒ void

This method returns an undefined value.

Parameters:

  • page (Puppeteer::Page)


67
68
69
70
71
72
73
# File 'lib/html2rss/request_service/puppet_commander.rb', line 67

def configure_navigation_guards(page)
  page.request_interception = true
  page.on('request') do |request|
    handle_request(request)
  end
  page.on('response') { |response| handle_response(response) }
end

#configure_page(page) ⇒ void

This method returns an undefined value.

Parameters:

  • page (Puppeteer::Page)


58
59
60
61
62
# File 'lib/html2rss/request_service/puppet_commander.rb', line 58

def configure_page(page)
  page.extra_http_headers = browser_headers
  page.default_navigation_timeout = navigation_timeout_ms
  page.default_timeout = navigation_timeout_ms
end

Returns the navigation response if one was produced.

Parameters:

  • page (Puppeteer::Page)

    browser page

  • url (Html2rss::Url)

    target URL

Returns:

  • (Puppeteer::HTTPResponse, nil)

    the navigation response if one was produced



79
80
81
82
83
84
85
86
87
88
89
# File 'lib/html2rss/request_service/puppet_commander.rb', line 79

def navigate_to_destination(page, url)
  @navigation_error = nil
  @latest_navigation_response = nil
  page.goto(url, wait_until: 'networkidle0', referer:, timeout: navigation_timeout_ms).tap do
    raise_navigation_error_if_any
  end
rescue StandardError
  raise_navigation_error_if_any

  raise
end

#new_pagePuppeteer::Page



47
48
49
50
51
52
53
# File 'lib/html2rss/request_service/puppet_commander.rb', line 47

def new_page
  page = browser.new_page
  @main_frame = page.main_frame if page.respond_to?(:main_frame)
  configure_page(page)
  configure_navigation_guards(page)
  page
end