Class: Html2rss::RequestSession::RelNextPager

Inherits:
Object
  • Object
show all
Includes:
Enumerable
Defined in:
lib/html2rss/request_session/rel_next_pager.rb

Overview

Traverses a rel=next pagination chain for selector-driven extraction.

Instance Method Summary collapse

Constructor Details

#initialize(session:, initial_response:, max_pages:, logger: Html2rss::Log) ⇒ RelNextPager

Returns a new instance of RelNextPager.

Parameters:

  • session (RequestSession)

    request session used to execute follow-ups

  • initial_response (RequestService::Response)

    first page response

  • max_pages (Integer)

    configured page budget, including the initial page

  • logger (Logger) (defaults to: Html2rss::Log)

    logger used for pagination stop reasons



15
16
17
18
19
20
# File 'lib/html2rss/request_session/rel_next_pager.rb', line 15

def initialize(session:, initial_response:, max_pages:, logger: Html2rss::Log)
  @session = session
  @initial_response = initial_response
  @max_pages = max_pages
  @logger = logger
end

Instance Method Details

#each {|RequestService::Response| ... } ⇒ Enumerator

Iterates over all paginated responses, beginning with the initial response.

Yields:

Returns:

  • (Enumerator)

    enumerator when no block is given



27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
# File 'lib/html2rss/request_session/rel_next_pager.rb', line 27

def each
  return enum_for(:each) unless block_given?

  yield initial_response

  current_response = initial_response
  session.effective_page_budget(max_pages).pred.times do
    next_url = next_page_url(current_response)
    break unless follow_up_allowed?(next_url)

    current_response = fetch_follow_up_response_or_stop(next_url, current_response.url)
    break unless current_response

    yield current_response
  end
end