Class: OpenGraphReader::Fetcher Private

Inherits:
Object
  • Object
show all
Defined in:
lib/open_graph_reader/fetcher.rb

Overview

This class is part of a private API. You should avoid using this class if possible, as it may be removed or be changed in the future.

Fetch an URI to retrieve its HTML body, if available.

Constant Summary collapse

HEADERS =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

{
  "Accept"     => "text/html",
  "User-Agent" => "OpenGraphReader/#{OpenGraphReader::VERSION} (+https://github.com/jhass/open_graph_reader)"
}.freeze

Instance Method Summary collapse

Constructor Details

#initialize(uri) ⇒ Fetcher

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Create a new fetcher.

Parameters:

  • uri (URI)

    the URI to fetch.

Raises:

  • (ArgumentError)


26
27
28
29
30
31
32
33
34
35
36
37
# File 'lib/open_graph_reader/fetcher.rb', line 26

def initialize uri
  raise ArgumentError, "url needs to be an instance of URI" unless uri.is_a? URI
  @uri = uri
  @fetch_failed = false
  @connection = Faraday.default_connection.dup
  @connection.headers.replace(HEADERS)
  @head_response = nil
  @get_response = nil

  prepend_middleware Faraday::CookieJar if defined? Faraday::CookieJar
  prepend_middleware FaradayMiddleware::FollowRedirects if defined? FaradayMiddleware
end

Instance Method Details

#bodyString

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

TODO:

Custom error class

Retrieve the body

Returns:

  • (String)

Raises:

  • (ArgumentError)

    The received content does not seems to be HTML.



70
71
72
73
74
75
# File 'lib/open_graph_reader/fetcher.rb', line 70

def body
  fetch_body unless fetched?
  raise NoOpenGraphDataError, "No response body received for #{@uri}" if fetch_failed?
  raise NoOpenGraphDataError, "Did not receive a HTML site at #{@uri}" unless html?
  @get_response.body
end

#fetchFaraday::Response? Also known as: fetch_body

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Fetch the full page.

Returns:

  • (Faraday::Response, nil)


49
50
51
52
53
# File 'lib/open_graph_reader/fetcher.rb', line 49

def fetch
  @get_response = @connection.get(@uri)
rescue Faraday::Error
  @fetch_failed = true
end

#fetch_headersFaraday::Response?

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Fetch just the headers

Returns:

  • (Faraday::Response, nil)


59
60
61
62
63
# File 'lib/open_graph_reader/fetcher.rb', line 59

def fetch_headers
  @head_response = @connection.head(@uri)
rescue Faraday::Error
  @fetch_failed = true
end

#fetched?Bool Also known as: fetched_body?

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Whether the target URI was fetched.

Returns:

  • (Bool)


93
94
95
# File 'lib/open_graph_reader/fetcher.rb', line 93

def fetched?
  fetch_failed? || !@get_response.nil?
end

#fetched_headers?Bool

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Whether the headers of the target URI were fetched.

Returns:

  • (Bool)


101
102
103
# File 'lib/open_graph_reader/fetcher.rb', line 101

def fetched_headers?
  fetch_failed? || !@get_response.nil? || !@head_response.nil?
end

#html?Bool

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Whether the target URI seems to return HTML

Returns:

  • (Bool)


80
81
82
83
84
85
86
87
88
# File 'lib/open_graph_reader/fetcher.rb', line 80

def html?
  fetch_headers unless fetched_headers?
  response = @get_response || @head_response
  return false if fetch_failed?
  return false unless response
  return false unless response.success?
  return false unless response["content-type"]
  response["content-type"].include? "text/html"
end

#urlString

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

The URL to fetch

Returns:

  • (String)


42
43
44
# File 'lib/open_graph_reader/fetcher.rb', line 42

def url
  @uri.to_s
end