Module: Html2rss::ItemExtractors

Defined in:
lib/html2rss/item_extractors.rb,
lib/html2rss/item_extractors/href.rb,
lib/html2rss/item_extractors/html.rb,
lib/html2rss/item_extractors/text.rb,
lib/html2rss/item_extractors/static.rb,
lib/html2rss/item_extractors/attribute.rb

Overview

Provides a namespace for item extractors.

Defined Under Namespace

Classes: Attribute, Href, Html, Static, Text, UnknownExtractorName

Constant Summary collapse

NAME_TO_CLASS =

Maps the extractor name to the class implementing the extractor.

The key is the name to use in the feed config.

{
  attribute: Attribute,
  href: Href,
  html: Html,
  static: Static,
  text: Text
}.freeze
ITEM_OPTION_CLASSES =

Maps the extractor class to its corresponding options class.

Hash.new do |hash, klass|
  hash[klass] = klass.const_get(:Options)
end
DEFAULT_EXTRACTOR =
:text

Class Method Summary collapse

Class Method Details

.build_options_instance(extractor_class, attribute_options) ⇒ Object

Builds the options instance for the extractor class.

Parameters:

  • extractor_class (Class)

    the class implementing the extractor

  • attribute_options (Hash<Symbol, Object>)

    the attribute options

Returns:

  • (Object)

    an instance of the options class for the extractor



72
73
74
75
# File 'lib/html2rss/item_extractors.rb', line 72

def self.build_options_instance(extractor_class, attribute_options)
  options = attribute_options.slice(*extractor_class::Options.members)
  ITEM_OPTION_CLASSES[extractor_class].new(options)
end

.create_extractor_instance(extractor_class, xml, options_instance) ⇒ Object

Creates an instance of the extractor class.

Parameters:

  • extractor_class (Class)

    the class implementing the extractor

  • xml (Nokogiri::XML::Document)

    the XML document

  • options_instance (Object)

    the options instance

Returns:

  • (Object)

    an instance of the extractor class



84
85
86
# File 'lib/html2rss/item_extractors.rb', line 84

def self.create_extractor_instance(extractor_class, xml, options_instance)
  extractor_class.new(xml, options_instance)
end

.element(xml, selector) ⇒ Nokogiri::XML::ElementSet

Retrieves an element from Nokogiri XML based on the selector.

Parameters:

  • xml (Nokogiri::XML::Document)
  • selector (String, nil)

Returns:

  • (Nokogiri::XML::ElementSet)

    selected XML elements



37
38
39
# File 'lib/html2rss/item_extractors.rb', line 37

def self.element(xml, selector)
  selector ? xml.css(selector) : xml
end

.find_extractor_class(extractor_name) ⇒ Class

Finds the extractor class based on the name.

Parameters:

  • extractor_name (Symbol)

    the name of the extractor

Returns:

  • (Class)

    the class implementing the extractor

Raises:



61
62
63
64
# File 'lib/html2rss/item_extractors.rb', line 61

def self.find_extractor_class(extractor_name)
  NAME_TO_CLASS[extractor_name] || raise(UnknownExtractorName,
                                         "Unknown extractor name '#{extractor_name}' requested in NAME_TO_CLASS")
end

.item_extractor_factory(attribute_options, xml) ⇒ Object

Creates an instance of the requested item extractor.

Parameters:

  • attribute_options (Hash<Symbol, Object>)

    Should contain at least ‘:extractor` (the name) and required options for that extractor.

  • xml (Nokogiri::XML::Document)

Returns:

  • (Object)

    instance of the specified item extractor class



48
49
50
51
52
53
# File 'lib/html2rss/item_extractors.rb', line 48

def self.item_extractor_factory(attribute_options, xml)
  extractor_name = attribute_options[:extractor]&.to_sym || DEFAULT_EXTRACTOR
  extractor_class = find_extractor_class(extractor_name)
  options_instance = build_options_instance(extractor_class, attribute_options)
  create_extractor_instance(extractor_class, xml, options_instance)
end