Module: Html2rss::ItemExtractors

Defined in:: lib/html2rss/item_extractors.rb,
lib/html2rss/item_extractors/href.rb,
lib/html2rss/item_extractors/html.rb,
lib/html2rss/item_extractors/text.rb,
lib/html2rss/item_extractors/static.rb,
lib/html2rss/item_extractors/attribute.rb

Overview

Provides a namespace for item extractors.

Defined Under Namespace

Classes: Attribute, Href, Html, Static, Text, UnknownExtractorName

Constant Summary collapse

NAME_TO_CLASS = Maps the extractor name to the class implementing the extractor. The key is the name to use in the feed config.

{
  attribute: Attribute,
  href: Href,
  html: Html,
  static: Static,
  text: Text
}.freeze

ITEM_OPTION_CLASSES = Maps the extractor class to its corresponding options class.

Hash.new do |hash, klass|
  hash[klass] = klass.const_get(:Options)
end

DEFAULT_EXTRACTOR =

:text

Class Method Summary collapse

.build_options_instance(extractor_class, attribute_options) ⇒ Object

Builds the options instance for the extractor class.
.create_extractor_instance(extractor_class, xml, options_instance) ⇒ Object

Creates an instance of the extractor class.
.element(xml, selector) ⇒ Nokogiri::XML::ElementSet

Retrieves an element from Nokogiri XML based on the selector.
.find_extractor_class(extractor_name) ⇒ Class

Finds the extractor class based on the name.
.item_extractor_factory(attribute_options, xml) ⇒ Object

Creates an instance of the requested item extractor.

Class Method Details

.build_options_instance(extractor_class, attribute_options) ⇒ `Object`

Builds the options instance for the extractor class.

Parameters:

extractor_class (Class) —

the class implementing the extractor
attribute_options (Hash<Symbol, Object>) —

the attribute options

Returns:

(Object) —

an instance of the options class for the extractor

# File 'lib/html2rss/item_extractors.rb', line 72

def self.build_options_instance(extractor_class, attribute_options)
  options = attribute_options.slice(*extractor_class::Options.members)
  ITEM_OPTION_CLASSES[extractor_class].new(options)
end

.create_extractor_instance(extractor_class, xml, options_instance) ⇒ `Object`

Creates an instance of the extractor class.

Parameters:

extractor_class (Class) —

the class implementing the extractor
xml (Nokogiri::XML::Document) —

the XML document
options_instance (Object) —

the options instance

Returns:

(Object) —

an instance of the extractor class



84
85
86

# File 'lib/html2rss/item_extractors.rb', line 84

def self.create_extractor_instance(extractor_class, xml, options_instance)
  extractor_class.new(xml, options_instance)
end

.element(xml, selector) ⇒ `Nokogiri::XML::ElementSet`

Retrieves an element from Nokogiri XML based on the selector.

Parameters:

xml (Nokogiri::XML::Document)
selector (String, nil)

Returns:

(Nokogiri::XML::ElementSet) —

selected XML elements



37
38
39

# File 'lib/html2rss/item_extractors.rb', line 37

def self.element(xml, selector)
  selector ? xml.css(selector) : xml
end

.find_extractor_class(extractor_name) ⇒ `Class`

Finds the extractor class based on the name.

Parameters:

extractor_name (Symbol) —

the name of the extractor

Returns:

(Class) —

the class implementing the extractor

Raises:

(UnknownExtractorName) —

if the extractor class is not found

# File 'lib/html2rss/item_extractors.rb', line 61

def self.find_extractor_class(extractor_name)
  NAME_TO_CLASS[extractor_name] || raise(UnknownExtractorName,
                                         "Unknown extractor name '#{extractor_name}' requested in NAME_TO_CLASS")
end

.item_extractor_factory(attribute_options, xml) ⇒ `Object`

Creates an instance of the requested item extractor.

Parameters:

attribute_options (Hash<Symbol, Object>) —

Should contain at least ‘:extractor` (the name) and required options for that extractor.
xml (Nokogiri::XML::Document)

Returns:

(Object) —

instance of the specified item extractor class

# File 'lib/html2rss/item_extractors.rb', line 48

def self.item_extractor_factory(attribute_options, xml)
  extractor_name = attribute_options[:extractor]&.to_sym || DEFAULT_EXTRACTOR
  extractor_class = find_extractor_class(extractor_name)
  options_instance = build_options_instance(extractor_class, attribute_options)
  create_extractor_instance(extractor_class, xml, options_instance)
end

Module: Html2rss::ItemExtractors

Overview

Defined Under Namespace

Constant Summary collapse

Class Method Summary collapse

Class Method Details

.build_options_instance(extractor_class, attribute_options) ⇒ Object

.create_extractor_instance(extractor_class, xml, options_instance) ⇒ Object

.element(xml, selector) ⇒ Nokogiri::XML::ElementSet

.find_extractor_class(extractor_name) ⇒ Class

.item_extractor_factory(attribute_options, xml) ⇒ Object

.build_options_instance(extractor_class, attribute_options) ⇒ `Object`

.create_extractor_instance(extractor_class, xml, options_instance) ⇒ `Object`

.element(xml, selector) ⇒ `Nokogiri::XML::ElementSet`

.find_extractor_class(extractor_name) ⇒ `Class`

.item_extractor_factory(attribute_options, xml) ⇒ `Object`