Class: Html2rss::Item

Inherits:
Object
  • Object
show all
Defined in:
lib/html2rss/item.rb

Overview

Takes the selected Nokogiri::HTML and responds to accessor names defined in the feed config.

Instances can only be created via ‘.from_url` and each represents an internally used “RSS item”. Such an item provides dynamically defined attributes as methods.

Defined Under Namespace

Classes: Context, Enclosure

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(xml, config) ⇒ Item

Returns a new instance of Item.

Parameters:



38
39
40
41
# File 'lib/html2rss/item.rb', line 38

def initialize(xml, config)
  @xml = xml
  @config = config
end

Dynamic Method Handling

This class handles dynamic methods through the method_missing method

#method_missing(method_name, *_args) ⇒ String

Dynamically extracts data based on the method name.

Parameters:

  • method_name (Symbol)
  • _args (Array)

Returns:

  • (String)

    extracted value for the selector.



62
63
64
65
66
# File 'lib/html2rss/item.rb', line 62

def method_missing(method_name, *_args)
  return super unless respond_to_missing?(method_name)

  extract(method_name)
end

Class Method Details

.from_url(url, config) ⇒ Array<Html2rss::Item>

Fetches items from a given URL using configuration settings.

Parameters:

  • url (Addressable::URI)

    URL to fetch items from.

  • config (Html2rss::Config)

    Configuration object.

Returns:



25
26
27
28
29
30
31
32
33
# File 'lib/html2rss/item.rb', line 25

def self.from_url(url, config)
  body = Utils.request_url(url, headers: config.headers).body
  body = ObjectToXmlConverter.new(JSON.parse(body)).call if config.json?

  Nokogiri.HTML(body)
          .css(config.selector_string(Config::Selectors::ITEMS_SELECTOR_NAME))
          .map { |xml| new(xml, config) }
          .select(&:valid?)
end

Instance Method Details

#categoriesArray<String>

Retrieves categories for the item based on configured category selectors.

Returns:

  • (Array<String>)

    list of categories.



114
115
116
117
118
119
120
# File 'lib/html2rss/item.rb', line 114

def categories
  config.category_selector_names
        .filter_map do |method_name|
    category = public_send(method_name)
    category.strip unless category.to_s.empty?
  end.uniq
end

#enclosureEnclosure

Retrieves enclosure details for the item.

Returns:



134
135
136
137
138
139
140
141
142
143
144
# File 'lib/html2rss/item.rb', line 134

def enclosure
  url = enclosure_url

  raise 'An item.enclosure requires an absolute URL' unless url&.absolute?

  Enclosure.new(
    type: Html2rss::Utils.guess_content_type_from_url(url),
    bits_length: 0,
    url: url.to_s
  )
end

#enclosure?true, false

Checks if the item has an enclosure based on configuration.

Returns:

  • (true, false)


126
127
128
# File 'lib/html2rss/item.rb', line 126

def enclosure?
  config.selector?(:enclosure)
end

#extract(tag) ⇒ String

Selects and processes data according to the selector name.

Parameters:

  • tag (Symbol)

Returns:

  • (String)

    the extracted value for the selector.



73
74
75
76
77
78
79
80
# File 'lib/html2rss/item.rb', line 73

def extract(tag)
  attribute_options = config.selector_attributes_with_channel(tag.to_sym)

  post_process(
    ItemExtractors.item_extractor_factory(attribute_options, xml).get,
    attribute_options.fetch(:post_process, false)
  )
end

#guidString

Returns SHA1 hashed GUID.

Returns:

  • (String)

    SHA1 hashed GUID.



104
105
106
107
108
# File 'lib/html2rss/item.rb', line 104

def guid
  content = config.guid_selector_names.flat_map { |method_name| public_send(method_name) }.join

  Digest::SHA1.hexdigest(content)
end

#respond_to_missing?(method_name, _include_private = false) ⇒ true, false

Checks if the object responds to a method dynamically based on the configuration.

:reek:BooleanParameter { enabled: false }

Parameters:

  • method_name (Symbol)
  • _include_private (true, false) (defaults to: false)

Returns:

  • (true, false)


52
53
54
# File 'lib/html2rss/item.rb', line 52

def respond_to_missing?(method_name, _include_private = false)
  config.selector?(method_name) || super
end

#title_or_descriptionString?

Returns either the title or the description, preferring title if available.

Returns:

  • (String, nil)


95
96
97
98
99
# File 'lib/html2rss/item.rb', line 95

def title_or_description
  return title if config.selector?(:title)

  description if config.selector?(:description)
end

#valid?true, false

Checks if the item is valid accordin to RSS 2.0 spec, by ensuring it has at least a title or a description.

Returns:

  • (true, false)


87
88
89
# File 'lib/html2rss/item.rb', line 87

def valid?
  title_or_description.to_s != ''
end