Class: Html2rss::AutoSource::Scraper::Schema::Thing
- Inherits:
-
Object
- Object
- Html2rss::AutoSource::Scraper::Schema::Thing
- Defined in:
- lib/html2rss/auto_source/scraper/schema/thing.rb
Overview
A Thing is kind of the ‘base class’ for Schema.org schema_objects.
Constant Summary collapse
- SUPPORTED_TYPES =
%w[ AdvertiserContentArticle AnalysisNewsArticle APIReference Article AskPublicNewsArticle BackgroundNewsArticle BlogPosting DiscussionForumPosting LiveBlogPosting NewsArticle OpinionNewsArticle Report ReportageNewsArticle ReviewNewsArticle SatiricalArticle ScholarlyArticle SocialMediaPosting TechArticle ].to_set.freeze
- DEFAULT_ATTRIBUTES =
%i[id title description url image published_at].freeze
Instance Method Summary collapse
-
#call ⇒ Hash
The scraped article hash with DEFAULT_ATTRIBUTES.
- #description ⇒ Object
- #id ⇒ Object
- #image ⇒ Object
-
#initialize(schema_object, url:) ⇒ Thing
constructor
A new instance of Thing.
- #published_at ⇒ Object
- #title ⇒ Object
-
#url ⇒ Addressable::URI?
The URL of the schema object.
Constructor Details
permalink #initialize(schema_object, url:) ⇒ Thing
Returns a new instance of Thing.
37 38 39 40 |
# File 'lib/html2rss/auto_source/scraper/schema/thing.rb', line 37 def initialize(schema_object, url:) @schema_object = schema_object @url = url end |
Instance Method Details
permalink #call ⇒ Hash
Returns the scraped article hash with DEFAULT_ATTRIBUTES.
43 44 45 46 47 |
# File 'lib/html2rss/auto_source/scraper/schema/thing.rb', line 43 def call DEFAULT_ATTRIBUTES.to_h do |attribute| [attribute, public_send(attribute)] end end |
permalink #description ⇒ Object
[View source]
61 62 63 64 |
# File 'lib/html2rss/auto_source/scraper/schema/thing.rb', line 61 def description schema_object.values_at(:description, :schema_object_body, :abstract) .max_by { |string| string.to_s.size } end |
permalink #id ⇒ Object
[View source]
49 50 51 52 53 54 55 56 57 |
# File 'lib/html2rss/auto_source/scraper/schema/thing.rb', line 49 def id return @id if defined?(@id) id = (schema_object[:@id] || url&.path).to_s return if id.empty? @id = id end |
permalink #image ⇒ Object
[View source]
77 78 79 80 81 |
# File 'lib/html2rss/auto_source/scraper/schema/thing.rb', line 77 def image if (image_url = image_urls.first) Utils.build_absolute_url_from_relative(image_url, @url) end end |
permalink #published_at ⇒ Object
[View source]
83 |
# File 'lib/html2rss/auto_source/scraper/schema/thing.rb', line 83 def published_at = schema_object[:datePublished] |
permalink #title ⇒ Object
[View source]
59 |
# File 'lib/html2rss/auto_source/scraper/schema/thing.rb', line 59 def title = schema_object[:title] |
permalink #url ⇒ Addressable::URI?
Returns the URL of the schema object.
67 68 69 70 71 72 73 74 75 |
# File 'lib/html2rss/auto_source/scraper/schema/thing.rb', line 67 def url url = schema_object[:url] if url.to_s.empty? Log.debug("Schema#Thing.url: no url in schema_object: #{schema_object.inspect}") return end Utils.build_absolute_url_from_relative(url, @url) end |