Class: Html2rss::HtmlExtractor::Extractors::Pdf
- Inherits:
-
Object
- Object
- Html2rss::HtmlExtractor::Extractors::Pdf
- Defined in:
- lib/html2rss/html_extractor/enclosure_extractor.rb
Overview
Extracts PDF enclosures from HTML tags.
Class Method Summary collapse
Class Method Details
.call(article_tag, base_url:) ⇒ Object
54 55 56 57 58 59 60 61 62 63 64 65 |
# File 'lib/html2rss/html_extractor/enclosure_extractor.rb', line 54 def self.call(article_tag, base_url:) article_tag.css('a[href$=".pdf"]').filter_map do |link| href = link['href'].to_s next if href.empty? abs_url = Url.from_relative(href, base_url) { url: abs_url, type: RssBuilder::Enclosure.guess_content_type_from_url(abs_url) } end end |