Class: Html2rss::HtmlExtractor::Extractors::Archive
- Inherits:
-
Object
- Object
- Html2rss::HtmlExtractor::Extractors::Archive
- Defined in:
- lib/html2rss/html_extractor/enclosure_extractor.rb
Overview
Extracts archive enclosures (zip, tar.gz, tgz) from HTML tags.
Class Method Summary collapse
Class Method Details
.call(article_tag, base_url:) ⇒ Object
86 87 88 89 90 91 92 93 94 95 96 97 |
# File 'lib/html2rss/html_extractor/enclosure_extractor.rb', line 86 def self.call(article_tag, base_url:) article_tag.css('a[href$=".zip"], a[href$=".tar.gz"], a[href$=".tgz"]').filter_map do |link| href = link['href'].to_s next if href.empty? abs_url = Url.from_relative(href, base_url) { url: abs_url, type: 'application/zip' } end end |