Class: Html2rss::Selectors::PostProcessors::SanitizeHtml
- Defined in:
- lib/html2rss/selectors/post_processors/sanitize_html.rb
Overview
Returns sanitized HTML code as String.
It sanitizes by using the sanitize gem with Sanitize::Config::RELAXED.
Furthermore, it adds:
rel="nofollow noopener noreferrer"to tagsreferrer-policy='no-referrer'totags
- wraps all
tags, whose direct parent is not an , into an linking to the
's
src.
Imagine this HTML structure:
<section>
Lorem <b>ipsum</b> dolor...
<iframe src="https://evil.corp/miner"></iframe>
<script>alert();</script>
</section>
YAML usage example:
selectors:
description:
selector: '.section'
extractor: html
post_process:
name: sanitize_html
Would return:
'<p>Lorem <b>ipsum</b> dolor ...</p>'
Constant Summary collapse
- TAG_ATTRIBUTES =
{ 'a' => { 'rel' => 'nofollow noopener noreferrer', 'target' => '_blank' }, 'area' => { 'rel' => 'nofollow noopener noreferrer', 'target' => '_blank' }, 'img' => { 'referrerpolicy' => 'no-referrer', 'crossorigin' => 'anonymous', 'loading' => 'lazy', 'decoding' => 'async' }, 'iframe' => { 'referrerpolicy' => 'no-referrer', 'crossorigin' => 'anonymous', 'loading' => 'lazy', 'sandbox' => 'allow-same-origin', 'src' => true, 'width' => true, 'height' => true }, 'video' => { 'referrerpolicy' => 'no-referrer', 'crossorigin' => 'anonymous', 'preload' => 'none', 'playsinline' => 'true', 'controls' => 'true' }, 'audio' => { 'referrerpolicy' => 'no-referrer', 'crossorigin' => 'anonymous', 'preload' => 'none' } }.freeze
Instance Attribute Summary
Attributes inherited from Base
Class Method Summary collapse
-
.get(html, url) ⇒ String?
Shorthand method to get the sanitized HTML.
- .validate_args!(value, context) ⇒ Object
Instance Method Summary collapse
Methods inherited from Base
assert_type, expect_options, #initialize
Constructor Details
This class inherits a constructor from Html2rss::Selectors::PostProcessors::Base
Class Method Details
.get(html, url) ⇒ String?
Shorthand method to get the sanitized HTML.
95 96 97 98 99 |
# File 'lib/html2rss/selectors/post_processors/sanitize_html.rb', line 95 def self.get(html, url) return nil if String(html).empty? new(html, config: { channel: { url: } }).get end |
.validate_args!(value, context) ⇒ Object
86 87 88 |
# File 'lib/html2rss/selectors/post_processors/sanitize_html.rb', line 86 def self.validate_args!(value, context) assert_type value, String, :value, context: end |
Instance Method Details
#get ⇒ String?
103 104 105 106 107 108 |
# File 'lib/html2rss/selectors/post_processors/sanitize_html.rb', line 103 def get sanitized_html = Sanitize.fragment(value, sanitize_config).to_s sanitized_html.gsub!(/\s+/, ' ') sanitized_html.strip! sanitized_html.empty? ? nil : sanitized_html end |