Class: Html2rss::AttributePostProcessors::HtmlToMarkdown

Inherits:
Base
  • Object
show all
Defined in:
lib/html2rss/attribute_post_processors/html_to_markdown.rb

Overview

Returns HTML code as Markdown formatted String. Before converting to markdown, the HTML is sanitized with SanitizeHtml. Imagine this HTML structure:

<section>
  Lorem <b>ipsum</b> dolor...
  <iframe src="https://evil.corp/miner"></iframe>
  <script>alert();</script>
</section>

YAML usage example:

selectors:
  description:
    selector: section
    extractor: html
    post_process:
      name: html_to_markdown

Would return:

'Lorem **ipsum** dolor'

Instance Attribute Summary

Attributes inherited from Base

#context, #value

Class Method Summary collapse

Instance Method Summary collapse

Methods inherited from Base

assert_type, expect_options, #initialize

Constructor Details

This class inherits a constructor from Html2rss::AttributePostProcessors::Base

Class Method Details

.validate_args!(value, context) ⇒ Object



30
31
32
# File 'lib/html2rss/attribute_post_processors/html_to_markdown.rb', line 30

def self.validate_args!(value, context)
  assert_type value, String, :value, context:
end

Instance Method Details

#getString

Returns formatted in Markdown.

Returns:

  • (String)

    formatted in Markdown



36
37
38
39
40
# File 'lib/html2rss/attribute_post_processors/html_to_markdown.rb', line 36

def get
  sanitized_value = SanitizeHtml.new(value, context).get

  ReverseMarkdown.convert(sanitized_value)
end