Class: Html2rss::Selectors::PostProcessors::HtmlToMarkdown

Inherits:
Base
  • Object
show all
Defined in:
lib/html2rss/selectors/post_processors/html_to_markdown.rb

Overview

Returns HTML code as Markdown formatted String. Before converting to markdown, the HTML is sanitized with SanitizeHtml. Imagine this HTML structure:

<section>
  Lorem <b>ipsum</b> dolor...
  <iframe src="https://evil.corp/miner"></iframe>
  <script>alert();</script>
</section>

YAML usage example:

selectors:
 description:
   selector: section
   extractor: html
   post_process:
     name: html_to_markdown

Would return:

'Lorem **ipsum** dolor'

Instance Attribute Summary

Attributes inherited from Base

#context, #value

Class Method Summary collapse

Instance Method Summary collapse

Methods inherited from Base

assert_type, expect_options, #initialize

Constructor Details

This class inherits a constructor from Html2rss::Selectors::PostProcessors::Base

Class Method Details

.validate_args!(value, context) ⇒ Object



31
32
33
# File 'lib/html2rss/selectors/post_processors/html_to_markdown.rb', line 31

def self.validate_args!(value, context)
  assert_type value, String, :value, context:
end

Instance Method Details

#getString

Returns formatted in Markdown.

Returns:

  • (String)

    formatted in Markdown



37
38
39
40
41
# File 'lib/html2rss/selectors/post_processors/html_to_markdown.rb', line 37

def get
  sanitized_value = SanitizeHtml.new(value, context).get

  ReverseMarkdown.convert(sanitized_value)
end