Class: TextRank::CharFilter::StripHtml
- Inherits:
-
Nokogiri::XML::SAX::Document
- Object
- Nokogiri::XML::SAX::Document
- TextRank::CharFilter::StripHtml
- Defined in:
- lib/text_rank/char_filter/strip_html.rb
Overview
Character filter to remove HTML tags and convert HTML entities to text.
= Example
StripHtml.new.filter!(""Optimism", said Cacambo, "What is that?"") => "\"Optimism\", said Cacambo, \"What is that?\""
StringHtml.new.filter!("Alas! It is the obstinacy of maintaining that everything is best when it is worst.") => "Alas! It is the obstinacy of maintaining that everything is best when it is worst."
Instance Method Summary collapse
-
#filter!(text) ⇒ String
Perform the filter.
-
#initialize ⇒ StripHtml
constructor
A new instance of StripHtml.
Constructor Details
#initialize ⇒ StripHtml
Returns a new instance of StripHtml.
19 20 21 22 |
# File 'lib/text_rank/char_filter/strip_html.rb', line 19 def initialize super @text = StringIO.new end |
Instance Method Details
#filter!(text) ⇒ String
Perform the filter
27 28 29 30 31 |
# File 'lib/text_rank/char_filter/strip_html.rb', line 27 def filter!(text) @text.rewind Nokogiri::HTML::SAX::Parser.new(self).parse(text) @text.string end |