Class: Loofah::Scrubbers::Unprintable

Inherits:
Loofah::Scrubber show all
Defined in:
lib/loofah/scrubbers.rb

Overview

scrub!(:unprintable)

+:unprintable+ removes unprintable Unicode characters.

   markup = "<p>Some text with an unprintable character at the end\u2028</p>"
   Loofah.html5_fragment(markup).scrub!(:unprintable)
   => "<p>Some text with an unprintable character at the end</p>"

You may not be able to see the unprintable character in the above example, but there is a
U+2028 character right before the closing </p> tag. These characters can cause issues if
the content is ever parsed by JavaScript - more information here:

   http://timelessrepo.com/json-isnt-a-javascript-subset

Constant Summary

Constants inherited from Loofah::Scrubber

Loofah::Scrubber::CONTINUE, Loofah::Scrubber::STOP

Instance Attribute Summary

Attributes inherited from Loofah::Scrubber

#block, #direction

Instance Method Summary collapse

Methods inherited from Loofah::Scrubber

#append_attribute, #traverse

Constructor Details

#initializeUnprintable

rubocop:disable Lint/MissingSuper



339
340
341
# File 'lib/loofah/scrubbers.rb', line 339

def initialize # rubocop:disable Lint/MissingSuper
  @direction = :top_down
end

Instance Method Details

#scrub(node) ⇒ Object



343
344
345
346
347
348
# File 'lib/loofah/scrubbers.rb', line 343

def scrub(node)
  if node.type == Nokogiri::XML::Node::TEXT_NODE || node.type == Nokogiri::XML::Node::CDATA_SECTION_NODE
    node.content = node.content.gsub(/\u2028|\u2029/, "")
  end
  CONTINUE
end