Class: Loofah::Scrubbers::Whitewash

Inherits:
Loofah::Scrubber show all
Defined in:
lib/loofah/scrubbers.rb

Overview

scrub!(:whitewash)

+:whitewash+ removes all comments, styling and attributes in
addition to doing markup-fixer-uppery and pruning unsafe tags. I
like to call this "whitewashing", since it's like putting a new
layer of paint on top of the HTML input to make it look nice.

   messy_markup = "ohai! <div id='foo' class='bar' style='margin: 10px'>div with attributes</div>"
   Loofah.fragment(messy_markup).scrub!(:whitewash)
   => "ohai! <div>div with attributes</div>"

One use case for this scrubber is to clean up HTML that was
cut-and-pasted from Microsoft Word into a WYSIWYG editor or a
rich text editor. Microsoft's software is famous for injecting
all kinds of cruft into its HTML output. Who needs that crap?
Certainly not me.

Constant Summary

Constant Summary

Constants inherited from Loofah::Scrubber

Loofah::Scrubber::CONTINUE, Loofah::Scrubber::STOP

Instance Attribute Summary

Attributes inherited from Loofah::Scrubber

#block, #direction

Instance Method Summary (collapse)

Methods inherited from Loofah::Scrubber

#traverse

Constructor Details

- (Whitewash) initialize

A new instance of Whitewash



145
146
147
# File 'lib/loofah/scrubbers.rb', line 145

def initialize
  @direction = :top_down
end

Instance Method Details

- (Object) scrub(node)



149
150
151
152
153
154
155
156
157
158
159
160
161
# File 'lib/loofah/scrubbers.rb', line 149

def scrub(node)
  case node.type
  when Nokogiri::XML::Node::ELEMENT_NODE
    if HTML5::Scrub.allowed_element? node.name
      node.attributes.each { |attr| node.remove_attribute(attr.first) }
      return CONTINUE if node.namespaces.empty?
    end
  when Nokogiri::XML::Node::TEXT_NODE, Nokogiri::XML::Node::CDATA_SECTION_NODE
    return CONTINUE
  end
  node.remove
  STOP
end