Module: Loofah::Scrubbers

Defined in:
lib/loofah/scrubbers.rb

Overview

Loofah provides some built-in scrubbers for sanitizing with

HTML5lib's whitelist and for accomplishing some common
transformation tasks.

=== Loofah::Scrubbers::Strip / scrub!(:strip)

+:strip+ removes unknown/unsafe tags, but leaves behind the pristine contents:

   unsafe_html = "ohai! <div>div is safe</div> <foo>but foo is <b>not</b></foo>"
   Loofah.fragment(unsafe_html).scrub!(:strip)
   => "ohai! <div>div is safe</div> but foo is <b>not</b>"

=== Loofah::Scrubbers::Prune / scrub!(:prune)

+:prune+ removes unknown/unsafe tags and their contents (including their subtrees):

   unsafe_html = "ohai! <div>div is safe</div> <foo>but foo is <b>not</b></foo>"
   Loofah.fragment(unsafe_html).scrub!(:prune)
   => "ohai! <div>div is safe</div> "

=== Loofah::Scrubbers::Escape / scrub!(:escape)

+:escape+ performs HTML entity escaping on the unknown/unsafe tags:

   unsafe_html = "ohai! <div>div is safe</div> <foo>but foo is <b>not</b></foo>"
   Loofah.fragment(unsafe_html).scrub!(:escape)
   => "ohai! <div>div is safe</div> &lt;foo&gt;but foo is &lt;b&gt;not&lt;/b&gt;&lt;/foo&gt;"

=== Loofah::Scrubbers::Whitewash / scrub!(:whitewash)

+:whitewash+ removes all comments, styling and attributes in
addition to doing markup-fixer-uppery and pruning unsafe tags. I
like to call this "whitewashing", since it's like putting a new
layer of paint on top of the HTML input to make it look nice.

   messy_markup = "ohai! <div id='foo' class='bar' style='margin: 10px'>div with attributes</div>"
   Loofah.fragment(messy_markup).scrub!(:whitewash)
   => "ohai! <div>div with attributes</div>"

One use case for this scrubber is to clean up HTML that was
cut-and-pasted from Microsoft Word into a WYSIWYG editor or a
rich text editor. Microsoft's software is famous for injecting
all kinds of cruft into its HTML output. Who needs that crap?
Certainly not me.

=== Loofah::Scrubbers::NoFollow / scrub!(:nofollow)

+:nofollow+ adds a rel="nofollow" attribute to all links

   link_farmers_markup = "ohai! <a href='http://www.myswarmysite.com/'>I like your blog post</a>"
   Loofah.fragment(link_farmers_markup).scrub!(:nofollow)
   => "ohai! <a href='http://www.myswarmysite.com/' rel="nofollow">I like your blog post</a>"

Defined Under Namespace

Classes: Escape, NewlineBlockElements, NoFollow, Prune, Strip, Whitewash

Constant Summary

MAP =

A hash that maps a symbol (like :prune) to the appropriate Scrubber (Loofah::Scrubbers::Prune).

{
  :escape    => Escape,
  :prune     => Prune,
  :whitewash => Whitewash,
  :strip     => Strip,
  :nofollow  => NoFollow,
  :newline_block_elements => NewlineBlockElements
}

Class Method Summary (collapse)

Class Method Details

+ (Object) scrubber_symbols

Returns an array of symbols representing the built-in scrubbers



213
214
215
# File 'lib/loofah/scrubbers.rb', line 213

def self.scrubber_symbols
  MAP.keys
end