Module: Loofah::Scrubbers
- Defined in:
- lib/loofah/scrubbers.rb
Overview
Loofah provides some built-in scrubbers for sanitizing with
HTML5lib's safelist and for accomplishing some common
transformation tasks.
=== Loofah::Scrubbers::Strip / scrub!(:strip)
+:strip+ removes unknown/unsafe tags, but leaves behind the pristine contents:
unsafe_html = "ohai! <div>div is safe</div> <foo>but foo is <b>not</b></foo>"
Loofah.html5_fragment(unsafe_html).scrub!(:strip)
=> "ohai! <div>div is safe</div> but foo is <b>not</b>"
=== Loofah::Scrubbers::Prune / scrub!(:prune)
+:prune+ removes unknown/unsafe tags and their contents (including their subtrees):
unsafe_html = "ohai! <div>div is safe</div> <foo>but foo is <b>not</b></foo>"
Loofah.html5_fragment(unsafe_html).scrub!(:prune)
=> "ohai! <div>div is safe</div> "
=== Loofah::Scrubbers::Escape / scrub!(:escape)
+:escape+ performs HTML entity escaping on the unknown/unsafe tags:
unsafe_html = "ohai! <div>div is safe</div> <foo>but foo is <b>not</b></foo>"
Loofah.html5_fragment(unsafe_html).scrub!(:escape)
=> "ohai! <div>div is safe</div> <foo>but foo is <b>not</b></foo>"
=== Loofah::Scrubbers::Whitewash / scrub!(:whitewash)
+:whitewash+ removes all comments, styling and attributes in
addition to doing markup-fixer-uppery and pruning unsafe tags. I
like to call this "whitewashing", since it's like putting a new
layer of paint on top of the HTML input to make it look nice.
messy_markup = "ohai! <div id='foo' class='bar' style='margin: 10px'>div with attributes</div>"
Loofah.html5_fragment(messy_markup).scrub!(:whitewash)
=> "ohai! <div>div with attributes</div>"
One use case for this scrubber is to clean up HTML that was
cut-and-pasted from Microsoft Word into a WYSIWYG editor or a
rich text editor. Microsoft's software is famous for injecting
all kinds of cruft into its HTML output. Who needs that crap?
Certainly not me.
=== Loofah::Scrubbers::NoFollow / scrub!(:nofollow)
+:nofollow+ adds a rel="nofollow" attribute to all links
link_farmers_markup = "ohai! <a href='http://www.myswarmysite.com/'>I like your blog post</a>"
Loofah.html5_fragment(link_farmers_markup).scrub!(:nofollow)
=> "ohai! <a href='http://www.myswarmysite.com/' rel="nofollow">I like your blog post</a>"
=== Loofah::Scrubbers::TargetBlank / scrub!(:targetblank)
+:targetblank+ adds a target="_blank" attribute to all links
link_farmers_markup = "ohai! <a href='http://www.myswarmysite.com/'>I like your blog post</a>"
Loofah.html5_fragment(link_farmers_markup).scrub!(:targetblank)
=> "ohai! <a href='http://www.myswarmysite.com/' target="_blank">I like your blog post</a>"
=== Loofah::Scrubbers::NoOpener / scrub!(:noopener)
+:noopener+ adds a rel="noopener" attribute to all links
link_farmers_markup = "ohai! <a href='http://www.myswarmysite.com/'>I like your blog post</a>"
Loofah.html5_fragment(link_farmers_markup).scrub!(:noopener)
=> "ohai! <a href='http://www.myswarmysite.com/' rel="noopener">I like your blog post</a>"
=== Loofah::Scrubbers::NoReferrer / scrub!(:noreferrer)
+:noreferrer+ adds a rel="noreferrer" attribute to all links
link_farmers_markup = "ohai! <a href='http://www.myswarmysite.com/'>I like your blog post</a>"
Loofah.html5_fragment(link_farmers_markup).scrub!(:noreferrer)
=> "ohai! <a href='http://www.myswarmysite.com/' rel="noreferrer">I like your blog post</a>"
=== Loofah::Scrubbers::Unprintable / scrub!(:unprintable)
+:unprintable+ removes unprintable Unicode characters.
markup = "<p>Some text with an unprintable character at the end\u2028</p>"
Loofah.html5_fragment(markup).scrub!(:unprintable)
=> "<p>Some text with an unprintable character at the end</p>"
You may not be able to see the unprintable character in the above example, but there is a
U+2028 character right before the closing </p> tag. These characters can cause issues if
the content is ever parsed by JavaScript - more information here:
http://timelessrepo.com/json-isnt-a-javascript-subset
Defined Under Namespace
Classes: DoubleBreakpoint, Escape, NewlineBlockElements, NoFollow, NoOpener, NoReferrer, Prune, Strip, TargetBlank, Unprintable, Whitewash
Constant Summary collapse
- MAP =
A hash that maps a symbol (like
:prune
) to the appropriate Scrubber (Loofah::Scrubbers::Prune). { escape: Escape, prune: Prune, whitewash: Whitewash, strip: Strip, nofollow: NoFollow, noopener: NoOpener, noreferrer: NoReferrer, targetblank: TargetBlank, newline_block_elements: NewlineBlockElements, unprintable: Unprintable, double_breakpoint: DoubleBreakpoint, }
Class Method Summary collapse
-
.scrubber_symbols ⇒ Object
Returns an array of symbols representing the built-in scrubbers.
Class Method Details
.scrubber_symbols ⇒ Object
Returns an array of symbols representing the built-in scrubbers
425 426 427 |
# File 'lib/loofah/scrubbers.rb', line 425 def scrubber_symbols MAP.keys end |