Class: HTMLFilter
- Inherits:
-
Object
- Object
- HTMLFilter
- Defined in:
- lib/htmlfilter.rb
Overview
HTML Filter
HTML Filter library can be used to sanitize and sterilize HTML. A good idea if you let users submit HTML in comments, for instance.
HtmlFilter is a port of lib_filter.php, v1.15 by Cal Henderson <[email protected]> licensed under a Creative Commons Attribution-ShareAlike 2.5 License creativecommons.org/licenses/by-sa/3.0/.
Usage
hf = HTMLFilter.new
hf.filter("<b>Bold Action") #=> "<b>Bold Action</b>"
Reference
Issues
-
The built in option constants could use a fair bit of refinement.
-
Eventually the old HtmlFilter name needs to be deprecated.
Constant Summary collapse
- VERSION =
Library version.
"1.2.0"- DEFAULT =
Default settings
{ 'allowed' => { 'a' => ['href', 'target'], 'img' => ['src', 'width', 'height', 'alt'], 'b' => [], 'i' => [], 'em' => [], 'tt' => [], }, 'no_close' => ['img', 'br', 'hr'], 'always_close' => ['a', 'b'], 'protocol_attributes' => ['src', 'href'], 'allowed_protocols' => ['http', 'ftp', 'mailto'], 'remove_blanks' => ['a', 'b'], 'strip_comments' => true, 'always_make_tags' => true, 'allow_numbered_entities' => true, 'allowed_entities' => ['amp', 'gt', 'lt', 'quot'] }
- BASIC =
Basic settings are simlialr to DEFAULT but do not allow any type of links, neither
a hreforimg. { 'allowed' => { 'b' => [], 'i' => [], 'em' => [], 'tt' => [], }, 'no_close' => ['img', 'br', 'hr'], 'always_close' => ['a', 'b'], 'protocol_attributes' => ['src', 'href'], 'allowed_protocols' => ['http', 'ftp', 'mailto'], 'remove_blanks' => ['a', 'b'], 'strip_comments' => true, 'always_make_tags' => true, 'allow_numbered_entities' => true, 'allowed_entities' => ['amp', 'gt', 'lt', 'quot'] }
- STRICT =
Strict settings do not allow any tags.
{ 'allowed' => {}, 'no_close' => ['img', 'br', 'hr'], 'always_close' => ['a', 'b'], 'protocol_attributes' => ['src', 'href'], 'allowed_protocols' => ['http', 'ftp', 'mailto'], 'remove_blanks' => ['a', 'b'], 'strip_comments' => true, 'always_make_tags' => true, 'allow_numbered_entities' => true, 'allowed_entities' => ['amp', 'gt', 'lt', 'quot'] }
- RELAXED =
Relaxed settings allows a great deal of HTML spec.
TODO: Need to expand upon RELAXED options.
{ 'allowed' => { 'a' => ['class', 'href', 'target'], 'b' => ['class'], 'i' => ['class'], 'img' => ['class', 'src', 'width', 'height', 'alt'], 'div' => ['class'], 'pre' => ['class'], 'code' => ['class'], 'ul' => ['class'], 'ol' => ['class'], 'li' => ['class'] }, 'no_close' => ['img', 'br', 'hr'], 'always_close' => ['a', 'b'], 'protocol_attributes' => ['src', 'href'], 'allowed_protocols' => ['http', 'ftp', 'mailto'], 'remove_blanks' => ['a', 'b'], 'strip_comments' => true, 'always_make_tags' => true, 'allow_numbered_entities' => true, 'allowed_entities' => ['amp', 'gt', 'lt', 'quot'] }
Instance Attribute Summary collapse
-
#allow_numbered_entities ⇒ Object
entity control option (true, false).
-
#allowed ⇒ Object
tags and attributes that are allowed.
-
#allowed_entities ⇒ Object
entity control option (amp, gt, lt, quot, etc.).
-
#allowed_protocols ⇒ Object
protocols which are allowed (http, ftp, mailto).
-
#always_close ⇒ Object
tags which must always have seperate opening and closing tags (e.g. “”).
-
#always_make_tags ⇒ Object
should we try and make a <b> tag out of “b>” (true, false).
-
#no_close ⇒ Object
tags which should always be self-closing (e.g. “<img />”).
-
#protocol_attributes ⇒ Object
attributes which should be checked for valid protocols (src,href).
-
#remove_blanks ⇒ Object
tags which should be removed if they contain no content (e.g. “” or “<b />”).
-
#strip_comments ⇒ Object
should we remove comments? (true, false).
Instance Method Summary collapse
-
#filter(html) ⇒ Object
Filter html string.
-
#initialize(options = nil) ⇒ HTMLFilter
constructor
New html filter.
Constructor Details
#initialize(options = nil) ⇒ HTMLFilter
New html filter.
Provide custom options, or use one of the built-in options constants.
hf = HTMLFilter.new(HTMLFilter::RELAXED)
hf.filter(htmlstr)
174 175 176 177 178 179 180 181 182 183 184 185 |
# File 'lib/htmlfilter.rb', line 174 def initialize(=nil) if h = DEFAULT.dup .each do |k,v| h[k.to_s] = v end = h else = DEFAULT.dup end .each{ |k,v| send("#{k}=",v) } end |
Instance Attribute Details
#allow_numbered_entities ⇒ Object
entity control option (true, false)
77 78 79 |
# File 'lib/htmlfilter.rb', line 77 def allow_numbered_entities @allow_numbered_entities end |
#allowed ⇒ Object
tags and attributes that are allowed
Eg.
{
'a' => ['href', 'target'],
'b' => [],
'img' => ['src', 'width', 'height', 'alt']
}
50 51 52 |
# File 'lib/htmlfilter.rb', line 50 def allowed @allowed end |
#allowed_entities ⇒ Object
entity control option (amp, gt, lt, quot, etc.)
80 81 82 |
# File 'lib/htmlfilter.rb', line 80 def allowed_entities @allowed_entities end |
#allowed_protocols ⇒ Object
protocols which are allowed (http, ftp, mailto)
64 65 66 |
# File 'lib/htmlfilter.rb', line 64 def allowed_protocols @allowed_protocols end |
#always_close ⇒ Object
tags which must always have seperate opening and closing tags (e.g. “”)
57 58 59 |
# File 'lib/htmlfilter.rb', line 57 def always_close @always_close end |
#always_make_tags ⇒ Object
should we try and make a <b> tag out of “b>” (true, false)
74 75 76 |
# File 'lib/htmlfilter.rb', line 74 def @always_make_tags end |
#no_close ⇒ Object
tags which should always be self-closing (e.g. “<img />”)
53 54 55 |
# File 'lib/htmlfilter.rb', line 53 def no_close @no_close end |
#protocol_attributes ⇒ Object
attributes which should be checked for valid protocols (src,href)
61 62 63 |
# File 'lib/htmlfilter.rb', line 61 def protocol_attributes @protocol_attributes end |
#remove_blanks ⇒ Object
tags which should be removed if they contain no content (e.g. “” or “<b />”)
68 69 70 |
# File 'lib/htmlfilter.rb', line 68 def remove_blanks @remove_blanks end |
#strip_comments ⇒ Object
should we remove comments? (true, false)
71 72 73 |
# File 'lib/htmlfilter.rb', line 71 def strip_comments @strip_comments end |
Instance Method Details
#filter(html) ⇒ Object
Filter html string.
189 190 191 192 193 194 195 196 197 198 |
# File 'lib/htmlfilter.rb', line 189 def filter(html) @tag_counts = {} html = escape_comments(html) html = balance_html(html) html = (html) html = process_remove_blanks(html) html = validate_entities(html) #html = truncate_html(html) html end |