Class: HTML::Pipeline::Filter
- Inherits:
-
Object
- Object
- HTML::Pipeline::Filter
- Defined in:
- lib/html/pipeline/filter.rb
Overview
Base class for user content HTML filters. Each filter takes an HTML string or Nokogiri::HTML::DocumentFragment, performs modifications and/or writes information to the result hash. Filters must return a DocumentFragment (typically the same instance provided to the call method) or a String with HTML markup.
Example filter that replaces all images with trollface:
class FuuuFilter < HTML::Pipeline::Filter
def call
doc.search('img').each do |img|
img['src'] = "http://paradoxdgn.com/junk/avatars/trollface.jpg"
end
end
end
The context Hash passes options to filters and should not be changed in place. A Result Hash allows filters to make extracted information available to the caller and is mutable.
Common context options:
:base_url - The site's base URL
:repository - A Repository providing context for the HTML being processed
Each filter may define additional options and output values. See the class docs for more info.
Direct Known Subclasses
AbsoluteSourceFilter, AutolinkFilter, CamoFilter, EmojiFilter, HttpsFilter, ImageMaxWidthFilter, MentionFilter, SanitizationFilter, SyntaxHighlightFilter, TableOfContentsFilter, TextFilter
Defined Under Namespace
Classes: InvalidDocumentException
Instance Attribute Summary collapse
-
#context ⇒ Object
readonly
Public: Returns a simple Hash used to pass extra information into filters and also to allow filters to make extracted information available to the caller.
-
#result ⇒ Object
readonly
Public: Returns a Hash used to allow filters to pass back information to callers of the various Pipelines.
Class Method Summary collapse
-
.call(doc, context = nil, result = nil) ⇒ Object
Perform a filter on doc with the given context.
-
.to_document(input, context = nil) ⇒ Object
Like call but guarantees that a DocumentFragment is returned, even when the last filter returns a String.
-
.to_html(input, context = nil) ⇒ Object
Like call but guarantees that a string of HTML markup is returned.
Instance Method Summary collapse
-
#base_url ⇒ Object
The site’s base URL provided in the context hash, or ‘/’ when no base URL was specified.
-
#call ⇒ Object
The main filter entry point.
-
#current_user ⇒ Object
The User object provided in the context hash, or nil when no user was specified.
-
#doc ⇒ Object
The Nokogiri::HTML::DocumentFragment to be manipulated.
-
#has_ancestor?(node, tags) ⇒ Boolean
Helper method for filter subclasses used to determine if any of a node’s ancestors have one of the tag names specified.
-
#html ⇒ Object
The String representation of the document.
-
#initialize(doc, context = nil, result = nil) ⇒ Filter
constructor
A new instance of Filter.
-
#needs(*keys) ⇒ Object
Validator for required context.
-
#parse_html(html) ⇒ Object
Ensure the passed argument is a DocumentFragment.
-
#repository ⇒ Object
The Repository object provided in the context hash, or nil when no :repository was specified.
-
#search_text_nodes(doc) ⇒ Object
Searches a Nokogiri::HTML::DocumentFragment for text nodes.
-
#validate ⇒ Object
Make sure the context has everything we need.
Constructor Details
#initialize(doc, context = nil, result = nil) ⇒ Filter
Returns a new instance of Filter.
32 33 34 35 36 37 38 39 40 41 42 43 |
# File 'lib/html/pipeline/filter.rb', line 32 def initialize(doc, context = nil, result = nil) if doc.kind_of?(String) @html = doc.to_str @doc = nil else @doc = doc @html = nil end @context = context || {} @result = result || {} validate end |
Instance Attribute Details
#context ⇒ Object (readonly)
Public: Returns a simple Hash used to pass extra information into filters and also to allow filters to make extracted information available to the caller.
48 49 50 |
# File 'lib/html/pipeline/filter.rb', line 48 def context @context end |
#result ⇒ Object (readonly)
Public: Returns a Hash used to allow filters to pass back information to callers of the various Pipelines. This can be used for #mentioned_users, for example.
53 54 55 |
# File 'lib/html/pipeline/filter.rb', line 53 def result @result end |
Class Method Details
.call(doc, context = nil, result = nil) ⇒ Object
Perform a filter on doc with the given context.
Returns a HTML::Pipeline::DocumentFragment or a String containing HTML markup.
136 137 138 |
# File 'lib/html/pipeline/filter.rb', line 136 def self.call(doc, context = nil, result = nil) new(doc, context, result).call end |
.to_document(input, context = nil) ⇒ Object
Like call but guarantees that a DocumentFragment is returned, even when the last filter returns a String.
142 143 144 145 |
# File 'lib/html/pipeline/filter.rb', line 142 def self.to_document(input, context = nil) html = call(input, context) HTML::Pipeline::parse(html) end |
.to_html(input, context = nil) ⇒ Object
Like call but guarantees that a string of HTML markup is returned.
148 149 150 151 152 153 154 155 |
# File 'lib/html/pipeline/filter.rb', line 148 def self.to_html(input, context = nil) output = call(input, context) if output.respond_to?(:to_html) output.to_html else output.to_s end end |
Instance Method Details
#base_url ⇒ Object
The site’s base URL provided in the context hash, or ‘/’ when no base URL was specified.
106 107 108 |
# File 'lib/html/pipeline/filter.rb', line 106 def base_url context[:base_url] || '/' end |
#call ⇒ Object
The main filter entry point. The doc attribute is guaranteed to be a Nokogiri::HTML::DocumentFragment when invoked. Subclasses should modify this document in place or extract information and add it to the context hash.
81 82 83 |
# File 'lib/html/pipeline/filter.rb', line 81 def call raise NotImplementedError end |
#current_user ⇒ Object
The User object provided in the context hash, or nil when no user was specified
100 101 102 |
# File 'lib/html/pipeline/filter.rb', line 100 def current_user context[:current_user] end |
#doc ⇒ Object
The Nokogiri::HTML::DocumentFragment to be manipulated. If the filter was provided a String, parse into a DocumentFragment the first time this method is called.
58 59 60 |
# File 'lib/html/pipeline/filter.rb', line 58 def doc @doc ||= parse_html(html) end |
#has_ancestor?(node, tags) ⇒ Boolean
Helper method for filter subclasses used to determine if any of a node’s ancestors have one of the tag names specified.
node - The Node object to check. tags - An array of tag name strings to check. These should be downcase.
Returns true when the node has a matching ancestor.
124 125 126 127 128 129 130 |
# File 'lib/html/pipeline/filter.rb', line 124 def has_ancestor?(node, ) while node = node.parent if .include?(node.name.downcase) break true end end end |
#html ⇒ Object
The String representation of the document. If a DocumentFragment was provided to the Filter, it is serialized into a String when this method is called.
72 73 74 75 |
# File 'lib/html/pipeline/filter.rb', line 72 def html raise InvalidDocumentException if @html.nil? && @doc.nil? @html || doc.to_html end |
#needs(*keys) ⇒ Object
Validator for required context. This will check that anything passed in contexts exists in @contexts
If any errors are found an ArgumentError will be raised with a message listing all the missing contexts and the filters that require them.
163 164 165 166 167 168 169 170 |
# File 'lib/html/pipeline/filter.rb', line 163 def needs(*keys) missing = keys.reject { |key| context.include? key } if missing.any? raise ArgumentError, "Missing context keys for #{self.class.name}: #{missing.map(&:inspect).join ', '}" end end |
#parse_html(html) ⇒ Object
Ensure the passed argument is a DocumentFragment. When a string is provided, it is parsed and returned; otherwise, the DocumentFragment is returned unmodified.
113 114 115 |
# File 'lib/html/pipeline/filter.rb', line 113 def parse_html(html) HTML::Pipeline.parse(html) end |
#repository ⇒ Object
The Repository object provided in the context hash, or nil when no :repository was specified.
It’s assumed that the repository context has already been checked for permissions
94 95 96 |
# File 'lib/html/pipeline/filter.rb', line 94 def repository context[:repository] end |
#search_text_nodes(doc) ⇒ Object
Searches a Nokogiri::HTML::DocumentFragment for text nodes. If no elements are found, a second search without root tags is invoked.
64 65 66 67 |
# File 'lib/html/pipeline/filter.rb', line 64 def search_text_nodes(doc) nodes = doc.xpath('.//text()') nodes.empty? ? doc.xpath('text()') : nodes end |
#validate ⇒ Object
Make sure the context has everything we need. Noop: Subclasses can override.
86 87 |
# File 'lib/html/pipeline/filter.rb', line 86 def validate end |