Class: HTML::Pipeline::Filter

Inherits:
Object
  • Object
show all
Defined in:
lib/html/pipeline/filter.rb

Overview

Base class for user content HTML filters. Each filter takes an HTML string or Nokogiri::HTML::DocumentFragment, performs modifications and/or writes information to the result hash. Filters must return a DocumentFragment (typically the same instance provided to the call method) or a String with HTML markup.

Example filter that replaces all images with trollface:

class FuuuFilter < HTML::Pipeline::Filter
  def call
    doc.search('img').each do |img|
      img['src'] = "http://paradoxdgn.com/junk/avatars/trollface.jpg"
    end
  end
end

The context Hash passes options to filters and should not be changed in place. A Result Hash allows filters to make extracted information available to the caller and is mutable.

Common context options:

:base_url   - The site's base URL
:repository - A Repository providing context for the HTML being processed

Each filter may define additional options and output values. See the class docs for more info.

Defined Under Namespace

Classes: InvalidDocumentException

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(doc, context = nil, result = nil) ⇒ Filter

Returns a new instance of Filter.



32
33
34
35
36
37
38
39
40
41
42
43
# File 'lib/html/pipeline/filter.rb', line 32

def initialize(doc, context = nil, result = nil)
  if doc.kind_of?(String)
    @html = doc.to_str
    @doc = nil
  else
    @doc = doc
    @html = nil
  end
  @context = context || {}
  @result = result || {}
  validate
end

Instance Attribute Details

#contextObject (readonly)

Public: Returns a simple Hash used to pass extra information into filters and also to allow filters to make extracted information available to the caller.



48
49
50
# File 'lib/html/pipeline/filter.rb', line 48

def context
  @context
end

#resultObject (readonly)

Public: Returns a Hash used to allow filters to pass back information to callers of the various Pipelines. This can be used for #mentioned_users, for example.



53
54
55
# File 'lib/html/pipeline/filter.rb', line 53

def result
  @result
end

Class Method Details

.call(doc, context = nil, result = nil) ⇒ Object

Perform a filter on doc with the given context.

Returns a HTML::Pipeline::DocumentFragment or a String containing HTML markup.



141
142
143
# File 'lib/html/pipeline/filter.rb', line 141

def self.call(doc, context = nil, result = nil)
  new(doc, context, result).call
end

.to_document(input, context = nil) ⇒ Object

Like call but guarantees that a DocumentFragment is returned, even when the last filter returns a String.



147
148
149
150
# File 'lib/html/pipeline/filter.rb', line 147

def self.to_document(input, context = nil)
  html = call(input, context)
  HTML::Pipeline::parse(html)
end

.to_html(input, context = nil) ⇒ Object

Like call but guarantees that a string of HTML markup is returned.



153
154
155
156
157
158
159
160
# File 'lib/html/pipeline/filter.rb', line 153

def self.to_html(input, context = nil)
  output = call(input, context)
  if output.respond_to?(:to_html)
    output.to_html
  else
    output.to_s
  end
end

Instance Method Details

#base_urlObject

The site’s base URL provided in the context hash, or ‘/’ when no base URL was specified.



111
112
113
# File 'lib/html/pipeline/filter.rb', line 111

def base_url
  context[:base_url] || '/'
end

#callObject

The main filter entry point. The doc attribute is guaranteed to be a Nokogiri::HTML::DocumentFragment when invoked. Subclasses should modify this document in place or extract information and add it to the context hash.

Raises:

  • (NotImplementedError)


74
75
76
# File 'lib/html/pipeline/filter.rb', line 74

def call
  raise NotImplementedError
end

#can_access_repo?(repo) ⇒ Boolean

Return whether the filter can access a given repo while applying a filter

A repo can only be accessed if its pullable by the user who submitted the content of this filter, or if it’s the same as the repository context in which the filter runs

Returns:

  • (Boolean)


103
104
105
106
107
# File 'lib/html/pipeline/filter.rb', line 103

def can_access_repo?(repo)
  return false if repo.nil?
  return true if repo == repository
  repo.pullable_by?(current_user)
end

#current_userObject

The User object provided in the context hash, or nil when no user was specified



93
94
95
# File 'lib/html/pipeline/filter.rb', line 93

def current_user
  context[:current_user]
end

#docObject

The Nokogiri::HTML::DocumentFragment to be manipulated. If the filter was provided a String, parse into a DocumentFragment the first time this method is called.



58
59
60
# File 'lib/html/pipeline/filter.rb', line 58

def doc
  @doc ||= parse_html(html)
end

#has_ancestor?(node, tags) ⇒ Boolean

Helper method for filter subclasses used to determine if any of a node’s ancestors have one of the tag names specified.

node - The Node object to check. tags - An array of tag name strings to check. These should be downcase.

Returns true when the node has a matching ancestor.

Returns:

  • (Boolean)


129
130
131
132
133
134
135
# File 'lib/html/pipeline/filter.rb', line 129

def has_ancestor?(node, tags)
  while node = node.parent
    if tags.include?(node.name.downcase)
      break true
    end
  end
end

#htmlObject

The String representation of the document. If a DocumentFragment was provided to the Filter, it is serialized into a String when this method is called.



65
66
67
68
# File 'lib/html/pipeline/filter.rb', line 65

def html
  raise InvalidDocumentException if @html.nil? && @doc.nil?
  @html || doc.to_html
end

#needs(*keys) ⇒ Object

Validator for required context. This will check that anything passed in contexts exists in @contexts

If any errors are found an ArgumentError will be raised with a message listing all the missing contexts and the filters that require them.



168
169
170
171
172
173
174
175
# File 'lib/html/pipeline/filter.rb', line 168

def needs(*keys)
  missing = keys.reject { |key| context.include? key }

  if missing.any?
    raise ArgumentError,
      "Missing context keys for #{self.class.name}: #{missing.map(&:inspect).join ', '}"
  end
end

#parse_html(html) ⇒ Object

Ensure the passed argument is a DocumentFragment. When a string is provided, it is parsed and returned; otherwise, the DocumentFragment is returned unmodified.



118
119
120
# File 'lib/html/pipeline/filter.rb', line 118

def parse_html(html)
  HTML::Pipeline.parse(html)
end

#repositoryObject

The Repository object provided in the context hash, or nil when no :repository was specified.

It’s assumed that the repository context has already been checked for permissions



87
88
89
# File 'lib/html/pipeline/filter.rb', line 87

def repository
  context[:repository]
end

#validateObject

Make sure the context has everything we need. Noop: Subclasses can override.



79
80
# File 'lib/html/pipeline/filter.rb', line 79

def validate
end