Class: HTML::Tag

Inherits:
Node show all
Defined in:
lib/action_controller/vendor/html-scanner/html/node.rb

Overview

A Tag is any node that represents markup. It may be an opening tag, a closing tag, or a self-closing tag. It has a name, and may have a hash of attributes.

Instance Attribute Summary collapse

Attributes inherited from Node

#children, #line, #parent, #position

Instance Method Summary collapse

Methods inherited from Node

#find_all, parse, #validate_conditions

Constructor Details

#initialize(parent, line, pos, name, attributes, closing) ⇒ Tag

Create a new node as a child of the given parent, using the given content to describe the node. It will be parsed and the node name, attributes and closing status extracted.



240
241
242
243
244
245
# File 'lib/action_controller/vendor/html-scanner/html/node.rb', line 240

def initialize(parent, line, pos, name, attributes, closing)
  super(parent, line, pos)
  @name = name
  @attributes = attributes
  @closing = closing
end

Instance Attribute Details

#attributesObject (readonly)

Either nil, or a hash of attributes for this node.



232
233
234
# File 'lib/action_controller/vendor/html-scanner/html/node.rb', line 232

def attributes
  @attributes
end

#closingObject (readonly)

Either nil, :close, or :self



229
230
231
# File 'lib/action_controller/vendor/html-scanner/html/node.rb', line 229

def closing
  @closing
end

#nameObject (readonly)

The name of this tag.



235
236
237
# File 'lib/action_controller/vendor/html-scanner/html/node.rb', line 235

def name
  @name
end

Instance Method Details

#[](attr) ⇒ Object

A convenience for obtaining an attribute of the node. Returns nil if the node has no attributes.



249
250
251
# File 'lib/action_controller/vendor/html-scanner/html/node.rb', line 249

def [](attr)
  @attributes ? @attributes[attr] : nil
end

#childless?Boolean

Returns non-nil if this tag can contain child nodes.

Returns:

  • (Boolean)


254
255
256
# File 'lib/action_controller/vendor/html-scanner/html/node.rb', line 254

def childless?
  @name =~ /^(img|br|hr|link|meta|area|base|basefont|col|frame|input|isindex|param)$/o
end

#find(conditions) ⇒ Object

If either the node or any of its children meet the given conditions, the matching node is returned. Otherwise, nil is returned. (See the description of the valid conditions in the match method.)



275
276
277
# File 'lib/action_controller/vendor/html-scanner/html/node.rb', line 275

def find(conditions)
  match(conditions) && self || super
end

#match(conditions) ⇒ Object

Returns true if the node meets any of the given conditions. The conditions parameter must be a hash of any of the following keys (all are optional):

  • :tag: the node name must match the corresponding value

  • :attributes: a hash. The node’s values must match the corresponding values in the hash.

  • :parent: a hash. The node’s parent must match the corresponding hash.

  • :child: a hash. At least one of the node’s immediate children must meet the criteria described by the hash.

  • :ancestor: a hash. At least one of the node’s ancestors must meet the criteria described by the hash.

  • :descendant: a hash. At least one of the node’s descendants must meet the criteria described by the hash.

  • :children: a hash, for counting children of a node. Accepts the keys:

** :count: either a number or a range which must equal (or

include) the number of children that match.

** :less_than: the number of matching children must be less than

this number.

** :greater_than: the number of matching children must be

greater than this number.

** :only: another hash consisting of the keys to use

to match on the children, and only matching children will be
counted.

Conditions are matched using the following algorithm:

  • if the condition is a string, it must be a substring of the value.

  • if the condition is a regexp, it must match the value.

  • if the condition is a number, the value must match number.to_s.

  • if the condition is true, the value must not be nil.

  • if the condition is false or nil, the value must be nil.

Usage:

# test if the node is a "span" tag
node.match :tag => "span"

# test if the node's parent is a "div"
node.match :parent => { :tag => "div" }

# test if any of the node's ancestors are "table" tags
node.match :ancestor => { :tag => "table" }

# test if any of the node's immediate children are "em" tags
node.match :child => { :tag => "em" }

# test if any of the node's descendants are "strong" tags
node.match :descendant => { :tag => "strong" }

# test if the node has between 2 and 4 span tags as immediate children
node.match :children => { :count => 2..4, :only => { :tag => "span" } } 

# get funky: test to see if the node is a "div", has a "ul" ancestor
# and an "li" parent (with "class" = "enum"), and whether or not it has
# a "span" descendant that contains # text matching /hello world/:
node.match :tag => "div",
           :ancestor => { :tag => "ul" },
           :parent => { :tag => "li",
                        :attributes => { :class => "enum" } },
           :descendant => { :tag => "span",
                            :child => /hello world/ }


348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
# File 'lib/action_controller/vendor/html-scanner/html/node.rb', line 348

def match(conditions)
  conditions = validate_conditions(conditions)
  
  # only Text nodes have content
  return false if conditions[:content]

  # test the name
  return false unless match_condition(@name, conditions[:tag]) if conditions[:tag]

  # test attributes
  (conditions[:attributes] || {}).each do |key, value|
    return false unless match_condition(self[key], value)
  end

  # test parent
  return false unless parent.match(conditions[:parent]) if conditions[:parent]

  # test children
  return false unless children.find { |child| child.match(conditions[:child]) } if conditions[:child]
   
  # test ancestors
  if conditions[:ancestor]
    return false unless catch :found do
      p = self
      throw :found, true if p.match(conditions[:ancestor]) while p = p.parent
    end
  end

  # test descendants
  if conditions[:descendant]
    return false unless children.find do |child|
      # test the child
      child.match(conditions[:descendant]) ||
      # test the child's descendants
      child.match(:descendant => conditions[:descendant])
    end
  end
  
  # count children
  if opts = conditions[:children]
    matches = children
    matches = matches.select { |c| c.match(opts[:only]) } if opts[:only]
    opts.each do |key, value|
      next if key == :only
      case key
        when :count
          if Integer === value
            return false if matches.length != value
          else
            return false unless value.include?(matches.length)
          end
        when :less_than
          return false unless matches.length < value
        when :greater_than
          return false unless matches.length > value
        else raise "unknown count condition #{key}"
      end
    end
  end
  
  true
end

#tag?Boolean

Returns true, indicating that this node represents an HTML tag.

Returns:

  • (Boolean)


280
281
282
# File 'lib/action_controller/vendor/html-scanner/html/node.rb', line 280

def tag?
  true
end

#to_sObject

Returns a textual representation of the node



259
260
261
262
263
264
265
266
267
268
269
270
# File 'lib/action_controller/vendor/html-scanner/html/node.rb', line 259

def to_s
  if @closing == :close
    "</#{@name}>"
  else
    s = "<#{@name}"
    @attributes.each { |k,v| s << " #{k}='#{v.gsub(/'/,"\\\\'")}'" }
    s << " /" if @closing == :self
    s << ">"
    @children.each { |child| s << child.to_s }
    s
  end
end