Module: HTree::Container::Trav
- Includes:
- Traverse
- Included in:
- Doc::Trav, Elem::Trav
- Defined in:
- lib/htree/traverse.rb,
lib/htree/modules.rb,
lib/htree/traverse.rb,
lib/htree/traverse.rb
Overview
:startdoc:
Instance Method Summary collapse
-
#each_child(&block) ⇒ Object
each_child
iterates over each child. -
#each_child_with_index(&block) ⇒ Object
each_child_with_index
iterates over each child. -
#each_hyperlink ⇒ Object
each_hyperlink
traverses hyperlinks such as HTML href attribute of A element. -
#each_hyperlink_uri(base_uri = nil) ⇒ Object
each_hyperlink_uri
traverses hyperlinks such as HTML href attribute of A element. -
#each_uri(base_uri = nil) ⇒ Object
each_uri
traverses hyperlinks such as HTML href attribute of A element. -
#filter(&block) ⇒ Object
filter
rebuilds the tree without some components. -
#find_element(*names) ⇒ Object
find_element
searches an element which universal name is specified by the arguments. -
#traverse_element(*names, &block) ⇒ Object
traverse_element
traverses elements in the tree. - #traverse_text_internal(&block) ⇒ Object
Methods included from Traverse
#bogusetag?, #comment?, #doc?, #doctype?, #elem?, #get_subnode, #procins?, #text?, #traverse_text, #xmldecl?
Instance Method Details
#each_child(&block) ⇒ Object
each_child
iterates over each child.
29 30 31 32 |
# File 'lib/htree/traverse.rb', line 29 def each_child(&block) # :yields: child_node children.each(&block) nil end |
#each_child_with_index(&block) ⇒ Object
each_child_with_index
iterates over each child.
35 36 37 38 |
# File 'lib/htree/traverse.rb', line 35 def each_child_with_index(&block) # :yields: child_node, index children.each_with_index(&block) nil end |
#each_hyperlink ⇒ Object
each_hyperlink
traverses hyperlinks such as HTML href attribute of A element.
It yields HTree::Text or HTree::Loc.
Note that each_hyperlink
yields HTML href attribute of BASE element.
161 162 163 164 165 166 |
# File 'lib/htree/traverse.rb', line 161 def each_hyperlink # :yields: text links = [] each_hyperlink_attribute {|elem, attr, hyperlink| yield hyperlink } end |
#each_hyperlink_uri(base_uri = nil) ⇒ Object
each_hyperlink_uri
traverses hyperlinks such as HTML href attribute of A element.
It yields HTree::Text (or HTree::Loc) and URI for each hyperlink.
The URI objects are created with a base URI which is given by HTML BASE element or the argument ((|base_uri|)). each_hyperlink_uri
doesn’t yields href of the BASE element.
138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 |
# File 'lib/htree/traverse.rb', line 138 def each_hyperlink_uri(base_uri=nil) # :yields: hyperlink, uri base_uri = URI.parse(base_uri) if String === base_uri links = [] each_hyperlink_attribute {|elem, attr, hyperlink| if %r{\{http://www.w3.org/1999/xhtml\}(?:base)\z}i =~ elem.name base_uri = URI.parse(hyperlink.to_s) else links << hyperlink end } if base_uri links.each {|hyperlink| yield hyperlink, base_uri + hyperlink.to_s } else links.each {|hyperlink| yield hyperlink, URI.parse(hyperlink.to_s) } end end |
#each_uri(base_uri = nil) ⇒ Object
each_uri
traverses hyperlinks such as HTML href attribute of A element.
It yields URI for each hyperlink.
The URI objects are created with a base URI which is given by HTML BASE element or the argument ((|base_uri|)).
175 176 177 |
# File 'lib/htree/traverse.rb', line 175 def each_uri(base_uri=nil) # :yields: URI each_hyperlink_uri(base_uri) {|hyperlink, uri| yield uri } end |
#filter(&block) ⇒ Object
filter
rebuilds the tree without some components.
node.filter {|descendant_node| predicate } -> node
loc.filter {|descendant_loc| predicate } -> node
filter
yields each node except top node. If given block returns false, corresponding node is dropped. If given block returns true, corresponding node is retained and inner nodes are examined.
filter
returns an node. It doesn’t return location object even if self is location object.
259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 |
# File 'lib/htree/traverse.rb', line 259 def filter(&block) subst = {} each_child_with_index {|descendant, i| if yield descendant if descendant.elem? subst[i] = descendant.filter(&block) else subst[i] = descendant end else subst[i] = nil end } to_node.subst_subnode(subst) end |
#find_element(*names) ⇒ Object
find_element
searches an element which universal name is specified by the arguments. It returns nil if not found.
43 44 45 46 |
# File 'lib/htree/traverse.rb', line 43 def find_element(*names) traverse_element(*names) {|e| return e } nil end |
#traverse_element(*names, &block) ⇒ Object
traverse_element
traverses elements in the tree. It yields elements in depth first order.
If names are empty, it yields all elements. If non-empty names are given, it should be list of universal names.
A nested element is yielded in depth first order as follows.
t = HTree('<a id=0><b><a id=1 /></b><c id=2 /></a>')
t.traverse_element("a", "c") {|e| p e}
# =>
{elem <a id="0"> {elem <b> {emptyelem <a id="1">} </b>} {emptyelem <c id="2">} </a>}
{emptyelem <a id="1">}
{emptyelem <c id="2">}
Universal names are specified as follows.
t = HTree(<<'End')
<html>
<meta name="robots" content="index,nofollow">
<meta name="author" content="Who am I?">
</html>
End
t.traverse_element("{http://www.w3.org/1999/xhtml}meta") {|e| p e}
# =>
{emptyelem <{http://www.w3.org/1999/xhtml}meta name="robots" content="index,nofollow">}
{emptyelem <{http://www.w3.org/1999/xhtml}meta name="author" content="Who am I?">}
76 77 78 79 80 81 82 83 84 85 |
# File 'lib/htree/traverse.rb', line 76 def traverse_element(*names, &block) # :yields: element if names.empty? traverse_all_element(&block) else name_set = {} names.each {|n| name_set[n] = true } traverse_some_element(name_set, &block) end nil end |
#traverse_text_internal(&block) ⇒ Object
228 229 230 |
# File 'lib/htree/traverse.rb', line 228 def traverse_text_internal(&block) each_child {|c| c.traverse_text_internal(&block) } end |