Class: Tilia::Xml::Reader
- Inherits:
-
Object
- Object
- Tilia::Xml::Reader
- Includes:
- ContextStackTrait
- Defined in:
- lib/tilia/xml/reader.rb
Overview
The Reader class expands upon PHP’s built-in XMLReader.
The intended usage, is to assign certain XML elements to PHP classes. These need to be registered using the element_map public property.
After this is done, a single call to parse() will parse the entire document, and delegate sub-sections of the document to element classes.
Instance Attribute Summary
Attributes included from ContextStackTrait
#class_map, #context_uri, #element_map, #namespace_map
Instance Method Summary collapse
-
#clark ⇒ String?
Returns the current nodename in clark-notation.
-
#deserializer_for_element_name(name) ⇒ #call
Returns the function that should be used to parse the element identified by it’s clark-notation name.
-
#method_missing(name, *args) ⇒ void
Delegates missing methods to XML::Reader instance.
-
#parse ⇒ Hash
Reads the entire document.
-
#parse_attributes ⇒ Hash
Grabs all the attributes from the current element, and returns them as a key-value array.
-
#parse_current_element ⇒ Hash
Parses the current XML element.
-
#parse_get_elements(element_map = nil) ⇒ Array
parse_get_elements parses everything in the current sub-tree, and returns a an array of elements.
-
#parse_inner_tree(element_map = nil) ⇒ Array, String
Parses all elements below the current element.
-
#read_text ⇒ String
Reads all text below the current element, and returns this as a string.
-
#xml(input) ⇒ XML::Reader
Fakes PHP method xml.
Methods included from ContextStackTrait
#initialize, #pop_context, #push_context
Dynamic Method Handling
This class handles dynamic methods through the method_missing method
#method_missing(name, *args) ⇒ void
This method returns an undefined value.
Delegates missing methods to XML::Reader instance
238 239 240 |
# File 'lib/tilia/xml/reader.rb', line 238 def method_missing(name, *args) @reader.send(name, *args) end |
Instance Method Details
#clark ⇒ String?
Returns the current nodename in clark-notation.
For example: “http://www.w3.org/2005/Atomfeed”. Or if no namespace is defined: “{}feed”.
This method returns null if we’re not currently on an element.
25 26 27 28 29 |
# File 'lib/tilia/xml/reader.rb', line 25 def clark return nil unless local_name "{#{namespace_uri}}#{local_name}" end |
#deserializer_for_element_name(name) ⇒ #call
Returns the function that should be used to parse the element identified by it’s clark-notation name.
224 225 226 227 228 229 230 231 232 233 |
# File 'lib/tilia/xml/reader.rb', line 224 def deserializer_for_element_name(name) return Element::Base.method(:xml_deserialize) unless @element_map.key?(name) deserializer = @element_map[name] return deserializer if deserializer.respond_to?(:call) return deserializer.method(:xml_deserialize) if deserializer.include?(XmlDeserializable) raise "Could not use this type as a deserializer: #{deserializer.inspect} for element: #{name}" end |
#parse ⇒ Hash
Reads the entire document.
This function returns an array with the following three elements:
* name - The root element name.
* value - The value for the root element.
* attributes - An array of attributes.
This function will also disable the standard libxml error handler (which usually just results in PHP errors), and throw exceptions instead.
42 43 44 45 46 47 48 49 50 51 52 |
# File 'lib/tilia/xml/reader.rb', line 42 def parse begin nil while node_type != ::LibXML::XML::Reader::TYPE_ELEMENT && read # noop result = parse_current_element rescue ::LibXML::XML::Error => e raise Tilia::Xml::LibXmlException, e.to_s end result end |
#parse_attributes ⇒ Hash
Grabs all the attributes from the current element, and returns them as a key-value array.
If the attributes are part of the same namespace, they will simply be short keys. If they are defined on a different namespace, the attribute name will be retured in clark-notation.
182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 |
# File 'lib/tilia/xml/reader.rb', line 182 def parse_attributes attributes = {} while move_to_next_attribute != 0 if namespace_uri # Ignoring 'xmlns', it doesn't make any sense. next if namespace_uri == 'http://www.w3.org/2000/xmlns/' name = clark attributes[name] = value else attributes[local_name] = value end end move_to_element attributes end |
#parse_current_element ⇒ Hash
Parses the current XML element.
This method returns arn array with 3 properties:
* name - A clark-notation XML element name.
* value - The parsed value.
* attributes - A key-value list of attributes.
158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 |
# File 'lib/tilia/xml/reader.rb', line 158 def parse_current_element name = clark attributes = {} attributes = parse_attributes if has_attributes? value = deserializer_for_element_name(name).call(self) { 'name' => name, 'value' => value, 'attributes' => attributes } end |
#parse_get_elements(element_map = nil) ⇒ Array
parse_get_elements parses everything in the current sub-tree, and returns a an array of elements.
Each element has a ‘name’, ‘value’ and ‘attributes’ key.
If the the element didn’t contain sub-elements, an empty array is always returned. If there was any text inside the element, it will be discarded.
If the element_map argument is specified, the existing element_map will be overridden while parsing the tree, and restored after this process.
68 69 70 71 72 73 |
# File 'lib/tilia/xml/reader.rb', line 68 def parse_get_elements(element_map = nil) result = parse_inner_tree(element_map) return [] unless result.is_a?(Array) result end |
#parse_inner_tree(element_map = nil) ⇒ Array, String
Parses all elements below the current element.
This method will return a string if this was a text-node, or an array if there were sub-elements.
If there’s both text and sub-elements, the text will be discarded.
If the element_map argument is specified, the existing element_map will be overridden while parsing the tree, and restored after this process.
87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 |
# File 'lib/tilia/xml/reader.rb', line 87 def parse_inner_tree(element_map = nil) text = nil elements = [] if node_type == ::LibXML::XML::Reader::TYPE_ELEMENT && empty_element? # Easy! self.next return nil end unless element_map.nil? push_context @element_map = element_map end return false unless read loop do # RUBY: Skip is_valid block case node_type when ::LibXML::XML::Reader::TYPE_ELEMENT elements << parse_current_element when ::LibXML::XML::Reader::TYPE_TEXT, ::LibXML::XML::Reader::TYPE_CDATA text ||= '' text += value read when ::LibXML::XML::Reader::TYPE_END_ELEMENT # Ensuring we are moving the cursor after the end element. read break when ::LibXML::XML::Reader::TYPE_NONE raise Tilia::Xml::ParseException, 'We hit the end of the document prematurely. This likely means that some parser "eats" too many elements. Do not attempt to continue parsing.' else # Advance to the next element read end end pop_context unless element_map.nil? elements.any? ? elements : text end |
#read_text ⇒ String
Reads all text below the current element, and returns this as a string.
135 136 137 138 139 140 141 142 143 144 145 146 147 148 |
# File 'lib/tilia/xml/reader.rb', line 135 def read_text result = '' previous_depth = depth while read && depth != previous_depth result += value if [ ::LibXML::XML::Reader::TYPE_TEXT, ::LibXML::XML::Reader::TYPE_CDATA, ::LibXML::XML::Reader::TYPE_WHITESPACE ].include? node_type end result end |
#xml(input) ⇒ XML::Reader
Fakes PHP method xml
Creates a new XML::Reader instance
207 208 209 210 211 212 213 214 215 216 217 |
# File 'lib/tilia/xml/reader.rb', line 207 def xml(input) raise 'XML document already loaded' if @reader if input.is_a?(String) @reader = ::LibXML::XML::Reader.string(input) elsif input.is_a?(File) @reader = ::LibXML::XML::Reader.file(input) else @reader = ::LibXML::XML::Reader.io(input) end end |