Module: TaliaCore::ActiveSourceParts::Xml::GenericReaderImportStatements
- Included in:
- GenericReader
- Defined in:
- lib/talia_core/active_source_parts/xml/generic_reader_import_statements.rb
Overview
These are the statements that are use to add handler for elements or which are used to otherwise read data for the element
What is an element handler?
The methods in the Handlers submodule create element handlers. Handlers are the starting point for the import operation and are the only statements at the top level of the import description.
Each handler will match a specific XML tag in the “current” XML content. At the beginning of the import the “current” content will be either the root element or all of its child elements (depending on wether the can_use_root flag is set).
The handlers will automatically attempt to match the element(s) at the starting level:
class SampleReader < GenericReader
can_use_root CAN_BE_TRUE_OR_FALSE
element :foo do
# Code for foo tags
end
element :bar
# Code for bar tags
end
element :foobar
# Code for foobar tags
end
end
With this importer, there would be three handlers, for “foo”, “bar” and “foobar” tags. If you had the following XML
<foobar>
<foo>Hello</foo>
<bar>World</bar>
</foobar>
When reader above is run on the sample XML, the follwoing will happen:
-
In case can_use_root has been set to true, the importer will start at the root element. In this case, the “Code for foobar tags” will be executed
-
In case can_use_root has not been set, or set to false, the importer will work on the elements inside the root tag. This means that it will first check the “foo” tag and call the “Code for foo tags” and then check the the “bar” tag and call the “Code for bar tags”.
Obviously an XML can also contain the same element multiple times, in which case the handler will be called multiple time.
What happens inside a handler?
When a handler is called, the “current” XML will be set to the inner part of the current document. That is, in case of the “foobar” handler the “current” XML would consist of the “foo” and “bar” tags (and their) content. For the “foo” and “bar” handler, the “current” XML would be just the text nodes inside it.
The handler handler also has a “current” source that is being imported. In case of a handler that was declared with .element, a new, empty source is created whenever the handler is called. If the handler was declared with .plain_element, the handler “inherits” the current source that was active when it was called.
Inside the handler, the GenericReaderAddStatements are used in order to add data and properties to the current source.
All handlers are executed as instance methods of the current reader.
How are handlers called?
The handlers that are declared in the importer are matched against the “starting” tags in the XML and called automatically. Inside the handler methods like #add_source can be used to call a handler on sub-elements. Example for the reader given above (with can_use_root set):
element :foobar do
# At this point a new empty source has been created
# and is set as the "current" source. The "current" XML
# is the foo and bar tags
add_source :foo # Takes the "foo" tag and calls the handler
add_source :bar # Takes the "bar" tag and calls the handler
# Alternatively: add_source :from_all_sources -> do both automatically
end
If the “foo” hanlder was defined as a .plain_element
plain_element :foo { }
then the “foo” handler would inherit the source from the “foobar” handler through wich it was called.
Accessing data within in the handlers
While the handlers allow you to navigate through the XML structure, you will also have to read the data in order to construct the sources.
The from_* methods allow to read data from the current XML:
element :foobar do
the_thing = from_element :foo
end
In this case, inside the handler for the “foobar” tag, you attempt to read the text from the :foo element. With the XML given above, the result of this would be the string “Hello”
Defined Under Namespace
Modules: Handlers
Instance Method Summary collapse
-
#add_part(sub_element = nil, &block) ⇒ Object
Imports another source like add_source and also assigns the new source as a part of the current one.
-
#add_source(sub_element = nil, &block) ⇒ Object
Adds a source from the given sub-element.
-
#all_elements(elem) ⇒ Object
This works like #from_element, except that it will return an array with the values of all “elem” tags inside the current XML.
-
#from_attribute(attrib) ⇒ Object
Gets the data for an attribute of the current XML element.
-
#from_element(elem) ⇒ Object
Gets data from the XML tag “elem” inside the currently active XML.
-
#nested(sub_element, handler_method = nil) ⇒ Object
Adds a nested element.
Instance Method Details
#add_part(sub_element = nil, &block) ⇒ Object
Imports another source like add_source and also assigns the new source as a part of the current one.
223 224 225 226 227 228 229 230 231 232 233 |
# File 'lib/talia_core/active_source_parts/xml/generic_reader_import_statements.rb', line 223 def add_part(sub_element = nil, &block) raise(RuntimeError, "Cannot add child before having an uri to refer to.") unless(@current.attributes['uri']) @current.element.search("/#{sub_element}").each do |sub_elem| attribs = call_handler(sub_elem, &block) if(attribs) attribs[N::TALIA.part_of.to_s] ||= [] attribs[N::TALIA.part_of.to_s] << "<#{@current.attributes['uri']}>" add_source_with_check(attribs) end end end |
#add_source(sub_element = nil, &block) ⇒ Object
Adds a source from the given sub-element. You may either pass a block with the code to import or the name of an already registered element. If the special value :from_all_sources is given, it will read from all sub-elements for which there are registered handlers.
If the method is used with a block, it will call the block as a handler for the current element.l
207 208 209 210 211 212 213 214 215 216 217 218 219 |
# File 'lib/talia_core/active_source_parts/xml/generic_reader_import_statements.rb', line 207 def add_source(sub_element = nil, &block) if(sub_element) if(sub_element == :from_all_sources) read_children_of(@current.element) else @current.element.search("/#{sub_element}").each { |sub_elem| read_source(sub_elem, &block) } end else raise(ArgumentError, "When adding elements on the fly, you must use a block") unless(block) attribs = call_handler(@current.element, &block) add_source_with_check(attribs) if(attribs) end end |
#all_elements(elem) ⇒ Object
This works like #from_element, except that it will return an array with the values of all “elem” tags inside the current XML.
176 177 178 179 180 |
# File 'lib/talia_core/active_source_parts/xml/generic_reader_import_statements.rb', line 176 def all_elements(elem) result = [] @current.element.search("/#{elem}").each { |el| result << el.inner_text.strip } result end |
#from_attribute(attrib) ⇒ Object
Gets the data for an attribute of the current XML element. E.g. if you have XML for
<foobar name="myself">
<foo>Hello World</foo>
</foobar>
And this handler
element :foobar do
my_attr = from_attribute :name
end
then the my_attr variable will be set to “myself”
156 157 158 |
# File 'lib/talia_core/active_source_parts/xml/generic_reader_import_statements.rb', line 156 def from_attribute(attrib) @current.element[attrib] end |
#from_element(elem) ⇒ Object
Gets data from the XML tag “elem” inside the currently active XML
<foobar><foo>Hello World</foo></foobar>
Inside the “foobar” handler ‘from_element :foo` would return “Hello World” with the XML given above.
166 167 168 169 170 171 172 |
# File 'lib/talia_core/active_source_parts/xml/generic_reader_import_statements.rb', line 166 def from_element(elem) return @current.element.inner_text.strip if(elem == :self) elements = all_elements(elem) elements = elements.uniq if(elements.size > 1) # Try to ignore dupes raise(ArgumentError, "More than one element of #{elem} in #{@current.element.inspect}") if(elements.size > 1) elements.first end |
#nested(sub_element, handler_method = nil) ⇒ Object
Adds a nested element. This will not change the currently importing source, but it will set the currently active XML to the nested element. If a block is given, it will execute for each of the nested elements that are found. Otherwise, a method name must be given, and that method will be executed instead of the block
187 188 189 190 191 192 193 194 195 196 197 198 |
# File 'lib/talia_core/active_source_parts/xml/generic_reader_import_statements.rb', line 187 def nested(sub_element, handler_method = nil) original_element = @current.element begin @current.element.search("#{sub_element}").each do |sub_elem| @current.element = sub_elem assit(block_given? ^ (handler_method.is_a?(Symbol)), 'Must have either a handler (x)or a block.') block_given? ? yield : self.send(handler_method) end ensure @current.element = original_element end end |