Module: MultiXml
- Extended by:
- Helpers
- Defined in:
- lib/multi_xml.rb,
lib/multi_xml/errors.rb,
lib/multi_xml/helpers.rb,
lib/multi_xml/version.rb,
lib/multi_xml/constants.rb,
lib/multi_xml/file_like.rb,
lib/multi_xml/parsers/ox.rb,
lib/multi_xml/parsers/oga.rb,
lib/multi_xml/parsers/rexml.rb,
lib/multi_xml/parsers/libxml.rb,
lib/multi_xml/parsers/nokogiri.rb,
lib/multi_xml/parsers/dom_parser.rb,
lib/multi_xml/parsers/libxml_sax.rb,
lib/multi_xml/parsers/sax_handler.rb,
lib/multi_xml/parsers/nokogiri_sax.rb
Overview
A generic swappable back-end for parsing XML
MultiXml provides a unified interface for XML parsing across different parser libraries. It automatically selects the best available parser (Ox, LibXML, Nokogiri, Oga, or REXML) and converts XML to Ruby hashes.
Defined Under Namespace
Modules: FileLike, Helpers, Parsers Classes: DisallowedTypeError, NoParserError, ParseError
Constant Summary collapse
- VERSION =
The current version of MultiXml
Gem::Version.create("0.8.1")
- TEXT_CONTENT_KEY =
Hash key for storing text content within element hashes
"__content__".freeze
- RUBY_TYPE_TO_XML =
Maps Ruby class names to XML type attribute values
{ "Symbol" => "symbol", "Integer" => "integer", "BigDecimal" => "decimal", "Float" => "float", "TrueClass" => "boolean", "FalseClass" => "boolean", "Date" => "date", "DateTime" => "datetime", "Time" => "datetime", "Array" => "array", "Hash" => "hash" }.freeze
- DISALLOWED_TYPES =
XML type attributes disallowed by default for security
These types are blocked to prevent code execution vulnerabilities.
%w[symbol yaml].freeze
- FALSE_BOOLEAN_VALUES =
Values that represent false in XML boolean attributes
Set.new(%w[0 false]).freeze
- DEFAULT_OPTIONS =
Default parsing options
{ typecast_xml_value: true, disallowed_types: DISALLOWED_TYPES, symbolize_keys: false }.freeze
- PARSER_PREFERENCE =
Parser libraries in preference order (fastest first)
[ ["ox", :ox], ["libxml", :libxml], ["nokogiri", :nokogiri], ["rexml/document", :rexml], ["oga", :oga] ].freeze
- PARSE_DATETIME =
This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.
Parses datetime strings, trying Time first then DateTime
lambda do |string| Time.parse(string).utc rescue ArgumentError DateTime.parse(string).to_time.utc end
- FILE_CONVERTER =
This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.
Creates a file-like StringIO from base64-encoded content
lambda do |content, entity| StringIO.new(content.unpack1("m")).tap do |io| io.extend(FileLike) file_io = io # : FileIO file_io.original_filename = entity["name"] file_io.content_type = entity["content_type"] end end
- TYPE_CONVERTERS =
Type converters for XML type attributes
Maps type attribute values to lambdas that convert string content. Converters with arity 2 receive the content and the full entity hash.
{ # Primitive types "symbol" => :to_sym.to_proc, "string" => :to_s.to_proc, "integer" => :to_i.to_proc, "float" => :to_f.to_proc, "double" => :to_f.to_proc, "decimal" => ->(s) { BigDecimal(s) }, "boolean" => ->(s) { !FALSE_BOOLEAN_VALUES.include?(s.strip) }, # Date and time types "date" => Date.method(:parse), "datetime" => PARSE_DATETIME, "dateTime" => PARSE_DATETIME, # Binary types "base64Binary" => ->(s) { s.unpack1("m") }, "binary" => ->(s, entity) { (entity["encoding"] == "base64") ? s.unpack1("m") : s }, "file" => FILE_CONVERTER, # Structured types "yaml" => lambda do |string| YAML.safe_load(string, permitted_classes: [Symbol, Date, Time]) rescue ArgumentError, Psych::SyntaxError string end }.freeze
Class Method Summary collapse
-
.parse(xml, options = {}) ⇒ Hash
Parse XML into a Ruby Hash.
-
.parser ⇒ Module
Get the current XML parser module.
-
.parser=(new_parser) ⇒ Module
Set the XML parser to use.
Methods included from Helpers
apply_converter, convert_hash, convert_text_content, disallowed_type?, empty_value?, extract_array_entries, find_array_entries, symbolize_keys, transform_keys, typecast_array, typecast_children, typecast_hash, typecast_xml_value, undasherize_keys, unwrap_file_if_present, unwrap_if_simple, wrap_and_typecast
Class Method Details
.parse(xml, options = {}) ⇒ Hash
Parse XML into a Ruby Hash
74 75 76 77 78 79 80 81 82 83 84 85 |
# File 'lib/multi_xml.rb', line 74 def parse(xml, = {}) = DEFAULT_OPTIONS.merge() xml_parser = [:parser] ? resolve_parser(.fetch(:parser)) : parser io = normalize_input(xml) return {} if io.eof? result = parse_with_error_handling(io, xml, xml_parser) result = typecast_xml_value(result, .fetch(:disallowed_types)) if .fetch(:typecast_xml_value) result = symbolize_keys(result) if .fetch(:symbolize_keys) result end |
.parser ⇒ Module
Get the current XML parser module
Returns the currently configured parser, auto-detecting one if not set. Parsers are checked in order of performance: Ox, LibXML, Nokogiri, Oga, REXML.
37 38 39 |
# File 'lib/multi_xml.rb', line 37 def parser @parser ||= resolve_parser(detect_parser) end |
.parser=(new_parser) ⇒ Module
Set the XML parser to use
52 53 54 |
# File 'lib/multi_xml.rb', line 52 def parser=(new_parser) @parser = resolve_parser(new_parser) end |