Class: RDF::RDFXML::Reader

Inherits:
RDF::Reader
  • Object
show all
Includes:
Util::Logger
Defined in:
lib/rdf/rdfxml/reader.rb,
lib/rdf/rdfxml/reader/rexml.rb,
lib/rdf/rdfxml/reader/nokogiri.rb

Overview

An RDF/XML parser in Ruby

Based on RDF/XML Syntax Specification: http://www.w3.org/TR/REC-rdf-syntax/

Extension: A nodeElement can also use the rdf:resource attribute, if none of the other standard attributes are defined.

Author:

Defined Under Namespace

Modules: Nokogiri, REXML

Constant Summary collapse

CORE_SYNTAX_TERMS =
%w(RDF ID about parseType resource nodeID datatype).map {|n| "http://www.w3.org/1999/02/22-rdf-syntax-ns##{n}"}
OLD_TERMS =
%w(aboutEach aboutEachPrefix bagID).map {|n| "http://www.w3.org/1999/02/22-rdf-syntax-ns##{n}"}

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(input = $stdin, **options) {|reader| ... } ⇒ reader

Initializes the RDF/XML reader instance.

Parameters:

  • input (Nokogiri::XML::Document, IO, File, String) (defaults to: $stdin)

    the input stream to read

  • options (Hash{Symbol => Object})

    any additional options

Options Hash (**options):

  • :library (Symbol)

    One of :nokogiri or :rexml. If nil/unspecified uses :nokogiri if available, :rexml otherwise.

  • :encoding (Encoding) — default: Encoding::UTF_8

    the encoding of the input stream (Ruby 1.9+)

  • :validate (Boolean) — default: false

    whether to validate the parsed statements and values

  • :canonicalize (Boolean) — default: false

    whether to canonicalize parsed literals

  • :intern (Boolean) — default: true

    whether to intern all parsed URIs

  • :prefixes (Hash) — default: Hash.new

    the prefix mappings to use (not supported by all readers)

  • :base_uri (#to_s) — default: nil

    the base URI to use when resolving relative URIs

Yields:

  • (reader)

    self

Yield Parameters:

  • reader (RDF::Reader)

Yield Returns:

  • (void)

    ignored

Raises:

  • (Error)

    Raises RDF::ReaderError if validate


141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
# File 'lib/rdf/rdfxml/reader.rb', line 141

def initialize(input = $stdin, **options, &block)
  super do
    @library = case options[:library]
    when nil
      # Use Nokogiri when available, and REXML otherwise:
      defined?(::Nokogiri) ? :nokogiri : :rexml
    when :nokogiri, :rexml
      options[:library]
    else
      log_fatal("expected :rexml or :nokogiri, but got #{options[:library].inspect}", exception: ArgumentError)
    end

    require "rdf/rdfxml/reader/#{@library}"
    @implementation = case @library
      when :nokogiri then Nokogiri
      when :rexml    then REXML
    end
    self.extend(@implementation)

    input.rewind if input.respond_to?(:rewind)
    initialize_xml(input, **options) rescue log_fatal($!.message)

    if root.nil?
      log_info("Empty document")
    elsif !doc_errors.empty?
      log_error("Synax errors") {doc_errors}
    end

    block.call(self) if block_given?
  end
end

Instance Attribute Details

#implementationModule (readonly)

Returns the XML implementation module for this reader instance.

Returns:

  • (Module)

113
114
115
# File 'lib/rdf/rdfxml/reader.rb', line 113

def implementation
  @implementation
end

Instance Method Details

#closeObject

Document closed when read in initialize


177
# File 'lib/rdf/rdfxml/reader.rb', line 177

def close; end

#each_statement {|statement| ... }

This method returns an undefined value.

Iterates the given block for each RDF statement in the input.

Yields:

  • (statement)

Yield Parameters:

  • statement (RDF::Statement)

185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
# File 'lib/rdf/rdfxml/reader.rb', line 185

def each_statement(&block)
  if block_given?
    # Block called from add_statement
    @callback = block
    return unless root

    log_fatal "root must be a proxy not a #{root.class}" unless root.is_a?(@implementation::NodeProxy)

    add_debug(root, "base_uri: #{base_uri.inspect}")
  
    rdf_nodes = root.xpath("//rdf:RDF", "rdf" => RDF.to_uri.to_s)
    if rdf_nodes.size == 0
      # If none found, root element may be processed as an RDF Node

      ec = EvaluationContext.new(base_uri, root, @graph) do |prefix, value|
        prefix(prefix, value)
      end

      nodeElement(root, ec)
    else
      rdf_nodes.each do |node|
        log_fatal "node must be a proxy not a #{node.class}" unless node.is_a?(@implementation::NodeProxy)
        # XXX Skip this element if it's contained within another rdf:RDF element

        # Extract base, lang and namespaces from parents to create proper evaluation context
        ec = EvaluationContext.new(base_uri, nil, @graph)
        ec.extract_from_ancestors(node) do |prefix, value|
          prefix(prefix, value)
        end
        node.children.each {|el|
          next unless el.element?
          log_fatal "el must be a proxy not a #{el.class}" unless el.is_a?(@implementation::NodeProxy)
          new_ec = ec.clone(el) do |prefix, value|
            prefix(prefix, value)
          end
          nodeElement(el, new_ec)
        }
      end
    end

    if validate? && log_statistics[:error]
      raise RDF::ReaderError, "Errors found during processing"
    end
  end
  enum_for(:each_statement)
end

#each_triple {|subject, predicate, object| ... }

This method returns an undefined value.

Iterates the given block for each RDF triple in the input.

Yields:

  • (subject, predicate, object)

Yield Parameters:

  • subject (RDF::Resource)
  • predicate (RDF::URI)
  • object (RDF::Value)

240
241
242
243
244
245
246
247
# File 'lib/rdf/rdfxml/reader.rb', line 240

def each_triple(&block)
  if block_given?
    each_statement do |statement|
      block.call(*statement.to_triple)
    end
  end
  enum_for(:each_triple)
end

#rewindObject

No need to rewind, as parsing is done in initialize


174
# File 'lib/rdf/rdfxml/reader.rb', line 174

def rewind; end