Class: TaliaCore::ActiveSourceParts::Rdf::RdfReader

Inherits:
Object
  • Object
show all
Extended by:
TaliaUtil::IoHelper
Includes:
TaliaUtil::IoHelper, TaliaUtil::Progressable
Defined in:
lib/talia_core/active_source_parts/rdf/rdf_reader.rb

Overview

Import class for rdf ntriples files using rdf.rb (rdf.rubyforge.org/). See GenericReader for more information on Talia import classes in general.

Direct Known Subclasses

NtriplesReader, RdfxmlReader

Class Method Summary collapse

Instance Method Summary collapse

Methods included from TaliaUtil::IoHelper

base_for, file_url, open_from_url, open_generic

Methods included from TaliaUtil::Progressable

#progressor, #progressor=, #run_with_progress

Constructor Details

#initialize(source) ⇒ RdfReader

On inititalization:

  • We are going to use Class.subclasses_of method to determine the talia type of the source. Due to the autoload functionality of rails we need to be sure any possible source type class file is loaded when we actually use that method. This is what TaliaUtil::Util::load_all_models does.

  • Works only with format=:ntriples for now.



35
36
37
38
39
# File 'lib/talia_core/active_source_parts/rdf/rdf_reader.rb', line 35

def initialize(source)
  TaliaUtil::Util.load_all_models
  source = StringIO.new(source) if(source.is_a? String)
  @reader = RDF::Reader.for(format).new(source)
end

Class Method Details

.sources_from(source, progressor = nil, base_url = nil) ⇒ Object

See TaliaCore::ActiveSourceParts::Xml::GenericReader#sources_from



22
23
24
25
26
# File 'lib/talia_core/active_source_parts/rdf/rdf_reader.rb', line 22

def sources_from(source, progressor=nil, base_url=nil)
  reader = self.new(source)
  reader.progressor = progressor
  reader.sources
end

.sources_from_url(url, options = nil, progressor = nil) ⇒ Object

See TaliaCore::ActiveSourceParts::Xml::GenericReader#sources_from_url



17
18
19
# File 'lib/talia_core/active_source_parts/rdf/rdf_reader.rb', line 17

def sources_from_url(url, options=nil, progressor=nil)
  open_generic(url, options) {|io| sources_from(io, progressor, url)}
end

Instance Method Details

#formatObject

Raises:

  • (NotImplementedError)


92
93
94
# File 'lib/talia_core/active_source_parts/rdf/rdf_reader.rb', line 92

def format
  raise NotImplementedError
end

#rdf_to_talia_type(rdf_type) ⇒ Object

Tries to gues at the talia type of a source given its rdf type.



86
87
88
89
90
# File 'lib/talia_core/active_source_parts/rdf/rdf_reader.rb', line 86

def rdf_to_talia_type(rdf_type)
  Class.subclasses_of(TaliaCore::ActiveSource).detect do |c|
    c.additional_rdf_types.include? rdf_type
  end.try :name
end

#sourcesObject

See TaliaCore::ActiveSourceParts::Xml::GenericReader#sources



42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
# File 'lib/talia_core/active_source_parts/rdf/rdf_reader.rb', line 42

def sources
  return @sources if(@sources)
  @sources = {}
  run_with_progress('RdfRead', 0) do |progress|
    @reader.each_statement do |statement|
      source = (@sources[statement.subject.to_s] ||= {})
      source['uri'] ||= statement.subject.to_s
      update_source_type(source, statement)
      source[statement.predicate.to_s] ||= []
      object = if(statement.object.literal?)
                 parsed_string = PropertyString.parse(statement.object.value)
                 parsed_string.lang = statement.object.language.to_s if(statement.object.language)
                 parsed_string
               else
                 "<#{statement.object.to_s}>"
               end
      source[statement.predicate.to_s] << object
      progress.inc
    end
  end
  # Set all empty source types to ActiveSource, to prevent DummySource type objects to
  # be created. (Reason: When we import RDF, we assume that all sources are "valid", and should
  # never be marked as DummySource)
  @sources = @sources.values
  @sources.each { |s| s['type'] ||= 'TaliaCore::ActiveSource' }
  @sources
end

#update_source_type(source, statement) ⇒ Object

Update the Talia source type. The type can be contained explicitly as object of a N::TALIA.type predicate or can be inferred from the rdf type if a N::RDF.type predicate is present.

The method works in the way that a N::TALIA type attribute always overwrites the type, while an N::RDF.type will only be used if no type has been set on the source



75
76
77
78
79
80
81
82
83
# File 'lib/talia_core/active_source_parts/rdf/rdf_reader.rb', line 75

def update_source_type(source, statement)
  return if(statement.object.literal?)
  case(statement.predicate.to_s)
  when N::TALIA.type.to_s
    source['type'] = statement.object.to_s
  when N::RDF.type.to_s
    source['type'] ||= rdf_to_talia_type statement.object.to_s
  end
end