Class: SXP::Reader::SPARQL

Inherits:
Extended show all
Defined in:
lib/sxp/reader/sparql.rb

Overview

A SPARQL Syntax Expressions (SSE) parser.

Requires [RDF.rb](rubygems.org/gems/rdf/).

See Also:

  • https:/openjenahttps:/openjena.org/wiki/SSE

Constant Summary collapse

A =

Alias for rdf:type

/^a$/
BASE =

Base token, causes next URI to be treated as the ‘base_uri` for further URI expansion

/^base$/i
PREFIX =

Prefix token, causes following prefix and URI pairs to be used for transforming PNAME tokens into URIs.

/^prefix$/i
NIL =
/^nil$/i
FALSE =
/^false$/i
TRUE =
/^true$/i
EXPONENT =
/[eE][+-]?[0-9]+/
DECIMAL =
/^[+-]?(\d*)?\.\d*$/
DOUBLE =
/^[+-]?(\d*)?\.\d*#{EXPONENT}$/
BNODE_ID =

BNode with identifier

/^_:([^\s]*)/
BNODE_NEW =

Anonymous BNode

/^_:$/
VAR_ID =

Distinguished variable

/^\?(.*)/
ND_VAR =

Non-distinguished variable

/^\?(?:[\?\.])(.*)/
EVAR_ID =

Distinguished existential variable

/^\$(.*)/
ND_EVAR =

Non-distinguished existential variable

/^\$(?:[\$\.])(.*)/
PNAME =

A QName, subject to expansion to URIs using PREFIX

/([^:]*):(.*)/
RDF_TYPE =
(a = RDF.type.dup; a.lexical = 'a'; a).freeze

Constants inherited from Extended

Extended::ATOM, Extended::LPARENS, Extended::RPARENS

Constants inherited from Basic

Basic::ATOM, Basic::INTEGER, Basic::LPARENS, Basic::RATIONAL, Basic::RPARENS

Instance Attribute Summary collapse

Attributes inherited from SXP::Reader

#input, #options

Instance Method Summary collapse

Methods inherited from Basic

#read_character, #read_literal, #read_string

Methods inherited from SXP::Reader

#each, read, #read, read_all, #read_all, #read_character, read_file, #read_files, #read_integer, #read_list, #read_literal, #read_sharp, #read_string, read_url

Constructor Details

#initialize(input, **options, &block) ⇒ SPARQL

Initializes the reader.

Parameters:



74
75
76
77
78
79
80
81
82
83
# File 'lib/sxp/reader/sparql.rb', line 74

def initialize(input, **options, &block)
  super { @prefixes = {}; @bnodes = {}; @list_depth = 0 }

  if block_given?
    case block.arity
      when 1 then block.call(self)
      else self.instance_eval(&block)
    end
  end
end

Instance Attribute Details

#base_uriObject

Base URI as specified or when parsing parsing a BASE token using the immediately following token, which must be a URI.



45
46
47
# File 'lib/sxp/reader/sparql.rb', line 45

def base_uri
  @base_uri
end

#prefixesHash{Object => RDF::URI}

Prefixes defined while parsing

Returns:



50
51
52
# File 'lib/sxp/reader/sparql.rb', line 50

def prefixes
  @prefixes
end

Instance Method Details

#prefix(name, uri = nil) ⇒ RDF::URI

Defines the given named URI prefix for this parser.

Examples:

Defining a URI prefix

parser.prefix :dc, RDF::URI('http://purl.org/dc/terms/')

Returning a URI prefix

parser.prefix(:dc)    #=> RDF::URI('http://purl.org/dc/terms/')

Parameters:

  • name (Symbol, #to_s)
  • uri (RDF::URI, #to_s) (defaults to: nil)

Returns:

  • (RDF::URI)


64
65
66
67
# File 'lib/sxp/reader/sparql.rb', line 64

def prefix(name, uri = nil)
  name = name.to_s.empty? ? nil : (name.respond_to?(:to_sym) ? name.to_sym : name.to_s.to_sym)
  uri.nil? ? @prefixes[name] : @prefixes[name] = uri
end

#read_atomObject

Reads an SSE Atom

Atoms parsed including ‘base`, `prefix`, `true`, `false`, numeric, BNodes and variables.

Creates ‘RDF::Literal`, `RDF::Node`, or `RDF::Query::Variable` instances where appropriate.

Returns:



215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
# File 'lib/sxp/reader/sparql.rb', line 215

def read_atom
  case buffer = read_literal
    when '.'       then buffer.to_sym
    when A         then RDF_TYPE
    when BASE      then @parsed_base = true; buffer.to_sym
    when NIL       then nil
    when FALSE     then RDF::Literal(false)
    when TRUE      then RDF::Literal(true)
    when DOUBLE    then RDF::Literal::Double.new(buffer)
    when DECIMAL   then RDF::Literal::Decimal.new(buffer)
    when INTEGER   then RDF::Literal::Integer.new(buffer)
    when BNODE_ID  then @bnodes[$1] ||= RDF::Node($1)
    when BNODE_NEW then RDF::Node.new
    when ND_VAR    then variable($1, distinguished: false)
    when VAR_ID    then variable($1, distinguished: true)
    when ND_EVAR   then variable($1, existential: true, distinguished: false)
    when EVAR_ID   then variable($1, existential: true, distinguished: true)
    else buffer.to_sym
  end
end

#read_rdf_literalRDF::Literal

Reads literals corresponding to SPARQL/Turtle/Notation-3 syntax

Examples:

"a plain literal"
'another plain literal'
"a literal with a language"@en
"a typed literal"^^<http://example/>
"a typed literal with a PNAME"^^xsd:string

Returns:

  • (RDF::Literal)


154
155
156
157
158
159
160
161
162
163
164
165
166
# File 'lib/sxp/reader/sparql.rb', line 154

def read_rdf_literal
  value   = read_string
  options = case peek_char
    when ?@
      skip_char # '@'
      {language: read_atom.downcase}
    when ?^
      2.times { skip_char } # '^^'
      {datatype: read_token.last}
    else {}
  end
  RDF::Literal(value, **options)
end

#read_rdf_uriRDF::URI

Reads a URI in SPARQL/Turtle/Notation-3 syntax

Examples:

<http://example/>

Returns:

  • (RDF::URI)


175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
# File 'lib/sxp/reader/sparql.rb', line 175

def read_rdf_uri
  buffer = ""
  skip_char # '<'
  return :< if (char = peek_char).nil? || char.chr !~ ATOM # FIXME: nasty special case for the '< symbol
  return :<= if peek_char.chr.eql?(?=.chr) && read_char    # FIXME: nasty special case for the '<= symbol
  until peek_char == ?>
    buffer << read_char # TODO: unescaping
  end
  skip_char # '>'

  # If we have a base URI, use that when constructing a new URI
  uri = if self.base_uri && RDF::URI(buffer).relative?
    self.base_uri.join(buffer)
  else
    RDF::URI(buffer)
  end
  
  # If we previously parsed a "BASE" element, then this URI is used to set that value
  if @parsed_base
    self.base_uri = uri
    @parsed_base = nil
  end
  
  # If we previously parsed a "PREFIX" element, associate this URI with the prefix
  if @parsed_prefix
    prefix(@parsed_prefix, uri)
    @parsed_prefix = nil
  end
  
  uri
end

#read_tokenObject

Reads SSE Tokens, including ‘RDF::Literal`, `RDF::URI` and `RDF::Node`.

Performs forward reference for prefix and base URI representations and saves in #base_uri and #prefixes accessors.

Transforms tokens matching a PNAME pattern into ‘RDF::URI` instances if a match is found with a previously identified PREFIX.

Returns:



94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
# File 'lib/sxp/reader/sparql.rb', line 94

def read_token
  case peek_char
  when ?" then [:atom, read_rdf_literal] # "
  when ?' then [:atom, read_rdf_literal] # '
  when ?< then [:atom, read_rdf_uri]
  else
    tok = super
    
    # If we just parsed "PREFIX", and this is an opening list, then
    # record list depth and process following as token, URI pairs
    #
    # Once we've closed the list, go out of prefix mode
    if tok.is_a?(Array) && tok[0] == :list
      if '(['.include?(tok[1])
        @list_depth += 1
      else
        @list_depth -= 1
        @prefix_depth = nil if @prefix_depth && @list_depth < @prefix_depth
      end
    end

    if tok.is_a?(Array) && tok[0] == :atom && tok[1].is_a?(Symbol)
      value = tok[1].to_s

      # We previously parsed a PREFIX, this will be the map value
      @parsed_prefix = value.chop if @prefix_depth && @prefix_depth > 0
      
      # If we just saw PREFIX, then this starts the parsing mode
      @prefix_depth = @list_depth + 1 if value =~ PREFIX
      
      # If the token is of the form 'prefix:suffix', create a URI and give it the
      # token as a QName
      if value.to_s =~ PNAME && base = prefix($1)
        suffix = $2
        #STDERR.puts "read_tok lexical: pfx: #{$1.inspect} => #{prefix($1).inspect}, sfx: #{suffix.inspect}"
        suffix = suffix.sub(/^\#/, "") if base.to_s.index("#")
        uri = RDF::URI(base.to_s + suffix)
        #STDERR.puts "read_tok lexical uri: #{uri.inspect}"

        [:atom, uri]
      else
        tok
      end
    else
      tok
    end
  end
end

#skip_commentsvoid

This method returns an undefined value.



238
239
240
241
242
243
244
245
246
247
# File 'lib/sxp/reader/sparql.rb', line 238

def skip_comments
  until eof?
    case (char = peek_char).chr
      when /\s+/ then skip_char
      when /;/   then skip_line
      when /#/   then skip_line
      else break
    end
  end
end

#variable(id, distinguished: true, existential: false) ⇒ RDF::Query::Variable

Return variable allocated to an ID. If no ID is provided, a new variable is allocated. Otherwise, any previous assignment will be used.

The variable has a #distinguished? method applied depending on if this is a disinguished or non-distinguished variable. Non-distinguished variables are effectively the same as BNodes.

Returns:

  • (RDF::Query::Variable)


258
259
260
261
262
263
264
265
266
267
268
269
# File 'lib/sxp/reader/sparql.rb', line 258

def variable(id, distinguished: true, existential: false)
  id = nil if id.to_s.empty?
  
  if id
    @vars ||= {}
    @vars[id] ||= begin
      RDF::Query::Variable.new(id, distinguished: distinguished, existential: existential)
    end
  else
    RDF::Query::Variable.new(distinguished: distinguished, existential: existential)
  end
end