Class: RDF::NTriples::Writer
- Includes:
- Util::Logger
- Defined in:
- lib/rdf/ntriples/writer.rb
Overview
N-Triples serializer.
Output is serialized for UTF-8, to serialize as ASCII (with) unicode escapes, set encoding: Encoding::ASCII as an option to #initialize.
Direct Known Subclasses
Constant Summary collapse
- ESCAPE_PLAIN =
/\A[\x20-\x21\x23-\x26\x28#{Regexp.escape '['}#{Regexp.escape ']'}-\x7E]*\z/m.freeze
- ESCAPE_PLAIN_U =
/\A(?:#{Reader::IRI_RANGE}|#{Reader::UCHAR})*\z/.freeze
Constants included from Util::Logger
Instance Attribute Summary
Attributes inherited from Writer
Class Method Summary collapse
-
.escape(string, encoding = nil) ⇒ String
Escape Literal and URI content.
-
.escape_ascii(u, encoding) ⇒ String
Standard ASCII escape sequences.
-
.escape_unicode(u, encoding) ⇒ String
Escape ascii and unicode characters.
- .escape_utf16(u) ⇒ String
- .escape_utf32(u) ⇒ String
-
.serialize(value) ⇒ String
Returns the serialized N-Triples representation of the given RDF value.
Instance Method Summary collapse
- #escaped(string) ⇒ Object
-
#format_literal(literal, **options) ⇒ String
Returns the N-Triples representation of a literal.
-
#format_node(node, unique_bnodes: false, **options) ⇒ String
Returns the N-Triples representation of a blank node.
-
#format_quotedTriple(statement, **options) ⇒ String
deprecated
Deprecated.
Quoted triples are now deprecated
-
#format_statement(statement, **options) ⇒ String
Returns the N-Triples representation of a statement.
-
#format_triple(subject, predicate, object, **options) ⇒ String
Returns the N-Triples representation of a triple.
-
#format_tripleTerm(statement, **options) ⇒ String
Returns the N-Triples representation of an RDF 1.2 triple term.
-
#format_uri(uri, **options) ⇒ String
Returns the N-Triples representation of a URI reference using write encoding.
-
#initialize(output = $stdout, validate: true, **options) {|writer| ... } ⇒ Writer
constructor
Initializes the writer.
-
#write_comment(text) ⇒ void
Outputs an N-Triples comment line.
-
#write_triple(subject, predicate, object) ⇒ void
Outputs the N-Triples representation of a triple.
Methods included from Util::Logger
#log_debug, #log_depth, #log_error, #log_fatal, #log_info, #log_recover, #log_recovering?, #log_statistics, #log_warn, #logger
Methods inherited from Writer
accept?, #base_uri, buffer, #canonicalize?, dump, each, #encoding, #flush, for, format, #format_list, #format_term, open, options, #prefix, #prefixes, #prefixes=, #to_sym, to_sym, #validate?, #write_epilogue, #write_prologue, #write_statement, #write_triples
Methods included from Util::Aliasing::LateBound
Methods included from Writable
Constructor Details
#initialize(output = $stdout, validate: true, **options) {|writer| ... } ⇒ Writer
Initializes the writer.
192 193 194 |
# File 'lib/rdf/ntriples/writer.rb', line 192 def initialize(output = $stdout, validate: true, **, &block) super end |
Class Method Details
.escape(string, encoding = nil) ⇒ String
Escape Literal and URI content. If encoding is ASCII, all unicode is escaped, otherwise only ASCII characters that must be escaped are escaped.
57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 |
# File 'lib/rdf/ntriples/writer.rb', line 57 def self.escape(string, encoding = nil) ret = case when string.match?(ESCAPE_PLAIN) # a shortcut for the simple case string when string.ascii_only? StringIO.open do |buffer| buffer.set_encoding(Encoding::ASCII) string.each_byte { |u| buffer << escape_ascii(u, encoding) } buffer.string end when encoding && encoding != Encoding::ASCII # Not encoding UTF-8 characters StringIO.open do |buffer| buffer.set_encoding(encoding) string.each_char do |u| buffer << case u.ord when (0x00..0x7F) escape_ascii(u, encoding) else u end end buffer.string end else # Encode ASCII && UTF-8 characters StringIO.open do |buffer| buffer.set_encoding(Encoding::ASCII) string.each_codepoint { |u| buffer << escape_unicode(u, encoding) } buffer.string end end encoding ? ret.encode(encoding) : ret end |
.escape_ascii(u, encoding) ⇒ String
Standard ASCII escape sequences. If encoding is ASCII, use Test-Cases sequences, otherwise, assume the test-cases escape sequences. Otherwise, the N-Triples recommendation includes ‘b` and `f` escape sequences.
Within STRING_LITERAL_QUOTE, only the characters ‘U+0022`, `U+005C`, `U+000A`, `U+000D` are encoded using `ECHAR`. `ECHAR` must not be used for characters that are allowed directly in STRING_LITERAL_QUOTE.
126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 |
# File 'lib/rdf/ntriples/writer.rb', line 126 def self.escape_ascii(u, encoding) case (u = u.ord) when (0x08) then "\\b" when (0x09) then "\\t" when (0x0A) then "\\n" when (0x0C) then "\\f" when (0x0D) then "\\r" when (0x22) then "\\\"" when (0x5C) then "\\\\" when (0x00..0x1F) then escape_utf16(u) when (0x7F) then escape_utf16(u) when (0x20..0x7E) then u.chr else raise ArgumentError.new("expected an ASCII character in (0x00..0x7F), but got 0x#{u.to_s(16)}") end end |
.escape_unicode(u, encoding) ⇒ String
Escape ascii and unicode characters. If encoding is UTF_8, only ascii characters are escaped.
101 102 103 104 105 106 107 108 109 110 111 112 |
# File 'lib/rdf/ntriples/writer.rb', line 101 def self.escape_unicode(u, encoding) case (u = u.ord) when (0x00..0x7F) # ASCII 7-bit escape_ascii(u, encoding) when (0x80..0xFFFF) # Unicode BMP escape_utf16(u) when (0x10000..0x10FFFF) # Unicode escape_utf32(u) else raise ArgumentError.new("expected a Unicode codepoint in (0x00..0x10FFFF), but got 0x#{u.to_s(16)}") end end |
.escape_utf16(u) ⇒ String
147 148 149 |
# File 'lib/rdf/ntriples/writer.rb', line 147 def self.escape_utf16(u) sprintf("\\u%04X", u.ord) end |
.escape_utf32(u) ⇒ String
155 156 157 |
# File 'lib/rdf/ntriples/writer.rb', line 155 def self.escape_utf32(u) sprintf("\\U%08X", u.ord) end |
.serialize(value) ⇒ String
Returns the serialized N-Triples representation of the given RDF value.
166 167 168 169 170 171 172 173 174 175 176 177 178 |
# File 'lib/rdf/ntriples/writer.rb', line 166 def self.serialize(value) writer = (@serialize_writer_memo ||= self.new) case value when nil then nil when FalseClass then value.to_s when RDF::Statement writer.format_statement(value) + "\n" when RDF::Term writer.format_term(value) else raise ArgumentError, "expected an RDF::Statement or RDF::Term, but got #{value.inspect}" end end |
Instance Method Details
#escaped(string) ⇒ Object
338 339 340 |
# File 'lib/rdf/ntriples/writer.rb', line 338 def escaped(string) self.class.escape(string, encoding) end |
#format_literal(literal, **options) ⇒ String
Returns the N-Triples representation of a literal.
322 323 324 325 326 327 328 329 330 331 332 333 334 |
# File 'lib/rdf/ntriples/writer.rb', line 322 def format_literal(literal, **) case literal when RDF::Literal # Note, escaping here is more robust than in Term text = quoted(escaped(literal.value)) text << "@#{literal.language}" if literal.language? text << "--#{literal.direction}" if literal.direction? text << "^^<#{uri_for(literal.datatype)}>" if literal.datatype? text else quoted(escaped(literal.to_s)) end end |
#format_node(node, unique_bnodes: false, **options) ⇒ String
Returns the N-Triples representation of a blank node.
268 269 270 |
# File 'lib/rdf/ntriples/writer.rb', line 268 def format_node(node, unique_bnodes: false, **) unique_bnodes ? node.to_unique_base : node.to_s end |
#format_quotedTriple(statement, **options) ⇒ String
Quoted triples are now deprecated
Returns the N-Triples representation of an RDF-star quoted triple.
243 244 245 246 |
# File 'lib/rdf/ntriples/writer.rb', line 243 def format_quotedTriple(statement, **) # FIXME: quoted triples are now deprecated "<<%s %s %s>>" % statement.to_a.map { |value| format_term(value, **) } end |
#format_statement(statement, **options) ⇒ String
Returns the N-Triples representation of a statement.
222 223 224 |
# File 'lib/rdf/ntriples/writer.rb', line 222 def format_statement(statement, **) format_triple(*statement.to_triple, **) end |
#format_triple(subject, predicate, object, **options) ⇒ String
Returns the N-Triples representation of a triple.
256 257 258 |
# File 'lib/rdf/ntriples/writer.rb', line 256 def format_triple(subject, predicate, object, **) "%s %s %s ." % [subject, predicate, object].map { |value| format_term(value, **) } end |
#format_tripleTerm(statement, **options) ⇒ String
Returns the N-Triples representation of an RDF 1.2 triple term.
232 233 234 |
# File 'lib/rdf/ntriples/writer.rb', line 232 def format_tripleTerm(statement, **) "<<(%s %s %s)>>" % statement.to_a.map { |value| format_term(value, **) } end |
#format_uri(uri, **options) ⇒ String
Returns the N-Triples representation of a URI reference using write encoding.
278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 |
# File 'lib/rdf/ntriples/writer.rb', line 278 def format_uri(uri, **) string = uri.to_s iriref = case when string.match?(ESCAPE_PLAIN_U) # a shortcut for the simple case string when string.ascii_only? || (encoding && encoding != Encoding::ASCII) StringIO.open do |buffer| buffer.set_encoding(encoding) string.each_char do |u| buffer << case u.ord when (0x00..0x20) then self.class.escape_utf16(u) when 0x22, 0x3c, 0x3e, 0x5c, 0x5e, 0x60, 0x7b, 0x7c, 0x7d # "<>\^`{|} self.class.escape_utf16(u) else u end end buffer.string end else # Encode ASCII && UTF-8/16 characters StringIO.open do |buffer| buffer.set_encoding(Encoding::ASCII) string.each_byte do |u| buffer << case u when (0x00..0x20) then self.class.escape_utf16(u) when 0x22, 0x3c, 0x3e, 0x5c, 0x5e, 0x60, 0x7b, 0x7c, 0x7d # "<>\^`{|} self.class.escape_utf16(u) when (0x80..0xFFFF) then self.class.escape_utf16(u) when (0x10000..0x10FFFF) then self.class.escape_utf32(u) else u end end buffer.string end end encoding ? "<#{iriref}>".encode(encoding) : "<#{iriref}>" end |
#write_comment(text) ⇒ void
This method returns an undefined value.
Outputs an N-Triples comment line.
201 202 203 |
# File 'lib/rdf/ntriples/writer.rb', line 201 def write_comment(text) puts "# #{text.chomp}" # TODO: correctly output multi-line comments end |
#write_triple(subject, predicate, object) ⇒ void
This method returns an undefined value.
Outputs the N-Triples representation of a triple.
212 213 214 |
# File 'lib/rdf/ntriples/writer.rb', line 212 def write_triple(subject, predicate, object) puts format_triple(subject, predicate, object, **@options) end |