Class: Saxon::DocumentBuilder

Inherits:
Object
  • Object
show all
Defined in:
lib/saxon/document_builder.rb

Overview

Builds XDM objects from XML sources, for use in XSLT or for query and access

Defined Under Namespace

Classes: ConfigurationDSL

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(s9_document_builder, &block) ⇒ DocumentBuilder

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Returns a new instance of DocumentBuilder.

Parameters:

  • s9_document_builder (net.sf.saxon.s9api.DocumentBuilder)

    The Saxon DocumentBuilder instance to wrap



77
78
79
80
81
82
# File 'lib/saxon/document_builder.rb', line 77

def initialize(s9_document_builder, &block)
  @s9_document_builder = s9_document_builder
  if block_given?
    ConfigurationDSL.define(self, block)
  end
end

Class Method Details

.create(processor) { ... } ⇒ Saxon::DocumentBuilder

Create a new DocumentBuilder that can be used to build new XML documents with the passed-in Processor. If a block is passed in it’s executed as a DSL for configuring the builder instance.

Parameters:

Yields:

  • An DocumentBuilder configuration DSL block

Returns:



67
68
69
# File 'lib/saxon/document_builder.rb', line 67

def self.create(processor, &block)
  new(processor.to_java.newDocumentBuilder, &block)
end

Instance Method Details

#base_urinil, ...

Return the default base URI to be used when building documents using this instance. This value will be ignored if the source being parsed has an intrinsic base URI (e.g. a File).

Returns nil if no URI is set (the default).

Returns:

  • (nil, URI::File, URI::HTTP)

    the default base URI (or nil)



110
111
112
113
# File 'lib/saxon/document_builder.rb', line 110

def base_uri
  uri = s9_document_builder.getBaseURI
  uri.nil? ? uri : URI(uri.to_s)
end

#base_uri=(uri) ⇒ Object

Set the base URI of documents created using this instance. This value will be ignored if the source being parsed has an intrinsic base URI (e.g. a File)

Parameters:

  • uri (String, URI::File, URI::HTTP)

    The (absolute) base URI to use

See Also:



122
123
124
# File 'lib/saxon/document_builder.rb', line 122

def base_uri=(uri)
  s9_document_builder.setBaseURI(java.net.URI.new(uri.to_s))
end

#build(source) ⇒ Saxon::XDM::Node

Returns The Saxon::XDM::Node representing the root of the document tree.

Parameters:

  • source (Saxon::Source)

    The Saxon::Source containing the source IO/string

Returns:

  • (Saxon::XDM::Node)

    The Saxon::XDM::Node representing the root of the document tree



224
225
226
# File 'lib/saxon/document_builder.rb', line 224

def build(source)
  XDM::Node.new(s9_document_builder.build(source.to_java))
end

#dtd_validation=(on) ⇒ Object

Switches DTD validation on or off.

It’s important to note that DTD validation only applies to documents that contain a <!doctype>, but switching DTD validation off doesn’t stop the XML parser Saxon uses from trying to retrieve the DTD that’s referenced, which can mean network requests. By default, the SAX parser Saxon uses (Xerces) doesn’t make use of XML catalogs, which causes problems when documents reference a DTD with a relative path as in:

<!DOCTYPE root-element SYSTEM "example.dtd">

This can be controlled through a configuration option, however.



216
217
218
# File 'lib/saxon/document_builder.rb', line 216

def dtd_validation=(on)
  s9_document_builder.setDTDValidation(on)
end

#dtd_validation?Boolean

Returns whether DTD Validation is enabled.

Returns:

  • (Boolean)

    whether DTD Validation is enabled



199
200
201
# File 'lib/saxon/document_builder.rb', line 199

def dtd_validation?
  s9_document_builder.isDTDValidation
end

#line_numbering=(on_or_not) ⇒ Object

Switch tracking of line and column numbers for elements in documents created by this instance on or off

Parameters:

  • on_or_not (Boolean)

    whether or not to track line numbering

See Also:



99
100
101
# File 'lib/saxon/document_builder.rb', line 99

def line_numbering=(on_or_not)
  s9_document_builder.setLineNumbering(on_or_not)
end

#line_numbering?Boolean

Report whether documents created using this instance will keep track of the line and column numbers of elements.

Returns:

  • (Boolean)

    whether line numbering will be tracked



88
89
90
# File 'lib/saxon/document_builder.rb', line 88

def line_numbering?
  s9_document_builder.isLineNumbering
end

#to_javaJava::NetSfSaxonS9api::DocumentBuilder

Returns The underlying Java Saxon DocumentBuilder instance.

Returns:

  • (Java::NetSfSaxonS9api::DocumentBuilder)

    The underlying Java Saxon DocumentBuilder instance



230
231
232
# File 'lib/saxon/document_builder.rb', line 230

def to_java
  s9_document_builder
end

#whitespace_stripping_policy:all, ...

Return the Whitespace stripping policy for this instance. Returns one of the standard policy names as a symbol, or the custom Java WhitespaceStrippingPolicy if one was defined using #whitespace_stripping_policy = ->(qname) { … }. (See #whitespace_stripping_policy= for more.)

:all: All whitespace-only nodes will be discarded

:none: No whitespace-only nodes will be discarded (the default if DTD or schema validation is not in effect)

:ignorable: Whitespace-only nodes inside elements defined as element-only in the DTD or schema being used will be discarded (the default if DTD or schema validation is in effect)

:unspecified: the default, which in practice means :ignorable if DTD or schema validation is in effect, and :none otherwise.

Returns:

  • (:all, :none, :ignorable, :unspecified, Proc)


145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
# File 'lib/saxon/document_builder.rb', line 145

def whitespace_stripping_policy
  s9_policy = s9_document_builder.getWhitespaceStrippingPolicy
  case s9_policy
  when Saxon::S9API::WhitespaceStrippingPolicy::UNSPECIFIED
    :unspecified
  when Saxon::S9API::WhitespaceStrippingPolicy::NONE
    :none
  when Saxon::S9API::WhitespaceStrippingPolicy::IGNORABLE
    :ignorable
  when Saxon::S9API::WhitespaceStrippingPolicy::ALL
    :all
  else
    s9_policy
  end
end

#whitespace_stripping_policy=(policy) ⇒ Object

Set the whitespace stripping policy to be used for documents built with this instance.

Possible values are:

  • One of the standard policies, as a symbol (:all, :none, :ignorable, :unspecified, see #whitespace_stripping_policy).

  • A Java net.sf.saxon.s9api.WhitesapceStrippingPolicy instance

  • A Proc/lambda that is handed an element name as a QName, and should return true (if whitespace should be stripped for this element) or false (it should not).

Examples:

whitespace_stripping_policy = ->(element_qname) {
  element_qname == Saxon::QName.clark("{http://example.org/}element-name")
}

Parameters:

  • policy (Symbol, Proc, Saxon::S9API::WhitespaceStrippingPolicy)

    the policy to use

See Also:



181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
# File 'lib/saxon/document_builder.rb', line 181

def whitespace_stripping_policy=(policy)
  case policy
  when :unspecified, :none, :ignorable, :all
    s9_policy = Saxon::S9API::WhitespaceStrippingPolicy.const_get(policy.to_s.upcase.to_sym)
  when Proc
    wrapped_policy = ->(s9_qname) {
      policy.call(Saxon::QName.new(s9_qname))
    }
    s9_policy = Saxon::S9API::WhitespaceStrippingPolicy.makeCustomPolicy(wrapped_policy)
  when Saxon::S9API::WhitespaceStrippingPolicy
    s9_policy = policy
  else
    raise InvalidWhitespaceStrippingPolicyError, "#{policy.inspect} is not one of the allowed Symbols, or a custom policy"
  end
  s9_document_builder.setWhitespaceStrippingPolicy(s9_policy)
end