Class: Nokogiri::XML::ParseOptions

Inherits:
Object
  • Object
show all
Defined in:
lib/nokogiri/xml/parse_options.rb

Overview

Parse options for passing to Nokogiri.XML or Nokogiri.HTML

Building combinations of parse options

You can build your own combinations of these parse options by using any of the following methods: Note: All examples attempt to set the RECOVER & NOENT options. All examples use Ruby 2 optional parameter syntax.

Ruby’s bitwise operators

You can use the Ruby bitwise operators to set various combinations.

<code>Nokogiri.XML('<content>Chapter 1</content', options: Nokogiri::XML::ParseOptions.new((1 << 0) | (1 << 1)))</code>
Method chaining

Every option has an equivalent method in lowercase. You can chain these methods together to set various combinations.

<code>Nokogiri.XML('<content>Chapter 1</content', options: Nokogiri::XML::ParseOptions.new.recover.noent)</code>
Using Ruby Blocks

You can also setup parse combinations in the block passed to Nokogiri.XML or Nokogiri.HTML

<code>Nokogiri.XML('<content>Chapter 1</content') {|config| config.recover.noent}</code>

Removing particular parse options

You can also remove options from an instance of ParseOptions dynamically. Every option has an equivalent no{option} method in lowercase. You can call these methods on an instance of ParseOptions to remove the option. Note that this is not available for STRICT.

# Setting the RECOVER & NOENT options...
options = Nokogiri::XML::ParseOptions.new.recover.noent
# later...
options.norecover # Removes the Nokogiri::XML::ParseOptions::RECOVER option
options.nonoent # Removes the Nokogiri::XML::ParseOptions::NOENT option

Constant Summary collapse

STRICT =

Strict parsing

0
RECOVER =

Recover from errors

1 << 0
NOENT =

Substitute entities

1 << 1
DTDLOAD =

Load external subsets

1 << 2
DTDATTR =

Default DTD attributes

1 << 3
DTDVALID =

validate with the DTD

1 << 4
NOERROR =

suppress error reports

1 << 5
NOWARNING =

suppress warning reports

1 << 6
PEDANTIC =

pedantic error reporting

1 << 7
NOBLANKS =

remove blank nodes

1 << 8
SAX1 =

use the SAX1 interface internally

1 << 9
XINCLUDE =

Implement XInclude substitution

1 << 10
NONET =

Forbid network access. Recommended for dealing with untrusted documents.

1 << 11
NODICT =

Do not reuse the context dictionary

1 << 12
NSCLEAN =

remove redundant namespaces declarations

1 << 13
NOCDATA =

merge CDATA as text nodes

1 << 14
NOXINCNODE =

do not generate XINCLUDE START/END nodes

1 << 15
COMPACT =

compact small text nodes; no modification of the tree allowed afterwards (will possibly crash if you try to modify the tree)

1 << 16
OLD10 =

parse using XML-1.0 before update 5

1 << 17
NOBASEFIX =

do not fixup XINCLUDE xml:base uris

1 << 18
HUGE =

relax any hardcoded limit from the parser

1 << 19
DEFAULT_XML =

the default options used for parsing XML documents

RECOVER | NONET
DEFAULT_HTML =

the default options used for parsing HTML documents

RECOVER | NOERROR | NOWARNING | NONET

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(options = STRICT) ⇒ ParseOptions

Returns a new instance of ParseOptions.



77
78
79
# File 'lib/nokogiri/xml/parse_options.rb', line 77

def initialize options = STRICT
  @options = options
end

Instance Attribute Details

#optionsObject Also known as: to_i

Returns the value of attribute options.



76
77
78
# File 'lib/nokogiri/xml/parse_options.rb', line 76

def options
  @options
end

Instance Method Details

#inspectObject



111
112
113
114
115
116
117
# File 'lib/nokogiri/xml/parse_options.rb', line 111

def inspect
  options = []
  self.class.constants.each do |k|
    options << k.downcase if send(:"#{k.downcase}?")
  end
  super.sub(/>$/, " " + options.join(', ') + ">")
end

#strictObject



100
101
102
103
# File 'lib/nokogiri/xml/parse_options.rb', line 100

def strict
  @options &= ~RECOVER
  self
end

#strict?Boolean

Returns:

  • (Boolean)


105
106
107
# File 'lib/nokogiri/xml/parse_options.rb', line 105

def strict?
  @options & RECOVER == STRICT
end