Class: Nokogiri::XML::SAX::Parser
- Inherits:
-
Object
- Object
- Nokogiri::XML::SAX::Parser
- Defined in:
- lib/nokogiri/xml/sax/parser.rb,
ext/nokogiri/xml_sax_parser.c
Overview
This parser is a SAX style parser that reads it’s input as it deems necessary. The parser takes a Nokogiri::XML::SAX::Document, an optional encoding, then given an XML input, sends messages to the Nokogiri::XML::SAX::Document.
Here is an example of using this parser:
# Create a subclass of Nokogiri::XML::SAX::Document and implement
# the events we care about:
class MyDoc < Nokogiri::XML::SAX::Document
def start_element name, attrs = []
puts "starting: #{name}"
end
def end_element name
puts "ending: #{name}"
end
end
# Create our parser
parser = Nokogiri::XML::SAX::Parser.new(MyDoc.new)
# Send some XML to the parser
parser.parse(File.open(ARGV[0]))
For more information about SAX parsers, see Nokogiri::XML::SAX. Also see Nokogiri::XML::SAX::Document for the available events.
Direct Known Subclasses
Defined Under Namespace
Classes: Attribute
Constant Summary collapse
- ENCODINGS =
Encodinds this parser supports
{ 'NONE' => 0, # No char encoding detected 'UTF-8' => 1, # UTF-8 'UTF16LE' => 2, # UTF-16 little endian 'UTF16BE' => 3, # UTF-16 big endian 'UCS4LE' => 4, # UCS-4 little endian 'UCS4BE' => 5, # UCS-4 big endian 'EBCDIC' => 6, # EBCDIC uh! 'UCS4-2143' => 7, # UCS-4 unusual ordering 'UCS4-3412' => 8, # UCS-4 unusual ordering 'UCS2' => 9, # UCS-2 'ISO-8859-1' => 10, # ISO-8859-1 ISO Latin 1 'ISO-8859-2' => 11, # ISO-8859-2 ISO Latin 2 'ISO-8859-3' => 12, # ISO-8859-3 'ISO-8859-4' => 13, # ISO-8859-4 'ISO-8859-5' => 14, # ISO-8859-5 'ISO-8859-6' => 15, # ISO-8859-6 'ISO-8859-7' => 16, # ISO-8859-7 'ISO-8859-8' => 17, # ISO-8859-8 'ISO-8859-9' => 18, # ISO-8859-9 'ISO-2022-JP' => 19, # ISO-2022-JP 'SHIFT-JIS' => 20, # Shift_JIS 'EUC-JP' => 21, # EUC-JP 'ASCII' => 22, # pure ASCII }
Instance Attribute Summary collapse
-
#document ⇒ Object
The Nokogiri::XML::SAX::Document where events will be sent.
-
#encoding ⇒ Object
The encoding beings used for this document.
Instance Method Summary collapse
-
#initialize(doc = Nokogiri::XML::SAX::Document.new, encoding = 'UTF-8') ⇒ Parser
constructor
Create a new Parser with
doc
andencoding
. -
#parse(thing, &block) ⇒ Object
Parse given
thing
which may be a string containing xml, or an IO object. -
#parse_file(filename) {|ctx| ... } ⇒ Object
Parse a file with
filename
. -
#parse_io(io, encoding = 'ASCII') {|ctx| ... } ⇒ Object
Parse given
io
. - #parse_memory(data) {|ctx| ... } ⇒ Object
Constructor Details
#initialize(doc = Nokogiri::XML::SAX::Document.new, encoding = 'UTF-8') ⇒ Parser
Create a new Parser with doc
and encoding
70 71 72 73 74 75 |
# File 'lib/nokogiri/xml/sax/parser.rb', line 70 def initialize doc = Nokogiri::XML::SAX::Document.new, encoding = 'UTF-8' check_encoding(encoding) @encoding = encoding @document = doc @warned = false end |
Instance Attribute Details
#document ⇒ Object
The Nokogiri::XML::SAX::Document where events will be sent.
64 65 66 |
# File 'lib/nokogiri/xml/sax/parser.rb', line 64 def document @document end |
#encoding ⇒ Object
The encoding beings used for this document.
67 68 69 |
# File 'lib/nokogiri/xml/sax/parser.rb', line 67 def encoding @encoding end |
Instance Method Details
#parse(thing, &block) ⇒ Object
Parse given thing
which may be a string containing xml, or an IO object.
80 81 82 83 84 85 86 |
# File 'lib/nokogiri/xml/sax/parser.rb', line 80 def parse thing, &block if thing.respond_to?(:read) && thing.respond_to?(:close) parse_io(thing, &block) else parse_memory(thing, &block) end end |
#parse_file(filename) {|ctx| ... } ⇒ Object
Parse a file with filename
100 101 102 103 104 105 106 107 |
# File 'lib/nokogiri/xml/sax/parser.rb', line 100 def parse_file filename raise ArgumentError unless filename raise Errno::ENOENT unless File.exist?(filename) raise Errno::EISDIR if File.directory?(filename) ctx = ParserContext.file filename yield ctx if block_given? ctx.parse_with self end |
#parse_io(io, encoding = 'ASCII') {|ctx| ... } ⇒ Object
Parse given io
90 91 92 93 94 95 96 |
# File 'lib/nokogiri/xml/sax/parser.rb', line 90 def parse_io io, encoding = 'ASCII' check_encoding(encoding) @encoding = encoding ctx = ParserContext.io(io, ENCODINGS[encoding]) yield ctx if block_given? ctx.parse_with self end |
#parse_memory(data) {|ctx| ... } ⇒ Object
109 110 111 112 113 |
# File 'lib/nokogiri/xml/sax/parser.rb', line 109 def parse_memory data ctx = ParserContext.memory data yield ctx if block_given? ctx.parse_with self end |