Class: Nokogiri::XML::SAX::Parser
- Inherits:
-
Object
- Object
- Nokogiri::XML::SAX::Parser
- Defined in:
- lib/nokogiri/xml/sax/parser.rb,
ext/nokogiri/xml_sax_parser.c
Overview
This parser is a SAX style parser that reads it’s input as it deems necessary. The parser takes a Nokogiri::XML::SAX::Document, an optional encoding, then given an XML input, sends messages to the Nokogiri::XML::SAX::Document.
Here is an example of using this parser:
# Create a subclass of Nokogiri::XML::SAX::Document and implement
# the events we care about:
class MyDoc < Nokogiri::XML::SAX::Document
def start_element name, attrs = []
puts "starting: #{name}"
end
def end_element name
puts "ending: #{name}"
end
end
# Create our parser
parser = Nokogiri::XML::SAX::Parser.new(MyDoc.new)
# Send some XML to the parser
parser.parse(File.open(ARGV[0]))
For more information about SAX parsers, see Nokogiri::XML::SAX. Also see Nokogiri::XML::SAX::Document for the available events.
Direct Known Subclasses
Defined Under Namespace
Classes: Attribute
Constant Summary collapse
- ENCODINGS =
Encodinds this parser supports
{ "NONE" => 0, # No char encoding detected "UTF-8" => 1, # UTF-8 "UTF16LE" => 2, # UTF-16 little endian "UTF16BE" => 3, # UTF-16 big endian "UCS4LE" => 4, # UCS-4 little endian "UCS4BE" => 5, # UCS-4 big endian "EBCDIC" => 6, # EBCDIC uh! "UCS4-2143" => 7, # UCS-4 unusual ordering "UCS4-3412" => 8, # UCS-4 unusual ordering "UCS2" => 9, # UCS-2 "ISO-8859-1" => 10, # ISO-8859-1 ISO Latin 1 "ISO-8859-2" => 11, # ISO-8859-2 ISO Latin 2 "ISO-8859-3" => 12, # ISO-8859-3 "ISO-8859-4" => 13, # ISO-8859-4 "ISO-8859-5" => 14, # ISO-8859-5 "ISO-8859-6" => 15, # ISO-8859-6 "ISO-8859-7" => 16, # ISO-8859-7 "ISO-8859-8" => 17, # ISO-8859-8 "ISO-8859-9" => 18, # ISO-8859-9 "ISO-2022-JP" => 19, # ISO-2022-JP "SHIFT-JIS" => 20, # Shift_JIS "EUC-JP" => 21, # EUC-JP "ASCII" => 22, # pure ASCII }
Instance Attribute Summary collapse
-
#document ⇒ Object
The Nokogiri::XML::SAX::Document where events will be sent.
-
#encoding ⇒ Object
The encoding beings used for this document.
Instance Method Summary collapse
-
#initialize(doc = Nokogiri::XML::SAX::Document.new, encoding = "UTF-8") ⇒ Parser
constructor
Create a new Parser with
doc
andencoding
. -
#parse(thing, &block) ⇒ Object
Parse given
thing
which may be a string containing xml, or an IO object. -
#parse_file(filename) {|ctx| ... } ⇒ Object
Parse a file with
filename
. -
#parse_io(io, encoding = "ASCII") {|ctx| ... } ⇒ Object
Parse given
io
. - #parse_memory(data) {|ctx| ... } ⇒ Object
Constructor Details
#initialize(doc = Nokogiri::XML::SAX::Document.new, encoding = "UTF-8") ⇒ Parser
Create a new Parser with doc
and encoding
72 73 74 75 76 |
# File 'lib/nokogiri/xml/sax/parser.rb', line 72 def initialize(doc = Nokogiri::XML::SAX::Document.new, encoding = "UTF-8") @encoding = check_encoding(encoding) @document = doc @warned = false end |
Instance Attribute Details
#document ⇒ Object
The Nokogiri::XML::SAX::Document where events will be sent.
66 67 68 |
# File 'lib/nokogiri/xml/sax/parser.rb', line 66 def document @document end |
#encoding ⇒ Object
The encoding beings used for this document.
69 70 71 |
# File 'lib/nokogiri/xml/sax/parser.rb', line 69 def encoding @encoding end |
Instance Method Details
#parse(thing, &block) ⇒ Object
Parse given thing
which may be a string containing xml, or an IO object.
81 82 83 84 85 86 87 |
# File 'lib/nokogiri/xml/sax/parser.rb', line 81 def parse(thing, &block) if thing.respond_to?(:read) && thing.respond_to?(:close) parse_io(thing, &block) else parse_memory(thing, &block) end end |
#parse_file(filename) {|ctx| ... } ⇒ Object
Parse a file with filename
100 101 102 103 104 105 106 107 108 |
# File 'lib/nokogiri/xml/sax/parser.rb', line 100 def parse_file(filename) raise ArgumentError unless filename raise Errno::ENOENT unless File.exist?(filename) raise Errno::EISDIR if File.directory?(filename) ctx = ParserContext.file(filename) yield ctx if block_given? ctx.parse_with(self) end |
#parse_io(io, encoding = "ASCII") {|ctx| ... } ⇒ Object
Parse given io
91 92 93 94 95 96 |
# File 'lib/nokogiri/xml/sax/parser.rb', line 91 def parse_io(io, encoding = "ASCII") @encoding = check_encoding(encoding) ctx = ParserContext.io(io, ENCODINGS[@encoding]) yield ctx if block_given? ctx.parse_with(self) end |
#parse_memory(data) {|ctx| ... } ⇒ Object
110 111 112 113 114 |
# File 'lib/nokogiri/xml/sax/parser.rb', line 110 def parse_memory(data) ctx = ParserContext.memory(data) yield ctx if block_given? ctx.parse_with(self) end |