Class: Nokogiri::HTML4::SAX::Parser
- Inherits:
-
XML::SAX::Parser
- Object
- XML::SAX::Parser
- Nokogiri::HTML4::SAX::Parser
- Defined in:
- lib/nokogiri/html4/sax/parser.rb
Overview
This class lets you perform SAX style parsing on HTML with HTML error correction.
Here is a basic usage example:
class MyDoc < Nokogiri::XML::SAX::Document
def start_element name, attributes = []
puts "found a #{name}"
end
end
parser = Nokogiri::HTML4::SAX::Parser.new(MyDoc.new)
parser.parse(File.read(ARGV[0], mode: 'rb'))
For more information on SAX parsers, see Nokogiri::XML::SAX
Constant Summary
Constants inherited from XML::SAX::Parser
Instance Attribute Summary
Attributes inherited from XML::SAX::Parser
Instance Method Summary collapse
-
#parse_file(filename, encoding = "UTF-8") {|ctx| ... } ⇒ Object
Parse a file with
filename
. -
#parse_io(io, encoding = "UTF-8") {|ctx| ... } ⇒ Object
Parse given
io
. -
#parse_memory(data, encoding = "UTF-8") {|ctx| ... } ⇒ Object
Parse html stored in
data
usingencoding
.
Methods inherited from XML::SAX::Parser
Constructor Details
This class inherits a constructor from Nokogiri::XML::SAX::Parser
Instance Method Details
#parse_file(filename, encoding = "UTF-8") {|ctx| ... } ⇒ Object
Parse a file with filename
51 52 53 54 55 56 57 58 59 |
# File 'lib/nokogiri/html4/sax/parser.rb', line 51 def parse_file(filename, encoding = "UTF-8") raise ArgumentError unless filename raise Errno::ENOENT unless File.exist?(filename) raise Errno::EISDIR if File.directory?(filename) ctx = ParserContext.file(filename, encoding) yield ctx if block_given? ctx.parse_with(self) end |
#parse_io(io, encoding = "UTF-8") {|ctx| ... } ⇒ Object
Parse given io
41 42 43 44 45 46 47 |
# File 'lib/nokogiri/html4/sax/parser.rb', line 41 def parse_io(io, encoding = "UTF-8") check_encoding(encoding) @encoding = encoding ctx = ParserContext.io(io, ENCODINGS[encoding]) yield ctx if block_given? ctx.parse_with(self) end |
#parse_memory(data, encoding = "UTF-8") {|ctx| ... } ⇒ Object
Parse html stored in data
using encoding
30 31 32 33 34 35 36 37 |
# File 'lib/nokogiri/html4/sax/parser.rb', line 30 def parse_memory(data, encoding = "UTF-8") raise TypeError unless String === data return if data.empty? ctx = ParserContext.memory(data, encoding) yield ctx if block_given? ctx.parse_with(self) end |