Class: Mizuho::Parser
- Inherits:
-
Object
- Object
- Mizuho::Parser
- Defined in:
- lib/mizuho/parser.rb
Overview
This class can parse the raw Asciidoc XHTML output, and extract the title, raw contents (without layout) and other information from it.
Instance Attribute Summary collapse
-
#contents ⇒ Object
readonly
The document’s raw contents, without any layout.
-
#filename ⇒ Object
readonly
Returns the value of attribute filename.
-
#table_of_contents ⇒ Object
readonly
The document’s table of contents, represented in a tree structure by Heading objects.
-
#title ⇒ Object
readonly
The document’s title.
Instance Method Summary collapse
-
#chapters ⇒ Object
Returns the individual chapters as an array of Chapter objects.
-
#initialize(filename) ⇒ Parser
constructor
Parse the given file.
Constructor Details
#initialize(filename) ⇒ Parser
Parse the given file.
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
# File 'lib/mizuho/parser.rb', line 21 def initialize(filename) @filename = filename @contents = File.read(filename) # Extract the title. @contents =~ %r{<title>(.*?)</title>} @title = CGI::unescapeHTML($1) # Get rid of the Asciidoc layout and unwanted elements. if !@contents.sub!(/\A.*?(<div id="preamble">)/m, '\1') # There's no preamble, so strip everything until the # end of the TOC div. @contents.sub!(%r(\A.*?</noscript>[\r\n\s]*</div>[\r\n\s]*</div>)m, '') end @contents.sub!(/<div id="footer">.*/m, '') @contents.gsub!(%r{<div style="clear:left"></div>}, '') # Extract table of contents. parse_table_of_contents(@contents) end |
Instance Attribute Details
#contents ⇒ Object (readonly)
The document’s raw contents, without any layout.
18 19 20 |
# File 'lib/mizuho/parser.rb', line 18 def contents @contents end |
#filename ⇒ Object (readonly)
Returns the value of attribute filename.
10 11 12 |
# File 'lib/mizuho/parser.rb', line 10 def filename @filename end |
#table_of_contents ⇒ Object (readonly)
The document’s table of contents, represented in a tree structure by Heading objects.
16 17 18 |
# File 'lib/mizuho/parser.rb', line 16 def table_of_contents @table_of_contents end |
#title ⇒ Object (readonly)
The document’s title.
13 14 15 |
# File 'lib/mizuho/parser.rb', line 13 def title @title end |
Instance Method Details
#chapters ⇒ Object
Returns the individual chapters as an array of Chapter objects. The first Chapter object represents the preamble.
45 46 47 |
# File 'lib/mizuho/parser.rb', line 45 def chapters @chapters ||= parse_chapters(@contents) end |