Class: RDoc::Markup::InlineParser
- Inherits:
-
Object
- Object
- RDoc::Markup::InlineParser
- Defined in:
- lib/rdoc/markup/inline_parser.rb
Overview
Parses inline markup in RDoc text. This parser handles em, bold, strike, tt, hard break, and tidylink. Block-level constructs are handled in RDoc::Markup::Parser.
Constant Summary collapse
- WORD_PAIRS =
TT, BOLD_WORD, EM_WORD: regexp-handling(example: crossref) is disabled
{ '*' => :BOLD_WORD, '**' => :BOLD_WORD, '_' => :EM_WORD, '__' => :EM_WORD, '+' => :TT, '++' => :TT, '`' => :TT, '``' => :TT }
- TAGS =
Other types: regexp-handling(example: crossref) is enabled
{ 'em' => :EM, 'i' => :EM, 'b' => :BOLD, 's' => :STRIKE, 'del' => :STRIKE, }
- STANDALONE_TAGS =
:nodoc:
{ 'br' => :HARD_BREAK }
- CODEBLOCK_TAGS =
:nodoc:
%w[tt code]
- TOKENS =
:nodoc:
{ **WORD_PAIRS.transform_values { [:word_pair, nil] }, **TAGS.keys.to_h {|tag| ["<#{tag}>", [:open_tag, tag]] }, **TAGS.keys.to_h {|tag| ["</#{tag}>", [:close_tag, tag]] }, **CODEBLOCK_TAGS.to_h {|tag| ["<#{tag}>", [:code_start, tag]] }, **STANDALONE_TAGS.keys.to_h {|tag| ["<#{tag}>", [:standalone_tag, tag]] }, '{' => [:tidylink_start, nil], '}' => [:tidylink_mid, nil], '\\' => [:escape, nil], '[' => nil # To make `label[url]` scan as separate tokens }
- SCANNER_REGEXP =
/(?: #{multi_char_tokens_regexp} |[^#{token_starts_regexp}\sa-zA-Z0-9\.]+ # chunk of normal text |\s+|[a-zA-Z0-9\.]+|. )/x- ESCAPING_CHARS =
Characters that can be escaped with backslash.
'\\*_+`{}[]<>'- CODEBLOCK_REGEXPS =
Pattern to match code block content until
</tt>or</code>. CODEBLOCK_TAGS.to_h {|name| [name, /((?:\\.|[^\\])*?)<\/#{name}>/] }
- WORD_REGEXPS =
Word contains alphanumeric and
_./:[]-characters. Word may start with#and may end with any non-space character. (e.g.#eql?). Underscore delimiter have special rules. { # Words including _, longest match. # Example: `_::A_` `_-42_` `_A::B::C.foo_bar[baz]_` `_kwarg:_` # Content must not include _ followed by non-alphanumeric character # Example: `_host_:_port_` will be `_host_` + `:` + `_port_` '_' => /#?([a-zA-Z0-9.\/:\[\]-]|_+[a-zA-Z0-9])+[^\s]?_(?=[^a-zA-Z0-9_]|\z)/, # Words allowing _ but not allowing __ '__' => /#?[a-zA-Z0-9.\/:\[\]-]*(_[a-zA-Z0-9.\/:\[\]-]+)*[^\s]?__(?=[^a-zA-Z0-9]|\z)/, **%w[* ** + ++ ` ``].to_h do |s| # normal words that can be used within +word+ or *word* [s, /#?[a-zA-Z0-9_.\/:\[\]-]+[^\s]?#{Regexp.escape(s)}(?=[^a-zA-Z0-9]|\z)/] end }
Instance Method Summary collapse
-
#current ⇒ Object
Return the current parsing node on
@stack. -
#initialize(string) ⇒ InlineParser
constructor
:nodoc:.
-
#parse ⇒ Object
Parse and return an array of nodes.
Constructor Details
#initialize(string) ⇒ InlineParser
:nodoc:
82 83 84 85 86 87 88 |
# File 'lib/rdoc/markup/inline_parser.rb', line 82 def initialize(string) @scanner = StringScanner.new(string) @last_match = nil @scanner_negative_cache = Set.new @stack = [] @delimiters = {} end |
Instance Method Details
#current ⇒ Object
Return the current parsing node on @stack.
92 93 94 |
# File 'lib/rdoc/markup/inline_parser.rb', line 92 def current @stack.last end |
#parse ⇒ Object
Parse and return an array of nodes. Node format:
{
type: :EM | :BOLD | :BOLD_WORD | :EM_WORD | :TT | :STRIKE | :HARD_BREAK | :TIDYLINK,
url: string # only for :TIDYLINK
children: [string_or_node, ...]
}
104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 |
# File 'lib/rdoc/markup/inline_parser.rb', line 104 def parse stack_push(:root, nil) while true type, token, value = scan_token close = nil tidylink_url = nil case type when :node current[:children] << value invalidate_open_tidylinks if value[:type] == :TIDYLINK when :eof close = :root when :tidylink_open stack_push(:tidylink, token) when :tidylink_close close = :tidylink if value tidylink_url = value else # Tidylink closing brace without URL part. Treat opening and closing braces as normal text # `{labelnodes}...` case. current[:children] << token end when :invalidated_tidylink_close # `{...{label}[url]...}` case. Nested tidylink invalidates outer one. The last `}` closes the invalidated tidylink. current[:children] << token close = :invalidated_tidylink when :text current[:children] << token when :open stack_push(value, token) when :close if @delimiters[value] close = value else # closing tag without matching opening tag. Treat as normal text. current[:children] << token end end next unless close while current[:delimiter] != close children = current[:children] open_token = current[:token] stack_pop current[:children] << open_token if open_token current[:children].concat(children) end token = current[:token] children = compact_string(current[:children]) stack_pop return children if close == :root if close == :tidylink || close == :invalidated_tidylink if tidylink_url current[:children] << { type: :TIDYLINK, children: children, url: tidylink_url } invalidate_open_tidylinks else current[:children] << token current[:children].concat(children) end else current[:children] << { type: TAGS[close], children: children } end end end |