Class: Peggy::Parser
- Inherits:
-
Object
- Object
- Peggy::Parser
- Defined in:
- lib/parser.rb
Overview
Packrat parser class. Note all methods have a trailing exclamation (!) or question mark (?), or have long names with underscores (_). This is because productions are methods and we need to avoid name collisions. To use this class you must subclass Parser and provide your productions as methods. Your productions must call match? or one of the protected convenience routines to perform parsing. Productions must never call another production directly, or results will not get memoized and you will slow down your parse conciderably, and possibly risk getting into an infinite recursion (until the stack blows its top). Note, as a conveience in writting productions, you can call any match? function multiple times, passing each returned index, such as in a sequence, without checking the results of each production.
Instance Attribute Summary collapse
-
#debug_flag ⇒ Object
Tells parser to print intermediate results if set.
-
#ignore_productions ⇒ Object
The productions to ignore.
-
#parse_results ⇒ Object
readonly
The results of the parse.
-
#source_text ⇒ Object
The source to parse, can be set prior to calling parse!().
Instance Method Summary collapse
-
#[](range) ⇒ Object
Return a range (or character) of the source_text.
-
#allow?(goal, index) ⇒ Boolean
Try to match a production from the given index.
-
#check?(goal, index) ⇒ Boolean
Try to match a production from the given index then backtrack.
-
#correct_regexp!(re) ⇒ Object
Make sure regular expressions match the beginning of the string, actually from the string from the given index.
-
#dissallow?(goal, index) ⇒ Boolean
Try not to match a production from the given index then backtrack.
-
#eof(index) ⇒ Object
Special production that only matches the end of source_text.
-
#ignore?(index) ⇒ Boolean
Match tokens that should be ignored.
-
#literal?(value, index) ⇒ Boolean
Match a literal string or regular expression from the given index.
-
#match?(goal, index) ⇒ Boolean
Match a production from the given index.
-
#parse?(goal, source = nil, index = 0) ⇒ Boolean
Envokes the parser from the beginning of the source on the given production goal.
-
#query?(*args) ⇒ Boolean
Queries the parse results for a heirarchy of production matches.
-
#regexp?(value, index) ⇒ Boolean
Match a regular expression from the given index.
-
#string?(value, index) ⇒ Boolean
Match a string from the given index.
Instance Attribute Details
#debug_flag ⇒ Object
Tells parser to print intermediate results if set.
37 38 39 |
# File 'lib/parser.rb', line 37 def debug_flag @debug_flag end |
#ignore_productions ⇒ Object
The productions to ignore.
47 48 49 |
# File 'lib/parser.rb', line 47 def ignore_productions @ignore_productions end |
#parse_results ⇒ Object (readonly)
The results of the parse. A hash (keys of indexs) of hashes (keys of production symbols and values of end indexes.
44 45 46 |
# File 'lib/parser.rb', line 44 def parse_results @parse_results end |
#source_text ⇒ Object
The source to parse, can be set prior to calling parse!().
40 41 42 |
# File 'lib/parser.rb', line 40 def source_text @source_text end |
Instance Method Details
#[](range) ⇒ Object
Return a range (or character) of the source_text.
50 51 52 53 |
# File 'lib/parser.rb', line 50 def [] range raise "source_text not set" if source_text.nil? source_text[range] end |
#allow?(goal, index) ⇒ Boolean
Try to match a production from the given index. Returns the end index if found or start index if not found.
83 84 85 86 87 |
# File 'lib/parser.rb', line 83 def allow? goal, index return NO_MATCH if index == NO_MATCH # allow users to not check results of a sequence found = match? goal, index found == NO_MATCH ? index : found end |
#check?(goal, index) ⇒ Boolean
Try to match a production from the given index then backtrack. Returns index if found or NO_MATCH if not.
91 92 93 94 95 |
# File 'lib/parser.rb', line 91 def check? goal, index return NO_MATCH if index == NO_MATCH # allow users to not check results of a sequence found = match? goal, index found == NO_MATCH ? NO_MATCH : index end |
#correct_regexp!(re) ⇒ Object
Make sure regular expressions match the beginning of the string, actually from the string from the given index.
185 186 187 188 |
# File 'lib/parser.rb', line 185 def correct_regexp! re source = re.source source[0..1] == '\\A' ? re : Regexp.new("\\A(#{source})", re.) end |
#dissallow?(goal, index) ⇒ Boolean
Try not to match a production from the given index then backtrack. Returns index if not found or NO_MATCH if found.
99 100 101 102 103 |
# File 'lib/parser.rb', line 99 def dissallow? goal, index return NO_MATCH if index == NO_MATCH # allow users to not check results of a sequence found = match? goal, index found == NO_MATCH ? index : NO_MATCH end |
#eof(index) ⇒ Object
Special production that only matches the end of source_text. Note, this function does not end in (?) or (!) because it is meant be used as a normal production.
107 108 109 110 |
# File 'lib/parser.rb', line 107 def eof index return NO_MATCH if index == NO_MATCH # allow users to not check results of a sequence index >= source_text.length ? index : NO_MATCH end |
#ignore?(index) ⇒ Boolean
Match tokens that should be ignored. Used by match?(). Returns end index if found or start index if not found. Subclasses should override this method if they wish to ignore other text, such as comments.
136 137 138 139 140 141 142 143 144 145 |
# File 'lib/parser.rb', line 136 def ignore? index return NO_MATCH if index == NO_MATCH # allow users to not check results of a sequence return index if @ignoring || ignore_productions.nil? @ignoring = true ignore_productions.each do |prod| index = allow? prod, index end @ignoring = nil index end |
#literal?(value, index) ⇒ Boolean
Match a literal string or regular expression from the given index. Returns the end index if found or NO_MATCH if not found.
149 150 151 152 153 154 155 156 157 158 159 |
# File 'lib/parser.rb', line 149 def literal? value, index return NO_MATCH if index == NO_MATCH # allow users to not check results of a sequence case value when String string? value, index when Regexp regexp? value, index else raise "Unknown literal: #{value.inspect}" end end |
#match?(goal, index) ⇒ Boolean
Match a production from the given index. Returns the end index if found or NO_MATCH if not found.
114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 |
# File 'lib/parser.rb', line 114 def match? goal, index return NO_MATCH if index == NO_MATCH # allow users to not check results of a sequence index = ignore? index unless @ignoring goal = goal.to_sym position = parse_results[index] found = position.fetch(goal) do position[goal] = IN_USE # used to prevent inifinite recursion in case user attemts # a left recursion if (result = send goal, index) position[:found_order] = [] unless position.has_key?(:found_order) position[:found_order] << goal end position[goal] = result end puts "found #{goal} at #{index}...#{found} #{source_text[index...found].inspect}" if found && debug_flag raise "Parser cannot handle infinite (left) recursions. Please rewrite usage of '#{goal}'." if found == IN_USE found end |
#parse?(goal, source = nil, index = 0) ⇒ Boolean
Envokes the parser from the beginning of the source on the given production goal. You sohuld provide the source here or you can set source_text prior to calling. If index is provided the parser will ignore characters previous to it.
58 59 60 61 62 63 64 65 66 |
# File 'lib/parser.rb', line 58 def parse? goal, source = nil, index = 0 source_text = source unless source.nil? # Hash of automatic hashes @parse_results = Hash.new {|h1, k1| h1[k1] = {}} @keys = nil index = match? goal, index puts pp(parse_results) if debug_flag index end |
#query?(*args) ⇒ Boolean
Queries the parse results for a heirarchy of production matches. An array of index ranges is returned, or an empny array if none are found. This can only be called after parse_results have been set by a parse.
71 72 73 74 75 76 77 78 79 |
# File 'lib/parser.rb', line 71 def query? *args raise "You must first call parse!" unless parse_results @keys = @parse_results.keys.sort unless @keys found_list = [] index = 0 args.each do |arg| index = find? arg, index end end |
#regexp?(value, index) ⇒ Boolean
Match a regular expression from the given index. Returns the end index if found or NO_MATCH if not found.
174 175 176 177 178 179 180 181 |
# File 'lib/parser.rb', line 174 def regexp? value, index return NO_MATCH if index == NO_MATCH # allow users to not check results of a sequence value = correct_regexp! value index = ignore? index unless @ignoring found = value.match source_text[index..-1] # puts "#{value.inspect} ~= #{found[0].inspect}" if found found ? found.end(0) + index : NO_MATCH end |
#string?(value, index) ⇒ Boolean
Match a string from the given index. Returns the end index if found or NO_MATCH if not found.
163 164 165 166 167 168 169 170 |
# File 'lib/parser.rb', line 163 def string? value, index return NO_MATCH if index == NO_MATCH # allow users to not check results of a sequence value = value.to_s index = ignore? index unless @ignoring i2 = index + value.length # puts source_text[index...i2].inspect + ' ' + value.inspect source_text[index...i2] == value ? i2 : NO_MATCH end |