Class: ANTLR3::Recognizer
- Inherits:
-
Object
- Object
- ANTLR3::Recognizer
- Extended by:
- ClassMacros
- Includes:
- Constants, Error, TokenFactory
- Defined in:
- lib/antlr3/recognizers.rb
Overview
Recognizer
As the base class of all ANTLR-generated recognizers, Recognizer provides much of the shared functionality and structure used in the recognition process. For all effective purposes, the class and its immediate subclasses Lexer, Parser, and TreeParser are abstract classes. They can be instantiated, but they're pretty useless on their own. Instead, to make useful code, you write an ANTLR grammar and ANTLR will generate classes which inherit from one of the recognizer base classes, providing the implementation of the grammar rules itself. this group of classes to implement necessary tasks. Recognizer defines methods related to:
-
token and character matching
-
prediction and recognition strategy
-
recovering from errors
-
reporting errors
-
memoization
-
simple rule tracing and debugging
Direct Known Subclasses
Constant Summary
Constant Summary
Constants included from Constants
Constants::BUILT_IN_TOKEN_NAMES, Constants::DEFAULT, Constants::DOWN, Constants::EOF, Constants::EOF_TOKEN, Constants::EOR_TOKEN_TYPE, Constants::HIDDEN, Constants::INVALID_TOKEN, Constants::INVALID_TOKEN_TYPE, Constants::MEMO_RULE_FAILED, Constants::MEMO_RULE_UNKNOWN, Constants::MIN_TOKEN_TYPE, Constants::SKIP_TOKEN, Constants::UP
Class Attribute Summary (collapse)
-
+ (Object) antlr_version
readonly
Returns the value of attribute antlr_version.
-
+ (Object) antlr_version_string
readonly
Returns the value of attribute antlr_version_string.
-
+ (Object) default_rule
Returns the value of attribute default_rule.
-
+ (Object) grammar_file_name
readonly
Returns the value of attribute grammar_file_name.
-
+ (Object) grammar_home
readonly
Returns the value of attribute grammar_home.
-
+ (Object) library_version_string
readonly
Returns the value of attribute library_version_string.
-
+ (Object) token_scheme
Returns the value of attribute token_scheme.
Instance Attribute Summary (collapse)
-
- (Object) input
Returns the value of attribute input.
-
- (Object) state
readonly
Returns the value of attribute state.
Attributes included from TokenFactory
Class Method Summary (collapse)
- + (Boolean) debug?
-
+ (Object) define_return_scope(*members)
this method is used to generate return-value structures for rules with multiple return values.
-
+ (Object) generic_return_scope
sets up and returns the generic rule return scope for a recognizer.
- + (Object) imported_grammars
- + (Object) master
- + (Object) master_grammars
- + (Boolean) profile?
-
+ (Object) return_scope_members
used as a hook to add additional default members to default return value structures For example, all AST-building parsers override this method to add an extra :tree field to all rule return structures.
- + (Object) rules
- + (Object) Scope(*declarations, &body)
- + (Object) token_class
Instance Method Summary (collapse)
- - (Boolean) already_parsed_rule?(rule)
- - (Object) antlr_version
- - (Object) antlr_version_string
- - (Object) backtrack
-
- (Boolean) backtracking?
Returns true if the recognizer is currently in a decision for which backtracking has been enabled.
- - (Object) backtracking_level (also: #backtracking)
- - (Object) backtracking_level=(n) (also: #backtracking=)
-
- (Object) begin_resync
overridable hook method that is executed at the start of the resyncing procedure in recover.
- - (Object) combine_follows(exact)
-
- (Object) compute_context_sensitive_rule_follow
Compute the context-sensitive FOLLOW set for current rule.
-
- (Object) compute_error_recovery_set
(The following explanation has been lifted directly from the
source code documentation of the ANTLR Java runtime library). -
- (Object) consume_until(types)
Consume input symbols until one matches a type within types.
-
- (Object) current_symbol
Match needs to return the current input symbol, which gets put into the label for the associated token ref; e.g., x=ID.
-
- (Object) display_recognition_error(e = $!)
error reporting hook for presenting the information The default implementation builds appropriate error message text using error_header and error_message, and calls emit_error_message to write the error message out to some source.
- - (Object) each_delegate
-
- (Object) emit_error_message(message)
Write the error report data out to some source.
-
- (Object) end_resync
overridable hook method that is after the resyncing procedure has completed.
-
- (Object) error_header(e = $!)
used to add a tag to the error message that indicates the location of the input stream when the error occurred.
-
- (Object) error_message(e = $!)
used to construct an appropriate error message based on the specific type of error and the error's attributes.
- - (Object) grammar_file_name
-
- (Recognizer) initialize(options = {})
constructor
Create a new recognizer.
-
- (Object) match(type, follow)
Attempt to match the current input symbol the token type specified by type.
-
- (Object) match_any
match anything -- i.e.
- - (Object) memoize(rule, start_index, success)
- - (Boolean) mismatch_is_missing_token?(follow)
- - (Boolean) mismatch_is_unwanted_token?(type)
-
- (Object) missing_symbol(error, expected_token_type, follow)
Conjure up a missing token during error recovery.
-
- (Object) number_of_syntax_errors
factor out what to do upon token mismatch so tree parsers can behave differently.
-
- (Object) recover(error = $!)
############################################################################################ #################################### Error Recovery ######################################## ############################################################################################.
- - (Object) recover_from_mismatched_element(e, follow)
- - (Object) recover_from_mismatched_set(e, follow)
- - (Object) recover_from_mismatched_token(type, follow)
-
- (Object) report_error(e = $!)
When a recognition error occurs, this method is the main hook for carrying out the error reporting process.
-
- (Object) reset
Resets the recognizer's state data to initial values.
- - (Object) resync
- - (Object) rule_memoization(rule, start_index)
- - (Boolean) syntactic_predicate?(name)
- - (Boolean) syntax_errors?
-
- (Object) token_error_display(token)
formats a token object appropriately for inspection within an error message.
- - (Object) trace_in(rule_name, rule_index, input_symbol)
- - (Object) trace_out(rule_name, rule_index, input_symbol)
Methods included from TokenFactory
Methods included from Error
#EarlyExit, #FailedPredicate, #MismatchedNotSet, #MismatchedRange, #MismatchedSet, #MismatchedToken, #MismatchedTreeNode, #MissingToken, #NoViableAlternative, #RewriteCardinalityError, #RewriteEarlyExit, #RewriteEmptyStream, #UnwantedToken
Constructor Details
- (Recognizer) initialize(options = {})
Create a new recognizer. The constructor simply ensures that all recognizers are initialized with a shared state object. See the main recognizer subclasses for more specific information about creating recognizer objects like lexers and parsers.
360 361 362 363 364 365 |
# File 'lib/antlr3/recognizers.rb', line 360 def initialize( = {} ) @state = [ :state ] || RecognizerSharedState.new @error_output = .fetch( :error_output, $stderr ) defined?( @input ) or @input = nil initialize_dfas end |
Class Attribute Details
+ (Object) antlr_version (readonly)
Returns the value of attribute antlr_version
207 208 209 |
# File 'lib/antlr3/recognizers.rb', line 207 def antlr_version @antlr_version end |
+ (Object) antlr_version_string (readonly)
Returns the value of attribute antlr_version_string
207 208 209 |
# File 'lib/antlr3/recognizers.rb', line 207 def antlr_version_string @antlr_version_string end |
+ (Object) default_rule
Returns the value of attribute default_rule
213 214 215 |
# File 'lib/antlr3/recognizers.rb', line 213 def default_rule @default_rule ||= rules.first end |
+ (Object) grammar_file_name (readonly)
Returns the value of attribute grammar_file_name
207 208 209 |
# File 'lib/antlr3/recognizers.rb', line 207 def grammar_file_name @grammar_file_name end |
+ (Object) grammar_home (readonly)
Returns the value of attribute grammar_home
207 208 209 |
# File 'lib/antlr3/recognizers.rb', line 207 def grammar_home @grammar_home end |
+ (Object) library_version_string (readonly)
Returns the value of attribute library_version_string
207 208 209 |
# File 'lib/antlr3/recognizers.rb', line 207 def library_version_string @library_version_string end |
+ (Object) token_scheme
Returns the value of attribute token_scheme
213 214 215 |
# File 'lib/antlr3/recognizers.rb', line 213 def token_scheme @token_scheme end |
Instance Attribute Details
- (Object) input
Returns the value of attribute input
344 345 346 |
# File 'lib/antlr3/recognizers.rb', line 344 def input @input end |
- (Object) state (readonly)
Returns the value of attribute state
345 346 347 |
# File 'lib/antlr3/recognizers.rb', line 345 def state @state end |
Class Method Details
+ (Boolean) debug?
306 307 308 |
# File 'lib/antlr3/recognizers.rb', line 306 def debug? return false end |
+ (Object) define_return_scope(*members)
this method is used to generate return-value structures for rules with multiple return values. To avoid generating a special class for ever rule in AST parsers and such (where most rules have the same default set of return values), each recognizer gets a default return value structure assigned to the constant Return. Rules which don't require additional custom members will have a rule-return name constant that just points to the generic return value.
241 242 243 244 245 246 247 |
# File 'lib/antlr3/recognizers.rb', line 241 def define_return_scope( *members ) if members.empty? then generic_return_scope else members += return_scope_members Struct.new( *members ) end end |
+ (Object) generic_return_scope
sets up and returns the generic rule return scope for a recognizer
260 261 262 263 264 265 |
# File 'lib/antlr3/recognizers.rb', line 260 def generic_return_scope @generic_return_scope ||= begin struct = Struct.new( *return_scope_members ) const_set( :Return, struct ) end end |
+ (Object) imported_grammars
267 268 269 |
# File 'lib/antlr3/recognizers.rb', line 267 def imported_grammars @imported_grammars ||= Set.new end |
+ (Object) master
275 276 277 |
# File 'lib/antlr3/recognizers.rb', line 275 def master master_grammars.last end |
+ (Object) master_grammars
271 272 273 |
# File 'lib/antlr3/recognizers.rb', line 271 def master_grammars @master_grammars ||= [] end |
+ (Boolean) profile?
310 311 312 |
# File 'lib/antlr3/recognizers.rb', line 310 def profile? return false end |
+ (Object) return_scope_members
used as a hook to add additional default members to default return value structures For example, all AST-building parsers override this method to add an extra :tree field to all rule return structures.
254 255 256 |
# File 'lib/antlr3/recognizers.rb', line 254 def return_scope_members [ :start, :stop ] end |
+ (Object) rules
298 299 300 |
# File 'lib/antlr3/recognizers.rb', line 298 def rules self::RULE_METHODS.dup rescue [] end |
+ (Object) Scope(*declarations, &body)
314 315 316 |
# File 'lib/antlr3/recognizers.rb', line 314 def Scope( *declarations, &body ) Scope.new( *declarations, &body ) end |
+ (Object) token_class
318 319 320 321 322 323 324 |
# File 'lib/antlr3/recognizers.rb', line 318 def token_class @token_class ||= begin self::Token rescue superclass.token_class rescue ANTLR3::CommonToken end end |
Instance Method Details
- (Boolean) already_parsed_rule?(rule)
866 867 868 869 870 871 872 873 874 875 876 |
# File 'lib/antlr3/recognizers.rb', line 866 def already_parsed_rule?( rule ) stop_index = rule_memoization( rule, @input.index ) case stop_index when MEMO_RULE_UNKNOWN then return false when MEMO_RULE_FAILED raise BacktrackingFailed else @input.seek( stop_index + 1 ) end return true end |
- (Object) antlr_version
336 337 338 |
# File 'lib/antlr3/recognizers.rb', line 336 def antlr_version self.class.antlr_version end |
- (Object) antlr_version_string
340 341 342 |
# File 'lib/antlr3/recognizers.rb', line 340 def antlr_version_string self.class.antlr_version_string end |
- (Object) backtrack
839 840 841 842 843 844 845 846 847 848 849 850 851 |
# File 'lib/antlr3/recognizers.rb', line 839 def backtrack @state.backtracking += 1 start = @input.mark success = begin yield rescue BacktrackingFailed then false else true end return success ensure @input.rewind( start ) @state.backtracking -= 1 end |
- (Boolean) backtracking?
Returns true if the recognizer is currently in a decision for which backtracking has been enabled
827 828 829 |
# File 'lib/antlr3/recognizers.rb', line 827 def backtracking? @state.backtracking > 0 end |
- (Object) backtracking_level Also known as: backtracking
831 832 833 |
# File 'lib/antlr3/recognizers.rb', line 831 def backtracking_level @state.backtracking end |
- (Object) backtracking_level=(n) Also known as: backtracking=
835 836 837 |
# File 'lib/antlr3/recognizers.rb', line 835 def backtracking_level=( n ) @state.backtracking = n end |
- (Object) begin_resync
overridable hook method that is executed at the start of the resyncing procedure in recover
by default, it does nothing
519 520 521 |
# File 'lib/antlr3/recognizers.rb', line 519 def begin_resync # do nothing end |
- (Object) combine_follows(exact)
779 780 781 782 783 784 785 786 787 788 789 790 791 792 |
# File 'lib/antlr3/recognizers.rb', line 779 def combine_follows( exact ) follow_set = Set.new @state.following.each_with_index.reverse_each do |local_follow_set, index| follow_set |= local_follow_set if exact if local_follow_set.include?( EOR_TOKEN_TYPE ) follow_set.delete( EOR_TOKEN_TYPE ) if index > 0 else break end end end return follow_set end |
- (Object) compute_context_sensitive_rule_follow
Compute the context-sensitive FOLLOW set for current rule. This is set of token types that can follow a specific rule reference given a specific call chain. You get the set of viable tokens that can possibly come next (look depth 1) given the current call chain. Contrast this with the definition of plain FOLLOW for rule r:
FOLLOW(r)={x | S=>*alpha r beta in G and x in FIRST(beta)}
where x in T* and alpha, beta in V*; T is set of terminals and V is the set of terminals and nonterminals. In other words, FOLLOW(r) is the set of all tokens that can possibly follow references to r in any sentential form (context). At runtime, however, we know precisely which context applies as we have the call chain. We may compute the exact (rather than covering superset) set of following tokens.
For example, consider grammar:
stat : ID '=' expr ';' // FOLLOW(stat)=={EOF}
| "return" expr '.'
;
expr : atom ('+' atom)* ; // FOLLOW(expr)=={';','.',')'}
atom : INT // FOLLOW(atom)=={'+',')',';','.'}
| '(' expr ')'
;
The FOLLOW sets are all inclusive whereas context-sensitive FOLLOW sets are precisely what could follow a rule reference. For input input "i=(3);", here is the derivation:
stat => ID '=' expr ';'
=> ID '=' atom ('+' atom)* ';'
=> ID '=' '(' expr ')' ('+' atom)* ';'
=> ID '=' '(' atom ')' ('+' atom)* ';'
=> ID '=' '(' INT ')' ('+' atom)* ';'
=> ID '=' '(' INT ')' ';'
At the "3" token, you'd have a call chain of
stat -> expr -> atom -> expr -> atom
What can follow that specific nested ref to atom? Exactly ')' as you can see by looking at the derivation of this specific input. Contrast this with the FOLLOW(atom)=ANTLR3::Recognizer.'+',')',';',''+',')',';','.'.
You want the exact viable token set when recovering from a token mismatch. Upon token mismatch, if LA(1) is member of the viable next token set, then you know there is most likely a missing token in the input stream. "Insert" one by just not throwing an exception.
775 776 777 |
# File 'lib/antlr3/recognizers.rb', line 775 def compute_context_sensitive_rule_follow combine_follows true end |
- (Object) compute_error_recovery_set
(The following explanation has been lifted directly from the
source code documentation of the ANTLR Java runtime library)
Compute the error recovery set for the current rule. During rule invocation, the parser pushes the set of tokens that can follow that rule reference on the stack; this amounts to computing FIRST of what follows the rule reference in the enclosing rule. This local follow set only includes tokens from within the rule; i.e., the FIRST computation done by ANTLR stops at the end of a rule.
EXAMPLE
When you find a "no viable alt exception", the input is not consistent with any of the alternatives for rule r. The best thing to do is to consume tokens until you see something that can legally follow a call to r or any rule that called r. You don't want the exact set of viable next tokens because the input might just be missing a token--you might consume the rest of the input looking for one of the missing tokens.
Consider grammar:
a : '[' b ']'
| '(' b ')'
;
b : c '^' INT ;
c : ID
| INT
;
At each rule invocation, the set of tokens that could follow that rule is pushed on a stack. Here are the various "local" follow sets:
FOLLOW( b1_in_a ) = FIRST( ']' ) = ']'
FOLLOW( b2_in_a ) = FIRST( ')' ) = ')'
FOLLOW( c_in_b ) = FIRST( '^' ) = '^'
Upon erroneous input "[]", the call chain is
a -> b -> c
and, hence, the follow context stack is:
depth local follow set after call to rule
0 \<EOF> a (from main( ) )
1 ']' b
3 '^' c
Notice that ')' is not included, because b would have to have been called from a different context in rule a for ')' to be included.
For error recovery, we cannot consider FOLLOW(c) (context-sensitive or otherwise). We need the combined set of all context-sensitive FOLLOW sets--the set of all tokens that could follow any reference in the call chain. We need to resync to one of those tokens. Note that FOLLOW(c)='^' and if we resync'd to that token, we'd consume until EOF. We need to sync to context-sensitive FOLLOWs for a, b, and c: ']','^'. In this case, for input "[]", LA(1) is in this set so we would not consume anything and after printing an error rule c would return normally. It would not find the required '^' though. At this point, it gets a mismatched token error and throws an exception (since LA(1) is not in the viable following token set). The rule exception handler tries to recover, but finds the same recovery set and doesn't consume anything. Rule b exits normally returning to rule a. Now it finds the ']' (and with the successful match exits errorRecovery mode).
So, you cna see that the parser walks up call chain looking for the token that was a member of the recovery set.
Errors are not generated in errorRecovery mode.
ANTLR's error recovery mechanism is based upon original ideas:
"Algorithms + Data Structures = Programs" by Niklaus Wirth
and
"A note on error recovery in recursive descent parsers": portal.acm.org/citation.cfm?id=947902.947905
Later, Josef Grosch had some good ideas:
"Efficient and Comfortable Error Recovery in Recursive Descent Parsers": www.cocolab.com/products/cocktail/doca4.ps/ell.ps.zip
Like Grosch I implemented local FOLLOW sets that are combined at run-time upon error to avoid overhead during parsing.
623 624 625 |
# File 'lib/antlr3/recognizers.rb', line 623 def compute_error_recovery_set combine_follows( false ) end |
- (Object) consume_until(types)
Consume input symbols until one matches a type within types
types can be a single symbol type or a set of symbol types
813 814 815 816 817 818 819 820 821 |
# File 'lib/antlr3/recognizers.rb', line 813 def consume_until( types ) types.is_a?( Set ) or types = Set[ *types ] type = @input.peek until type == EOF or types.include?( type ) @input.consume type = @input.peek end return( type ) end |
- (Object) current_symbol
Match needs to return the current input symbol, which gets put into the label for the associated token ref; e.g., x=ID. Token and tree parsers need to return different objects. Rather than test for input stream type or change the IntStream interface, I use a simple method to ask the recognizer to tell me what the current input symbol is.
This is ignored for lexers.
804 805 806 |
# File 'lib/antlr3/recognizers.rb', line 804 def current_symbol @input.look end |
- (Object) display_recognition_error(e = $!)
error reporting hook for presenting the information The default implementation builds appropriate error message text using error_header and error_message, and calls emit_error_message to write the error message out to some source
424 425 426 427 428 |
# File 'lib/antlr3/recognizers.rb', line 424 def display_recognition_error( e = $! ) header = error_header( e ) = ( e ) ( "#{ header } #{ }" ) end |
- (Object) each_delegate
347 348 349 350 351 352 353 |
# File 'lib/antlr3/recognizers.rb', line 347 def each_delegate block_given? or return enum_for( __method__ ) for grammar in self.class.imported_grammars del = __send__( Util.snake_case( grammar ) ) and yield( del ) end end |
- (Object) emit_error_message(message)
Write the error report data out to some source. By default, the error message is written to $stderr
491 492 493 |
# File 'lib/antlr3/recognizers.rb', line 491 def ( ) @error_output.puts( ) if @error_output end |
- (Object) end_resync
overridable hook method that is after the resyncing procedure has completed
by default, it does nothing
526 527 528 |
# File 'lib/antlr3/recognizers.rb', line 526 def end_resync # do nothing end |
- (Object) error_header(e = $!)
used to add a tag to the error message that indicates the location of the input stream when the error occurred
466 467 468 |
# File 'lib/antlr3/recognizers.rb', line 466 def error_header( e = $! ) e.location end |
- (Object) error_message(e = $!)
used to construct an appropriate error message based on the specific type of error and the error's attributes
433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 |
# File 'lib/antlr3/recognizers.rb', line 433 def ( e = $! ) case e when UnwantedToken token_name = token_name( e.expecting ) "extraneous input #{ token_error_display( e.unexpected_token ) } expecting #{ token_name }" when MissingToken token_name = token_name( e.expecting ) "missing #{ token_name } at #{ token_error_display( e.symbol ) }" when MismatchedToken token_name = token_name( e.expecting ) "mismatched input #{ token_error_display( e.symbol ) } expecting #{ token_name }" when MismatchedTreeNode token_name = token_name( e.expecting ) "mismatched tree node: #{ e.symbol } expecting #{ token_name }" when NoViableAlternative "no viable alternative at input " << token_error_display( e.symbol ) when MismatchedSet "mismatched input %s expecting set %s" % [ token_error_display( e.symbol ), e.expecting.inspect ] when MismatchedNotSet "mismatched input %s expecting set %s" % [ token_error_display( e.symbol ), e.expecting.inspect ] when FailedPredicate "rule %s failed predicate: { %s }?" % [ e.rule_name, e.predicate_text ] else e. end end |
- (Object) grammar_file_name
332 333 334 |
# File 'lib/antlr3/recognizers.rb', line 332 def grammar_file_name self.class.grammar_file_name end |
- (Object) match(type, follow)
Attempt to match the current input symbol the token type specified by type. If the symbol matches the type, consume the current symbol and return its value. If the symbol doesn't match, attempt to use the follow-set data provided by follow to recover from the mismatched token.
385 386 387 388 389 390 391 392 393 394 |
# File 'lib/antlr3/recognizers.rb', line 385 def match( type, follow ) matched_symbol = current_symbol if @input.peek == type @input.consume @state.error_recovery = false return matched_symbol end raise( BacktrackingFailed ) if @state.backtracking > 0 return recover_from_mismatched_token( type, follow ) end |
- (Object) match_any
match anything -- i.e. wildcard match. Simply consume the current symbol from the input stream.
398 399 400 401 |
# File 'lib/antlr3/recognizers.rb', line 398 def match_any @state.error_recovery = false @input.consume end |
- (Object) memoize(rule, start_index, success)
878 879 880 881 |
# File 'lib/antlr3/recognizers.rb', line 878 def memoize( rule, start_index, success ) stop_index = success ? @input.index - 1 : MEMO_RULE_FAILED memo = @state.rule_memory[ rule ] and memo[ start_index ] = stop_index end |
- (Boolean) mismatch_is_missing_token?(follow)
692 693 694 695 696 697 698 699 700 701 702 703 704 |
# File 'lib/antlr3/recognizers.rb', line 692 def mismatch_is_missing_token?( follow ) follow.nil? and return false if follow.include?( EOR_TOKEN_TYPE ) viable_tokens = compute_context_sensitive_rule_follow follow = follow | viable_tokens follow.delete( EOR_TOKEN_TYPE ) unless @state.following.empty? end if follow.include?( @input.peek ) or follow.include?( EOR_TOKEN_TYPE ) return true end return false end |
- (Boolean) mismatch_is_unwanted_token?(type)
688 689 690 |
# File 'lib/antlr3/recognizers.rb', line 688 def mismatch_is_unwanted_token?( type ) @input.peek( 2 ) == type end |
- (Object) missing_symbol(error, expected_token_type, follow)
Conjure up a missing token during error recovery.
The recognizer attempts to recover from single missing symbols. But, actions might refer to that missing symbol. For example, x=ID f($x);. The action clearly assumes that there has been an identifier matched previously and that $x points at that token. If that token is missing, but the next token in the stream is what we want we assume that this token is missing and we keep going. Because we have to return some token to replace the missing token, we have to conjure one up. This method gives the user control over the tokens returned for missing tokens. Mostly, you will want to create something special for identifier tokens. For literals such as '{' and ',', the default action in the parser or tree parser works. It simply creates a CommonToken of the appropriate type. The text will be the token. If you change what tokens must be created by the lexer, override this method to create the appropriate tokens.
684 685 686 |
# File 'lib/antlr3/recognizers.rb', line 684 def missing_symbol( error, expected_token_type, follow ) return nil end |
- (Object) number_of_syntax_errors
factor out what to do upon token mismatch so tree parsers can behave differently.
-
override this method in your parser to do things like bailing out after the first error
-
just raise the exception instead of calling the recovery method.
718 719 720 |
# File 'lib/antlr3/recognizers.rb', line 718 def number_of_syntax_errors @state.syntax_errors end |
- (Object) recover(error = $!)
############################################################################################ #################################### Error Recovery ######################################## ############################################################################################
499 500 501 502 503 504 505 506 |
# File 'lib/antlr3/recognizers.rb', line 499 def recover( error = $! ) @state.last_error_index == @input.index and @input.consume @state.last_error_index = @input.index follow_set = compute_error_recovery_set resync { consume_until( follow_set ) } end |
- (Object) recover_from_mismatched_element(e, follow)
653 654 655 656 657 658 659 660 661 662 663 664 |
# File 'lib/antlr3/recognizers.rb', line 653 def recover_from_mismatched_element( e, follow ) follow.nil? and return false if follow.include?( EOR_TOKEN_TYPE ) viable_tokens = compute_context_sensitive_rule_follow follow = ( follow | viable_tokens ) - Set[ EOR_TOKEN_TYPE ] end if follow.include?( @input.peek ) report_error( e ) return true end return false end |
- (Object) recover_from_mismatched_set(e, follow)
645 646 647 648 649 650 651 |
# File 'lib/antlr3/recognizers.rb', line 645 def recover_from_mismatched_set( e, follow ) if mismatch_is_missing_token?( follow ) report_error( e ) return missing_symbol( e, INVALID_TOKEN_TYPE, follow ) end raise e end |
- (Object) recover_from_mismatched_token(type, follow)
627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 |
# File 'lib/antlr3/recognizers.rb', line 627 def recover_from_mismatched_token( type, follow ) if mismatch_is_unwanted_token?( type ) err = UnwantedToken( type ) resync { @input.consume } report_error( err ) return @input.consume end if mismatch_is_missing_token?( follow ) inserted = missing_symbol( nil, type, follow ) report_error( MissingToken( type, inserted ) ) return inserted end raise MismatchedToken( type ) end |
- (Object) report_error(e = $!)
When a recognition error occurs, this method is the main hook for carrying out the error reporting process. The default implementation calls display_recognition_error to display the error info on $stderr.
412 413 414 415 416 417 |
# File 'lib/antlr3/recognizers.rb', line 412 def report_error( e = $! ) @state.error_recovery and return @state.syntax_errors += 1 @state.error_recovery = true display_recognition_error( e ) end |
- (Object) reset
Resets the recognizer's state data to initial values. As a result, all error tracking and error recovery data accumulated in the current state will be cleared. It will also attempt to reset the input stream via input.reset, but it ignores any errors received from doing so. Thus the input stream is not guarenteed to be rewound to its initial position
374 375 376 377 |
# File 'lib/antlr3/recognizers.rb', line 374 def reset @state and @state.reset! @input and @input.reset rescue nil end |
- (Object) resync
508 509 510 511 512 513 |
# File 'lib/antlr3/recognizers.rb', line 508 def resync begin_resync return( yield ) ensure end_resync end |
- (Object) rule_memoization(rule, start_index)
860 861 862 863 864 |
# File 'lib/antlr3/recognizers.rb', line 860 def rule_memoization( rule, start_index ) @state.rule_memory.fetch( rule ) do @state.rule_memory[ rule ] = Hash.new( MEMO_RULE_UNKNOWN ) end[ start_index ] end |
- (Boolean) syntactic_predicate?(name)
853 854 855 |
# File 'lib/antlr3/recognizers.rb', line 853 def syntactic_predicate?( name ) backtrack { send name } end |
- (Boolean) syntax_errors?
706 707 708 |
# File 'lib/antlr3/recognizers.rb', line 706 def syntax_errors? ( error_count = @state.syntax_errors ) > 0 and return( error_count ) end |
- (Object) token_error_display(token)
formats a token object appropriately for inspection within an error message
474 475 476 477 478 479 480 481 482 483 484 485 |
# File 'lib/antlr3/recognizers.rb', line 474 def token_error_display( token ) unless text = token.text || ( token.source_text rescue nil ) text = case when token.type == EOF then '<EOF>' when name = token_name( token.type ) rescue nil then "<#{ name }>" when token.respond_to?( :name ) then "<#{ token.name }>" else "<#{ token.type }>" end end return text.inspect end |
- (Object) trace_in(rule_name, rule_index, input_symbol)
883 884 885 886 887 888 889 |
# File 'lib/antlr3/recognizers.rb', line 883 def trace_in( rule_name, rule_index, input_symbol ) @error_output.printf( "--> enter %s on %s", rule_name, input_symbol ) @state.backtracking > 0 and @error_output.printf( " (in backtracking mode: depth = %s)", @state.backtracking ) @error_output.print( "\n" ) end |
- (Object) trace_out(rule_name, rule_index, input_symbol)
891 892 893 894 895 896 897 |
# File 'lib/antlr3/recognizers.rb', line 891 def trace_out( rule_name, rule_index, input_symbol ) @error_output.printf( "<-- exit %s on %s", rule_name, input_symbol ) @state.backtracking > 0 and @error_output.printf( " (in backtracking mode: depth = %s)", @state.backtracking ) @error_output.print( "\n" ) end |