1 What are the mistakes ?
2 Beautify the error prompt
3 Error recovery strategy
stay DSL In the language development cycle , First you need to design grammar , Generating translators from grammar , Then input language samples to test the effect of the translator , And adjust the translation strategies according to the existing translation problems . There may be two types of mistakes : Grammatical Mistakes and Language sample error .
stay Parr Pull some mistakes out of the street , And praise ANTLR In the description of excellent performance in error prompt and recovery , You can observe some error prompts and recovery strategies and conventions :
(1) Error recovery is the process of recovering from grammatical errors in language samples , Often by modifying the input symbol or eating some symbols until the parser reaches a certain rule state of recognition ;
(2) Cascading error messages should be avoided as much as possible , That is, only one error message is given for each syntax error .
open ANTLR Of -dfa Options , You can view the generated DFA, Understand the decision and status information of the recognizer .
To achieve beautification of the error prompt , Cover... In grammar BaseRecognizer Of getErrorMesage() and getTokenErrorDisplay() Method .( It is required to cover displayRecognitionError() instead of getErrorMesage(), In fact, if you look at the generated code, the former will call the latter ).
Some built-in exception classes
abnormal explain
RecognitionException Identify anomalies .ANTLR The generated identifier throws the base class of the exception . Record the input stream information when the error occurs : The symbol the recognizer is currently seeing ( character 、Token Or tree nodes ) The index of , Pointer to the wrong symbol 、 Current row 、 Position in the industry .
MismatchedTokenException Token Mismatch exception . Indicates that the specific symbol expected by the parser is not currently found . Record additional Token Type information .
MismatchedTreeNodeException Tree node mismatch exception . Similar to the previous exception , Point out the specific token Type node not found .
NoViableAltException No viable option exception . The recognizer reaches the decision point , But looking forward, the symbol doesn't match all options . Record looking forward DFA Middle decision number and status number , Also record the piece of grammar that made the decision .
EarlyExitException Premature exit exception . distinguish (...)+ EBNF Sub rule , Only one less match , But the rule doesn't match any symbols . Record DFA Middle decision number .
FailedPredicateException Failed predicate exception . The semantic predicate evaluates to false. Record the rule name and predicate text .
MismatchedRangeException Range mismatch exception . The scope is as follows [a..z], No symbols in the range are matched . The smallest and largest elements in the record range .
MismatchedSetException Collection mismatch exception . Assemble as {'a','b'}, Does not match any symbols in the set . Record all the symbols in the set .
MismatchedNostSetException Non set mismatch exception . Non set refers to ~ The quasi... Of the set represented by the operator . Similar to the previous exception .
ANTLR The recognizer does not use string information to create exception objects , Track and record only the field information necessary to generate errors .
Base recognizer class BaseRecognizer Multiple error reporting methods in can generate local error messages , Don't expect from the exception class above getMessage() Method returns any exception information .
In addition to exception information , You can use custom actions in grammar to provide error information .
An important error message tip is to indicate the parsing rules used when an error occurs ( Stack ), The general method is in grammar @members Define a stack of error messages in , In the rules that make the most sense to users @init/@after In the action, the stack in and stack out operations are performed respectively , At the same time @memebers Cover in action getErrorMessage() Method , It uses the peek() Get the prompt information of the current scene .
Error zero tolerance : Exit immediately when you encounter the first error
Need to be in grammar @members Cover in action mismatch()、recoverFromMismatchedSet() Method and override the default @rulecatch.
Cover @rulecatch Actions affect all the rules , Fine grained rule level can be used catch Action to perform a manual recovery operation .
The mistakes in morphology and tree grammar are similar to those in general grammar , The only difference is that it deals with characters and tree nodes .
3 Error recovery strategy
Standing on the shoulders of giants
ANTLR The error recovery mechanism of Nikalaus Wirth Of "Algorithms + Data Structures = Programs"( Gossip : It's like he won the Turing prize with that )、Rodney Topor Of " Error recovery record in recursive descent parser " And some of the Josef Grosch stay CoCo Some ideas of parser generator .
The essential idea is , When the recognizer encounters mismatched symbol errors , Try single symbols as much as possible first Insert and Delete Whether it can be recovered , If not , Again Swallow some symbols until you look ahead and the symbols belong to a resynchronization set (resynchronization set) Exit the rule after .
A resynchronization set is a set of input symbols , These symbols can legally appear in the current rule and the current rule call chain .

