<note>
  * 官網手冊比較精簡，建議買書。
    * [[https://github.com/antlr/antlr4/blob/master/doc/index.md|ANTLR 4 Documentation]]
    * [[https://www.amazon.com/Definitive-ANTLR-4-Reference/dp/1934356999|The Definitive ANTLR 4 Reference]]
    * [[http://www.antlr.org/api/Java/index.html|ANTLR 4 Runtime 4.7 API]]
</note>

  * [[http://www.antlr.org/|ANTLR]]
    * ANTLR 除了在語法嵌入 action 之外，提供以下兩種方式。可以先使用，熟悉之後再回來看 design pattern。
      * [[wp>Visitor pattern]]
      * [[wp>Observer pattern]]
====== 環境設定 ======
  * [[https://github.com/antlr/antlr4/blob/master/doc/getting-started.md|Getting Started with ANTLR v4]]

  - 下載 Jar 檔，設定環境。<code bash>
$ cd /usr/local/lib
$ curl -O http://www.antlr.org/download/antlr-4.7-complete.jar
$ export CLASSPATH=".:/usr/local/lib/antlr-4.7-complete.jar:$CLASSPATH"
$ alias antlr4='java -jar /usr/local/lib/antlr-4.7-complete.jar'
$ alias grun='java org.antlr.v4.gui.TestRig'
</code>
  - 下載範例。<code bash>
# cd; mkdir antlr; cd antlr
# curl -O http://media.pragprog.com/titles/tpantlr2/code/tpantlr2-code.tgz
# tar xvf tpantlr2-code.tgz
# cd code/install
$ git clone https://github.com/azru0512/antlr.git
$ cd install
$ antlr4 Hello.g4
$ javac *.java
# grun grammer_name start_rule
$ grun Hello r -tokens
$ grun Hello r -gui
</code>
====== Part I ======
  * Ch3. A Starter ANTLR Project
    * 調用 ANTLR 產生的 parser 和 lexer 方式如下:<code java>
// import ANTLR's runtime libraries
import org.antlr.v4.runtime.*;
import org.antlr.v4.runtime.tree.*;

public class Test {
    public static void main(String[] args) throws Exception {
        // create a CharStream that reads from standard input
        ANTLRInputStream input = new ANTLRInputStream(System.in);

        // create a lexer that feeds off of input CharStream
        ArrayInitLexer lexer = new ArrayInitLexer(input);

        // create a buffer of tokens pulled from the lexer
        CommonTokenStream tokens = new CommonTokenStream(lexer);

        // create a parser that feeds off the tokens buffer
        ArrayInitParser parser = new ArrayInitParser(tokens);

        ParseTree tree = parser.init(); // begin parsing at init rule
        
        // 處理 parse 得到的 AST，tree。
    }
}
</code>
    * 一般繼承 ANTLR 生成的 Listener 基類，覆寫自己所需要的 callback。<code java>
/** Convert short array inits like {1,2,3} to "\u0001\u0002\u0003" */
public class ShortToUnicodeString extends ArrayInitBaseListener {
    /** Translate { to " */
    @Override
    public void enterInit(ArrayInitParser.InitContext ctx) {
        System.out.print('"');
    }

    /** Translate } to " */
    @Override
    public void exitInit(ArrayInitParser.InitContext ctx) {
        System.out.print('"');
    }

    /** Translate integers to 4-digit hexadecimal strings prefixed with \\u */
    @Override
    public void enterValue(ArrayInitParser.ValueContext ctx) {
        // Assumes no nested array initializers
        int value = Integer.valueOf(ctx.INT().getText());
        System.out.printf("\\u%04x", value);
    }
}
</code>
      * 當 ANTLR 的 AST walker 遍歷節點時 (enter 和 exit)，會調用相應的 callback，同時傳入跟該節點相關的資訊 (ctx)。我們可以透過 ctx 取得該節點的值，或是該節點的子節點 ... 等等。
    * 遍歷 AST 並調用 Listener 的代碼基本如下:<code java>
        // Create a generic parse tree walker that can trigger callbacks
        ParseTreeWalker walker = new ParseTreeWalker();
        // Walk the tree created during the parse, trigger callbacks
        walker.walk(new ShortToUnicodeString(), tree);
</code>
  * Ch4. A Quick Tour
    * 基本範例。<code bash>
$ antlr4 Expr.g4
$ javac Expr*.java
$ grun Expr prog -gui t.expr
</code>
      * ''grammer'' 開頭。語法規則小寫字母開頭，語彙規則大寫字母開頭。語法和語彙規則可以寫在同一個檔案，或是寫在不同檔案，透過 ''import''  載入。
    * Visitor<code bash>
$ antlr4 -no-listener -visitor LabeledExpr.g4
$ javac Calc.java LabeledExpr*.java
$ java Calc t.expr
</code>
      * Expr.g4 需要稍做修改以得到 LabeledExpr.g4。
      * 繼承 ANTLR 生成的 Visitor 基類，其中的型別參數代表每條規則的返回型別。使用方法如下:<code java>
        ParseTree tree = parser.prog(); // parse

        EvalVisitor eval = new EvalVisitor();
        eval.visit(tree);
</code>
    * Listener<code bash>
$ antlr4 Java.g4
$ javac Java*.java Extract*.java
$ java ExtractInterfaceTool Demo.java
</code>
      * 繼承 ANTLR 生成的 Listener 基類。使用方法如下:<code java>
        ParseTree tree = parser.compilationUnit(); // parse

        ParseTreeWalker walker = new ParseTreeWalker(); // create standard walker
        ExtractInterfaceListener extractor = new ExtractInterfaceListener(parser);
        walker.walk(extractor, tree); // initiate walk of tree with listener
</code>
      * 修改 ExtractInterfaceListener.java 加入以下代碼，可以印出 Demo.java 的 ''import'' 語句。<code java>
    @Override
    public void enterImportDeclaration(JavaParser.ImportDeclarationContext ctx) {
        System.out.println("import "+parser.getTokenStream().getText(ctx));
    }
</code>
    * Embed Action<code bash>
$ antlr4 -no-listener Rows.g4
$ javac Rows*.java Col.java
$ java Col 1 < t.rows  
</code>
      * [[https://github.com/antlr/antlr4/blob/master/doc/predicates.md|Semantic Predicates]]
        * 根據 parse 的內容決定下一步動作。
    * Lexical
      * Island Grammars
        * 欲剖析的目標語言，其語彙規則可能有多組。比如: 程式和註解的規則不同。和 flex 的 [[http://westes.github.io/flex/manual/Start-Conditions.html#Start-Conditions|Start Conditions]] 一樣功能。
      * TokenStreamRewriter
        * 可以在 Listener 內透過 TokenStreamRewriter 簡單改寫輸入文本。
      * Channel
        * Token 可以經由不同的 channel 傳給 parser。
====== Part II ======
  * Ch5. Designing Grammers
    * [[https://github.com/antlr/grammars-v4|Grammars written for ANTLR v4]]
    * 語法基本有底下四種 pattern:
      * Sequence
        * ''exprList: expr (',' expr)* ;''
      * Choice
        * ''type: 'float' | 'int' | 'void' ;''
      * Token dependence
        * ''vector: '[' INT+ ']' ;''
      * Nested phrase
        * ''expr: expr '+' expr ;''
    * 運算子優先級和結合律。優先級由上而下，由高到低。Bison 使用 precedence declaration ([[https://www.gnu.org/software/bison/manual/bison.html#Precedence-Decl|3.7.3 Operator Precedence]]) 指定優先級和結合律。
    * 語彙規則基本常用如下。ANTLR 的 fragment 和 Flex 的 name definition ([[http://westes.github.io/flex/manual/Definitions-Section.html#Definitions-Section|5.1 Format of the Definitions Section]]) 相同。
      * Identifier
        * ''ID: [a-zA-Z]+ ;''
      * Number
        * ''INT: [0-9]+' ;''
      * String Literal: ''.*'' 匹配任意字元，但是這樣會連後面的 ''"'' 都被匹配。''.*?'' 是 nongreedy 規則，匹配任意字元直到 ''"''。
        * ''STRING: '"' .*? '"' ;''
      * Comment and White Space: ''skip'' 跟 Flex 的 empty action ([[http://westes.github.io/flex/manual/Actions.html#Actions|8 Actions]]) 相同。
        * ''WS: [ \t\r\n]+ -> skip ;''
    * 何者要放在語彙規則，或是語法規則?
      * parsing 不需要的東西，放在語彙規則，匹配並捨棄。如空白和註解。
      * 常用的 token 放在語彙規則。如 identifier。
      * parsing 不需要加以區分的，放在語彙規則。如: ''NUMBER: INT | FLOAT''。如果 parsing 不區分 int 和 float，統一交給語彙規則處理。
  * Ch6. Exploring Some Real Grammars
    * comma-seperated value<code bash>
$ cd code/example
$ antlr4 CSV.g4
$ javac CSV.*.java
$ grun CSV file -tokens data.csv
</code>
    * nested elements<code bash>
$ antlr4 JSON.g4
$ javac JSON*.java
$ grun JSON json -tokens
[1, "\u0049", 1.3e9]
</code>
  * Ch7. Decoupling Grammars from Application-Specific Code
    * 最基本將 grammar 和 action 拆分的方式，是將 action 代碼包裝成函式，原本 action 代碼改成函式調用。對於 Bison 來說，一般做法是建立一個 Compiler 類，包含實現 action 的 method。透過 ''%parse-param {Compiler* compiler}'' 指示 ''yyparse'' 將會接受 Compiler 指針。爾後透過該 Compiler 指針調用 method 執行 action。Compiler 類除了 method 之外，所有編譯過程中所需要紀錄的資訊都可以當作 member 加以保存。
    * Listener
      * 交由 ANTLR 遍歷 AST，再調用自定義的 listener。<code java>
        ParseTree tree = parser.file();

        // create a standard ANTLR parse tree walker
        ParseTreeWalker walker = new ParseTreeWalker();
        // create listener then feed to walker
        PropertyFileLoader loader = new PropertyFileLoader();
        walker.walk(loader, tree);        // walk parse tree
</code>
      * <code bash>
$ cd code/listeners
$ antlr4 LExpr.g4
$ javac LExpr*.java TestLEvaluator.java
$ java TestLEvaluator
1+2*3
</code>
        * 如果 listener 需要返回值，必須透過 stack 模擬。
    * Visitor
      * 如果需要自己遍歷 AST (比如略過部分 AST 節點)，採用 Visitor 模式。<code java>
        ParseTree tree = parser.file();
        
        PropertyFileVisitor loader = new PropertyFileVisitor();
        loader.visit(tree);
</code>
      * <code bash>
$ cd code/listeners
$ antlr4 -visitor LExpr.g4
$ javac LExpr*.java TestLEvalVisitor.java
$ java TestLEvalVisitor
1+2*3
</code>
    * Annotated Parse Tree
      * 將值存在 AST 節點。
  * Ch8. Building Some Real Language Applications
    * Loading CSV Data <code bash>
$ cd code/listeners
$ antlr4 CSV.g4 
$ javac CSV*.java LoadCSV.java 
$ java LoadCSV t.csv 
</code>
      * 想好要在哪個地方，以哪種形式保存自己所需的資料。
    * Translating JSON to XML <code bash>
$ antlr4 JSON.g4 
$ javac JSON*.java
$ java JSON2XML t.json
</code>
      * ''ParseTreeProperty<T>'' 每個節點可以儲存資料 T。
    * Generating a Call Graph <code bash>
$ antlr4 Cymbol.g4 
$ javac Cymbol*.java CallGraph.java 
$ java CallGraph t.cymbol
</code>
    * Validating Program Symbol Usage <code bash>
$ antlr4 Cymbol.g4 
$ javac Cymbol*.java CheckSymbols.java *Phase.java *Scope.java *Symbol.java
$ java CheckSymbols vars.cymbol
</code>
      * 使用 symbol table。第一次 walk tree 定義 (define) symbol，第二次 walk tree 解析 (resolve) symbol。<code java>
        ParseTreeWalker walker = new ParseTreeWalker();
        DefPhase def = new DefPhase();
        walker.walk(def, tree);
        // create next phase and feed symbol table info from def to ref phase
        RefPhase ref = new RefPhase(def.globals, def.scopes);
        walker.walk(ref, tree);
</code>
        * 一般會記錄 global symbol table 和當前的 symbol table。
        * 某些 symbol 同時也是 symbol table (scope)。例如: function 本身是 symbol，同時也是一個 scope，包含其後的參數列表。
        * symbol table 和 ast node 會互指。例如: function ast node 會指向其對應的 function symbol table。
        * symbol table 會指向上層的 symbol table。
====== Part III ======
  * Ch9. Error Reporting and Recovery
    * A Parade of Errors<code bash>
$ antlr4 Simple.g4 
$ javac Simple*.java
$ grun Simple prog
class T { int i; }
</code>
      * 參考 [[https://pragprog.com/titles/tpantlr2/errata|Errata for The Definitive ANTLR 4 Reference]] 修正 Simple.g4。
      * single-token deletion 和 insertion 是 Parser 忽略錯誤的 token 或是插入預期的 token，做簡單的 error recovery。
    * Altering and Redirecting ANTLR Error Messages<code bash>
$ javac TestE_Listener.java
TestE_Listener.java:10: error: cannot find symbol
import org.antlr.v4.runtime.Nullable;
                           ^
$ java TestE_Listener
class T T {
  int i;
}                          
</code>
      * 繼承 ''BaseErrorListener''，override ''syntaxError''。
    * Automatic Error Recovery Strategy
      * 策略和 Bison ([[https://www.gnu.org/software/bison/manual/html_node/Error-Recovery.html|6 Error Recovery]]) 基本一致。尋找適當的同步點 (synchronization point)。
      * [[http://www.antlr.org/api/Java/org/antlr/v4/runtime/DefaultErrorStrategy.html|DefaultErrorStrategy]]