Finding refs and defs in an Antlr grammar
An Antlr4 grammar has the form:
/** Optional javadoc style comment */
grammar Name;
options {...}
import ... ;
tokens {...}
channels {...} // lexer only
@actionName {...}
rule1 // parser and lexer rules, possibly intermingled
...
ruleN
where each rule is:
ruleName : alternative1 | ... | alternativeN ;
Often we are interested in finding the locations of rule names in the grammar. Trash is ideally suited to finding defining and applied occurrences of a rule name.
Defining occurrences
A defining occurrence of a rule name is the symbol on the left-hand side of a rule.
Trash can find these very easily using a simple XPath expression that looks
for a parserRuleSpec/RULE_REF
or a lexerRuleSpec/TOKEN_DEF
in the parse
tree of a grammar.
#
cat - <<EOF
Usage: $0 symbol-name
Finds all definitions of a symbol.
EOF
if [[ $# -gt 0 ]]
then
echo Arguments were provided.
trparse -l -t ANTLRv4 *.g4 2> /dev/null | \
trxgrep " //(parserRuleSpec/RULE_REF[text()='"$1"'] | lexerRuleSpec/TOKEN_REF[text()='"$1"'])" | \
trcaret
else
echo No arguments were provided.
trparse -l -t ANTLRv4 *.g4 2> /dev/null | \
trxgrep " //(parserRuleSpec/RULE_REF | lexerRuleSpec/TOKEN_REF)" | \
trcaret
fi
Example
Grammar
https://github.com/antlr/grammars-v4/tree/443916c7460a8f69e66666873c0088d2b3ec64c8/java/java
Command
$ bash /c/Users/Kenne/Documents/GitHub/g4-checks/find-defs.sh expression
Usage: /c/Users/Kenne/Documents/GitHub/g4-checks/find-defs.sh symbol-name
Finds all definitions of a symbol.
Arguments were provided.
JavaParser.g4:L588: expression
^
Applied occurrences
A applied occurrence of a rule name is the symbol on the right-hand side of a rule. Often we are interested not only in these, but the defining occurences as well. Trash can be used to find these easily.
#
cat - <<EOF
Usage: $0 symbol-name
Finds all references of a symbol.
EOF
if [[ $# -gt 0 ]]
then
echo Arguments were provided.
trparse -l -t ANTLRv4 *.g4 2> /dev/null | \
trxgrep " //(lexerRuleSpec/lexerRuleBlock//(ruleref/RULE_REF[text()='"$1"'] | terminal/TOKEN_REF[text()='"$1"']) | parserRuleSpec/ruleBlock//(ruleref/RULE_REF[text()='"$1"'] | terminal/TOKEN_REF[text()='"$1"']) | parserRuleSpec/RULE_REF[text()='"$1"'] | lexerRuleSpec/TOKEN_REF[text()='"$1"'])" | \
trcaret
else
trparse -l -t ANTLRv4 *.g4 2> /dev/null | \
trxgrep " //(lexerRuleSpec/lexerRuleBlock//(ruleref/RULE_REF | terminal/TOKEN_REF) | parserRuleSpec/ruleBlock//(ruleref/RULE_REF | terminal/TOKEN_REF) | parserRuleSpec/RULE_REF | lexerRuleSpec/TOKEN_REF)" | \
trcaret
fi
Example
Grammar
https://github.com/antlr/grammars-v4/tree/443916c7460a8f69e66666873c0088d2b3ec64c8/java/java
Command
$ bash /c/Users/Kenne/Documents/GitHub/g4-checks/find-refs.sh expression
Usage: /c/Users/Kenne/Documents/GitHub/g4-checks/find-refs.sh symbol-name
Finds all references of a symbol.
Arguments were provided.
JavaParser.g4:L252: | expression
^
JavaParser.g4:L349: : expression
^
JavaParser.g4:L457: : variableModifier* (VAR identifier '=' expression | typeType variableDeclarators)
^
JavaParser.g4:L503: | ASSERT expression (':' expression)? ';'
^
JavaParser.g4:L503: | ASSERT expression (':' expression)? ';'
^
JavaParser.g4:L512: | RETURN expression? ';'
^
JavaParser.g4:L513: | THROW expression ';'
^
JavaParser.g4:L516: | YIELD expression ';' // Java17
^
JavaParser.g4:L518: | statementExpression=expression ';'
^
JavaParser.g4:L544: : variableModifier* ( classOrInterfaceType variableDeclaratorId | VAR identifier ) '=' expression
^
JavaParser.g4:L556: : CASE (constantExpression=expression | enumConstantName=IDENTIFIER | typeType varName=identifier) ':'
^
JavaParser.g4:L562: | forInit? ';' expression? ';' forUpdate=expressionList?
^
JavaParser.g4:L571: : variableModifier* (typeType | VAR) variableDeclaratorId ':' expression
^
JavaParser.g4:L577: : '(' expression ')'
^
JavaParser.g4:L581: : expression (',' expression)*
^
JavaParser.g4:L581: : expression (',' expression)*
^
JavaParser.g4:L592: | expression '[' expression ']'
^
JavaParser.g4:L592: | expression '[' expression ']'
^
JavaParser.g4:L593: | expression bop='.'
^
JavaParser.g4:L604: | expression '::' typeArguments? identifier
^
JavaParser.g4:L611: | expression postfix=('++' | '--')
^
JavaParser.g4:L614: | prefix=('+'|'-'|'++'|'--'|'~'|'!') expression
^
JavaParser.g4:L617: | '(' annotation* typeType ('&' typeType)* ')' expression
^
JavaParser.g4:L621: | expression bop=('*'|'/'|'%') expression // Level 12, Multiplicative operators
^
JavaParser.g4:L621: | expression bop=('*'|'/'|'%') expression // Level 12, Multiplicative operators
^
JavaParser.g4:L622: | expression bop=('+'|'-') expression // Level 11, Additive operators
^
JavaParser.g4:L622: | expression bop=('+'|'-') expression // Level 11, Additive operators
^
JavaParser.g4:L623: | expression ('<' '<' | '>' '>' '>' | '>' '>') expression // Level 10, Shift operators
^
JavaParser.g4:L623: | expression ('<' '<' | '>' '>' '>' | '>' '>') expression // Level 10, Shift operators
^
JavaParser.g4:L624: | expression bop=('<=' | '>=' | '>' | '<') expression // Level 9, Relational operators
^
JavaParser.g4:L624: | expression bop=('<=' | '>=' | '>' | '<') expression // Level 9, Relational operators
^
JavaParser.g4:L625: | expression bop=INSTANCEOF (typeType | pattern)
^
JavaParser.g4:L626: | expression bop=('==' | '!=') expression // Level 8, Equality Operators
^
JavaParser.g4:L626: | expression bop=('==' | '!=') expression // Level 8, Equality Operators
^
JavaParser.g4:L627: | expression bop='&' expression // Level 7, Bitwise AND
^
JavaParser.g4:L627: | expression bop='&' expression // Level 7, Bitwise AND
^
JavaParser.g4:L628: | expression bop='^' expression // Level 6, Bitwise XOR
^
JavaParser.g4:L628: | expression bop='^' expression // Level 6, Bitwise XOR
^
JavaParser.g4:L629: | expression bop='|' expression // Level 5, Bitwise OR
^
JavaParser.g4:L629: | expression bop='|' expression // Level 5, Bitwise OR
^
JavaParser.g4:L630: | expression bop='&&' expression // Level 4, Logic AND
^
JavaParser.g4:L630: | expression bop='&&' expression // Level 4, Logic AND
^
JavaParser.g4:L631: | expression bop='||' expression // Level 3, Logic OR
^
JavaParser.g4:L631: | expression bop='||' expression // Level 3, Logic OR
^
JavaParser.g4:L632: | <assoc=right> expression bop='?' expression ':' expression // Level 2, Ternary
^
JavaParser.g4:L632: | <assoc=right> expression bop='?' expression ':' expression // Level 2, Ternary
^
JavaParser.g4:L632: | <assoc=right> expression bop='?' expression ':' expression // Level 2, Ternary
^
JavaParser.g4:L634: | <assoc=right> expression
^
JavaParser.g4:L636: expression
^
JavaParser.g4:L588: expression
^
JavaParser.g4:L662: : expression
^
JavaParser.g4:L667: : '(' expression ')'
^
JavaParser.g4:L690: | variableModifier* typeType annotation* identifier ('&&' expression)*
^
JavaParser.g4:L691: | guardedPattern '&&' expression
^
JavaParser.g4:L720: | ('[' expression ']')+ ('[' ']')*
^