Finding useless parentheses in Antlr4 grammars using XPath2.0

Trash has been extended to parse and execute for-loops of XPath2.0. It also has been extended with attributes for line and column of a parse tree node. These extensions now allow one to create a script to find useless parentheses in an Antlr4 grammar, and print the location of those parentheses.

find-useless.sh

#!/bin/bash
export MSYS2_ARG_CONV_EXCL="*"
count=`trparse $1 | trxgrep -e 'for $i in (//(altList | labeledAlt)/alternative/element[ebnf[not(child::blockSuffix)]/block/altList[not(@ChildCount > 1)]] | //(altList | labeledAlt)[not(@ChildCount > 1)]/alternative[not(@ChildCount > 1)]/element[ebnf[not(child::blockSuffix)]/block/altList[@ChildCount > 1]] | //ebnf/block[altList[@ChildCount = 1]/alternative[@ChildCount = 1]/element/atom]) return concat("line ", $i/@Line, " col ", $i/@Column, " """, $i/@Text,"""")'`
if [ "$count" != "0" ]
then
	trparse $1 | trxgrep -e 'for $i in (//(altList | labeledAlt)/alternative/element[ebnf[not(child::blockSuffix)]/block/altList[not(@ChildCount > 1)]] | //(altList | labeledAlt)[not(@ChildCount > 1)]/alternative[not(@ChildCount > 1)]/element[ebnf[not(child::blockSuffix)]/block/altList[@ChildCount > 1]] | //ebnf/block[altList[@ChildCount = 1]/alternative[@ChildCount = 1]/element/atom]) return concat("line ", $i/@Line, " col ", $i/@Column, " """, $i/@Text,"""")' | sort -u
else
	echo No useless parentheses.
fi

Input grammar t1.g4

grammar g1;
a : ('a');
b : ( 'b' ) | a;
c : b | (a);
d : (b | c);

Command

bash find-useless.sh t1.g4 | sort -u

Output

line 2 col 4 "('a')"
line 3 col 4 "('b')"
line 4 col 8 "(a)"
line 5 col 4 "(b|c)"

find-useless.sh uses node set union operators ('|'). I don’t know how to create a list of nodes that are unique with XPath, so I perform a sort -u of the output.

Note: For context, the Antlr grammar is parsed using this grammar. You can see the parse tree of the grammar at a command prompt: trparse g1.g4 | trtree.