GraSs: A Taxonomy of Grammar Smells

Organisation
global problems

Navigation
problems with navigating through the grammar

Structure
harmful relationships among grammar components

Convention
violations of visual policies

Notation
metalanguage-related

Parsing
parsing techniques related smells

Duplication
the same fragment is repeated

Misformat
misleading formatting

Misnomer
problems with naming policies

SayMyName
a misspelling or improper use of the natural language

ZigZag
mixing horizontal and vertical styles

Splat
nonterminal fragments scattered over the grammar

Misnomer Edit!

There are a lot of potential problems with names used within a grammar, mostly concerning nonterminal names and labels. Many grammar notations do not support labels (decorative names for production rules or right hand side subexpressions) [SAC-2012-Zaytsev], but realistic metalanguages tend to have them in some form. Nonterminal names, on the other hand, are essential – they are optional only in notations for regular expressions, and present in all grammar notations of the context-free kind and beyond.

One can blame names to be uncommunicative, like the names from the last example: abc or pqr are much worse for the readability and maintainability of the grammar than if_statement, CompilationUnit or DIGIT, similarly to how this is a problem in programming in general [ICPC-2017-BeniaminiGOF]. One can also investigate whether naming policies are present and how they are respected. For instance, if all nonterminals are camelcased, but one is lowercase with an underscore separator, it was probably a misspelling – cases like this were reported in a MediaWiki grammar which was created by several unrelated grammar engineers [MediaWiki2011]. If can also be the case that the naming policy carries semantic meaning: typically lexical nonterminals and/or preterminals are named in uppercase, to distinguish them visually when they are used next to others like this:

if_stmt ::= IF condition THEN expression ENDIF;

Sometimes naming policies represent namescoping, which is considered a bad smell in OOP but is much less so in grammars because all names are global (at least up to a module level, if we have modules). An example:

VarDef ::= VarName DefKeyword VarType; ConstDef ::= ConstName DefKeyword ConstType;

Finally, names can be misleading and contain words that contradict the definition of the named entity. For example:

WhileStatement ::= "while" Condition Block | "repeat" Block "until" Condition ;

The GraSs taxonomy is a joint effort maintained by Dr. Vadim Zaytsev a.k.a. @grammarware. Page last updated in March 2021.