Vadim Zaytsev aka @grammarware

GraSs: A Taxonomy of Grammar Smells


Organisation
global problems
Navigation
problems with navigating through the grammar
Structure
harmful relationships among grammar components

Convention
violations of visual policies
Notation
metalanguage-related
Parsing
parsing techniques related smells
Duplication
the same fragment is repeated

Misformat
misleading formatting
Misnomer
problems with naming policies
SayMyName
a misspelling or improper use of the natural language
ZigZag
mixing horizontal and vertical styles
Splat
nonterminal fragments scattered over the grammar

Misnomer Edit!

Related smells: Packages Not Clearly Named, Duplicate Abstraction, Inconsistent Naming Convention, Improper Quote Usage

There are a lot of potential problems with names used within a grammar, mostly concerning nonterminal names and labels. Many grammar notations do not support labels (decorative names for production rules or right hand side subexpressions) [SAC-2012-Zaytsev], but realistic metalanguages tend to have them in some form. Nonterminal names, on the other hand, are essential – they are optional only in notations for regular expressions, and present in all grammar notations of the context-free kind and beyond.

One can blame names to be uncommunicative, like the names from the last example: abc or pqr are much worse for the readability and maintainability of the grammar than if_statement, CompilationUnit or DIGIT, similarly to how this is a problem in programming in general [ICPC-2017-BeniaminiGOF]. One can also investigate whether naming policies are present and how they are respected. For instance, if all nonterminals are camelcased, but one is lowercase with an underscore separator, it was probably a misspelling – cases like this were reported in a MediaWiki grammar which was created by several unrelated grammar engineers [MediaWiki2011]. If can also be the case that the naming policy carries semantic meaning: typically lexical nonterminals and/or preterminals are named in uppercase, to distinguish them visually when they are used next to others like this:


 if_stmt ::= IF condition THEN expression ENDIF;

Sometimes naming policies represent namescoping, which is considered a bad smell in OOP but is much less so in grammars because all names are global (at least up to a module level, if we have modules). An example:


    VarDef ::= VarName DefKeyword VarType;
    ConstDef ::= ConstName DefKeyword ConstType;

Finally, names can be misleading and contain words that contradict the definition of the named entity. For example:


WhileStatement ::= "while" Condition Block
      | "repeat" Block "until" Condition ;


The GraSs taxonomy is a joint effort maintained by Dr. Vadim Zaytsev a.k.a. @grammarware. Page last updated in March 2021.
XHTML 1.1 CSS 3