6.4. Ambiguity


A problem with our expression grammar

Our simple expression grammar has a problem. There are two different parse trees for expression n + n * n.

    E               E
   /|\             /|\
  / | \           / | \
 E  +  E         E  *  E
 |    /|\       /|\    |   
 n   / | \     / | \   n
    E  *  E   E  +  E
    |     |   |     |    
    n     n   n     n

The left-hand parse tree is built assuming that * has higher precedence than +. The right-hand one is built assuming that + has higher precedence than *.

Ideally, the grammar should be designed so that it tells you the structure; there should be at most one parse tree for any sequence of tokens.


Ambiguous grammars

What is wrong with the simple expression grammar is that it is ambiguous, meaning that it is possible to produce two different parse trees for the same sequence of tokens.

The usual definition of ambiguity is in terms of derivations. For any given parse tree, there is exactly one leftmost derivation.

Definition. A grammar is ambiguous if there exists a sequence of token that can be derived by two different leftmost derivations.

An obvious question is whether there is a better, unambiguous, grammar for simple expressions.