9.10. LALR(1) Parsers


Spurious conflicts

Let's create an SLR(1) parse table for the following augmented grammar, G, from page 255 of the Dragon Book.

0. S S
1. S L = R
2. S R
3. L * R
4. L id
5. R L

The LR(0) finite-state machine for G has the following states.

State 0:
S′ →  S
S →  L = R
S →  R
L →  * R
L →  id
R →  L

State 1:
S′ → S 

State 2:
S → L  = R
R → L 

State 3:
S → R 

State 4:
L → *  R
R →  L
L →  * R
L →  id

State 5:
L → id 

State 6:
S → L =  R
R →  L
L →  * R
L →  id

State 7:
L → * R 

State 8:
R → L 

State 9:
S → L = R 

The FIRST and FOLLOW sets are as follows.

  FIRST(S) = {*, id}
  FIRST(L) = {*, id}
  FIRST(R) = {*, id}

  FOLLOW(S) = {$}
  FOLLOW(L) = {$, =}
  FOLLOW(R) = {$, =} 

Look at state 2. LR(0) item SL = R calls for a shift on lookahead =. But LR(0) item RL calls for a reduce on lookahead =, since FOLLOW(L) contains =. That is a parsing conflict.

But, in reality, = can only follow L when L occurs in the context of production SL = R, never when L is derived using production RL.

So the shift reduce conflict in state 2 can be resolved by choosing the shift action without affecting any parses.


The source of spurious conflicts

Imagine you have an SLR(1) grammar G1. In one part of G1 you have production

RB

and all is well. Imagine that + is not in FOLLOW(B).

Now, you modify G1 slightly by adding production

AB + C

in an entirely different section of the grammar. Call the new grammar G2.

Now + is in FOLLOW(B). That can introduce a parsing conflict in a state of G2's machine containing LR(0) item RB , even though the two parts of the grammar have nothing to do with one another.

The problem is that FOLLOW sets are global, taking information from the entire grammar.


LALR(1) parsers

A LALR(1) parser uses the same LR(0) finite-state machine that an SLR(1) parser uses. But the LALR algorithm is more sensitive, and can remove spurious conflicts like the one above, by using a more local notion of FOLLOW sets.

Grammar G above is not an SLR(1) grammar, but it is a LALR(1) grammar.

Bison uses the LALR(1) algorithm.

The ideas behind LALR(1) parsers are difficult to understand without looking at an even more powerful kind of parser, a canonical LR(1) parser, discussed in the next page.