Suppose that you choose to describe a programming language by a context-free grammar at the character level instead of at the token level. That is, you do not use a lexical analyzer as the front-end of the compiler. Describe two important problems or issues that you think you might encounter.
Is the following context-free grammar ambiguous? Explain how you know.
S | -> | S a S |
S | -> | b |
S | -> | c |
Is the grammar of the preceding question SLR(1)?
Is every unambiguous grammar LL(1)?. Explain briefly.
Is every LL(1) grammar unambiguous? Explain briefly.
Give an example of a grammar that is SLR(1) but not LL(1).
Is the following grammar LL(1)? Explain briefly. The start symbol is S.
S | -> | A B |
A | -> | a |
A | -> | A B |
B | -> | b |
Suppose that an identifier in a particular language starts with a letter and can contain letters, digits and the underscore symbol. Using Flex (or Lex) notation, write a regular expression that describes identifiers of that form.
Let G be the following augmented grammar.
What are the FIRST and FOLLOW sets for S and C in grammar G?
Show the SLR(1) state transition diagram, with the correct set of LR(0)-items in each state, for grammar G. Give a number to each state.
Show the SLR(1) parsing table for grammar G, using your transition diagram from part (b). Are there any conflicts?