CSCI 4627
Spring 2003
Lexer for compiler project

Assigned: 1/29/03
Due: 2/5/03

The C- language

Appendix A contains a definition of a small programming language called C-. We will be designing a compiler for a subset of C-.

Write a lexer using Flex for the entire C- language. Also create a header file, tokens.h, that contains definitions of token numbers.

The tokens

See appendix A.1 for a list of the tokens. Each reserved word is a separate token. Each of the symbols

  (  )  [  ]  {  }  ,  ;  =
is a token. Operators are combined into group tokens. Token RELOP can be any of <, >, <=, >=, ==, !=. Token ADDOP can be + or -. Token MULOP can be * or /.

Additional tokens are ID and NUM, standing for identifiers and numbers, respectively.

Comments are not tokens. They are skipped by the lexer.

Attributes

Tokens ID, NUM, RELOP, ADDOP and MULOP have attributes. For all except ID, the attribute is an integer. The attribute of a NUM token is the number that the lexeme stands for. The attribute of a RELOP token should be one of the values RELOP_LT, RELOP_LE, RELOP_GT, RELOP_GE, RELOP_EQ or RELOP_NE. Define those to be specific integers. The attribute for an ADDOP should be one of ADDOP_PLUS and ADDOP_MINUS. The attribute of a MULOP token should be either MULOP_TIMES and MULOP_DIVIDE.

The attribute of an identifier is the lexeme (the name of the identifier).

Create an attribute type called attribute_type. It can be defined as follows.

  typedef union
  {
    int i_attr;
    char* s_attr;
  }
  attribute_type;
and create a global variable called lex_attr whose type is attribute_type. Your line for the <= lexeme will look like this.
"<=" 	{lex_attr.i_attr = RELOP_LE;
         return RELOP;
        }

Look at the Flex example for more help. You can also look at the Flex manual.

What to turn in

Turn in your Flex lexer and you tokens.h file. Do not turn in any machine-generated files.f Use the /export/stu/classes/csci4627/bin/handin command with assignment number 2. So your command might be

  /export/stu/classes/csci4627/bin/handin csci4627 2 lexer.lex tokens.h