It is possible to create a compiler that reads a variation of the BNF of a programming language and produces a table that allows another program, a generic bottom-up compiler, to compile programs written in the language that the BNF describes.
If our new language has the BNF production:
<Expression > ::= <Expression > + <Term>
The compiler that creates the table could receive either of the two the following pseudo-BNF representations.
E -> E '+' T E-> E "+" T
and create a table that a second bottom-up compiler could use to compile actual programs in the language described by the BFN. Nonterminal grammar items are identifiers and terminals are set off by single or double quotes. There are two different ways of distinguishing nonterminals because terminals may include either a single or double quote. The bottom-up compiler would compile the following expression:
a + b.
In Lab 1 you will create the first compiler of the pair, namely the compiler that takes the pseudo-BNF and produces a table for the bottom-up compiler. Lab 2 will be the bottom-up compiler and code generator for a language similar to Pascal (or Pl/0). The first compiler is a top-down compiler with a state transition lexical analyzer.
For clarity we will call the language describing pseudo-BNF PBNF. Non terminal grammar items of a language are given as identifiers in PBNF, while terminal grammar items are given as quoted strings (Single quotes (') are used if the item does not contain a single quote. Double quotes (") if the item does not contain a double quote. Either single or double quotes may be used if the string contains neither type of quote.) The two character lexeme -> is used to separate the left-hand side of a production from the right-hand side. All productions that share the same left-hand side are listed consecutively with only the first left-hand side listed. In all but the first of the production for a given left-hand side the first token is the arrow (->).
The syntax diagram for PBNF is given here.
The following is the Pl/0-like grammar that you use to test your PBNF compiler. Get a copy of this grammar by clicking here .
P1 -> P;
P -> B ".";
B -> V S1;
V -> "var" V1;
V1 -> "ident" ":" "type";
-> "ident" ":" "type" ";" V1;
S -> S1;
-> S1 ";" S;
S1 -> "ident" ":=" E;
-> "if" C "then" S1 "else" S1;
-> "if" C "then" S1;
-> "while" C "do" S1;
-> "begin" S "end";
-> ;
E -> E "+" T;
-> T;
T -> T "*" F;
-> F;
F -> "(" E ")";
-> "ident";
C -> E "relop" E;.
The grammar for PBNF is given in PBNF below.
Parser -> ProductList ".";
ProductList -> ParseRule;
-> ParseRule ProductList;
ParseRule -> OneRule;
-> OneRule OtherRules;
OneRule -> "Ident" "->" Expression;
OtherRules -> ShortRule;
-> ShortRule OtherRules;
ShortRule -> "->" Expression;
Expression -> Term";";
-> Term Expression";";
->";" ;
Term -> "Ident";
-> Factor;
Factor -> "(" Expression ")";
-> "String"; .
Part 1. Create a state-transition lexical analyzer (turn in diagrams). Click here for the shell for lab 1 parts 1 and 2.
Part 2. Create the parser for the PBNF.
Part 3. Generate the SLR table (for the bottom-up parser). Shell for this and the rest of the semester is now available (click here ). Add your own lex.h and lex.cpp files.
This compiler will be object-oriented and the general framework and some methods will be sent to you by E-mail during the semester. The following are the lexical conventions for your PBNF compiler. The complete list of states and the table are in the file you can get by clicking here .
Ident -> letter (letter | Digit)*
String ->' char (char)* ' where char not ' |
" char (char)* " where char not "
Part 1. Create a SLR parser and use this compiler with the table generated in Lab 1 part 3 to parse sample programs for the language described by the PBNF. Add the following functions to your TAGParser class.
LookUpTable getLookup(){return lookUp;}
CodeArray getCode(){return code;}
Part 2. Create a code generator for your SLR compiler and generate intermediate (three address code) for a program provide here later in the semester. Click here to get the sample code to test your compile and the correct output for this project.