Notes for week 10

Day 1 LookAhead L R Parsers (LALR)
  1. Comparison of the canonical LR table with the SLR table
    1. canonical has more states. (in a typical grammar 50-100 terminals 100 productions)(Aho,Sethi, Ullman pp236 and 244)
      1. Canonical has thousands of states
      2. SLR and LALR have several hundred states
    2. Many states in the canonical LR table are very similar.
    3. All of the work was done to remove one conflict in constructing the table
  2. LALR parser Brute-Force
    1. Find all of the canonical states.
    2. Find and merge all of the states haveing the same core sets. The merging process consists of taking the union of the lookahead sets of corresponding items; the result of each merger is one of the LALR states.
    3. The next-state function for LALR State Q is the state whose core set is the union of the core sets of the next-state functions of the merged items.
    4. The action-table entries are constructed as for the canonical LR parser.
  3. Example of constructing the LALR parser from the Canonical states.
    S' -> S
    S -> CC
    C -> cC | d
    
    1. Construct the canonical states
      state 0 V[e]	
      S1 -> @S; $
      S -> @CC; what can follow S in the first production? Only $
      C -> @cC; what can follow C in the production S ->@CC? First (C) = {c,d}
      C -> @d same follow {c,d}
      
      state 1	V[S]
      			S1 -> S@; $
      
      state 2	V[C]
      S -> C@C ;{$}
      C -> @CC; { $ } because this C is at the end of the production
      C -> @cC; {$}
      
      state 3	V[c]
      C -> c@C; {c,d}
      C -> @CC; {c,d} same things that follow the production because at end
      C -> @cC; {c,d}
      
      state 4	V[d]
      			C -> d@; {$}
      
      state 5 	V[CC]
      S -> CC@; {$}
      
      state 6	V[c] (from  state 2)
      C -> c@C; {$}
      C -> @cC; {$}
      C -> @d; {$}
      
      state 7	V[d] (from state 2)
      			C -> d@; {$}
      
      state 8	V[cC]  (from 3)
      			C -> cC@; {c, d}
      
      state 9	V[cC] (from 6)
      			C -> cC@; {$}
      
      
    2. Construct the unions of similar states
      States 3 and 6, form new state 36
      State 36	V[c]
      C -> c@C; {c,d,$}
      C -> @CC; {c,d,$} 
      C -> @cC; {c,d,$}
      State 47 V[d]
      C -> d@; {c,d,$}
      State 89 V[cC]
      			C -> cC@; {c,d,$}
      
      
    3. Construct the new table.
      action goto
    State c d $ S C
    0 s36 s47   1 2
    1     acc    
    2 s36 s47     5
    36 s36 s47     89
    47 r3 r3 r3    
    5     r1    
    89 r2 r2 r2    

Day 2

Intermediate code generation

  1. Productions contain syntax but also have meaning (sematics) so we can describe the sematics of a language at the same time we describe the syntax.
    1. Example:
      E1 -> E2 + E3 	{E1 := E2 + E3}
      E1 -> E2 * E3 	{E1 := E2 * E3}
      E1 -> (E2)	{E1 := E2}
      E-> i		{E = i.lexeme}
      
    2. Parse i+i*i for the expression a + b * c
      Stack Input Production Action
      $ i+i*i$ E1 -> i E1 = a
      $i +i*i$    

      parse so * is done properly
    3. Parse tree with semantic values inserted
    4. Storing intermediate code.
      1. Syntax tree most common way to store intermediate code. give tree for above.
      2. Directed Acyclic Graph (DAG)
      3. PostFix notation
      4. Three address code
      5. Intermediate languages
  2. Bottom-up translation.
    1. Semantic Stack
      1. Can be a separate stack synchronized with the parse stack
      2. Can be a descendent of parse stack that contains both syntax and semantics.
      // This is the parent stack node for a syntax stack in
      // an LR parser. Children of this node will store
      // semantic information.
      
      class  StackNode
      {
      public :
         StackNode(int s){state = s;}
         void setState(int s){state = s;}
         int seeState(){return state;} // See the state at the top of the stack 
      private: 
         int state;
      };
      
      // The parse stack is augmented to make it possible to store semantic
      // information. There are three types of node. A node for variables, types and
      // temporary storage (StoreNode). A node for Jump locations (JumpNode).
      // A node for operators (OperatorNode). 
      
      class  StoreNode :public  StackNode
      {
      public :
         StoreNode(int st, const  String &n, const String &t)
            :StackNode(st),name(n),type(t){}
         String seeName(){return  name;}
         String seeType(){return  type;}
         void setName(const  String &s){name = s;}
         void setType(const  String &t){type = t;}
      private :
         String name;// actual name of an identifier or temperary, if known 
         String type; //type of the data storage, if known 
      };
      
       // JumpNodes are used to store jump locations for both backfills and
      // forward locations. Since the locations must be integers when used 
      // as subscripts we need GetAddress to return an integer value. Code 
      // is stored as strings, so SeeAddress returns a string version of   
      // the value.  
      
      class  JumpNode:public  StackNode
      {
      public :
         JumpNode(int s,int ja):StackNode(s){jumpAddress = ja;}
         String addressAsText(){char num[10]; itoa(jumpAddress,num,10);return  String(num);}
         int seeAddress(){return  jumpAddress;}
      private :
         int jumpAddress; // Location of a jump or where jump to 
      };
      
      class  StateStack
      {
      public :
         StateStack(int size=0,int resize=1):data(size,resize){top = -1;}
         int insert(StackNode* t){data[++top]= t;return  top;}
         int seeTop(){return data[top]->seeState();}
         StackNode* getTop(){return  data[top];}
         StackNode* remove(){return  data[top--];}
         int empty(){return  top ==-1;}
      private :
         Array<StackNode*> data;
         int top;
      };
    2. The only action that generates code is the reduce action. So the action function for the Reduce action is suplemented by the code necessary to generate code. This is mainly a call to the generator function. This is a virtual function that each descendent of a base class Generator implements.
      int ReduceAction::action(StateStack &s,const ActionArray &a,
                  Lexical &lex,IdTable &ids,ThreeAddressCode &code)const
      {
      	Array<StackNode*> temps(5,1); // grab the stuff coming out to generate code 
      	if  (popNum >0)
      		for (int i = 0;i<popNum;i++)
      			temps[i] = s.remove();  //deleted as needed in generate 
      	else 
      		temps[0] = s.getTop();// must know conditional goto storage 
      	if  (actionNumber == 14)
      	{
      		StackNode *temp = new  StoreNode(temps[1]->seeState(),"+","");
      		delete  temps[1];
      		temps[1] = temp;
      	}
      	else if  (actionNumber == 16)  
      	{
      		StackNode *temp = new  StoreNode(temps[1]->seeState(),"*" ,"" );
      		delete  temps[1];
      		temps[1] = temp;
      	}
      	//add next line when generating code 
        	generators[actionNumber]->generate(ids,code,temps,popNum);
       
       	// temps[0] always comes back with an appropriate type of StackNode * 
      	int top = s.seeTop();
      	int newState = a.act(top,pushSym)->action(s,a,lex,ids,code);
      	temps[0]->setState(newState);// now we have the state - set it 
      	s.insert(temps[0]);// push the new node 
      	return  newState;
      }
    3. The Generator Class Hierarchy
      //Used to update the parse stack and generate code
      //Different children of Generator will represent
      //various actions on reduce 
      
      class  Generator
      {
      public :
         Generator(){}
         virtual void  generate(IdTable &idents,ThreeAddressCode &codeGen,
            Array<StackNode*> &s,int &num);
      };
      
      //Add variables to table in declaration section 
      class  NewVars : public  Generator
      {
      public :
         NewVars(){} //needed to bind the virtual functions
         virtual void  generate(IdTable &idents,ThreeAddressCode &codeGen,
            Array<StackNode*> &s,int &num);
      };
      
      //Transfer temporary address from nonterminal in stack to new nonterminal 
      class  ChangeNonTerm : public  Generator
      {
      public :
         ChangeNonTerm(){} //needed to bind the virtual functions
         virtual void  generate(IdTable &idents,ThreeAddressCode &codeGen,
            Array<StackNode*> &s,int &num);
      };
      
      //Test to see if variable in table in F->ident
      class  OldVars :public ChangeNonTerm
      {
      public :
         OldVars(){} //needed to bind the virtual functions 
         virtual void  generate(IdTable &idents,ThreeAddressCode &codeGen,
            Array<StackNode*> &s,int &num);
      };
      
      //Generates the code for S->ident := E. Must do type checking
      //and see if ident is defined to ensure the assignment is possible 
      class  AssignGen : public  Generator
      {
      public :
         AssignGen(){} //needed to bind the virtual functions 
         virtual void  generate(IdTable &idents,ThreeAddressCode &codeGen,
            Array<StackNode*> &s,int &num);
      };
      
      //Operator with two operands 
      class  StandardOp : public  Generator
      {
      public :
         StandardOp(){} //needed to bind the virtual functions 
         virtual  void generate(IdTable &idents,ThreeAddressCode &codeGen,
            Array<StackNode*> &s,int &num);
      };
      //Other Generator Classes 
      //******************* More of the same here followed by this list class. 
      class  Generators
      {
      public :
         Generators();
         ~Generators();
         Generator* operator [](int i){return gen[i];}
      private :
         Array<Generator*> gen;
         int size;
      };
    4. Initializing the array of generators for your grammar.
      //The array of generators has a generator that corresponds to each production.
      //The production number is used as subscript
      Generators::Generators():gen(24,10)
      {
         size = 24;
         for  (int i = 0;i<4;i++)
            gen[i] = new  Generator;
         // Generators for adding variables to table
         gen[4] = new  NewVars;//V1-> ident : type
         gen[5] = new  NewVars;//V1 -> ident : type;V1
      
         for (int i = 6;i<8;i++)
            gen[i] = new  Generator;
         // Generator for ident = E
         gen[8] = new  AssignGen;
         // Treat if thenelse and if then
         gen[9] = new  IfElseGen;
         gen[10] = new  IfGen;
         //Treat while loop
         gen[11] = new  WhileGen;
      
         for (int i = 12;i<14;i++) // S1->"begin" S "end" and S1-> epsilon
            gen[i] = new  Generator;
         // Treat the standard three address cases
         gen[14] = new  StandardOp; // E -> E+T
          gen[16] = new  StandardOp; // T -> T*F
         gen[20] = new  StandardOp; // C -> E relop E
      
         // Treat the simple Transfers
         gen[15] = new  ChangeNonTerm;// E -> T
         gen[17] = new  ChangeNonTerm;// T -> F
         gen[18] = new  ChangeNonTerm;// F -> (E)
      
         //Treat the use of a variable on the right-hand side
      
         gen[19] = new  OldVars;// F -> ident
      
      
         // Treat the marker nonterminals
      
         gen[21] = new  CondGotoGen;//CondGoToMark -> epsilon
         gen[22] = new   UnCondGoToGen; // GoToMark -> epsilon
         gen[23] = new    BackFillGen;// Mark -> epsilon 
      
      }
      
    5. The easiest generator to understand, simply destroys the nodes that have been popped and creates a new StackNode. Remember the reduce action will fill in the appropriate new state number.
      // disposes of unused nodes in case when they are not needed 
      void Generator::generate(IdTable &,ThreeAddressCode &,
                     Array<StackNode*> &s,int &num)
      {
         for (int i = 0;i<num;i++)
            delete  s[i];
         num = 1;
         s[0] = new  StackNode(0);
       }
    6. The production E -> E + E uses the StandardOp generate to create its code
      void StandardOp::generate(IdTable &idents,ThreeAddressCode &codeGen,
      								Array &s,int &num)
      {
      	// There are two operands and an operator, so pop three times
      	StoreNode *op2 = (StoreNode*)(s[0]);//first operand
      	StoreNode *op1 = (StoreNode*)(s[2]); //second operand
      	//Get the names of the operands and operator
      	String op1Name = op1->seeName();
      	String type1 = idents.getType(op1Name);
      	String op2Name = op2->seeName();
      	String type2 =  idents.getType(op2Name);
      	String opName;//holds operator depends on the actionNumber
      
      	StoreNode *op = (StoreNode*)(s[1]);
      	opName = op->seeName();
      
      	// Dispose of stack nodes since we wont need them again
      	for  (int i = 0;i<num;i++)
         		delete  s[i];
      
      	// if the operands are different types, convert the integer to real
      	if  (type1 != type2)
      		coerce(op1Name,op2Name,type1,type2,codeGen,idents);
      	//Generate the code for this operation
      	String leftSide = codeGen.genName();//Get a temperary storage location
      	idents.insert(leftSide,type1);
                  //Insert the code
      	codeGen.gen(leftSide + String(" := ")+op1Name + opName + op2Name);
      	//Update the parse stack with the new information
      	s[0] = new  StoreNode(0,leftSide,type1);//state is inserted in reduce
      	num = 1;
      }