Notes for week 9

Day 1

The Parser.

  1. The items stored in the parse stack need only hold the state number. The following class is a base class, but is all that is needed for the parser. When code is generated, descendents of this class will hold information needed to generate code.
    // This is the parent stack node for a syntax stack in
    // an LR parser. Children of this node will store
    // semantic information.
    
    class  StackNode
    {
    public :
       StackNode(int s){state = s;}
       void setState(int s){state = s;}
       int seeState(){return state;} // See the state at the top of the stack 
    private: 
       int state;
    };
    
  2. Each type of parser action (shift, reduce, goto, accept, and error) has its own class. Members of these classes are stored in the appropriate locations in the parse table. The actual parser (ActionArray::Execute), calls on each of these functions to perform the appropriate operations.
    void  ActionArray::Execute(ostream &out,const  CodeArray &code,
                         const  LookUpTable &l, String vals[],Array &sourceCode)
    {
       int state = 0;
       IdTable ids;
       ThreeAddressCode codeGen;
       StateStack stack;
       //Insert the 0 state into the stack. 
       stack.insert(new  StackNode(0));
       Lexical lex(sourceCode);
       int sym = lex.getNextSymbol();
       do
       {
          //Trace the parse 
          out<<" In state > "<<state<<endl;
          out<<" Symbol "<<vals[sym].c_str()<<" Number "<<sym<<endl;
          out<<" action "; act(state,sym)->print(out,code,vals,l)<<endl;
    
          //Do the parse action 
          state = act(state,sym)->action(stack,*this,lex,ids,codeGen);
          sym = lex.getSym();//check to see if symbol updated 
     
       }while  (state >=0);
    }
    
  3. An example of the action function is the reduce. First the number of items on the right-hand side are popped. Then the state at the top of the stack is retrieved. This is used in the goto action which follows. Finally, the goto action returns the new state that must be pushed into the stack. This is done by setting the state number in temps[0] and pushing the value. The newState is returned by the function.
    int ReduceAction::action(StateStack &s,const ActionArray &a,
                Lexical &lex,IdTable &ids,ThreeAddressCode &code)const
    {
    	Array<StackNode*> temps(5,1); // grab the stuff coming out  
    	if  (popNum >0)
    		for (int i = 0;i<popNum;i++)
    			temps[i] = s.remove();  //deleted as needed in generate 
    	else 
    		temps[0] = s.getTop();// must know conditional goto storage 
      	
     	// temps[0] always comes back with an appropriate type of StackNode * 
    	int top = s.seeTop();
    	//This processes the goto part of the reduce. Get the state. 
    	int newState = a.act(top,pushSym)->action(s,a,lex,ids,code);
    	temps[0]->setState(newState);// now we have the state - set it 
    	s.insert(temps[0]);// push the new node 
    	return  newState;
    }

Day 2

  1. LR Conflicts.
    1. Example if then else ambiguity
      1 S -> if E then S
      2 S -> if E then S else S
      3 S -> a
      4 S -> b
      5 E -> x
      6 E -> y
      
    2. Consider the state with
      {S -> if E then S@, S -> if E then S@else S}
      1. else is in the follow of S (why?)
      2. on entry of else should go to the state containing S -> if E then S else@S
      3. Shift overpowers the reduce in this case.
    3. There is not general technique for building LR tables from ambiguous grammars.
  2. Canonical LR Parsers
    1. SLR has problems if you have to decide between shift and reduce for a particular table entry.
    2. Follow tells us too much. In this case else is included in Follow(S) because of the S -> if E then S else S. This is exactly the case where follow does not indicate reduce.
    3. It is also possible to have two different productions in their terminal state and nonempty intersection of follows.
      0	Z  -> S
      1	S -> E = E
      2	S -> i
      3	E -> E + i
      4	E ->	i
      
      1. Follow(S) = {$}, Follow(E) = {$,+,=}
      2. One of the states (3) contains S -> i@ and E -> i@. The entry under $ for state 3 can be either r2 or r4
    4. Solve this problem by limiting the things allowed to follow each of these productions by taking into account the history of the parse.
      1. Carry along a list of things that can follow this production with this history and split some of the states. [A -> b@g. s], where s, the lookahead, is what can follow this production at this place in the parse.
      2. Computing these new states
        1. First new closure For every item [A -> a@Bg; s] in the core, and every production B ->b, create a new item [B -> @b; t] where t = U xe s First(gx)
        2. Second start with [Z -> @S; $]
        3. Third Change new state rule to use the new closure
        4. For every completed item [A -> a@; s] we put rn into those columns corresponding to the members of s, where n is the number of production A -> a. Thus we will use the lookahead set s where we used the Follow(A).
    5. The example.
      0	Z  -> S
      1	S -> E = E
      2	S -> i
      3	E -> E + i
      4	E ->	i
      

      Compute the First of expression "=E$", First(=E$) = { = } and first of "+i=", First(+i=)={I}

      • State 0. V[e] = {[Z -> @S; {$}], [S -> @E = E; {$}], [S -> @i; {$}],
        [E -> @E+i; {= }] [E -> @i {= }] from [S->@E=E; {$}]
        because [E->@E+i; { = } ] is in the state, add [E->@E+i; { + }]
        [E -> @E+i;{+}] [E ->@i {+}] from [E->@E+i; {= }] and [E -> @i {= }]
        merged to give
        [E -> @E+i; {=,+}], [E -> @i; {=,+}]}

      • State 1 V[S] = {[Z -> S@; {$}]}

      • State 2 V[E] = {[S -> E@=E; {$}], [E -> E@+i; {=,+}]}

      • State 3 V[i] = {[S -> i@; {$}], [E -> i@; {=,+}]}

      • State 4 (from 2)V[E=] = {[S -> E=@E; {$}],
        add [E -> @E + i;{$}] and [E -> @i;{$}]
        The first adds [E -> @E + i; {+}] and [E -> @i;{+]]

      • State 5 (from 2)V[E+] = {[S -> E+@i; {=,+}
    6. Build row 3.
 

Inputs

State i = + $ S E
3   r4 r4 r2