Académique Documents
Professionnel Documents
Culture Documents
LR Parsers
The most powerful shift-reduce parsing (yet efficient) is:
LR(k) parsing.
left to right scanning right-most derivation k lookhead (k is omitted it is 1)
LR Parsers
LR-Parsers
covers wide range of grammars. SLR simple LR parser LR most general LR parser LALR intermediate LR parser (look-head LR parser) SLR, LR and LALR work same (they used the same algorithm), only their parsing tables are different.
LR Parsing Algorithm
input a1 stack
Sm Xm Sm-1 Xm-1 . . S1 X1 S0
s t a t e s
... ai
... an
LR Parsing Algorithm
output
Action Table
terminals and $ four different actions s t a t e s
Goto Table
non-terminal each item is a state number
Sm and ai decides the parser action by consulting the parsing action table. (Initial Stack contains just So ) A configuration of a LR parsing represents the right sentential form: X1 ... Xm ai ai+1 ... an $
Actions of A LR-Parser
1. shift s -- shifts the next input symbol and the state s onto the stack
( So X1 S1 ... Xm Sm, ai ai+1 ... an $ ) ( So X1 S1 ... Xm Sm ai s, ai+1 ... an $ )
2.
reduce ApF (or rn where n is a production number) pop 2|F| (=r) items from the stack; then push A and s where s=goto[sm-r,A]
( So X1 S1 ... Xm Sm, ai ai+1 ... an $ ) ( So X1 S1 ... Xm-r Sm-r A s, ai ... an $ )
Output is the reducing production reduce ApF 3. 4. Accept Parsing successfully completed Error -- Parser detected an error (an empty entry in the action table)
Reduce Action
pop 2|F| (=r) items from the stack; let us assume that F = Y1Y2...Yr then push A and s where s=goto[sm-r,A] ( So X1 S1 ... Xm-r Sm-r Y1 Sm-r+1 ...Yr Sm, ai ai+1 ... an $ ) ( So X1 S1 ... Xm-r Sm-r A s, ai ... an $ ) In fact, Y1Y2...Yr is a handle. X1 ... Xm-r A ai ... an $ X1 ... Xm Y1...Yr ai ai+1 ... an $
Goto Table
E 1 T 2 F 3
.. ..
Sets of LR(0) items will be the states of action and goto table of the SLR parser. A collection of sets of LR(0) items (the canonical LR(0) collection) is the basis for constructing SLR parsers. Augmented Grammar: G is G with a new production rule SpS where S is the new starting symbol.
. .
What is happening by Bp K ? Bp
. . . . . . .
kernel items
Computation of Closure
function closure ( I ) begin J := I; repeat for each item A p E.BF in J and each production BpK of G such that Bp.K is not in J do add Bp.K to J until no more items can be added to J end
Goto Operation
If I is a set of LR(0) items and X is a grammar symbol (terminal or non-terminal), then goto(I,X) is defined as follows: If A p E XF in I then every item in closure({A p EX F}) will be in goto(I,X). If I is the set of items that are valid for some viable prefix K, then goto(I,X) is the set of items that are valid for the viable prefix KX.
Example:
I ={ E p E, E p E+T, E p T, T p T*F, T p F, F p (E), F p id } goto(I,E) = { E p E , E p E +T } goto(I,T) = { E p T , T p T *F } goto(I,F) = {T p F } goto(I,() = { F p ( E), E p E+T, E p F p (E), F p id } goto(I,id) = { F p id }
. .. . . . . . . . . . . . . .. . .
T, T p
T*F, T p
F,
C is { closure({Sp S}) } repeat the followings until no more set of LR(0) items can be added to C. for each I in C and each grammar symbol X if goto(I,X) is not empty and not in C add goto(I,X) to C
I11: F p (E).
I1
T F
I6
T F ( id
I2
(
I9 to I3 to I4 to I5 I10 to I4 to I5 I11 to I6
to I7
I7
F ( id
I3 I4 I5
E T F (
id id
I8 to I2 to I3 to I4
) +
1. 2.
Cn{I0,...,In}
Create the parsing action table as follows If a is a terminal, ApE.aF in Ii and goto(Ii,a)=Ij then action[i,a] is shift j. If ApE. is in Ii , then action[i,a] is reduce ApE for all a in FOLLOW(A) where
A{S. If
SpS.
If any conflicting actions generated by these rules, the grammar is not SLR(1).
3. 4. 5.
All entries not defined by (2) and (3) are errors. Initial state of the parser contains Sp.S
Goto Table
E 1 T 2 F 3
SLR(1) Grammar
An LR parser using SLR(1) parsing tables for a grammar G is called as the SLR(1) parser for G. If a grammar G has an SLR(1) parsing table, it is called SLR(1) grammar (or SLR grammar in short). Every SLR grammar is unambiguous, but every unambiguous grammar is not a SLR grammar.
Conflict Example
S p L=R SpR Lp *R L p id RpL I0: S p .S S p .L=R S p .R L p .*R L p .id R p .L I1:S p S. I2:S p L.=R R p L. I3:S p R. I4:L p *.R R p .L Lp .*R L p .id I5:L p id. I7:L p *R. I8:R p L. I6:S p L=.R R p .L Lp .*R L p .id I9: S p L=R.
Conflict Example2
S p AaAb S p BbBa ApI BpI I0: S p .S S p .AaAb S p .BbBa Ap. Bp.
LR(1) Item
To avoid some of invalid reductions, the states need to carry more information. Extra information is put into a state by including a terminal symbol as a second component in an item.
A LR(1) item is: where a is the look-head of the LR(1) item (a is a terminal or end-marker.) Such an object is called LR(1) item.
1 refers to the length of the second component The lookahead has no effect in an item of the form [A p E.F,a], where F is not . But an item of the form [A p E.,a] calls for a reduction by A p E only if the next input symbol is a. The set of such as will be a subset of FOLLOW(A), but it could be a proper subset.
A p E F,a
goto operation
If I is a set of LR(1) items and X is a grammar symbol (terminal or non-terminal), then goto(I,X) is defined as follows: If A p E.XF,a in I then every item in closure({A p EX.F,a}) will be in goto(I,X).
. .
I7: S p BbB.a ,$
I9: S p BbBa. ,$
An Example
1. S p S 2. S p C C 3. C p c C 4. C p d
I3: goto(I1, c) = (C p c y C, c/d) (C p y c C, c/d) (C p y d, c/d) I4: goto(I1, d) = (C p d y, c/d) I5: goto(I3, C) = (S p C C y, $)
S p y S, $ S p y C C, $ C p y c C, c/d C p y d, c/d
I1
(S p S y , $
I0
C I2
S p C y C, $ C p y c C, $ C p y d, $
C c c
d
I5
S p C C y, $
I6
C p c y C, $ C p y c C, $ C p y d, $
C I9
C p cC y, $
I7 C I8
c c I3 d
C p c y C, c/d C p y c C, c/d C p y d, c/d
C p d y, $
C p c C y, c/d
I4
C p d y, c/d
An Example
I6: goto(I3, c) = (C p c y C, $) (C p y c C, $) (C p y d, $) I7: goto(I3, d) = (C p d y, $) I8: goto(I4, C) = (C p c C y, c/d) : goto(I4, c) = I4 : goto(I4, d) = I5 I9: goto(I7, c) = (C p c C y, $) : goto(I7, c) = I7 : goto(I7, d) = I8
An Example
I0 S C I2 c c c d I3 d I4 C I8 d I6 d I7 C I9 I1 C I5
An Example
c s3 s6 s3 r3 s6 r2 d s4 s7 s4 r3 r1 s7 r3 r2 r2 g9 $ a g5 g8 S g1 C g2
0 1 2 3 4 5 6 7 8 9
. .
4. All entries not defined by (2) and (3) are errors. 5. Initial state of the parser contains Sp.S,$
. .
S p L =R RpL
. .
Core
We will find the states (sets of LR(1) items) in a canonical LR(1) parser with same cores. Then we will merge them as a single state. I1:L p id ,= I2:L p id ,$
. .
A new state:
have same core, merge them
I12: L p id ,= L p id ,$
. .
We will do this for all states of a canonical LR(1) parser to get the states of the LALR parser. In fact, the number of the states of the LALR parser for a grammar will be equal to the number of states of the SLR parser for that grammar.
4. If no conflict is introduced, the grammar is LALR(1) grammar. (We may only introduce reduce/reduce conflicts; we cannot introduce a shift/reduce conflict)
S p y S, $ S p y C C, $ C p y c C, c/d C p y d, c/d
I1
(S p S y , $
I0
C I2
S p C y C, $ C p y c C, $ C p y d, $
C c c
d
I5
S p C C y, $
I6
C p c y C, $ C p y c C, $ C p y d, $
C I9
C p cC y, $
I7 C I8
c c I3 d
C p c y C, c/d C p y c C, c/d C p y d, c/d
C p d y, $
C p c C y, c/d
I4
C p d y, c/d
S p y S, $ S p y C C, $ C p y c C, c/d C p y d, c/d
I1
(S p S y , $
I0
C I2
S p C y C, $ C p y c C, $ C p y d, $
C c c
d
I5
S p C C y, $
I6
C p c y C, $ C p y c C, $ C p y d, $
I7 C I89
c c I3 d
C p c y C, c/d C p y c C, c/d C p y d, c/d
C p d y, $
C p c C y, c/d/$
I4
C p d y, c/d
S p y S, $ S p y C C, $ C p y c C, c/d C p y d, c/d
I1
(S p S y , $
I0
C I2 d
S p C y C, $ C p y c C, $ C p y d, $
C c c
d
I5
S p C C y, $
I6
C p c y C, $ C p y c C, $ C p y d, $
I47 C I89
c c I3
C p c y C, c/d C p y c C, c/d C p y d, c/d
C p d y, c/d/$
C p c C y, c/d/$
S p y S, $ S p y C C, $ C p y c C, c/d C p y d, c/d
I1
(S p S y , $
I0
C I2
S p C y C, $ C p y c C, $ C p y d, $
C c c
I5
S p C C y, $
c
d
I36
I47 I89
C p d y, c/d/$
C p c C y, c/d/$
0 1 2 s36 36 s36 47 r3 5 89 r2
Shift/Reduce Conflict
We say that we cannot introduce a shift/reduce conflict during the shrink process for the creation of the states of a LALR parser. Assume that we can introduce a shift/reduce conflict. In this case, a state of LALR parser must have: A p E ,a and B p F aK,b This means that a state of the canonical LR(1) parser must have: and But, this state has also a shift/reduce conflict. i.e. The original canonical LR(1) parser has a conflict. (Reason for this, the shift operation does not depend on lookaheads)
. A p E.,a
. B p F.aK,c
Reduce/Reduce Conflict
But, we may introduce a reduce/reduce conflict during the shrink process for the creation of the states of a LALR parser. I1 : A p E ,a
. B p F.,b
I2: A p E ,b
I12: A p E ,a/b
. B p F.,b/c
. B p F.,c
reduce/reduce conflict
. . . . . .
I1:S p S ,$ I411:L p * R,$/= * S R p L,$/= L I2:S p L =R,$ to I6 Lp *R,$/= R p L ,$ L p id,$/= R id I3:S p R ,$ I :L p id ,$/=
512
. .. .
. . . .
R L * id
. . . . . .
I9:S p L=R ,$
Same Cores I4 and I11 I5 and I12 I7 and I13 I8 and I10
s6 s5 s4
Ex.
E p E+T | T E p E+E | E*E | (E) | id T p T*F | F F p (E) | id
I1: E p E E p E +E E p E *E ( (
. . .
I2: E p ( Ep Ep Ep id Ep
E ( id I3 E ( id I3 I2 I2
I7: E p E+E + I4 E p E +E * I 5 E p E *E
. .
I8: E p E*E + I4 E p E +E * I 5 E p E *E
. .
. . .
) + * I4 I5
I9: E p (E)
I1
I4
I7
when current token is + shift + is right-associative reduce + is left-associative when current token is * shift * has higher precedence than + reduce + has higher precedence than *
I1
I5
I8
when current token is * shift * is right-associative reduce * is left-associative when current token is + shift + has higher precedence than * reduce * has higher precedence than +
0 1 2 3 4 5 6 7 8 9
The parser stacks the nonterminal A and the state goto[s,A], and it resumes the normal parsing. This nonterminal A is normally is a basic programming block (there can be more than one choice for A).
stmt, expr, block, ...
The End