Bottom Up Parsing

BOTTOM UP PARSING
Bottom up parsing
Goal of parser : build a derivation

top-down parser : build a derivation by working from the start
symbol towards the input.
builds parse tree from root to leaves
builds leftmost derivation
bottom-up parser : build a derivation by working from the input
back toward the start symbol
builds parse tree from leaves to root
builds reverse rightmost derivation
a string the starting symbol
reduced to
2
A general style of bottom-up syntax analysis, known as shift-reduce

parsing.
Two types of bottom-up parsing:
Operator-Precedence parsing
LR parsing
Bottom Up Parsing
Shift-Reduce Parsing
Reduce a string to the start symbol of the grammar.
At every step a particular sub-string is matched (in left-to-right fashion)
to the right side of some production, Replace this string by the LHS
(called reduction).
If the substring is chosen correctly at each step, it is the trace of a
rightmost derivation in reverse
Reverse
order
Consider:
abbcde
S aABe
aAbcde
A Abc | b
aAde
Bd
aABe
S
Rightmost Derivation:
S aABe aAde aAbcde abbcde
Handle
A Handle of a string
A substring that matches the RHS of some production and whose
reduction represents one step of a rightmost derivation in reverse
So we scan tokens from left to right, find the handle, and replace it
by corresponding LHS
Formally:
handle of a right sentential form is a production A ,
location of in , that satisfies the above property.
i.e. A is a handle of at the location immediately after the
end of , if:
S => A =>
Handle
A certain sentential form may have many different handles.
Right sentential forms of a non-ambiguous grammar

have one unique handle
An Example of Bottom-Up Paring

S aABe
A Abc | b
Bd
Handle-pruning,
The process of discovering a handle & reducing it to the
appropriate left-hand side is called handle pruning.
Handle pruning forms the basis for a bottom-up parsing method.
Problems:
Two problems:
locate a handle and
decide which production to use (if there are more than two candidate
productions).
Shift Reduce Parser using Stack

General Construction: using a stack:
shift input symbols into the stack until a handle is found on top of
it.
reduce the handle to the corresponding non-terminal.
other operations:
accept when the input is consumed and only the start symbol is
on the stack, also: error
Initial stack just contains only the end-marker $.

The end of the input string is marked by the end-marker $.
9
Shift-Reduce Parser Example

Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
Op Op *
Stack
$
Input String
num
num
num
$
10

Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
Op Op *
num
num
num
)
11
SHIFT
Expr Expr Op Expr

Expr (Expr)
Expr - Expr
Expr num
Op +
Op Op *
num
num
num
)
12

Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
Op Op *
SHIFT
num
num
num
)
13

Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
Op Op *
REDUCE
num
num
num
)
14

Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
Op Op *
REDUCE
num
num
num
)
15

Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
Op Op *
REDUCE
Expr
num
*
num
num
)
16

Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
Op Op *
SHIFT
Expr
num
*
num
num
)
17

Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
Op Op *
SHIFT
Expr
num
(
num
num
)
18

Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
Op Op *
Op
REDUCE
Expr
num
*
(
num
num
)
19

Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
Op Op *
Op
SHIFT
Expr
num
*
(
num
num
)
20

Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
Op Op *
(
Op
SHIFT
Expr
num
*
num
num
)
21

Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
Op Op *
(
Op
SHIFT
Expr
num
*
num
num
)
22

Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
Op Op *
SHIFT
num
(
Op
Expr
num
*
+
num
)
23

Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
Op Op *
REDUCE
SHIFT
Expr
(
Op
Expr
num
num
+
num
)
24

Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
Op Op *
SHIFT
Expr
(
Op
Expr
num
num
+
num
)
25

Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
Op Op *
SHIFT
Expr
(
Op
Expr
num
num
num
)
26

Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
Op Op *
Op
REDUCE
SHIFT
Expr
(
Op
Expr
num
num
+
num
)
27

Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
Op Op *
Op
SHIFT
Expr
(
Op
Expr
num
num
+
num
)
28

Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
Op Op *
Op
SHIFT
Expr
(
Op
Expr
num
num
+
num
)
29

Expr
Op
Expr Expr Op Expr

Expr (Expr)
Expr - Expr
Expr num
Op +
Op Op *
REDUCE
SHIFT
Expr
(
Op
Expr
num
num
num
)
30
REDUCE
SHIFT
Expr
(
Op
Expr
Expr Expr Op Expr

Expr (Expr)
Expr - Expr
Expr num
Op +
Op Op *
Expr
Op
Expr
num
num
num
)
31
SHIFT
Expr
(
Op
Expr
Expr Expr Op
Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
Op Op *
Expr
Op
Expr
num
num
num
)
32

Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
Op Op *
SHIFT
Expr
(
Op
Expr
Expr
Op
Expr
num
num
num
33

Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
Op Op *
)
Expr
Expr
REDUCE
Expr
Op
Expr
num
Op
Expr
num
num
34
Expr
Expr
Op
REDUCE
Expr
Expr
num
Expr
Op
Expr
num
Expr Expr Op Expr

Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
Op Op *
num
35
Expr
Expr
Op
ACCEPT!
Expr
Expr
num
Expr
Op
Expr
num
Expr Expr Op Expr

Expr (Expr)
Expr - Expr
Expr num
Op +
Op Op *
num
36
Basic Idea
Goal: construct parse tree for input string

Read input from left to right
Build tree in a bottom-up fashion
Use stack to hold pending sequences of terminals and nonterminals
37
Example, Corresponding Parse Tree

S
Expr
Expr
Term
Term
Term
Fact.
Fact.
Fact.
<id,y>
<id,x> <num,2>
1. Shift until top-of-stack is the right end of a handle

2. Pop the right end of the handle & reduce
SEXP
EXP EXP + TERM |TERM
TERMTERM*F ACT| FACT
FACT(EXP)|ID |NUM
38
Conflicts During Shift-Reduce Parsing

There are context-free grammars for which shift-reduce parsers cannot
be used.
Stack contents and the next input symbol may not decide action:
shift/reduce conflict: Whether make a shift operation or a reduction.
reduce/reduce conflict: The parser cannot decide which of several
reductions to make.
If a shift-reduce parser cannot be used for a grammar, that grammar is
called as non-LR(k) grammar.
left to right
scanning
right-most
derivation
k lookhead
An ambiguous grammar can never be a LR grammar.

39
More on Shift-Reduce Parsing

Conflicts
shift/reduce or reduce/reduce
Example:
We cant tell whether

it is a handle
stmt if expr then stmt
| if expr then stmt else stmt

| other (any other statement)
Stack
if then stmt
Input
else
Shift/ Reduce Conflict

40
Confilcts Resolution
Conflict resolution by adapting the parsing algorithm (e.g., in
parser generators)
Shift-reduce conflict
Resolve in favor of shift
Reduce-reduce conflict
Use the production that appears earlier
41
Shift-Reduce Parsers
There are two main categories of shift-reduce parsers
1. Operator-Precedence Parser
simple, but only a small class of grammars.

CFG
LR
LALR
2. LR-Parsers
SLR
covers wide range of grammars.

SLR simple LR parser
LR most general LR parser
LALR intermediate LR parser (lookhead LR parser)
SLR, LR and LALR work same, only their parsing tables are different.
42
Consider the Grammar

SS
S(S)S|
Show the actions of shift reduce parser for the input string ( ) using the
above grammar
43
Operator-Precedence Parser
Operator grammar
small, but an important class of grammars
we may have an efficient operator precedence parser (a shift-reduce
parser) for an operator grammar.
In an operator grammar, no production rule can have:
at the right side
two adjacent non-terminals at the right side.
Ex:
EAB
Aa
Bb
not operator grammar
EEOE
Eid
O+|*|/
not operator grammar
EE+E |
E*E |
E/E | id
operator grammar
44
Precedence Relations
In operator-precedence parsing, we define three disjoint precedence
relations between certain pairs of terminals.
a <. b
a = b
a .> b
b has higher precedence than a

b has same precedence as a
b has lower precedence than a
The determination of correct precedence relations between terminals

are based on the traditional notions of associativity and precedence of
operators. (Unary minus causes a problem).
45
Using Operator-Precedence Relations

The intention of the precedence relations is to find the handle of
a right-sentential form,
<. with marking the left end,
= appearing in the interior of the handle, and
.> marking the right hand.
In our input string $a1a2...an$, we insert the precedence relation
between the pairs of terminals (the precedence relation holds between
the terminals in that pair).
46
Using Operator -Precedence Relations

E E+E | E-E | E*E | E/E | E^E | (E) | -E | id
The partial operator-precedence

table for this grammar
id
id
.>
.>
.>
<.
.>
<.
.>
<.
.>
.>
.>
<.
<.
<.
Then the input string id+id*id with the precedence relations inserted
will be:
$ <. id .> + <. id .> * <. id .> $
47
To Find The Handles

1. Scan the string from left end until the first .> is encountered.
2. Then scan backwards (to the left) over any = until a <. is encountered.
3. The handle contains everything to left of the first .> and to the right of
the <. is encountered.
$ <. id .> + <. id .> * <. id .> $
$ <. + <. id .> * <. id .> $
$ <. + <. * <. id .> $
$ <. + <. * .> $
$ <. + .> $
$$
E id
E id
E id
E E*E
E E+E
$ id + id * id $
$ E + id * id $
$ E + E * id $
$ E + E * .E $
$E+E$
$E$
48
Operator-Precedence Parsing Algorithm
The input string is w$, the initial stack is $ and a table holds precedence relations between
certain terminals
Algorithm:
set p to point to the first symbol of w$ ;
repeat forever
if ( $ is on top of the stack and p points to $ ) then return
else {
let a be the topmost terminal symbol on the stack and let b be the symbol pointed to
by p;
if ( a <. b or a = b ) then {
/* SHIFT */
push b onto the stack;
advance p to the next input symbol;
}
else if ( a .> b ) then
/* REDUCE */
repeat pop stack
until ( the top of stack terminal is related by <. to the terminal most recently
popped );
else error();
}
49
Operator-Precedence Parsing Algorithm -- Example

stack
$
$<.id
$
$<.+
$<.+<.id
$<.+
$<.+<.*
$<.+<.*<.id
$<.+<.*
$<.+
$
input
id+id*id$
+id*id$
+id*id$
id*id$
*id$
*id$
id$
$
$
$
$
action
shift
reduce E id
shift
shift
reduce E id
shift
shift
reduce E id
reduce E E*E
reduce E E+E
accept
id
id
.>
.>
.>
<.
.>
<.
.>
<.
.>
.>
.>
<.
<.
<.
EE+E|E*E |id
50
How to Create Operator-Precedence Relations
We use associativity and precedence relations among operators.
1.
If operator O1 has higher precedence than operator O2,

O1 .> O2 and O2 <. O1
2.
If operator O1 and operator O2 have equal precedence,

they are left-associative O1 .> O2 and O2 .> O1
they are right-associative O1 <. O2 and O2 <. O1
3.
For all operators O,

O <. id, id .> O, O <. (, (<. O,
4.
Also, let
(=)
( <. (
( <. id
$ <. (
$ <. id
O .> ), ) .> O, O .> $, and $ <. O
id .> )
id .> $
) .> $
) .> )
51
Operator-Precedence Relations
+
.>
.>
*
<.
/
<.
^
<.
id
<.
(
<.
)
.>
$
.>
*
/
.>
.>
<.
<.
<.
.>
.>
<.
<.
<.
.>
.>
<.
<.
<.
.>
.>
<.
.>
.>
.>
.>
<.
.>
.>
.>
.>
^
id
(
)
$
.>
.>
.>
.>
<.
<.
.>
.>
.>
.>
.>
.>
.>
.>
<.
<.
<.
<.
.>
.>
.>
.>
<.
<.
<.
<.
<.
.>
<.
.>
<.
<.
<.
=
.>
<.
.>
<.
52
Handling Unary Minus

Operator-Precedence parsing cannot handle the unary minus when we
also the binary minus in our grammar.
The best approach to solve this problem, let the lexical analyzer handle
this problem.
The lexical analyzer will return two different operators for the unary minus and the binary
minus.
The lexical analyzer will need a lookhead to distinguish the binary minus from the unary
minus.
Then, we make
O <. unary-minus
unary-minus .> O
unary-minus <. O
for any operator

if unary-minus has higher precedence than O
if unary-minus has lower (or equal) precedence than O
53
Precedence Functions
Compilers using operator precedence parsers do not need to store the
table of precedence relations.
The table can be encoded by two precedence functions f and g that map
terminal symbols to integers.
For symbols a and b.
f(a) < g(b)
whenever a <. b
f(a) = g(b)
whenever a = b
f(a) > g(b)
whenever a .> b
54
Disadvantages of Operator Precedence Parsing

Disadvantages:
It cannot handle the unary minus (the lexical analyzer should handle
the unary minus).
Small class of grammars.
Difficult to decide which language is recognized by the grammar.
Advantages:
simple
powerful enough for expressions in programming languages
55
Error Recovery in Operator-Precedence Parsing

Error Cases:
1. No relation holds between the terminal on the top of stack and the
next input symbol.
2. A handle is found (reduction step), but there is no production with
this handle as a right side
Error Recovery:
1. Each empty entry is filled with a pointer to an error routine.
2. Decides the popped handle looks like which right hand side. And
tries to recover from that situation.
56
Handling Shift/Reduce Errors

When consulting the precedence matrix to decide whether to shift or
reduce, we may find that no relation holds between the top stack and
the first input symbol.
To recover, we must modify (insert/change)
1. Stack or
2. Input or
3. Both.
We must be careful that we dont get into an infinite loop.
57
Example
e1:
id
id
e3
e3
.>
.>
<.
<.
=.
e4
e3
e3
.>
.>
<.
<.
e2
e1
Called when : whole expression is missing

insert id onto the input
issue diagnostic: missing operand
e2:
Called when : expression begins with a right parenthesis

delete ) from the input
issue diagnostic: unbalanced right parenthesis
58
Example
e3:
id
id
e3
e3
.>
.>
<.
<.
=.
e4
e3
e3
.>
.>
<.
<.
e2
e1
Called when : id or ) is followed by id or (

insert + onto the input
issue diagnostic: missing operator
e4:
Called when : expression ends with a left parenthesis

pop ( from the stack
issue diagnostic: missing right parenthesis
59
CONSTRUCT OPERATOR PRECEDENCE PARSING TABLE FOR THE GRAMMAR

EE+E | E* E|(E)| id
60
LR Parsers
The most powerful shift-reduce parsing (yet efficient) is:
LR(k) parsing.
left to right
scanning
right-most
derivation
k lookhead
(k is omitted it is 1)
LR parsing is attractive because:

LR parsing is most general non-backtracking shift-reduce parsing, yet it is still efficient.
The class of grammars that can be parsed using LR methods is a proper superset of the class
of grammars that can be parsed with predictive parsers.
LL(1)-Grammars LR(1)-Grammars
An LR-parser can detect a syntactic error as soon as it is possible to do so a left-to-right
scan of the input.
61
LL(k) vs. LR(k)

LL(k): must predict which production to use having seen only first k
tokens of RHS
Works only with some grammars
But simple algorithm (can construct by hand)
LR(k): more powerful
Can postpone decision until seen tokens of entire RHS of a
production & k more beyond
62
More on LR(k)
Can recognize virtually all programming language constructs (if CFG
can be given)
Most general non-backtracking shift-reduce method known, but can be
implemented efficiently
Class of grammars can be parsed is a superset of grammars parsed by
LL(k)
Can detect syntax errors as soon as possible
63
More on LR(k)
Main drawback: too tedious to do by hand for typical
programming lang. grammars
We need a parser generator
Many available
Yacc (yet another compiler compiler) or bison for C/C++
environment
CUP (Construction of Useful Parsers) for Java environment;
JavaCC is another example
We write the grammar and the generator produces the parser for that
grammar
64
LR Parsers
LR-Parsers
covers wide range of grammars.
SLR simple LR parser
LR most general LR parser
LALR intermediate LR parser (look-head LR parser)
SLR, LR and LALR work same (they used the same algorithm),
only their parsing tables are different.
65
LR Parsing Algorithm
input a1
... ai
... an
stack
Sm
Xm
Sm-1
output
Xm-1
.
.
Action Table
S1
X1
S0
Goto Table
terminals and $
s
t
a
t
e
s
four different
actions
non-terminal
s
t
a
t
e
s
each item is
a state number
66
Key Idea
Deciding when to shift and when to reduce is based on a DFA
applied to the stack
Edges of DFA labeled by symbols that can be on stack
(terminals + non-terminals)
Transition table defines transitions (and characterizes the type
of LR parser)
67
Entries in Transition Table

Entry
sn
gn
rk
a
Meaning
Shift into state n (advance input

pointer to next token)
Goto state n
Reduce by rule (production) k;
corresponding gn gives next state
Accept
Error (denoted by blank entry)
68
How to make the Parse Table?

Use DFA for building parse tables
Each state now summarizes how much we have seen so far
and what we expect to see
Helps us to decide what action we need to take
How to build the DFA, then?
Analyze the grammar and productions
Need a notation to show how much we have seen so far
for a given production: LR(0) item
69
LR(0) Item
An LR(0) item is a production and a position in its RHS marked by a dot
(e.g., A )
The dot tells how much of the RHS we have seen so far. For example,
for a production S XYZ,
S XYZ: we hope to see a string derivable from XYZ
S XYZ: we have just seen a string derivable from X and we hope
to see a string derivable from YZ
SXY.Z : we have just seen a string derivable from XY and we
hope to see a string derivable from Z
SXYZ. : we have seen a string derivable from XYZ and going to
reduce it to S
(X, Y, Z are grammar symbols)
70
SLR PARSING
The central idea in the SLR method is first to construct from
the grammar a DFA to recognize viable prefixes. We group
items into sets, which become the states of the SLR parser.
Viable prefixes:
The set of prefixes of a right sentential form that can appear on the
stack of a Shift-Reduce parser is called Viable prefixes.
Example :- a, aa, aab, and aabb are viable prefixes of aabbbbd.
One collection of sets of LR(0) items, called the canonical

LR(0) collection, provides the basis for constructing SLR
parsers.
71
Augmented Grammar
If G is a grammar with start symbol S, then G', the
augmented grammar for G, is G with
new start symbol S' and
the production S' S.
The purpose of the augmenting production is to indicate to
the parser when it should stop parsing and accept the input.
That is, acceptance occurs only when the parser is about to
reduce by the production S' S.
72
Constructing Sets of LR(0) Items

1. Create a new nonterminal S' and a new production S' S where S is
the start symbol.
2. Put the item S' S into a start state called state 0.
3. Closure: If A B is in state s, then add B to state s for
every production B in the grammar.
4. Creating a new state from an old state[ goto operation] : Look for an
item of the form A x where x is a single terminal or
nonterminal and build a new state from A x . Include in the
new state all items with x in the old state. A new state is created for
each different x.
5. Repeat steps 3 and 4 until no new states are created. A state is new if it
is not identical to an old state.
73
The Closure Operation (Example)
closure({[E E]}) =
{
{ [E E] }
{ [E E]
{ [E E]
[E E + T] [E E + T]
[E T] }
[E T]
[T T * F]
Add [E]
[T F] }
Add [T]
Add [F]
Grammar:
EE+T|T
TT*F|F
F(E)
F id
[E E]
[E E + T
[E T]
[T T * F]
[T F]
[F ( E )]
[F id] }
74
Formal Definition of GOTO operation for constructing

LR(0) Items
1. For each item [AX] I, add the set of items

closure({[AX]}) to goto(I,X) if not already
there
2. Repeat step 1 until no more items can be added to
goto(I,X)
75
The Goto Operation (Example 1)
Then goto(I,E)
Suppose I ={ [E E]
[E E + T] = closure({[E E , E E + T]}
= { [E E ]
[E T]
[E E + T] }
[T T * F]
[T F]
[F ( E )]
[F id] }
Grammar:
EE+T|T
TT*F|F
F(E)
F id
76
The Goto Operation (Example 2)
Suppose I = { [E E ], [E E + T] }
{ [=E E + T]
Then goto(I,+) = closure({[E E + T]})
[T T * F]
[T F]
[F ( E )]
[F id] }
Grammar:
EE+T|T
TT*F|F
F(E)
F id
77
State 0
We start by adding item E' E to
state 0.
This item has a " " immediately to the
left of a nonterminal. Whenever this is
the case, we must perform step 3
(closure) of the set construction
algorithm.
We add the items E E + T and E
T to state 0, giving
I0:
E' E
E E+T
ET
T T*F
T F
F (E)
F id
{ E' E
EE+T
ET }
78
State 0
Reapplying closure to E T, we must add the
items T T * F and
T F to state 0, giving
I0:
{ E' E
EE+T
ET
TT*F
TF
}
E' E
E E+T
ET
T T*F
T F
F (E)
F id
79
State 0
Reapplying closure to T F, we must
add the items F ( E ) and F id
to state 0, giving
I0: { E' E
EE+T
ET
TT*F
TF
F(E)
F id
}
E' E
E E+T
ET
T T*F
T F
F (E)
F id
80
Creating State 1 From State 0 [ goto(I0,E)]

Final version of state 0:
I0: {
E' E
EE+T
ET
TT*F
TF
F(E)
F id
}
Using step 4, we create new state 1 from items E'
E and E E + T
E' E
E E+T
ET
T T*F
T F
F (E)
F id
I0
I1
81
State 1
State 1 starts with the items E' E and
E E + T. These items are formed from
items E' E and E E + T by
moving the "" one grammar symbol to the
right. In each case, the grammar symbol is
E.
Closure does not add any new items, so
state 1 ends up with the 2 items:
I1: {
E' E
EE+T
}
I0
E' E
E E+T
ET
T T*F
T F
F (E)
F id
I1
82
Creating State 2 From State 0 [ goto(I0,T)]

Using step 4, we create state 2 from items E T
and T T * F by moving the "" past the T.
State 2 starts with 2 items,
I2: {
ET
TT*F
}
Closure does not add additional items to state 2.
I0
E' E
E E+T
ET
T T*F
T F
F (E)
F id
I2
83
Creating State 3 From State 0 [ goto(I0,F)]

Using step 4, we create state 3 from item T
F.
State 3 starts (and ends up) with one item:
I3: {
TF
}
Since the only item in state 3 is a complete
item, there will be no transitions out of state
3.
The figure on the next slide shows the DFA of
viable prefixes to this point.
I0
E' E
E E+T
ET
T T*F
T F
F (E)
F id
I3
84
DFA After Creation of State 3
85
Creating State 4 From State 0 [ goto(I0,( )]

Using step 4, we create state 4 from item F
( E ).
State 4 begins with one item:
F(E)
Applying closure to this item, we add the items
E E+T
ET
E' E
E E+T
ET
T T*F
T F
F (E)
F id
86
State 4
Applying closure to E T, we add
items T T * F and
T F to state 4, giving
F(E)
EE+T
ET
TT*F
TF
E' E
E E+T
ET
T T*F
T F
F (E)
F id
87
State 4
Applying step 3 to T F, we add items F
( E ) and F id to state 4, giving the
final set of items
I 4: {
F(E)
EE+T
ET
TT*F
TF
F(E)
F id
}
The next slide shows the DFA to this point.
E' E
E E+T
ET
T T*F
T F
F (E)
F id
I0
I4
88
89
Creating State 5 From State 0 [ goto(I0,id)]

Finally, from item F id in state 0, we
create state 5, with the single item:
I5: {
F id
}
Since this item is a complete item, we will
not be able to produce new states from state
5.
The next slide shows the DFA to this point.
I0
id
E' E
E E+T
ET
T T*F
T F
F (E)
F id
I4
90
91
Creating State 6 From State 1 [ goto(I1,+)]

State 1 consists of 2 items
E' E
EE+T
Create state 6 from item E E + T, giving
the item E E + T.
Closure results in the set of items
E' E
E E+T
ET
T T*F
T F
F (E)
F id
I6: {
EE+T
T T*F
T F
F ( E )
F id
}
I1
I6
92
93
Creating State 7 From State 2 [ goto(I2,*)]

State 2 has two items,
ET
TT*F
We create state 7 from T T * F,
giving the initial item T T * F.
Using closure, we end up with
E' E
E E+T
ET
T T*F
T F
F (E)
F id
I7: {
TT*F
F (E)
F id
}
I2
I7
94
95
Creating State 8 From State 4 [ goto(I4,E)]

We use the items F ( E ) and E
E + T from State 4 to add the
following items to State 8:
E' E
E E+T
ET
T T*F
T F
F (E)
F id
I8: {
F (E)
E E+T
}
No further items can be added to state 8
through closure.
There are other transitions from state 4,
but they do not result in new states.
I4
I8
96
Other Transitions From State 4 [ goto(I4,T),

goto(I4,F), goto(I4,( ), goto(I4,id)]
If we use the items E T and
T T * F from state 4 to start a
new state, we begin with items
E' E
ET
E E+T
TT*F
ET
This set is identical to state 2.
T T*F
T F
Similarly, the items
F (E)
F id
T F will produce state 3
F ( E ) will produce state 4
F id will produce state 5
97
98
Creating State 9 From State 6 [ goto(I6,+)]

We use item E E + T from state six to
create state 9:
I9: {
E
T
T
F
F
E+T
T*F
F
(E)
id
E' E
E E+T
ET
T T*F
T F
F (E)
F id
}
All other transitions from state 6 go to
existing states. The next slide shows the
DFA to this point.
I6
I9
99
100
Creating State 10 From State 7 [ goto(I7,F)]

We use item T T * F from state 7
to create state 10:
I10: {
E' E
E E+T
ET
T T*F
T F
F (E)
F id
T T*F
}
existing states. The next slide shows
the DFA to this point.
I7
I10
101
102
Creation of State 11 From State 8 [ goto(I8,F)]

We use item F ( E * ) from state 8
to create state 11:
I11: {
F (E)
}
existing states.
State 9 has one transition to an existing
state (7). No other new states can be
added, so we are done.
The next slide shows the final DFA for
I7
viable prefixes.
E' E
E E+T
ET
T T*F
T F
F (E)
F id
I10
103
DFA for Viable Prefixes
104
(SLR) Parsing Tables for Expression Grammar

Action Table
1)
2)
3)
4)
5)
6)
E E+T
ET
T T*F
TF
F (E)
F id
state
id
s5
Goto Table
)
s4
s6
r2
s7
r2
r2
r4
r4
r4
r4
s4
r6
acc
s5
r6
r6
s5
s4
s5
s4
r6
10
s6
s11
r1
s7
r1
r1
10
r3
r3
r3
r3
11
r5
r5
r5
r5
105
Constructing Parse Table

Construct the DFA (state graph) as in LR(0)
Action Table
If there is a transition from the state i to state j on a terminal a,
ACTION[i, a] = shift j
If there is a reduce item A (for a production #k in state i, for
each a FOLLOW(A),
ACTION[i, a] = Reduce k
If an item S S. is in state i,
ACTION[i, $] = Accept
Otherwise, error
GOTO
Write GOTO for nonterminals: for terminals it is already embedded
in the action table
106
Algorithm Construction of SLR Parsing Table

1. Construct the canonical collection of sets of LR(0) items for G.
C{I0,...,In}
2. Create the parsing action table as follows
If a is a terminal, A.a in Ii and goto(Ii,a)=Ij then action[i,a] is
shift j.
If A. is in Ii , then action[i,a] is reduce A for all a in
FOLLOW(A) where AS.
If SS. is in Ii , then action[i,$] is accept.
If any conflicting actions generated by these rules, the grammar is
not SLR(1).
Create the parsing goto table
for all non-terminals A, if goto(Ii,A)=Ij then goto[i,A]=j
All entries not defined by (2) and (3) are errors.
4. Initial state of the parser contains S.S
107
We use the partial DFA at right

to fill in row 0 of the parse table.
By rule 2a,
action[ 0, ( ] = shift 4
action[ 0, id ] = shift 5
By rule 3,
goto[ 0, E ] = 1
goto[ 0, T ] = 2
goto[ 0, F ] = 3
108
Action Table
1)
2)
3)
4)
5)
6)
E E+T
ET
T T*F
TF
F (E)
F id
state
id
s5
Goto Table
(
s4
1
2
3
4
5
6
7
8
9
10
11
109

By rule 2a,
action [ 1, + ] = shift 6
By rule 2c
action [ 1, $ ] = accept
110
Action Table
1)
2)
3)
4)
5)
6)
E E+T
ET
T T*F
TF
F (E)
F id
state
id
s5
Goto Table
(
s4
s6
acc
2
3
4
5
6
7
8
9
10
11
111

By rule 2b, we set
action[ 5, x ] = reduce Fid
for each x Follow(F).
Since Follow(F) = { ), +, *, $)
we have
action[ 5, ) ] = reduce
Fid
action[ 5, +] = reduce
Fid
action[5, *] = reduce
Fid
action[5, $] = reduce
Fid
112
Action Table
1)
2)
3)
4)
5)
6)
E E+T
ET
T T*F
TF
F (E)
F id
state
id
s5
Goto Table
(
s4
s6
acc
2
3
4
5
r6
r6
r6
r6
7
8
9
10
113
Use the DFA to Finish the SLR Table
The complete SLR parse table for the expression grammar is given on the next slide.
114
Parse Table For Expression Grammar

action
State
0
id
s5
goto
s4
s6
r2
s7
r2
r2
r4
r4
r4
r4
s4
r6
acc
s5
Rules:
r6
r6
s5
s4
s5
s4
1.
2.
3.
4.
5.
6.
EE+T
ET
TT*F
TF
F(E)
F id
Notation:
r6
10
s6
s11
r1
s7
r1
r1
10
r3
r3
r3
r3
11
r5
r5
r5
r5
s5 = shift 5
r2 = reduce by
E T
115
Example SLR Grammar and LR(0) Items

Augmented
grammar:
1. C C
2. C A B
3. A a
4. B a
start
I0 = closure({[C C]})
I1 = goto(I0,C) = closure({[C C]})
goto(I0,C)
State I1:
C C final
State I4:
C A B
goto(I2,B)
State I0:
State I2:
goto
(
I
,
A
)
C C
0
C AB
C A B
B a
goto(I2,a)
A a
goto(I0,a) State I3:
A a
State I5:
B a
116
Example SLR Parsing Table
State I0:
State I1:
C C
C C
C A B
A a
State I2: State I3: State I4:

State I5:
C AB A a C A B B a
B a
a
0
1
start
C
A
2
a
a
3
s3
s5
r3
acc
4
r2
r4
Grammar:
1. C C
2. C A B
3. A a
4. B a
117
Actions of A LR-Parser
1. shift s -- shifts the next input symbol and the state s onto the stack
( So X1 S1 ... Xm Sm, ai ai+1 ... an $ ) ( So X1 S1 ... Xm Sm ai s, ai+1 ... an $ )
2. reduce A (or rn where n is a production number)

pop 2|| (=r) items from the stack;
then push A and s
Output is the reducing production reduce A
3. Accept Parsing successfully completed
4. Error -- Parser detected an error (an empty entry in the action table)
118
Refer Text:
Compilers Principles Techniques and Tools by Alfred V Aho, Ravi
Sethi, Jeffery D Ulman
Page No. 218-219
119
Actions of A (S)LR-Parser -- Example

stack
0
0id5
0F3
0T2
0T2*7
0T2*7id5
0T2*7F10
0T2
0E1
0E1+6
0E1+6id5
0E1+6F3
0E1+6T9
0E1
input
id*id+id$
*id+id$
*id+id$
*id+id$
id+id$
+id$
+id$
+id$
+id$
id$
$
$
$
$
action
shift 5
reduce by Fid
reduce by TF
shift 7
shift 5
reduce by Fid
reduce by TT*F
reduce by ET
shift 6
shift 5
reduce by Fid
reduce by TF
reduce by EE+T
accept
output
Fid
TF
Fid
TT*F
ET
Fid
TF
EE+T
120
Exercise
a)
b)
c)
Consider the following grammar of simplified statement sequences:

stmt_sequencestmt_sequene;stmt |stmt
stmts
Construct the DFA of LR(0) items of this grammar.
Construct SLR parsing table.
Show the parsing stack and the actions of the SLR parsing for the input
string s;s;s
121
shift/reduce and reduce/reduce conflicts

If a state does not know whether it will make a shift operation or
reduction for a terminal, we say that there is a shift/reduce conflict.
If a state does not know whether it will make a reduction operation
using the production rule i or j for a terminal, we say that there is a
reduce/reduce conflict.
If the SLR parsing table of a grammar G has a conflict, we say that that
grammar is not SLR grammar.
122
Conflict Example
S L=R
SR
L *R
L id
RL
I0: S .S
S .L=R
S .R
L .*R
L .id
R .L
Problem
FOLLOW(R)={=,$}
=
shift 6
reduce by R L
shift/reduce conflict
I1:S S.
I2:S L.=R
R L.
I6:S L=.R
R .L
L .*R
L .id
I9: S L=R.
I3:S R.
I4:L *.R
R .L
L .*R
L .id
I7:L *R.
I8:R L.
I5:L id.
123
Conflict Example2
S AaAb
S BbBa
A
B
I0: S .S
S .AaAb
S .BbBa
A.
B.
Problem
FOLLOW(A)={a,b}
FOLLOW(B)={a,b}
a
reduce by A
reduce by B
reduce/reduce conflict
reduce by A
reduce by B
reduce/reduce conflict
b
124
SLR(1)
There is an easy fix for some of the shift/reduce or reduce/reduce errors
requires to look one token ahead (called the lookahead token)
Steps to resolve the conflict of an itemset:

1) for each shift item Y b . c you find FIRST(c)
2) for each reduction item X a . you find FOLLOW(X)
3) if each FOLLOW(X) do not overlap with any of the other sets, you have resolved the conflict!
eg, for the itemset with E T . and
TT.*F
FOLLOW(E) = { $, +, ) }
FIRST(* F) = { * }
no overlapping!
This is a SLR(1) grammar, which is more powerful than LR(0)
125
General LR(1) Parsing

The SLR(1) trick doesn't always work
The difficulty with the SLR(1) method is that it applies lookaheads after
the construction of the DFA of LR(0) items.
The power of general LR(1) method is that it uses a new DFA that has
the lookaheads built into its construction from the start.
This DFA uses extension of LR(0) items ie. LR(1) items.
A single lookahead token is attached with each item.
LR(1) item is a pair of consisting of an LR(0) item and a lookahead
token of the form.
[A. , a] where A. is an LR(0) item and a is a token.
126
LR(1) items
A LR(1) item is:
A ,a
where a is the look-head of the LR(1) item

(a is a terminal or end-marker.)
When ( in the LR(1) item A ,a ) is not empty, the look-head

does not have any affect.
When is empty (A ,a ), we do the reduction by A only if

the next input symbol is a (not for any terminal in FOLLOW(A)).
A state will contain A ,a1 where {a1,...,an} FOLLOW(A)

...
A ,an
127
Canonical Collection of Sets of LR(1) Items
The construction of the canonical collection of the sets of LR(1) items

are similar to the construction of the canonical collection of the sets of
LR(0) items, except that closure and goto operations work a little bit
different.
closure(I) is: ( where I is a set of LR(1) items)

every LR(1) item in I is in closure(I)
if A B,a in closure(I) and B is a production rule of G;

then B.,b will be in the closure(I) for each terminal b in
FIRST(a) .
128
goto operation
If I is a set of LR(1) items and X is a grammar symbol
(terminal or non-terminal), then goto(I,X) is defined as
follows:
If A .X,a in I
then every item in closure({A X.,a}) will be in
goto(I,X).
129
Construction of The Canonical LR(1) Collection

Algorithm:
C is { closure({S.S,$}) }
repeat the followings until no more set of LR(1) items can be added to C.
for each I in C and each grammar symbol X
if goto(I,X) is not empty and not in C
add goto(I,X) to C
goto function is a DFA on the sets in C.
130
A Short Notation for The Sets of LR(1) Items

A set of LR(1) items containing the following items
A ,a1
...
A ,an
can be written as
A ,a1/a2/.../an
131
LR (1) Items -Example

SCC
CcC|d
Augmented Grammar
S S
SCC
CcC|d
Start with closure { S.S,$}
I 0.
{S.S, $
S.CC,$
C. cC, c/d
C. d, c/d
}
132
State 1 from State 0 (goto( I0,S)

I1: {
SS., $}
I1
I0
S.S,$
S.CC,$
C.cC,c/d
C.d, c/d
SS.,$
133
State 2 from State 0 ( GOTO (I0,C)

State 3 from 0 ( GOTO ( I0,c)
State 4 from 0 (GOTO (I0,d)
I2:
{
SC.C,$
C.cC,$
C.d,$
}
I4:
{
Cd.,c/d
}
I3:
{
Cc.C,c/d
CcC, c/d
C.d,c/d
}
DFA upto this point is shown in the next

slide
134
I0
S.S,$
S.CC,$
C.cC,c/d
C.d, c/d
I1
S
SS.,$
I2
SC.C,$
C.cC,$
C.d,$
Cc.C,c/d
C.cC,c/d
C.d,c/d
I3
I4
Cd., c/d
d
135
New states from State 2 (GOTO (I2,C)

GOTO(I2, c), GOTO(I2,d)
I5:
{
SCC.,$
}
I6
{Cc.C,$
C.cC,$
C.d,$
}
I7:
{Cd.,$}
136
137
Construction of LR(1) Parsing Tables

1. Construct the canonical collection of sets of LR(1) items for G.
C{I0,...,In}
2. Create the parsing action table as follows
If a is a terminal, A a,b in Ii and goto(Ii,a)=Ij then action[i,a] is shift j.

If A ,a is in Ii , then action[i,a] is reduce A where AS.
If SS ,$ is in Ii , then action[i,$] is accept.
If any conflicting actions generated by these rules, the grammar is not LR(1).
.
.
3. Create the parsing goto table

for all non-terminals A, if goto(Ii,A)=Ij then goto[i,A]=j
4. All entries not defined by (2) and (3) are errors.

5. Initial state of the parser contains S.S,$
138
Cannonical LR Parsing Table

ACTION
0
s3
s4
GOTO
$
2.CcC
acc
s6
s7
s3
s4
r3
r3
5
6
3. Cd
r1
s6
s7
7
8
1.SCC
r3
r2
r2
r2
139
LALR(1)
If the lookaheads s1 and s2 are different, then the items A a, s1
and A a , s2 are different
this results to a large number of states since the combinations of expected lookahead
symbols can be very large.
We can combine the two states into one by creating an item A a, s3

where s3 is the union of s1 and s2
LALR(1) is weaker than LR(1) but more powerful than SLR(1)
LALR(1) and LR(0) have the same number of states
Most parser generators are LALR(1), including CUP (Constructor of
Useful Parsers)
140
Practical Considerations
How to avoid reduce/reduce and shift/reduce conflicts:
left recursion is good, right recursion is bad
Most shift/reduce errors are easy to remove by assigning precedence and

associativity to operators
+ and * are left-associative
* has higher precedence than +
141
LALR Parsing Tables

LALR stands for LookAhead LR.
LALR parsers are often used in practice because LALR parsing tables
are smaller than LR(1) parsing tables.
The number of states in SLR and LALR parsing tables for a grammar G
are equal.
But LALR parsers recognize more grammars than SLR parsers.
yacc creates a LALR parser for the given grammar.
A state of LALR parser will be again a set of LR(1) items.
142
Creating LALR Parsing Tables

Canonical LR(1) Parser
shrink # of states
LALR Parser
This shrink process may introduce a reduce/reduce conflict in the

resulting LALR parser (so the grammar is NOT LALR)
But, this shrink process does not produce a shift/reduce conflict.
143
The Core of A Set of LR(1) Items

We will find the states (sets of LR(1) items) in a canonical LR(1) parser
with same items. Then we will merge them as a single state.
I4:Cd., c/d
I7:Cd ,$
A new state:
I47: Cd. ,c/d/$
have same item, merge them
We will do this for all states of a canonical LR(1) parser to get the states
of the LALR parser.
In fact, the number of the states of the LALR parser for a grammar will
be equal to the number of states of the SLR parser for that grammar.
144
I3: Cc.C,c/d
C.cC,c/d
C.d,c/d
A new state:
I36: Cc.C,c/d/$
C.cC,c/d/$
C.d,c/d/$
I6: Cc.C,$
C.cC,$
C.d,$
have same item, merge them
145
I8 and I9 are replaced by their union

I89 :
{
CcC. , c/d/$
}
146
Creation of LALR Parsing Tables

Create the canonical LR(1) collection of the sets of LR(1) items for
the given grammar.
Find each core; find all sets having that same core; replace those sets
having same cores with a single set which is their union.
C={I0,...,In} C={J1,...,Jm}
where m n
Create the parsing tables (action and goto tables) same as the
construction of the parsing tables of LR(1) parser.
Note that: If J=I1 ... Ik since I1,...,Ik have same cores
cores of goto(I1,X),...,goto(I2,X) must be same.
So, goto(J,X)=K where K is the union of all sets of items having same cores as goto(I1,X).
If no conflict is introduced, the grammar is LALR(1) grammar.

(We may only introduce reduce/reduce conflicts; we cannot introduce
a shift/reduce conflict)
147
LALR Parsing Table

0
s36
s47
2.CcC
acc
s36
s47
36
s36
s47
89
47
r3
r3
5
89
1.SCC
3. Cd
r3
r1
r2
r2
r2
148
Exercises
Q1. Show that the following grammar
SAa|bAc|dc|dba
Ad
Is LALR(1) but not SLR(1).
Q2. Show that the following grammar
SAa|bAc|Bc|bBa
Ad
Bd
Is LR(1) but not LALR(1)
149

Bottom Up Parsing

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Bottom Up Parsing

Transféré par

Droits d'auteur :

Formats disponibles

BOTTOM UP PARSING

Goal of parser : build a derivation

A general style of bottom-up syntax analysis, known as shift-reduce

Right sentential forms of a non-ambiguous grammar

An Example of Bottom-Up Paring

Shift Reduce Parser using Stack

Initial stack just contains only the end-marker $.

Shift-Reduce Parser Example

Shift-Reduce Parser Example

Shift-Reduce Parser Example

Expr Expr Op Expr

Shift-Reduce Parser Example

Shift-Reduce Parser Example

Shift-Reduce Parser Example

Shift-Reduce Parser Example

Shift-Reduce Parser Example

Shift-Reduce Parser Example

Shift-Reduce Parser Example

Shift-Reduce Parser Example

Shift-Reduce Parser Example

Shift-Reduce Parser Example

Shift-Reduce Parser Example

Shift-Reduce Parser Example

Shift-Reduce Parser Example

Shift-Reduce Parser Example

Shift-Reduce Parser Example

Shift-Reduce Parser Example

Shift-Reduce Parser Example

Shift-Reduce Parser Example

Expr Expr Op Expr

Shift-Reduce Parser Example

Expr Expr Op Expr

Shift-Reduce Parser Example

Shift-Reduce Parser Example

Shift-Reduce Parser Example

Shift-Reduce Parser Example

Expr Expr Op Expr

Shift-Reduce Parser Example

Expr Expr Op Expr

Goal: construct parse tree for input string

Example, Corresponding Parse Tree

1. Shift until top-of-stack is the right end of a handle

Conflicts During Shift-Reduce Parsing

An ambiguous grammar can never be a LR grammar.

More on Shift-Reduce Parsing

We cant tell whether

stmt if expr then stmt

| if expr then stmt else stmt

Shift/ Reduce Conflict

There are two main categories of shift-reduce parsers

simple, but only a small class of grammars.

covers wide range of grammars.

Consider the Grammar

b has higher precedence than a

The determination of correct precedence relations between terminals

Using Operator-Precedence Relations

Using Operator -Precedence Relations

The partial operator-precedence

To Find The Handles

Operator-Precedence Parsing Algorithm

Operator-Precedence Parsing Algorithm -- Example

How to Create Operator-Precedence Relations

We use associativity and precedence relations among operators.