Académique Documents
Professionnel Documents
Culture Documents
Regular Languages
} 0 : { > n b a
n n
} {
R
ww
* *b a
* ) ( b a +
Regular Languages
} {
n n
b a
} {
R
ww
Context-Free Languages
Context-Free Languages
Pushdown
Automata
Context-Free
Grammars
stack
automaton
Grammars
Grammars express languages
Example: the English language
verb predicate
noun article phrase noun
predicate phrase noun sentence
_
_
walks verb
runs verb
dog noun
cat noun
the article
a article
_
_
A derivation of a cat runs:
runs cat a
verb cat a
verb noun a
verb noun article
verb phrase noun
predicate phrase noun sentence
_
_
Language of the grammar:
L = { a cat runs,
a cat walks,
the cat runs,
the cat walks,
a dog runs,
a dog walks,
the dog runs,
the dog walks }
Notation
dog noun
cat noun
Variable Terminal
Production Rules
Another Example
Grammar:
Derivation of sentence :
S
aSb S
ab aSb S
ab
aSb S S
aabb aaSbb aSb S
aSb S S
aabb
S
aSb S
Grammar:
Derivation of sentence :
Other derivations:
aaabbb aaaSbbb aaSbb aSb S
aaaabbbb aaaaSbbbb
aaaSbbb aaSbb aSb S
Language of the grammar
S
aSb S
} 0 : { > = n b a L
n n
More Notation
Grammar ( ) P S T V G , , , =
: V
: T
: S
: P
Set of variables
Set of terminal symbols
Start variable
Set of Production rules
Example
Grammar :
S
aSb S G
( ) P S T V G , , , =
} {S V = } , { b a T =
} , { = S aSb S P
More Notation
Sentential Form:
A sentence that contains
variables and terminals
Example:
aaabbb aaaSbbb aaSbb aSb S
Sentential Forms sentence
We write:
Instead of:
aaabbb S
*
Example
S
aSb S
aaabbb S
aabb S
ab S
S
*
*
*
*
Grammar
Derivations
b aaaaaSbbbb aaSbb
aaSbb S
-
-
S
aSb S
Grammar
Example
Derivations
Another Grammar Example
Grammar :
A
aAb A
Ab S
Derivations:
aabbb aaAbbb aAbb Ab S
abb aAbb Ab S
b Ab S
G
More Derivations
aaaabbbbb aaaaAbbbbb
aaaAbbbb aaAbbb aAbb Ab S
b b a S
bbb aaaaaabbbb S
aaaabbbbb S
n n
-
-
-
Language of a Grammar
For a grammar
with start variable :
G
S
} : { ) ( w S w G L
-
=
String of terminals
Example
For grammar :
A
aAb A
Ab S
} 0 : { ) ( > = n b b a G L
n n
Since:
b b a S
n n
-
G
A Convenient Notation
A
aAb A
| aAb A
the article
a article
the a article |
Example
A context-free grammar :
S
aSb S
aabb aaSbb aSb S
G
A derivation:
A context-free grammar :
S
aSb S
aaabbb aaaSbbb aaSbb aSb S
G
Another derivation:
S
aSb S
= ) (G L
(((( ))))
} 0 : { > n b a
n n
Describes parentheses:
S
bSb S
aSa S
abba abSba aSa S
A context-free grammar : G
A derivation:
Example
S
bSb S
aSa S
abaaba abaSaba abSba aSa S
A context-free grammar : G
Another derivation:
S
bSb S
aSa S
= ) (G L
}*} , { : { b a w ww
R
e
S
SS S
aSb S
ab abS aSbS SS S
A context-free grammar : G
A derivation:
Example
S
SS S
aSb S
abab abaSb abS aSbS SS S
A context-free grammar : G
A derivation:
S
SS S
aSb S
} prefix any in
) ( ) ( and
), ( ) ( : {
v
v n v n
w n w n w
b a
b a
>
=
() ((( ))) (( ))
= ) (G L
Describes
matched
parentheses:
Definition: Context-Free Grammars
Grammar
Productions of the form:
x A
String of variables
and terminals
) , , , ( P S T V G =
Variables Terminal
symbols
Start
variable
Variable
*} , : { ) (
*
T w w S w G L e =
) , , , ( P S T V G =
Definition: Context-Free Languages
A language is context-free
if and only if
there is a context-free grammar
with
L
G
) (G L L =
Derivation Order
AB S . 1
A
aaA A
. 3
. 2
B
Bb B
. 5
. 4
aab aaBb aaB aaAB AB S
5 4 3 2 1
Leftmost derivation:
aab aaAb Ab ABb AB S
3 2 5 4 1
Rightmost derivation:
| A B
bBb A
aAB S
Leftmost derivation:
abbbb abbbbB
abbBbbB abAbB abBbB aAB S
Rightmost derivation:
abbbb abbBbb
abAb abBb aA aAB S
Derivation Trees
AB S
AB S
| aaA A | Bb B
S
B A
AB S
| aaA A | Bb B
aaAB AB S
a a
A
S
B A
AB S
| aaA A | Bb B
aaABb aaAB AB S
S
B A
a a
A B b
AB S
| aaA A | Bb B
aaBb aaABb aaAB AB S
S
B A
a a
A B b
AB S
| aaA A | Bb B
aab aaBb aaABb aaAB AB S
S
B A
a a
A B b
Derivation Tree
aab aaBb aaABb aaAB AB S
yield
aab
b aa
=
S
B A
a a
A B b
Derivation Tree
AB S
| aaA A | Bb B
Ambiguity
a E E E E E E | ) ( | | - +
a a a - +
E
E E
E E
+
a
a
a
-
a a a E a a
E E a E a E E E
* + - +
- + + +
leftmost derivation
a E E E E E E | ) ( | | - +
a a a - +
E
E E
+
a
a
-
E E
a
a a a E a a
E E a E E E E E E
- + - +
- + - + -
leftmost derivation
a E E E E E E | ) ( | | - +
a a a - +
E
E E
+
a
a
-
E E
a
E
E E
E E
+
a
a
a
-
Two derivation trees
The grammar
a E E E E E E | ) ( | | - +
is ambiguous:
E
E E
+
a
a
-
E E
a
E
E E
E E
+
a
a
a
-
string a a a - + has two derivation trees
string a a a - + has two leftmost derivations
a a a E a a
E E a E E E E E E
- + - +
- + - + -
a a a E a a
E E a E a E E E
* + - +
- + + +
The grammar
a E E E E E E | ) ( | | - +
is ambiguous:
Definition:
A context-free grammar is ambiguous
if some string has:
two or more derivation trees
G
) (G L we
In other words:
A context-free grammar is ambiguous
if some string has:
two or more leftmost derivations
G
) (G L we
(or rightmost)
Why do we care about ambiguity?
E
E E
+
a
a
-
E E
a
E
E E
E E
+
a
a
a
-
a a a - +
take 2 = a
E
E E
+
-
E E
E
E E
E E
+
-
2 2 2 - +
2
2 2 2 2
2
E
E E
+
-
E E
E
E E
E E
+
-
6 2 2 2 = - +
2
2 2 2 2
2
8 2 2 2 = - +
4
2 2
2
6
2 2
2 4
8
E
E E
E E
+
-
6 2 2 2 = - +
2
2 2
4
2 2
2
6
Correct result:
We want to remove ambiguity
Ambiguity is bad for programming languages
Left Recursion & Right Recursion
It is possible for a recursive-descent parser to loop forever.
The same effect can be achieved by rewriting the
productions for A in the following manner, using a new
nonterminal R:
The left-recursion-elimination technique sketched in
previous Fig. can also be applied to productions containing
semantic actions.
First, the technique extends to multiple productions for A.
Position of Parser
There are three general types of parsers for grammars:
UNIVERSAL
TOP-DOWN
BOTTOM-UP
Universal parsing methods such as
Cocke-Younger-Kasami algorithm
Earley's algorithm can parse any grammar
These general methods are, however, too inefficient to use in
production compilers.
The methods commonly used in compilers can be classified as
being either top-down or bottom-up.
Top-down methods build parse trees from the top (root) to
the bottom (leaves), while Bottom-up methods start from
the leaves and work their way up to the root.
In either case, the input to the parser is scanned from left to
right, one symbol at a time.
The most efficient top-down and bottom-up
methods work only for subclasses of
grammars,
but several of these classes, particularly, LL and
LR grammars, are expressive enough to describe
most of the syntactic constructs in modern
programming languages.
Parsers implemented by hand often use LL
grammars;
for example, the predictive-parsing approach
Parsers for the larger class of LR grammars
are usually constructed using automated tools.
Associativity of operators
Our grammar gives left associativity.
That is, if you traverse the parse tree in postorder and
perform the indicated arithmetic you will evaluate the
string left to right.
If you wished to generate right associativity, you would
change the productions