Vous êtes sur la page 1sur 33

Pushdown Automata (PDA)

Informally:
A PDA is an NFA- with a stack. Transitions are modified to accommodate stack operations.

Questions:
What is a stack? How does a stack help?

A DFA can remember only a finite amount of information, whereas a PDA can remember an infinite amount of (certain types of) information.

Example: {0n1n | 0=<n} {0n1n | 0enek, for some fixed k} is not regular, but is regular, for any fixed k.

For k=3: L = { , 01, 0011, 000111}

q0 1 0/1 q7

q1 1 0/1 q6

q2 1

q3 1

1 0

q5 0

q4 0
2

In a DFA, each state remembers a finite amount of information.

To get {0n1n | 0 n} with a DFA would require an infinite number of states using the preceding technique.

An infinite stack solves the problem for {0n1n | 0 n} as follows:


Read all 0s and place them on a stack Read all 1s and match with the corresponding 0s on the stack

Only need two states to do this in a PDA Similarly for {0n1m0n+m | n,mu0}

Formal Definition of a PDA


A pushdown automaton (PDA) is a seven-tuple: M = (Q, , , , q0, z0, F) Q A finite set of states A finite input alphabet A finite stack alphabet The initial/starting state, q0 is in Q A starting stack symbol, is in A set of final/accepting states, which is a subset of Q A transition function, where

q0 z0 F

: Q x ( U { }) x

> finite subsets of Q x *


4

Consider the various parts of : Q x ( U { }) x > finite subsets of Q x *

Q on the LHS means that at each step in a computation, a PDA must consider its current state. on the LHS means that at each step in a computation, a PDA must consider the symbol on top of its stack. U { } on the LHS means that at each step in a computation, a PDA may or may not consider the current input symbol, i.e., it may have epsilon transitions. Finite subsets on the RHS means that at each step in a computation, a PDA will have several options. Q on the RHS means that each option specifies a new state. * on the RHS means that each option specifies zero or more stack symbols that will replace the top stack symbol.

Two types of PDA transitions: (q, a, z) = {(p1, 1), (p2, 2),, (pm, m)}
Current state is q Current input symbol is a Symbol currently on top of the stack z Move to state pi from q Replace z with i on the stack (leftmost symbol on top) Move the input head to the next input symbol

a/z/

p1 p2
m

a/z/ a/z/

pm
6

Two types of PDA transitions: (q, , z) = {(p1, 1), (p2, 2),, (pm, m)}
Current state is q Current input symbol is not considered Symbol currently on top of the stack z Move to state pi from q Replace z with i on the stack (leftmost symbol on top) No input symbol is read

/z/

p1 p2
m

/z/ /z/

pm
7

Example: (balanced parentheses) M = ({q1}, {(, )}, {L, #}, , q1, #, ) : (1) (2) (3) (4) (5) (6) (q1, (, #) = {(q1, L#)} (q1, ), #) = (q1, (, L) = {(q1, LL)} (q1, ), L) = {(q1, )} (q1, , #) = {(q1, )} (q1, , L) =

Goal: (acceptance)
Terminate in a non-null state Read the entire input string Terminate with an empty stack

Informally, a string is accepted if there exists a computation that uses up all the input and leaves the stack empty. 8

Transition Diagram:

(, # | L# ,#|

q0
), L |

(, L | LL

Example Computation: Current Input (()) ()) )) ) Stack # L# LL# L# # Transition (1) (3) (4) (4) (5) - Could have applied rule (5), but it would have done no good

Example PDA #1: For the language {x | x = wcwr and w in {0,1}*, but sigma={0,1,c}} M = ({q1, q2}, {0, 1, c}, {R, B, G}, , q1, R, ) : (1) (2) (3) (4) (5) (6) (7) (8) (q1, 0, R) = {(q1, BR)} (q1, 0, B) = {(q1, BB)} (q1, 0, G) = {(q1, BG)} (q1, c, R) = {(q2, R)} (q1, c, B) = {(q2, B)} (q1, c, G) = {(q2, G)} (q2, 0, B) = {(q2, )} (q2, , R) = {(q2, )} (9) (10) (11) (q1, 1, R) = {(q1, GR)} (q1, 1, B) = {(q1, GB)} (q1, 1, G) = {(q1, GG)}

(12)

(q2, 1, G) = {(q2, )}

Notes:
Only rule #8 is non-deterministic. Rule #8 is used to pop the final stack symbol off at the end of a computation. 10

Example Computation: (1) (2) (3) (4) (5) (6) (7) (8) (q1, 0, R) = {(q1, BR)} (q1, 0, B) = {(q1, BB)} (q1, 0, G) = {(q1, BG)} (q1, c, R) = {(q2, R)} (q1, c, B) = {(q2, B)} (q1, c, G) = {(q2, G)} (q2, 0, B) = {(q2, )} (q2, , R) = {(q2, )} (9) (10) (11) (q1, 1, R) = {(q1, GR)} (q1, 1, B) = {(q1, GB)} (q1, 1, G) = {(q1, GG)}

(12)

(q2, 1, G) = {(q2, )}

State q1 q1 q1 q2 q2 q2 q2

Input 01c10 1c10 c10 10 0

Stack R BR GBR GBR BR R

Rule Applied (1) (10) (6) (12) (7) (8)

Rules Applicable (1) (10) (6) (12) (7) (8) 11

Example Computation: (1) (2) (3) (4) (5) (6) (7) (8) (q1, 0, R) = {(q1, BR)} (q1, 0, B) = {(q1, BB)} (q1, 0, G) = {(q1, BG)} (q1, c, R) = {(q2, R)} (q1, c, B) = {(q2, B)} (q1, c, G) = {(q2, G)} (q2, 0, B) = {(q2, )} (q2, , R) = {(q2, )} (9) (10) (11) (q1, 1, R) = {(q1, GR)} (q1, 1, B) = {(q1, GB)} (q1, 1, G) = {(q1, GG)}

(12)

(q2, 1, G) = {(q2, )}

State q1 q1 q2 q2 q2 Questions:

Input 1c1 c1 1

Stack R GR GR R

Rule Applied (9) (6) (12) (8)

Why isnt (q2, 0, G) defined? Why isnt (q2, 1, B) defined? 12

Example PDA #2: For the language {x | x = wwr and w in {0,1}*} M = ({q1, q2}, {0, 1}, {R, B, G}, , q1, R, ) : (1) (2) (3) (4) (5)

(q1, 0, R) = {(q1, BR)} (q1, 1, R) = {(q1, GR)} (q1, 0, B) = {(q1, BB), (q2, )} (q1, 0, G) = {(q1, BG)} (q1, 1, B) = {(q1, GB)}

(6) (7) (8) (9) (10)

(q1, 1, G) = {(q1, GG), (q2, )} (q2, 0, B) = {(q2, )} (q2, 1, G) = {(q2, )} (q1, , R) = {(q2, )} (q2, , R) = {(q2, )}

Notes:
Rules #3 and #6 are non-deterministic. Rules #9 and #10 are used to pop the final stack symbol off at the end of a computation.

13

Example Computation: (1) (2) (3) (4) (5) (q1, 0, R) = {(q1, BR)} (q1, 1, R) = {(q1, GR)} (q1, 0, B) = {(q1, BB), (q2, )} (q1, 0, G) = {(q1, BG)} (q1, 1, B) = {(q1, GB)} (6) (7) (8) (9) (10) (q1, 1, G) = {(q1, GG), (q2, )} (q2, 0, B) = {(q2, )} (q2, 1, G) = {(q2, )} (q1, , R) = {(q2, )} (q2, , R) = {(q2, )}

State q1 q1 q1 q1 q2 q2 q2 q2

Input 000000 00000 0000 000 00 0

Stack Rule Applied R BR (1) BBR (3) option #1 BBBR (3) option #1 BBR (3) option #2 BR (7) R (7) (10)

Rules Applicable (1), (9) (3), both options (3), both options (3), both options (7) (7) (10)

Questions: What is rule #10 used for? What is rule #9 used for? Why do rules #3 and #6 have options? Why dont rules #4 and #5 have similar options? [transition not possible if the previous input symbol was different] 14

Example Computation: (1) (2) (3) (4) (5) (q1, 0, R) = {(q1, BR)} (q1, 1, R) = {(q1, GR)} (q1, 0, B) = {(q1, BB), (q2, )} (q1, 0, G) = {(q1, BG)} (q1, 1, B) = {(q1, GB)} (6) (7) (8) (9) (10) (q1, 1, G) = {(q1, GG), (q2, )} (q2, 0, B) = {(q2, )} (q2, 1, G) = {(q2, )} (q1, , R) = {(q2, )} (q2, , R) = {(q2, )}

State q1 q1 q1 q1 q2 q2 q2 q2

Input 010010 10010 0010 010 10 0

Stack R BR GBR BGBR GBR BR R

Rule Applied (1) (5) (4) (3) option #2 (8) (7) (10) From (1) and (9)

Exercises:
0011001100 011110 0111 15

Formal Definitions for PDAs


Let M = (Q, , , , q0, z0, F) be a PDA. Definition: An instantaneous description (ID) is a triple (q, w, ), where q is in Q, w is in * and is in *. q is the current state w is the unused input is the current stack contents

Example: (for PDA #2) (q1, 111, GBR) (q1, 111, GBR) (q1, 000, GR) (q1, 11, GGBR) (q2, 11, BR) (q2, 00, R)
16

Let M = (Q, , , , q0, z0, F) be a PDA. Definition: Let a be in U { }, w be in *, z be in , and ) and both be in *. Then:

(q, aw, z ) |M (p, w, if (q, a, z) contains (p, ).

Intuitively, if I and J are instantaneous descriptions, then I | J means that J follows from I by one transition.

17

Examples: (PDA #2) (q1, 111, GBR) | (q1, 11, GGBR) (6) option #1, with a=1, z=G, =GG, w=11, and = BR (6) option #2, with a=1, z=G, = , w=11, and = BR Is not true, For any a, z, , w and

(q1, 111, GBR) | (q2, 11, BR)

(q1, 000, GR) | (q2, 00, R)

Examples: (PDA #1) (q1, (())), L#) | (q1, ())),LL#) (3)

18

Definition: |* is the reflexive and transitive closure of |. I |* I for each instantaneous description I If I | J and J |* K then I |* K

Intuitively, if I and J are instantaneous descriptions, then I |* J means that J follows from I by zero or more transitions.

19

Definition: Let M = (Q, , , , q0, z0, F) be a PDA. The language accepted by empty stack, denoted LE(M), is the set {w | (q0, w, z0) |* (p, , ) for some p in Q}

Definition: Let M = (Q, , , , q0, z0, F) be a PDA. The language accepted by final state, denoted LF(M), is the set {w | (q0, w, z0) |* (p, , ) for some p in F and in *}

Definition: Let M = (Q, , , , q0, z0, F) be a PDA. The language accepted by empty stack and final state, denoted L(M), is the set {w | (q0, w, z0) |* (p, , ) for some p in F}

Questions:
How does the formal definition of a PDA differ from that given in the book? Does the book define string acceptance by empty stack, final state, both, or neither? 20

Lemma 1: Let L = LE(M1) for some PDA M1. Then there exits a PDA M2 such that L = LF(M2). Lemma 2: Let L = LF(M1) for some PDA M1. Then there exits a PDA M2 such that L = LE(M2). Theorem: Let L be a language. Then there exits a PDA M1 such that L = LF(M1) if and only if there exists a PDA M2 such that L = LE(M2). Corollary: The PDAs that accept by empty stack and the PDAs that accept by final state define the same class of languages. Note: Similar lemmas and theorems could be stated for PDAs that accept by both final state and empty stack.

21

Definition: Let G = (V, T, P, S) be a CFL. If every production in P is of the form A > a Where A is in V, a is in T, and (GNF). is in V*, then G is said to be in Greibach Normal Form

Example: S > aAB | bB A > aA | a B > bB | c

Theorem: Let L be a CFL. Then L { } is a CFL. Theorem: Let L be a CFL not containing { }. Then there exists a GNF grammar G such that L = L(G).

22

Lemma 1: Let L be a CFL. Then there exists a PDA M such that L = LE(M). Proof: Assume without loss of generality that is not in L. The construction can be modified to include later. Let G = (V, T, P, S) be a CFG, and assume without loss of generality that G is in GNF. Construct M = (Q, , , , q, z, ) where: Q = {q} =T =V z=S : for all a in and A in , (q, a, A) contains (q, ) if A > a is in P or rather: (q, a, A) = {(q, ) | A > a is in P and is in *}, for all a in and A in

For a given string x in * , M will attempt to simulate a leftmost derivation of x with G.


23

Example #1: Consider the following CFG in GNF. S > aS S > a Construct M as: Q = {q} = T = {a} = V = {S} z=S G is in GNF L(G) = a+

(q, a, S) = {(q, S), (q, )} (q, , S) =


Question: Is that all? Is complete? Recall that : Q x ( U { }) x Qx * > finite subsets of

24

Example #2: Consider the following CFG in GNF. (1) (2) (3) (4) (5) (6) S > aA S > aB A > aA A > aB B > bB B > b

G is in GNF L(G) = a+b+

Construct M as: Q = {q} = T = {a, b} = V = {S, A, B} z=S (1) (2) (3) (4) (5) (6) (7) (8) (9) (q, a, S) = {(q, A), (q, B)} (q, a, A) = {(q, A), (q, B)} (q, a, B) = (q, b, S) = (q, b, A) = (q, b, B) = {(q, B), (q, )} (q, , S) = (q, , A) = (q, , B) = From productions #1 and 2, S->aA, S->aB From productions #3 and 4, A->aA, A->aB

From productions #5 and 6, B->bB, B->b

Recall : Q x ( U { }) x

> finite subsets of Q x * 25

For a string w in L(G) the PDA M will simulate a leftmost derivation of w.


If w is in L(G) then (q, w, z0) |* (q, , ) If (q, w, z0) |* (q, , ) then w is in L(G)

Consider generating a string using G. Since G is in GNF, each sentential form in a leftmost derivation has form: => t1t2ti A1A2Am

terminals

non-terminals

And each step in the derivation (i.e., each application of a production) adds a terminal and some nonterminals. A1 > ti+1 => t1t2ti ti+1 A1A2Am

Each transition of the PDA simulates one derivation step. Thus, the ith step of the PDAs computation corresponds to the ith step in a corresponding leftmost derivation. After the ith step of the computation of the PDA, t1t2ti+1 are the symbols that have already been read by the PDA and A1A2Amare the stack contents. 26

For each leftmost derivation of a string generated by the grammar, there is an equivalent accepting computation of that string by the PDA. Each sentential form in the leftmost derivation corresponds to an instantaneous description in the PDAs corresponding computation. For example, the PDA instantaneous description corresponding to the sentential form: => t1t2ti A1A2Am would be: (q, ti+1ti+2tn , A1A2Am)

27

Example: Using the grammar from example #2: S => aA => aaA => aaaA => aaaaB => aaaabB => aaaabb (1) (3) (3) (4) (5) (6)

The corresponding computation of the PDA: (q, aaaabb, S) | (q, aaabb, A) | (q, aabb, A) | (q, abb, A) | (q, bb, B) | (q, b, B) | (q, , )

(1)/1 (2)/1 (2)/1 (2)/2 (6)/1 (6)/2

String is read Stack is emptied Therefore the string is accepted by the PDA

28

Another Example: Using the PDA from example #2: (q, aabb, S) | (q, abb, A) | (q, bb, B) | (q, b, B) | (q, , ) (1)/1 (2)/2 (6)/1 (6)/2

The corresponding derivation using the grammar: S => aA => aaB => aabB => aabb (1) (4) (5) (6)

29

Example #3: Consider the following CFG in GNF. (1) (2) (3) (4) (5) S > aABC A > a B > b C > cAB C > cC

G is in GNF

Construct M as: Q = {q} = T = {a, b, c} = V = {S, A, B, C} z=S (1) (q, a, S) = {(q, ABC)} (2) (q, a, A) = {(q, )} (3) (q, a, B) = (4) (q, a, C) = >cAB|cC (5) (q, b, S) = (6) (q, b, A) = (7) (q, b, B) = {(q, )} (8) (q, b, C) = S->aABC A->a (9) (10) (11) (12) (13) (14) (15) (16) (q, c, S) = (q, c, A) = (q, c, B) = (q, c, C) = {(q, AB), (q, C)) (q, (q, (q, (q, , S) = , A) = , B) = , C) =

C-

B->b

30

Notes:
Recall that the grammar G was required to be in GNF before the construction could be applied. As a result, it was assumed at the start that was not in the context-free language L. What if is in L?

Suppose is in L: 1) First, let L = L { } Fact: If L is a CFL, then L = L { } is a CFL. By an earlier theorem, there is GNF grammar G such that L = L(G). 2) Construct a PDA M such that L = LE(M) How do we modify M to accept ? Add (q, , S) = {(q, )}? No!
31

Counter Example: Consider L = { , b, ab, aab, aaab, } Then L = {b, ab, aab, aaab, }

The GNF CFG for L: (1) S > aS (2) S > b

The PDA M Accepting L: Q = {q} = T = {a, b} = V = {S} z=S (q, a, S) = {(q, S)} (q, b, S) = {(q, )} (q, , S) =

If (q, , S) = {(q, )} is added then: L(M) = { , a, aa, aaa, , b, ab, aab, aaab, } 32

3) Instead, add a new start state q with transitions: (q, , S) = {(q, ), (q, S)}

Lemma 1: Let L be a CFL. Then there exists a PDA M such that L = LE(M). Lemma 2: Let M be a PDA. Then there exists a CFG grammar G such that LE(M) = L(G). Theorem: Let L be a language. Then there exists a CFG G such that L = L(G) iff there exists a PDA M such that L = LE(M). Corollary: The PDAs define the CFLs.

33

Vous aimerez peut-être aussi