Vous êtes sur la page 1sur 158

Chapter 1.

Preliminaries
A preliminary step for an abstraction is to introduce a proper notation that will symbolically represent the objects in which
we are interested. In particular, we will see that formal languages heavily rely on such symbolic representations. In this chapter
we begin with introducing such notational convention.
To follow the discussions presented in this book, we need the capacity for logical reasoning, which will in turn be enhanced
through the course. Often the book leaves out the routine and tedious details of a proof that can be completed by following one
of the conventional proof techniques, such as proof by induction. This doesn‟t mean that proofs can be ignored. Thus, this
chapter presents an overview of proof techniques with easy examples and helps the reader build confidence in validating and
presenting his or her own proofs. Studying this chapter, the reader will see that “acquired proof-phobia” is curable.

1.1 Notation 2
Symbols, Strings, Composite symbols
Sets and set operations
1.2 Propositions and logical connectives 6
1.3 Proof techniques 10
Proof by contrapositive, Proof by cases, Proof by contradiction
Proof by example, Proof by generalization
Proof by induction, Proof by pigeonhole principle
Proof by counting, Diagonalization technique
Rumination 22
Exercises 26

1
Preliminary

1.1 Notation
In this book, with no further explanation, we shall use the following notation.
• The leading English alphabet letters, a, b, c, .…, will be used as symbols to denote
an object.
• A string is a finite sequence of symbols, for which we shall use the trailing
English alphabet, .., u, v, w, x, y, z.
• For two strings x and y, xy denotes the concatenated string of x and y.
• Epsilon () denotes the null string, which, for any string x, has the property
x = x = x.
• For a string x, by |x| we denote the length of x, i.e., the number of symbols in x.
• For a string x, x0 = , xi+1 = xxi, and xR denotes the reverse of the string x.
Example: (ab)3 = ababab, (ab)0 = , (abcd)R = dcba.
• We shall use capital letters to denote the sets.

2
Preliminary
Notation

• Composite symbols: To show the constituent entities that a symbol represent,


composite symbols are used in addition to the conventional superscript, subscript, and
primed notation. A composite symbol is represented by a finite number of symbols
listed in the brackets as follows.
[a, b], [1, A, 5], [a, qi, (c, d)]
Notice that we can always substitute a single symbol for a composite symbol, if
needed.

• Set specification: There are two ways to specify the sets; in curly brackets, either by
explicitly showing all the members or by describing the set properties.
(a) Explicit. Examples: A = {2, 8},  = {a, b, c}
In particular, the empty (or null) set is denoted by Φ.
(b) By set properties of the form { x | property list of x}.
Examples: A = {n | n is a even number} = { 2, 4, 6, . . . },
B = {xxR | x is a binary string } = {00, 11, 0110, 1001, 1111, . .},
C = {anbncn | n  0} = {, abc, aabbcc, aaabbbccc, . . . }

3
Preliminary
Set operations
• Operations on the sets of strings: Let A = {aa, ab, abc) and B = {aa, bc}.
(a) union() : A  B = {aa, ab, bc, abc}, A  Φ = A
(b) intersection( ) : A  B = { aa }, A  Φ = Φ
(c) subtraction() : A – B = { ab, abc}
(d) exclusive or () : A  B = {ab, abc, bc}
(e) product : AB = { aaaa, aabc, abaa, abbc, abcaa, abcbc}
AΦ = ΦA = Φ
(f) complement : Let U be the universal set, i.e., the set of all possible
elements. Then the complement of set A, denoted by A is
A=U–A
(g) DeMorgan‟s law: A  B = A  B, A B =A B

4
Preliminary
Set operations

(h) Closure (star, or Kleene closure): For a set A, define A0 = {}, and for i > 1,
Ai = AAi-1. The closure of the set A is defined as follows.


A* =  (Ai) = A0  A1  A2 . . . .
i=0

The closure A* is the set of all strings that we can get by picking arbitrary number of
strings in A (you are allowed to pick the same string arbitrary number of times) and
concatenating them in any order. For example, for the binary alphabet , the closure *
= {0, 1}* is the set of all binary strings including the null string . If  is the English
alphabet, * is the set of  and all strings over the alphabet . Every closure contains the
null string . Kleene plus, A+, denotes the closure without , i.e., A+ = A* - {}.

(i) For a set A the power set, denoted by 2A, is the set of all subsets of A.
Example: Let A = {a, b, c}. Then,
2A = {Φ, {a}, {b}, {c}, {a, b}, {b, c}, {a, c}, {a, b, c}}

5
Preliminary
1.2 Propositions and Logical Connectives
A proposition is a declarative statement that can be either true or false. By
combining propositions using the logical connectives, such as and, or, and not,
which are symbolically denoted by , , and , respectively, we can form another
proposition. Let p and q be two propositions. The three compound propositions p 
q, p  q, and p, respectively, have the truth values defined by the truth tables as
shown below.

p q pq p q pq
T T T T T T p p
T F F T F T T F
F T F F T T F T
F F F F F F

6
Preliminary
Propositions

Another commonly used logical connective is the conditional (or if-then), denoted
by the symbol →. The truth table for the proposition p → q (read if p, then q) is
defined as shown below. Notice that the truth tables for p → q and p  q are
identical. Thus p → q is a short notation for the proposition p  q.
Notice that though we use the term conditional, implying an antecedent-
consequent relation, this conditional connective does not necessarily carry such
relational meaning.
p q p → q p  q
T T T T
T F F F
F T T T
F F T T

For example, let p be “Elephants fly,” and q be “The earth is flat.” By the definition,
the compound statement p → q is (vacuously) true, with no regard to the truth values of
statement q, and without extending to their meanings.

7
Preliminary
Propositions

p q p → q p  q
T T T T
T F F F
F T T T
F F T T

There are several ways of reading the conditional statement p → q. Among


others are the following. (Notice that in (c) the order of p and q are reversed.)

(a) if p, then q (b) p only if q (c) q if p (d) if p holds, q follows

The biconditional connective of two propositions p and q, denoted by p  q, is


the short notation for the compound statement (p → q)  (p  q). Often this
proposition is shortly written as “p iff q”, and read “p if and only if q” or “if p
then q, and conversely.”
We usually prove p  q by proving each of the two parts: p → q (called the
“only if part”), and p  q (called the “if part”).

8
Preliminary
Propositions

Often we come across a problem that asks us to justify a statement of the form “p
implies q” (denoted by p q), meaning that q is true in each case in which p is true.
Notice that the statement p q, in particular, refers to the first row of the truth table of
the proposition p → q . In other words, if p → q is true whenever p is true, we say p
implies q. The implication p q is not a proposition, which would have a truth value,
but an assertion about the relationship of two propositions.
If p  q and p  q, we write p  q and say that p and q are equivalent. Consider the
truth tables of p → q, p  q and q → p, which are all identical, as shown below.
Clearly, we have (p → q)  (p  q)  (q → p) for every pair of values of p and q.
Since p  q and q → p are true for every case in which p → q is true and vice
versa, we can also say (p → q)  (p  q )  (q → p).

p q p → q p  q q → p
T T T T T
T F F F F
F T T T T
F F T T T

9
Preliminary

1.3 Proof Techniques


A proof is a convincing argument that a statement is true. Therefore, the
statement to be proven must be a proposition that can be either true or false.
Usually it involves a statement of the form p → q. In this case the proof must show
that p → q is true. Since, if p is false, p → q is vacuously true (i.e., has no meaning),
the proof is only concerned with every case in which p is true. In other words the
proof justifies the implication p  q.
A theorem is a statement that has been proven true. However, we do not call every
statement proven true a theorem. We reserve the use of the word theorem only for
important statements of interest. Occasionally we prove statements to use in the
proof of other more important statements. We call such statements lemmas.
Sometimes when we have proven a theorem, we can easily show that a further
statement is true. Such statements are called corollaries of the theorem.
Now, we are ready for a quick overview of some important proof techniques.

10
Preliminary
Proof techniques

1. Proof by contrapositive: To prove p  q, we use the following equivalence.


( p → q )  ( q → p)

Example: Let p and q be, respectively, the statements “it rained” and “the lawn is
wet.” Then instead of proving that p → q (i.e., “if it rained, then the lawn is wet”) is
true, we can prove that q → p (i.e., “if the lawn is not wet, it did not rain”) is true.

2. Proof by cases: When p is a compound statement, we prove p  q by considering


the truth values of each component of p.
Example: Suppose that p = p1  p2  p3. Using the equivalence
((p1  p2  p3) → q)  ((p1 → q)  (p2 → q)  (p3 → q)),
we prove p q by justifying either (p1  q), (p2  q), or (p3  q).

11
Preliminary
Proof techniques

3. Proof by contradiction: This proof technique shows that if p  q does not hold,
i.e., the statement p → q is false, then it contradicts some statement, possibly
including p, that we know to be true. Recall that the implication p  q presupposes
that p is true. So, if we can show that p is false from the supposition that p → q is
false, we also have a contradiction.
Example: Suppose that the statement “everyone who studies automata and formal
languages gets a good job” is a proven fact. Let p be the statement “Tom failed to get a
good job,” and let q be the statement “Tom did not study the automata and formal
languages.” To justify the statement p  q (i.e., p→ q is true), we suppose that p→ q
is false, and show that it contradicts to the above fact.
Recall that (p → q)  (p  q). If p → q is false, p  q must be false. It follows
that both p and q should be false, i.e., (p  q) is true, implying that Tom failed to get
a good job and he studied the automata and formal languages. This contradicts to the
proven fact that everybody, who studied the automata and formal languages, gets a
good job.
The proof by contradiction is quite popular in computer science. We will see more
interesting examples in other chapters.

12
Preliminary
Proof techniques

4. Dealing with quantified statements: Occasionally a statement contains free


variables bound to a phrase “for all”, “for every”, or “there exists” in the following
forms.
(1) “for all (or for every) x, . . . .” (denoted by x …)
(2) “there exists (or there is) x . . .” (denoted by  x …)
The phrases “for all” and “there exists” are, respectively, called the universal
quantifier and the existential quantifier.

(a) Proof by example: Let x [p(x)] denote a statement with the existential
quantification bound to variable x. To prove such statements are true it is enough to
show an example.
Example: We can justify the statement “Among the students who took this course,
there exists a student who got a grade of A” by showing a student, say Mary, who got
an A from the course.

13
Preliminary
Proof techniques

(b) Proof by generalization: We apply this technique to justify a statement of the form
x [p(x)]. If the domain of the variable x is small, we may examine the statement for
each case of x. Otherwise, we need to come up with a generalization, possibly using
other proven facts.
Example: Suppose that we want to justify the statement “all the students who take
the language course are from abroad.” If possible, it is a good strategy to find some
information, like the curriculum of the school, showing that only foreign students are
admitted to the course, for example, to learn English as their second language.

Notice that if you want to disprove a statement of the form x [P(x)] (or the form
x [p(x)]), you may use the same strategy (i.e., to show an example against the
statement, called counter example) as for proving a statement of the form x [p(x)]
(respectively, x [P(x)] ).

14
Preliminary
Proof techniques

5. Proof by induction: This is a quite effective proof technique for justifying a


statement that claims a generalization over all integers n  n0, i.e., a statement of
the form  n  n0 [ p(n) ]. The statement may explicitly describe the involvement
of the natural numbers, as the following example, or it may be implicit (as we shall
see in other examples).
n
n0[ i=0 i = n(n+1) /2 ]
The proof is carried out in three steps: (1) basis, (2) induction hypothesis, and (3)
induction. In step (1), we show that p(n) is true for n0, the lower bound. In step (2)
we suppose that for every natural number m < n, p(m) is true, and finally in step (3)
we show that p(m + 1) is true.

Applicant Bloopers Break Time


Interviewer: "Do you think you can handle a variety of tasks?"
Applicant: "I should say so. I've had nine totally different jobs in the past five months."

The stern faced Personnel Officer told an applicant that they needed an individual who is totally responsible.
"I sure qualify then." replied the applicant. "Everywhere I've worked, whenever something went wrong, I was
responsible."
- Jim -

15
Preliminary
Proof techniques
n
Example: Prove that  n  0 [  i=0 i = n(n+1) /2 ].
n
Proof. (1) Basis (n = 0): we show that for n = 0, the equality  i=0 i = n(n+1) /2 holds.
0
The left side of the equation is (  i=0 i) = 0, and the right side is 0(0 + 1)/ 2 = 0.
Obviously, the statement is true.
m
(2) Induction hypothesis: suppose that  m < n, [  i=0 i = m(m+1) /2].
n
(3) Induction: we show that for n = m+1, the equality  i=0 i = (n)(n+1) /2 holds.

n m+1 m
The left side of the equation is  i=0 i =  i=0 i =  i=0 i + (m+1). Since by the
m
induction hypothesis,  i=0 i = m (m+1)/2, we get the right side as follows.
m
 i=0 i + (m+1) = m(m+1)/2 + (m+1) = m(m+1)/2 + (m+1) =
(m+1)(m+2)/2 = n(n+1)/2

In the rumination section at the end of this chapter, we discuss this proof technique
further with another example.

16
Preliminary
Proof techniques

6. Constructive proof and non-constructive proof: Constructive proof is another


name for proof by example, which is used for proving a statement of the form x
[p(x)]. The technique shows a specific value for x that makes p(x) true. A non-
constructive proof shows indirectly that such a value for x must exist by using other
facts.
Example: Prove that n [ n is a prime number and n > 2].
Constructive proof: 5 is a prime number greater than 2.
Non-constructive proof: It is proven that every natural number is a product of two
prime numbers. If there is no prime number greater than 2, then every number must
be equal to 2i, because 2 is the only prime number by the assumption. However,
there are natural numbers (e.g., 5) that are not a power of 2. This contradicts to the
proven fact that every natural number is a product of two prime numbers.

17
Preliminary
Proof techniques

7. Proof by the pigeonhole principle: The pigeonhole principle says that if m


letters are distributed among n < m pigeonholes, there must be at least one
pigeonhole that contains more than one letter.

1 2 n
.....
1 2 ... m

The pigeonhole principle appears intuitively obvious. But an intuition by itself


cannot be a proof. We need a sound logical argument for the proof. This principle can
be proved by the proof by induction technique. We will show this in the rumination
section at the end of this chapter.

18
Preliminary
Proof techniques

Application example: Using the pigeonhole principle, show that in a directed graph
with n nodes, any path of length greater than or equal to n involves a cycle.

Proof: Let n be the number of nodes of the graph. We know that on any path of the
graph, the number of nodes visited is greater than the path length, i.e., the number of
edges. (Notice that we count the starting node of a path as visited.)
Let the nodes of the graph correspond to pigeonholes and visiting a node
corresponds to putting a letter in it (see the figure below). If the path length is greater
than or equal to n, then more than n nodes are visited. In view of the pigeonholes, this
implies that we are putting m (> n) letters into the n pigeonholes. According to the
principle, there is a pigeonhole (i.e., a node) containing more than one letter (i.e.,
visited more than once). This implies that the path contains a cycle.

19
Preliminary
Proof techniques

8. Proof by counting: Many theorems are involved with the number of entities
that satisfy certain property. One way to prove such a theorem is by figuring out
the number or its range that satisfies the theorem. (The technique does not
necessarily count. The name is somewhat misleading.)

Example: Prove that n > 0 [it is impossible to put all the binary numbers of
length n in the rows and columns of n  n bit (0/1) matrix].

1 1 0 1
0 1 0 1
1 0 1 1

Proof: We prove the statement in two cases: (1) when n > 2, and (2) when n = 1 or 2.
Case (1), when n > 2: There are 2n binary numbers of length n, and the total number
of columns and rows is 2n. Since 2n > 2n, for every n > 2, the statement is true.
Case (2), when n = 1 or 2: We leave the proof as an exercise for the reader.

20
Preliminary
Proof techniques

8. Diagonalization Technique: Given a set of lists, this technique shows how to


construct a list that is not in the set. This is a powerful tool for proving certain
properties of formal languages presented in Chapters 15.
Example: Given an n  n bit matrix M, for some n > 0, construct a binary number of
length n which does not appear in any row or column of M.
Answer: Let mii be the entry at the i-th row and i-th column of M. Construct the
following binary number v,

v = m11m22….mnn , where mii is the complement of mii.

Clearly, v does not appear in any row or column of M. This is true, because for
any i, the number appearing in the i-th row or column contains a number whose i-th
bit is mii, which is the complement of the i-th bit of v.

21
Preliminary
Rumination (1): Conditional connective and implication
Note that the truth table of the conditional connective (if-then) is not the antecedent-consequent (or cause-effect)
relationship, but a logical definition. It is, however, very useful template to use for deducing a proposition (i.e.,
consequent) from another (i.e., antecedent), only if they are related. Otherwise, we will end up with a nonsense.
For example, let p be the statement “I watch TV” and q be “Our soccer team wins.” Then, by simple manipulations,
we get the following.
(p  p)  (q  q) = (p  q)  (q  p)
Since (p  p)  (q  q) is true, either (p  q) or (q  p) must be true, i.e.,
(p → q) is true, or (q → p) is true.
A verbal interpretation of this last symbolic statement would be as follows.
“It is true that if I watch TV, then our soccer team wins, or if our soccer team wins, then I watch TV.”

p q p→q
T T T
T F F
F T T
F F T

22
Preliminary

Rumination (2): Proof by induction


Proof by induction is a powerful technique for proving a statement is true for all integers n  n0, i.e., a proposition with a
universal quantification of the form n  n0 [p(x)]. The statement one is to prove may clearly show the lower bound n0 and the
integer variable n, as in the mathematical equation presented in this chapter as an example. However, occasionally we come
across statements that do not explicitly describe those numbers. We should figure them out before applying the technique.
When the statement is presented in a mathematical formula, this proof technique works so routinely that we may fail to
comprehend the underlying logical basis involved in the proof. To understand why the technique works, consider, as an analogy,
how we walk up a stairway. To walk up to the next floor at the n-th stair, we must go to the bottom (i.e., 0-th stair) of the
stairway. Then we repeat the step going from i-th stair to (i+1)-th stair. In other words, to walk up to n-th stair,
(1) we must be able to get to the bottom of the stairway, and
(2) if we are up on an i-th stair, for any i (0  i < n ),
(3) we must be able to move up to (i+1)-th stair.

The proof by induction logically works the same way. By the basis step, (1) it shows that the statement is true for the lower bound
n0 of the quantified variable n. By the induction hypothesis, (2) we suppose that the statement is true for all m (n0  m < n), and by
the induction step, (3) we show the statement is true for m + 1.
Notice that because the statement is true for n0 (by the basis step), it is also true for n0 + 1 by the induction step. Since it is true
for n0 + 1, it is also true for n0 + 2, and so on. We can claim that the statement is true for all n  n0. (Sometimes the proof is
described in two steps with the hypothesis included in the induction step.) The next slide shows how the “walking up a stairway”
and the proof by induction correspond.

23
Preliminary
Rumination: Proof by Induction

n A
Prove that  n  0 [  i=0 i = n(n+1) /2 ]
n
N
I

(3) Induction step: for n = m+1, show that


m n
 i=0 i = n(n+1) /2.

(2) Hypothesis:  m < n, suppose that


3 m
2  i=0 i = m(m+1) /2.
1
0
0 (1) Basis (when n = 0):  i=0 i = 0 and n(n+1)/ 2 = 0.
It is true for n = 0.

0 + 1 + 2 + . . . .. . . + m + (m+1)

m(m+1)/2 + (m+1) = (m+1)(m+2)/2 = n(n+1)/2


24
Rumination: Proof by Induction
Preliminary

As another application example of the proof by induction technique, here we will justify the pigeonhole principle, which is
repeated below.
Pigeonhole Principle: Suppose that there are n pigeonholes and m (> n) letters. If we distribute all the letters (in any way) into
the pigeonholes, there will be at least one pigeonhole that contains more than one letter.

Proof: (1) Basis: The pigeonhole principle makes sense when there is at least one
pigeonhole. Hence the lower bound n0 = 1. In this case, if we put all the m > 1 letters in
the pigeonhole, that pigeonhole clearly contains more than one letter. The principle is true.

(2) Hypothesis: Suppose that for some number of pigeonholes n‟ < n, the principle is true.

(3) Induction: We prove that the principle is true for n‟+1 pigeonholes and m (> n‟+1) letters.
As shown in the figure bellow, we examine the problem divided into two cases depending on the number of letters
contained in one particular pigeonhole, say hole n‟+1.

(a) When pigeonhole n‟+1 has one or no letter: In the remaining n‟ pigeonholes at least m - 1 > n‟ letters are distributed.
According to the induction hypothesis, among the n‟ pigeonholes there is at least one pigeonhole that contains more than
one letter. The principle is true.
(b) When pigeonhole n‟+1 has two or more letters: Obviously this pigeonhole has more than one letter. Again, the
principle is true.

1 2 n‟ n‟+1

.....
1 2 m

25
Preliminary
Exercises
1.1 What does each of the following expressions denote? If your answer is a finite set, show all the members. If it is
infinite, use set property notation.

(a) a5 (b) |aba7b| (c) || (d) xR, where x = abab (e) AB, where A = {a, b, c}, B = {aaaa, bbbb, cccc}
(f) A*BA*, where A and B are the sets shown in part (e) above.

1.2 Among the strings given in (1) – (7) below, identify all the members that belong to each of the following sets.

(a) {xyxR | x  {a, b, c}*, y  ccc } (b) {xx | x  {a, b, c}* }


(c) {x | x  {a, b, c}* and in x the number of a‟s is strictly greater than the number of b‟. }  {aibj | i, j > 0 }
(d) ({a, b, c}* - ({aibj | i > j > 0 }  {aibj | 0 < i < j }))  {aibj | i, j > 0 }

(1) aaaabbbb (2) aaaa (3) aaaacccaaaa (4) bbbaaaa (5) abcccccba (6) aaaaab (7) abaaba

1.3 For the sets of strings given below, show the result of each of the following set operations. Write your answer in set
property notation. Try to make it as simple as possible.

(a) L0 L4 (b) L0  L1 (c) L3  L5 (d) (L2 - L1)  L4 (e) L0L5

L0 = {cibjak | i, j, k > 0, j = k } L1 = { cibjak | i, j, k  0, i = j }


L2 = {a, b, c}* L3 = {xxR | x  {a, b, c}* }
L4 = { cibjak | i,j,k  0 } L5 = {cibjck | i,j,k  0 }

26
Exercises Preliminary
1.4 Show that the following statement is true for a pair of integers a and b.

a = b if and only if a ≥ b and a  b.

1.5 Prove that for all positive integers n  2, it is impossible to put all binary numbers of length n in the n rows and n
columns of n  n bit matrix.

1.6 Let G be a directed graph with n ≥ 2 nodes, with each node labeled with either 1 or 2. Let P12 be a path in G which
starts at a node with label 1 and ends in a node with label 2. An edge is said “good” if it links two nodes with different
labels (i.e., 1  2 or 2  1). (The figure below shows a path with 7 good edges.) Prove that in any path P12 the number
of good edges is always odd.
6
1
2 7 A path P12
5 8
2
1 2 2 1 2 1 2 2 1 2
1 1 4 1
2 9 3
7 good edges
2
2
1

1.7 Let‟s change the definition of good edges (in problem 1.6) the other way around, i.e., an edge is “good” if it links two
nodes with the same label. Can you claim the same property on the number of good edges? Justify your answer.

27
Exercises Preliminary

1.8 Using the pigeonhole principle we proved that the following statement is true.

[ In a directed graph with n nodes, any path of length greater than or equal to n involves a cycle. ]

Here, we want to justify the statement using the proof-by-induction technique. Answer each of the following questions.
(1) What is the basis of the statement? In particular, identify the lower bound (n0). Show that the statement is true for the lower
bound.
(2) Write the induction hypothesis.
(3) For the argument in the induction step, you may use the figures below, each showing a typical case of a path whose length is
greater than or equal to n. Figures (a) – (c) are the cases where there is a node which the path does not go through, and figure
(d) is the case where the path goes through every node. Complete this step answering the following questions.
(i) How can you argue that in cases (a), (b) and (c), the path involves a cycle? (Hint: use the hypothesis.)
(ii) Why does the path involve a cycle in case (d)?

(a) (b) (c) (d)


28
Models of Language Generation:
Grammars

29
Chapter 2. Formal Languages and Grammars
In chapter 1 we learned how to represent a set. If the set is small, we write it out. Otherwise, we show it in terms of a set
property representing the characteristics of the set elements. This approach is useful for the sets whose members (i.e., the
strings) have simple properties to describe. For the formal languages in real applications, like programming languages, it is
impractical to represent a language in terms of the set properties because in general the properties are too complex to
identify and succinctly describe. Hence the major part of the book will be dedicated to the languages represented as a set of
rewriting rules, called grammar, which derives the language.

In this chapter we will learn how to derive a language with a given set of rewriting rules, and conversely, given a
language, how to find a set of rewriting rules that derive it. We begin with a couple of rewriting rules to see how they derive
a language.

2.1 Languages 31
2.2 Deriving a Language with rewriting rules: Examples 33
2.3 Definitions: Formal languages and grammars 48
Type 0 (phrase structured) grammars
Type 1 (context-sensitive) grammars
Type 2 (context-free) grammars
Type 3 (regular) grammars
2.4 Grammars and their languages: Examples 53
Rumination 55
Exercises 58

30
Languages and Grammars
2.1 Languages
Webster‟s dictionary defines the word language as follows.

(1) The words, their pronunciation, and the methods of combining them used
and understood by a community.
(2) A system of signs and symbols and rules for using them that is used to
carry information.

In the above definitions we see two common factors: (a) language elements such
as words, their pronunciation, signs and symbols, and (b) methods of combining (or
rules for using) them. In natural languages the elements and the rules of using
them are evolved, not defined. Each sentence is constructed with some of the
language elements combined according to the rules of the language. Since the
language elements and the rules for using them give rise to all the sentences, we
can simply define a natural language as a set of sentences.
In formal languages we usually deal with strings of symbols (not pronunciation)
and the rules for constructing the strings, which correspond to the sentences of a
natural language. Hence, in formal languages we simply define the language as
follows.
31
Languages Languages and Grammars

According to the above definition, the following sets are languages. Notice that
set A is finite and others are infinite. Set D is the C++ programming language.
A = { aaba, bba, aaabbb }
B = {xxR | x  {0, 1}+ } = {00, 11, 0110, 1001, 1111, . .}
C = {anbncn | n  0} = {, abc, aabbcc, aaabbbccc, . . . }
D = {x | x is a C++ program }

Love Break Time

Never pretend to a love which you do not actually feel, for love is not ours to command. - Alan Watts –
We can only learn to love by loving. - Iris Murdoch –
Love is the triumph of imagination over intelligence. - H. L. Mencken –
There is no remedy for love but to love more. - H. D. Thoreau -

32
Languages and Grammars
2.2 Deriving a language with rules
The following example shows a set of rewriting rules named spring-grammar.

spring-grammar:
<spring-sentence>  <spring-subject> <spring-verb>
<spring-subject>  birds | flowers | buds
<spring-verb>  sing | bloom | fly

A rewriting rule (or simply rule) is divided into two sides; left side and right side
separated by an arrow. When there is more than one element (word) at the right side,
we separate them with a vertical bar (|). In the above example, every rule has one
element at its left side. There can be more, as we shall see later. We will call each rule
by the word on its left side.
In the example, the elements delimited by a pair of angled brackets < . . . > are only
used for generating a “spring sentence” consisting of “spring subjects,” birds, flowers
and buds, followed by “spring verbs,” sing, bloom and fly.

33
Deriving a language with rules Languages and Grammars

spring-grammar:
<spring-sentence>  <spring-subject> <spring-verb>
<spring-subject>  birds | flowers | buds
<spring-verb>  sing | bloom | fly

Let‟s see how we can derive a “spring sentence” with the above set of rules. We
begin with a designated element, in this example <spring-sentence>, and rewrite it
with one of the right side elements of its rule. The choice is arbitrary, if there is
more than one element. In the example we have no choice except <spring-subject>
<spring-verb>. So, we get the following.

A
<spring-sentence>  <spring-subject> <spring-verb> N
I

34
Deriving a language with rules Languages and Grammars

spring-grammar:
<spring-sentence>  <spring-subject> <spring-verb>
<spring-subject>  birds | flowers | buds
<spring-verb>  sing | bloom | fly

We call the result of a rewriting step a sentential form. From the current sentential
form, we pick an element that appears in the left side of a rule and rewrite it with one
of its right side elements. We keep this rewriting step until we get a sentential form
with no rule applicable as shown below.
This last sentential form (“birds sing” in the example) is a member (string or
sentence) of the language generated by the set of rules.
A
<spring-sentence>  <spring-subject> <spring-verb> N
I
 birds <spring-verb>  birds sing

35
Deriving a language with rules Languages and Grammars

spring-grammar:
<spring-sentence>  <spring-subject> <spring-verb>
<spring-subject>  birds | flowers | buds
<spring-verb>  sing | bloom | fly

<spring-sentence>  <spring-subject> <spring-verb>


 birds <spring-verb>  birds sing.
In the above derivation if we had rewritten <spring-subject> with flowers instead of
birds, we would get a spring sentence “flowers sing.” It follows that the set of rules in
the spring-grammar can derive all the 9 sentences that can be formed with the three
subjects and three verbs. Denoting this set of sentences (i.e., the language) by L(spring-
grammar), we get

L(spring-grammar) = { birds sing, birds bloom, birds fly, flowers sing,


flowers bloom, flowers fly, buds sing, buds bloom, buds fly }

36
Deriving a language with rules Languages and Grammars

If we are not concerned with the meaning of the sentences (i.e., strings), we can
simplify the grammar by substituting a single symbol for each of the elements as
shown below. (Studying formal languages, we are mainly concerned with the
structural complexity of the languages, not the meaning.)
In this simplification, we used lower case letters for the words that appear in the
language, and upper case letters for the others. In particular, we used S to denote the
starting element for the language derivation. (Notice that in the example, # denotes
the blank between <spring-subject> and <spring-verb>.)
G
spring-grammar:
G: S  A#B
S A # B
<spring-sentence>  <spring-subject> <spring-verve> A  a|b|c
A a b c B  d|e|f
<spring-subject>  birds | flowers | buds
L(G) = {a#d, a#e, a#f, b#d,
B d e f
<spring-verve>  sing | bloom | fly b#e, b#f, c#d, c#e, c#f}

37
Deriving a language with rules Languages and Grammars

Conventionally, we use lower case letters and special characters to denote a word
in the language, and upper case letters only for the elements used for language
generation. We call those lower case letters terminal symbols, and the upper case
letters nonterminal symbols. In our example (repeated below), a through f and # are
terminal symbols, and S, A and B are nonterminal symbols. We call the set of
rewriting rules a grammar of the language.

G: S  A#B
A a | b | c L(G) = { a#d, a#e, a#f, b#d, b#e, b#f, c#d, c#e, c#f }
Bd|e|f

Break Time
Life
- Life is a long lesson in humility. - J. M. Barrie –
- Life is far too important a thing ever to talk seriously about. - Oscar Wilde –
- Life is like playing a violin in public and learning the instrument as one goes on. - Samuel Butler –
- Life is a foreign language; all men mispronounce it. - Christopher Morley –

38
Deriving a language with rules Languages and Grammars

The size of the language of the above grammar is finite. The following grammar
shows how it is possible to generate an infinite language.

Example 1. G1: S  dingdongA A  dengS | deng

S  dingdongA A  dengS A
S dingdongA dingdongdengS N
I
S  dingdongA A  deng
dingdongdengdingdongA

dingdongdengdingdongdeng

In the above derivation, rule A  dengS allows us to derive a string containing


“dingdongdeng” repeated arbitrary number of times. To terminate the derivation
we should finally apply rule A  deng.

39
Deriving a language with rules Languages and Grammars

Example 1. G1: S  dingdongA A  dengS | deng

We can see that the language L(G1) is

L(G1) = { (dingdongdeng)i | i  1 }

Here are more examples.

Example 2. G2: S  aaS | aa

S  aa S  aaS S  aa A
S  aa S  aaS  aaaa N
I
S  aaS S  aaS S  aa
S  aaS  aaaaS  aaaaaa ..........

L(G2) = {a2i | i > 0}

40
Deriving a language with rules Languages and Grammars

Example 3. G3: S  aSb | ab

S  ab
 ab L(G3) = {aibi | i  1}
S

S  aSb S  ab
 aSb  aabb
A
S  aSb S  ab N
 aaSbb  aaabbb I

. . . . aaaabbbb

. . . . . . aaaaabbbbb

.......
41
Deriving a language with rules Languages and Grammars

The following grammar looks challenging. The number of rules and their complexity,
with more than one symbol on the left side, are overwhelming. However, once you
understand how it derives a couple of strings, you will like it. Notice that the grammar
has only one terminal symbol a.

Example 4. G4: S  ACaB | a Ca  aaC CB  DB | E


aD  Da AD  AC aE  Ea AE  

Using rule S → a, the grammar generates string a, which is a member of the language.
Let‟s see what other members it can generate starting with rule S  ACaB.

42
Deriving a language with rules Languages and Grammars

For the convenience, the grammar is repeated below with id number assigned to
each rule.
(1) (2) (3) (4) (5)
S  ACaB | a Ca  aaC CB  DB | E
(6) (7) (8) (9)
aD  Da AD  AC aE  Ea AE  

Applying rules (1), the only rule applicable next is rule (3). Notice that by rule
(3), C keeps jumping over a to the right adding one a till it meets B, where we can
apply either rule (4) or (5). If we apply rule (4), then by rule (6) D moves left till
it meets A and converts to C as follows. We will shortly see what will happen if
we apply rule (5). A
(1) (3) (4) (6) (6) (7)
N
S  ACaB  AaaCB  AaaDB  AaDaB ADaaB  ACaaB I
Again by keep applying rule (3) till C meets B, we can double the number of a.
(3) (3) (3)
ACa . . . . aB  AaaCa . . . . aB  . . . .  Aaa . . . . aaCB
43
Deriving a language with rules Languages and Grammars

(1) (2) (3) (4) (5)


S  ACaB | a Ca  aaC CB  DB | E
(6) (7) (8) (9)
aD  Da AD  AC aE  Ea AE  

(3) (3) (3)


ACa . . . . aB  AaaC . . . . aB  . . .  Aaa . . . . aaCB

If we want to derive a terminal string, i.e., a member of the language, we should


apply rule (5) (i.e., CB → E) and then, by applying rule (8) repeatedly, bring E
toward the left end until it meets A. Finally applying rule (9) we get a string of
terminal symbol a.
A
(5) (8) (8) N
Aaa . . . . aaCB  Aaa . . . . aaE  Aaa . . . . aEa  . . . I
(8) (9)
 AEaa . . . . aa  aa . . . . aa

44
Deriving a language with rules Languages and Grammars

(1) (2) (3) (4) (5)


S  ACaB | a Ca  aaC CB  DB | E
(6) (7) (8) (9)
aD  Da AD  AC aE  Ea AE  

(3) (3) (3)


ACa . . . . aB  AaaC . . . . aB  . . .  Aaa . . . . aaCB

If we want to double the number of a again in the last sentential form above, we
apply rule (4), bring D toward the left end using rule (6) until it meets A and then
change D to C by rule (7) as follows.

(4) (6) (6)


Aaa . . . . aaCB  Aaa . . . . aaDB  Aaa . . . . aDaB  . . .
(6) (7)
 ADaa . . . . aaB  ACaa . . . . aaB

Now, by rule (3) again we are ready to double the number of a in the above final
sentential form.

45
Deriving a language with rules Languages and Grammars

(1) (2) (3) (4) (5)


S  ACaB | a Ca  aaC CB  DB | E
(6) (7) (8) (9)
aD  Da AD  AC aE  Ea AE  

Now, we are ready to figure out the language of the grammar. By rule (2) (S → a)
the string a can be derived. By applying rules (1), (3), (5), (8), (8) and (9) in this
order, we can derive string aa as follows.

A
(1) (3) (5) (8) (8) (9) N
S  ACaB  AaaCB  AaaE  AaEa  AEaa  aa I

We know that instead of rule (5) in the above derivation, if we apply rule (4)
followed by the rules (6), (6) and (7), we get the sentential form ACaaB. With
this sentential form we are ready to derive the string aaaa.

46
Deriving a language with rules Languages and Grammars

(1) (2) (3) (4) (5)


G4 : S  ACaB | a Ca  aaC CB  DB | E
(6) (7) (8) (9)
aD  Da AD  AC aE  Ea AE  

Now, we claim that the language of the grammar G4 is the set of strings of a whose
length is some power of 2, i.e.,

L(G4) = { am | m = 2n, n  0 }

We can prove this claim by induction. We have just shown that the basis part is
true. We leave the formal proof for the reader.

47
Languages and Grammars

2.3 Definitions: Formal Languages and Grammars

To define a grammar we need the following information.


(1) The language alphabet, i.e., the set of terminal symbols, which appear in the
language. Conventionally, we use the lower case English alphabet together with
special characters if needed.
(2) The set of nonterminal symbols (the upper case letters by convention) used to
construct the rules to derive the language.
(3) The set of rules.
(4) The start symbol (S by convention).

In the previous examples we saw various kinds of rules. The rules in the first three
examples have only one symbol on their left side. In general there can be a string of
finite length. We will later see that such variations may affect the characteristics of the
language generated by the grammar.
In this book we will study the following four types of grammars that have been
extensively investigated.

48
Definition Languages and Grammars

Type 0 ( phrase structured ) grammar: a type 0 grammar G is the following


quadruple,

G = < VT , VN , P , S > , where

• VT : the terminal alphabet (called morphemes by linguists). Following the


convention, we will use the lower case letters.
• VN : the nonterminal alphabet (also called variables or syntactic categories).
Again, following the convention, we will use the upper case letters.
• V = VT  VN , i.e., the set of symbols used in the grammar. This set is
called the total alphabet.
• S  VN : the start symbol.
• P : a finite set of rules, each denoted by   and read “ generates ,”
where   V*VNV*,   V*.

49
Definition Languages and Grammars

Notice that in a rule  , the left side must be a string   V*VNV* and
the right side must be a string   V*. More specifically,  is a string in V*
with at least one symbol in VN (i.e., a nonterminal symbol), and  can be
any string in V*, including the null string ε.

For two strings w1 and w2, by w1  w2 we mean that w1 derives w2 by


applying a rule. We write w1 * w2 , if w1 derives w2 by applying an arbitrary
number (i.e., zero or more) of rules. Any string w  V* that can be derived
starting with the start symbol (i.e., S * w) is called a sentential form.
The language of a grammar G, denoted by L(G), is the set of all string of
terminal symbols that can be derived by some rules of G. That is

L(G) = { x | x  (VT )* and S * x }

All the grammars in Examples 1 – 4 are type 0. The languages of type 0


grammars are also called recursively enumerable (R.E.) languages.

50
Definition Languages and Grammars

The following three types of grammars, type 1, 2 and 3 are defined with some
additional restrictions on the rule form of type 0. However, we will see that since
those restrictions do not violate type 0 forms, all these restricted grammar types are
also type 0.
Type 1 (context-sensitive) grammars: This type of grammars have rules   ,
where   V*VNV*and   V*, with the restriction that ||  ||, i.e., the left side of
all the rules in the grammar cannot be longer than its right side, except for the rule S
  under the condition that no rule in the grammar has S on its right side. (Notice
that the compositions of  and  are the same as those in type 0 rules.)

Type 2 (context-free) grammars: These grammars have rules   , where  


V*VNV*,   V*, with the restriction that | | = 1. Since  should have at least one
nonterminal symbol, this restriction implies that the left side must have one
nonterminal symbol (i.e., an upper case letter).

51
Definition Languages and Grammars

Type 3 (regular) grammars: these grammars have type 2 grammar rules with
the following restrictions on their right side:
• the right side of each rule can have no more than one nonterminal symbol.
• The nonterminal symbol at the right side of a rule, if any, should be at the
end of the string. That is, in rule    it is required that  = xA or  = x,
where A  VN and x  (VT )*.
For example, A  abB, B  aa | D, D   are all legal type 3 grammar
rules. However, rules A  aBa and B  aaAD cannot be in a type 3 grammar,
because in the former nonterminal symbol B is not at the end of right side
string, and in the latter there are more than one nonterminal symbol.

52
Languages and Grammars

2.4 Grammars and their language: Examples

Here is a typical example for each of the four types of grammars and their
languages that we will often use for further discussions in the following chapters.

Type 0 (phrase structured): G = < {a}, {S,A,B,C,D,E}, P, S >


P = { S  ACaB | a Ca  aaC CB  DB | E
aD  Da AD  AC aE  Ea AE   }
L(G) = { am | m = 2n, n  0 }
The above grammar is the one that we examined in Section 2.1. If you understand
how this grammar generates its language, it would not be hard to see how the
following type 1 grammar generates its language shown below.

Type 1 (context-sensitive): G = < {a,b,c}, {S,B,C}, P, S >


P = { S  aSBC | aBC CB  BC bB  bb
aB  ab bC  bc cC  cc }
L(G) = {aibici | i  1 }
53
Grammars: Examples Languages and Grammars

Type 2 (context-free): G = < {0,1}, {S,A,B}, P, S >


P = { S  ASB |  A0 B1}
L(G) = {0i1i | i  0 }

Type 3 (regular): G = < {0,1}, {S,A}, P, S >


P = { S  0S | A A  1A |  }
L(G) = { 0i1j | i, j  0 }

From now on, to present a grammar we shall only show the set of rewriting (or
production) rules written according to the convention, i.e., lower case letters for
terminal symbols, upper case letters for nonterminal symbols, and S for the start
symbol.

54
Languages and Grammars
Rumination (1): Grammars and production rules
The following remarks summarize subtle conceptual aspects concerning formal grammars and their languages that we
have defined. Let G = < VT , VN , P , S > be a grammar.
(a) Recall that the language L(G) is the set of terminal strings that can be generated by applying a finite sequence of
production rules. There is no order in the grammar rules that must be observed when a string is derived. However,
depending on the order of the rules applied, we may end up with a string containing a nonterminal symbol from which no
terminal (or null) string can be derivable. For example, consider the following type 1 grammar with four rules.

(1) S → ABC (2) AB → ab (3) BC → bc (4) bC → bc

Clearly, only rules (1) (2) (4) applied in this order will derive terminal string abc, which is the only member of the
language of the grammar. If you apply rule (1) followed by rule (3), you will be stuck with Abc as follows, which cannot be
a member of the language because the string has a nonterminal symbol.

S  ABC BC  bc
S  ABC  Abc  ??
(b) Rule (3) of the grammar above is useless in the sense that it does not contribute to the generation of the language. We
can delete this rule from the grammar without affecting its language. In general, the decision problem of whether a rule in
an arbitrary grammar is useless or not is unsolvable. However, if we restrict the problem to the class of context-free
grammars (type 2), we can effectively clean up useless rules, if any. We will learn such an algorithm in Chapter 9.
(c) In this chapter we defined a type 0 (or phrase structured) grammar as a grammar having production rules of the form
  , where   V*VNV*,   V*. The left side of each rule should have at least one nonterminal symbol. In the text
we see some variations of this definition, in particular on the left side of the rules. For example, unrestricted grammars
are defined as having rules with   V+ and type 0 grammas with   (VN)+. These variations are equivalent in the
sense that they generate the same class of languages as the phrase structured grammars. In this text we will use the
generic names and type numbers to refer the grammars and their languages defined in this chapter.

55
Rumination (1): Grammars and Rules Languages and Grammars

(d) The grammars that we have defined are sequential in the sense that only one rule is allowed to apply at a time. Notice that
with the grammar below, if we apply rules (2) AB → ab and (3) BC → bc simultaneously on the string ABC, which is derived
with rule (1), we will get terminal string abbc, which is not a member of the language according to our definition. There is a
class of grammar where more than one rule can be applied simultaneously. We call such rules parallel rewriting rules. (In
Chapter 3, we will see a class of grammars using parallel rewriting rules.) In general it is not easy to figure out the language of
a parallel rewriting grammar.

(1) S → ABC (2) AB → ab (3) BC → bc (4) bC → bc

(e) For context-free grammars we get the same language independent of the mode of rule application, either sequential or
parallel. Why? The answer is left for the reader.
(f) A context-sensitive grammar is defined as a phrase structured grammar with non-contracting rules, except for S   under
the condition that no rule has S on its right side. On account of the non-contracting rules, the sentential forms that we get along
a derivation never shrink, which is a typical property of context-sensitive grammars. However, we need the exceptional case, S
 , which is contracting, to include the null string in the language. Including this rule in the grammar, we need the condition
of not allowing S on the right side of any rule because, otherwise (e.g., S  SSS | ε), the non-contracting property of
derivations will be violated. Sometimes in a text we see context-sensitive grammars defined without this exceptional case, thus,
excluding  from the language.
(g) A context-free grammars are defined as a type 0 (not type 1) grammar with the restriction of || = 1. It follows that a
context-free grammar can have a contracting rule, like A → ε (called ε-production rule), while type 1 grammars are not
allowed to have such rules except for S → ε. Later we will learn in Chapter 10 that all ε-production rules in a context-free
grammar, if any, can be eliminated, leaving S → ε only when the null string is in the language.

56
Rumination (1): Grammars and Rules Languages and Grammars

(h) A context-free grammar is called linear if each rule is restricted to the following form.
A  xBy or A  x, where A, B VN, x, y  (VT)*
Recall that each rule of a regular grammar is restricted to the form of A  xB or A  x. Such grammars are also called right
linear. By this definition, neither of the following rules cannot be a regular grammar rule.
A → bBC A → abBa A → Ba
(i) Following the same notion, a left linear grammar can be defined as a context-free grammar having its rules restricted to the
form of A  Bx or A  x. We will show in Chapter 9 that left linear grammars and right linear grammars are equivalent in the
sense that they both generate the same class of regular languages.

(j) Notice that a regular grammar cannot have both right linear and left linear rules. For example, the following linear grammar
generates the language L = {aibi | i  1}. In Chapter 12 we will show that no regular grammar can generate L.
S → aB B → Sb | b
(k) Formal grammars were defined in terms of the quadruple G = < VT, VN, P, S >, which shows all the necessary information that
are required to specify a grammar. This is very general way of defining a system not only in computer science but also in other
fields. We commonly define a system by first showing the list of the system‟s entities that are required for the definition, and then
specifying each entity. We will follow the same approach when we define automata in Chapter 4.

Defining a system S:
(1) S = < E1, E2, . . ., Ek > // a list of system entities
(2) Specify each of entities E1, E2, . . ., Ek.

57
Languages and Grammars
Exercises

2.1 For each of the grammars G1 – G4 below, answer the following questions.
(a) Which type is the grammar (either phrase structures, context-sensitive, context-free, or regular)? (Recall that a grammar
can be in more than one type.)
(b) What is the language?
G1: S  aS | abcB B  bbB | 
G2: S  ABC |  AB  ab bC  bc BC  cd Ac  bc
G3: S  AAA A a | 
G4: S  ABC AB  a aC  c BC  b

2.2 Using the set property notation, show the language of each of the following grammars. (Review Example 4 in Section 2.1
before you figure out the answers for parts (g) and (h).)

(a) S  aS | bS |  (b) S  aS | Saa | 


(c) S  aSa | bSb | aa | bb (d) S  aA | bB A  Sa |a B  Sb | b
(e) S  aA |  A  Sb (f) S  aSb | A A  aA | a
(g) S  DG G  bAGc | bEc bA  Ab bE  Eb DA  aaD DE  aa
(h) S  DG G  bbAGcc | bbEcc bA  Ab bE  Eb DA  aaD DE  aa

58
Exercises Formal Languages

2.3 For each of the following languages, show a regular (type 3) grammar that generates it.

(a) L1 = {a}* (b) L2 = {ab, abb, abbb} (c) L3 = { axb | x {0, 1}*}
(d) L4 = L2  L3 (e) L4 = L2L3 (f) L4 = { xy | x {0, 1}+, y {a, b}+}
(g) L3 = { x | x is a decimal number. i.e., x{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}+ }
(h) L5 = { x | x  {a, b}* and x has at least one a. }

2.4 For each of the following languages, construct a context-free (type 2) grammar that generates the language and briefly
explain how your grammar generates it.

(a) L1 = {aibj | i > j > 0} (b) L2 = {aibj | i  j > 0} (c) L3 = {aibj | i  j  0}


(d) L4 = {aibj | j > i > 0} (e) L5 = {aibj | i, j > 0 and i  j } (f) L6 = {xxR | x  {a, b}*}

2.5 For each of the following languages, construct a grammar (of any type) that generates the language, and briefly explain how
your grammar generates it. (Hint: try to apply the idea of the type 1 grammar given as an example in Section 2.3. )

(a) L1 = {a2nbnc2ndm | n,m  1 } (b) L2 = {anbncndn | n  1 }

59
Models of Language Generation:
Grammars

60
Chapter 3. Other Models for Language
Representation
In Chapter 2 we learned how to define a language in terms of a grammar, and saw several examples. Numerous other
models for language representation have been introduced in the literature, among others, graph grammars, picture language
grammars, Lyndenmayer systems (L-systems), syntax flow graphs, regular expressions, etc. This chapter briefly introduces
L-systems, syntax flow graphs and regular expressions, and points to their applications.

3.1 L-systems 62
Definitions and examples
Application
3.2 Syntax flow graph 67
Definition and examples
3.3 Regular Expressions 70
Definition and examples
Algebraic properties of regular expressions
Rumination 74
Exercises 75

61
Other Language Models

3.1 L-systems
L-system is a grammar model introduced by Lyndenmayer to describe biological development, such as the growth of plants
and cells. L-systems are different from the grammars that we have studied in the last chapter. In an L-system, to drive the next
string from the current string all applicable rules are applied simultaneously (i.e., in parallel), and every string that can be
derived belongs to the language of the system. There are several variations of L-systems. Zero-sided L-systems correspond to
the context-free grammars in the sense that the production rules are not context dependent, i.e., there is only one symbol on the
left side of each rule. Depending on the context-sensitivity of the rules to the left, to the right or both, the L-systems are called,
respectively, left-sided, right-sided and two-sided L-systems.
Having parallel rewriting rues, in general it is hard to indentify the language of an L-system. However, because of the unique
characteristics of the languages generated by L-systems that can be used in computer imagery, L-systems have been widely used
as a graphical tool in computer science. The following book contains a rich collection of various L-systems and their
applications.
H. Ehrig, M. Nagl, G. Rozenberg, and A. Rosenbeld (Edition), “Graph Grammars and Their Application to Computer Science,”
(Lecture Notes in Computer Science #291, Springe-Verlag, 1986)

62
L-systems Other Language Models

In an L-system there is no difference between terminal symbols and nonterminal


symbols. Every string that can be derived by applying the rules belongs to the
language.

Definition: An L-system G is a triple G = ( , h,  ), where  is a finite alphabet, h is


a set of rewriting rules, and , called the axiom, is the start string. Conventionally the
rules are expressed as a function h. For example, h() =  and h() = {, }, where ,
,   *, imply, respectively, the rules    and    |  in the formal grammar
notation.
Let hi() denote a string derivable by applying the rules in h i times, i.e., h0( ) =  ,
h1( ) = h( ), h2() = h(h1( )), . . . ., hi() = h(hi-1()). The language of an L-system
G = ( , h,  ) is defined as follows.

L(G) = { hi( ) | i  0}

63
Other Language Models
L-systems

Below are two simple L-systems. Notice that in both examples, there is only
one symbol at the left side of the rules. That is, the rules are context-free. Such L-
systems are called 0L (zero-sided Lyndenmayer) systems. We can also define L-
systems with rules whose left side has a string of length greater than 1.

Example 1. G1 = ( {a}, h, a2 ), h(a) = {a, a2 }.


L(G1) = {an | n  2}.
Example 2. G2 = ({a, b}, h, ab ), h(a) = aa, h(b) = ab.
L(G2) = {amb | m = 2n-1, n  1}.

Break Time
When one door of happiness closes, another opens. But often we look so long at the closed door that we don‟t see
the one which has been open for us.

It‟s true that we don‟t know what we got until we lose it, but it‟s also true that we don‟t know what we‟ve been
missing until it arrives.
- Anonymous -

64
L-system Other Language Models

The following figures show how an L-system can be used to draw a plant, where the
lines correspond to the branches of a tree, and the matching brackets indicate the start
and the end points of a branch. Nested matching brackets indicate branching on their
“main-branch” corresponding to the outer brackets. When we draw a tree in a two
dimensional space with such an expression, we let the branches grow in alternate
directions, to the right and then to the left, with certain angle, say 30o from the current
direction.
[ ] [ ] [ [] [ ] ] [ ]

(a) (b) (c)


To draw a tree we first let the L-system generate a string of symbols (each
corresponding to a line segment) with matching brackets, and then translate the string
into a tree using the same idea shown above. Next slide shows an example, where
every digit is translated into a line segment of the same length.
65
L-system Other Language Models

Drawing a branching plant using an L-system

L-system rule (1 is the axiom) 5

22 3  24
6
1 23
7
4  25 5  65 67 8

78 8  9[3] 99 9

9
Generate a string
9
1 9
23 9
224 9
.. Translate
9
.. 9

2229[229[24]9[3]8765]9[229[3]8765]9[228765] 2

9[228765]9[2265]9[225]9[24]9[3]8765 2
2

66
Other Language Models
3.2 Syntax Flow Graph
A syntax flow graph (also called syntax diagram) is a graphical representation of
context-free grammar rules. (This model is applicable only for context-free grammar
rules.) This model helps us understand how the grammar rules put symbols together to
construct a string. The figure below shows an example.
For each of the nonterminal symbols (S, A and B in the example below) in the
grammar, we draw a directed graph under the title of that nonterminal with one entry
and one exit edge as follows. For each right side of the rule corresponding to the
nonterminal symbol, we draw a path from the entry to the exit with the sequence of
nodes, each labeled with the symbol in the order appearing on the right side.
Conventionally, terminal symbols are put in a circle and nonterminal symbols in a
rectangle. The null string is represented with a line.

S  aS | A A  bAc | B B d|f| 
S A
a S B d
B

b A c f
A

67
Syntax Flow Graph Other Language Models

Example 1. Here is a syntax flow graph (and the rewriting rules) corresponding to
the identifier defined in C programming language.

<letter> <digit>
a 0
<letter>  a | b | … | z
b 1
. . <digit>  0 | 1 | 2 . . . | 9

. .
z 9 <identifier>  <letter> |
<identifier><letter> |
<identifier> <identifier><digit>
digit
letter
letter

68
Syntax Flow Graph Other Language Models

Example 2. The following syntax flow graph (and the rewriting rules) represents the
unsigned integers and unsigned doubles (i.e., real numbers) defined in C.

<unsigned integer>
<unsigned integer>  <digit> |
digit
<unsigned integer><digit>

<unsigned double> +

unsigned integer . unsigned integer E unsigned integer

<unsigned double>  <unsigned integer> . <unsigned integer> |


<unsigned integer> . <unsigned integer><exponent> |
<unsigned integer><exponent> |
<unsigned integer>
<exponent>  E<sign><unsigned integer>
<sign>  + | - | 
69
Other Language Models

3.3 Regular Expressions


Searching a file or browsing the Web, we enter a query in terms of strings (or a pattern) representing the information that
we want to find. We learned how to represent a set of strings (i.e., a language) in terms of a set property or a grammar.
However, neither the set property nor the grammar model is practical for interactive online search. We need a convenient
way to succinctly express and efficiently find what we are looking for.

This section introduces regular expressions, an algebraic model, which denotes regular languages. Since regular
expressions can be transformed to a machine that recognizes the language expressed by the expression (we will learn how in
Chapter 7), they are widely used in Web browsing, word processing, compilers, etc. (Unfortunately no such practical model
is available for denoting other types of formal languages.)

Interesting Warning Labels Break Time

On a knife: Warning. Keep out of children.


On a string of Christmas lights: For indoor or outdoor only.
On a food processor: Not to be used for the other use.
On a chainsaw: Do not attempt to stop chain with your hands.
On a child‟s superman costume: Wearing of this garment does not enable you fly.

- Anonymous -

70
Regular Expression Other Language Models

Definition: Regular expression


Let  be a finite alphabet. The regular expressions over  and the sets of strings (i.e.,
languages) they denote are defined as follows.
(1) Ø is a regular expression which denotes the empty set.
(2)  is a regular expression which denotes the set {}.
(3) For each symbol a  , a is a regular expression which denotes the set {a}.
(4) If r and s are regular expressions which denote the sets of strings R and S,
respectively, then r + s, rs and r* are regular expressions that, respectively,
denote the set R  S, RS and R*.

In the above definition boldface symbols are used to help the reader distinguish
regular expressions from strings. From now on, whenever it is clear under the context,
we shall use normal fonts in a regular expression.

71
Regular Expression Other Language Models

In the definition of the regular expression, operators +, concatenation and *,


respectively, correspond to the set union, set product and set closure. The precedence
of these operators in a regular expression is in the order (from the highest) of *,
concatenation, +. However, we are free to use parentheses in a regular expression, as
we do in an arithmetic expression, wherever there is ambiguity.
For a regular expression r, by L(r) we denote the language expressed by r. Here are
some examples showing regular languages and their representation in terms of a
regular expression.

Language L(r) Regular expression r


{a, b}* (a + b)*
{ai | i  0} a*
{aibj | i, j  0} a*b*
{xaaybbz | x, y, z  {a, b}* } (a+b)*aa(a+b)*bb(a+b)*

72
Other Language Models
Regular Expression

Algebraic Properties of Regular expressions

For two regular expressions E and F, define the identity, written E = F, if and only
if their languages are identical, i.e., L(E) = L(F). Regular expressions with operator
+ or concatenation have the identity laws similar to those of arithmetic algebra with
operator + (plus) and multiplication as shown below. Notice that in a regular
expression the commutative law does not hold for concatenation.
Let r, s and t be arbitrary regular expressions.
• Commutative law: r + s = s + r. (But for concatenation, rs  sr.)
• Associative law: r + (s + t) = (r + s) + t, (rs)t = r(st).
• Distributive law: r(s + t) = rs + rt.
• For the regular expressions Ø and , we have the following identities.
Ø + r = r + Ø = r, Ø r = rØ = Ø , r = r = r
Notice that Ø and , respectively, correspond to 0 and 1 in arithmetic algebra. We
leave the proof for the reader.

73
Other Language Models
Rumination (1): syntax flow graph
• Occasionally, we hear that programming languages belong to the class of context-free languages. That is true except for
some special cases, such as input/output statements. However, in a text book we hardly come across the whole grammar of a
programming language. Usually the grammar is informally described, focusing on how to use the statements. However, to
develop a system component such as compiler or debugger for a programming language, we need a formal definition of the
language. Appendix A shows the whole syntax flow graphs for Pascal. (Pascal is a predecessor of C.)

Rumination (2): Regular Expression


• There are quite a few variations in the form of regular expressions. Here are some examples together with the standard regular
expressions.

(a | b)* = (a+b)* , (a | b)+ = (a+b)(a+b)* [ab] = (a+b), [a-z] = a + b + c + . . . . + z ,


abc? = (ab + abc), (abc)? = abc + 
With these variations, the integers defined in C can be expressed as follows.
digits [0-9]
int {digits}+
real {int}”.”{int}([Ee][+-]?{int})?
• For a given language, there are infinitely many regular expressions that denote the same language. For example, r, r + r, r + r + r,
… , etc. are all equivalent. As shown in the preceding section for algebraic properties of regular expressions, for such simple
regular expressions it is possible to prove their equivalence. However, there is no practical algorithm available for proving the
equivalence of two arbitrary regular expressions. For example, it is not trivial to prove the following equivalences. In Chapter 7,
we will present an idea how to prove them.
(a) (r*)* = (r)* (b) r*(r + s)* = ( r + s)* (c) (r + s)* = ( r*s*)*

74
Other Language Models
Exercises
3.1 What is the language of L-system G = ({a, b, c}, h, acb ), where the rules are defined as follows?
h (a) = aa h (b) = cb h (c) = a

3.2 Draw a syntax flow graph for the following context-free grammar.
S  aSbB | A A  bSa | ba | ab B  bB | 

3.3 The following syntax flow graph is a simplified version of the syntax flow graph for <expression> of Pascal programming
language.
(a) Write a grammar which generates the same language defined by this syntax flow graph.
(b) Show how expression a1+ (a + b)*a – a/b in Pascal can be derived with the rewriting rules of your grammar.

<expression> <term>
<digit>
0
term factor

+ - * / 1

term factor

<factor>
<variable> <letter>
variable letter
a
letter
( expression ) digit b

75
Other Language Models
Exercises

3.4 Briefly describe the language expressed by each of the following regular expressions.

(a) (0 + 1)*00 (b) (1 + 0 )*111( 0 + 1)* (c) (1 + 0)*(111 + 000)(0 + 1)*

Actual Signs
Break Time
Outside a muffler ship: “No appointment necessary. We‟ll hear you coming.”
In a veterinarian’s waiting room: “Be back in 5 minutes. Sit! Stay!”
At the electric company: “We would be delighted if you send in you bill. However, if you don‟t, you will be (de-lighted).”
In a beauty shop: “Dye now!”
In a cafeteria: “Shoes are required to eat in the cafeteria. Socks can eat any place else they want.”

- Anonymous -

76
Models of Language Recognition:
Automata

77
4. Automata
In the preceding two chapters we learned several models for language representation; formal grammars, L-systems, syntax flow
graphs and regular expressions. The natural languages are communication media, and so are the formal languages. The person who
hears a spoken sentence should be able to understand the meaning (i.e., the semantics) of the sentence. In this chapter we will
learn how to design machine models (i.e., automata) that recognize formal languages. We are not going to build a model which
understands a formal language, but one that recognizes it. Here we use the word “recognize” in a much narrower sense than the
word “understand” as follows.

For a given language L, we say a machine M recognizes L, if M, given a string x, can output “yes” or give
a positive indication if and only if x ∈ L.

What does it take for a machine M to recognize a language L? To answer this question it is better to think about a conceptual
model, instead of complex computer systems currently available. From 1930‟s computer scientists have investigated many
machine models, called automata, for language recognition. We will study four popular models – Turing machines, linear-
bounded automata, pushdown automata and finite automata, respectively, recognizing the languages of types 0, 1, 2, and 3,
that we have introduced in Chapter 2. In Chapters 7, 12, and 14 we will study the so-called Chomsky hierarchy (after Noam
Chomsky), which shows the hierarchical relationship among the languages and automata.
Automata can be classified into two types of models, deterministic and nondeterministic. In this chapter we will first study
the deterministic models, which naturally follow our intuition, and then the nondeterministic models in the following chapter.

78
Automata

4.1 Deterministic Turing machines (DTM) 80


Design example
State transition graph, State transition table, State transition function
Formal definition
4.2 DLBA (Deterministic Linear Bounded Automata) 98
Definition and Design example
4.3 DPDA (Deterministic Pushdown Automata) 100
Definition
Design example
4.4 DFA (Deterministic Finite Automata) 111
Definition
Design example
Rumination 114
Exercises 120

LOVE Break Time

Love is when you take away the feeling, the passion, the romance and you find out you still love the person.

Love comes to those who still hope even though they‟ve been disappointed, to those who still believe, even though
they‟ve been betrayed, to those who still love even though they‟ve been deeply wounded before.

- Anonymous -

79
Automata
4.1 Deterministic Turing Machines
The Turing machine (TM), introduced by Alan M. Turing in 1936, is a model that
can recognize type 0 (phrase structured) languages. In this section we will study the
deterministic version of the Turing machine.
As illustrated below, a TM has a finite state control with a two-way read/write tape
head and one-dimensional tape of infinite length. The finite state control
corresponds to the CPU of a modern computer, where the “state” represents the
whole information stored in the main memory and registers.
The tape is divided into cells, each capable of holding one symbol. It is assumed
that initially, an input string over a given alphabet is written on the tape, and the tape
head is reading the leftmost symbol. We assume that all the blank cells are written
with the blank symbol B.

a b a b b read/write tape

two-way read/write head

finite state
control

80
DTM Automata

In one move (or step) a deterministic Turing machine (DTM) does the following:
(1) Read the content of the cell on which the tape head is located.
(2) Depending on the symbol read and the current state of the finite state control,
write a symbol on the cell, erasing the original symbol.
(3) Changing the state (or possibly without changing it) move the tape head left or
right, or let it stay on the same cell.

a b a .. c b a .. c b a ..

A
p q p
N
I
We say the DTM accepts the input string if it enters a designated state (called
accepting state) after some finite number of moves.

81
DTM Automata

It is unwieldy to draw the configuration (i.e., the tape and the finite state control)
for every step of a DTM to illustrate how it works. Instead we will mainly use a
directed graph, called the state transition graph, or occasionally the transition
function  to represent each step as shown in figures (a) and (b), respectively.

next
state
( symbol read, symbol written, head direction )
current
(current state, symbol read) =
state
(next state, symbol written, head direction)
(a)
(b)

82
DTM Automata

Figures (a) and (b) below, respectively, show a graph and its functional
representation of two steps of a DTM. In the figures, R and L, respectively, denote the
tape head moving right and left. We shall use N to denote the head staying on the same
cell.

a b a .. c b a .. c b a ..

p q p

q
(a, c, R)
(p, a) = (q, c, R)
(b, b, L)
(q, b) = (p, b, L)
p
(a) (b)

83
DTM Automata

Example 1. Construct a DTM which given the word “mother,” converts it to


“father,” and vice versa. After completing the conversion, the DTM must halt in an
accepting state.
Figure (a) below shows the initial and final configurations of a DTM which does
the conversion. (Intermediate configurations are omitted.) Figure (b) shows the
state transition graph. In the state transition graph of a DTM, state labels are just
for reference and hence, can be omitted, if not needed, as shown in figure (b).

m o t h e r f a t h e r

s f
(a)
(a, o, R) (B, B, N)
(f, m, R) f
(r, r, R)
(m, f, R) (o, a, R) (t, t, R) (h, h, R) (e, e, R)
s
start
(b)
84
DTM Automata

Notice that depending on the first character (either „m‟ or „f‟) read, the machine
takes a transition to a different state. After converting the first two characters, it
does the same work (i.e., rewriting the “ther” part) by taking the four transitions,
and finally, reading the blank symbol B, enters the accepting state (the heavy circle
labeled with f).
Also notice that the machine cannot enter an accepting state after changing the
first two characters. It should make sure that the remaining part of the input is
correct.

(B, B, N)
f
(f, m, R) (a, o, R)
(r, r, R)

(m, f, R) (o, a, R) (t, t, R) (h, h, R) (e, e, R)


s
start

85
DTM Automata

Example 2. Design a DTM which recognizes the language L = {0n1n | n  1}.


Suppose that string 000111 is initially written on the tape as shown below. How can a
DTM M recognize that this string belongs to L? We know that if a string x is in L, there
should be the same number of 0‟s and 1‟s in x, and all the 0‟s should come first. We
will let M use this property.
One idea is to let M move back and forth checking if, for each symbol 0 erased (i.e.,
rewritten) by a symbol, say X, there is 1 that can be erased by Y. We let M enter an
accepting state, if it sees that every pair of 0 and 1 is erased. Before we give a formal
definition of M, let‟s figure out how the machine should move along the tape to
implement this idea.

0 0 0 1 1 1

86
DTM Automata

Let q0 denote the start state of M. We will use the following state names in which
the machine does the following;

q1: The state in which the machine, after erasing the leftmost 0
with X, moves right in search of the leftmost 1.
q2 : The state in which the machine, after erasing the leftmost 1
with Y, moves left in search of X.
q3 : The state in which, after reading X, the machine backs up right
to read the leftmost 0, if any.

We will introduce additional states, if needed, while designing M‟s work


further in detail.

0 0 0 1 1 1

q0

87
DTM Automata

(a) (a) In start state q0, M reads 0 and


X 0 0 1 1 1
erasing it by X, moves right in q1 in
q0 q1 search of the leftmost 1.
(b) After erasing 1 with Y, M moves
(b) X 0 0 Y 1 1 left in state q2 in search of X.
A
q2 N
I
(c) X 0 0 Y 1 1 (c) Reading X in state q2, M backs up
in state q3.
q2 q3 (d) Reads the next leftmost 0, if any,
and erase it with X and moves right in
(d) X X 0 Y 1 1
state q1 in search of the next 1 to be
q1 erased.
M repeats the procedure (b), (c) and
(d).

88
DTM Automata

A
(b) X X 0 Y Y 1 N
q2 I

(c) X X 0 Y Y 1
q2 q3 (f) This is the final phase. In state q3 if
M reads Y, not 0, it implies that all 0‟s
are erased by X. M should make sure
(d) X X X Y Y 1 that no 1‟s remain on the tape. For this,
q1 M needs additional states q4 and an
accepting state q5.
(e) X X X Y Y Y Reading Y in state q3, M enters state
q4 and moves right until it reads the
q2
blank symbol and then enters an
accepting state q5 .
(f) X X X Y Y Y
q2
q3 q4 q5

89
DTM Automata

The following state transition graph is a formal representation of how the DTM
recognizes the language L = {0n1n | n  1}.

(0,0,L) In q2, after erasing the


In q1, after erasing the (Y,Y,L) leftmost 1 with Y, keep
(0,0,R)
leftmost 0 with X, keep moving left in search of
(Y,Y,R)
moving right in search of X.
the leftmost 1. (1,Y,L) q2
Reading X, back up
q1
right in state q3.
(0,X,R) (X,X,R)
(0,X,R)
q0 q3 A
start N
(Y,Y,R) (Y,Y,R) I
q4 q5
In q4, check if all 1‟s have (B,B,N)
been erased with Y.
90
DTM Automata

Only the transitions along a path defined on the graph are legal. Reading a symbol
undefined (e.g., reading 1 in state q0 ), implies that M rejects the input.
Notice that the sequence of moves along the cyclic transitions through q1, q2, and q3
erases any number (≥ 1) of matching pairs of 0 and 1. We can prove by induction that
DTM M enters the accepting state q5 if and only if the input string is in the language
L = {0n1n | n  1}.

(0,0,R) (0,0,L)
(Y,Y,R) (1,Y,L) (Y,Y,L)
q2
q1

(X,X,R)
(0,X,R) (0,X,R)
q0 q3
start

(Y,Y,R) (Y,Y,R)
q4 q5
(B,B,N)
91
DTM Automata

The functional representation of the DTM M that we have designed is given


below. In this book we will mainly use the state transition graph to show a DTM
and other automata, because it is easier to understand how they work.

State Transition Function  of M

(q0, 0) = (q1, X, R) (q1, 0) = (q1, 0, R) (q1, 1) = (q2, Y, L)


(q2, 0) = (q2, 0, L) (q2, X) = (q3, X, R) (q3, 0) = (q1, X, R)
(q1, Y) = (q1, Y, R) (q2, Y) = (q2, Y, L) (q3, Y) = (q4, Y, R)
(q4, Y) = (q4, Y, R) (q4, B) = (q5, B, N)

92
DTM Automata

A DTM can also be represented with a table, called the state transition table. The
table below defines the DTM that we have designed. The states are listed on the first
column and the tape symbols on the first row. Table entries contain the triple (next
state, symbol written, direction). Blank entries are for the cases of rejecting the
input.

State transition table of M

0 1 X Y B
q0 (q1, X, R)
q1 (q1, 0, R) (q2, Y, L) (q1, Y, R)
q2 (q2, 0, L) (q3, X, R) (q , Y, L)
2

q3 (q1, X, R) (q4, Y, R)
q4 (q4, Y, R) (q5, B, N)
q5

93
DTM Automata

Formally a DTM M is defined with a sextuple M = ( Q, , , , q0, F ), where

Q: The finite set of states. q0 : The start state (q0  Q).


: The finite input alphabet.
: The finite tape alphabet, i.e., the set of symbols that are allowed to be
written on the tape, including the blank symbol B. Notice that   .
F: The set of accepting states (F  Q).
: The state transition function (: Q   Q    {L, N, R}), where
R, L and N, respectively, denote the direction in which the head moves
right, left or stay.

Given an input string x, if M enters an accepting state in a finite number of moves,


we say M accepts x. The language recognized by M, denoted by L(M), is the set of all
strings x in * that are accepted by M, i.e.,
L(M) = { x | x  *, M accepts x}.

94
DTM Automata

Example: The DTM M which recognizes L = {0n1n | n  1} can be formally defined


as follows.
M = ( Q,  ,  ,  , q0 , F ),
where Q = {q0, q1, q2, q3, q4, q5},  = {0, 1},  = {0, 1, X, Y, B}, F = {q5}, and the
transition function  is given by the following state transition graph.

(0,0,R) (0,0,L)
(Y,Y,R) (Y,Y,L)
(1,Y,L)
q2
q1
(0,X,R)
(X,X,R)
(0,X,R)
q0 q3
start

(Y,Y,R) (Y,Y,R)
q4 q5
(B,B,N)

95
DTM Automata

The state transition of M shows only the transitions that lead to an accepting state.
For example, the graph (repeated below) defines only one transition from the start
state q0 for the case when the input symbol is 0. What will happen when the DTM
reads 1 in state q0? We assume that in such undefined cases, the machine enters a
“dead state” and rejects the input. For simplicity, we do not show such rejecting
transitions.

(0,0,R) (0,0,L)
(Y,Y,R) (Y,Y,L)
(1,Y,L)
q2
q1
(0,X,R)
(X,X,R)
(0,X,R)
q0 q3
start

(Y,Y,R) (Y,Y,R)
q4 q5
(B,B,N)
96
DTM Automata

Unless stated otherwise, the reader may assume the following.


• Only lower case letters are used for the input alphabet .
• Initially, the input tape is written with a string x over the input alphabet, i.e.,
x  *, and the tape head is reading the leftmost symbol, if the input is not .

Notice that these assumptions do not affect the computational capability of the
machines. We could assume that the machine‟s read/write head is initially
positioned at an arbitrary location on the input string. Then we can simply modify
the machine, as the following figure illustrates, so that it moves onto the leftmost
symbol of the input string and then starts the computation.

(X, X, L)

(B, B, R) . . . .
q0
start
X: any input symbol

97
Automata

4.2 Deterministic linear bounded automata (DLBA)


A DLBA is a DTM whose read/write tape range is bounded by the length of the
input string. To indicate this range, the input string is delimited by a pair of boundary
markers (the matching brackets) as shown below. During the computation, if the
DLBA read one of the markers, it should either halt or move back toward the direction
it came from. Note that the boundary markers „[„ and „]‟ are in .
Formally a DLBA is defined with an octuple including the boundary markers as
shown below. Except for the boundary markers, other elements of the octuple and the
language recognized by the automaton are specified as in the formal definition of a
DTM.

M = (Q,  ,  ,  , q0, [, ], F) [ a b a a b ]

q0

98
DLBA Automata

With a simple modification of the DTM recognizing {0n1n | n  1} as shown


below, we can make the state transition graph of a DLBA which recognizes the
same language. Notice that in state q4, reading the closing boundary mark „]‟
(instead of the blank symbol B), the machine enters the accepting state. All the
other moves are identical to the ones of the DTM.

(0,0,L)
(0,0,R)
(Y,Y,L)
(Y,Y,R) (1,Y,L)
q2
q1
(0,X,R)
(X,X,R)
(0,X,R)
q0 q3
start

(Y,Y,R) (Y,Y,R)
q4 q5
( ], ], N )
99
Automata

4.3 Deterministic pushdown automata (DPDA)


A deterministic pushdown automaton (DPDA) is a restrictive version of the DTM.
The capacity of the two-way read/write tape head is restricted to one-way read-only
and instead, the machine is equipped with a pushdown stack as shown below. In each
move the machine does the following:
(1) Read the stack-top symbol.
(2) Depending on the stack-top symbol and the current state, either read the input
or do not read it.
(3) Depending on the input symbol read (if it was allowed to read) and the stack-top,
pop the stack-top symbol or push some symbols, possibly changing the stack-top.

a b b a a b
one-way read only
finite state
control
push down stack

Z0
100
Automata
DPDA

Notice that depending on the state and the stack top symbol, the input head is either
allowed to read or not. Reading the input tape, the head should move right onto the
next cell. Otherwise, it does not move. The stack head is always positioned at the stack
top. We assume that at the bottom of the stack, a designated bottom of the stack
symbol (Z0) is initially written.

a b b a a b
one-way read only
finite state
control
push down stack

Z0

101
Automata
DPDA

As we did for DTM and DLBA, we shall mainly describe a DPDA with the state
transition graph or the transition function  as shown below. To denote a move of not
reading the input tape, we shall use the null character  for the parameter “input
symbol read.” We call it an -move. If the parameter “stack-top content changed” is ,
it is a popping move, i.e., the stack-top symbol is erased.

next
state
( input symbol read, stack-top symbol /stack-top content changed)

current (a)
state

(current state, input symbol read, stack-top symbol) =


(next state, stack-top content changed)

(b)
102
DPDA Automata

The following figures show a sequence of moves of a DPDA and their


representations in terms of the transition function  and the state transition graph.
Notice that the second move is an -move.

a b c a b c a b c a b c

p q r q

… Z … X Y … X V … X

 ( p , a, Z ) = ( q,YX )  ( q, , Y ) = ( r, V )  ( r, b, V ) = ( q,  )

A
(a, Z /YX) (, Y/V) N
p q r I

(b, V/)

103
DPDA Automata

Formally a DPDA M is defined by the following septuple


M = ( Q, , , , q0 , Z0, F ),
where Q is the set of states,  is the input alphabet,  is the stack alphabet,  is the
transition function, q0 is the start state, Z0 is the bottom of the stack symbol, and F is
the set of accepting states.
The function , given the current state, the stack-top symbol and the input symbol
read or  (for -move), outputs the next state and a string of stack symbols with which
the stack-top symbol is replaced, i.e.,
 : Q  (   { } )    Q  *

According to the above definition, M can rewrite the stack-top symbol with a string
(of stack symbols) of finite length (including the null string). For convenience we can
simplify the stack operation such that the machine either pops the stack-top or pushes
one symbol on top of the stack without affecting the language recognized by the
machine. At the end of this chapter we will show how.

104
DPDA Automata
Starting with the initial configuration with the input head on the leftmost symbol
and the stack head at the bottom of the stack, a DPDA M accepts the input if and
only if it enters a configuration satisfying the following two conditions. (Notice that
the conditions are irrelevant to the stack contents.)

(1) The input head has read the input string up to the last symbol.
(2) The machine is in an accepting state (qf in figure (b)).

a b b a a a b b a a

q0 qf

Z0 Z0 X
(a) (b)
The language L(M) is the set of input strings accepted by M, i.e.,
L(M) = { x | x  * , M accepts x }

105
Designing a DPDA Automata

Example 1. Design a DPDA M which recognizes L = {aibi | i > 0}.

Here is a simple idea. Reading a‟s, M pushes them in the stack, and then, for each b
read from the input tape, the machine erases one a by popping it from the stack top till
it sees Z0 (i.e., the stack is empty). Seeing the stack is empty, M enters an accepting
state without reading the input. The state transition graph of M is shown below.
The best way to understand how this DPDA works is to trace its moves along the
transition graph. See how M accepts the input string aaabbb and fails to accept aaabb.

a a a b b b
(a, a/aa)
(b, a/) (, Z0/Z0)
q0

Z0 (a, Z0/aZ0) (b, a/)


start

106
Designing a DPDA Automata
Example 2. Design a DPDA which recognizes {aibj | j > i > 0 }.
We will use the same idea of pushing a‟s and then, for each b read, popping one
a from the stack top. Notice that by the time the stack is empty (i.e., Z0 appears at
the stack top), the machine has read the same number of a‟s and b‟s. It should
make sure that there are more b‟s in the tape by reading them up to the last one.
To do this the machine reads one more symbol and if it is a, enters an accepting
state. In the accepting state the machine reads off all the remaining b‟s, if any.

(a, a/aa) (b, Z0/Z0)


(b, a/)

(b, Z0/Z0)

(b, a/)
(a, Z0/aZ0)
start

107
Designing a DPDA Automata

Example 3. Design a DPDA that recognizes language {aibj | i > j > 0 }.


Notice that this language has more a‟s than b‟s. The figure shown below is a DPDA
that recognizes this language. Notice that this machine reads the first a and discards it.
Then the machine pushes all the remaining a‟s onto the stack and reading the first b,
enters an accepting state. And in the accepting state, read off all the remaining b‟s,
popping one a from the stack top for each b read.
Because all the a‟s in the input except the first one have been pushed into the stack,
during the popping phase it is guaranteed that the total number of a‟s read from the
input tape is strictly greater than the number of b‟s.
See how this DPDA works by tracing its moves for the input strings aaabb and
aabbb.
(a, Z0/aZ0)
(a, a/aa) (b, a/)

(b, a/)
(a, Z0/Z0)
start

108
Designing a DPDA Automata

For comparison, here are the three state transition graphs that we have designed
in the previous examples.

{aibi | i > 0 } {aibj | i > j > 0 }


(a, Z0/aZ0)
(a, a/aa) (b, a/)
(a, a/aa)
(b, a/) (, Z0/Z0)

(b, a/)
(a, Z0/Z0)
(a, Z0/aZ0) (b, a/) start
start

(a, a/aa) (b, Z0/Z0)


(b, a/)

(b, Z0/Z0)

(b, a/) {aibj | j > i > 0 }


(a, Z0/aZ0)
start
109
DPDA 설계 Automata

Example 4. Design a DPDA which recognizes the following language.


{aibkci | k, i > 0 }  {aibkdk | k, i > 0 }
Notice that to recognize the first half of the language, the machine needs to push a‟s
into the stack to match with c‟s. However, we cannot let the machine read off b‟s and
discard them, because the input string may belong to the second half of the language.
The machine needs all a‟s and b‟s (in this order) pushed into the stack till it reads c or
d from the input tape. Reading c, we let it pop out all the b‟s in the stack before it
begins matching a‟s and c‟s.
(, b/)
(, a/) (c, a/)

(c, b/)
(a, a/aa) (, Z0/ Z0)
(b, a/ba)
(d, b/)

(a, Z0/aZ0) (b, b/bb)


(, a/a)
start
(d, b/)
110
Automata
4.4 Deterministic finite automata (DFA)
Deterministic finite automata (DFA) are the simplest computational model, which
can recognize only regular languages. A DFA is a DPDA with the pushdown stack
removed. Thus a DFA M is defined with the quintuple M = ( Q,  ,  ,q0 , F ), where
Q, , q0, and F are the same as they are defined for a DPDA. The transition function 
is the mapping
: Q    Q.
In a move the machine reads the input tape, and depending on the current state and
the input symbol read, it enters a state in Q and moves right. For example, (p, a) = q
implies a move shown in figure (a) below, which can be represented with the state
transition graph shown in figure (b).

a a b a a b a q
One-way
read only p
p q
(a) (b)

111
DFA
Automata
A DFA accepts the input string if and only if it satisfies the following two
conditions, which are the same for DPDA‟s.

(1) The input head has read the input string up to the last symbol.
(2) The machine is in an accepting state.

The language recognized by a DFA M, denoted by L(M), is


L(M) = { x | x  * , M accepts x }

Example 1. Construct a DFA which recognizes the following language.


L = {x | x  {0, 1}+, the number of 1‟s in x is odd}
Here is the transition graph of a DFA which recognizes the language L.

0 0
1

start
1

112
Designing a DFA Automata

Example 2. Design a DFA which recognizes the following language.


L = {x | x is a binary number, i.e., x  {0, 1}+, and x mod 3 = 0 }
Let M be a DFA which recognizes L. In addition to the start state, M has 3 states,
each labeled with one of the three possible remainders, 0, 1, and 2 (see the graph
below). Reading the input binary string, M computes the current remainder on the fly
and enters the state labeled with that remainder as follows.
0 1
1 0
0
0 1 2
start 1 0
1
For a binary string w, let dec(w) be the decimal value of w. If the first symbol
read is 0, the current remainder is 0, if 1 it is 1. So, depending on the first input symbol
read, M enters either state 0 or 1. Suppose that reading up to a prefix v of the input
string, M entered a state i  {0, 1, 2}, implying that i = dec(v) mod 3. Reading the next
bit j  {0, 1}, M enters a state labeled by dec(vj) mod 3. Since dec(vj) mod 3 = (2dec(v)
+ j) mod 3 = (2i + j) mod 3, M can easily identify the next state to enter as the transition
graph shows.
113
Automata
Rumination (1): States
Every automaton has some states. We have introduced the word “state” without elaborating the deep conceptual issues
involved. Actually, by a state we represent the whole information stored in the finite state control that the machine needs for
the work. The DTM that we have designed in Example 2 of Section 4.1 has 6 states. In each state qi the machine does a unique
work. Erasing the leftmost 0 with X, the machine keeps moving right in state q1 till it reads the leftmost 1. Rewriting this 1
with Y, the machine keeps moving left in state q2 until it reads X (see the top figures (a) through (d) below). Would it be
possible to do this part of the work in one state? The answer is no. The figures at the bottom show why.

Suppose that the machine, erasing the leftmost 1 by Y, moved left in the same state q1. Then the machine will encounter the
same symbol 0 that it came across moving right. Reading this 0 the machine should move right, because the DTM must take
the same move (as in figure (a)) with the same input 0 in the same state q1, i.e.,  (q1, 0) = (q1, 0, R). Reading the same 0
second time in state q1, the machine has no information in its finite control that indicates it should move left this time. In its
finite control the machine should have some information available that is different from the one when it is in state q1. Being in
state q2, different from q1, implies this.

X 0 1 1 .. X 0 1 1 .. X 0 Y 1 .. X 0 Y 1 ..

q1 q1 q2 q2
(a) (b) (c) (d)
X 0 1 1 .. X 0 1 1 .. X 0 Y 1 .. X 0 Y 1 ..

q1 q1 q1 ? q1

114
Rumination (1): States Automata

The six states of the DTM in Example 2 (repeated below) can be implemented with a three-bits register, for example, letting its
values 000, 001, 010, 011, 100, and 101 represent the states q0, q1, q2, q3, q4, and q5, respectively. The number of states is directly
related to the hardware complexity of the machine. Unfortunately, given an arbitrary automaton, it is an unsolvable problem to
minimize the number of states of the machine. However, for DFA it is possible to minimize them. We will learn how in Chapter 8.

(0, 0, R) (0, 0, L)
(Y,Y, R) (Y,Y, L)
(1,Y, L)
q2
q1
(0, X, R)
(X, X, R) X 0 Y 1 ..
(0, X, R)
q0 q3
start 010

(Y,Y, R) (Y,Y, R)
q4 q5
(B, B, N)

115
Automata
Rumination (2): Accepting conditions
Recall that by the definition, both DTM and DLBA accept the input string by entering an accepting state. Hence, no transition is
needed out of an accepting state. We may think that both DTM and DLBA halt in an accepting state. Also recall that both DTM and
DLBA can have more than one accepting state (i.e., F  Q). However, we can show that one accepting state is enough. If it has
more than one accepting state, we simply merge them into one as the figure below illustrates. Thus, we can design a user-friendly
DTM or DLBA which entering the accepting state, can kindly announce that the input has been accepted.
What will happen when the input string is not accepted? Would it be possible to design a DTM which can inform us of not
accepting the input? What about the other machine models? We will deal with these intriguing questions and related ones in
Chapters 7, 12 and 15, under the title of Hierarchy of Languages and Automata.

In contrast, for both DPDA and DFA the following two conditions should be satisfied simultaneously to accept the input string.

(1) The input head has read the string up to the last symbol.
(2) The finite state control is in an accepting state.

Notice that with no unique marker appended to every input string, it is impossible for both DPDA and DFA to identify whether
the symbol read is the last one or not. Figures (a) and (b) below, respectively, show a DPDA and a DFA recognizing the language
L = {ai | i > 0 }. Suppose that the string aaaa is given as an input. Reading the first a, the machines enter the accepting state.
However, it is too early to say that the input string is accepted because it has not read the last a yet (i.e., the second condition is
not yet satisfied). Since the machines keep reading a‟s in the accepting state, it will eventually satisfy the two conditions.
(a, Z0/Z0) a
(a, Z0/Z0) a

start start
(a) (b)
116
Automata
Rumination (2): Accepting conditions
If the input string had a symbol different from a, the machine would fail to accept it because the transitions are defined only
for the symbols in the input alphabet. Since the blank symbols does not belong to the input alphabet, the machine cannot read
the blank symbol next to the last input symbol to ensure that the symbol read in the previous move is the last input symbol or
not. (Such an illegal move is quite tempting for a student designing a DPDA.)
From the above analysis we find that both DPDA and DFA are not autonomous computational models that can kindly tell us
whether the input is accepted or not. In other words, we cannot make them say “Yes, I accept it!” or “Sorry, I cannot.” Rather
we, the user, must decide, for a given input string, whether or not the machine satisfies the two conditions after some finite
number of moves.
Concerning these accepting conditions, we may raise the following question: Why should we put the burden on the user for
the decision? How about appending an end-of-string marker to every input string? Consider the language L‟ = {ai ; | i > 0}
with a marker „;‟ appended to every string in L. For such language we don‟t need condition (1) above. The input strings will
accepted if and only if the machine enters an accepting state (see the figures below).
(a, Z0/Z0) a
(a, Z0/Z0) a

start ( ; , Z0/Z0) start ;

Appending an end-of-string marker to every string appears to alleviate the burden for the user to see if a DPDA or a DFA
accepts it. However, this approach introduces a new burden for the user in the real application. For example, programming
language C uses such approach (each statement must end by a semicolon) and we know why it is a burden, while FORTRAN does
not. In this book we mainly study the automata whose languages have no end-of–string markers.
Recall that for both DTM and DLBA, the only accepting condition is to enter an accepting state. The user does not have to
check if the last input symbol has been read. They can identify the last symbol by reading the next neighboring symbol, i.e., the
blank symbol for DTM and the right boundary marker „]‟ for DLBA. (Notice that blank symbol belongs to the tape alphabet that
the DTM can read.) Since the tape head of these models is bidirectional, they can accept the input string with the head located at
any position.
117
Automata
Rumination (3): Stack operation of DPDA
By definition, in every move, a DPDA rewrites the stack-top symbol with a string of finite length (including the null string).
(Notice that rewriting the stack-top with the null string implies a popping operation.) This stack operation can be simplified
such that the stack-top symbol is either popped out or exactly 1 symbol is pushed on top of the stack without affecting the
language recognized by the machine. (Notice that in this simplified version, there is no stack operation rewriting the stack-top
symbol with another.)
Suppose that as illustrated in figure (a) below, there is a DPDA which has a stack operation rewriting the stack-top with a
string of length either 1 (i.e., simply replaces the stack-top with another) or greater than 2 (i.e., pushes more than 1 symbol on
top of the stack). Figure (b) shows a simplified version of figure (a). Notice that the stack operation involved in the transition
with (b, A/B) is replaced by two transitions with a pushing operation (b, A/XB) followed by a popping operation (, X/),
where X is a new stack symbol introduced. It is clear from the transformation that such simplification does not affect the
language recognized by the DPDA.

(, X/)
(b, A/XB)
(b, A/B)

(a, Z0/BZ0) (, a/Aa)


(a, Z0/AaBZ0)

(, B/aB)

(a) (b)

118
Automata
Rumination (4): Reading operation of a DPDA
Finally we would like to ruminate on the reading operation of a DPDA and clarify some tricky concepts involved. Recall that
a DPDA does the following in a move.
(1) In the current state, read the stack-top symbol.
(2) Depending on the stack-top symbol and the current state, either read the input or do not read.
(3) Depending on the input symbol read (if it was allowed to read in (2) above) and the stack-top symbol, pop the stack-top
symbol or push some symbols, possibly changing the stack-top.
For a state p, stack symbol A and an input symbol a, if transition  (p, a, A) is defined, then the above definition implies that
transition  (p, , A) should not be defined, and vice versa. To the contrary, suppose that both transitions  (p, a, A) = (q, BA)
and  ( p, , A) = (r, CA) are defined (see the figure below). Transition  (p, a, A) = (q, BA) implies the following move.
(1) In state p, the DPDA reads the stack-top symbol. (This part is always done in any state.)
(2) If the stack-top symbol is A, the machine reads the input, and
(3) if the input symbol read is a, pushes B on top of the stack and enters state q.
r
In contrast, transition  (p, , A) = (r, CA) implies the follow move: (, A/ CA)
(1) In state p, the DPDA reads the stack-top symbol.
p q
(2) If the stack-top is A, the machine does not read the input, and (a, A/ BA)
(3) pushes C on top of the stack and enters state r.

Notice that in the two moves  (p, a, A) = (q, BA) and  ( p, , A) = (r, CA), the machine does two contradictory actions
(boldfaced phrases) in the same state p with the same stack-top symbol A; in the former move, the machine reads the input
symbol, while in the later it does not read the input. This violates the determinism of DPDA that for the same course (i.e., the
same state p and stack-top symbol A) there must be a unique response. Deterministic automata do not allow such moves. In the
following chapter we will study nondeterministic automata, where such contradictory moves are allowed.

119
Automata
Exercises

4.1 Design a DTM which, given a binary number x  {0, 1}+ on the input tape, increments x by one and halts in an
accepting state.

4.2 What does the following DTM do? The input is a binary string. Describe the work in detail referring each state.
For your answer, it is a good strategy to trace the moves of the machine with a couple of short input strings.

(0,0,L)
(1,1,R)
(0,0,R)

(0,B,L) 4 (B,0,N)
(1,1,R)
(1,0,L) (0,0,R)
start 0
(B,B,L)

1 (B,B,L) (0,1,L) (1,0,L) 6


2 3
(0, 1, R)
(B,1,R)

(1,B,L) 5 (B,1,N)

(1,1,L)

120
Exercises Automata

4.3 Design a DTM which, given a string x {a, b}*, does each of the following works and then halts in an accepting state.
(a) Shift the string x one cell to the right. (b) Shift the string x two cells to the right.

4.4 Design a DTM which, given a string x {a, b}*, collects all a‟s in x to the left and all b‟s to the right. For example, given
the string baabbab as an input, your machine should rearrange the symbols in it such that aaabbbb is the only string in the tape.

4.5 Design a DTM which, given a string x  {a, b, c}+, moves all c‟s toward the left end of the string and halts. For example,
given the string baacbbccabc, the machine should leave only the string ccccbaabbab on the tape and halt. With the transition
graph of your machine and an input string, briefly explain how it works.
4.6 Design a DPDA to recognize each of the following languages. (Notice that L1, L2, and L3 have an end-of-string marker „;‟)
(a) L1 = {aibj; | i > j > 0 } (b) L2 = {aibj; | i  j > 0 } (c) L3 = {aibj; | i  j  0 }
(d) L4 = {aibj | i > j > 0 } (e) L5 = {aibj | i  j > 0 } (f) L6 = {aibj | i  j  0 }
(g) L7 = { aibkci | i, k > 0 } (h) L8 = { aibkci | i, k  0 }

4.7 Design a DPDA to recognize the language generated by each of the following context-free grammars.
(a) G1: S  aSbb | abb (b) G2: S  aSbb |  (c ) G3: S  aSbb | ccc

4.8 Design a DPDA to recognize each of the following languages.


L1 = { x | x  {a, b, c}+, in x the number of a‟s and the number of b‟s are same. }
L2 = { x | x  {a, b, c}+, in x the number of a‟s is strictly greater than the number of c‟s. }

4.9 Design a DFA to recognize each of the following languages.


(a) L1 = { babo } (b) L2 = { xbaboyobba | x, y  {a, b, o}*}
(c) L3 = {x | x  {a, b, c}+ , string x begins and ends with the same letter.}

121
Exercises Automata
The following problem is for modeling a system by a DFA. The original problem is given as an exercise in the book
“Introduction to Automata Theory, Languages, and Computation (page 53) (by Hopcropt, Motwani and Ullman, Edison Wesley,
2001).” Here we present it slightly modified.
4.10 Consider a network of ducts shown below. The network has two entries labeled A and B, and two exits C and D. There are two
levers; x1 at the branching section and x2 at the cross-section. If we drop a marble in at A or B, the marble rolls down along the
ducks and hits one of the levers, which causes it to fall either to the left or to the right. Whenever a lever is hit by a marble, it causes
the lever change its orientation 90o such that the next marble to encounter the lever will be directed to the opposite direction. The
initial lever positions are set as shown in the figure, indicating that both levers direct the marble to the right.
Model this network by a DFA whose input is a binary string, where 0 denotes a marble dropped in at A, and 1 denotes a marble
dropped in at B. The automaton accepts the input if the last marble comes out of exit D.

A B

x1

x2

C D

122
Models of Language Recognition:
Automata

123
5. Nondeterministic Automata
The computational machine models that we studied in Chapter 4 are deterministic in the sense that the next move (i.e., the
effect) of the automata is uniquely determined by the current state and the input (i.e., the cause), including the stack-top for
the case of DPDA. Thus, the state transition graphs of DFA's, for example, have at most one outgoing edge for each input
symbol. For a state p and an input symbol a of a DFA, the transition function  (p, a) gives no more than one next state, for
example,  (p, a) = q. Deterministic machines follow our intuition that for a given cause the effect is uniquely determined.
In this chapter we will study the computational models whose transitions do not go along with our deterministic intuition.
They are allowed to take more than one action in a move. More specifically, for the same state and an input symbol (or a
stack symbol for the pushdown automata), we allow the automata to take more than one transition simultaneously. For
example, a nondeterministic finite state automaton (NFA) can have a nondeterministic transition, like (p, a) = {q, r), i.e.,
reading symbol a in state p, the machine enters two different states q and r simultaneously. Likewise, nondeterministic
pushdown automata (NPDA) can have (p, a, A) = {(q, BA), (r, )}, i.e., they can push and pop simultaneously, possibly
entering different states. Nondeterministic linear bounded automata (NLBA) and nondeterministic Turing machines (NTM)
can have nondeterministic transitions, like  (p, a) = {(q, b, R), (q, b, L)}.
Nondeterministic automata involve a conceptual model that cannot be implemented with current technology. However, we
will see that once we understand the underlying concept of these nondeterministic automata, we can use them as a powerful
tool for the design and analysis of deterministic machines. Now, we will study these nondeterministic models beginning with
NFA, and then NPDA, NLBA, and NTM, in this order.

124
Nondeterministic Automata

5.1 NFA (Nondeterministic Finite Automata) 126


Definition and examples
5.2 NPDA (Nondeterministic Pushdown Automata) 132
Definition and examples
5.3 NTM (Nondeterministic Turing Machines) and
NLBA (Nondeterministic Linear Bounded Automata) 139
5.4 Nondeterministic Computing 140
Nondeterministic computing
Example
5.5 NFA and -transitions 145
-transitions in an FA
Eliminating -transitions
Rumination 154
Exercises 156

Today‟s Quote Break Time


I pay very little regard to what any young person says on the subject of marriage. If they profess a disinclination for
it, I only set it down that they have not seen the right person.
- Jane Austen –
Love is the difficult realization that something other than oneself is real.
- Iris Murdoch -

125
Nondeterministic Automata
5.1 Nondeterministic Finite Automata (NFA)

Figure (a) below shows the state transition graph of an NFA that has
nondeterministic transitions in states q0 and q2, i.e.,
(q0, b) = {q0 , q1}, (q2, a) = {q0 , q1}.
For the input string bba, this NFA takes multiple sequences of transitions
simultaneously as shown in figure (b), which we shall call state transition profile for
the given input. Notice that a root-to-leaf path corresponds to a full sequence of
transition for the input. (In the figure, X denotes the dead state.)
q0
b b
a,b
b q1 b q0 b q1 b

q0 a a q0 q1 a X
a

a q1 q2
q2

(a) An NFA M. (b) State transition profile of M for bba.


126
Nondeterministic Automata
NFA

The accepting condition of an NFA is defined as follows.

An NFA accepts the input if and only if the transition profile has an accepting
leaf node (i.e., state). (All the other non-accepting leaf nodes are ignored.)

In other words the input string is accepted by an NFA if and only if there is a
sequence of transitions finally entering an accepting state. According to this
definition, the input string bba is accepted by M.
q0
a,b b b
b q1
b q0 b q1 b
q0 a a
q0 q1 a X
a
a q1
q2 q2

(a) An NFA M (b) Transition profile of M for bba.

127
NFA Nondeterministic Automata

Figure (b) below shows the transition profile for the input string bbb. This profile
has no root-to-leaf path ending in an accepting state. Hence, by definition the input
is rejected by M.

a,b q0
b q1 b b
b q0 b q1 b
q0 a a
b q0 b q1 b X
a X
q2 q0 q1

(a) An NFA M (b) Transition profile of M for bbb.

128
NFA Nondeterministic Automata

Formally, an NFA M is defined with the quintuple as follows.


M = (Q, , , q0, F ),
where all the elements are the same as the ones for DFA‟s, except for the state
transition function , which is defined as follows.
 : Q    2Q
That is, the automaton, reading the input symbol in a state, takes transitions
simultaneously to every state in a subset of Q. (Recall that 2Q denotes the set of
subsets of Q.)
For a string x, let (p, x) denote the set of states reachable from state p after
reading the input string x. Then the language L(M) is defined as follows.

L(M) = {x| x  *, (q0, x) = S, and S  F   }

Notice that S  F   implies that the transition profile for the string x has an
accepting leaf node, i.e., there is a sequence of transitions ending in an accepting
state.

129
NFA
Nondeterministic Automata
Designing an NFA
Example 1. Construct an NFA which recognizes L = {xaa | x  {a, b}*}.
This language is the set of strings in {a, b}* ending with aa. The figure below shows
an NFA which recognizes L. This NFA, reading b, stays in state 1, while reading a,
takes nondeterministic transitions to state 1 and 2.
Because of this nondeterministic transition, it is guaranteed that if the input string
contains aa, the machine enters the accepting state 3 by reading the second a. Hence,
if the second a is the last symbol of the input string, then by the definition, it is
accepted.
Notice that together with this accepting sequence of transitions, there are other non-
accepting sequences of transitions ending in a non-accepting state that we ignore by
the definition.
a, b a
a
2
1 3

130
NFA Nondeterministic Automata

Example 2. Construct an NFA to recognize the following language.


L = {xaaybbz | x, y, z  {a, b}*}
This language is the set of strings in {a, b}*, each having aa followed by bb, not
necessarily consecutive. We use the same idea that we used for Example 1. We
connect two transition graphs; one for recognizing xaa followed by the other for ybb,
and then let the automaton read off the part z in an accepting state.

b
a, b a b
a

a, b
a, b
Nondeterministic transitions do not give any additional power for recognizing a
language, i.e., NFA‟s and DFA‟s recognize the same languages. We will learn why
in Chapter 8. Actually we will study how to transform an NFA to a DFA which
recognizes the same language.
131
Nondeterministic Automata

5.2 Nondeterministic Pushdown Automata (NPDA)


Since PDA‟s have the stack, they can do more works in a move than NFA‟s.
Therefore they can take more diverse nondeterministic moves at a time. For example,
in a move they can enter different states simultaneously as well as change the stack
top with different contents, as the following examples show.

(p, a, Z) = {(q, aZ), (r, aZ)}


(s, b, Z) = {(t,  ), (t, bZ), (t, cZ)}

The power of nondeterministic transitions of an NPDA is best shown by the


following simple example.

132
NPDA Nondeterministic Automata

Example 1. Construct an NPDA to recognize the following language.


L = {aibjck| i, j, k > 0, and i = j or j = k }.
This language is the union of the following two languages.
L1 = {aibjck| i, j, k > 0, i = j } L2 = {aibjck| i, j, k > 0, j = k }
We first construct DPDA‟s M1 and M2 , respectively, recognizing L1 and L2.

(a, a/ aa) (b, a/)


(b, a/)
(a, Z0/aZ0) (c, Z0/ Z0)
start (c, Z0/ Z0)
(a) M1

start (b, b/bb)


(, Z0/ Z0)
(a, Z0/Z0)
(b, Z0/bZ0) (c, b/  )
(a, Z0/Z0) (c, b/ )
(b) M2
133
NPDA Nondeterministic Automata

L = {aibjck| i, j, k > 0, and i = j or j = k }.

It is proven that no DPDA can recognize L. However, as the following figure


shows, we can simply construct an NPDA, which can recognize L, by combining
the two DPDA‟s M1 and M2 with a nondeterministic transition.

(a, a/ aa) (b, a/) (b, a/)

2 (c, Z0/ Z0)


(a, Z0/aZ0)
(c, Z0/ Z0)
start 1
(b, b/bb)
(a, Z0/Z0) (, Z0/ Z0)
3
(b, Z0/bZ0) (c, b/  ) (c, b/ )
(a, Z0/Z0)

134
NPDA Nondeterministic Automata

A simple idea for identifying whether an automaton is deterministic or not is to


check if the  function of the automaton is one-to-one, i.e., the function gives no more
than one value. However this idea is not sufficient for NPDA.
As we mentioned in Chapter 4 studying DPDA, if a PDA has two transitions from a
state, one reading the input and the other not reading, with the same stack-top symbol,
then the PDA is an NPDA. For example, if the transition function  of a PDA has two
transitions shown below, then it is an NPDA. In state p, with B on the stack-top, the
machine is taking two contradictory transitions; reading and not reading the input.
(p, a, B) = (q, AB) (p, , B ) = (q, AB)
Let‟s see how we can utilize such nondeterministic transitions to design an NPDA
that can recognize a language, but no DPDA can recognize it.

135
NPDA Nondeterministic Automata

Example 2. Construct an NPDA which recognizes the following language.


L = {xxR | x  {a, b}+ }
This is a typical context-free language for which it is impossible to design a DPDA
to recognize (see Appendix D for a proof). If there were a marker, say #, at the center
of every string as shown in the language L‟ below, it would be trivial to construct a
DPDA. This DPDA, reading the part x (till it reads #), pushes all the symbols into the
stack, and then pops the stack-top for every matching symbol read from the second
half (i.e., xR) until it sees the stack empty.

L‟ = {x#xR | x  {a, b}+ }

(Y, X/YX ) In the figure, X, Y  {a, b}.


( , Z0/Z0 ) (Y, X/YX ) is a shorthand
(X, Z0/X Z0 )
for (a, a/aa), (a, b/ab),
(b, a/ba), (b, b/bb).
(#, X/X )
start
(X, X/ )
136
NPDA Nondeterministic Automata
Now, we will design an NPDA which uses the power of nondeterministic
transitions to recognize L (see the figure below). The NPDA, reading the input
symbols and pushing them onto the stack in state q1, guesses that from the next
input symbol, the part xR will begin, and takes nondeterministic transitions to a
popping state (q2 in the figure) while remaining in state q1. In the popping state
the machine keeps popping the stack-top for each matching input symbol.
Notice that in the same state q1 with the same stack-top X, the NPDA takes
reading and not-reading transitions simultaneously, the one entering q1 and the
other entering q2.

L = {xxR | x  {a, b}+ }


In the figure X, Y  {a, b}.
(Y, X/YX ) (Y, X/YX ) is a shorthand
( , Z0/Z0 ) for (a, a/aa), (a, b/ab),
(X, Z0/X Z0 )
q1 q3 ( b, a/ba), (b, b/bb).
(X, X/) is a shorthand for
(, X/X ) q2
start q0 (a, a /), (b, b /)
(X, X/ )

137
NPDA Nondeterministic Automata

Palindrome is a word or phrase that reads the same forwards and backwards (with
case insensitive and spaces ignored). For example, “Madam” and “sums are not set
as a test on Erasmus” are palindromes. The language L = {xxR | x  {a, b}+ } is a
language of palindromes.
Every language of palindromes (with case sensitive and spaces excluded) is a
context-free language. In Chapter 15 we will learn that the every context-free
language can be recognized by an NPDA. However, there are context-free languages
that cannot be recognized by any DPDA. The language L above is an example. It is a
challenging problem to prove that for a given context-language, whether or not it can
be recognized by a DPDA. Appendix D shows a proof that there is no DPDA that
can recognize L.

Actual Signs Break Time


In a non-smoking area: “If we see you smoking, we will assume you are on fire and take appropriate action.”
In the front yard of a funeral home: “Drive carefully. We‟ll wait.”
On a maternity room door: “Push, Push, Push.”
On an optometrist’s office: “If you don‟t see what you‟re looking for, you‟ve come to the right place.”
On a butcher’s window: “Let me meat your needs.”
- Anonymous -

138
Nondeterministic Automata

5.3 Nondeterministic Turing Machines (NTM) and Nondeterministic


Linear Bounded Automata (NLBA)
The figure below shows a part of the state transition graph of an NTM (or an
NLBA). Notice the various nondeterministic transitions in states 1, 2 and 4. In a
move, the machine enters more than one state (from state 1 in the figure), write
different symbols (in state 2) and/or move in different directions (in state 4).
Though it would be tedious, we can show that given a NTM M, we can build a
DTM M‟ which simulates M by systematically keeping track of every
nondeterministic transition of M. NTM‟s and DTM‟s are computationally equivalent,
i.e., they accept the same class of type 0 languages.
(a, b, R)
(a, b, R)
2 4
(a, c, R) (a, b, R),
1 (a, b, L)
3
(a, b, R) 5

139
Nondeterministic Automata
5.4 Nondeterministic Computing
Sometimes it may help to think that whenever a nondeterministic move occurs, the
original machine makes a copy for each instance of the move. For example, suppose
that an NFA M, in state p, reads symbol a and takes a nondeterministic move to q
and r, i.e., (p, a ) = {q, r}. Then, conceptually it is clear to think that two copies of
the machine, one in state q and the other in r, are created and proceed independently.
The same concept applies to the other nondeterministic automata and algorithms.

a b A
N
M p I
(p, a) = {q, r)

a b a b

q r

M1 M2

140
Nondeterministic Automata
Nondeterministic Computing

For a finite set S, define a function nondet_choose (S), which returns every
element in S simultaneously. The figure below illustrates conceptually how a program
works with a nondeterministic function returning multiple values simultaneously.
Executing the function nondet_choose, for each of the function values returned, a
copy of the process (i.e., the current instance of the executable program) is generated
(i.e., forked). Then, all the process copies proceed independently, with the function
value assigned.
. A
. N
x = nondet_choose (q, r) ; I
.
. .
.
. .
x = nondet_choose (q, r) ; x = nondet_choose (q, r) ;
{ compute with x = q } { compute with x = r }
. .

141
Nondeterministic Computing Nondeterministic Automata

Using the function nondet_choose we can quickly search the solution space of a
problem by nondeterministically forking (i.e., making a copy of the program and
let them proceed independently). If there is any sequence of computation that
gives the solution, we say the program solves the problem, ignoring the
unsuccessful computations. Otherwise, we say the program fails to find it.
There are many, so-called, NP-hard problems for which no polynomial-time
algorithm (i.e., which gives an answer in O(nk) time, for some constant k ) is
available. The sum-of-subset problem, which is defined as follows, is one of them.

Given a set S of N integers and an integer M, is there a subset


S’  S such that the sum of elements in S’ is equal to M?

For example let S = {8, 21, 13, 20, 7, 11, 5} and M = 34. Since 8 + 21 + 5 = 34,
the answer is “yes.”

142
Nondeterministic Computing Nondeterministic Automata

Here is a nondeterministic algorithm which solves the sum-of-subset problem in


time proportional to the problem size (i.e., in time linear to the number of given
integers), assuming that the nondeterministic function takes a constant time.

// S is an integer array of size n. This algorithm tests if there is a


// subset of the numbers in S whose sum is equal to M.
1. Nondet-Sum-of-Subset (int S[ ], int n, int M) {
2. int i, sum = 0;
3. boolean selected;
4. for ( i = 0; i < n; i++) {
5. selected = nondet_choose(true, false);
6. if (selected) sum += S[i];
}
7. if (sum = = M) return “yes”;
8. else return “no”;
}

143
Nondeterministic Computing Nondeterministic Automata

Notice that whenever the statement selected = nondet_choose(true, false) is executed,


two processes are created, one with selected = true, and the other with selected = false.
Let 1 and 0, respectively, denote true and false, and suppose |S| = 3. Then the initial
process will be forked down to three levels, summing up all the selected (= true)
elements by the statement in line 6.
The statement in line 7 makes the final decision. Notice that there are 8 leaf nodes
(processes) in the decision tree. Each leaf node of the tree corresponds to one of the 8
possible subsets of the set of size 3. If there is a subset whose summation is equal to the
given number M, one of the processes at the leaf nodes will output “yes.” The figure
shows a successful selection sequence 011 and the leaf node which will output “yes”
for the given set S.
0 1 2
i=0 0 S
1 34 7 12 M = 19
1 0 1
1 0
2
0 1 0 1 0 1 1 1: true
0
0: false
000 001 010 011 100 101 110 111

144
Nondeterministic Automata

5.5 -transitions in an FA
Consider a part of a state transition graph of an FA shown in figure (a) below. In
state q1, if the input is a, the automaton remains at state q1, and if the input is b, it
enters state q2.
Now, we assume that while the automaton does not read the input, it remains in the
current state. To denote such idling transition to the current state lets add a self-loop
labeled by  to every state as shown in figure (b). Clearly, such modification does not
affect the computational capability of an FA.


a
a
b
b q1 q2
q1 q2

(a) (b)

145
Nondeterministic Automata
-transitions

Now, we define a “spontaneous” transition between two different states of an FA.


For example, (p, ) = q denotes a spontaneous transition from state p to q without
reading the input. We shall use the label  on the state transition graph to denote
such spontaneous transitions (also called -transitions) as shown in figure (c) below.
In this figure, if we take into account of the self-loop labeled with  that we have
introduced for the case of no input, we see that this machine has nondeterministic
transitions on input . That is, (q1, ) = {q1, q2}. So, if an FA has an -transition
between two different states, it is an NFA. (As usual we shall not explicitly show the
self-loop labeled with . Only the -transitions between two different states will be
shown.)

a  a 
a
b b 
q1 q2 q1 q2 q1 q2
 
(a) (b) (c)

146
Nondeterministic Automata

Eliminating -transitions from an FA


An NFA with ε-transitions are very convenient model for designing a DFA, which
can be implemented in the real application, or solving other problems concerning
regular languages. However, the ε-transitions in an NFA need to be eliminated before
converting the NFA to a DFA. It is possible to remove all the ε-transitions, if any,
from an NFA without changing the language of the automaton. We will shortly learn
how.
Before we introduce a method for eliminating -transitions from an NFA, we need
to extend the transition function  as follows.
For a subset A of the set of states Q of an NFA (i.e., A  Q) and an input string x, let
(A, x) denote the set of states reachable from each state in A reading the string x. For
example, in the following FA, we get ({1, 3}, ab) = {1, 2, 3}.

a b a
1 2 3 4
start
a, b a
b

147
Eliminating -transitions Nondeterministic Automata

Example 1. Eliminate the -transition from the NFA shown below without affecting
the language recognized by the automaton.
This example would be enough to understand the basic idea for eliminating -
transitions from an NFA.
For the method we need the following property of the regular expressions
involved with the null character , where t is an input symbol.
 =  = *, t = t = t = t = *t*.
We will also use the property that for a subset A  Q and two input strings x and y,
(A, xy) = ((A, x), y). Let M be the given NFA and M ' be the FA that we get by
eliminating the -transitions from M.
a b

start q0 q2

For each state qi and for each input symbol t  {a, b}, we compute the set of states
S that can be reachable by reading t, i.e., (qi, *t* ) = S. (Notice that instead of t, we
use *t*, because M has -transitions.) We let this S be the states reachable from qi
in M' by reading t, i.e., '(qi, t) = (qi, *t* ) = S.
148
Eliminating -transitions Nondeterministic Automata

The figure below shows in detail how we build the transition function ' of M'
with  of M to eliminate the -transitions from state q0. (Recall that on each
state there is a self-loop labeled with .
a b a
 a, b A
q0 q1 start q0 q1
start N
I
(a) M (b) building M'

'(q0, a) = (q0, *a* ) = (( q0, * ), a* ) = ({q0, q1}, a* )


= ( ( {q0, q1}, a), * ) = ( {q0}, * ) = { q0, q1 }

'(q0, b) = (q0, *b* ) = (( q0, * ), b* ) = ({q0, q1}, b* )


= ( ( {q0, q1}, b), * ) = ( {q1}, * ) = { q1 }

149
Eliminating -transitions Nondeterministic Automata

This figure shows the part for eliminating -transitions from state q1 to construct '.

a b a b a b
 a, b a,b
q0 q1 q0 q1 start q0 q1
start start A
(a) M (b) Building M' (c) Final M' N
I
'(q1, a) = (q1, *a* ) = (( q1, * ), a* ) = ({q1}, a* ) =
( ({q1}, a), * ) = ( {}, * ) = {} = 

'(q1, b) = (q1, *b* ) = (( q1, * ), b* ) = ({q1}, b* ) =


( ( {q1}, b), * ) = ( {q1}, * ) = { q1 }

Finally, if an accepting state is reachable by a sequence of -transitions (i.e.,  is


in the language of M), we let the start state (q0) be an accepting state as shown in
figure (c) above.
150
Eliminating -transitions Nondeterministic Automata

Example 2. Eliminate the -transitions from the following NFA.

a b c A
  N
q0 q1 q2
start I
(a) M

Computing ' for the state q0 and input symbol a.


'(q0, a) = (q0, *a* ) = (( q0, * ), a* ) = ({ q0, q1, q2 }, a* ) =
( ( q0, q1, q2}, a), * ) = ( q0, * ) = { q0, q1, q2 }.

a
a q2
start q0 q1

a
(b) M
151
Eliminating -transitions Nondeterministic Automata

a b c
 
start q0 q1 q2

(a) M

Computing ' for the state q0, and input symbol b and c.
'(q0, b) = (q0, *b* ) = {q1, q2 }
'(q0, c) = (q0, *c* ) ={ q2 }

a
a, b q2
start q0 q1

a, b, c
(b) M

152
Eliminating -transitions Nondeterministic Automata

Computing ' for the states q1 and q2 , and input symbols a, b, and c.
'(q1, a) = Ø , '(q1, b) = { q1, q2 }, '(q1, c) = { q2 },
'(q2, a) = Ø , '(q2, b) = Ø , '(q2, c) = { q2 }.

Finally, since the accepting state q2 is reachable by ε-transitions from the start state,
we change the start state into an accepting state to get M as shown in figure (b).

a b c
a b c
a,b b,c
 
q0 q1 q2 start q0 q1 q2
start
a,b,c
(a) M (b) M

153
Nondeterministic Automata
Rumination (1): Nondeterministic automata
• The only difference between DFA‟s and NFA‟s is in the definitions of , as shown below.
DFA  : Q    Q
NFA  : Q    2Q
In other words, NFA‟s take a transition to one or more states (i.e., a subset of Q), while DFA‟s take transition to no more
than one state. Hence, we can view a DFA as a special NFA, every transition of which goes to at most one state.
• It is not easy to identify the language accepted by a DFA directly if the state transition graph of the automaton is not
simple. It is more difficult to identify the language accepted by an NFA because for a given input string, there are
multiple number of transition paths, some leading to an accepting state, while others not. In Chapter 7, we will learn a
systematic way to transform the state transition graph of an FA (either DFA or NFA) to a regular expression which
expresses the language recognized by the FA.
• Since NFA‟s are allowed to take multiple transitions simultaneously, they seem to be more powerful than DFA‟s. In
particular, we may raise the following question. Is there an NFA language that cannot be recognizable by any DFA? In
Chapter 8 we will learn a method of transforming an NFA to a DFA that recognizes the same language. This implies that
for FA‟s, nondeterminism does not give any additional power for the language recognition.
We may raise the same question for the PDA‟s, the TM‟s and the LBA‟s. For PDA‟s, the nondeterminism gives
additional power (see Appendix D for the proof). We will see a couple of examples. However, for TM‟s it doesn‟t, and it
is an open problem for LBA‟s.
• We will see that, once we get used to the nondeterminism, it is easier to design an NFA than DFA. We will also see that
NFA is a powerful tool for solving many problems concerning regular languages and related models.

154
Rumination: Nondeterministic automata Nondeterministic Automata

• With nondeterministic automata, we can search the solution space of a problem by tracing diverse routes simultaneously,
like in a parallel computer. But they are different from the parallel computers. While a parallel computer, out of the parallel
search, generates the final result for the user, a nondeterministic automaton does not “announce” the final result. The user
must make the final decision out of the many possible results. Nondeterministic automata cannot say “Yes, the input is
accepted.” or “No, it is rejected.” Such final decisions are up to the user.

• We learned that the stack operation mode restricted to either popping or pushing exactly one symbol in a move (see the
rumination section at the end of Chapter 4). We can apply the same restriction to NPDA‟s without affecting their capacity
for language recognition.
• Nondeterministic automata are different from the probabilistic machine, where a move takes place according to a given
probabilistic rule. For example, in an NFA,  (p, a) = {q, r} implies that in a state p, reading a symbol a, the machine certainly
enters state q and r simultaneously. There is no coin-tossing.
• From now on by TM, LBA, PDA and FA we denote those automata either deterministic or nondeterministic.

Ask God Break Time


A person asked God, “What surprises you most about mankind?” And God answered:
“That they lose their health to make money and then lose their money to restore their health.
That by thinking anxiously about the future, they forget the present, such that they live neither for the present nor
the future.
That they live as if they will never die, and they die as if they had never lived ….”
- Marty -

155
Exercises

5.1 For each of the strings given below, (i) show the state transition profile (the tree) of the FA shown below, and (ii) based
on the profile, check whether the string is accepted by the automaton or not.
a
(a) aaabaa (b) aabaa a
a

a a b
start

a
5.2 What is the language recognized by the NFA above?
(i) Describe the language informally. Your answer should be clear.
(ii) Express the language with a regular expression.

5.3 (a) Construct the state transition graph of an NFA which recognizes {aaxaa | x {a, b, c}+ }. Your graph should have
no more than 6 transition edges. (Do not count the implicit edges going to the dead state.)
(b) Construct a DFA which recognizes the language given in part (a) .

5.4 For each of the following languages, construct a PDA (either NPDA or DPDA, whichever possible) which recognizes the
language, and briefly explain how your PDA recognizes the language. For your PDA, it is enough to show the state transition
graph.

(a) L1 = {xyz | x, y, z  {a, b}+, and y = zR } (b) L2 = {xyz | x, y, z  {a, b}+, and x = yR }

156
5.5 Given a directed graph G and two nodes s and t in G, the Hamiltonian path from s to t is a path that starts at node s and
ends at node t visiting all the other nodes exactly once. For example, for the graph below and two nodes s = 4 and t = 7, a
Hamiltonian path is 4  2  1  8  9  10  6  5  3  7.

Given an arbitrary directed graph G and two node ids s and t, write a nondeterministic algorithm that checks if G has a
Hamiltonian path from s to t. Your algorithm should take no more than a polynomial time, i.e., in time O(nk), for some
constant k. (This is a challenging problem. Review the nondeterministic algorithm to solve the sum-of-subset problem that we
studied in Section 5.4.)

2
1 t
7
s
4

3 8

10
5
6 9

157
5.6 Using the technique for eliminating -transitions from an NFA presented in Section 5.6, construct an NFA which
recognizes the language of the following NFA with no -transitions. You should also show the procedure that you took to get
your answer.

b
a
b 2
a,b 4 
b

1
a
start 5
 3
a,b

158

Vous aimerez peut-être aussi