Vous êtes sur la page 1sur 8

CS2 Language processing note 4

CS2Ah 22.10.04

CS2 Language processing note 4


Regular expressions and Kleenes Theorem
This note presents regular expressions, another method for describing regular
languages. We show how to convert a regular expression to an equivalent NFA
with -transitions, and, conversely, how to convert a DFA to an equivalent regular expression. Taken together, these results yield Kleenes Theorem which
states that a language is regular if, and only if, it can be described by a regular
expression.
Regular expressions

We have seen how sets of strings can be described by automata. A language


over an alphabet is described by constructing a DFA or NFA that accepts precisely the strings that are in . This may be viewed as a dynamic description
of since the characterisation is via a notion of computation. In this lecture we
consider an alternative static method for describing languages, using a simple mechanism for describing how the strings in the language are built. The
mechanism is to define a language using a regular expression.

We begin by presenting the syntax for regular expressions. We then go on to


associate a language with each regular expression, thereby giving a meaning to
the expressions. This latter part is usually referred to as defining a semantics
for regular expressions.
Regular expressions over the alphabet
formation rules:





are produced using the following

is a regular expressions.
is a regular expression.

Every symbol a in
and

is a regular expression.

 
 .
 If  and  are regular expressions then so is   .
 If  is a regular expression then so is  .

To define the language   described by a regular expression  , we first define
the language associated to each of the basic expressions   , and then describe
how to interpret the operations +, concatenation, and (often referred to as
If

are regular expressions then so is

Kleene-star) used to build the whole class of regular expressions.


1

CS2 Language processing note 4

CS2Ah 22.10.04

      .
     .
  a a  .

 

Note that the first two languages are subtly different:


is the language containing the empty string as its only string; and is the empty language, which
contains no strings. Also,
is the language containing the string as its only
string.



   
 

 

     

and are regular expressions and that we have


Now let us suppose that
already defined the languages
and
. We define
,
, and
as follows:

  

   !
  "   $#   ,
   &%('*)&%,+     '-+    ,
   .   #/&%0)1%*+     #2&%435%(67)&%43  %(6+     #2898:8
 ;<%43=898:8>%@?A)1BC+CDE  %43  8:898  %@?F+   &G(8
(the language consisting of all concatenations of finitely many strings from
   ).

It is important to keep in mind that a regular expression does not describe a


single string, but a language, that is, a set of strings.

 I     
H
K6L
 MN

KO- 

> P 

Example 4.1. Suppose that


a b c . Then examples of regular expressions
over
are
ab abc,
a b c , and
aaa
aaaaa . These
expressions define the following languages:

73J

>

    3M ab  abc 
  K6Q consists of all strings of even length having an a at all even positions
and either a b or a c at all odd positions, that is,

 K6Q ; a%43 a%(6 a%(OR8:8:8 a%@?A)SBC+CDE  %43  8:8:8  %@?F+T b  c  G


  ab  ac  abab  abac  acab  acac  ababab  898:8  8
  KOQ a? )UB/+,DE divisible by 3 or 5  .
From regular expressions to NFAs with -transitions
Proposition 4.2. For every regular expression
such that
transitions

     ,V Q8


2

there exists an NFA with -

CS2 Language processing note 4

CS2Ah 22.10.04

Proof: The proof is by induction on the structure of the regular expression


. This means that we define the NFA
following the rules that were used to
generate .

        

W

  

We first show how to construct NFAs for the basic regular expressions
and
a for a
. Recall that
,
, and a
a . Figure 1 shows three
simple NFAs recognising these three languages.

+C

Figure 1: NFAs for the regular expressions ,

 3X
K6 
V
V,Y



ZV 3 V76
VF3>Y
VF3
V76
V76
V

and a

 7 3   K 6
  3M   7V 6QY    6Q
V
[:E
V
VL3

Suppose next that


, where
and
are regular expressions. We
want to construct an NFA
with
. We assume that we already have
constructed NFAs
and
with
and
. Let
and
be the starting states of
and
, respectively. The desired NFA
is obtained
by taking
and
together with a new starting state
which is connected to
and by an transition. The final states of
are those of
and those of .
Figure 2 gives a schematic picture of .

[13

[96

VF3

q1

N1

q2

N2

[13

[96

V76

Figure 2: An NFA for the regular expression

]] 3>K6    3  K6


V
,V  Y   


ZV 3
V76
FV 3M   3M
VZ3
V76
V

\3$ 6

Suppose next that


, where
and
are regular expressions. We
with
. We assume that we already have
want to construct an NFA
constructed NFAs
and
with
and
. Let
and
be the starting states of
and
, respectively, and let
be the set of final
states in . The desired NFA
is obtained by taking
and
and connecting

[96

VZ3

 V 6Q7  6Q


^3
VZ3 V76

[13

CS2 Language processing note 4

^3

[96

CS2Ah 22.10.04

[_3

all states in
with
by an -transition. The starting state of
is , and the
final states are all final states of
. Figure 3 gives a schematic picture of .

V\6

N1

q1

N2

Figure 3: An NFA for the regular expression

 `
    3
 3



V
 VF3M*V "/    3M   1[ 3
ZV 3
V
[
[S3

\3>6

Finally, suppose that


, where
is a regular expression. We want
to construct an NFA
with
. We assume that we already have
constructed an NFA
with
. Let be the starting states of
and
let
be the set of final states. The desired NFA is obtained from
by adding
a new state , connecting with
by an -transition, and connecting all states
in
with by an -transition. The start state of is , and the set of final states
of
is just
. Figure 3 gives a schematic picture of .

^ 3
^3

[
[ 
U[

VL3

VZ3

N1

q1

Figure 4: An NFA for the regular expression

 3

From DFAs to regular expressions


Proposition 4.3.For every DFA

there exists a regular expression

 ab     8
4

such that

CS2 Language processing note 4

CS2Ah 22.10.04

We do not give a proof here. Such a proof can be found in Introduction to


Automata Theory, Languages, and Computation (2nd Edition) by J. E. Hopcroft,
R. Motwani, and J. D. Ullman, Addison-Wesley, 2001. We just illustrate how to
construct a regular expression from an automaton with one example. The idea is
to eliminate the states of the automaton one by one. As we proceed, we replace
the labels on the transitions of the automaton, which are initially just letters
from the alphabet, by regular expressions. We end up with a simple automaton
with just two states; the transitions of this automaton will immediately give us
the desired regular expression. A slight problem with this construction is that
we cannot eliminate final states so easily, so in a first normalisation step we
replace our initial DFA by one with just one final state, at the price of introducing
some transitions.
Example 4.4. Consider DFA

of Figure 5: After the normalisation step, we

b
a

b
a

a
a

b
b
Figure 5: A DFA

obtain the NFA with transitions displayed in Figure 6, which obviously accepts
the same language as . Now we start eliminating states and obtain the automata displayed in Figures 7. All these automata recognise the same language.
The language of the last automaton of Figures 7 can easily be seen to be
for
the regular expression

c M  a 

  

  M  a 

ba b

 

ba a

  M  a 

bb b a

ba

$ b

Kleenes Theorem
Putting results of the previous sections together, we obtain Kleenes Theorem,
first proved by S.C. Kleene, one of the founders of modern mathematical logic.
5

CS2 Language processing note 4

CS2Ah 22.10.04

b
a

b
a

a
a

b
b
Figure 6: Normalisation

     

Theorem 4.5. A language


sion such that
.

is regular if and only if there exists a regular expres-

     

Proof: Proposition 4.3 shows that for every regular language


there exists a
regular expression
such that
, and Proposition 4.2 together with
Theorem 3.6 shows the converse.

CS2 Language processing note 4

CS2Ah 22.10.04

a+ba

bb
a

(a+ba)b

(a+ba)+b

(a+ba)a+bb

b
(a+ba)b+((a+ba)a+bb)b*a

(a+ba)+b

Figure 7: State elimination


7

CS2 Language processing note 4

CS2Ah 22.10.04

Exercises
1. Convert the following regular expressions to NFAs.

d    d N

(b)  d W>
M
   ef 
(c)   d  e    d
(a)

Consider other examples too.


2. Convert the following DFA into a regular expression.
1

1
0

0
0

3. Convert the DFAs from lecture notes 1 and 2 into regular expressions.

Don Sannella

Vous aimerez peut-être aussi