Académique Documents
Professionnel Documents
Culture Documents
Chapter 3:
Regular Expressions and Languages
DR. NOR FAZLIDA MOHD SANI
DEPT. OF COMPUTER SCIENCE
FAC. OF COMPUTER SCIENCE AND INFORMATION
TECHNOLOGY, UPM.
Regular expressions
Introduction
3
describing languages.
RE play important role in CS application
String concatenation
4
s = 011
t = 101
s = a1an t = b1bm
st = 011101
ts = 101011
ss = 011011
sst = 011011101
st = a1anb1bm
Operations on languages
5
L1 L2 = {s: s L1 or s L2}
Example
6
L1 = {0, 01}
L1L2
L12
L22 = L2
L2n = L2 (n 1)
Operations on languages
7
chunks from L:
L* = L0 L1 L2
Example
8
L1 = {0, 01}
L1*
L22 = L2
L2n = L2 (n 1)
= L2
L2* = L2
0(0+1)*
all strings that start with 0
({0}{1}*)({1}{0}*)
01*+10*
0 followed by any number of 1s, or
1 followed by any number of 0s
Regular expressions
10
Examples
11
S = {0, 1}
01* = 0(1*) = {0, 01, 011, 0111, }
Examples
12
0+1
strings of length 1
= {0, 1}
(0+1)*010
(0+1)*01(0+1)*
any string
Examples
13
((0+1)(0+1))*+((0+1)(0+1)(0+1))*
all strings whose length is even or a mutliple of 3
= strings of length 0, 2, 3, 4, 6, 8, 9, 10, 12, ...
((0+1)(0+1))*
(0+1)(0+1)
strings of length 2
((0+1)(0+1)(0+1))*
(0+1)(0+1)(0+1)
strings of length 3
Examples
14
((0+1)(0+1)+(0+1)(0+1)(0+1))*
strings that can be broken in blocks,
where each block has length 2 or 3
(0+1)(0+1)+(0+1)(0+1)(0+1)
strings of length 2 or 3
(0+1)(0+1)
strings of length 2
(0+1)(0+1)(0+1)
strings of length 3
Examples
15
((0+1)(0+1)+(0+1)(0+1)(0+1))*
strings that can be broken in blocks,
where each block has length 2 or 3
e 1
10
011
00110
011010110
Examples
16
(1+01+001)*(e+0+00)
00
0110010110
0010010
Examples
17
(0+1)*00(0+1)*
S = {0, 1}
Examples
18
S = {0, 1}
0110101101010
blocks ending in 1 last block
(e + 0)
(1 + 01)*(e + 0)
Examples
19
S = {0, 1}
DFA
NFA
regular languages
regular
expression
Road map
21
NFA
regular
expression
NFA
without e
DFA
R1 = 0
R2 = 0 + 1
q0
q1
0
q2
q3
q0
q1
e
q4
q5
e
e
R3 = (0 + 1)*
q0
M2
M2
q1
Regular expressions
23
General method
24
regular expr
NFA
q0
q0
symbol a
q0
RS
q0
q1
MR
MS
q1
regular expr
NFA
e
R+S
MR
q0
q1
e
MS
e
e
R*
q0
MR
q1
Road map
26
regular
expression
NFA
NFA
without e
DFA
NFAs, DFAs,
and regular expressions
27
S = {0, 1}
0
q0
0, 1
q0
0
q00
1
q01
qe
1
1
0
q1
q10
(0+1)*01
0
q11
1
q2
NFA
q1
DFA
regular
expression
DFA
NFA
regular languages
regular
expression
Road map
30
NFA
regular
expression
NFA
without e
DFA
R1 = 0
q0
q1
R2 = 01
q0
q1
q2
R3 = 0 + 01
q1
q2
NFA3
q0
q6
e
q3
q4
q5
e
e
R4 = (0 + 01)*
q0
NFA3
q1
General method
34
regular expr
NFA
q0
q0
aS
q0
RS
q0
q1
NFAR
NFAS
q1
regular expr
NFA
e
R+S
NFAR
q0
q1
e
NFAS
e
e
R*
q0
NFAR
q1
Road map
36
regular
expression
NFA
NFA
without e
DFA
Road map
37
regular
expression
NFA
NFA
without e
DFA
Road map
38
NFA
regular
expression
2-state
GNFA
NFA
without e
DFA
GNFA
Generalized NFAs
39
moreover
e+10*
q1
0*11
q2
01
It has exactly one accept state, different from its start state
No arrows come into the start state
No arrows go out of the accept state
NFA
regular
expression
2-state
GNFA
NFA
without e
DFA
GNFA
q3
q0
q1
e
e
q5
qf
Conversion example
41
0
q0
q1
1
1
0
q2
q3
It has exactly one accept state, different from its start state
NFA
regular
expression
2-state
GNFA
NFA
without e
DFA
GNFA
State elimination
43
0*1
q0
e+10*
q1
0*11
q2
01
q0
(e+10*)(0*1)*0*11
q2
01
q0
(e+10*)(0*1)*0*11 + 01
q2
Replace
qi
R1
qk
R3
qj
R4
by
qi
R1R2*R3 + R4
qj
Road map
45
regular
expression
NFA
2-state
GNFA
q0
q1
NFA
without e
GNFA
DFA
Conversion example
46
0
q0
1
1
q1
q2
q3
00*1+1
Eliminate q1:
q0
Eliminate q2:
q0
0*1
q2
q3
0*1(00*1+1)*
q3
Check:
0*1(00*1+1)* =
q1
1
1
0
q2
1
1
q1
q2
0*1(00*1+1)*
=
0*1(0*1)*
011001000101
Yes!
DFA
NFA
regular
expression
regular languages
Design
Analyze
Convert
Text search
49
{cat, 12}
union
[abc]
[ab][12]
(ab)*
[ab]?
(cat)+
[ab]{2}
{a, b, c}
{a1, a2, b1, b2}
{e, ab, abab, ...}
{e, a, b}
{cat, catcat, ...}
{aa, ab, ba, bb}
outsavor
savor
savored
savorer
savorily
savoriness
savoringly
savorless
savorous
savorsome
savory
savour
unsavored
unsavoredly
unsavoredness
unsavorily
unsavoriness
unsavory
grabbable
.
any symbol
[a-z] anything in a
range beginning of line
\<
$
end of line
5
1 n
2 s u
3
r e g
4 s
4 $10000000 = $10?
5 I study 5204 hard because it will make me ...
grep E `\<.ff.u..t` words
a
f
f
l
u
e
n
t
a
f e r
l a r
a r
regular
expression
NFA
NFA
without e
DFA
text file
differences
in class
in grep
not allowed
allowed
input handling
output
accept/reject
finds pattern
Implementation of grep
55
[ab]? ()|[ab]
zero or one
R? e|R
(cat)+ (cat)(cat)*
one or more
R+ RR*
a{3}
aaa
[^aeiouy]
any
{n} copies
R{n} RR...R
n times
not containing
Algebraic Laws
56
similar to arithmetic
But there are also some laws that apply to RE but not
for arithmetic
the identity and some other value, the result is the other
value; e.g.
Distributive Laws
60
2.
(L*)* = L*
* = e
L+ = LL* = L*L
L* = L+ + e
L? = e + L