Vous êtes sur la page 1sur 58

ANALYSIS OF ALGORITHM

STRING
MATCHING WITH
FINITE AUTOMATA
SUBMITTED TO : MAM MAIMOONA
SUBMITTED BY : IQRA MUNIR
ANMOL HAMID

INTRODUCTIO
N

STRING MATCHING WITH


FINITE AUTOMATA:
String matching algorithm builds a finite
automaton
A simple machine for processing information
that scans the text string T for all occurrences of
the pattern P

FINITE
AUTOMATA

FINITE
AUTOMATA

STRING MATCHING
WITH
FINITE AUTOMATA

STRING
MATCHING
AUTOMATA
In order to specify the string-matching automaton
(SUFFIX
corresponding to a given pattern P[1m]
An auxiliary function , called the suffix function
FUNCTION):
corresponding to P
The function maps * to {0,1,..,m} such that
(x) is the length of the longest prefix of P that is
also a suffix of x:
(x) = max{k: Pk x}

ALGORITHMS

COMPUTE TRANSITION
COMPUTE-TRANSITION-FUNCTION(P,)
FUNCTION:
1.m-P.length
2.for q=0 to m
3.
for each character a
4.
k= min(m+1,q+2)
5.
repeat
6.
K=k-1
7.
until P k P q a
8.
(q ,a )=k
9.return

DRY RUNNING

COMPUTE
TRANSITION
TEXT=abababacaba
FUNCTION:
PATTERN=ababaca
={a,b,c}
Compute transition function(P,)
1 m length[P]
m 7
2 for q0 to m
for q0 to 7
3 do for each character a
// ={a,b,c}

1st
ITERATION:
q=0, a=a
3. for each character aa
4. kmin(m+1,q+2)
= min(7+1,0+2)=2
K2
7. Pk Pqa= P2P0a
(aba) FALSE
K1
7.Pk Pqa= P1P0a
(aa) TRUE
8. (q,a)k
(0,a)1

stat
e

P
a

CONT
q=0
3. for each character ab
4. K=2
7. Pk Pqa=P2P0a
(abb) FALSE
K=1
7.Pk Pqa=P1P0a
(ab) FALSE
K=0
Pk Pqa=P0P0b
( b) TRUE
8.(q,a)k
(0,b)0

stat
e

P
a

CONT
q=0
3. for each character ac
4. K=2
7. Pk Pqa=P2P0a
(abc) FALSE
K=1
7.Pk Pqa=P1P0a
(ac) FALSE
K=0
Pk Pqa=P0P0s
( c) TRUE
8.(q,a)k
(0,c)0

stat
e

2nd
2.for q1 to 7
ITERATION:

3.for each character a(a,b,c)


q=1,a=a
4.kmin(m+1,q+2)=min(7+1,1+
2)=3
K3
Pk Pqa=P3P1a=(abaaa)
FALSE
k2
7. Pk Pqa=P2P1a
(abaa) FALSE
K=1
7.Pk Pqa=P1P1a
(aaa) TRUE
8.(q,a)k
(1,a)1

Stat
e

CONT
q=1
for each character ab
4.K3
Pk
Pqa=P3P1a=(abaab)
FALSE
k2
7. Pk Pqa=P2P1a
(abab) TRUE
8.(1,b)2

Stat
e

CONT
q=1
for each character ac
4.K3
Pk Pqa=P3P1a=(abaac)
FALSE
k2
7. Pk Pqa=P2P1a
(abac) FALSE
k1
Pk Pqa=P1P1a
(aac) FALSE
k0
Pk Pqa=P0P1a
(ac) TRUE
8.(1,c)0

Stat
e

3rd
ITERATION:
2.for q2 to 7
3.for each character a(a,b,c)
q=2,a=a
4.kmin(m+1,q+2)=min(8,4)=4
K4
Pk Pqa=P4P2a=(abababa)
FALSE
Pk Pqa=P3P2a=(abaaba)
TRUE
8.(q,a)k
(2,a)3

Stat
e

CONT
q=2
for each character ab
K4
Pk Pqa=P4P2a=(abababb)
FALSE
K=3
Pk Pqa=P3P2a=(abaabb)
FALSE
k2
7. Pk Pqa=P2P2a =(ababb)
FALSE
K=1
7.Pk Pqa=P1P2a
(aabb) FALSE

K=0
7.Pk Pqa=P0P2a
(abb)TRUE
8.(q,a)k
(2,b)0
Stat
e

CONT
q=2
for each character ac
K4
Pk Pqa=P4P2a=(abababc)
FALSE
K=3
Pk Pqa=P3P2a=(abaabc)
FALSE
k2
7. Pk Pqa=P2P2a
=(ababc) FALSE
K=1
7.Pk Pqa=P1P2a
(aabc) FALSE

K=0
7.Pk Pqa=P0P2a
(abc)TRUE
8.(q,a)k
(2,c)0
Stat
e

TH

ITERATION:K1

2.for q3 to 7
3.for each character a(a,b,c)
q=3,a=a
4.kmin(m+1,q+2)=min(8,4)=4
K5
Pk Pqa=P5P3a=(ababaabaa)
FALSE
k4
Pk Pqa=P4P3a=(abababaa)
FALSE
K3
Pk Pqa=P3P3a=(abaabaa) FALSE
k2
Pk Pqa=P2P3a=(ababaa)
FALSE

Pk Pqa=P1P3a=(aabaa)
TRUE
8.(q,a)k
(3,a)1
Stat
e

CONT
2.for q3
3.for each character ab
K5
Pk Pqa=P5P3a=(ababaabab)
FALSE
k4
Pk Pqa=P4P3a=(abababab)
TRUE
8.(q,a)k
(3,b)4

Stat
e

K1
Pk Pqa=P1P3a=(aabac)
FALSE
K0
Pk Pqa=P0P3a=(abac)
TRUE
8.(q,a)k
(3,c)0

CONT
2.q=3
3.for each character ac
K5
Pk Pqa=P5P3a=(ababaabac)
FALSE
k4
Pk Pqa=P4P3a=(abababac)
FALSE
K3
Pk Pqa=P3P3a=(abaabac) FALSE
k2
Pk Pqa=P2P3a=(ababac)
FALSE

Stat
e

5
ITERATION:
th

2.for q4to 7
3.for each character a(a,b,c)
q=4,a=a
4.kmin(m+1,q+2)=min(8,6)=6
K6
Pk Pqa=P6P4a=(ababacababa)
FALSE
K5
Pk Pqa=P5P4a=(ababaabaBA) TRUE
8.(q,a)k
(4,a)5

Stat
e

CONT
2.q=4
3.for each character ab
K6
Pk Pqa=P6P4a=(ababacababb)
FALSE
K5
Pk Pqa=P5P4a=(ababaababb) FALSE
k4
Pk Pqa=P4P4a=(ababababb)
FALSE
K3
Pk Pqa=P3P4a=(abaababb) FALSE
k2
Pk Pqa=P2P4a=(abababb)
FALSE

k1
Pk Pqa=P2P4a=(aababb)
FALSE
k0
Pk Pqa=P2P4a=(ababb)
TRUE
8.(q,a)k
(4,b)0

Stat
e

CONT
2.q=4
3.for each character ac
K6
Pk Pqa=P6P4a=(ababacababc)
FALSE
K5
Pk Pqa=P5P4a=(ababaababc)
FALSE
k4
Pk Pqa=P4P4a=(ababababc)
FALSE
K3
Pk Pqa=P3P4a=(abaababc) FALSE
k2
Pk Pqa=P2P4a=(abababc)
FALSE

k1
Pk Pqa=P2P4a=(aababbc
FALSE
k0
Pk Pqa=P2P4a=(ababc)
TRUE
8.(q,a)k
(4,c)0

Stat
e

6 ITERATION:
th

2.for q5to 7
3.for each character a(a,b,c)
q=5,a=a
4.kmin(m+1,q+2)=min(8,7)=7
K7
Pk Pqa=P7P5a=(ababacaababaa)
FALSE
K6
Pk Pqa=P6P5a=(ababacababaa)
FALSE
K5
Pk Pqa=P5P5a=(ababaababaa) FALSE
k4
Pk Pqa=P4P5a=(ababababaa)
FALSE
K3
Pk Pqa=P3P5a=(abaababaa) FALSE

k2
Pk Pqa=P2P5a=(abababaa)
FALSE
k1
Pk Pqa=P2P4a=(aababbaa
TRUE
8.(q,a)k
(5,a)1
Stat
e

6
7

c
a

CONT
4.q5
3.for each character ab
K7
Pk Pqa=P7P5a=(ababacaababab)
FALSE
K6
Pk Pqa=P6P5a=(ababacababab)
FALSE
K5
Pk Pqa=P5P5a=(ababaababab) FALSE
k4
Pk Pqa=P4P5a=(ababababab)
TRUE
8.(q,a)k
(5,b)4

Stat
e

6
7

c
a

CONT
4.q5
3.for each character ac
K7
Pk Pqa=P7P5a=(ababacaababac)
FALSE
K6
PkPqa=P6P5a=(ababacababac)
TRUE
8.(q,a)k
(5,c)6

Stat
e

6
7

7TH ITERATION:
2.for q6to 7
3.for each character a(a,b,c)
q=6,a=a
4.kmin(m+1,q+2)=min(8,8)=8
K8
Pk Pqa=P8P6a=(ababacaaababaca)
FALSE
K7
Pk Pqa=P7P6a=(ababacaababaca)
TRUE
8.(q,a)k
(6,a)7

Stat
e

CONT

2.q6
3.for each character ab
K8
Pk Pqa=P8P6a=(ababacaaababacb) FALSE
K7
Pk Pqa=P7P6a=(ababacaababacb) FALSE
K6
Pk Pqa=P6P6a=(ababacababacb) FALSE
K5
Pk Pqa=P5P6a=(ababaababacb) FALSE
K4
Pk Pqa=P4P6a=(ababababacb) FALSE
K3
Pk Pqa=P3P6a=(abaababacb) FALSE
K2
Pk Pqa=P2P6a=(abababacb) FALSE

K1
Pk Pqa=P1P6a=(aababacb) FALSE
K0
Pk Pqa=P0P6a=(ababacb) TRUE
8.(q,a)k
(6,b)0

Stat
e

CONT
2.q6
3.for each character ac
K8
Pk Pqa=P8P6a=(ababacaaababac) FALSE
K7
Pk Pqa=P7P6a=(ababacaababac) FALSE
K6
Pk Pqa=P6P6a=(ababacababac) FALSE
K5
Pk Pqa=P5P6a=(ababaababac) FALSE
K4
Pk Pqa=P4P6a=(ababababac) FALSE
K3
Pk Pqa=P3P6a=(abaababac) FALSE
K2
Pk Pqa=P2P6a=(abababac) FALSE

K1
Pk Pqa=P1P6a=(aababac) FALSE
K0
Pk Pqa=P0P6a=(ababac) TRUE
8.(q,a)k
(6,c)0

Stat
e

8TH ITERATION:

2.q7 to 7
3.for each character a(a,b,c)
q=7,a=a
4.kmin(m+1,q+2)=min(8,9)=8
K8
Pk Pqa=P8P7a=(ababacaaababacaa) FALSE
K7
Pk Pqa=P7P7a=(ababacaababacaa) FALSE
K6
Pk Pqa=P6P7a=(ababacababacaa) FALSE
K5
Pk Pqa=P5P7a=(ababaababacaa) FALSE
K4
Pk Pqa=P4P7a=(ababababacaa) FALSE
K3
Pk Pqa=P3P7a=(abaababacaa) FALSE
K2
Pk Pqa=P2P7a=(abababacaa) FALSE

K1
Pk Pqa=P1P7a=(aababacaa) TRUE
8.(q,a)k
(7,a)1

Stat
e

CONT

2.q7
3.for each character ab
K8
Pk Pqa=P8P7a=(ababacaaababacab) FALSE
K7
Pk Pqa=P7P7a=(ababacaababacab) FALSE
K6
Pk Pqa=P6P7a=(ababacababacab) FALSE
K5
Pk Pqa=P5P7a=(ababaababacab) FALSE
K4
Pk Pqa=P4P7a=(ababababacab) FALSE
K3
Pk Pqa=P3P7a=(abaababacab) FALSE
K2
Pk Pqa=P2P7a=(abababacab) TRUE

8.(q,a)k
(7,b)2

Stat
e

CONT

2.q7
3.for each character ac
K8
Pk Pqa=P8P7a=(ababacaaababacac) FALSE
K7
Pk Pqa=P7P7a=(ababacaababacac) FALSE
K6
Pk Pqa=P6P7a=(ababacababacac) FALSE
K5
Pk Pqa=P5P7a=(ababaababacac) FALSE
K4
Pk Pqa=P4P7a=(ababababacac) FALSE
K3
Pk Pqa=P3P7a=(abaababacac) FALSE
K2
Pk Pqa=P2P7a=(abababacac) FALSE

K1
Pk Pqa=P1P7a=(aababacac)
FALSE
K0
Pk Pqa=P0P7a=(ababacac)
TRUE
8.(q,a)k
(7,c)0

Stat
e

REQUIRED
RESULT:
THE FINITE STATE
AUTOMATA IS:

Stat a
e

a
0

b
b

ALGORITHM

FINITE AUTOMATON
MATCHER
FINITE-AUTOMATON-MATCHER(T,,m)
1.n=T . length
2.q=0
3.for I =1 to n
4.
q= ( q, T[ i ] )
5.
If q==m
6.
Print Pattern occurs with shift i -m

DRY RUNNING

FINITE-AUTOMATON
MATCHER
i= 1 2 3 4 5 6 7 8 9 10 11
T= a b a b a b a c a b a
1.n length[T]
n11
2. q0
3. for i1 to n
for i1 to 11

1 ITERATION:
st

3. for i1 to 11
4. do q(q,T[i])
q(0,T[1])=(0,a)
q(0,a)=1
5. if q=m
if 1=7 FALSE

a
0

b
b

ND

ITERATION:

i=2,q=2
4. do q(1,T[2])
=(1,b)
q2
5 if 2=7 FALSE

a
0

b
b

RD

ITERATION:

i=3,q=2
4. do q(2,T[3])
=(2,a)
q3
5 if 3=7 FALSE

a
0

b
b

4 ITERATION:
th

i=4,q=3
4. do q(3,T[4]) =(3,b)
q4
5 if 4=7 FALSE

a
0

b
b

5 ITERATION:
th

i=5,q=4
4. do q(4,T[5])
=(4,a)=5
q5
5 if 5=7 FALSE

a
0

b
b

6TH
ITERATION:
i=6,q=5
4. do q(5,T[6])
=(5,b)=4
q5
5 if 4=7 FALSE

a
0

b
b

TH

ITERATION:

i=7,q=4
4. do q(4,T[7])
=(4,a)=5
q5
5 if 5=7 FALSE

a
0

b
b

TH

ITERATION:

i=8,q=5
4. do q(5,T[8])
=(5,c)=6
q6
5 if 6=7 FALSE

a
0

b
b

9
i=9,q=6
ITERATION:
TH

4. do q(6,T[9])
=(6,a)=7
q7
5 if 7=7 TRUE
Then print pattern occurs with
shifti-m
Print pattern occurs with
shift9-7=2
a

a
0

b
b

10th
ITEARTION:
i=10,q=7
4. do q(7,T[10])
=(7,b)=2
q2
5 if 2=7 FALSE

a
0

b
b

11

TH

ITEARTION:

i=11,q=2
4. do
q(2,T[11])=(2,a)=3
q3
5 if 3=7 FALSE

a
0

b
b

REQUIRED OUTPUT:
After removing useless edges:

a
a
0

b
1

a
2

a
4

5
b
b

a
6

CONT..
i= - 1 2 3 4 5 6 7 8 9 10 11
T[i] - a b a b a b a c a b a
state T[i] 0 1 2 3 4 5 4 5 6 7 2 3
Stat a
e

COMPLEXITY

COMPLEXITY
ANALYSIS:

The simple loop time of FINITE-AUTOMATON-MATCHER


implies that its matching time on the text string of length is
(n).This matching time does not include the preprocessing
time required to compute the transition function )
The running time of COMPUTE-TRANSITION-FUNCTION is
O(m^3),because the outer loops contribute a factor of
m,the inner repeat loop can run atmost m+1 times, and
the test Pk Pqa on line 7 can require computing upto m
characters.
Much faster procedures exist; the time required to compute
from P can be improved to O(m) by utilizing some
cleverly computed information about the pattern P. With
this improved procedure for computing from P to O(m||),
we can find all occurrences of a length-m pattern in a
length-n text over an alphabet with O(m)
preprocessing time and (n) matching time.

ADVANTAGES &
DISADVANTAGE

ADVANTAGES &
DISADVANTAGE:

String matching automaton are very efficient


Examine each text character exactly once taking
constant time per text character.
The time to build the automaton can be large if
is large

THANK YOU

Vous aimerez peut-être aussi