Hidden Markov Model

PRESENTED BY
ADRIJA BHATTACHARAYA
DEBAYAN GANGULY
KINGSHUK CHATERJEE

A Markov process is a process, which moves from state to state
depending(only) on the previous n states.
Set of states:
Process moves from one state to another generating a
sequence of states :
Markov chain property: probability of each subsequent state
depends only on what was the previous state:

To define Markov model, the following probabilities have to be
specified: transition probabilities and initial
probabilities

Markov Models
} , , , {
2 1 N
s s s
, , , ,
2 1 ik i i
s s s
) | ( ) , , , | (
1 1 2 1
=
ik ik ik i i ik
s s P s s s s P
) | (
j i ij
s s P a =
) (
i i
s P = t
Rain
Dry
0.7 0.3
0.2 0.8
Two states : Rain and Dry.
Transition probabilities: P(Rain|Rain)=0.3 ,
P(Dry|Rain)=0.7 , P(Rain|Dry)=0.2, P(Dry|Dry)=0.8
Initial probabilities: say P(Rain)=0.4 , P(Dry)=0.6 .
Example of Markov Model
By Markov chain property, probability of state sequence can be
found by the formula:

Suppose we want to calculate a probability of a sequence of
states in our example, {Dry,Dry,Rain,Rain}.
P({Dry,Dry,Rain,Rain} ) =
P(Rain|Rain) P(Rain|Dry) P(Dry|Dry) P(Dry)=
= 0.3*0.2*0.8*0.6
Calculation of sequence probability
) ( ) | ( ) | ( ) | (
) , , , ( ) | (
) , , , ( ) , , , | ( ) , , , (
1 1 2 2 1 1
1 2 1 1
1 2 1 1 2 1 2 1
i i i ik ik ik ik
ik i i ik ik
ik i i ik i i ik ik i i
s P s s P s s P s s P
s s s P s s P
s s s P s s s s P s s s P

=
= =
=
Hidden Markov models.

Hidden Markov model (HMM) is a statistical model where the
system being modelled is assumed to be a Markov process.

Set of states:
Process moves from one state to another generating a
sequence of states :
Markov chain property: probability of each subsequent state
depends only on what was the previous state:

States are not visible, but each state randomly generates one of M
observations (or visible states)

} , , , {
2 1 N
s s s
, , , ,
2 1 ik i i
s s s
) | ( ) , , , | (
1 1 2 1
=
ik ik ik i i ik
s s P s s s s P
} , , , {
2 1 M
v v v
Hidden Markov models.
To define hidden Markov model, the following probabilities
have to be specified: matrix of transition probabilities A=(a
ij
),
a
ij
= P(s
i

| s
j
)
matrix of observation probabilities B=(b
i
(v
m
)), b
i
(v
m
)

= P(v
m

| s
i
)
a vector of initial probabilities t=(t
i
), t
i
= P(s
i
) .
Model is represented by M=(A, B, t).

Low
High
0.7 0.3
0.2 0.8
Dry
Rain
0.6 0.6
0.4 0.4
Example of Hidden Markov Model
Two states : Low and High atmospheric pressure.
Two observations : Rain and Dry.
Transition probabilities: P(Low|Low)=0.3 ,
P(High|Low)=0.7 , P(Low|High)=0.2,
P(High|High)=0.8
Observation probabilities : P(Rain|Low)=0.6 ,
P(Dry|Low)=0.4 , P(Rain|High)=0.4 ,
P(Dry|High)=0.3 .
Initial probabilities: say P(Low)=0.4 , P(High)=0.6 .
Example of Hidden Markov Model
Suppose we want to calculate a probability of a sequence of
observations in our example, {Dry,Rain}.
Consider all possible hidden state sequences:
P({Dry,Rain} ) = P({Dry,Rain} , {Low,Low}) +
P({Dry,Rain} , {Low,High}) + P({Dry,Rain} ,
{High,Low}) + P({Dry,Rain} , {High,High})

where first term is :
P({Dry,Rain} , {Low,Low})=
P({Dry,Rain} | {Low,Low}) P({Low,Low}) =
P(Dry|Low)P(Rain|Low) P(Low)P(Low|Low)
= 0.4*0.4*0.6*0.4*0.3
Calculation of observation sequence probability
Once a system is described as an HMM, there are three basic
problems to be solved :

Finding the probability of an observed sequence given an HMM
(evaluation),

finding the sequence of hidden states that most probably
generated an observed sequence (decoding)

generating an HMM given a sequence of observations (learning).
Basic problems of HMM
Evaluation problem. Given the HMM M=(A, B, t) and the
observation sequence O=o
1
o
2
... o
K
, calculate the probability that
model M has generated sequence O . In order to solve this problem,
a method known as forward-backward algorithm is used.

Decoding problem. Given the HMM M=(A, B, t) and the
1
o
2
... o
K
, calculate the most likely
sequence of hidden states s
i
that produced this observation sequence
O. The most popular method for solving this problem is
Viterbi Algorithm , which finds the single best state sequence as a
whole.

Main issues using HMMs :
Learning problem. Given some training observation sequences
O=o
1
o
2
... o
K
and general structure of HMM (numbers of hidden
and visible states), determine HMM parameters M=(A, B, t)
that best fit training data.

O=o
1
...o
K
denotes a sequence of observations o
k
e{v
1
,,v
M
}.
This is by far the most challenging problem
among all. Fortunately, Baum-Welch algorithm proposes an
iterative procedure for locally maximizing the probability.

Main issues using HMMs :
Evaluation problem. Given the HMM M=(A, B, t) and the
1
o
2
... o
K
, calculate the probability that
model M has generated sequence O .
Trying to find probability of observations O=o
1
o
2
... o
K
by
means of considering all hidden state sequences (as was done in
example) is impractical:
N
K
hidden state sequences - exponential complexity.

Use Forward-Backward HMM algorithms for efficient
calculations.

Define the forward variable o
k
(i) as the joint probability of the
partial observation sequence o
1
o
2
... o
k
and that the hidden state at
time k is s
i
: o
k
(i)= P(o
1
o
2
... o
k ,
q
k
=

s
i
)
Evaluation Problem.
Initialization:
o
1
(i)= P(o
1 ,
q
1
=

s
i
) = t
i
b
i
(o
1
) , 1<=i<=N.

Forward recursion:
o
k+1
(i)= P(o
1
o
2
... o
k+1 ,
q
k+1
=

s
j
) =
E
i
P(o
1
o
2
... o
k+1 ,
q
k
=

s
i ,
q
k+1
=

s
j
) =
E
i
P(o
1
o
2
... o
k ,
q
k
=

s
i
) a
ij
b
j
(o
k+1
) =
[E
i
o
k
(i) a
ij
] b
j
(o
k+1
) , 1<=j<=N, 1<=k<=K-1.
Termination:
P(o
1
o
2
... o
K
) = E
i
P(o
1
o
2
... o
K ,
q
K
=

s
i
) = E
i
o
K
(i)

Complexity :
N
2
K operations.
Forward recursion for HMM
Define the forward variable |
k
(i) as the joint probability of the
partial observation sequence o
k+1
o
k+2
... o
K
given that the hidden
state at time k is s
i
: |
k
(i)= P(o
k+1
o
k+2
... o
K
|q
k
=

s
i
)
Initialization:
|
K
(i)= 1 , 1<=i<=N.
Backward recursion:
|
k
(j)= P(o
k+1
o
k+2
... o
K
|

q
k
=

s
j
) =
E
i
P(o
k+1
o
k+2
... o
K ,
q
k+1
=

s
i
|

q
k
=

s
j
) =
E
i
P(o
k+2
o
k+3
... o
K
|

q
k+1
=

s
i
) a
ji
b
i
(o
k+1
) =
E
i
|
k+1
(i) a
ji
b
i
(o
k+1
) , 1<=j<=N, 1<=k<=K-1.
Termination:
P(o
1
o
2
... o
K
) = E
i
P(o
1
o
2
... o
K ,
q
1
=

s
i
) =
E
i
P(o
1
o
2
... o
K
|q
1
=

s
i
) P(q
1
=

s
i
) = E
i
|
1
(i) b
i
(o
1
) t
i

Backward recursion for HMM
Decoding problem. Given the HMM M=(A, B, t) and the
1
o
2
... o
K
, calculate the most likely
sequence of hidden states s
i
that produced this observation sequence.
We want to find the state sequence Q= q
1
q
K
which maximizes
P(Q | o
1
o
2
... o
K
) , or equivalently P(Q , o
1
o
2
... o
K
) .
Brute force consideration of all paths takes exponential time. Use
efficient Viterbi algorithm instead.
.
Decoding problem
Learning problem. Given some training observation sequences
O=o
1
o
2
... o
K
and general structure of HMM (numbers of
hidden and visible states), determine HMM parameters M=(A,
B, t) that best fit training data, that is maximizes P(O |

M) .

There is no algorithm producing optimal parameter values.

Use iterative expectation-maximization algorithm to find local
maximum of P(O |

M) - Baum-Welch algorithm.

Learning problem (1)
If training data has information about sequence of hidden states
(as in word recognition example), then use maximum likelihood
estimation of parameters:

a
ij
= P(s
i
| s
j
) =
Number of transitions from state s
j
to state s
i

Number of transitions out of state s
j
b
i
(v
m
)

= P(v
m
| s
i
)=
Number of times observation v
m
occurs in state s
i
Number of times in state s
i
Learning problem (2)
General idea:
a
ij
= P(s
i
| s
j
) =
Expected number of transitions from state s
j
to state s
i

Expected number of transitions out of state s
j
b
i
(v
m
)

= P(v
m
| s
i
)=
Expected number of times observation v
m
occurs in state s
i
Expected number of times in state s
i
t
i
= P(s
i
) = Expected frequency in state s
i
at time k=1.
Baum-Welch algorithm
Define variable
k
(i,j) as the probability of being in state s
i
at
time k and in state s
j
at time k+1, given the observation
sequence o
1
o
2
... o
K
.

k
(i,j)= P(q
k
= s
i
,

q
k+1
= s
j
|

o
1
o
2
... o
K
)
k
(i,j)=
P(q
k
= s
i

,

q
k+1
= s
j

,

o
1
o
2
... o
k
)
P(o
1
o
2
... o
k
)
=
P(q
k
= s
i

,

o
1
o
2
... o
k
) a
ij
b
j
(o
k+1
) P(o
k+2
... o
K
|

q
k+1
= s
j

)
P(o
1
o
2
... o
k
)
=
o
k
(i) a
ij
b
j
(o
k+1
) |
k+1
(j)
E
i
E
j
o
k
(i) a
ij
b
j
(o
k+1
) |
k+1
(j)
Baum-Welch algorithm: expectation step(1)
Define variable
k
(i) as the probability of being in state s
i
at
time k, given the observation sequence o
1
o
2
... o
K
.

k
(i)= P(q
k
= s
i
|

o
1
o
2
... o
K
)
k
(i)=
P(q
k
= s
i

,

o
1
o
2
... o
k
)
P(o
1
o
2
... o
k
)
=
o
k
(i) |
k
(i)
E
i
o
k
(i) |
k
(i)
We calculated
k
(i,j) = P(q
k
= s
i
,

q
k+1
= s
j
|

o
1
o
2
... o
K
)
and
k
(i)= P(q
k
= s
i
|

o
1
o
2
... o
K
)

i
to state s
j
=
= E
k
k
(i,j)
i
= E
k
k
(i)

m
occurs in state s
i
=
= E
k
k
(i) , k is such that o
k
= v
m

Expected frequency in state s
i
at time k=1 :
1
(i) .
a
ij
=
j
to state s
i

j
b
i
(v
m
)

=
m
occurs in state s
i
i
t
i
= (Expected frequency in state s
i
at time k=1) =
1
(i).
=
E
k
k
(i,j)
E
k
k
(i)
=
E
k
k
(i,j)
Ek,o
k
= v
m

k
(i)
Baum-Welch algorithm: maximization step
a
ij
=
j
to state s
i

j
b
i
(v
m
)

=
m
occurs in state s
i
i
t
i
= (Expected frequency in state s
i
at time k=1) =
1
(i).
=
E
k
k
(i,j)
E
k
k
(i)
=
E
k
k
(i,j)
Ek,o
k
= v
m

k
(i)
Baum-Welch algorithm: maximization step
Object Identification
To recognize observed symbol sequences ,we create an
HMM for each category. We chose the model which
best matches the observations of a particular category.
This is done by using a learning algorithm known as
Baum-Welch algorithm in which the known
observations for each category are used to generate
the HMM model i.e M=(A, B, t) .
Object Identification
Now to detect to what category an unknown
observation sequence belong we use the forward
algorithm and for each HMM model generated in the
learning steps we find the probability of occurrence of
the observation.
The model that gives the highest probability is the
category to which the observation belong thus we can
identify the objects.
Event recognization
Once tracking of an object is achieved and its
trajectory information is obtained for every point it
has visited and which involves the position of the
centroid and objects instantaneous velocity at each
point, which are then utilized to construct a flow
vector, f:
f= ( x, y,vx ,vy )
and the trajectory of each objects is given by a
sequence of a flow vector
T = f1 f 2... f N
Event recognization
The idea is to observe usual trajectories of the moving
targets in the scene and train the HMMs by using
normal trajectories. In this context, the definition of
abnormal is any type of motion that has not any
previous occurrences (i.e., not resembling normal)
In order to train hidden Markov models, a useful
sequence is required this is done by clustering (x,y)
and (vx ,vy) separately.
The method used for clustering data is the K-Means
algorithm.
Event recognization
Selection of number of models: There should be a
mechanism to identify distinct motion patterns and
each pattern is to be modeled separately.
Example :
Event recognization
If the images are obtained from a highway (Figure (a)), it
can be deduced that there are mainly two types of motion;
one going from bottom to top (right lane) and the other
from top to bottom (left lane).
Now using learning algorithm the two HMM for the
highway sequence is generated.
Now if an unknown trajectory is obtained its probability is
found out using the HMM models,
If its a normal motion one of the two HMM models will
give a high probability for the trajectory whereas if it is an
abnormal motion both the HMM models will give a low
probability thus enabling us to detect abnormal function.
Combining Event recognization and
Object identification
In a surveillance camera if an abnormal motion occurs
it can be detected using event recognization and the
object responsible for it can be identified using object
identification .
REFERENCES
MOVING OBJECT IDENTIFICATION AND EVENT RECOGNITION IN VIDEO SURVEILLANCE
SYSTEMS - A THESIS SUBMITTED TO THE GRADUATE SCHOOL OF NATURAL AND APPLIED
SCIENCES OF MIDDLE EAST TECHNICAL UNIVERSITY BY BURKAY BRANT RTEN.

A TUTORIAL ON HIDDEN MARKOV MODELS AND SELECTED APPLICATION IN SPEECH
RECOGNITION- LAWRENCE R. RABINER,FELLOW IEEE

RECOGNISING HUMAN ACTION IN TIME-SEQUENTIAL IMAGES USING HIODDEN MARKOV
MODEL - JUNJI YAMATO ZUN OHYA ,KENICHIRO ISHII - NTT HUMAN INTERFACE
LABORTARIES YOKOSUKA , JAPAN.
AN INTRODUCTON TO HIDDEN MARKOV MODEL L.R.RABINER AN B.H.JUANG

Hidden Markov Model

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Hidden Markov Model

Transféré par

Droits d'auteur :

Formats disponibles

PRESENTED BY

Vous aimerez peut-être aussi