Vous êtes sur la page 1sur 118

State Space Models and Filtering

Jes
us Fern
andez-Villaverde
University of Pennsylvania

State Space Form


What is a state space representation?
States versus observables.
Why is it useful?
Relation with filtering.
Relation with optimal control.
Linear versus nonlinear, Gaussian versus nongaussian.
2

State Space Representation

Let the following system:


Transition equation
xt+1 = F xt + G t+1, t+1 N (0, Q)
Measurement equation
zt = H 0xt + t, t N (0, R)
where xt are the states and zt are the observables.
Assume we want to write the likelihood function of z T = {zt}T
t=1.
3

The State Space Representation is Not Unique


Take the previous state space representation.
Let B be a non-singular squared matrix conforming with F .
0 0

Then, if xt = Bxt, F = BF B , G = BG, and H = H B , we

can write a new, equivalent, representation:


Transition equation

xt+1 = F xt + G t+1, t+1 N (0, Q)


Measurement equation
zt = H 0xt + t, t N (0, R)
4

Example I

Assume the following AR(2) process:

zt = 1zt1 + 2zt2 + t, t N 0, 2
Model is not apparently not Markovian.
Can we write this model in dierent state space forms?
Yes!
5

State Space Representation I

Transition equation:
xt =
where xt =

yt 2yt1

"

1 1
2 0

xt1 +

"

i0

Measurement equation:
zt =

1 0 xt

1
0

State Space Representation II

Transition equation:
xt =
where xt =

yt yt1

"

1 2
1 0

zt =

Try B =

1 0
0 2

xt1 +

"

1
0

i0

Measurement equation:

"

1 0 xt

on the second system to get the first system.


7

Example II

Assume the following MA(1) process:

2
zt = t + t1, t N 0, , and E t s = 0 for s 6= t.

Again, we have a more conmplicated structure than a simple Markovian process.

However, it will again be straightforward to write a state space representation.

State Space Representation I

Transition equation:
xt =
where xt =

yt t

"

0 1
0 0

zt =

xt1 +

"

i0

Measurement equation:

1 0 xt

State Space Representation II


Transition equation:

xt = t1
where xt = [ t1]0
Measurement equation:
zt = xt + t
Again both representations are equivalent!
10

Example III

Assume the following random walk plus drift process:

zt = zt1 + + t, t N 0, 2
This is even more interesting.
We have a unit root.
We have a constant parameter (the drift).
11

State Space Representation

Transition equation:
xt =
where xt =

yt

"

1 1
0 1

zt =

xt1 +

"

i0

Measurement equation:

12

1 0 xt

1
0

Some Conditions on the State Space Representation

We only consider Stable Systems.


A system is stable if for any initial state x0, the vector of states, xt,
converges to some unique x.
A necessary and sucient condition for the system to be stable is that:
|i (F )| < 1
for all i, where i (F ) stands for eigenvalue of F .

13

Introducing the Kalman Filter

Developed by Kalman and Bucy.


Wide application in science.
Basic idea.
Prediction, smoothing, and control.
Why the name filter?
14

Some Definitions

Let xt|t1 = E xt|z t1 be the best linear predictor of xt given the


history of observables until t 1, i.e. z t1.

t1
Let zt|t1 = E zt|z
= H 0xt|t1 be the best linear predictor of zt
given the history of observables until t 1, i.e. z t1.

t
Let xt|t = E xt|z be the best linear predictor of xt given the history
of observables until t, i.e. z t.

15

What is the Kalman Filter trying to do?

Let assume we have xt|t1 and zt|t1.


We observe a new zt.
We need to obtain xt|t.
Note that xt+1|t = F xt|t and zt+1|t = H 0xt+1|t, so we can go back
to the first step and wait for zt+1.
Therefore, the key question is how to obtain xt|t from xt|t1 and zt.
16

A Minimization Approach to the Kalman Filter I

Assume we use the following equation to get xt|t from zt and xt|t1:

xt|t = xt|t1 + Kt zt zt|t1 = xt|t1 + Kt zt H 0xt|t1

This formula will have some probabilistic justification (to follow).


What is Kt?

17

A Minimization Approach to the Kalman Filter II

Kt is called the Kalman filter gain and it measures how much we


update xt|t1 as a function in our error in predicting zt.
The question is how to find the optimal Kt.
The Kalman filter is about how to build Kt such that optimally update
xt|t from xt|t1 and zt.
How do we find the optimal Kt?
18

Some Additional Definitions

Let t|t1 E

Let t|t1 E

xt xt|t1

xt xt|t1 |z t1 be the predicting

error variance covariance matrix of xt given the history of observables


until t 1, i.e. z t1.
zt zt|t1

zt zt|t1 |z t1

be the predicting

error variance covariance matrix of zt given the history of observables


until t 1, i.e. z t1.

Let t|t E

xt xt|t

xt xt|t |z t be the predicting error vari-

ance covariance matrix of xt given the history of observables until t1,


i.e. z t.
19

Finding the optimal Kt

We want Kt such that min t|t.


It can be shown that, if that is the case:

1
0
Kt = t|t1H H t|t1H + R

with the optimal update of xt|t given zt and xt|t1 being:

xt|t = xt|t1 + Kt zt H 0xt|t1


We will provide some intuition later.
20

Example I
Assume the following model in State Space form:

Transition equation

xt = + t, t N 0, 2
Measurement equation

zt = xt + t, t N 0, 2
Let 2 = q 2 .
21

Example II
Then, if 1|0 = 2 , what it means that x1 was drawn from the ergodic
distribution of xt.

We have:

K1 = 2

1
1

.
1+q
1+q

Therefore, the bigger 2 relative to 2 (the bigger q) the lower K1


and the less we trust z1.

22

The Kalman Filter Algorithm I


Given t|t1, zt, and xt|t1, we can now set the Kalman filter algorithm.
Let t|t1, then we compute:
t|t1 E

zt zt|t1

zt zt|t1 |z t1

H 0 xt xt|t1

xt xt|t1 H+

0
0
= E t xt x

t|t1 H + H xt xt|t1 t+
t 0t|z t1
= H 0t|t1H + R

23

The Kalman Filter Algorithm II


Let t|t1, then we compute:
E

E H 0 xt xt|t1

zt zt|t1

xt xt|t1

xt xt|t1 |z t1

+ t xt xt|t1 |z t1

Let t|t1, then we compute:

1
0
Kt = t|t1H H t|t1H + R

Let t|t1, xt|t1, Kt, and zt then we compute:

xt|t = xt|t1 + Kt zt H 0xt|t1


24

=
= H 0t|t1

The Kalman Filter Algorithm III


Let t|t1, xt|t1, Kt, and zt, then we compute:

t|t E

xt xt|t1

xt xt|t1

xt xt|t

xt xt|t |z t =

xt xt|t1
0

zt H 0xt|t1 Kt0

0
0
Kt zt H xt|t1 xt xt|t1 +

0
0
0
Kt zt H xt|t1 zt H xt|t1 Kt0 |z t

= t|t1 KtH 0t|t1

0
where, you have to notice that xt xt|t = xt xt|t1 Kt zt H xt|t1 .

25

The Kalman Filter Algorithm IV


Let t|t1, xt|t1, Kt, and zt, then we compute:
t+1|t = F t|tF 0 + GQG0
Let xt|t, then we compute:
1. xt+1|t = F xt|t
2. zt+1|t = H 0xt+1|t
Therefore, from xt|t1, t|t1, and zt we compute xt|t and t|t.
26

The Kalman Filter Algorithm V

We also compute zt|t1 and t|t1.


Why?
To calculate the likelihood function of z T = {zt}T
t=1 (to follow).

27

The Kalman Filter Algorithm: A Review


We start with xt|t1 and t|t1.
The, we observe zt and:
t|t1 = H 0t|t1H + R
zt|t1 = H 0xt|t1

1
0
Kt = t|t1H H t|t1H + R

t|t = t|t1 KtH 0t|t1


28

xt|t = xt|t1 + Kt zt H 0xt|t1


t+1|t = F t|tF 0 + GQG0
xt+1|t = F xt|t
We finish with xt+1|t and t+1|t.

29

Some Intuition about the optimal Kt

1
0
Remember: Kt = t|t1H H t|t1H + R

Notice that we can rewrite Kt in the following way:


Kt = t|t1H1
t|t1
If we did a big mistake forecasting xt|t1 using past information (t|t1
large) we give a lot of weight to the new information (Kt large).

If the new information is noise (R large) we give a lot of weight to the


old prediction (Kt small).
30

A Probabilistic Approach to the Kalman Filter

Assume:
Z|w = [X 0|w Y 0|w]0 N
then:

"

x
y

# "

xx xy
yx yy

#!

1
X|y, w N x + xy 1
yy (y y ) , xx xy yy yx

t1
Also xt|t1 E xt|z
and:

t|t1 E

xt xt|t1
31

xt xt|t1 |z t1

Some Derivations I
If zt|z t1 is the random variable zt (observable) conditional on z t1, then:

t1
0
t1
Let zt|t1 E zt|z
= E H xt + t|z
= H 0xt|t1

Let

t|t1 E

zt zt|t1

zt zt|t1 |z t1 =
0

0 x x
H
t
t|t1 xt xt|t1 H+

t xt xt|t1 H+
= H 0
E

t|t1H + R

H 0 xt xt|t1 0t+

t 0t|z t1
32

Some Derivations II
Finally, let
E

zt zt|t1

E H 0 xt xt|t1

xt xt|t1 |z t1 =

xt xt|t1

+ t xt xt|t1 |z t1 =

= H 0t|t1

33

The Kalman Filter First Iteration I

Assume we know x1|0 and 1|0, then

x1 0
|z N
z1

Remember that:

"

x1|0
H 0x1|0

# "

1|0
1|0H
H 01|0 H 01|0H + R

#!

1
X|y, w N x + xy 1
yy (y y ) , xx xy yy yx

34

The Kalman Filter First Iteration II


Then, we can write:

where

and

0
1
x1|z1, z = x1|z N x1|1, 1|1

0
0
x1|1 = x1|0 + 1|0H H 1|0H + R
z1 H x1|0

1
0
H 01|0
1|1 = 1|0 1|0H H 1|0H + R

35

Therefore, we have that:


z1|0 = H 0x1|0
1|0 = H 01|0H + R

0
0
x1|1 = x1|0 + 1|0H H 1|0H + R
z1 H x1|0

1
0
1|1 = 1|0 1|0H H 1|0H + R
H 01|0

Also, since x2|1 = F x1|1 + G 2|1 and z2|1 = H 0x2|1 + 2|1:


x2|1 = F x1|1
2|1 = F 1|1F 0 + GQG0
36

The Kalman Filter th Iteration I

Assume we know xt|t1 and t|t1, then

xt t1
|z
N
zt

Remember that:

"

xt|t1
H 0xt|t1

# "

t|t1
t|t1H
H 0t|t1 H 0t|t1H + R

1
X|y, w N x + xy 1
yy (y y ) , xx xy yy yx

37

#!

The Kalman Filter th Iteration II


Then, we can write:

where

and

t1
t
xt|zt, z
= xt|z N xt|t, t|t

0
0
xt|t = xt|t1 + t|t1H H t|t1H + R
zt H xt|t1

1
0
H 0t|t1
t|t = t|t1 t|t1H H t|t1H + R

38

The Kalman Filter Algorithm


Given xt|t1, t|t1 and observation zt
t|t1 = H 0t|t1H + R
zt|t1 = H 0xt|t1

t|t = t|t1 t|t1H H 0t|t1H + R

H 0

1
0
0
0
xt|t = xt|t1 + t|t1H H t|t1H + R
zt H xt|t1
39

t+1|t = F t|tF 0 + GQGt|t1


xt+1|t = F xt|t1

40

Putting the Minimization and the Probabilistic Approaches Together

From the Minimization Approach we know that:

xt|t = xt|t1 + Kt zt H 0xt|t1


From the Probability Approach we know that:

0
0
zt H xt|t1
xt|t = xt|t1 + t|t1H H t|t1H + R

41

But since:

1
0
Kt = t|t1H H t|t1H + R

We can also write in the probabilistic approach:

0
0
zt H xt|t1 =
xt|t = xt|t1 + t|t1H H t|t1H + R

0
= xt|t1 + Kt zt H xt|t1

Therefore, both approaches are equivalent.

42

Writing the Likelihood Function


We want to write the likelihood function of z T = {zt}T
t=1:

T
X

T
X

t=1

T
log ` z |F, G, H, Q, R =

t1
log ` zt|z
F, G, H, Q, R =

X
1
N
1

log 2 + log
vt0 1
vt

t|t1 +
t|t1
2
2 t=1
t=1 2

vt = zt zt|t1 = zt H 0xt|t1
0

t|t1 = Htt|t1Ht + R
43

Initial conditions for the Kalman Filter

An important step in the Kalman Fitler is to set the initial conditions.


Initial conditions:
1. x1|0
2. 1|0
Where do they come from?
44

Since we only consider stable system, the standard approach is to set:


x1|0 = x
1|0 =
where x solves
x = F x
= F F 0 + GQG0
How do we find ?
= [I F F ]1 vec(GQG0)
45

Initial conditions for the Kalman Filter II


Under the following conditions:

1. The system is stable, i.e. all eigenvalues of F are strictly less than one
in absolute value.
2. GQG0 and R are p.s.d. symmetric
3. 1|0 is p.s.d. symmetric
Then t+1|t .
46

Remarks

1. There are more general theorems than the one just described.

2. Those theorems are based on non-stable systems.

3. Since we are going to work with stable system the former theorem is
enough.

4. Last theorem gives us a way to find as t+1|t for any 1|0 we


start with.
47

The Kalman Filter and DSGE models

Basic Real Business Cycle model


max E0

t=0

t { log ct + (1 ) log (1 lt)}

ct + kt+1 = kt (ezt lt)1 + (1 ) k

zt = zt1 + t, t N (0, )

Parameters: = {, , , , , }
48

Equilibrium Conditions
(

1
1
1 1
= Et
1 + ezt+1 kt+1
lt+1
ct
ct+1

1
= (1 ) ezt ktlt
1 lt
ct
ct + kt+1 = ezt ktlt1 + (1 ) kt
zt = zt1 + t

49

A Special Case

We set, unrealistically but rather useful for our point, = 1.


In this case, the model has two important and useful features:
1. First, the income and the substitution eect from a productivity
shock to labor supply exactly cancel each other. Consequently, lt
is constant and equal to:
(1 )
lt = l =
(1 ) + (1 ) (1 )
2. Second, the policy function for capital is kt+1 = ezt ktl1.
50

A Special Case II
The definition of kt+1 implies that ct = (1 ) ezt ktl1.
Let us try if the Euler Equation holds:
(

1
1
1 1
= Et
ezt+1 kt+1
lt+1
ct
c
)
(t+1

1
1
zt+1 k1l1
=
E
e
t
t+1 t+1
l1
(1 ) ezt ktl1
(1 ) ezt+1kt+1
(

=
E
t
(1 ) ezt ktl1
(1 ) kt+1

=
(1 )
(1 )
51

Let us try if the Intratemporal condition holds


1

zt kl
(1

)
e
=
t
1l
(1 ) ezt ktl1
1

(1 )
=
1l
(1 )
l
(1 ) (1 ) l = (1 ) (1 l)
((1 ) (1 ) + (1 ) ) l = (1 )
Finally, the budget constraint holds because of the definition of ct.

52

Transition Equation

Since this policy function is linear in logs, we have the transition equation for the model:

0 0

log kt+1 = log l1 log kt + 1 t.


zt
zt1
1
0
0

Note constant.
Alternative formulations.
53

Measurement Equation
As observables, we assume log yt and log
0 it subject to a linearly additive
measurement error Vt = v1,t v2,t .
Let Vt N (0, ), where is a diagonal matrix with 21 and 22, as
diagonal elements.
Why measurement error? Stochastic singularity.
Then:

log yt
log it

log l1 1 0
0

1 0

54

log kt+1 +
zt

v1,t
v2,t

The Solution to the Model in State Space Form

1
log yt

xt = log kt , zt =
log it
zt1

1
0 0
0

F = log l1 , G = 1
1
0
0

H0 =

log l1 1 0
0

1 0

55

2
,Q =

,R =

The Solution to the Model in State Space Form III


Now, using z T , F, G, H, Q, and R as defined in the last slide...
...we can use the Ricatti equations to compute the likelihood function
of the model:

T
log ` z |F, G, H, Q, R

Croos-equations restrictions implied by equilibrium solution.


With the likelihood, we can do inference!
56

What do we Do if 6= 1?
We have two options:

First, we could linearize or log-linearize the model and apply the


Kalman filter.

Second, we could compute the likelihood function of the model using


a non-linear filter (particle filter).

Advantages and disadvantages.


Fern
andez-Villaverde, Rubio-Ramrez, and Santos (2005).
57

The Kalman Filter and linearized DSGE Models

We linearize (or loglinerize) around the steady state.


We assume that we have data on log output (log yt), log hours (log lt),
and log investment (log ct) subject to a linearly additive measurement

0
error Vt = v1,t v2,t v3,t .
We need to write the model in state space form. Remember that
and

b
b
k
t+1 = P kt + Qzt
b + Sz
lbt = Rk
t
t
58

Writing the Likelihood Function I

The transitions Equation:

1
1 0 0
0
b

kt+1 = 0 P Q k
t + 0 t.
0 0
1
zt+1
zt
The Measurement Equation requires some care.

59

Writing the Likelihood Function II


b + (1 )lb
Notice that ybt = zt + k
t
t
b + Sz
Therefore, using lbt = Rk
t
t

b + (1 )(Rk
b + Sz ) =
ybt = zt + k
t
t
t
b + (1 + (1 )S) z
( + (1 )R) k
t
t

b and using again lb = Rk


b + Sz
Also since cbt = 5lbt + zt + k
t
t
t
t
b (Rk
b + Sz ) =
cbt = zt + k
t
t
t
5
b + (1 S) z
( 5R) k
t
t
5
60

Writing the Likelihood Function III


Therefore the measurement equation is:

1
log yt
log y + (1 )R 1 + (1 )S

b
R
S
log lt = log l
kt
log ct
1 5S
log c
5R
zt

v1,t

+ v2,t .
v3,t

61

The Likelihood Function of a General Dynamic Equilibrium Economy

Transition equation:
St = f (St1, Wt; )

Measurement equation:
Yt = g (St, Vt; )

Interpretation.
62

Some Assumptions
n

1. We can partition {Wt} into two independent sequences W1,t


n

and

W2,t , s.t. Wt = W1,t, W2,t and dim W2,t +dim (Vt) dim (Yt).

t
t1
2. We can always evaluate the conditional densities p yt|W1 , y , S0; .

Lubick and Schorfheide (2003).

3. The model assigns positive probability to the data.

63

Our Goal: Likelihood Function

Evaluate the likelihood function of the a sequence of realizations of


the observable y T at a particular parameter value :

p yT ;

We factorize it as:

p yT ;
=

T Z Z
Y

t=1

T
Y

t=1

p yt|y t1;

t
t1
t
t1
p yt|W1 , y , S0; p W1 , S0|y ; dW1tdS0
64

A Law of Large Numbers

If

N )T
n
oT
t|t1,i
t|t1,i
t
t1
s0
, w1
N i.i.d. draws from p W1 , S0|y ;
,
t=1
i=1 t=1

then:

p yT ; '

T
N
Y
1 X

t=1 N i=1

t|t1,i

p yt|w1

65

t|t1,i

, y t1, s0

...thus
The problem of evaluating the likelihood is equivalent to the problem of
drawing from
n

p W1t, S0|y t1;

66

oT
t=1

Introducing Particles

t1,i
t1,i N
t1
t1
s0
, w1
N i.i.d. draws from p W1 , S0|y ; .
i=1

t1,i
t1,i
Each s0
, w1
is a particle and

particles.

o
t1,i
t1,i N
s0
, w1
a swarm of
i=1

t|t1,i
t|t1,i N
t
t1
s0
, w1
N i.i.d. draws from p W1 , S0|y ; .
i=1

t|t1,i
t|t1,i
, w1
is a proposed particle and
Each s0

a swarm of proposed particles.

67

t|t1,i
t|t1,i N
s0
, w1
i=1

... and Weights

t|t1,i

p yt|w1

qti = P
N

t|t1,i

, y t1, s0

t|t1,i t1 t|t1,i
p
y
|w
, y , s0
;
t 1
i=1

68

A Proposition

Let
and

N
t|t1,i
be a draw with replacement from st|t1,i
,
w
0
1
i=1
n
oN

probabilities qti. Then sei0, we1i i=1 is a draw from p W1t, S0|yt; .

oN
e 1i
sei0, w
i=1

69

Importance of the Proposition

t|t1,i
t|t1,i N
t
t1
, w1
from p W1 , S0|y ;
1. It shows how a draw s0
i=1
n
oN

t,i
t,i
t
t
can be used to draw s0 , w1
from p W1 , S0|y ; .
i=1
n

t,i
t,i N
t
t
2. With a draw s0 , w1
from p W1 , S0|y ; we can use p W1,t+1;
i=1

t+1|t,i
t+1|t,i N
to get a draw s0
, w1
and iterate the procedure.
i=1

70

Sequential Monte Carlo I: Filtering


Step 0, Initialization: Set t
p (S0; ).

1 and initialize p W1t1, S0|y t1;

N
t|t1,i
t|t1,i
Step 1, Prediction: Sample N values s0
, w1
from the
i=1


t1
density p W1t, S0|y t1; = p W1,t; p W1 , S0|y t1; .
t|t1,i

Step 2, Weighting: Assign to each draw s0


qti.

t|t1,i

, w1

the weight

o
t|t1,i
t|t1,i N
t,i
t,i N
with rep. from s0
, w1
Step 3, Sampling: Draw s0 , w1
i=1
i=1
n oN
with probabilities qti
. If t < T set t t + 1 and go to
i=1
n

step 1.

Otherwise stop.

71

Sequential Monte Carlo II: Likelihood

Use

N )T
t|t1,i
t|t1,i
s0
, w1
to compute:
i=1 t=1

p yT ; '

T 1 X
N
Y

t=1 N i=1

t|t1,i

p yt|w1

72

t|t1,i

, y t1, s0

A Trivial Application

T
How do we evaluate the likelihood function p y |, , of the nonlinear,

nonnormal process:

st = +

st1
+ wt
1 + st1

yt = st + vt
where wt N (0, ) and vt t (2) given some observables y T = {yt}T
t=1
and s0.

73

0,i

1. Let s0 = s0 for all i.

2. Generate N i.i.d. draws

1|0,i

3. Evaluate p y1|w1

N
1|0,i
s0 , w1|0,i
from N (0, ).
i=1

1|0,i

, y 0, s0

= pt(2) y1 +

4. Evaluate the relative weights q1i =

74

1|0,i
s0

1|0,i
1+s0

+ w1|0,i

!!

!!

1|0,i
s
pt(2) y1 + 0 1|0,i +w1|0,i
1+s
0

!! .
1|0,i
PN
s
0
p
y

+
+w1|0,i
1
i=1 t(2)
1|0,i
1+s
0

N
1|0,i
5. Resample with replacement N values of s0 , w1|0,i
with relan
oNi=1
1,i
tive weights q1i . Call those sampled values s0 , w1,i
.
i=1

6. Go to step 1, and iterate 1-4 until the end of the sample.

75

A Law of Large Numbers

A law of the large numbers delivers:

N
X
1
1|0,i
1|0,i
p y1| y 0, , , '
p y1|w1 , y 0, s0
N i=1

and consequently:

T
N

Y
X
1
t|t1,i t1 t|t1,i

p y T , , '
p yt|w1
, y , s0
N
t=1
i=1

76

Comparison with Alternative Schemes

Deterministic algorithms: Extended Kalman Filter and derivations


(Jazwinski, 1973), Gaussian Sum approximations (Alspach and Sorenson, 1972), grid-based filters (Bucy and Senne, 1974), Jacobian of the
transform (Miranda and Rui, 1997).
Tanizaki (1996).

Simulation algorithms: Kitagawa (1987), Gordon, Salmond and Smith


(1993), Mariano and Tanizaki (1995) and Geweke and Tanizaki (1999).

77

A Real Application: the Stochastic Neoclassical Growth Model

Standard model.
Isnt the model nearly linear?
Yes, but:
1. Better to begin with something easy.
2. We will learn something nevertheless.
78

The Model

1
1

ct (1lt )
P
t
.
Representative agent with utility function U = E0 t=0
1

One good produced according to yt = ezt Aktlt1 with (0, 1) .


Productivity evolves zt = zt1 + t, || < 1 and t N (0, ).
Law of motion for capital kt+1 = it + (1 )kt.
Resource constrain ct + it = yt.
79

Solve for c (, ) and l (, ) given initial conditions.


Characterized by:
h

1
Uc(t) = Et Uc(t + 1) 1 + Aezt+1kt+1
l(kt+1, zt+1)

1 c(kt, zt)
= (1 ) ezt Aktl(kt, zt)
1 l(kt, zt)
A system of functional equations with no known analytical solution.

80

Solving the Model


We need to use a numerical method to solve it.
Dierent nonlinear approximations: value function iteration, perturbation, projection methods.
We use a Finite Element Method. Why? Aruoba, Fern
andez-Villaverde
and Rubio-Ramrez (2003):
1. Speed: sparse system.
2. Accuracy: flexible grid generation.
3. Scalable.
81

Building the Likelihood Function

Time series:
1. Quarterly real output, hours worked and investment.
2. Main series from the model and keep dimensionality low.

Measurement error. Why?


= (, , , , , , , 1, 2, 3)
82

State Space Representation


kt =

t =
gdpt =
hourst =
invt =

1
1 (
tanh
)

1
t1 k
f1(St1, Wt; ) = e

t1l kt1, tanh (t1);

1
1 l kt1, tanh (t1);

1
+ (1 ) kt1

(1 )
1
1
l kt1, tanh (t1);
f2(St1, Wt; ) = tanh( tanh1(t1) + t)
1
1 ( )
tanh
1
t
g1(St, Vt; ) = e
kt l kt, tanh (t);
+ V1,t

1
g2(St, Vt; ) = l kt, tanh (t); + V2,t
1
1 ( )
tanh
1
t
g3(St, Vt; ) = e
kt l kt, tanh (t);

1
1 l kt, tanh (t);

1
+ V3,t

(1 )
1
l kt, tanh1(t);

Likelihood Function

83

Since our measurement equation implies that


32

p (yt|St; ) = (2)

12

||

(St ;)
2

where (St; ) = (yt x(St; )))0 1 (yt x(St; )) t, we have

3T
2

(2)

T
|| 2

p yT ;
T Z
Y

t=1

'

3T
(2) 2

(St ;)
t1
2
p St|y , S0; dSt p (S0; ) dS1
T
|| 2

T 1 X
N
(sit ;)
Y
e 2

t=1 N i=1

84

Priors for the Parameters


Priors for the Parameters of the Model
Parameters Distribution Hyperparameters

Uniform
0,1

Uniform
0,1

Uniform
0,100

Uniform
0,1

Uniform
0,0.05

Uniform
0.75,1

Uniform
0,0.1
1
Uniform
0,0.1
2
Uniform
0,0.1
3
Uniform
0,0.1
85

Likelihood-Based Inference I: a Bayesian Perspective

Define priors over parameters: truncated uniforms.


Use a Random-walk Metropolis-Hastings to draw from the posterior.
Find the Marginal Likelihood.

86

Likelihood-Based Inference II: a Maximum Likelihood Perspective

We only need to maximize the likelihood.


Diculties to maximize with Newton type schemes.
Common problem in dynamic equilibrium economies.
We use a simulated annealing scheme.
87

An Exercise with Artificial Data


First simulate data with our model and use that data as sample.
Pick true parameter values. Benchmark calibration values for the
stochastic neoclassical growth model (Cooley and Prescott, 1995).
Calibrated Parameters
Parameter

Value
0.357 0.95
2.0
0.4
Parameter

1
2
Value
0.99
0.007 1.58*104 0.0011
Sensitivity: = 50 and = 0.035.
88

0.02
3
8.66*104

Figure 5.1: Likelihood Function Benchmark Calibration


Likelihood cut at

Likelihood cut at

0.9 0.92 0.94 0.96 0.98

1.5

Likelihood cut at

2.5

Likelihood cut at

3.5

0.38

Likelihood cut at

0.4

0.42

Likelihood cut at

0.018

0.02

0.022

Likelihood cut at

Nonlinear
Linear
Pseudotrue
7

10 11
x 10

0.98
-3

0.985

0.99

0.25

0.3

0.35

0.4

Figure 5.2: Posterior Distribution Benchmark Calibration

5000

5000

4000

4000

3000

3000

2000

2000

1000

1000

0.94850.9490.9495 0.95 0.95050.9510.9515

0
1.996

1.998

4000

2000

2000

0.3995

0.4

0.4005

0.01957

0.0196

0.01963

5000

6.99

6.995

7.005

7.01
x 10

0.988

0.989

0.99

0.991

-3

5000

2.004

5000

2.002

4000

5000

0.3564

0.3568

0.3572

0.3576

0
1.578 1.579 1.58 1.581 1.582 1.583 1.584
x 10

4000

4000

2000

2000

0
1.116

1.117

1.118

-4

1.119

1.12
x 10

-3

8.645 8.65 8.655 8.66 8.665 8.67 8.675


x 10

-4

Figure 5.3: Likelihood Function Extreme Calibration


Likelihood cut

Likelihood cut

0.9 0.92 0.94 0.96 0.98

Likelihood cut

40

45

50

55

Likelihood cut

0.36 0.38

Likelihood cut

0.4

0.42 0.44

Likelihood cut

0.016 0.018 0.02 0.022

Likelihood cut

Nonlinear
Linear
Pseudotrue

0.03

0.035

0.04

0.95 0.96 0.97 0.98 0.99

0.3

0.35

0.4

Figure 5.4: Posterior Distribution Extreme Calibration

6000

6000

4000

4000

2000

2000

0.9495

0.95

0
49.95

0.9505

50

6000

6000

4000

4000

2000

2000

0
0.3996

0.3998

0.4

0.4002

0.019555 0.019565 0.019575 0.019585

6000

6000

4000

4000

2000

2000

0.035 0.035 0.035 0.035 0.035 0.035 0.035

0.989

0.9895

0.99

0.9905

5000

50.05

5000

0.3567

0.3569

0.3571

0.3573

1.58 1.5805 1.581 1.5815 1.582 1.5825


x 10

6000

6000

4000

4000

2000

2000

0
1.117

1.1175

1.118

-4

1.1185

1.119
x 10

-3

8.655

8.66

8.665
x 10

-4

Figure 5.5: Converge of Posteriors Extreme Calibration

80

0.8
60
0.6
0.4

4
x 10

40

4
x 10

0.03

0.45
0.4
0.35

0.02

4
x 10

0.01

0.03

0.95

4
5

3
x 10

0.04

0.02

4
x 10

0.9

4
x 10

0.4

0.05

0.35

4
x 10

4
x 10

0.015

0.04

0.01
0.02
0.005
0

4
x 10

4
x 10

Figure 5.6: Posterior Distribution Real Data

10000

15000
10000

5000
5000
0
0.96

x 10

0.97

0.98

0.99

0
1.68

1.7

1.72

1.74

1.76

6.35

6.4

10000

5000

0
0.32 0.322 0.324 0.326 0.328

0.33

0.332

0
6.2

6.25

6.3

x 10

15000

15000

10000

10000

5000

5000

0
0.0198

0.02

0.0202

0.0204

0
0.9964 0.9966 0.9968 0.997 0.9972 0.9974 0.9976
1

10000

10000

5000

5000

0
0.385

0.39

-3

0.395

0.4

0
0.0435 0.044 0.0445 0.045 0.0455 0.046 0.0465

10000

15000
10000

5000
5000
0
0.014

0.0145

0.015

0.0155

0.016

0
0.037 0.0375 0.038 0.0385 0.039 0.0395 0.04

Figure 6.1: Likelihood Function


Transversal cut at

Transversal cut at

-1000

-1000

-2000

-2000

-3000

-3000
-4000

-4000

-5000

-5000

Exact
100 Particles
1000 Particles
10000 Particles

-6000

-7000

-7000
0.39

0.395

0.4

0.405

Exact
100 Particles
1000 Particles
10000 Particles

-6000

0.41

0.97

0.98

Exact
100 Particles
1000 Particles
10000 Particles

-60
-80

-40

-80
-100

-120

-120

-140

-140

-160

-160

-180

-180

-200

-200

-220

-220
0.94

0.95

1.01

0.96

0.97

Exact
100 Particles
1000 Particles
10000 Particles

-60

-100

0.93

Transversal cut at

Transversal cut at
-40

0.99

6.85

6.9

6.95

7.05

7.1

7.15
x 10

-3

Figure 6.2: C.D.F. Benchmark Calibration


10000 particles

20000 particles

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

2000

4000

6000

8000

10000

0.5

1.5

2
x 10

30000 particles

40000 particles

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

3
x 10

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

4
4

60000 particles
1

x 10

50000 particles

5
x 10

6
x 10

Figure 6.3: C.D.F. Extreme Calibration


10000 particles

20000 particles

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

2000

4000

6000

8000

10000

0.5

1.5

2
x 10

30000 particles

40000 particles

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

3
x 10

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

4
4

60000 particles
1

x 10

50000 particles

5
x 10

6
x 10

Figure 6.4: C.D.F. Real Data


10000 particles

20000 particles

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

2000

4000

6000

8000

10000

0.5

1.5

2
x 10

30000 particles

40000 particles

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

3
x 10

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

4
4

60000 particles
1

x 10

50000 particles

5
x 10

6
x 10

Posterior Distributions Benchmark Calibration


Parameters
Mean
s.d.
6.72105
0.357

3.40104
0.950

6.78104
2.000

0.400
8.60105

0.020
1.34105

0.989

1.54105
0.007

9.29106
1.58104
1
5.75108
2
1.12102
6.44107
3
8.64104
6.49107

89

Maximum Likelihood Estimates Benchmark Calibration


Parameters
MLE
s.d.
0.357
8.19106

0.950
0.001

2.000
0.020

0.400
2.02106

0.002

2.07105
0.990

1.00106
0.007

0.004
1.58104
1
0.007
2
1.12103
0.007
3
8.63104
0.005

90

Posterior Distributions Extreme Calibration


Parameters
Mean
s.d.
7.19104
0.357

1.88104
0.950

7.12103
50.00

0.400
4.80105

0.020
3.52106

0.989

8.69106
0.035

4.47106
1.58104
1
1.87108
2
1.12102
2.14107
3
8.65104
2.33107

91

Maximum Likelihood Estimates Extreme Calibration


Parameters
MLE
s.d.
0.357
2.42106

0.950
6.12103

50.000
0.022

0.400

3.62107
0.019

7.43106
0.990

1.00105
0.035

0.015
1.58104
1
0.017
2
1.12103
0.014
3
8.66104
0.023

92

Convergence on Number of Particles


Convergence Real Data
N
Mean
s.d.
10000
1014.558
0.3296
20000
1014.600
0.2595
30000
1014.653
0.1829
40000
1014.666
0.1604
50000
1014.688
0.1465
60000
1014.664
0.1347

93

Posterior Distributions Real Data


Parameters Mean
s.d.
7.976 104

0.323
0.008

0.969
0.011

1.825
0.001

0.388
3.557 105

0.006
9.221 105

0.997

0.023
2.702 104
1
0.039
5.346 104
2
0.018
4.723 104
3
0.034
6.300 104

94

Maximum Likelihood Estimates Real Data


Parameters
MLE
s.d.
0.044

0.390
0.708

0.987
1.398

1.781
0.019

0.324
0.160

0.006

0.997
8.67103

0.023
0.224
1
0.038
0.060
2
0.016
0.061
3
0.035
0.076

95

Logmarginal Likelihood Dierence: Nonlinear-Linear


p Benchmark Calibration Extreme Calibration Real Data
0.1
73.631
117.608
93.65
0.5
73.627
117.592
93.55
0.9
73.603
117.564
93.55

96

output
hours
inv

Nonlinear versus Linear Moments Real Data


Real Data
Nonlinear (SMC filter) Linear (Kalman filter)
Mean
s.d
Mean
s.d
Mean
s.d
1.95
0.073
1.91
0.129
1.61
0.068
0.36
0.014
0.36
0.023
0.34
0.004
0.42
0.066
0.44
0.073
0.28
0.044

97

A Future Application: Good Luck or Good Policy?

U.S. economy has become less volatile over the last 20 years (Stock
and Watson, 2002).

Why?
1. Good luck: Sims (1999), Bernanke and Mihov (1998a and 1998b)
and Stock and Watson (2002).
2. Good policy: Clarida, Gertler and Gal (2000), Cogley and Sargent
(2001 and 2003), De Long (1997) and Romer and Romer (2002).
3. Long run trend: Blanchard and Simon (2001).
98

How Has the Literature Addressed this Question?

So far: mostly with reduced form models (usually VARs).


But:
1. Results dicult to interpret.
2. How to run counterfactuals?
3. Welfare analysis.

99

Why Not a Dynamic Equilibrium Model?

New generation equilibrium models: Christiano, Eichebaum and Evans


(2003) and Smets and Wouters (2003).

Linear and Normal.


But we can do it!!!

100

Environment

Discrete time t = 0, 1, ...


t = (s , ..., s ) and probability
Stochastic
process
s

S
with
history
s
t
0

st .

101

The Final Good Producer


Perfectly Competitive Final Good Producer that solves
max

yi(st)

Z
1

t
yi s di pi st yi st di.

Demand function for each input of the form

1
1
t

p
s
i
t ,

y
s
yi st =
p (st)

with price aggregator:


p st =

! 1

t
1
pi s
di
.
102

The Intermediate Good Producer

Continuum of intermediate good producers, each of one behaving as


monopolistic competitor.

The producer of good i has access to the technology:


t
yi st = max ez (s )ki st1 li1 st , 0 .


t
t1
Productivity z s = z s
+ z st .

Calvo pricing with indexing. Probability of changing prices (before


observing current period shocks) 1 .
103

Consumers Problem

c

m

c st dc st1
l st l
m st
X
E
t c st
l st
+ m st

c
l
m

st t=0



Z

t
t
t
t
t
t+1
t+1
p s
c s +x s
+M s +
q s s B s
dst+1 =
t+1
s




t
t
t
t
t1
t1
t
t
p s
w s l s +r s k s
+M s
+ B s + s + T st

t+1
B s
B

t


x
s
t
t1
+ x st .
k s = (1 ) k s

k st1

104

Government Policy

Monetary Policy: Taylor rule



t
i s
= rg g st


t
t
+a s
s g st



t
t
t
+b s
y s yg s
+ i st


t
t1
g s
= g s
+ st


t
t1
a s
= a s
+ a st


t
t1
b s
= b s
+ b st

Fiscal Policy.
105

Stochastic Volatility I

We can stack all shocks in one vector:









0
t
t
t
t
t
t
t
t
s = z s , c s , l s , m s , i s , s , a s , b st

Stochastic volatility:


0.5
t
s = R st
st .


The matrix R st can be decomposed as:


1

t
t
H st G st .
R s =G s
106

Stochastic Volatility II

H st (instantaneous shocks variances) is diagonal with nonzero ele
ments hi st that evolve:


t
t1
+ i i st .
log hi s = log hi s

G st (loading matrix) is lower triangular, with unit entries in the

diagonal and entries ij st that evolve:


t
t1
+ ij ij st .
ij s = ij s
107

Where Are We Now?

Solving the model: problem with 45 state variables: physical


capital,

t
the aggregate price level, 7 shocks, 8 elements of matrix H s , and

the 28 elements of the matrix G st .

Perturbation.
We are making good progress.

108

Vous aimerez peut-être aussi