Wellesley Bellman Notes 1

Fall Semester 05-06
Akila Weerapana
Lecture 22: Dynamic Optimization in Discrete Time
I. INTRODUCTION
The last lecture served as a general introduction to dynamic optimization problems. We
looked at how we can use the Lagrange multiplier method for solving simple dynamic opti-
mization problems.
Todays lecture focuses on solving dynamic optimization problems without resorting to the
sometimes cumbersome Lagrange multiplier technique. Keep in mind that the Lagrange
multiplier method can always be used to solve a dynamic problem, we are simply looking for
a more tractable, and more powerful way.
The technique we use is known as the Bellman equation. The Bellman equation is a recur-
sive representation of a maximization decision, in other words, it represents a maximization
decision as a function of a smaller maximization decision.
II. BELLMAN EQUATIONS
Consider a very general T-period optimization decision faced by an agent today, which we
call period 1.
max
x
1
,x
2
,x
T
T
t=1
t1
f(x
t
, A
t
)
subject to the constraints A
t
= g(x
t
, A
t1
) for t = 1, 2, , T where A
0
, A
T
are given to us.
In this type of optimization, we can identify several variables that will be useful in formulating
the Bellman equation for the problem.
f(x
t
, A
t1
) is the objective function - the present discounted value of the sum of the objective
functions in each time period is our maximization objective.
x
t
is the choice variable(s) - the variable(s) whose time path we are choosing in order to
maximize the PDV of the sum of the objective functions.
A
t1
is known as the state variable: a stock variable that the agent inherits from the past at
time t, which is aected by her choice at time t, and which she passes on to the next period.
Note that depending on the setup of the problem, the A variable may have a time t subscript
instead of a time t 1 subscript - examples can be seen later.
g(x
t
, A
t1
) is known as the transition equation, it describes how the choice variable at time
t aects the state variable inherited from time t 1 in order to determined the value of the
state variable we pass on to time t + 1.
The centerpiece of the Bellman equation is a function known as the value function. The
value function denotes the maximized value of the objective function from time t onwards.
The value of V at any given period in time depends on the value of the state variable at that
time since that is what the decision maker inherits from the past.
So the denition of the value function in the general problem dened above is
V
1
(A
0
) = max
x
1
,x
2
, ,x
T
T
t=1
t1
f(x
t
, A
t1
) subject to A
t
= g(x
t
, A
t1
) for t = 1, , T
We can think of formulating the Bellman equation in a recursive manner by thinking of the
dynamic decision as follows. Picking a value for the choice variable today (say x
1
), given an
initial value for the stock variable A
0
, has two eects: it directly aects todays objective
function (through f(x
1
, A
0
)) but it also aects the optimal decisions for next periods choice
variables (by changing A
1
).
Thus, in picking x
1
we have to be concerned about two things: i) what the choice of x
1
means
for the current periods objective function and ii) how the choice of x
1
aects the best we can
do after today if we pass on A
1
= g(x
1
, A
0
).
Since we dened V
1
(A
0
) as the maximized value of the objective function at time 1 given an
initial stock of assets A
0
, then V
2
(A
1
) is the maximized value of the objective function at
time 2 given an initial stock of assets A
1
. In other words our maximization decision can be
simplied greatly in a recursive form using the Bellman equation as
V
1
(A
0
) = max
x
1
[f(x
1
, A
0
) +V
2
(A
1
)] where A
1
= (1 +r)A
0
+Y
1
C
1
The same recursive denition applies to V
2
as a function of V
3
and so on. In general, the
decision of the individual at any point in time t can be written as
V
t
(A
t1
) = max
x
t
[f(x
t
, A
t1
) +V
t+1
(A
t
)] where A
t
= (1 +r)A
t1
+Y
t
C
t
This recursive denition is enormously useful in solving discrete time dynamic optimization
problems. We do not have to carry around many dierent choice variables and Lagrange
multipliers and the problem is reduced to a simple two step equation.
We will write down Bellman equations for a variety of optimization problems and then demon-
strate how to solve them.
III. APPLICATIONS OF BELLMAN EQUATIONS
Utility Maximization
Lets return to the rst problem of multi-period consumer optimization decision
max
C
1
,C
T
T
t=1
t1
U(C
t
)
where A
t
= (1 +r)A
t1
+Y
t
C
t
for t = 1, 2, , T and A
0
, A
T
are given
The choice variable here is consumption, C
t
, while the state variable is assets, A
t
. The objec-
tive function is utility function the transition equation simply states that income-consumption
either adds on or subtracts from the principal and interest on assets inherited from the past
to determine next periods assets.
We will dene the maximization problem faced at time t using the Bellman equation as
V
t
(A
t1
) = max
C
t
[U(C
t
) +V
t+1
(A
t
)] where A
t
= (1 +r)A
t1
+Y
t
C
t
Since the form of the utility function does not depend on time, we can drop the t subscript
on the value function and write V (A
t
) instead of V
t
(A
t
)
The next task is to nd the solution using the Bellman equation. Finding the solution to a
Bellman equation involves two steps. First, we do the traditional FOC with respect to the
choice variable, in this case C
t
. The FOC will be
U
(C
t
) +V
(A
t
)
A
t
C
t
= 0
U
(C
t
) V
(A
t
) = 0
In order to solve this FOC, we need to know the value of V
(A
t1
). We do this by taking the
derivative with respect to the state variable, A
t1
. The envelope theorem says that we can
ignore the impact of changing A on our choice variable in calculating that derivative.
This results in the following equation
V
(A
t1
) = V
(A
t
)
A
t
A
t1
V
(A
t1
) = (1 +r)V
(A
t
)
From the FOC we know that V
(A
t1
) =
U
(C
t1
)
and (1 + r)V
(A
t
) (1 + r)U
(C
t
).
Plugging these into the envelope condition we get
U
(C
t1
) = (1 +r)U
(C
t
)
This equation is more commonly written as
U
(C
t
) = (1 +r)U
(C
t+1
)
This equation, which relates consumption in 1 period to the consumption in the next period
(more broadly speaking the choice variable in one period to the choice variable in subsequent
period(s)) is called an EULER equation
This modied FOC is known as the Euler equation: it relates consumption in one period to
consumption in the next period. The Euler equation states that at the optimum choices, we
cannot gain utility by making a feasible switch of consumption from one period to the next.
What is a feasible switch in consumption? For example, by consuming 1 unit less in period
t we will have (1+r) units more to consume in period t+1. If we lower our consumption by
1 unit at time t this period, the net impact on our maximized utility is U
(C
t
). Increasing
our consumption by (1+r) units next period will have a net impact on our maximized utility
of (1 + r)U
(C
t+1
). Since this gain is in the next period, we need to discount it, which gives
us (1 +r)U
(C
t+1
).
At the optimal point, the discounted gain from any feasible reallocation is zero, so we get
(1 +r)U
(C
t+1
) U
(C
t
) = 0 (1 +r)U
(C
t+1
) = U
(C
t
) The Euler equation states that
at the optimum choices, we cannot gain by making a feasible switch of consumption from one
period to the next.
This holds for all T 1 pairs of consecutive years in the sample. If we had a particular
functional form say U(C
t
) = 2
C
t
we can show that the Euler equation (lagged) implies
that C
t
= [(1 +r)]
2
C
t1
. This dierence equation, combined with the budget constraint
A
t
= (1 + r)A
t1
+ Y
t
C
t
and the initial and terminal conditions A(0) = 0 and A(T) = 0
form a two variable system of dierence equations which can be solved to nd out the time
path of consumption.
Cake Eating Problem
Lets consider a classic dynamic optimization problem, known as the cake-eating problem.
This is a simplied version of the consumption problem dened earlier. Dene
t
to be the
size of a cake at time t and assume that the utility function is U(C
t
) = 2
C. The problem is
max
C
1
,C
T
T
t=1
t1
U(C
t
)
where
t
=
t1
C
t
for t = 1, 2, , T and
0
,
T
are given
The choice variable here is consumption, C
t
, while the state variable is the size of the cake,
t
.
The objective function is utility function the transition equation simply states that income-
consumption either adds on or subtracts from the principal and interest on assets inherited
from the past to determine next periods assets.
We will dene the maximization problem faced at time t using the Bellman equation as
V (
t1
) = max
C
t
_
2
_
C
t
+V (
t
)
_
where
t
=
t1
C
t
The FOC will be
1
C
t
+V
(
t
)(1) = 0
C
t
= V
(
t
)
The envelope condition is
V
(
t1
) = V
(
t
)
From the FOC we know that V
(
t1
) =
1
C
t1
and V
(
t
)
1
C
t
. Plugging these into
the envelope condition we get
_
C
t
=
_
C
t1
This equation is more commonly written as
_
C
t+1
=
_
C
t
This is the Euler equation for the model. The Euler equation states that at the optimum
choices, we cannot gain utility by making a feasible switch of consumption from one period
to the next.
What is a feasible switch in consumption? For example, by consuming 1 unit less in period t
we will have 1 unit more to consume in period t+1. If we lower our consumption by 1 unit at
time t this period, the net impact on our maximized utility is U
(C
t
)
1
C
t
. Increasing
our consumption by 1 unit next period will have a net impact on our maximized utility of
U
(C
t+1
)
1
C
t+1
. Since this gain is in the next period, we need to discount it, which gives
us
1
C
t+1
.
C
t+1
C
t
= 0
This dierence equation (lagged one period), combined with the budget constraint
t
=
t1
C
t
and initial and terminal conditions (0) and (T) form a two variable system of
dierence equations which can be solved to nd out the time path of consumption.
Prot Maximization
The second problem was the multi-period rm prot maximization decision
max
I
1
,I
T
,L
1
,L
T
T
t=1
_
1
1+r
_
t
[F(K
t
, L
t
) w
t
L
t
t
I
t
] subject to the constraints
K
t+1
= (1 )K
t
+I
t
for t = 1, 2, , T and K
1
, K
T
are given.
Couple of dierences from the previous example: The state variable here is K
t
. It has a t
subscript because that is what we inherit from the past (i.e it is the value of the variable at
the beginning of time t). There are now TWO choice variables I
t
and L
t
.
The recursive Bellman equation is
V
t
(K
t
) = max
I
t
,L
t
_
F(K
t
, L
t
) w
t
L
t
t
I
t
+
_
1
1 +r
_
V
t+1
(K
t+1
)
_
where K
t+1
= (1)K
t
+I
t
Once again the maximized value of prots from time t onwards can be recursively dened as
the optimal value(s) of the choice variables at time t taking into account the direct impact
of thee variables on the objective function today and the indirect eect on the state variable,
which aects the optimal choices that can be made next period. w
t
is the real wage and
t
is
the real price of investment goods.
Note that we have to keep the subscript t on the value function here because the prot function
changes depending on the wage prevailing at that time. So two dierent time periods will
have two dierent wages and therefore two dierent levels of prots even if K, L and I were
the same. In the previous example, the value function did not have a t subscript because the
utility function did not have any parameters that varied with time.
The FOCs of this model are
F
L
t
w
t
= 0
t
+
_
1
1 +r
_
V
t+1
(K
t+1
)
K
t+1
I
t

t
+
_
1
1 +r
_
V
t+1
(K
t+1
) = 0
The rst equation says that the marginal product of labor is set equal to its marginal cost.
Since labor hiring is not dynamic we get the same condition as in the static case.
The nal step is to take the derivative with respect to the state variable so that we can
eliminate the V
terms. This results in the following equation

V
t
(K
t
) =
F
K
t
+
_
1
1 +r
_
V
t+1
(K
t+1
)
K
t+1
K
t
F
K
t
+
_
1
1 +r
_
(1 )V
t+1
(K
t+1
)
From the FOC we know that
_
1
1+r
_
V
t+1
(K
t+1
) =
t
and V
t
(K
t
) = (1 +r)
t1
Putting the last two conditions together, with the envelope condition, we get the Euler equa-
tion
(1 +r)
t1
=
F
K
t
+
t
(1 )
We can move this one period forward and rearrange as
t
=
1
1 +r
_
F
K
t+1
+
t+1
(1 )
_
This is the Euler equation for this problem: it relates investment in one period to investment
in the next period. Recall that the Euler equation reects the core intuition that at the
optimum choices, we cannot gain any prots by making a feasible switch of investment from
one period to the next.
What is a feasible switch in investment? If we invest 1 unit more in period t we can invest
(1 ) units less in period t+1 and leave every other periods investment path unchanged.
If we increase investment by 1 unit at time t this period, the net impact on our maximized
prots is
t
+
_
1
1+r
_
F
K
t+1
. The rst term is the cost of buying 1 more unit of the investment
good, the second is the additional prot that we can make in the next period as a result of
increasing our capital stock by 1 unit via investment.
If we decrease investment next period by (1 ) units, it will have a net impact on our
maximized prots of (1 )
t+1
. Since this is in the next period, we need to discount it,
which gives us
_
1
1+r
_
(1 )
t+1
t
+
_
1
1+r
_
F
K
t+1
+
_
1
1+r
_
(1 )
t+1
= 0, which is the Euler equation.

Wellesley Bellman Notes 1

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Wellesley Bellman Notes 1

Transféré par

Droits d'auteur :

Formats disponibles

Fall Semester 05-06

terms. This results in the following equation

Vous aimerez peut-être aussi