Académique Documents
Professionnel Documents
Culture Documents
PROGRAMMING
Dynamic Programming
It is a useful mathematical technique for making a
sequence of interrelated decisions.
Systematic procedure for determining the optimal
combinations of decisions.
There is no standard mathematical formulation of
the Dynamic Programming problem.
Knowing when to apply dynamic programming
depends largely on experience with its general
structure.
206
Prototype example
Stagecoach problem
Fortune seeker that wants to go from Missouri to
California in the mid-19th century.
Travel has 4 stages. A is Missouri and J is California.
Cost is the life insurance of a specific route; lowest
cost is equivalent to a safest trip.
207
Costs
Cost cij of going from state i to state j is:
208
F
D
F.
209
Formulation
Decision variables xn (n = 1, 2, 3, 4) are the immediate
destination of stage n.
Route is A
x1
x2
x3
x4, where x4 = J.
min f n ( s, xn )
xn
f n ( s, xn* )
210
Formulation
where
f n ( s , xn )
211
Solution procedure
When n = 4, the route is determined by its current
state s (H or I) and its final destination J.
Since f4*(s) = f4*(s, J) = csJ, the solution for n = 4 is
s
f4*(s)
x4 *
212
Stage n = 3
Needs a few calculations. If fortune seeker is in state F,
he can go to either H or I with costs cF,H = 6 or cF,I = 3.
Choosing H, the minimum additional cost is f4*(H) = 3.
Total cost is 6 + 3 = 9.
Choosing I, the total cost is 3 + 4 = 7. This is smaller, and
it is the optimal choice for state F.
3
6
F
3
I
4
213
Stage n = 3
Similar calculations can be made for the two possible
states s = E and s = G, resulting in the table for n = 3:
x3
f3*(s)
x3*
214
Stage n = 2
In this case, f2*(s, x2) = csx2 + f4*(x3).
Example for node C:
x2 = E: f2*(C, E) = cC,E + f3*(E) = 3 + 4 = 7 optimal
x2 = F: f2*(C, F) = cC,F + f3*(F) = 2 + 7 = 9.
x2 = G: f2*(C, G) = cC,G + f3*(G) = 4 + 6 = 10.
4
E
3
7
2
C
4
F
G
6
215
Stage n = 2
Similar calculations can be made for the two possible
states s = B and s = D, resulting in the table for n = 2:
x2
f2*(s)
x2*
11
11
12
11
E or F
10
11
E or F
216
Stage n = 1
Just one possible starting state: A.
x1 = B: f2*(A, B) = cA,B + f2*(B) = 2 + 11 = 13.
x1 = C: f2*(A, C) = cA,C + f2*(C) = 4 + 7 = 11
x1 = D: f2*(A, D) = cA,D + f2*(D) = 3 + 8 = 11
optimal
optimal
x1
A
f1*(s)
x1*
13
11
11
11
C or D
217
Optimal solution
Three optimal solutions, all with f1*(A) = 11:
218
Characteristics of DP
1. The problem can be divided into stages, with a policy
Characteristics of DP
3. Policy decision transforms the current state to a state
Characteristics of DP
4. The solution procedure finds an optimal policy for
221
Characteristics of DP
5. Given the current state, an optimal policy for the
222
Characteristics of DP
6. Solution procedure begins by finding the optimal
min csxn
xn
f n* 1 ( xn )
223
Characteristics of DP
7. (cont.) Notation:
N number of stages.
n label for current stage (n 1, 2,
sn
xn
xn*
, N ).
f n ( sn , xn ) contribution of stages n, n 1,
, N to objective function,
f n ( sn , xn* )
224
Characteristics of DP
7. (cont.) recursive relationship:
f n* ( sn )
max f n ( sn , xn )
xn
or
f n* ( sn )
min f n ( sn , xn )
xn
*
f
where fn(sn, xn) is written in terms of sn, xn and n 1 ( sn 1 ).
225
45
20
50
70
45
70
90
75
80
105
110
100
120
150
130
227
228
States to be considered
Thousands of additional
person-years of life
Country
Medical
teams
45
20
50
70
45
70
90
75
80
105
110
100
120
150
130
229
Overall problem
pi(xi): measure of performance for allocating xi
medical teams to country i.
3
Maximize
pi ( xi ),
i 1
subject to
3
xi
5,
i 1
230
Policy
Recursive relationship relating functions:
f n* ( sn )
f3* ( s3 )
max
xn 0,1, , sn
pn ( xn )
f n* 1 ( sn
xn ) , for n 1, 2
max p3 ( x3 ), for n 3
xn 0,1, , s3
231
Solution procedure
For last stage n = 3, values of p3(x3) are the last column
of table. Here, x3* = s3 and f3*(s3) = p3(s3).
Thousands of additional
person-years of life
Country
2
n = 3:
s3
f3*(s3)
x3 *
50
70
80
Medical
teams
45
20
50
70
45
70
90
75
80
100
105
110
100
130
120
150
130
232
Stage n = 2
Here, finding x2* requires calculating f2(s2, x2) for the
values of x2 = 0, 1, , s2. Example for s2 = 2:
Thousands of additional
person-years of life
Country
Medical
teams
45
20
50
70
45
70
90
75
80
105
110
100
120
150
130
45
50
State:
20
2
70
233
Stage n = 2
Similar calculations can be made for the other values of s2:
n=2:
s2
x2
f2*(s2)
x2*
50
70
0 or 1
95
50
20
70
70
45
80
90
95
100
100
115
125 110
125
130
120
125
160
75
234
Stage n = 1
Only state is the starting
state s1 = 5:
0
Thousands of additional
person-years of life
Country
Medical
teams
45
20
50
70
45
70
90
75
80
105
110
100
160
120
150
130
...
120
State:
45
125
4
0
n=1:
s1
x1
5
f1*(s1)
x1*
160
170
165
160
155
120
170
1
235
236
237
activity n ( n 1, 2, , n),
decision variable for stage n,
State sn
, N)
n
sn
n+1
xn
s n xn
238
Example
Distributing scientists to research teams
3 teams are solving engineering problem to safely fly
people to Mars.
2 extra scientists reduce the probability of failure.
Probability of failure
Team
New scientists
0.40
0.60
0.80
0.20
0.40
0.50
0.15
0.20
0.30
239
240
Spring
Summer
Autumn
Winter
Spring
Requirements
255
220
240
200
255
241
242
Formulation
From data, maximum employment should be 255
(spring). It is necessary to find the level of employment
for other seasons. Seasons are stages.
One cycle of four seasons, where stage 1 is summer
and stage 4 is spring.
xn = employment level for stage n (n =1,2,3,4); x4=255
rn = minimum employment level for stage n: r1=220,
r2=240, r3=200, r4=255. Thus:
rn xn 255
243
Formulation
Cost for stage n:
n = 200(xn xn1)2 + 2000(xn rn)
State sn is the employment in the preceding season xn1
sn = xn1
(n=1: s1 = x0 = x4 = 255)
Problem:
4
minimize
2000( xi
xi 1 )
200( xi
ri ) ,
i 1
subject to ri
xi
244
Detailed formulation
Choose x1, x2 and x3 so as to minimize the cost:
n
rn
Feasible xn
Possible sn = xn-1
Cost
220
220 x1 255
s1 = 255
240
240 x2 255
220 s2 255
200
200 x3 255
240 s3 255
255
x4 = 255
200 s4 255
200(255 - x3)2
245
Formulation
Recursive relationship:
f n* ( sn )
min
rn xn 255
200( xn
sn ) 2 2000( xn
rn )
f n* 1 ( xn )
246
Solution procedure
Stage 4: the solution is known to be x4* = 255.
s4
200 s4 255
f4* (s4)
200(255 - s4)2
x 4*
255
min
min
200 xn 255
200 xn 255
f 4* ( x3 )
graphical solution:
247
*
f3 (x3)
248
*
f3 (x3)
Using calculus:
x3
f3 ( s3 , x3 )
*
3
x
s3
240 s3 255
s3 250
2
f3* (s3)
x3*
(s3+250)/2
249
Stage 2
Solved in a similar fashion:
f 2 ( s2 , x2 )
200( x2 s2 ) 2 2000( x2 r2 )
f3* ( x3 )
50(260 x2 )
f 3* ( x3 )
1000( x2 150)
2 s2
240
3
250
Stage 2
The solution has to be feasible for 220 s2 255 (i.e.,
240 x2 255 for 220 s2 255)!
*
2
2 s2
240
3
x2
f 2 ( s2 , x2 ) 0
for 240
x2
255
so x2* = 240.
251
f2* (s2)
200(240-s2) 2+115000
x2 *
240
(2s2+240)/3
s1
f1* (s1)
x1 *
255
185000
247.5
Solution:
x1* = 247.5; x2* = 245; x3* = 247.5; x4* = 255
Total cost of 185 000
252
253
Basic structure
254
f n ( sn , xn )
pi Ci
f n* 1 (i )
i 1
with
f n* 1 (i )
min f n 1 (i, xn 1 )
xn
255
256
Formulation
Objective: determine policy regarding lot size
(1 + reject allowance) for required production run(s)
that minimizes total expected cost.
Stage n = production run n (n = 1,2,3),
xn = lot size for stage n,
State sn = number of acceptable items still needed (1
or 0) at the beginning of stage n.
At stage 1, state s1 = 1.
257
Formulation
fn(sn, xn) = total expected costs for stages n,,3 and
optimal decisions are made thereafter:
f n* ( sn )
min f n ( sn , xn )
xn 0,1,
f (1)
min
xn 0,1,2,
K ( xn ) xn
1
2
xn
*
n 1
(1)
for n 1, 2,3
259
Solution procedure
n = 3:
n = 2:
n = 1:
s3 x3
16
s2 x2
f3*(s3)
x3*
0
12
8 1/2
3 or 4
f2*(s2)
x2*
s1 x1
0
8
7 1/2
2 or 3
f1*(s1)
x1*
7 1/2
6 3/4
6 7/8
7 7/16
6 3/4
260
Solution
Optimal policy: produce two items on the first
production run;
if none is acceptable, then produce either two or three
items on the second production run;
if none is acceptable, then produce either three or four
items on the third production run.
The total expected cost for this policy is 675.
261
Conclusions
Dynamic programming: very useful technique for
making a sequence of interrelated decisions. It
requires formulating an appropriate recursive
relationship for each individual problem.
Example: problem has 10 stages with 10 states and 10
possible decisions at each stage.
Exhaustive enumeration must consider up to 10 billion
combinations.
Dynamic programming need make no more than a
thousand calculations (10 for each state at each stage).
262