Vous êtes sur la page 1sur 48

CLASSICAL OPTIMIZATION THEORY

Quadratic forms
 x1 
x 
 2
Let X  .  be a n-vector.
 
. 
 xn 
 
Let A = ( aij) be a n×n symmetric matrix.
We define the kth order principal minor as
the kk determinant
a11 a12 ... a1k
a21 a22 ... a2 k
.
.
ak 1 ak 2 ... akk
Then the quadratic form
Q( X )  X A X
T
 �aii xi2 + 2 � �a x x ij i j
i 1�i < j �n

(or the matrix A) is called

Positive semi definite if XTAX ≥ 0 for all X.


Positive definite if XTAX > 0 for all X ≠ 0.
Negative semi definite if XTAX ≤ 0 for all X.
Negative definite if XTAX < 0 for all X ≠ 0.
A necessary and sufficient condition for A (or
Q(X)) to be :

Positive definite (positive semi definite) is


that all the n principal minors of A are > 0 (
≥ 0).

Negative definite ( negative semi definite) if


kth principal minor of A has the sign of (-1)k ,
k=1,2,…,n (kth principal minor of A is zero or
has the sign of (-1)k , k=1,2,…,n )
Let f (X)=f (x1 ,x2 ,…,xn ) be a real-valued
function of the n variables x1 ,x2 ,…,xn (we
assume that f (X) is at least twice
differentiable ).
A point X0 is said to be a local maximum of
f (X) if there exists an ε > 0 such that
r
f ( X 0 + h ) � f ( X 0 ) for all h j  

Here h  ( h1 , h2 ,.., hn ) X 0  ( x10 , x20 ,.., xn0 )
r
and X 0 + h  ( x1 + h1 , x2 + h2 ,.., xn + hn )
0 0 0
X0 is said to be a local minimum of f (X) if
there exists an ε > 0 such that

f ( X 0 + h )  f ( X 0 ) for all h  
j

X0 is called an absolute maximum or global


maximum of f (X) if
f (X )  f (X0)  X
X0 is called an absolute minimum or global
minimum of f (X) if
f ( X ) �f ( X 0 )  X
Theorem
A necessary condition for X0 to be an
optimum point of f (X) is that f ( X 0 )  0
(that is all the first order partial derivatives f
are zero at X0.) xi

Definition
A point X0 for which f ( X 0 )  0
is called a stationary point of f (X) (potential
candidate for local maximum or local
minimum).
Theorem

Let X0 be a stationary point of f (X).


A sufficient condition for X0 to be a local
minimum of f (X) is that the Hessian matrix
H(X0) is positive definite; local maximum of
f (X) is that the Hessian matrix H(X0) is
negative definite.
Here H(X) is the n×n matrix whose ith row are
�f
the partial derivates of �
xj
(j =1,2,..n)
with respect to xi. (i =1,2,..n)
��2 f �2 f �2 f �
� 2 . . �
i.e. H(X) = �� x1 �
x1�x2 �x1� xn �
��2 f �2 f �2 f �
� . . �
��
x1� x2 �x22 �
x2 � xn �
� . .�
� �
� . �
� 2 �
�� f �2 f 2
�f �
. . .

��
xn � x1 �
xn �x2 � xn2 � �
Problem 3 set 20.1A page 705
Find the stationary points of the function
f (x1,x2,x3) = 2x1x2x3 – 4x1x3 – 2x2x3 + x12 + x22 +
x32 – 2x1 - 4x2 + 4x3
And hence find the extrema of f (X)
f
 2 x 2 x3  4 x3 + 2 x1  2
x1
f
 2 x1 x3  2 x3 + 2 x 2  4
x 2
f
 2 x1 x 2  4 x1  2 x 2 + 2 x3 + 4
x3
All these above equations = 0 give

x2 x3  2 x3 + x1  1 (1)
x1 x3  x3 + x2  2 (2)
x1 x2  2 x1  x2 + x3  2 (3)

(2) � x2  2 + x3  x1 x3
Substituting in (3) for x2, we get
2x1+ x1x3 – x12x3 – 2x1 – 2 – x3 + x1x3 + x3 =
-2
Or x1x3 (2 – x1) = 0
Therefore x1= 0 or x3 = 0 or x1 = 2
Case (i) x1 = 0
(1) implies x2x3 – 2x3 = 1 (4)
(2) implies x2 – x3 = 2 (5)
(3) implies -x2+x3 = -2 same as (5)
(4) using (5) gives
x3(2 + x3) – 2x3 = 1
i.e. x3= �1
or x32 = 1
Sub case (i) x3 = 1 gives (using (5) ) x2 = 3
(ii) x3= -1 gives (using (5) ) x2 = 1

Therefore, we get 2 stationary points


(0,3,1), (0,1,-1).
Case (ii) x3 = 0
(1) gives x1 = 1
(2) gives x2 = 2
For these x1, x2
LHS of (3) = x1x2 - 2x1 - x2 = -2 = RHS
Therefore, we get the stationary point (1,2,0)
Case (iii) x1= 2
(1) gives x2x3 – 2x3 = -1 (6)
(2) gives x2 + x3 = 2 (7)
(3) gives x2 + x3 = 2 same as (7)
(6) Using (7) gives
x3(2 – x3) – 2x3 = -1
i.e. x32 = 1, therefore x3 = ±1
sub case (i) x3 = 1 gives (using (5)) x2 = 1
(ii) x3 = -1, gives (using (5) x2 = 3
Therefore, we get two stationary points
(2,1,1), (2,3,-1).

Hessian matrix = � 2 2 x3 2 x2  4 �
� 2x 2 2 x  2 �
� 3 1 �
2 x2  4 2 x1  2

� 2 � �
Point Principal Nature
minors

(0,3,1) 2,0,-32 Saddle pt


(0,1,-1) 2,0,-32 Saddle pt
(1,2,0) 2,4,8 Local min
(2,1,1) 2,0,-32 Saddle pt
(2,3,-1) 2,0,-32 Saddle pt
Definition:
A function f(X)=f(x1,x2,…xn) of n variables is
said to be convex if for each pair of points X,Y
on the graph, the line segment joining these
two points lies entirely above or on the graph.
i.e. f((1-λ)X + λY) ≤ (1-λ)f(X)+ λf(Y)
for all λ such that 0 < λ < 1.
f is said to be strictly convex if for each pair
of points X,Y on the graph,
f ((1-λ) X + λ Y) < (1-λ) f(X)+ λ f(Y)
for all λ such that 0 < λ < 1.
f is called concave ( strictly concave) if
– f is convex ( strictly convex).
Convexity test for function of one variable

A function of one variable f(x) is


2
d f
convex if 2
�0
dx

d2 f
concave if 2
�0
dx
Convexity test for functions of 2 variables

quantity convex Strictly concave Strictly


convex concave

fxx fyy -(fxy)2 ≥0 >0 ≥0 >0


fxx ≥0 >0 ≤0 <0
fyy ≥0 >0 ≤0 <0
Constrained optimization
Karush–Kuhn–Tucker (KKT) conditions:
Consider the problem
maximize z = f(X) = f(x1, x2,…, xn)
subject to g(X) ≤ 0 [g1(X) ≤ 0
g2(X) ≤ 0
.
gm(X) ≤ 0]
(the non-negativity restrictions, if any, are
Let S  ( s , s ,..., s )
2
1
2 2
2
2
m
(where s , s22,..,sm2 are the non negative
1
2

slack variables added to g1(X) ≤ 0 , g2(X) ≤ 0,


…, gm(X) ≤ 0 to make them into equalities).
We define the Lagrangean function
L(X, S, λ) = f(X) – λ[g(X) + S2]
= f(X) – [λ1{g1(X) + s12} +
λ2{g2(X)+s22)}+ … +λm{gm(X) + sm2}]
KKT necessary conditions for optimality are
given by
ur
l �0
�L
 �f ( X )  l�g ( X )  0

X
�L
 2li Si  0, i  1, 2,..., m

Si
�L
ur  ( g ( X ) + S 2 )  0
�l
These are equivalent to the following
ur
conditions: l �0
ur
�f ( X )  l�g ( X )  0
li gi ( X )  0, i  1, 2,..., m
g ( X ) �0 (Complementary
slackness)
We denote the Lagrangean L(X, l) by
m
L( X , l )  f ( X )  �li g i ( X )
i 1
In scalar notation, this is given by
I. λi` ≥ 0, i=1,2,….m

f g1 g 2 g m
II.  l1  l2  ..  lm 0
x j x j x j x j
j  1, 2, . . . , n

III. li gi ( X )  0, i  1,2,..., m

IV. g ( X ) �0, i  1, 2,..., m


i
The same conditions apply for a
minimization problem also except that now
we have u r
l �0
Also in both maximization and minimization
problems, the Lagrange multipliers li
corresponding to equality constraints
gi(X) = 0 must be URS (unrestricted in sign).
Sufficiency of the KKT conditions:

Required conditions
Sense of Objective Solution
optimization function space

maximization concave Convex set

minimization convex Convex set


It is simpler to verify whether a function is
concave or convex than to prove that the
solution space is a convex set.
We thus give a set of sufficient conditions
that are easier to check that the solution
space is a convex set in terms of the
convexity or concavity of the constraint
functions.
Consider the general non-linear problem:

Maximize or minimize z = f(X)

Subject to gi(X) ≤ 0 i = 1,2,.., p


gi(X)  0 i = p+1, p+2,.., q
gi(X) = 0 i = q+1, q+2,.., r
Sufficiency of the KKT conditions:

Sense of Required conditions


optimization f(X) gi(X) λi
convex 0 1≤i≤p
maximization concave concave ≤0 p+1 ≤ i ≤ q
linear URS q+1 ≤ i ≤ r
convex 0 1≤i≤p
minimization convex concave ≤0 p+1 ≤ i ≤ q
linear URS q+1 ≤ i ≤ r
The conditions in the above table
represent only a subset of the conditions
given in the earlier table.
The reason is that a solution space may be
convex without satisfying the conditions
of the constraint functions given in the
second table.
Problem Use the KKT conditions to derive
an optimal solution for the following
problem:
maximize f(x1, x2) = x1+ 2x2 – x23
subject to x1 + x2 ≤ 1
x1 ≥0
x2 ≥ 0
Solution: Here there are three constraints
namely,
g1(x1,x2) = x1+x2 - 1 ≤ 0
g2(x1,x2) = - x1 ≤0
g3(x1,x2) = - x2 ≤ 0
Hence the KKT conditions become
λ1≥0, λ2 ≥ 0, λ3 ≥ 0
f g1 g 2 g 3
 l1  l2  l3 0
x1 x1 x1 x1
f g1 g 2 g 3
 l1  l2  l3 0
x 2 x 2 x 2 x 2
Note:
λ1g1(x1,x2) = 0 f is concave
gi are convex,
λ2g2(x1,x2) = 0
λ3g3(x1,x2) = 0 maximization
problem
g1(x1,x2) ≤ 0
 these KKT
g2(x1,x2) ≤ 0 conditions are
sufficient at the
i.e. 1 – λ1 + λ2 = 0 (1)
2 – 3x22 – λ1 + λ3 = 0 (2)
λ1(x1 + x2 – 1) = 0 (3)
λ2 x1 =0 (4)
λ3 x2 =0 (5)
x1 + x2 – 1 ≤ 0 (6)
x1 ≥0 (7)
x2 ≥0 (8) λ1 ≥ 0 (9)
λ2 ≥0 (10) and λ3 ≥ 0 (11)
(1) gives λ1 = 1 + λ2 ≥ 1 >0 (using 10)
Hence (3) gives x1 + x2 = 1 (12)
Thus both x1, x2 cannot be zero.
So let x1>0 (4) gives λ2 = 0. therefore λ1 = 1
if now x2 = 0, then (2) gives 2 – 0 – 1 + λ3 = 0 or λ3 < 0
not possible
Therefore x2 > 0
hence (5) gives λ3 = 0 and then (2) gives x22 = 1/3 so x2 =1/√3
And so x1 = 1- 1/√3
Maximize f(x) = 20x1 + 10 x2
(0,1)
Subject to x12 + x22 ≤ 1
x1 + 2x2 ≤ 2 (4/5,3/5
)
x1 ≥ 0, x2 ≥ 0
KKT conditions become
20 - 2λ1 x1 – λ2 + λ3 = 0
10 - 2λ1 x2 – 2λ2 + λ4 = 0
λ1 (x12 + x22 – 1) =0
λ1 (x1 + 2x2 – 2) =0
λ4 x2 =0
x12 + x22 ≤ 1
x1 + 2x2 ≤ 2
x1 ≥0
x2 ≥0
λ1 ≥0
λ2 ≥0
λ3 ≥0
λ4 ≥0
From the figure it is clear that max f occurs at (x1, x2) where
x1, x2 >0.
λ3 = 0, λ4 = 0
suppose x1 + 2x2 – 2 ≠ 0
λ2 = 0 , therefore we get 20 - 2λ1 x1=0
10 - 2λ1 x2=0
λ1x1=10 and λ1x2=5, squaring and adding we get

λ12 = 125 λ1 = 5√5
therefore x1 = 2/√5, x2 = 1/√5, f= 50/√5 >22
λ2 ≠ 0  x1 + x2 – 2 = 0
Therefore x1=0, x2=1, f =10
Or x1= 4/5, x2= 3/5, f=22
Therefore max f occurs at x1 = 2/√5, x2 = 1/√5
Problem Use the KKT conditions to derive
an optimal solution for the following
problem:
minimize f(x1, x2) = x12+ x2
subject to x12 + x22 ≤ 9
x1 + x2 ≤ 1
Solution: Here there are two constraints,
namely, g1(x1,x2) = x12+x22 - 9 ≤ 0
g2(x1,x2) = x1 + x2 -1 ≤ 0
Thus the KKT conditions are:
1 : l1  0, l2  0 as it is a minimization problem
2: 2x1  2l1 x1  l2  0
1 -2l1 x2  l2  0
3 : l1 ( x + x  9)  0
2
1
2
2

l2 ( x1 + x2  1)  0
4: x + x  9
2
1
2
2

x1 + x2  1
Now l1  0 (from 2) gives l2  1 Not possible.

Hence l1  0 and so x +x 9
2
1
2
2 (5)

Assume l2  0. So (1st equation of ) (2) gives

2 x1 (1  l1 )  0 Since l1  0 we get x1= 0

From (5), we get x2   3

2nd equation of (2) says (with l1 < 0, l2  0 ) x2 = -3


Thus the optimal solution is: The optimal value is :
1
x1  0, x2   3, l1   , l2  0 z  3
6
Basic r x1 x2 l1 l2 1 2 R1 R2 S1 S2 Sol
22 -8 2 5 -1 -1 0 0 0 0 70
r 1 0 0 0 0 0 0 -1 -1 0 0 0
R1 0 40 -18 1 1 -1 0 1 0 0 0 20
R2 0 -18 10 1 4 0 -1 0 1 0 0 50
S1 0 1 1 0 0 0 0 0 0 1 0 6
S2 0 1 4 0 0 0 0 0 0 0 1 18
19 29 89 9 11
r 1 0
10 20 20

20
1 
20
0 0 0 59
9 1 1 1 1 1
x1 0 1   0 0 0 0
20 40 40 40 40 2
19 29 89 9 9
R2 0 0
10 20 20

20
1 20 1 0 0 59
29 1 1 1 1 11
S1 0 0   0  0 1 0
20 40 40 40 40 2
89 1 1 1 1 35
S2 0 0   0  0 0 1
20 40 40 40 40 2
Basic r x1 x2 l1 l2 1 2 R1 R2 S1 S2 Sol
917 2657 337 243 38 1502
r 1 0 0 580 580

580
1 
580
0 
29
0 29
7 7 7 7 9 64
x1 0 1 0  
1160 1160 1160 0 
1160 0 29 0 29
917 2657 337 337 38 1502
R2 0 0 0 
580 580 580
1 580 1 
29
0 29
2 2 2 2 20 110
x2 0 0 1 
29

29 29
0 
29
0 29 0 29
327 327 327 327 89 18
S2 0 0 0 1160 1160

1160 0 1160 0 
29 1 29
1740 2320 2900
r 1 0 0 
580
0 580
 1 580
 0
1 1 1
x1 0 1 0 0  0 0 0
40 0 40 2
9 9
R2 0 0 0 0 
20
1 20 1 0 0 59
1 1 11
x2 0 0 1 0 0  0 1 0
40 40 2
3560 1160 720
l2 0 0 0 1 1 1 0 1 0 
327 327 327
Using Excel Solver, the optimum solution is:
x1 = 1.604651, x2 = 4.395349 and
the maximum z = 230.7209

Vous aimerez peut-être aussi