Académique Documents
Professionnel Documents
Culture Documents
Moritz Diehl,
Optimization in Engineering Center (OPTEC) & ESAT
K.U. Leuven, Belgium
(some slide material was provided by W. Bangerth, K. Mombaur,
and P. Riede)
Overview of presentation
f(x) -f(x)
x x
x* Maximum
x* Minimum
Constrained optimization
x2 0
1 x12 x22 0
h1 ( x) := x2 0
h2 ( x) := 1 x12 x22 0
Local and global optima
f(x)
Local Minimum
Local Minimum
Global Minimum:
x
Derivatives
First and second derivatives of the objective function or the
constraints play an important role in optimization
The first order derivatives are called the gradient (of the resp. fct)
and the second order derivatives are called the Hessian matrix
Optimality conditions (unconstrained)
min f ( x) x R n
necessary condition:
f(x*)=0 (stationarity)
x*
sufficient condition:
x* stationary and 2f(x*) positive definite
Types of stationary points
gradient vector
f ( x) = (2 x1 ,2 x2 + m)
unconstrained minimum:
m
0 = f ( x* ) ( x1* , x2* ) = (0, )
2
Sometimes there are many local minima
Convex problem if
f(x) is convex and the feasible set is convex
number of minima
Characteristics of optimization problems 2
Existence of constraints
Properties of constraints:
equalities / inequalities
type: simple bounds, linear, nonlinear,
dynamic equations (optimal control)
smoothness
Overview of presentation
Linear objective,
linear constraints:
Linear Optimization Problem
(convex)
min f ( x) ( x R ) n
f(x*)=0
Derivative based algorithms
Search direction:
choose descent direction
(f should be decreased)
Step length:
solve1-d minimization approximately,
satisfy Armijo condition
Computation of step length
Dream:
exact line search: k = arg min f ( xk + pk )
In practice:
inexact line search: k
arg min f ( x k
+ p k
)
maximum descent, if
Steepest descent method
p k = f ( x k ) x k +1 = x k k f ( x k )
Gradient direction
Convergence of steepest descent method
i.e. x k x* C xk 1 x*
1 1 2
f ( x, y) = 4 ( x y 2 )2 + + y
100 100
banana valley function,
global minimum at x=y=0
Example - steepest descent method
xk x*
268 0.1
1 = 0.1, 2 = 268 C 0.9993
268+ 0.1
Algorithm 2: Newtons Method
Newton-Direction
Visualization of Newtons method
1 kT 2
Q ( p ) = f ( x ) + f ( x ) p + p f ( x k ) p k
k k k k
Gradient direction
if quadratic model is
good, then take full
step with k =1
Newton direction
Convergence of Newtons method
k 1 * 2
i.e. x x
k *
C x x
1 1 2
f ( x, y) = 4 ( x y 2 )2 + + y
100 100
xk x*
xk x*
Notation:
Steepest Descent:
Newton Method:
Three methods:
steepest descent: intuitive, but slow (linear convergence)
Newtons method: very fast (quadratic convergence)
Newton type methods: cheap, fast, and popular, e.g.
BFGS (superlinear)
Gauss-Newton (fast linear convergence)
Attention: Nonlinear vs. Convex Optimization
For nonlinear problems, Newton type algorithms only find local minima, and
"optimal solution" depends on initialization!