10 NLP I - Handout

Solving Single-Variable, Unconstrained NLPs
Prerequisites:
Non-Linear Programming (NLP): Single-Variable, Methods for Single-Variable

Unconstrained Unconstrained Optimization
Optimization – Part II
Optimization – Part I
Benoı̂t Chachuat <benoit@mcmaster.ca>
Basic Concepts for
Basic Concepts for

McMaster University
Department of Chemical Engineering
ChE 4G03: Optimization in Chemical Engineering
11111111111111111
00000000000000000
Benoı̂t Chachuat (McMaster University) NLP: Single-Variable, Unconstrained 4G03 1 / 18 Benoı̂t Chachuat (McMaster University) NLP: Single-Variable, Unconstrained 4G03 2 / 18
Solving Single-Variable, Unconstrained NLPs Outline

Single Variable
“Aren’t single-variable problems easy?” — Sometimes Optimization
“Won’t the methods for multivariable problems work in the single

variable case?” — Yes
This lecture Numerical Analytical

But, Solution Methods Solution Methods
1 A few important problems are single variables — e.g., nonlinear

regression
2 This will give us insight into multivariable solution techniques Region Elimination Interpolation Derivative-based
Methods Methods Methods
3 Single-variable optimization is a subproblem for many nonlinear
optimization methods and software! — e.g., linesearch in
(un)constrained NLP
For additional details, see Rardin (1998), Chapter 13.2
Region Elimination Methods (Minimize Case) Golden Section Search (Minimize Case)
Iteratively consider the function value at 4 carefully spaced points: How do we choose the intermediate points x 1 , x 2 ?
Assume a unimodal function
Golden section search proceeds by keeping both possible intervals
Leftmost x ℓ is always a lower bound on the optimal value x ∗ [x ℓ , x 2 ] or [x 1 , x u ] equal
Rightmost x u is always an upper bound on the optimal value x ∗
Golden Section Search f (x)
Points x 1 and x 2 fall in between
Case 1: f (x 1 ) < f (x 2 ) Case 2: f (x 1 ) > f (x 2 ) The points x 1 , x 2 are taken as:
f (x) f (x) h i
∆
x1 = xu − γ xu − xℓ
h i
∆
x2 = xℓ + γ xu − xℓ
x
√
∆ 1+ 5 xℓ x1 x2 xu
with γ = ≈ 0.618 a reduction γ
x x 2
fraction know as the golden ratio xℓ x1 x2 xu
xℓ x1 x2 xu xℓ x1 x2 xu
xℓ x1 x2 xu eliminated xℓ x1 x2 xu What is the advantage of choosing this ratio γ?

Golden Section Search Algorithm (Minimize Case) Pros and Cons of Golden Section Search
Step 0: Initialization Pros:
◮ Choose lower and upper bounds, x ℓ and x u , that bracket x ∗ , as well as
stopping tolerance ǫ > 0
Objective function can be nonsmooth or even discontinuous
◮ Set x 1 ← x u − γ x u − x ℓ and x 2 ← x ℓ + γ x u − x ℓ
Calculations are straightforward (only function evaluations)
Step 1: Stopping Cons:

u ℓ 1 ℓ
u

◮ If x − x < ǫ, stop — report x ← ∗
2 x −x as an approximate Assumes a unimodal function and requires prior knowledge of an
solution enclosure x ∗ ∈ [x ℓ , x u ]
Step 2a: Case f (x 1 ) < f (x 2 ) (optimum left) Golden section search is slow! Considerable computation may be
◮ Narrow the search by eliminating the rightmost part: needed to get an accurate solution
x u ← x 2, x 2 ← x 1, x 1 ← x u − γ x u − x ℓ ,

∆ (x−20)4
Example: Consider the problem: min f (x) = 500 − 2x
and evaluate f (x 1 ) — Return to step 1
it. xℓ x1 x2 xu f (x ℓ ) f (x 1 ) f (x 2 ) f (x u ) xu − xℓ
Step 2b: Case f (x 1 ) > f (x 2 ) (optimum right) 0 0.00 15.28 24.72 40.00 320.00 -29.57 -48.45 240.00 40.00
1 15.28 24.72 30.54 40.00 -29.57 -48.45 -36.27 240.00 24.72
◮ Narrow the search by eliminating the leftmost part: 2 15.28 21.12 24.72 30.56 -29.57 -42.23 -48.45 -36.27 15.28
x ℓ ← x 1, x 1 ← x 2, x 2 ← x ℓ + γ x u − x ℓ ,

3 21.12 24.72 26.95 30.56 -42.23 -48.45 -49.23 -36.27 9.44
4
and evaluate f (x 2 ) — Return to step 1
Interpolation Methods (Minimize Case) Math Refresher: Lagrange Polynomials
Improve speed by taking full advantage of a 3-point pattern: 1 For N + 1 data points, there is one and only one polynomial of order
∆
Assume a unimodal, continuous function N, pN (x) = a0 + a1 x + · · · + aN x N , passing through all the points
Fit a smooth curve through the points: (x ℓ , f (x ℓ )), (x m , f (x m )), 2 The Lagrange polynomials provide one way of computing this
(x u , f (x u )); then, optimize the fitted curve interpolation:
N N
x − xi

x q > x m , and x q > x m , and ∆
X Y ∆
Case 1A: Case 1B: pN (x) = p(xk )Lk (x), with: Lk (x) =
f (x q ) < f (x m ) f (x q ) > f (x m ) xk − xi
f (x) f (x) k=0 i =1
i 6=k
1.2 1.2
1 1
0.8 0.8
p3 (x)
p2 (x)
0.6 0.6
0.4 0.4
x x
0.2 0.2
xℓ xm xq xu xℓ xm xq xu
0 0
xℓ xm xu eliminated xℓ x
m
xu -0.2
0 1 2 3 4 5 6
-0.2
0 1 2 3 4 5 6
x x
Quadratic Fit Search (Minimize Case) Quadratic Fit Search Algorithm (Minimize Case)
How do we fit the curve? Step 0: Initialization
◮ Choose starting 3-point pattern (x ℓ , x m , x u ), as well as stopping
Quadratic fit search proceeds by fitting a 2nd-order polynomial tolerance ǫ > 0
through (x ℓ , f (x ℓ )), (x m , f (x m )) (x u , f (x u )): Step 1: Stopping
◮ If x u − x ℓ < ǫ, stop — report x ∗ ← x m as an approximate solution
∆ (x − x m )(x − x u )
ℓ (x − x ℓ )(x − x u )
p2 (x) = f (x ) ℓ ℓ
+ f (x m ) m Step 2: Quadratic Fit
(x − x )(x − x )
m u (x − x ℓ )(x m − x u )
◮ Compute quadratic fit optimum x q as
(x − x ℓ )(x − x m ) h i h i
+ f (x u ) u ℓ ℓ2 ℓ2
m2 u2
m u2 u m2
1 f (x ) x − x + f (x ) x − x + f (x ) x − x
(x − x ℓ )(x u − x m ) xq ←
2 f (x ℓ ) [x m − x u ] + f (x m ) [x u − x ℓ ] + f (x u ) [x ℓ − x m ]
Quadratic Fit Search Step 3a: Case x q < x m Step 3b: Case x q > x m
The unique optimum of the fitted quadratic function occurs at: ◮ If f (x q ) > f (x m ), update ◮ If f (x q ) > f (x m ), update
h i h i xℓ ← xq xh ← xq
ℓ ) x m2 − x u2 + f (x m ) x u2 − x ℓ 2 + f (x u ) x ℓ 2 − x m2

∆ 1
f (x ◮ Otherwise, update ◮ Otherwise, update
xq =
2 f (x ℓ ) [x m − x u ] + f (x m ) [x u − x ℓ ] + f (x u ) [x ℓ − x m ] u
x ←x , m
x m
←x q
x ℓ ← x m, xm ← xq
◮ Return to step 1 ◮ Return to step 1
Pros and Cons of Quadratic Fit Search Derivative-Based Methods
Pros: Improve convergence speed by using first- and second-order
Calculations are also straightforward (only function evaluations) derivatives taking full advantage of a 3-point pattern:
Assume a smooth function f
Typically faster than Golden section search (but not so much...)
Consider the quadratic approximation of f passing through the
Cons: current iterate x k
Assumes a unimodal function and requires prior knowledge of an Optimize this approximation to determine the next iterate, x k+1
enclosure x ∗ ∈ [x ℓ , x u ]
f (x)
Objective function has to be smooth
∆ (x−20)4
it. xℓ xm xu f (x ℓ ) f (x m ) f (x u ) xu − xℓ xq f (x q )
0 0.00 32.00 40.00 320.00 -22.53 240.00 40.00 20.92 -41.84
1 0.00 20.92 32.00 320.00 -41.84 -22.53 32.00 25.00 -48.75
2 20.92 25.00 32.00 -41.84 -48.75 -22.53 11.80 30.00 -40.03
3 20.92 25.00 30.00 -41.84 -48.75 -40.03 9.08
4 x
x k+1 x k
Newton’s Method Basic Newton Algorithm

How do we get a quadratic approximation at xk ?
Use Taylor series approximation: Step 0: Initialization
h i 1 h i2 ◮ Choose an initial guess x 0 , as well as stopping tolerance ǫ > 0
f (x) ≈ f (x k ) + f ′ (x k ) x − x k + f ′′ (x k ) x − x k ◮ Set k ← 0
2
Calculate the optimum, x k+1 , of the quadratic approximation: Step 1: Derivatives
h i ◮ Compute first- and second-order derivatives f ′ (x k ), f ′′ (x k )
f ′ (x k+1 ) = 0 = f ′ (x k ) + f ′′ (x k ) x k+1 − x k
Step 2: Stopping

◮ If f ′ (x k ) < ǫ, stop — report x ∗ ← x k as an approximate solution
Basic Newton’s Search
The basic Newton’s method proceeds iteratively as Step 3: Newton Step
′ k
◮ Compute Newton step d k+1 ← − ff ′′(x )
∆ f ′ (x k ) (x k )
x k+1 = x k − ′′ k ◮ Update iterate x k+1 ← x k + d k+1
f (x ) ◮ Increment k ← k + 1 and return to step 1
∆ ′ k
The term d k+1 = − ff ′′(x )
(x k )
is called the Newton step.
Variant: Quasi-Newton Algorithm Pros and Cons of (Quasi-)Newton Search
Idea: Approximate second-order derivatives using finite differences,
Pros:
f ′ (x k ) − f ′ (x k−1 ) Very fast convergence close to the optimal solution (quadratic
f ′′ (x k ) ≈
x k − x k−1 convergence rate)
Step 0: Initialization No need to bound the optimum within a range
◮ Choose initial points x −1 and x 0 , as well as stopping tolerance ǫ > 0 Cons:
◮ Set k ← 0
Requires a “good” initial guess, otherwise typically diverges
Step 1: Derivatives No distinction between (local) minima and maxima!
◮ Compute first-order derivative f ′ (x k ) and approximate inverse of Objective function has to be smooth and first-order derivatives must
∆ x k −x k−1
second-order derivatives B(x k ) = f ′ (x k )−f ′ (x k−1 ) be available (possibly second-order derivatives too)
Step 2: Stopping ∆ (x−20)4
◮ If f ′ (x k ) < ǫ, stop — report x ∗ ← x k as an approximate solution
k xk f (x k ) f ′ (x k ) f ′′ (x k ) |f ′ (x k )| d k+1
Step 3: Newton Step 0 30.00 -40.00 6.000 2.40 6.000 -2.50
◮ Compute Newton step d k+1 ← −f ′ (x k )B k 1 27.50 -48.67 1.37 1.35 1.375 -1.02
2 26.48 -49.43 0.18 1.01 0.178
◮ Update iterate x k+1 ← x k + d k+1 3
◮ Increment k ← k + 1 and return to step 1

10 NLP I - Handout

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

10 NLP I - Handout

Transféré par

Droits d'auteur :

Formats disponibles

Solving Single-Variable, Unconstrained NLPs

Non-Linear Programming (NLP): Single-Variable, Methods for Single-Variable

Basic Concepts for

Basic Concepts for

ChE 4G03: Optimization in Chemical Engineering

Solving Single-Variable, Unconstrained NLPs Outline

“Won’t the methods for multivariable problems work in the single

This lecture Numerical Analytical

1 A few important problems are single variables — e.g., nonlinear

For additional details, see Rardin (1998), Chapter 13.2

xℓ x1 x2 xu eliminated xℓ x1 x2 xu What is the advantage of choosing this ratio γ?

Step 1: Stopping Cons:

Newton’s Method Basic Newton Algorithm

Vous aimerez peut-être aussi