Vous êtes sur la page 1sur 10

Solution of Nonlinear Equations

1 Introduction
Finding the solutions of the nonlinear equations occurs often in the scientific computing. For
example, let us consider the problem of finding the parameter in the curve y = cosh(x/)
such that the length of the arc between x = 0 and x = 5 is 10. Now
Z 5 Z 5
ds
10 = dx = cosh(x/) dx = sinh(5/)
0 dx 0

Hence, finding the appropriate curve needs the value of which can be found from the solution
of the transcendal equation
sinh(5/) = 10
In this chapter, we discuss few methods along with their convergence properties. Let is a
solution of f (x) = 0 and cn is a approximation of the root. Here suffix n denotes an iteration
index that will be introduced later. Now the error at the n-th stage is en = cn . If

|en+1 | = A|en |p ,

then p is the order of the convergence and A is called the asymptotic rate constant. Clearly
for p = 1, we need A < 1 for convergence.

2 Bisection method
Suppose that f is a continuous function on the interval [a, b] and f (a)f (b) < 0. By intermediate
value theorem, f has at least one zero in the interval [a, b]. We next calculate c = (a + b)/2
and test f c). If f (c) = 0, then c is the root and we are done. If not, then either f (a)f (c) < 0
or f (b)f (c) < 0. In the former case, a root lies in [a, c] and we rename c as b and do the same
process. In the latter case, we rename c as a and continue the same process. The root now lies
in a interval whose length is half of the length of the original interval. The process is repeated
and we stop the iteration when f (c) is very nearly zero or length of the interval [a, b] is very
small.

2.1 Convergence
Let a0 = a and b0 = b and [an , bn ] (n 0) are the successive intervals in the bisection process.
Clearly
a0 a1 a2 b 0 = b
and
b 0 b 1 b 2 a0 = a
Now the sequence {an } is monotonic increasing and bounded above and the sequence {bn } is
monotonic decreasing and bounded below. Hence both the sequences converge. Further,
bn1 an1 ba
b n an = = = n
2 2
Hence limn an = limn bn = . Further, taking limit in f (an )f (bn ) 0, we get [f (r)]2 0
and that implies f (r) = 0. Hence, both an and bn converges to a root of f (x) = 0.
Let us apply the bisection method to the interval [an , bn ] and calculate midpoint cn =
(an + bn )/2. Then the root lies either in [an , cn ] or [cn , bn ]. In either case

b n an ba
| cn | = n+1
2 2
Hence, cn as n .
In this method, we can calculate the number of iteration n that need to be done to achieve
a specified accuracy. Suppose we want relative accuracy of the root. Hence we want

| cn |

||

Suppose that the root lies in [a, b] where b > a > 0. Clearly || > a and hence the above
relation is true if
| cn |

a
which is true if
ba

2n+1 a
Solving this we can find minimum number of iteration needed to obtain the desired accuracy.
Now
1 1 b n an
|en+1 | = | cn+1 | (bn+1 an+1 ) =
2 2 2
and
1
|en | = | cn | (bn an )
2
Thus we find
1
|en+1 | |en |
2
Hence the bisection method converges linearly.

a
b

3 Regula falsi
Consider the figure in which the root lies between a and b. In the first iteration of bisection
method, the approximation lies at the small circle. However, since |f (a)| is small, we expect
the root to lie near a. This can be achieved if we joint the coordinates (a, f (a)) and (b, f (b))
and take the intersection c of the line with the x axis as the first approximation. Hence the
new approximation c satisfies
ca 0 f (a) af (b) bf (a)
= = c =
ba f (b) f (a) f (b) f (a)
Since f (a) and f (b) are of opposite sign, the method is well defined. If f (c) = 0, then c
is the exact root . Otherwise we take b = c if f (a)f (c) < 0 and a = c if f (c)f (b) < 0.
This process is then repeated. Of course, method does not work satisfactorily in all cases and
certain modification can be made.
Let [an , bn ] be the successive intervals of the regula falsi method. In this case bn an may
not go to zero as n . However, the method still converges to the root.
To show this, we consider the worst case. We assume that f (x) exists and for some
i, f (x) 0 in [ai , bi ]. The case of f (x) 0 can be treated similarly. Also, suppose
that f (ai ) < 0 and f (bi ) > 0. Let ci be the new approximation which is nothing but the
intersection of the straight line through (ai , f (ai )) and (bi , f (bi )) with the x-axis. We claim
that ai < ai+1 = ci < bi+1 = bi . To show that note that the straight line is nothing but the
degree one polynomial p(x) with p(ci ) = 0. Obviously ai < ci < bi . For x [ai , bi ]

f (x) p(x) = (x ai )(x bi )f ()/2, (ai , bi )

Putting x = ci and using the given conditions, we get

f (ci ) 0

If f (ci ) 6= 0, then ai+1 = ci > ai and bi+1 = bi . Hence, if the condition holds, then bi = bf
(say) for i i0 . Now ai is monotonically increasing and bounded by bf . Hence limn ai
exists and is equal to (say). Since f is continuous, f () = limn f (ai ) 0 and f (bf ) > 0
and hence 6= bf . Taking limit in
ai f (bi ) bi f (ai )
ci =
f (bi ) f (ai )
we find
f (bf ) bf f ()
=
f (bf ) f ()
Hence we find
( bf )f () = 0
Since 6= bf , we must have f () = 0 and ai converges to .
To find the order of convergence, let us consider the worst case discussed just above. Now
writing xi instead of ai and bi = b, the iteration can be written as
bf (xi ) xi f (b)
xi+1 = = (xi )
f (xi ) f (b)
where
bf (x) xf (b)
(x) =
f (x) f (b)
Hence after i i0 , the regula falsi method usually becomes a fixed point iteration method.
This method will converge if | ()| < 1. Now we can write
b f ()
() = 1 f () = 1 , < 1 < b.
f () f (b) f (1 )
By mean-value theorem, there exists 2 (xi , ) such that

f (xi ) f () = (xi )f (2 )

Since f (x) 0 in [xi , b], f (x) is monotonically increasing in [xi , b] and f (2 ) = f (xi )/(xi
) > 0. This implies

f ()
0 < f (2 ) f () f (1 ) = 0 < 1
f (1 )

Hence
f ()
01 < 1 = 0 () < 1.
f (1 )
Hence the fixed point iteration converges to i.e. = (). Now

en+1 = xn+1 = () (xn ) = en () = Ken ,

where 0 K < 1. Hence the convergence is linear.

4 Secant method
Here we dont insist on bracketing of roots. Given two initial guess. Given two approximation
xn1 , xn , we take the next approximation xn+1 as the intersection of line joining (xn1 , f (xn1 ))
and (xn , f (xn )) with the x-axis. Thus xn+1 need not lie in the interval [xn1 , xn ]. If the root
is and is a simple zero, then it can be proved that the method converges for initial guess
in sufficiently small neighbourhood of . Now xn+1 is given by

xn1 f (xn ) xn f (xn1 )


xn+1 =
f (xn ) f (xn1 )

4.1 Convergence
Now
xn1 f (xn ) xn f (xn1 )
en+1 = xn+1 =
f (xn ) f (xn1 )
xn xn1
= xn + f (xn )
f (xn ) f (xn1 )
f (xn ) f ()
= xi +
f [xn , xn1 ]
f [, xn , xn1 ]
= ( xn )( xn1 )
f [xn , xn1 ]

f (1 )
= en en1 ,
2f (2 )

where 1 I(xn , xn1 , xn ) and 2 I(xn , xn1 ). Here, I(a, b, c) denotes the interior of the
interval formed by a, b and c. Since is a simple zero, f () 6= 0. Consider the interval
J = {x : |x | } such that

f (1 )
2f (2 ) M, 1 , 2 J

Now we have
|en+1 | M |en ||en1 |
Let n = M |en |. Then n+1 n n1 . Now choose initial guess x0 and x1 such that

|xi | < min{1/M, }, i = 0, 1

This implies i = M |xi | < min{1, M } for i = 0, 1. Now choose 0 < D < min{1, M } and
thus 0 < D < 1 and 0 , 1 D < 1 . Now

2 1 0 D 2

Also, 3 2 1 D3 etc. By induction, we can show that n Dn where 0 = 1 = 1 and


n = n1 + n2 for n 2. Using n rn , we find
!n+1 !n+1 !n+1

1 1+ 5 1 5 1 1 + 5
n = , as n
5 2 2 5 2

Since D < 1 and n , we get n 0 as n and thus xn .


Now as xn , 1 , 2 . This implies

f ()
|en+1 | C|en ||en1 |, C =
2f ()
Let
|en+1 | C |en |p = |en | C |en1 |p = |en1 | = simC /p |en |1/
Now from |en+1 | C|en ||en1 |, we get

C|en |p C|en ||en |1/p C /p ,

which is true provided

p = 1 + 1/p, = 1 /p = = p/(1 + p) = p 1

Taking the positive value of p, we find p = (1 + 5)/2 = r (golden ratio) and = r 1. Hence

|en+1 | C r1 |en |r

The order of convergence is non-integer which is greater than one. Hence, the convergence is
superlinear.

5 Newton-Raphson method
Let x0 be an initial guess to the root of f (x) = 0. Let h is the correction i.e. = x0 + h.
Then f () = 0 implies f (x0 + h) = 0. Now assuming h small and f twice continuously
differentiable, we find
h2
f (x0 ) + hf (x0 ) + f () = 0, I(x0 , x0 + h]
2
Neglecting quadratic and higher order term and assuming that is a simple root, we find

h f (x0 )/f (x0 ) = x1 = x0 f (x0 )/f (x0 )


might be a better approximation to than x0 . We can continue this process with x1 . Hence,
method is given by
xn+1 = xn f (xn )/f (xn ), n = 0, 1, 2,
Geometrically, xn+1 is the intersection of tangent with the xaxis that passes through the
point (xn , f (xn )). This method can also be derived from the secant method if xn1 approach
xn . The method may or may not converge if the initial guess is too far from the root.

5.1 Convergence
If is simple root, the f () 6= 0 and hence f (x) 6= 0 in a sufficiently small neighbourhood of
. Consider the interval J = {x : |x | } in which f (x) 6= 0 and

f (n )
2f (xn ) M, xn , n J

If en = xn , then from f (xn + en ) = 0, we find

f (xn ) + en f (xn ) + e2n f (n )/2f (xn ) = 0

Now we write en = xn , divide both side by f (xn ) and use the iteration formula for
Newton-Raphson method to arrive at

f (n )
 
en+1 = e2 = |en+1 | M |en |2
2f (xn ) n

Let n = M |en |. Then n+1 2n . Now choose initial guess x0 such that

|x0 | < min{1/M, }

This implies 0 = M |x0 | < min{1, M }. Now choose 0 < D < min{1, M } and thus
0 < D < 1 and 0 D < 1 . Now
2
n 2n1 2n2 (0 )2 = D2
n n

Since D < 1, n 0 as n and thus xn as n .


Also, |en+1 | C|en |2 which implies quadratic convergence and the asymptotic rate constant
is C = |f ()/2f ()|.
Also, note that

f (xn ) = f () f (xn ) = ( xn )f (cn ) ( xn )f (xn )

This implies
en = xn f (xn )/f (xn ) = xn+1 xn
Hence, the error is approximately the difference between the two successive iteration values.
Thus the difference between the successive iteration values can be used for stopping criterion.
6 Fixed point iteration
In this method, one writes f (x) = 0 in the form x = g(x) so that any solution of x = g(x)
(which is also called fixed point) is a solution of f (x) = 0. This can be accomplished in many
ways. For example, with f (x) = x2 5, we can write g(x) = (x + 5/x)/2 or g(x) = x + 5 x2
or g(x) = x (5 x2 )/2 etc.
The function g(x) is also called an iteration function. Once an g(x) is chosen, then we
carry out the iteration (starting from initial guess x0 )

xn+1 = g(xn ), n = 0, 1, 2,

Theorem: Let g be defined in an interval I = [a, b] such that g(x) I i.e. g(x) maps I into
itself. Further, suppose that g is differentiable in I and there exists a nonnegative constant
K < 1 such that g (x) K for all x I. Then there exists a unique fixed point in I and
xn as n .
Proof: If g(a) = a or g(b) = b, then obviously g have a fixed point. Hence suppose that
g(a) 6= a and g(b) 6= b. Since g maps I into I, we must have g(a) > a and g(b) < b.
Now consider the function h(x) = x g(x) and we must have h(a) < 0 and h(b) > 0. By
intermediate value theorem, there exists such that h() = 0 and hence existence of fixed
point is proved. To prove uniqueness, suppose that and are distinct fixed point. Then

| | = |g() g()| = |g ()|| | K| | < | |, I(, )

which is a contradiction. Hence fixed point is unique.


To prove convergence, consider en = xn . Then

en = g() g(xn1 ) = g (cn )en1 , cn I(, xn )

Hence
|en | K|en1 K 2 |en2 | K n |e0 |
Since 0 K < 1, we have K n 0 as n and hence en 0 as n .
Also, assuming that g is twice differentiable, we have
e2n
en+1 = xn+1 = g() g(xn ) = g() g( en ) = en g () g (cn ), cn I(, xn )
2
If g () 6= 0, then
|en+1 | A|en |, A |g ()|
showing that the convergence is first order. On the other hand, when g () = 0, then

|en+1 | C|en |2 , C |g ()|/2

showing that the convergence is 2nd order. For example, the Newton-Raphson is a special
case of fixed point iteration in which g(x) = x f (x)/f (x). If is a simple root of f , then
the convergence of Newton-Raphson method is 2nd order.
It is often very difficult to verify the assumptions of the previous theorem. Hence many
times, we check the following condition: If g is continuously differentiable in some open interval
J containing and if |g ()| < 1 in J, then there exists an > 0 such that the fixed point
iteration converges whenever we start with x0 that satisfies |x0 | .
To show this we take q = (1 + |g ())/2 < 1 and take = (1 |g ())/2 > 0. Since g is
continuous, there exists a > 0 such that

|g (x) g ()| |x | .

Now
|g (x)| |g (x) g ()| + |g ()| + |g () = q
Consider I = [ , + ] and we show that g maps I to itself. To show this, we note that
for x I
|g(x) | = |g(x) g()| = g ()(x )
where I(x, ) and hence I = [ , + ]. Thus

|g(x) | q|x | <

showing that g maps I to I.

7 Roots of polynomial
Finding roots of polynomial also deals with complex roots. Also, sometimes we are interested
in finding all the roots of a polynomial. We know that a polynomial of degree n has n roots
(counting multiplicity) in the complex field. We first deal with some localization theorem.
Theorem: All roots of the polynomial

p(z) = an z n + an1 z n1 + + a0

lie in the open disk whose centre is at the origin of the complex plane and whose radius is

= 1 + |an |1 max |ai |


0in

Proof: Let c = max0in |ai |. If c = 0, then nothing to prove. Hence assume that c > 0 and
then > 1. Now we show that p(z) does not vanish in the region |z| . To show this, we
find (noting that |z| > 1 and c|an |1 = 1)

|p(z)| |an z n | |an1 z n1 + an2 z n2 + + a0 |


n1
X
|an z n | c |z|n
i=0
n n
> |an z | c|z| (|z| 1)1
= |an z n |[1 c|an |1 (|z| 1)1 ]
|an z n |[1 c|an |1 ( 1)1 ] = 0

Here we have used |z| = |z| 1 1.


Corollary: Note that if we consider s(z) = z n p(1/z) then

s(z) = a0 z n + a1 z n1 + + an

Note that p(z0 ) = 0 implies s(1/z0 ) = 0. Hence, if all the roots of s lies inside the disk |z| ,
then all the nonzero roots of p are outside the disk |z| < 1/.
8 Horners algorithm
This is also known as nested multiplication and as synthetic division. For a polynomial
p(z) = an z n + an1 z n1 + + a0 and a number z0 , we can write p(z) = (x z0 )q(z) + p(z0 )
where
q(z) = bn1 z n1 + bn2 z n2 + + b0
is polynomial of degree one less than that of p. Substituting q(z) and equating like powers we
find
bn1 = an , bn2 = an1 + bn1 z0 , , b0 = a1 + b1 z0 , p(z0 ) = a0 + b0 z0
Thus we can use Horners algorithm to find value of a polynomial at any point z0 . This can
also be used to deflate a polynomial if we know that z0 is a root. This method also can be
used to find Taylor expansion of a polynomial about any point. Suppose

p(z) = an z n + an1 z n1 + + a0 = bn (z z0 )n + bn1 (z z0 )n1 + + b0

Clearly bk = p(k) (z0 )/k!. We can use Horners algorithm to find ck efficiently. Since c0 = p(z0 )
which is obtain by applying nested multiplication to p(z). The method also gives

q(z) = (p(z) p(z0 ))/(z z0 ) = bn (z z0 )n1 + bn1 (z z0 )n2 + + b1

Hence we can obtain c1 by applying nested multiplication to q(z). This process can be repeated.

9 Newton-Raphson method
We use the iteration

zk+1 = zk p(zk )/p (zk ), k = 0, 1, 2,

Note that p(z) = (x zk )q(x) + p(zk ) and p (zk ) = q(zk ). Hence, both numerator and
denominator can be obtained by two steps of nested multiplication. To obtain complex roots,
we need a complex initial guess. Alternatively, we also can use z = + i and substituting
in p(z) = 0, we get two equation F (, ) = 0, G(, ) = 0 that can be solved together using
Newton-Raphson technique with real arithmetic.
Suppose we obtain a root 1 by Newton-Raphson method. Then by deflation method,
p(z) (z 1 )q1 (z) where q1 is a polynomial of degree one less than p. Now we apply
Newton-Raphson to q1 and find another root 2 . We can proceed this way and find all the
roots 1 , 2 , , n such that

p(z) (z 1 )(z 2 ) (z n )

Of course, the error in the roots increases from 2 to n since the error gradually built up in
the deflation method. One remedy is to take 2 to n as initial guess and work with the full
polynomial.
10 Mullers method
This is generalization of secant method. This method works well for simple and multiple roots.
This method may converge to a complex root even if we start with a real initial guesses. It
works for non polynomial too. Here we fit a polynomial of degree 2 through three interpolatory
points xi2 , xi1 , xi :

p(x) = f [xi ] + f [xi , xi1 ](x xi ) + f [xi , xi1 , xi2 ](x xi )(x xi1 )
= f [xi ] + f [xi , xi1 ](x xi ) + f [xi , xi1 , xi2 ][(x xi )2 + hi ](x xi )
= ai (x xi )2 + 2bi (x xi ) + ci ,

where ai = f [xi , xi1 , xi2 ], bi = [hi f [xi , xi1 , xi2 ]+f [xi , xi1 ]/2, ci = f [xi ] and hi = xi xi1 .
Let i be the root of smallest absolute value of the quadratic equation ai 2 + 2bi + ci = 0.
Then xi+1 = xi + i is the root of p(x) = 0 closest to xi . Note that
p
bi b2i ai ci c
i = = pi
2ai bi b2i ai ci

We need the sign that make the denominator largest in absolute value. Hence
ci
i = p
bi + sgn(bi ) b2i ai ci

Complex arithmetic have to be used since b2i ai ci might be negative. Once we get xi+1 , then
we repeat the same procedure with interpolatory points xi1 , xi and xi+1 . It can be shown
that  (3) 
f ()
ei+1 ei ei1 ei2
6f ()
Further, if |ei+1 | |ei |p , then p 1.84 and hence the convergence is superlinear and better
than secant method.

11 Sensitivity of polynomial roots


Consider p(x) = x2 2x + 1 which has real roots x1 = x2 = 1. If we change the coefficient 2
to 1.9999, the roots will be complex. Hence, the character of the roots change from real to
complex. Of course the roots are equal here.
However, the well separated roots of a polynomial will be sensitive too. For example,
consider Wilkinson polynomial

p(x) = (x 1)(x 2) (x 20)

whose roots are simple, real and well separated. A small perturbation to the coefficient of x19
from 210 to 210 + 223 will change the roots significantly and some of the becomes complex
too. Hence, care must be exercised while finding the roots of a polynomial.

Vous aimerez peut-être aussi