Académique Documents
Professionnel Documents
Culture Documents
1 Introduction
Finding the solutions of the nonlinear equations occurs often in the scientific computing. For
example, let us consider the problem of finding the parameter in the curve y = cosh(x/)
such that the length of the arc between x = 0 and x = 5 is 10. Now
Z 5 Z 5
ds
10 = dx = cosh(x/) dx = sinh(5/)
0 dx 0
Hence, finding the appropriate curve needs the value of which can be found from the solution
of the transcendal equation
sinh(5/) = 10
In this chapter, we discuss few methods along with their convergence properties. Let is a
solution of f (x) = 0 and cn is a approximation of the root. Here suffix n denotes an iteration
index that will be introduced later. Now the error at the n-th stage is en = cn . If
|en+1 | = A|en |p ,
then p is the order of the convergence and A is called the asymptotic rate constant. Clearly
for p = 1, we need A < 1 for convergence.
2 Bisection method
Suppose that f is a continuous function on the interval [a, b] and f (a)f (b) < 0. By intermediate
value theorem, f has at least one zero in the interval [a, b]. We next calculate c = (a + b)/2
and test f c). If f (c) = 0, then c is the root and we are done. If not, then either f (a)f (c) < 0
or f (b)f (c) < 0. In the former case, a root lies in [a, c] and we rename c as b and do the same
process. In the latter case, we rename c as a and continue the same process. The root now lies
in a interval whose length is half of the length of the original interval. The process is repeated
and we stop the iteration when f (c) is very nearly zero or length of the interval [a, b] is very
small.
2.1 Convergence
Let a0 = a and b0 = b and [an , bn ] (n 0) are the successive intervals in the bisection process.
Clearly
a0 a1 a2 b 0 = b
and
b 0 b 1 b 2 a0 = a
Now the sequence {an } is monotonic increasing and bounded above and the sequence {bn } is
monotonic decreasing and bounded below. Hence both the sequences converge. Further,
bn1 an1 ba
b n an = = = n
2 2
Hence limn an = limn bn = . Further, taking limit in f (an )f (bn ) 0, we get [f (r)]2 0
and that implies f (r) = 0. Hence, both an and bn converges to a root of f (x) = 0.
Let us apply the bisection method to the interval [an , bn ] and calculate midpoint cn =
(an + bn )/2. Then the root lies either in [an , cn ] or [cn , bn ]. In either case
b n an ba
| cn | = n+1
2 2
Hence, cn as n .
In this method, we can calculate the number of iteration n that need to be done to achieve
a specified accuracy. Suppose we want relative accuracy of the root. Hence we want
| cn |
||
Suppose that the root lies in [a, b] where b > a > 0. Clearly || > a and hence the above
relation is true if
| cn |
a
which is true if
ba
2n+1 a
Solving this we can find minimum number of iteration needed to obtain the desired accuracy.
Now
1 1 b n an
|en+1 | = | cn+1 | (bn+1 an+1 ) =
2 2 2
and
1
|en | = | cn | (bn an )
2
Thus we find
1
|en+1 | |en |
2
Hence the bisection method converges linearly.
a
b
3 Regula falsi
Consider the figure in which the root lies between a and b. In the first iteration of bisection
method, the approximation lies at the small circle. However, since |f (a)| is small, we expect
the root to lie near a. This can be achieved if we joint the coordinates (a, f (a)) and (b, f (b))
and take the intersection c of the line with the x axis as the first approximation. Hence the
new approximation c satisfies
ca 0 f (a) af (b) bf (a)
= = c =
ba f (b) f (a) f (b) f (a)
Since f (a) and f (b) are of opposite sign, the method is well defined. If f (c) = 0, then c
is the exact root . Otherwise we take b = c if f (a)f (c) < 0 and a = c if f (c)f (b) < 0.
This process is then repeated. Of course, method does not work satisfactorily in all cases and
certain modification can be made.
Let [an , bn ] be the successive intervals of the regula falsi method. In this case bn an may
not go to zero as n . However, the method still converges to the root.
To show this, we consider the worst case. We assume that f (x) exists and for some
i, f (x) 0 in [ai , bi ]. The case of f (x) 0 can be treated similarly. Also, suppose
that f (ai ) < 0 and f (bi ) > 0. Let ci be the new approximation which is nothing but the
intersection of the straight line through (ai , f (ai )) and (bi , f (bi )) with the x-axis. We claim
that ai < ai+1 = ci < bi+1 = bi . To show that note that the straight line is nothing but the
degree one polynomial p(x) with p(ci ) = 0. Obviously ai < ci < bi . For x [ai , bi ]
f (ci ) 0
If f (ci ) 6= 0, then ai+1 = ci > ai and bi+1 = bi . Hence, if the condition holds, then bi = bf
(say) for i i0 . Now ai is monotonically increasing and bounded by bf . Hence limn ai
exists and is equal to (say). Since f is continuous, f () = limn f (ai ) 0 and f (bf ) > 0
and hence 6= bf . Taking limit in
ai f (bi ) bi f (ai )
ci =
f (bi ) f (ai )
we find
f (bf ) bf f ()
=
f (bf ) f ()
Hence we find
( bf )f () = 0
Since 6= bf , we must have f () = 0 and ai converges to .
To find the order of convergence, let us consider the worst case discussed just above. Now
writing xi instead of ai and bi = b, the iteration can be written as
bf (xi ) xi f (b)
xi+1 = = (xi )
f (xi ) f (b)
where
bf (x) xf (b)
(x) =
f (x) f (b)
Hence after i i0 , the regula falsi method usually becomes a fixed point iteration method.
This method will converge if | ()| < 1. Now we can write
b f ()
() = 1 f () = 1 , < 1 < b.
f () f (b) f (1 )
By mean-value theorem, there exists 2 (xi , ) such that
f (xi ) f () = (xi )f (2 )
Since f (x) 0 in [xi , b], f (x) is monotonically increasing in [xi , b] and f (2 ) = f (xi )/(xi
) > 0. This implies
f ()
0 < f (2 ) f () f (1 ) = 0 < 1
f (1 )
Hence
f ()
01 < 1 = 0 () < 1.
f (1 )
Hence the fixed point iteration converges to i.e. = (). Now
4 Secant method
Here we dont insist on bracketing of roots. Given two initial guess. Given two approximation
xn1 , xn , we take the next approximation xn+1 as the intersection of line joining (xn1 , f (xn1 ))
and (xn , f (xn )) with the x-axis. Thus xn+1 need not lie in the interval [xn1 , xn ]. If the root
is and is a simple zero, then it can be proved that the method converges for initial guess
in sufficiently small neighbourhood of . Now xn+1 is given by
4.1 Convergence
Now
xn1 f (xn ) xn f (xn1 )
en+1 = xn+1 =
f (xn ) f (xn1 )
xn xn1
= xn + f (xn )
f (xn ) f (xn1 )
f (xn ) f ()
= xi +
f [xn , xn1 ]
f [, xn , xn1 ]
= ( xn )( xn1 )
f [xn , xn1 ]
f (1 )
= en en1 ,
2f (2 )
where 1 I(xn , xn1 , xn ) and 2 I(xn , xn1 ). Here, I(a, b, c) denotes the interior of the
interval formed by a, b and c. Since is a simple zero, f () 6= 0. Consider the interval
J = {x : |x | } such that
f (1 )
2f (2 ) M, 1 , 2 J
Now we have
|en+1 | M |en ||en1 |
Let n = M |en |. Then n+1 n n1 . Now choose initial guess x0 and x1 such that
This implies i = M |xi | < min{1, M } for i = 0, 1. Now choose 0 < D < min{1, M } and
thus 0 < D < 1 and 0 , 1 D < 1 . Now
2 1 0 D 2
p = 1 + 1/p, = 1 /p = = p/(1 + p) = p 1
Taking the positive value of p, we find p = (1 + 5)/2 = r (golden ratio) and = r 1. Hence
|en+1 | C r1 |en |r
The order of convergence is non-integer which is greater than one. Hence, the convergence is
superlinear.
5 Newton-Raphson method
Let x0 be an initial guess to the root of f (x) = 0. Let h is the correction i.e. = x0 + h.
Then f () = 0 implies f (x0 + h) = 0. Now assuming h small and f twice continuously
differentiable, we find
h2
f (x0 ) + hf (x0 ) + f () = 0, I(x0 , x0 + h]
2
Neglecting quadratic and higher order term and assuming that is a simple root, we find
5.1 Convergence
If is simple root, the f () 6= 0 and hence f (x) 6= 0 in a sufficiently small neighbourhood of
. Consider the interval J = {x : |x | } in which f (x) 6= 0 and
f (n )
2f (xn ) M, xn , n J
Now we write en = xn , divide both side by f (xn ) and use the iteration formula for
Newton-Raphson method to arrive at
f (n )
en+1 = e2 = |en+1 | M |en |2
2f (xn ) n
Let n = M |en |. Then n+1 2n . Now choose initial guess x0 such that
This implies 0 = M |x0 | < min{1, M }. Now choose 0 < D < min{1, M } and thus
0 < D < 1 and 0 D < 1 . Now
2
n 2n1 2n2 (0 )2 = D2
n n
This implies
en = xn f (xn )/f (xn ) = xn+1 xn
Hence, the error is approximately the difference between the two successive iteration values.
Thus the difference between the successive iteration values can be used for stopping criterion.
6 Fixed point iteration
In this method, one writes f (x) = 0 in the form x = g(x) so that any solution of x = g(x)
(which is also called fixed point) is a solution of f (x) = 0. This can be accomplished in many
ways. For example, with f (x) = x2 5, we can write g(x) = (x + 5/x)/2 or g(x) = x + 5 x2
or g(x) = x (5 x2 )/2 etc.
The function g(x) is also called an iteration function. Once an g(x) is chosen, then we
carry out the iteration (starting from initial guess x0 )
xn+1 = g(xn ), n = 0, 1, 2,
Theorem: Let g be defined in an interval I = [a, b] such that g(x) I i.e. g(x) maps I into
itself. Further, suppose that g is differentiable in I and there exists a nonnegative constant
K < 1 such that g (x) K for all x I. Then there exists a unique fixed point in I and
xn as n .
Proof: If g(a) = a or g(b) = b, then obviously g have a fixed point. Hence suppose that
g(a) 6= a and g(b) 6= b. Since g maps I into I, we must have g(a) > a and g(b) < b.
Now consider the function h(x) = x g(x) and we must have h(a) < 0 and h(b) > 0. By
intermediate value theorem, there exists such that h() = 0 and hence existence of fixed
point is proved. To prove uniqueness, suppose that and are distinct fixed point. Then
Hence
|en | K|en1 K 2 |en2 | K n |e0 |
Since 0 K < 1, we have K n 0 as n and hence en 0 as n .
Also, assuming that g is twice differentiable, we have
e2n
en+1 = xn+1 = g() g(xn ) = g() g( en ) = en g () g (cn ), cn I(, xn )
2
If g () 6= 0, then
|en+1 | A|en |, A |g ()|
showing that the convergence is first order. On the other hand, when g () = 0, then
showing that the convergence is 2nd order. For example, the Newton-Raphson is a special
case of fixed point iteration in which g(x) = x f (x)/f (x). If is a simple root of f , then
the convergence of Newton-Raphson method is 2nd order.
It is often very difficult to verify the assumptions of the previous theorem. Hence many
times, we check the following condition: If g is continuously differentiable in some open interval
J containing and if |g ()| < 1 in J, then there exists an > 0 such that the fixed point
iteration converges whenever we start with x0 that satisfies |x0 | .
To show this we take q = (1 + |g ())/2 < 1 and take = (1 |g ())/2 > 0. Since g is
continuous, there exists a > 0 such that
|g (x) g ()| |x | .
Now
|g (x)| |g (x) g ()| + |g ()| + |g () = q
Consider I = [ , + ] and we show that g maps I to itself. To show this, we note that
for x I
|g(x) | = |g(x) g()| = g ()(x )
where I(x, ) and hence I = [ , + ]. Thus
7 Roots of polynomial
Finding roots of polynomial also deals with complex roots. Also, sometimes we are interested
in finding all the roots of a polynomial. We know that a polynomial of degree n has n roots
(counting multiplicity) in the complex field. We first deal with some localization theorem.
Theorem: All roots of the polynomial
p(z) = an z n + an1 z n1 + + a0
lie in the open disk whose centre is at the origin of the complex plane and whose radius is
Proof: Let c = max0in |ai |. If c = 0, then nothing to prove. Hence assume that c > 0 and
then > 1. Now we show that p(z) does not vanish in the region |z| . To show this, we
find (noting that |z| > 1 and c|an |1 = 1)
s(z) = a0 z n + a1 z n1 + + an
Note that p(z0 ) = 0 implies s(1/z0 ) = 0. Hence, if all the roots of s lies inside the disk |z| ,
then all the nonzero roots of p are outside the disk |z| < 1/.
8 Horners algorithm
This is also known as nested multiplication and as synthetic division. For a polynomial
p(z) = an z n + an1 z n1 + + a0 and a number z0 , we can write p(z) = (x z0 )q(z) + p(z0 )
where
q(z) = bn1 z n1 + bn2 z n2 + + b0
is polynomial of degree one less than that of p. Substituting q(z) and equating like powers we
find
bn1 = an , bn2 = an1 + bn1 z0 , , b0 = a1 + b1 z0 , p(z0 ) = a0 + b0 z0
Thus we can use Horners algorithm to find value of a polynomial at any point z0 . This can
also be used to deflate a polynomial if we know that z0 is a root. This method also can be
used to find Taylor expansion of a polynomial about any point. Suppose
Clearly bk = p(k) (z0 )/k!. We can use Horners algorithm to find ck efficiently. Since c0 = p(z0 )
which is obtain by applying nested multiplication to p(z). The method also gives
Hence we can obtain c1 by applying nested multiplication to q(z). This process can be repeated.
9 Newton-Raphson method
We use the iteration
Note that p(z) = (x zk )q(x) + p(zk ) and p (zk ) = q(zk ). Hence, both numerator and
denominator can be obtained by two steps of nested multiplication. To obtain complex roots,
we need a complex initial guess. Alternatively, we also can use z = + i and substituting
in p(z) = 0, we get two equation F (, ) = 0, G(, ) = 0 that can be solved together using
Newton-Raphson technique with real arithmetic.
Suppose we obtain a root 1 by Newton-Raphson method. Then by deflation method,
p(z) (z 1 )q1 (z) where q1 is a polynomial of degree one less than p. Now we apply
Newton-Raphson to q1 and find another root 2 . We can proceed this way and find all the
roots 1 , 2 , , n such that
p(z) (z 1 )(z 2 ) (z n )
Of course, the error in the roots increases from 2 to n since the error gradually built up in
the deflation method. One remedy is to take 2 to n as initial guess and work with the full
polynomial.
10 Mullers method
This is generalization of secant method. This method works well for simple and multiple roots.
This method may converge to a complex root even if we start with a real initial guesses. It
works for non polynomial too. Here we fit a polynomial of degree 2 through three interpolatory
points xi2 , xi1 , xi :
p(x) = f [xi ] + f [xi , xi1 ](x xi ) + f [xi , xi1 , xi2 ](x xi )(x xi1 )
= f [xi ] + f [xi , xi1 ](x xi ) + f [xi , xi1 , xi2 ][(x xi )2 + hi ](x xi )
= ai (x xi )2 + 2bi (x xi ) + ci ,
where ai = f [xi , xi1 , xi2 ], bi = [hi f [xi , xi1 , xi2 ]+f [xi , xi1 ]/2, ci = f [xi ] and hi = xi xi1 .
Let i be the root of smallest absolute value of the quadratic equation ai 2 + 2bi + ci = 0.
Then xi+1 = xi + i is the root of p(x) = 0 closest to xi . Note that
p
bi b2i ai ci c
i = = pi
2ai bi b2i ai ci
We need the sign that make the denominator largest in absolute value. Hence
ci
i = p
bi + sgn(bi ) b2i ai ci
Complex arithmetic have to be used since b2i ai ci might be negative. Once we get xi+1 , then
we repeat the same procedure with interpolatory points xi1 , xi and xi+1 . It can be shown
that (3)
f ()
ei+1 ei ei1 ei2
6f ()
Further, if |ei+1 | |ei |p , then p 1.84 and hence the convergence is superlinear and better
than secant method.
whose roots are simple, real and well separated. A small perturbation to the coefficient of x19
from 210 to 210 + 223 will change the roots significantly and some of the becomes complex
too. Hence, care must be exercised while finding the roots of a polynomial.