Numerical Methods Basics

Note Set 1 The Basics
1.1 Overview
In this note set, we will cover the very basic tools of numerical methods.
Numerical methods are useful for both formal theory and methods because they free us
from having to employ restrictive assumptions in order to obtain solutions to important
problems.
Consider applications to formal theory first. Stylized formal models often show
that some result holds under a certain set of conditions. A good theorist however, will
attempt to characterize all conditions under which that result holds. The ideal is an if and
only if result. This may not be possible, but general conditions are more informative than
specific ones. When we go down this road, however, we may be able to derive certain
nice results, but lack specificity. Without restrictive assumptions, we may not be able to
solve our model analytically. When we make restrictive assumptions, then we would like
to at least make them realistic. The set of realistic assumptions will not always be the
ones that lead to an easy solution. Here is where numerical methods provide a great
advantage, because they free of form having to choose the set of assumptions that make
our model easy to solve, and allow us to choose these assumptions based on other
concerns (e.g. realism, generality).
1.2 Numerical Differentiation
Differentiation differs from many of the operation we consider because it will

transform an analytical expression into another analytical one. If our original expression
is made up of plusses, minuses, multiplication signs, division signs, powers, roots,
exponentials, logarithms, sines, cosigns, etc., then results derivative will have a
representation made up of the same set of expressions. If all we were required to do is
compute derivatives, we may never need to employ numerical methods. However, mixing
analytic and numerical methods can be hard. Because of this, we will often apply
numerical differentiation because it is required by the other numerical procedures.
Alternatively, if we would like to study the behavior of a function and are able to
compute its derivative, it may still be advantageous to apply numerical differentiation.
There are many important applications of numerical differentiation. Many
methods for solving nonlinear systems of equations and the optimization of non-linear
functions require that derivatives be supplied. It is sometimes possible to supply analytic
derivatives to the function, but there are limitations to this approach. Computing
analytical derivates can vary from tedious to near impossible (imagine optimization a loglikelihood that depends on a variance matrix through its Cholesky decomposition). Other
applications of numerical derivatives include computing standard errors for econometric

models, marginal effects for econometric models, and solving games with continuous
strategy spaces.
Recall that the definition of the derivatives of the function f ( x) is,
f '( x) = lim
h 0
f ( x + h) f ( x )
h
We assume that have access to a numerical function that computes f ( x) . For example,
we may have the c++ code,
double Func(double &x)

{
return x * x + 7;
}
We would like to be able to take the function Func and the point x as inputs and return
the derivative of Func at x. The definition of the derivative suggests an approach- select a
small value, h, and compute,
f '( x)
f ( x + h) f ( x )
h
The key question here is, how small should h be? For example, why not choose
h = 1.0*1080 ? How small we can select h is limited by the precision of the computer.
Real numbers are stored on a computer using a finite number of bits representing the
decimal place, the sign, and the exponent. A float is represented by 4 bytes (or 32 bits)
and a double is represented by 8 bytes (or 64 bits). If we select h = 1.0*1080 , then the
computer will likely not be able to return a different value for f ( x + h) and f ( x) , and
are estimated derivative will be (arbitrarily) evaluated as zero.
There are two major sources of error in computing numerical derivativestruncation error and round-off error. The first trick we employ is to make sure that x and
x + h differ exactly by a number that can be represented by a computer. This will reduce
one source of round-off error to zero. This can be accomplished using the following lines
of code,
double Temp = x + h;
h = Temp x;
Let m denote the smallest number that a computer can represent. For example,
for a typical Intel/AMD pc, this number will be 2.22045*10-16 . There exist routines that
will compute this number for your computer. The remaining round off error will have
size ~ m | f ( x) / h | . To calculate the truncation error, consider the Taylor expansion,
f ( x + h) = f ( x) + f '( x)h + 12 f ''( x)h 2 + ...

The truncation error is given by ~ f ''( x)h . Now, imagine choosing the size of h to
minimize the total error
good enough to choose
1
2
f ''( x)h + m | f ( x) | / h . We will obtain, h
m f
f ''
. It is often
f
| x | when x is not to close to zero. This leads to the
f ''
m | x |,
| x | 0
heuristic, h
. An alternative that is sometimes used is
,
otherwise
h m (1+ | x |) . This may seem a little ad-hoc, but we will see later that this method is
quite good (and certainly a lot better than guessing!).
The truncation error in the above calculation is O (h) . We can do better then this
by employing a higher order Taylor expansion,
f ( x + h) = f ( x) + f '( x)h + 12 f ''( x)h 2 + 16 f '''( x)h3 ...

f ( x h) = f ( x) f '( x)h + 12 f ''( x)h 2 16 f '''( x)h3 ...
Notice that,
f '( x)
f ( x + h) f ( x h ) 1
3 f '''( x)h 2 + ...
2h
This is the second difference formula for computing numerical derivatives. Notice that
this yields are more accurate calculation, with an accuracy of O(h 2 ) , but requires an
extra function evaluation (assuming that f is already available). One can use the same
approach to calculate an optimal h . We get, h m1 3 | x | .
We may also want to obtain higher-order derivatives. We can obtain these
derivatives by iterating the approach suggested above. For example, let use combine,
f ( x + h) = f ( x) + f '( x)h + 12 f ''( x)h 2 + 16 f '''( x)h3 + 241 f (4) ( x)h 4 ...
f ( x h) = f ( x) f '( x)h + 12 f ''( x)h 2 16 f '''( x)h3 + 241 f (4) ( x)h 4 + ...
to obtain approximation to f '( x) at two points,
f ( x + h) f ( x )
= f '( x) + 12 f ''( x)h + 16 f '''( x) h 2 + 241 f (4) ( x) h3 + ...
h
f ( x ) f ( x h)
= f '( x) 12 f ''( x)h + 16 f '''( x)h 2 + 241 f (4) ( x)h3 + ...
h
We can this apply the first-difference principle again to obtain,
f ( x + h) f ( x ) f ( x ) f ( x h)
h
h
= f ''( x) + 121 f (4) ( x)h 2 + ...
h
We can write this as,

f ''( x)
f ( x + h) 2 f ( x) + f ( x h) 1 (4)
12 f ( x)h 2 + ...
h2
We can apply a similar principle to obtain h m1 4 | x | with an error rate of O(h 2 ) .

The question arises, how well do these methods work in practice. In practice, they
work quite well, as the results below in Table 1.1. We computed the first and second
derivatives of the function f ( x) = 1.5 x 1 at x = 2.75 for step sizes of various values, as
well as the optimal values indicated by the formulas. We can see that the optimal values
do a good job of finding the best possible step size. C++ code for this example will be
available on the course website.
The discussion above assumed that the function f ( x) could be computed at
machine precision. Depending on the function we are dealing with, this may or may not
be the case. For simple problems, it will be the case. Suppose, alternatively, that we are
optimization a likelihood that involves an integral, which is computed using quadrature
methods (see the next section). This will involve error in the calculation and will likely
result in a bumpy objective function. Figure 1.2 plots a simulated method of moments
objective function. These can be quite messy at a small scale, so if we want to compute
derivatives, we should use step sizes on the scale where the function begins to be smooth.
Table 1.1: First and Second Derivatives of f ( x) = 1.5 x 1 at x = 2.75
Second Derivatives
First Derivatives
h
Forward Backward
1.00E-01
6.96E-03 -7.48E-03
1.00E-02
7.19E-04 -7.24E-04
1.00E-03
7.21E-05 -7.22E-05
1.00E-04
7.21E-06 -7.21E-06
1.00E-05
7.21E-07 -7.21E-07
1.00E-06
7.21E-08 -7.22E-08
1.00E-07
7.26E-09 -7.18E-09
1.00E-08
8.26E-10 -1.03E-08
1.00E-09
7.34E-09
7.34E-09
1.00E-10
2.29E-07 -8.81E-07
1.00E-11
5.78E-06 -5.32E-06
1.00E-12
7.89E-05 -3.21E-05
1.00E-13
5.69E-04 -5.42E-04
1.00E-14
2.69E-03 -8.17E-03
1.00E-15
7.33E-02 -5.17E-02
1.00E-16
-nan
-nan
1.00E-17
-nan
-nan
1.00E-18
-nan
-nan
1.00E-19
-nan
-nan
Optimal h
5.08E-09 -3.05E-09
Central
-2.63E-04
-2.62E-06
-2.62E-08
-2.62E-10
8.26E-13
-5.05E-11
4.13E-11
-4.73E-09
7.34E-09
-3.26E-07
2.29E-07
2.34E-05
1.38E-05
-2.74E-03
1.08E-02
-nan
-nan
-nan
-nan
-6.17E-12
h
1.00E-01
1.00E-02
1.00E-03
1.00E-04
1.00E-05
1.00E-06
1.00E-07
1.00E-08
1.00E-09
1.00E-10
1.00E-11
1.00E-12
1.00E-13
1.00E-14
1.00E-15
1.00E-16
1.00E-17
1.00E-18
1.00E-19
Optimal h
Forward
-1.45E-02
-1.56E-03
-1.57E-04
-1.58E-05
-3.38E-06
7.66E-05
7.66E-05
-1.44E-01
-1.44E-01
-1.44E-01
-1.11E+06
-1.11E+08
-1.11E+10
-1.44E-01
-1.41E+14
-nan
-nan
-nan
-nan
-5.28E-05
Backward
1.72E-02
1.59E-03
1.58E-04
1.57E-05
1.06E-06
-1.45E-04
7.66E-05
-1.25E+00
-1.44E-01
-1.11E+04
-1.11E+06
-1.44E-01
-1.11E+10
-1.06E+12
-1.44E-01
-nan
-nan
-nan
-nan
5.28E-05
Figure 1.2: Plot of a Simulated of Moments Objective Function

0.016
0.014
0.012
0.01
0.008
0.006
0.004
0.002
0
0.100000
0.100002
0.100003
0.100005
0.100006
0.100007
0.100009
1.3 Numerical Integration
Unlike differentiation, most integration problem will not admit numerical

solution. Hence, we often have no choice but to employ numerical methods. For example,
consider the integral
(1 + x)
e x + x dx . The integral clearly does not admit an analytical

2
solution. Like numerical differentiation, you can probably guess what the first approach
to numerical integration will be.
Recall the definition of the Riemann integral. For any
a = x0 < t1 < x2 < t2 < ... < tn < xn = b , we consider the approximation,
x=a
f ( x)dx f ( ti ) ( xi +1 xi )
i =1
If we consider patricians such that the distance between xi and xi +1 is arbitrarily small for
all i , we get the value of the integral. This suggests the following formula for numerical
integration,
x =a
f ( x)dx f
i =1
( 12 ( xi + xi +1 ) ) ( xi +1 xi )
If we set xi = a + ni (b a) . We get,
x =a
f ( x)dx (b a) 1n f ( a + 22i +n1 (b a ) )

i =1
More precisely, let us consider integrating the function between xi and xi +1 where
h = xi +1 xi . Using A Taylor expansion, we can obtain,
F ( xi +1 ) = F ( xi ) + f ( xi )h + 12 f '( xi )h 2 + ...
where F denotes the indefinite integral of f . Hence, we have,
xi +1
f ( x)dx = f ( xi )h + 12 f '( xi )h 2 + ...
x = xi
Summing these terms, we obtain,
x =a
n 1
f ( x)dx =
xi +1
x = xi
i =0
n 1
f ( x)dx = h f ( xi ) + h
1
2
i =0
n 1
i =0
1
2
f '( xi ) + ...
We therefore have,
x =a
n 1
f ( x)dx = h f (a + h(b a)) + O(h)

i =0
This is almost exactly the same as the formula derived above. A higher-order expansion
will improve on the accuracy, but requires higher-order differentiability of the function
being integrated.
The methods outlined above are sometimes useful, but only problems that are
very messy or problems for which extremely high precision is desired. Ill elaborate on
this later. Perhaps the most widely used method is Gaussian quadrature. Gaussian
quadrature can be used for functions that are well approximated by a polynomial. In
particular, n -point quadrature wield yield an exactly correct expression for functions that
are equal to a 2n 1 -degree polynomial multiplied by some known weighting function
W ( x) . In particular, the formula is,
x=a
W ( x) f ( x)dx wi f ( xi )
i =1
This will produce an exactly correct answer when f ( x) is a polynomial. The trick then is
to find the weights wi and the evaluation points xi .
10
The following are the weighting functions that are typically used (each one is
given a name):
1.
Gauss-Legendre Quadrature:
W ( x) = 1 for 1 < x < 1
2.
Guass-Chebychev Quadrature:
W ( x) = (1 x 2 ) 1 2 for 1 < x < 1
3.
Guass-Laguerre:
W ( x) = x e x for 0 < x <
4.
Guass-Hermite:
W ( x) = e x for < x <
5.
Gauss-Jacobi:
W ( x) = (1 x) (1 + x) for 1 < x < 1
How do we go about finding the weights and evaluation points then? Consider
2 n 1
Gauss-Hermite Quadrature and consider the polynomial
ax
i =0
. We have,
2 n 1
2
2 n 1 i x2
a
x
e
dx
ai
xi e x dx
=
x=
x =
i =0
i =0
Integrating by parts, we can determine that,
x =
x =
x i e x dx = 12 (i 1)
x =
xe x dx = 0 ,
2
x =
x i 2 e x dx
e x dx = 2
2
Notice, we want to have,

2 n 1
n
2 n 1
2
x i e x dx = wi a j ( xi ) j
x =
i =1
j =0
ai
i =0
For example, when n = 2 , we have,
11
a0
x =
e x dx + a1
2
= a0 2 + a2
x =
1
2
xe x dx + a2
2
x 2 e x dx + a3
2
x =
x =
x3e x dx
2
= ( w1 + w2 )a0 + ( w1 x1 + w2 x2 )a1 + ( w1 x12 + w2 x2 2 )a2 + ( w1 x13 + w2 x23 )a3

Notice that we need,
w1 + w2 = 2
w1 x1 + w2 x2 = 0
w1 x12 + w2 x2 2 =
1
2
w1 x13 + w2 x23 = 0
One can show that the solution to this system satisfies,

w1 = w2 =
, x1 =
1
2
, x2 =
1
2
This procedure works more generally, and for all of the quadrature formulas, except that
we dont solve them by hand! Instead, there are standard computer programs that are
designed to computer the solutions to such systems of equations.
How do we choose which formula to use? The most important concern is the
range of the function. For example, is we want to compute the integral
x =3
xe
12 x 2
dx , we
would see that the function has a finite range. Therefore, we would apply Legendre,
Chebychev, or Jacoby. Guass Hermite would be a poor choice here because even though
we can write this integral as,
x =
1{3 x 7}xe
12 x 2
dx , the resulting function
1{3 x 7}x would not we well approximated by a polynomial.

Suppose we are estimating a Heckman selection model and have the equations,
yn * = ' xn + n
rn * = ' zn + n
12
where n and n are standard normal random deviates with correlation . Now, we
observe yn = 1{ yn * 0} only if rn = 1{rn * 0} = 1 . This means that conditional on ( xn , zn ) ,
we observe three possible events- rn = 0 , yn = 0, rn = 1 , and yn = 1, rn = 1 . Computing the
integral of the first event is easy,
Pr(rn = 0 | xn , zn ) = Pr(rn * 0 | xn , zn ) = Pr( ' zn + n 0 | xn , zn )
= Pr(n ' zn | xn , zn ) = ( ' zn )
Consider one of the other events,
Pr( yn = 0, rn = 1| xn , zn ) = Pr(rn * 0, yn * 0 | xn , zn )
= Pr( ' xn + n 0, ' zn + n 0 | xn , zn )
= Pr( n ' xn , n ' zn | xn , zn )
=
' xn
' zn
1
2
= 2 2(1 )
1
( 2 2 + 2 )
2(1 2 )
d d
We can reduce the integrals by completing the square (or factoring the joint distraction
into a marginal and conditional),
Pr( yn = 0, rn = 1| xn , zn )
=
' zn
1
2
12 2
If we use the change of variables u =
' xn
1
2
= (1 ) 2

1 2
1
( ) 2
2 (1 2 )
d d
, we obtain,
Pr( yn = 0, rn = 1| xn , zn )
=
' zn
' zn
1
2
1
2
12 2
12 2
' xn
1 2
u =
1
2
12 u 2
dud
' x
n
d
There are standard algorithms for computing efficiently, so we have reduced our
problem to a one dimensional integral. Since the bounds are half infinite, Guass-Leugerre
13
integration is a good choice here (with = 0 ). We need to transform the range however.
Define v = ' zn . We have,
Pr( yn = 0, rn = 1| xn , zn )
e v
v =0
1
2
v 12 ( v + ' zn ) 2
' x + (v + ' z )
n
n
dv
2
Finally, we have the approximation,

Pr( yn = 0, rn = 1| xn , zn )
wi
i
1
2
v 12 ( vi + ' zn ) 2
ei
' x + (v + ' z )
n
i
n
As an alternative example, suppose we wanted to computed the expectation of

1
1+ X 2
where X is a normal random variable, then Gauss-Hermite would be the best
choice. We have,
1
1
E
=
2
x =
1 + x2
1+ X
1
2
1 ( x )2
2 2
dx
We would consider the transformation,

1
E
2
1+ X
2
1
e y dy
= y =
(1 + ( + y 2)) 2
This allows us to write,

1
E
2
1+ X
2
1
e yi
wi
i
(1 + ( + yi 2)) 2
where ( wi , yi ) are Guass-Hermite weights and evaluation points.

The chief advantage of applying Gaussian Quadrature is that we can often obtain
extreme accurate results with very few function evaluations. The chief drawback is that
14
some functions will not be well approximated by a polynomial of any order. It is when
very high accuracy is desired and the function is poorly approximated by a polynomial
that we will rely on the Trapezoid and related formulas (or as we discuss later, on
simulation methods).
In Table 1.3, we consider several examples, taken from page 254 in Kenneth
Judds textbook. We see that Gaussian quadrature sometime performs extremely well,
even with only a few points. The trapezoid rule, alternatively, does well for a large
number of points.
15
Table 1.3 Some Simple Integrals
Rule
Trapezoid
Simpson
GuassLegendre
Number of
Points
4
7
10
13
3
7
11
15
4
7
10
13
Truth
x1/ 4 dx
10
x 2 dx
e dx
( x + 0.05) + dx
.7212
.7664
.7797
.7858
.6496
.7816
.7524
.7922
.8023
1.7637
1.1922
1.0448
.9857
1.3008
1.0017
.9338
.9169
.8563
1.7342
1.7223
1.72
1.7193
1.4662
1.7183
1.6232
1.7183
1.7183
.6056
.5583
.5562
.5542
.4037
.5426
.4844
.5528
.5713
.8006
.8003
.8001
.8
.8985
.9
.9
.9
1.7183
1.7183
1.7183
1.7183
.5457
.5538
.5513
.55125
16
1.4 Numerical Solution of Nonlinear Equations
Consider the problem of solving the equation x + 3e x = 4 . You will not be

successful because this equation simply does not admit an analytical solution. Equations
like this come up frequently in formal theory and methods applications. Fortunately,
solving this problem numerically is actually quite easy. We can write the general problem
as f ( x) = 0 . We will almost always assume that f is continuous. At minimum, the
function should be continuous and all but a countable number of points. Otherwise,
knowing the function at one point will not provide any information about the function at
other points.
Now, suppose that we have points x and x such that x < x , f ( x ) < 0 , and
f ( x ) > 0 . In this case, the intermediate value theorem implies that if f is continuous on
[ x , x ] , then there exists an x* ( x , x ) such that f ( x*) = 0 . Finding points x and x is
called bracketing a root (alternatively, we can bracket points with x < x such that
f ( x ) > 0 and f ( x ) < 0 ).
Now consider evaluating the function at the point 12 ( x + x ) . If f ( 12 ( x + x )) = 0 ,
then we have found a root. If f ( 12 ( x + x )) > 0 , then we not have bracketed a root in a
smaller interval, [ 12 ( x + x ), x ] . If f ( 12 ( x + x )) < 0 , then we have bracketed a root in a
smaller interval, [ x , 12 ( x + x )] . By continuing this process, we can bracket a root in
smaller and smaller intervals. Eventually, we will converge to a root of the system. This
technique is called the bisection method.
17
Suppose that, in addition, we know that f is strictly monotonic on [ x , x ] . Then

we also know that the root x * is unique on that interval. If f is monotonic everywhere,
there exists a unique solution.
Now, we know that at each iteration, the root x * will be between the upper and
lower bracket. We know that the size of the bracket is being cut in half at each iteration.
Therefore, we know that the bisection method converges q-linearly.
Consider an alternative algorithm for solving the sample problem. A first-order
Taylor approximation gives,
f ( x*) f ( x) + f '( x)( x * x)

Let us take x * to be a root of f so that,
f ( x) f '( x)( x * x)
We have,
x* x
f ( x)
f '( x)
This suggests the following algorithm. Given a current point, xk , compute a new point
using,
xk +1 xk +
f ( xk )
f '( xk )
It is clear that this process will stop if f ( xk ) = 0 . If we start this procedure in a

neighborhood of the root, it is guaranteed to converge q-quadratically. Whether this
process will converge far from the root depends on the function f , and is a rather
complicated problem. Iteration is a rather complex branch of mathematics. For example,
it is well know that a differential equation cannot exhibit chaos in dimension less than 3.
18
A nonlinear difference equation, however, can exhibit chaos in a single dimension. These
problems do exist for Newtons method in practice.
In practice, Newtons method works as follows. Set,
xk +1 xk +
f ( xk )
f '( xk )
for = 1 . If | f ( xk +1 ) |<| f ( xk ) | , then we have successfully reduced that value of the

function, so accept f ( xk +1 ) . Otherwise, reduce and try again. One can show
theoretically, that under these conditions, Newtons method will converge globally to a
local minimum of | f ( x) | (which may not be a root).
Newtons method has two problems. The first is that we need to be able to
compute f ' . The second is that the naive version of Newtons method may not converge,
even when the bisection method will. While Newtons method with line search can be
quite effective for higher-dimensional problems, it is inefficient for one-dimensional
problems.
There are solutions to each of these approaches. Suppose that f ' is not supplied
directly. We can approximate it using numerical derivatives (see Section 1.2). The main
drawback of this approach is that we need to compute f twice per iteration rather than
once. Instead, we consider the following approximation for f '( x) . We can used,
f '( xk )
f ( xk ) f ( xk 1 )
xk xk 1
This suggest the following algorithm,

xk +1 xk + ( xk xk 1 )
19
f ( xk )
f ( xk ) f ( xk 1 )
This approach is called the secant method. It achieves q-superlinear convergence, which
is faster than the bisection method, but slower than Newtons method. Like Newtons
methods, there is no guarantee that the secant method will converge. A variant of the
secant method is the false position method, which makes sure to keep one point on each
side of the root, but is otherwise similar to the secant method. This procedure retains the
q-superlinear convergence rate, but is guaranteed to converge. A picture will illustrate
this.
The second problem relates to convergence of Newtons method, the Secant
method, and the false position method. These methods all converge faster than the
bisection method as we get close to the root, but may have worse performance far from
the root. Brents method follows the same lines as these other methods, but makes sure to
check the progress of the algorithm and reverts to the bisection method in cases of poor
performance. Brents method is otherwise similar to the false-position method, but uses
and inverse quadratic approximation rather that a linear one. This is the algorithm that
works best in practice, and which is the hardest to illustrate with a simple figure. The
Secant and Newtons method are of interest because they are the only algorithms that
extend to the multidimensional case.
Let us now consider an example where we can apply the one-dimensional rootfinding algorithms. Consider two countries that must divide a surplus (whose value is
normalized to one) among themselves. A country may choose to fight or back down or
agree to a default settlement of
1
2
for each. The country must pay a cost of ck to fight. In
the event that only one country fights, that country gets the full surplus. In the event that
20
both countries fight, each country wins with probability one half (and the country that
wins gets the full surplus). The surplus is discounted at rate 0 < < 1 however.
Each country knows its own cost, but not the cost of the other country. The costs
are known to be drawn from the common distribution, F , where F admits a derivative
on [0, 12 ]. We have the following utility functions,
Fight
Dont
Fight
( 12 c1 , 12 c2 )
(1 c1 , 0)
Dont
Country 1
Country 2
(0,1 c2 )
( 12 , 12 )
We will assume that all equilibria have the form, country 1 fights if their cost is lower
that c1 * and country 2 fights if their cost is lower than c2 * .
Country 1s expected utility from fighting, given country 2s strategy is,
F (c2 *)( 12 c1 ) + (1 F (c2 *))(1 c1 ) = 1 + F (c2 *)( 12 1) c1
while his expected utility from not fighting is,
F (c2 *) *0 + (1 F (c2 *)) * 12 = 12 12 F (c2 *)
Now, the cut point must be the point c1 = c1 * that equates these utilities,
c1* = 12 12 F (c2 *)(1 )
A similar calculation for the other country will show that,
c2 * = 12 12 F (c1*)(1 )
21
These two equations form a system of non-linear equations. We can however

reduce this to a single nonlinear equation by noting that there is a c * such that
c1* = c2 * = c * solves both equations. Such a point must satisfy,
g (c) = c 12 + 12 F (c)(1 ) = 0
Notice that,
g ( 12 12 (1 )) = 12 (1 )( F ( 12 12 (1 )) 1) < 0
g ( 12 ) = 12 F ( 12 )(1 ) > 0
g is continuous
g '(c) = 1 + 12 f (c)(1 ) > 0
We immediately have that there exists a unique solution to g (c) = c , which implies that
there is a unique symmetric equilibrium to the game described above (if our hypothesis
that all equilibria involve cut-point strategies is correct). Furthermore, we see that we can
bracket the root, which tells us that the bisection method will converge if we start from
[ 12 12 (1 ), 12 ] .
Solving this problem is quite easy. The first thing we must do is supply the
function, g ( x) . For example, let us consider the case where = 0.7 and F ( x) = 1 e x .
In c/c++, we could write,
const double Beta = 0.7;
double Func(double x)
{
return x - 0.5 - 0.5 * (x >= 0.0) * (1.0 exp(-x)) * (1.0 Beta);
}
In numerical recipes, we could call the bisection method using the code,
/* Include NR Code */
22
#include "nr3.h"
#include "roots.h"
/* Define Nonlinear Equation to Solve */
const double Beta = 0.7;
double Func(double x);
double Func(double x)
{
return x - 0.5 - 0.5 * (x >= 0.0) * (1.0 - exp(-x)) * (1.0 - Beta);
}
int main()
{
/* Bisection Method */
try
{
cout << "rtbis, x: " << rtbis(Func,-10.0,10.0,0.00000001) << "\n\n";
}
catch(int i)
{
cout << "rtbis: FAILED\n\n";
}
return 0;
}
This produces the output,

rtbis, x: 0.564722
1.5 Numerical Optimization
The final basic algorithms we will discuss relate to numerical optimization.

Suppose that we want to find the minimum (or maximum) of some function f . Suppose
that f (b) < f (a ) and f (c) < f (a) with a < b < c . Then if f is continuous on [a, c] , it
must contain a local minimum in this region. We thus say we have bracketed a minimum.
The golden section search algorithm is very similar to the bisection method,
trying to enclose the local maximum in smaller and smaller brackets. We do this by
trying a new point in between a and c . If this point is greater than f (b) , then it becomes
23
the new middle point. Otherwise, it becomes one of the new boundary points. Continuing
this procedure, we will obtain smaller and smaller intervals.
1.6 Suggested Reading
Numerical Recipes in C, section 4.0-4.5, 5.7, 9.0-9.3, 10.0-10.3.

Dennis and Schnabel,
Judd,
24

Numerical Methods Basics

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Numerical Methods Basics

Transféré par

Droits d'auteur :

Formats disponibles

Note Set 1 The Basics

1.2 Numerical Differentiation

Differentiation differs from many of the operation we consider because it will

applications of numerical derivatives include computing standard errors for econometric

double Func(double &x)

f ( x + h) = f ( x) + f '( x)h + 12 f ''( x)h 2 + ...

good enough to choose

f ''( x)h + m | f ( x) | / h . We will obtain, h

f ( x + h) = f ( x) + f '( x)h + 12 f ''( x)h 2 + 16 f '''( x)h3 ...

We can write this as,

We can apply a similar principle to obtain h m1 4 | x | with an error rate of O(h 2 ) .

Table 1.1: First and Second Derivatives of f ( x) = 1.5 x 1 at x = 2.75

Figure 1.2: Plot of a Simulated of Moments Objective Function

1.3 Numerical Integration

Unlike differentiation, most integration problem will not admit numerical

e x + x dx . The integral clearly does not admit an analytical

f ( x)dx (b a) 1n f ( a + 22i +n1 (b a ) )

f ( x)dx = f ( xi )h + 12 f '( xi )h 2 + ...

Summing these terms, we obtain,

f ( x)dx = h f (a + h(b a)) + O(h)

W ( x) = 1 for 1 < x < 1

W ( x) = (1 x 2 ) 1 2 for 1 < x < 1

W ( x) = x e x for 0 < x <

W ( x) = e x for < x <

W ( x) = (1 x) (1 + x) for 1 < x < 1

Gauss-Hermite Quadrature and consider the polynomial

Integrating by parts, we can determine that,

Notice, we want to have,

For example, when n = 2 , we have,

= ( w1 + w2 )a0 + ( w1 x1 + w2 x2 )a1 + ( w1 x12 + w2 x2 2 )a2 + ( w1 x13 + w2 x23 )a3

One can show that the solution to this system satisfies,

dx , the resulting function

1{3 x 7}x would not we well approximated by a polynomial.

If we use the change of variables u =

Finally, we have the approximation,

As an alternative example, suppose we wanted to computed the expectation of

where X is a normal random variable, then Gauss-Hermite would be the best

We would consider the transformation,

This allows us to write,

where ( wi , yi ) are Guass-Hermite weights and evaluation points.

Table 1.3 Some Simple Integrals

1.4 Numerical Solution of Nonlinear Equations

Consider the problem of solving the equation x + 3e x = 4 . You will not be

Suppose that, in addition, we know that f is strictly monotonic on [ x , x ] . Then

f ( x*) f ( x) + f '( x)( x * x)

It is clear that this process will stop if f ( xk ) = 0 . If we start this procedure in a

for = 1 . If | f ( xk +1 ) |<| f ( xk ) | , then we have successfully reduced that value of the

This suggest the following algorithm,

for each. The country must pay a cost of ck to fight. In

These two equations form a system of non-linear equations. We can however

This produces the output,

1.5 Numerical Optimization

The final basic algorithms we will discuss relate to numerical optimization.

1.6 Suggested Reading

Numerical Recipes in C, section 4.0-4.5, 5.7, 9.0-9.3, 10.0-10.3.

Vous aimerez peut-être aussi

f ( x) f ( x) + f '( x)( x x)