Vous êtes sur la page 1sur 9

Normal approximation to Binomial distribution

We had known that Poisson distribution can be used to approximate the binomial
distribution for large values of n and small values of p provided that the correct conditions
exist. The approximation is only of practical use if just a few terms of the Poisson distribution
need be calculated. In cases where many or sometimes several hundred terms need to be
calculated the arithmetic involved becomes very tedious indeed and we turn to the normal
distribution for help.
It is possible, of course, to use high-speed computers to do the arithmetic but the
normal approximation to the binomial distribution negates this in a fairly elegant way. In the
problem situations following this introduction, the normal distribution is used to avoid very
tedious arithmetic while at the same time giving a very good approximate solution.
In this topic, we consider the normal approximation to the binomial distribution. Since
the binomial is a discrete probability, this may seem to go against the intuition. However, a
limiting process is involved, keeping p of the binomial distribution fixed and letting n .
The approximation is known as DeMoivre-Laplace approximation.
We recall the binomial distribution as

Stirlings approximation to n! is

The error

as n . Using Sterlings formula to approximate the terms involving n! in the binomial


model, we eventually find that, for large n,

so that

This result make sense in the light of the central limit theorem and the fact that X is the sum
of independent Bernoulli trials. Thus, the quantity ( X np ) / npq approximately has a
N(0,1) distribution. If p is close to 0.5 and n > 10, the approximation is fairly good. However,
for the other values of p, the value of n must be larger.
In general, experience indicates that the approximation is fairly good as long as np > 5 for p
0.5 or when nq > 5 when p > 0.5

Problem 1:
Steel bars are made to a nominal length of 4cm but in fact the length is a normally distributed
random variable with mean 4.01cm and standard deviation 0.03. Each steel bar costs 6p to
make and may be used immediately if its length lies between 3.98cm and 4.02cm. If its
length is less than 3.98cm the steel bar cannot be used but has a scrap value of 1p. If the
length exceeds 4.02cm it can be shortened and used at a further cost of 2p.
Find the average cost per usable steel bar.
Solution:
Total of steel bars = 100 steel bars
2
X~N ( 4.01, 0.03 )

Cost has 2 possible values per usable steel bar, 6p, 8p.

P(C=6) = P(3.98 < X < 4.02)


=P

(0< Z < 4.013.98


)
0.03

+P

(0< Z < 4.024.01


0.03 )

= P(0< Z < 1) + P(0 < Z < 0.333)


= 0.3413 + 0.1305

= 0.4718 Amount of steel bars that cost 6p = 47.18

P(C=8) = P(X > 4.02)


= P(Z > 0.333)
= 0.5 - P(0 < Z < 0.333)
= 0.3695 Amount of steel bars that cost 8p = 36.95

Cost:
For usable steel bar cost 8p each, 36.95 steel bars x 8 = 295.6
For usable steel bar cost 6p each, 47.18 steel bars x 6 = 283.08
For non-usable steel bar cost 5p each, 100 - 36.95 - 47.18 = 15.87 steel bars x 5 =
79.35
Therefore,
average cost per usable steel bar =

295.6+ 283.08+79.35
=
36.95+47.18

658.03
84.13 =7.82

Cumulative Distribution Function of a Discrete Random Variable


For a discrete or continuous random variable, the cumulative distribution function,
abbreviated as cdf and denoted by F X ( x ) , is the probability of nonexceedance of

x ;

this is sometimes referred to as the distribution function. That is,


F X ( x )=Pr [ X x ] .
We note that
values of

F X ( x ) is a monotonic function, which, by definition, increases for increasing


and, as previously defined,
0 F X ( x ) 1,

for all possible

x.

Definition and properties: Cumulative distribution function, cdf. For a discrete or


continuous random variable X the cdf is the probability of nonexceedance of the
value

x . The cdf is a monotonic (non-decreasing) continuous function that is

bounded by 0 and 1. In the discrete case it is obtained by summing over values of the
pmf.
In the case of a discrete random variable,
possible values of

F X ( x ) is the sum of the probabilities of all

that are less than or equal to the argument

F x ( x )= px ( x k ) .
Xk x

This is summed over all possible

Xk

less than or equal to

x.

x . That is,

Problem 2 :
Maximum potential soil absorption capacity. The absorption capacity of a portion of terrain
can be described through its curve number. The curve number takes integer values in the
range 1-100 and depends on soil properties and land use. As a first approximation, values
taken by the random variable CN in a region may be assumed to be equally likely (that is to
say, uniformly distributed) with pmf:
PCN ( cn )=1/100 , for

1 cn 100

The corresponding cdf is given by


cn

PCN ( cn )=
i=1

1
cn
=
, 1 cn 100;
100 100

PCN ( cn )=0, cn<1 ;


PCN ( cn )=1, c 100.
For example, FCN ( 25 )=Pr [ cn 25 ]=0.25.

the pmf and cdf are shown in Fig. 3.1.3 a .

This is a step function that appears to be a curve because of numerous steps. The maximum
potential soil absorption capacity S is related to the CN as follows:
S=25.4 [ (1000 /CN ) 10 ] ,
Where S

is measured in millimeters. Accordingly, S

mm. For example, S=762 mm for CN =25, and


cdf of S
value of S
100

F s ( s ) =

i=cn

can take a value from 0 to 25146


ps ( 762 )=0.01 . The corresponding

is the given by the sum of the probabilities of those outcomes of CN that yield a
less than or equal to 762mm. This corresponds to CN 25. Hence,
1
=
100

F s ( s ) =0, s< 0

100

i=

25400
s+254

1
254
=1
,
100
s +254

F s ( s ) =1, s>25146

This cdf is shown in Fig. 3.1.3 b

This is also a step function. The log scale allows a clearer definition for high and low values
of S .
It is often convenient to consider S
value from 0 to 25146mm.

as a continuous random variable that can take any real

Figure 3.1.3 (a) probability density function and cumulative distribution function of equally
likely values of curve number, CN. (b) Cumulative distribution function of maximum soil
potential retention S as obtained from equally likely values of the curve number.

Vous aimerez peut-être aussi