Académique Documents
Professionnel Documents
Culture Documents
We had known that Poisson distribution can be used to approximate the binomial
distribution for large values of n and small values of p provided that the correct conditions
exist. The approximation is only of practical use if just a few terms of the Poisson distribution
need be calculated. In cases where many or sometimes several hundred terms need to be
calculated the arithmetic involved becomes very tedious indeed and we turn to the normal
distribution for help.
It is possible, of course, to use high-speed computers to do the arithmetic but the
normal approximation to the binomial distribution negates this in a fairly elegant way. In the
problem situations following this introduction, the normal distribution is used to avoid very
tedious arithmetic while at the same time giving a very good approximate solution.
In this topic, we consider the normal approximation to the binomial distribution. Since
the binomial is a discrete probability, this may seem to go against the intuition. However, a
limiting process is involved, keeping p of the binomial distribution fixed and letting n .
The approximation is known as DeMoivre-Laplace approximation.
We recall the binomial distribution as
Stirlings approximation to n! is
The error
so that
This result make sense in the light of the central limit theorem and the fact that X is the sum
of independent Bernoulli trials. Thus, the quantity ( X np ) / npq approximately has a
N(0,1) distribution. If p is close to 0.5 and n > 10, the approximation is fairly good. However,
for the other values of p, the value of n must be larger.
In general, experience indicates that the approximation is fairly good as long as np > 5 for p
0.5 or when nq > 5 when p > 0.5
Problem 1:
Steel bars are made to a nominal length of 4cm but in fact the length is a normally distributed
random variable with mean 4.01cm and standard deviation 0.03. Each steel bar costs 6p to
make and may be used immediately if its length lies between 3.98cm and 4.02cm. If its
length is less than 3.98cm the steel bar cannot be used but has a scrap value of 1p. If the
length exceeds 4.02cm it can be shortened and used at a further cost of 2p.
Find the average cost per usable steel bar.
Solution:
Total of steel bars = 100 steel bars
2
X~N ( 4.01, 0.03 )
Cost has 2 possible values per usable steel bar, 6p, 8p.
+P
Cost:
For usable steel bar cost 8p each, 36.95 steel bars x 8 = 295.6
For usable steel bar cost 6p each, 47.18 steel bars x 6 = 283.08
For non-usable steel bar cost 5p each, 100 - 36.95 - 47.18 = 15.87 steel bars x 5 =
79.35
Therefore,
average cost per usable steel bar =
295.6+ 283.08+79.35
=
36.95+47.18
658.03
84.13 =7.82
x ;
x.
bounded by 0 and 1. In the discrete case it is obtained by summing over values of the
pmf.
In the case of a discrete random variable,
possible values of
F x ( x )= px ( x k ) .
Xk x
Xk
x.
x . That is,
Problem 2 :
Maximum potential soil absorption capacity. The absorption capacity of a portion of terrain
can be described through its curve number. The curve number takes integer values in the
range 1-100 and depends on soil properties and land use. As a first approximation, values
taken by the random variable CN in a region may be assumed to be equally likely (that is to
say, uniformly distributed) with pmf:
PCN ( cn )=1/100 , for
1 cn 100
PCN ( cn )=
i=1
1
cn
=
, 1 cn 100;
100 100
This is a step function that appears to be a curve because of numerous steps. The maximum
potential soil absorption capacity S is related to the CN as follows:
S=25.4 [ (1000 /CN ) 10 ] ,
Where S
F s ( s ) =
i=cn
is the given by the sum of the probabilities of those outcomes of CN that yield a
less than or equal to 762mm. This corresponds to CN 25. Hence,
1
=
100
F s ( s ) =0, s< 0
100
i=
25400
s+254
1
254
=1
,
100
s +254
F s ( s ) =1, s>25146
This is also a step function. The log scale allows a clearer definition for high and low values
of S .
It is often convenient to consider S
value from 0 to 25146mm.
Figure 3.1.3 (a) probability density function and cumulative distribution function of equally
likely values of curve number, CN. (b) Cumulative distribution function of maximum soil
potential retention S as obtained from equally likely values of the curve number.