Vous êtes sur la page 1sur 17

Chapter 1 The Normal Distribution

_________________________________________________________________
_
NORMAL DISTRIBUTION
The most important continuous probability distribution in the entire field
of statistics is the normal distribution. Its graph, called the normal curve, is the
bell-shaped curve of Figure 1.1, which describes approximately many phenomena

Figure 1.1 Normal Curve

these include biological measurements such as height, weight, and life span, as
well as psychological measurements such as scores on intelligence tests (IQ tests).
In a normal distribution, most values fall near the average, with only a small
percentage of values falling far above or below the average. For example, in a
random sample of adults, most will measure between 120 cm (4 ft) and 210 cm (7
ft) tall, with very few heights outside this range. Normal distributions generally
develop when the sample size or number of observations is very large. In addition,
errors in scientific measurements are extremely well approximated by a normal
distribution. In 1773, Abraham DeMoivre developed the mathematical equation of
the normal curve. It provided a basis for which much of the theory of inductive
statistics is founded. The normal distribution is often referred to as the Gaussian
distribution, in honor of Karl Friedrich Gauss ( 1777 – 1855 ), who also derived
its equation from a study of errors in repeated measurements of the same quantity.

The density function of the normal random variable X, with mean µ and
variance σ2, is

 ( x   )2 
 
1  2
2 
f ( x)  e 
,  x  
2 2

where  and   0 are the parameters known as the mean and the standard
deviation, respectively

Properties of the Normal Distribution

1. It is bell-shaped and has a single peak at the center of the distribution.


The arithmetic mean, median, and mode are equal and located in the
center of the distribution. Thus half the area under the normal curve is to
the right of this center point and the other half to the left of it.
2. It is symmetrical about its mean  . If we cut the normal curve vertically
at the center value, the two halves will be mirror images.
3. If falls off smoothly in either direction from the central value. That is, the
distribution is asymptotic: the curve gets closer and closer to the x-axis
but never actually touches it. To put it another way, the tails to the curve
extend indefinitely in both directions.

1
Chapter 1 The Normal Distribution
_________________________________________________________________
_
4. The location of a normal distribution is determined by the mean. The
dispersion (or spread) of the distribution is determined by the standard
deviation.
5. The coefficient of skewness of a normal distribution is zero, while its
kurtosis is equal to 3.
6. The total area under the curve and above the horizontal axis is equal to 1.
7. For any real number x, P  X  x   0 .

The Standard Normal Curve

Since each normally distributed variable has its own mean and standard
deviation, the shape and location of these curves will vary. Therefore, different
tables of values of areas under each curve will be needed for each variable. To
simplify the situation, statisticians use the standard normal distribution. The
standard normal distribution is a normal distribution with a mean of 0 and a
standard deviation of 1.

Any normal distribution can be converted into a standard normal. This can
be done by means of the z-transform:

X 
Z

The above formula is also known as the z-statistic, z-values, the standard
normal deviates, or just the normal deviates. Thus, if X is normally distributed
with mean  and standard deviation  , and for any a and b, then

 a b 
P  a  X  b  P  Z
    

Areas Under the Normal Curve

Probabilities of normal distributions (or areas under a normal curve) can


be obtained from a normal probability table. Most textbooks provide the
cumulative probability of a standard normal distribution. That is, we can get
P ( Z  z )  P Z  z  .

Figure 1.2 Area of P( Z < z )

2
Chapter 1 The Normal Distribution
_________________________________________________________________
_
Example 1 Determine the area under the standard normal curve between –1.15
and 0.94.

Solution For probabilities, a special notation is used. In this problem, to find


the probability of z value between –1.15 and 0.94, this probability
is written as P( -1.15 < Z < 0.94 ).
To find the area under the standard normal curve between z = -1.15
and z = 0.94, simply subtract the area bounded by z = 0.94 to the
area bounded by z = - 1.15.

To use the Standard Probability Table, note that all Z values must first be recorded
to two decimal places. Now, for example we want to get P( Z < 0.94), scan down
the Z column until you locate 0.9. At 0.9, read across this row until you intersect
the column that contains the hundredths place of the Z value. In this case, the z
value is 0.04. Therefore in the body of the table, the tabulated probability for z =
0.94 corresponds to the intersection of the row z = 0.9 with column 0.04. This
probability is 0.8264. Hence, P( Z < 0.94) = 0.8264.

P = ( -1.15 < Z < 0.94 )

P = ( Z < 0.94 ) – ( Z < -1.15 )

P = 0.8264 – 0.1251

P = 0.7013

-1.15 0.94
Therefore, the area under the standard normal curve between –1.15
and 0.94 is 0.7013.

3
Chapter 1 The Normal Distribution
_________________________________________________________________
_
Producing Normal Probabilities Using Excel

Excel can be used to calculate normal probabilities and the z value corresponding
to a specific probability. For example, suppose we want to get P( Z < 0.94). The
following are the necessary Excel commands to produce the probability.

1. Select Insert and Function.


2. Select Statistical in the category dialog and then select NORMSDIST
in the “Function name” dialog. Press OK.
3. In the NORMSDIST dialog box, enter 0.94.
4. The result will appear in the dialog box. If you click OK, the answers
appear in the spreadsheet.

In this case, P( Z < 0.94) = 0.826391238.

Another way of finding the probability of P( Z < 0.94) is by using the


Megastat. Megastat is an Excel add-in that performs statistical analysis within an
Excel workbook. After it is installed, it appears on the Excel menu and works like
any Excel option.

1. Click Megastat.
2. Select Probability and choose Continuous Probability Distributions.
3. Select Normal Distribution. In this case, we select calculate
probability given z.
4. In the dialog box, enter 0.94. The result will appear if you click
preview at p(lower) or click OK.

4
Chapter 1 The Normal Distribution
_________________________________________________________________
_

Megastat in Microsoft Excel

In this case, P( Z < 0.94) = 0.8264.

Example 2 Determine the area under the standard normal curve above 1.3.

Solution: Using megastat, click continuous probability distributions. Under


normal distribution, select calculate probability given z and in the
dialog box, enter 1.3. The result will appear if you click Preview at
p(upper) or Click OK.

In this case, P( Z > 1.3 ) = 0.0968.

5
Chapter 1 The Normal Distribution
_________________________________________________________________
_
Example 3 The age of subscribers to a certain newspaper are normally
distributed with mean of 32.5 years and standard deviation 4.6.
What is the probability that the age of a random subscriber is
(a) more than 30.5 years; (b) between 25 and 40 years?
Solution:
(a) 1. Calculate the z – transform by selecting Insert and
Function. Choose Statistical and Standardize then press OK.
In the dialog box, enter 30.5 for X, 32.5 for Mean, 4.6 for
Standard_dev. The result will appear in the dialog box. If you
click OK, the answers appear in the spreadsheets.

The probability of the raw score ( 30.5 ) having been standardized


is the same as the probability of z > -0.434782609.
By megastat, we can determine the P(Z > -0.4348).
P(Z > -0.4348) = 0.6681.

Therefore, the probability is 0.6681.

Try to answer problem letter ( b ).

Example 4 Find the z value for which the area under the standard normal
curve to the left of that value is 0.2583.

Solution: Using megastat, click continuous probability distributions. Under


normal distribution, select calculate z given probability, and in the
dialog box, enter 0.2583. The result will appear if you click
Preview or it will appear on the spreadsheet when you click OK. In
this case, the result is 0.65.

6
Chapter 1 The Normal Distribution
_________________________________________________________________
_

0.2583

0.65
The z value is 0.65 when the probability is 0.2583.

The Central Limit Theorem

In practice, there are some cases that the normality assumption is not
satisfied. Hence, computing the probabilities of certain events may be
cumbersome. But if we want to compute the probability that the sample mean
n

∑X i
is between two fixed real numbers, then an appropriate model will be
X = i =1

n
the normal distribution. This is the content of the Central Limit Theorem, one of
the most important theorems in applied probability and statistics. The theorem
states that if X 1 , X 2 ,..., X n is a random sample from a population with mean µ
and standard deviation σ , then for large sample size n, the sample mean, X , has
approximate normal distribution with mean µ and standard deviation n . The
σ

n
above statement can also be expressed ∑X
i =1
i has an approximate normal

distribution with mean nµ and standard deviation σ n .

To illustrate this theorem, consider a random sample X 1 , X 2 ,..., X n from a


distribution given by P( X = 1) = p, P ( X = 0 ) = 1 − p, 0 < p < 1 . Note that the
mean of X is µ = 1( p ) + 0(1 − p ) = p and standard deviation is p (1 − p ) . Now, the
n
sum ∑X
i =1
i counts the number of 1’s in the sample of n. By using the Central
Limit Theorem, it is approximately normally distributed with mean np and
standard deviation np (1 − p ) . This result is illustrated by the figure given below
with n = 15 and p = 0.5.

7
Chapter 1 The Normal Distribution
_________________________________________________________________
_

The central limit theorem can be used to answer questions about sample
means in the same manner that the normal distribution can be used to answer
about individual values. The formula for computing z value:

Z = X-µ
σ/√n

where; X is the sample mean


µ is the population mean
σ / √ n is the standard deviation

The normal approximation for X will generally be good if n > 30. If


n < 30, the approximation is good only if the population is not too different from a
normal distribution and if the population is known to be normal, the sampling
distribution of X will follow a normal distribution exactly, no matter how small
the size of the samples.

Example 5 The average age of lawyers is 40 years, with a standard deviation


of 3 years. If a law firm employs 36 lawyers, find the probability
that the average age group is lower than 39 years old?

Solution: The sampling distribution of X will be approximately normal, with


µ = 40 and σ = 3 / √ 36= 0.5. The shaded region of the figure
below gives the desired probability.

σx = 0.5

39 4

Calculate the z – transform by selecting Insert and Function.


Choose Statistical and Standardize then press OK. In the dialog
box, enter 39 for X, 40 for Mean, 0.5 for Standard_dev. The

8
Chapter 1 The Normal Distribution
_________________________________________________________________
_
result is – 2. Using megastat, z = -2 is equal to 0.0228. Thus, the
probability of hiring a lawyer less than 39 years old is 2.28%.

Other Statistical Distribution

Aside from the normal distribution, another important distributions are the
student’s t distribution, the chi-square distribution, and F distribution. Just like the
standard normal distribution, all probabilities and critical values can be obtained
using a specialized table. This table is available in all elementary statistics books.
To generate the probabilities and critical values using Excel, click megastat, then
select Probability and choose continuous probability distribution.

Example 6 Find the critical chi-square value for n = 15, a = 0.05 and the test is
a right tailed.

Solution: Since n = 15, then degrees of freedom = n – 1 = 15 – 1 =14.


Using megastat under continuous probability distribution, select
chi-square distribution, then choose calculate chi-square given
probability. In the dialog box, enter 0.05 as the probability and 14
for degrees of freedom. Click preview for the result. The solution
is 23.68.

0.05

23.68

Therefore, the critical chi-square value of n = 15, a = 0.05,


and right tailed is 23.68.

9
Chapter 1 The Normal Distribution
_________________________________________________________________
_

Note: Megastat provides critical values for one-tailed test. However, in finding
the critical values for two-tailed test, the probability ( a ) will be divided
by 2.

Name:_______________________________________ Score:______________
Section:______________________________________ Date: ______________

Exercise 1.1

I. Standardized the following:


No. X Mean SD Z
1. 15 17.6 4.2
2. 100 98 5
3. 40 37 2.9
4. 721 689.23 60
5. 500 820 270

II. Find z given the probability and draw the figure.

6. P = 0.6289 to its right

7. P = 0.8096 to its left

8. P = 0.3780 to it left

10
Chapter 1 The Normal Distribution
_________________________________________________________________
_
9. P = 0.2009 to its right

10. P = 0.2777 to its left

III. Find the probability given z. Draw its figure.

11. z > 1.232

12. z < -2.39

13. z < 1.67

14. –1.87 < z < 2.85

11
Chapter 1 The Normal Distribution
_________________________________________________________________
_

15. –2.37 < z < 1.27

Name:_______________________________________ Score:______________
Section:______________________________________ Date: ______________

Exercise 1.2
Answer the following.

1. If the scores for the test have a mean of 90 and a standard deviation of 10,
find the percentage of scores that will fall below 80.

2. A certain type of storage battery lasts, on average, 5 years, with a standard


deviation of 0.2 years. Assuming that the battery lives are normally
distributed, find the probability that a given battery will last less than 4
years.

3. The average grade for an exam is 70, and the standard deviation is 5. If
20% of the class is given A’s, and the grades are curved to follow a normal
distribution, what are the lowest possible A and the highest possible B?

4. The loaves of raisin bread distributed to local stores by a certain bakery


have an average length of 29 centimeters and a standard deviation of 3

12
Chapter 1 The Normal Distribution
_________________________________________________________________
_
centimeters. Assuming that the lengths are normally distributed, what
percentage of the loaves are
a. longer than 31.2 centimeters?
b. shorter than 28.9 centimeters?
c. between 28.1 and 31.2 centimeters in length?

5. For a medical study, the researchers’ wishes to select people in the middle
30% of the population based on blood pressure. If the mean systolic blood
pressure is 110 and the standard deviation is 20, find upper and lower
readings that would qualify people to participate in the study.

6. A certain university desires to accept only the top 15% of all graduating
seniors who took their entrance test. This test has a mean of 500 and a
standard deviation of 100. Assuming that the scores are normally
distributed, find the cutoff score for the test.

7. The IQs of 2000 applicants of a certain university are approximately


normally distributed with mean 300 and standard deviation of 40. If the
university requires an IQ of at least 95, how many of these students will be
rejected on this basis regardless of their other qualifications?

8. A jeepney arrives every 8 minutes at a certain terminal and standard


deviation of 0.9 minutes. It is assumed that the waiting time for a
particular individual is a random variable with a uniform distribution.

13
Chapter 1 The Normal Distribution
_________________________________________________________________
_
a. What is the probability that the individual waits more than 5
minutes?
b. What is the probability that the individual waits between 6 and 10
minutes?

Name:_______________________________________ Score:______________
Section:______________________________________ Date: ______________

Exercise 1.3
Answer the following.

1. The mean weight of 25-year-old males is 70 kilograms and the standard


deviation is 5.3 kilograms. If a sample of 625 males is selected, find the
probability that the mean of the sample will be greater than 75 kilograms.

2. A survey found that a family generates an average of 3.3 kilograms of


garbage daily and a standard deviation of 0.7. Assume that it is normally
distributed, find the probability that the mean of 81 families will be
between 2.5 and 3.8 kilograms.

3. At a large university, the mean age of graduate students who are majoring
in Mathematics is 34.6 years and the standard deviation is 1.2 years. If a
random sample of 64 individuals selected at random, find the probability
that the mean age is
a. below 30 years old
b. above 36 years old
c. between 32 and 35 years old.

14
Chapter 1 The Normal Distribution
_________________________________________________________________
_

4. The time it takes for a certain pain reliever to begin reducing pain is 2
hour2, with standard deviation of 7 minutes. If a random sample of 100
patients who took the pain reliever selected at random, find the probability
that it will take the medication more than 100 minutes to take effect.

5. A certain company manufactures cell phones that have a length of life that
is approximately normally distributed, with mean equal to100 months and
a standard deviation of 7 months. Find the probability that a random
sample of 25 cell phones will have an average life of less than 94 months.

6. The average height of basketball players in a professional league if 6 feet,


the standard deviation is 3 inches. What is the probability of selecting 50
players taller than 6 feet 7 inches? Assume that the variable is normally
distributed.

15
Chapter 1 The Normal Distribution
_________________________________________________________________
_

Name:_______________________________________ Score:______________
Section:______________________________________ Date: ______________

Exercise 1.4
Answer the following.

1. Complete the table ( t – distribution )

item Degree of freedom Alpha ( a ) Tail/s t - value


a 28 0.05 1
b 17 0.1 2
b 19 0.25 1
d 27 2 2.977
e 39 1 2.086

2. Determine the f – value of the following:

a. f 0.05 ( 9,20 )

b. f 0.01 ( 27,10 )

c. f 0.025 ( 29, 17 )

3. Given the f – value and degrees of freedom, find the probability.

a. f – value = 1.9 ; ( 13, 19 )

b. f – value = 1.25 ; ( 185, 120 )

c. f – value = 4.56 ; ( 31, 26 )

4. Complete the table ( chi – square distribution )

item Degree of freedom Alpha ( a ) X2


a 29 0.05
b 21 35.718

16
Chapter 1 The Normal Distribution
_________________________________________________________________
_
c 35 0.01
d 20 36.781
e 28 37.697

17

Vous aimerez peut-être aussi