CH 09

Chapter 9: Sampling Distributions
9.1 Introduction
This chapter connects the material in Chapters 4 through 8 (numerical descriptive statistics, sampling, and probability distributions, in particular) with statistical inference, which is introduced in Chapter 10. At the completion of this chapter, you are expected to know the following: 1. How the sampling distribution of the mean is created and the shape and parameters of the distribution. 2. How to calculate probabilities using the sampling distribution of the mean. 3. Understand how the normal distribution can be used to approximate the binomial distribution. 4. How to calculate probabilities associated with a sample proportion. 5. How to calculate probabilities associated with the difference between two sample means.
9.2
Sampling Distribution of the Mean
The most important thing to learn from this section is that if we repeatedly draw samples from any population, the values of x calculated in each sample will vary. This new random variable created by sampling will have three important characteristics: 1. x is approximately normally. 2. The mean of x will equal the mean of the original random variable. That is, x = x . 3. The variance of x will equal the variance of the original random variable divided by n. That is, 2 = 2 / n . x x The sampling distribution of x allows us to make probability statements about values of the sample size n and the population parameters and 2 .
x
based on knowing the
Example 9.1
A random variable possesses the following probability distribution: x 1 2 3 p(x) .2 .5 .3
106
a) Find all possible samples of size 2 that can be drawn from this population. b) Using the results in part a), find the sampling distribution of x . c) Confirm that x = x and 2 = 2 / n . x x
Solution
a) There are nine possible samples of size 2. They are (1,1), (1,2), (1,3), (2,1), (2,2), (2,3), (3,1), (3,2), and (3,3). b) The samples, the values of x , and the probability of each sample outcome are shown below: Sample (1,1) (1,2) (1,3) (2,1) (2,2) (2,3) (3,1) (3,2) (3,3)
x
Probability (.2)(.2) = .04 (.2)(.5) = .10 (.2)(.3) = .06 (.5)(.2) = .10 (.5)(.5) = .25 (.5)(.3) = .15 (.3)(.2) = .06 (.3)(.5) = .15 (.3)(.3) = .09
x
1.0 1.5 2.0 1.5 2.0 2.5 2.0 2.5 3.0 follows:
The sampling distribution of

x
p( x ) .04 .20 .37 .30 .09
1.0 1.5 2.0 2.5 3.0
c) Using our definitions of expected value and variance, we find the mean and variance of the random variable x: x = x =
2
E( x )
(x
x p(x )
)
2
= = =
1(. 2 ) (1
2 (. 5)
2
3(. 3 ) (2
2.1
p(x )
2 . 1) (. 2 )
2 .1) 2 (. 5 )
(3
2 .1) 2 (. 3)
0 . 49
107
The mean and variance of the random variable x = 2 = x = =

E (x )
are computed as follows: +

2.0 (. 37 )
x p( x ) = 1 . 0 (. 04 ) + 1 .5 (. 20 )
=
2. 1
2 . 5(. 30 )
3 .0 (. 09 )
(x
(1.0
x
(2 . 5
)2 p( x )
+
(1 .5
2 .1 ) 2 (. 04 )
2 .1) 2 (. 20 ) + +
( 3 .0
(2. 0
2 .1) 2 (. 37 )
+
0. 245
2.1 ) 2 (.30 )
2 .1) 2 (. 09 )
As you can see, x = x = and 2 = 2 x x

/ n 2 .1
0.49 / 2
0 . 245
Example 9.2
Suppose a random sample of 100 observations is drawn from a normal population whose mean is 600 and whose variance is 2,500. Find the following probabilities: a) b) c) d) P(590 < x < 610) P(590 < x < 610) P(x > 650) P( x > 650)
Solution
a) X is normally distributed with mean x = 600 and variance 2 = 2,500. We standardize x by x x = 600 and dividing by x = 50. Therefore, subtracting P(590 < x < 610) =
P
590 50
600
<
x < x
610
50
600
= P(.2 < z < .2) = .1586 b) We know that

2 , 500 / 100
=
x
25 .
is normally distributed with x = x = Thus, x = 5 . Hence,

x
600
and 2 = 2 x x
/n
P(590 <
< 610)
590 5
600
<
x < x
610
600
= P(2 < z < 2) = .9544 108
c) P(x > 650)
x x > x
650
50
600
= P(z > 1) = .1587 d) P( x > 650) =

P
x x > x
650
600
= P(z > 10) = 0
Example 9.3
Refer to Example 9.2. Suppose a random sample of 100 observations produced a mean of What does this imply about the statement that = 600 and 2 = 2,500?
x
= 650.
Solution
From Example 9.2 part d), we found that P( x > 650) = 0 Therefore, it is quite unlikely that we could observe a sample mean of 650 in a sample of 100 observations drawn from a population whose mean is 600 and whose variance is 2,500.
Question:
What purpose does the sampling distribution serve? In particular, why do we need to calculate probabilities associated with the sample mean? (in reverse order) We are not terribly interested in making probability statements about x . Since knowledge of and 2 is required in order to compute the probability that x falls into some specific interval, we acknowledge that this procedure is quite unrealistic. However, the sampling distribution will eventually allow us to infer something about an unknown population mean from a sample mean. This process, called statistical inference, will be the main topic throughout the rest of the textbook.
Answer:
EXERCISES
9.1 If 64 observations are taken from a population with = 100 and = 40, find P(102 <
x
< 112).
109
9.2
A normally distributed random variable has a mean of 20 and a standard deviation of 10. If a random sample of 25 is drawn from this population, find P( x > 23).
9.3
Given the probability distribution of x below, find all samples of size 3, the sampling distribution of x , the mean, and the variance of x . x 0 1 p(x) .7 .3
110
9.3 Creating the Sampling Distribution by Computer Simulation

In Section 9.2, we described how the sampling distribution was created theoretically. We also pointed out that sampling distributions can be created empirically, but that the effort can be extremely timeconsuming. In this section, we used the computer and our two software packages to create several sampling distributions empirically. The concept of the sampling distribution is critical to the development of statistical inference. It is important that you understand that the sampling distribution of any statistic is created theoretically or empirically by repeated sampling from a population. In each sample we calculate the statistic and thus create the distribution of that statistic. Throughout the book we will introduce about 20 different sampling distributions.
9.4 Sampling Distribution of a Proportion

The sampling distribution of a sample proportion is actually based on the binomial distribution. However, the primary purpose of c reating the sampling distribution is for inference and the binomial distribution, which is discrete, makes inference somewhat difficult. Consequently, we use the normal approximation to the binomial distribution. The details are not particularly important to the applied statistician (that's you). What is important is that you understand how the sampling distribution is used. The sampling distribution of z= p p p( 1 p ) / n
p is approximately normal with mean p and variance np(1-p). Thus
is approximately standard normally distributed.
Example 9.4
A fair coin is flipped 400 times. Find the probability that the proportion of heads falls between .48 and .52.
Solution
We wish to find P(.48 < coin is fair, p = .5.
p < .52). We employ the approximate normal sampling distribution. Because the
111
P(.48 <
p .48 .5 p .52 . 5 p < .52) = P < < (.5 )(.5 ) / 400 p( 1 p ) / n (.5 )(.5 ) / 400 = P(-.8 < z < .8) = 2(.5 - .2881) = .4238
Example 9.5
Repeat Exercise 9.4 changing the number of flips to 1000.
Solution
P(.48 <
p < .52) = P
.48 .5
(.5 )(.5 ) / 1000
<
p p p( 1 p ) / n
<
(.5 )(.5 ) / 1000 .52 .5
= P(-1.26 < z < 1.26) = 2(.5 - .3962) = .2076
Question: Answer:
Why don't we use the 1/2-correction factor? When the sample size is large the effect of the correction factor is negligible. Omitting it only slightly affects the approximation but simplifies our calculation.
EXERCISES
9.4 The proportion of defective units coming off a production line is 5%. Find the probability that in a random sample of 100 units more than 10% are defective?
9.5
Repeat Exercise 9.4 with a sample of 400 units.
112
9.6
In the last election a local counselor received 52% of the vote. If her popularity level is unchanged what is the probability that in a random sample of 200 voters less than 50% would vote for her?
9.5 Sampling Distribution of the Difference between Two Means

The sampling distribution of the difference between two means is developed by extending the Central Limit Theorem. That is, the sampling distribution of x1 x 2 is approximately normally distributed (If the two variables are normal then deviation 12 n1 Thus, z= ( x1 x 2 ) ( 1 2 )
2 12 2 + n1 n2
x1 x 2 is normally distributed as well.) with mean 1 2 and standard
2 2
n2
is either normally distributed or approximately normally distributed. We can use this sampling distribution in the same way we employed the sampling distribution of the sample mean, to make probability statements about the difference between two sample means.
Example 9.6
Suppose that we draw random samples of size 5 from two normal populations. The mean and standard deviation of population 1 are 100 and 25. The mean and standard deviation of population 2 are 90 and 40. Find the probability that the mean of sample 1 exceeds the mean of sample 2.
113
Solution
We want to determine P[( x1
x 2 > 0]. The mean and standard deviation of the sampling distribution are
1 2 = 100 - 90 = 10
12 n1 Thus, ( x x ) ( ) 0 10 1 2 = (P(z > -.47) = .5+ .1808 = .6808. x 2 ) > 0] = P 1 2 > 2 2 21 . 1 1 2 + n1 n2 +
2 2
n2
25 2 40 2 + = 21.1 5 5
P[( x1
EXERCISES
9.7 The assistant dean of a business school claims that the number of job offers received by MBA's whose major is finance is normally distributed with a mean of 12 and a standard deviation of 2.5. Furthermore he states that job offers to marketing majors is normally distributed with a mean of 10 and a standard deviation of 3. Find the probability that in a random sample of 10 finance and 10 marketing majors the average finance major receives more job offers than the average marketing major.
9.8
Repeat Exercise 9.7 using sample sizes of 25.
114

CH 09

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

CH 09

Transféré par

Droits d'auteur :

Formats disponibles

Chapter 9: Sampling Distributions

Sampling Distribution of the Mean

based on knowing the

The sampling distribution of

p( x ) .04 .20 .37 .30 .09

1.0 1.5 2.0 2.5 3.0

The mean and variance of the random variable x = 2 = x = =

are computed as follows: +

As you can see, x = x = and 2 = 2 x x

= P(.2 < z < .2) = .1586 b) We know that

is normally distributed with x = x = Thus, x = 5 . Hence,

= P(2 < z < 2) = .9544 108

c) P(x > 650)

= P(z > 1) = .1587 d) P( x > 650) =

= P(z > 10) = 0

9.3 Creating the Sampling Distribution by Computer Simulation

9.4 Sampling Distribution of a Proportion

p is approximately normal with mean p and variance np(1-p). Thus

is approximately standard normally distributed.

(.5 )(.5 ) / 1000

(.5 )(.5 ) / 1000 .52 .5

= P(-1.26 < z < 1.26) = 2(.5 - .3962) = .2076

Repeat Exercise 9.4 with a sample of 400 units.

9.5 Sampling Distribution of the Difference between Two Means

x1 x 2 is normally distributed as well.) with mean 1 2 and standard

Repeat Exercise 9.7 using sample sizes of 25.

Vous aimerez peut-être aussi