Vous êtes sur la page 1sur 48

Copyright 2010 John Wiley & Sons, Inc.

1
Copyright 2010 John Wiley & Sons, Inc.
Chapter 7

Sampling and
Sampling
Distributions
Copyright 2010 John Wiley & Sons, Inc.
2
x

p
Learning Objectives
Determine when to use sampling instead of a census.
Distinguish between random and nonrandom
sampling.
Decide when and how to use various sampling
techniques.
Be aware of the different types of errors that can
occur in a study.
Understand the impact of the Central Limit Theorem
on statistical analysis.
Use the sampling distributions of and .
Copyright 2010 John Wiley & Sons, Inc.
3
Reasons for Sampling
Sampling A means for gathering useful information
about a population
Information gathered from sample, and conclusions drawn
Sampling vs. census has advantages
Sampling can save money.
Sampling can save time.
Copyright 2010 John Wiley & Sons, Inc.
4
Reasons for Taking a Census
Eliminate the possibility that a random sample is
not representative of the population.
The person authorizing the study is uncomfortable
with sample information.
For the safety of the consumer.

Copyright 2010 John Wiley & Sons, Inc.
5
Random Versus Nonrandom Sampling
Nonrandom Sampling - Every unit of the population
does not have the same probability of being included
in the sample
Random sampling - Every unit of the population has
the same probability of being included in the sample.
Copyright 2010 John Wiley & Sons, Inc.
6
Population Frame
A list, map, directory, or other source used to represent the
population

Overregistration -- the frame contains all members of the target
population and some additional elements
Example: Voting more than once in the National Elections

Underregistration -- the frame does not contain all members of the
target population.
Example: Not voting despite registering as a voter
Copyright 2010 John Wiley & Sons, Inc.
7
Random Sampling Techniques
Simple Random Sample basis for other random
sampling techniques
Each unit is numbered from 1 to n
A random number generator can be used to select
n items from the sample
Copyright 2010 John Wiley & Sons, Inc.
8
N = 30
n = 6
01 Alaska Airlines
02 Alcoa
03 Ashland
04 Bank of America
05 BellSouth
06 Chevron
07 Citigroup
08 Clorox
09 Delta Air Lines
10 Disney
11 DuPont
12 Exxon Mobil
13 General Dynamics
14 General Electric
15 General Mills
16 Halliburton
17 IBM
18 Kellog
19 KMart
20 Lowes
21 Lucent
22 Mattel
23 Mead
24 Microsoft
25 Occidental Petroleum
26 JCPenney
27 Procter & Gamble
28 Ryder
29 Sears
30 Time Warner
Simple Random Sample:
Sample Members
Copyright 2010 John Wiley & Sons, Inc.
9
9 9 4 3 7 8 7 9 6 1 4 5 7 3 7 3 7 5 5 2 9 7 9 6 9 3 9 0 9 4 3 4 4 7 5 3 1 6 1 8
5 0 6 5 6 0 0 1 2 7 6 8 3 6 7 6 6 8 8 2 0 8 1 5 6 8 0 0 1 6 7 8 2 2 4 5 8 3 2 6
8 0 8 8 0 6 3 1 7 1 4 2 8 7 7 6 6 8 3 5 6 0 5 1 5 7 0 2 9 6 5 0 0 2 6 4 5 5 8 7
8 6 4 2 0 4 0 8 5 3 5 3 7 9 8 8 9 4 5 4 6 8 1 3 0 9 1 2 5 3 8 8 1 0 4 7 4 3 1 9
6 0 0 9 7 8 6 4 3 6 0 1 8 6 9 4 7 7 5 8 8 9 5 3 5 9 9 4 0 0 4 8 2 6 8 3 0 6 0 6
5 2 5 8 7 7 1 9 6 5 8 5 4 5 3 4 6 8 3 4 0 0 9 9 1 9 9 7 2 9 7 6 9 4 8 1 5 9 4 1
8 9 1 5 5 9 0 5 5 3 9 0 6 8 9 4 8 6 3 7 0 7 9 5 5 4 7 0 6 2 7 1 1 8 2 6 4 4 9 3
Simple Random Sampling:
Random Number Table
N = 30
n = 6
Copyright 2010 John Wiley & Sons, Inc.
10
N = 30
n = 6
01 Alaska Airlines
02 Alcoa
03 Ashland
04 Bank of America
05 BellSouth
06 Chevron
07 Citigroup
08 Clorox
09 Delta Air Lines
10 Disney
11 DuPont
12 Exxon Mobil
13 General Dynamics
14 General Electric
15 General Mills
16 Halliburton
17 IBM
18 Kellog
19 KMart
20 Lowes
21 Lucent
22 Mattel
23 Mead
24 Microsoft
25 Occidental Petroleum
26 JCPenney
27 Procter & Gamble
28 Ryder
29 Sears
30 Time Warner
Simple Random Sample:
Sample Members
Copyright 2010 John Wiley & Sons, Inc.
11
Stratified Random Sample
Stratified Random sampling population is divided
into non-overlapping subpopulations called strata
Researcher extracts a simple random sample from each
subpopulation
Stratified random sampling has the potential for reducing
error
Copyright 2010 John Wiley & Sons, Inc.
12
Stratified Random Sample: Population of
FM Radio Listeners
20 - 30 years old
(homogeneous within)
(alike)
30 - 40 years old
(homogeneous within)
(alike)
40 - 50 years old
(homogeneous within)
(alike)
Heterogeneous
(different)
between
Heterogeneous
(different)
between
Stratified by Age
Copyright 2010 John Wiley & Sons, Inc.
13
Proportionate -- the percentage of the sample taken
from each stratum is proportionate to the percentage
that each stratum is within the population
Disproportionate -- proportions of the strata within
the sample are different than the proportions of the
strata within the population
Stratified Random Sample
Copyright 2010 John Wiley & Sons, Inc.
14
Stratified random sampling has the potential to match the
sample closely to the population
Stratified sampling is more costly
Stratum should be relatively homogeneous, i.e. race,
gender, religion

Stratified Random Sample
Copyright 2010 John Wiley & Sons, Inc.
15
k
=
N
n
,
where :
n = sample size
N = population size
k = size of selection interval
Systematic Sampling
Used because of its
convenience and easy
of administration
Population elements are
an ordered sequence
(at least, conceptually).
With systematic sampling,
every k
th
item is selected to
produce a sample of size n
from a population of size N
Copyright 2010 John Wiley & Sons, Inc.
16
Thereafter, sample elements are selected at a
constant interval, k, from the ordered sequence
frame.
Advantages of systematic sampling
Systematic sampling is evenly distributed across the frame
Evenly determined if a sampling plan has been followed
Systematic sampling is based on the assumption that the
source of the population is random
Systematic Sampling
Copyright 2010 John Wiley & Sons, Inc.
17
Systematic Sampling: Example
Purchase orders for the previous fiscal year are
serialized 1 to 10,000 (N = 10,000).
A sample of fifty (n = 50) purchases orders is
needed for an audit.
k = 10,000/50 = 200
Copyright 2010 John Wiley & Sons, Inc.
18
First sample element randomly selected from the
first 200 purchase orders. Assume the 45th
purchase order was selected.
Subsequent sample elements: 45, 245, 445, 645, . . .
Systematic Sampling: Example
Copyright 2010 John Wiley & Sons, Inc.
19
Cluster sampling involves dividing the population
into non-overlapping areas
Identifies the clusters that tend to be internally
heterogeneous
Each cluster is a microcosm of the population
If the cluster is too large, a second set of clusters is
taken from each original cluster
This is two stage sampling
Cluster Sampling
Copyright 2010 John Wiley & Sons, Inc.
20
Cluster Sampling
San Jose
Boise
Phoenix
Denver
Cedar
Rapids
Buffalo
Louisville
Atlanta
Portland
Milwaukee
Kansas
City
San
Diego
Tucson
Grand Forks
Fargo
Sherman-
Dension
Odessa-
Midland
Cincinnati
Pittsfield
Copyright 2010 John Wiley & Sons, Inc.
21
Cluster Sampling
Advantages
More convenient for geographically dispersed populations
Reduced travel costs to contact sample elements
Simplified administration of the survey
Unavailability of sampling frame prohibits using other
random sampling methods
Copyright 2010 John Wiley & Sons, Inc.
22
Cluster Sampling
Disadvantages
Statistically less efficient when the cluster elements
are similar
Costs and problems of statistical analysis are greater
than for simple random sampling
Copyright 2010 John Wiley & Sons, Inc.
23
Non-Random sampling sampling techniques used
to select elements from the population by any
mechanism that does not involve a random selection
process
These techniques are not desirable for use in gathering
data to be analyzed by inferential statistics
Sampling area cannot be determined objectively from
these techniques

Nonrandom Sampling
Copyright 2010 John Wiley & Sons, Inc.
24
Nonrandom Sampling
Convenience Sampling: sample elements are
selected for the convenience of the researcher
Judgment Sampling: sample elements are
selected by the judgment of the researcher
Quota Sampling: sample elements are selected
until the quota controls are satisfied
Snowball Sampling: survey subjects are selected
based on referral from other survey respondents
Copyright 2010 John Wiley & Sons, Inc.
25
Errors
Data from nonrandom samples are not appropriate
for analysis by inferential statistical methods.
Sampling Error occurs when the sample is not
representative of the population
Non-sampling Errors all errors other than sampling
errors
Missing Data, Recording, Data Entry, and Analysis Errors
Poorly conceived concepts , unclear definitions, and
defective questionnaires
Response errors occur when people do not know, will not
say, or overstate in their answers
Copyright 2010 John Wiley & Sons, Inc.
26
Sampling Distribution of Mean x
here." Start "
) (parameter
Population

) (statistic
x
Sample
Calculate x
to estimate
sample random
a Select
Process of
Inferential Statistics
Proper analysis and interpretation of a sample
statistic requires knowledge of its distribution.
Copyright 2010 John Wiley & Sons, Inc.
27
Central limits theorem allows one to study
populations with differently shaped distributions
Central limits theorem creates the potential for
applying the normal distribution to many problems
when sample size is sufficiently large
Central Limit Theorem
Copyright 2010 John Wiley & Sons, Inc.
28
Advantage of Central Limits theorem is when sample
data is drawn from populations not normally
distributed or populations of unknown shape can
also be analyzed because the sample means are
normally distributed due to large sample sizes
Central Limit Theorem
Copyright 2010 John Wiley & Sons, Inc.
29
As sample size increases, the distribution narrows
Due to the Std Dev of the mean
Std Dev of mean decreases as sample size increases
Central Limit Theorem
Copyright 2010 John Wiley & Sons, Inc.
30
. deviation standard
and mean on with distributi normal a
is x of on distributi the , of deviation standard
and of mean with population normal a from
n size of sample random a of mean the is x If
x
x
n
o

=
=
Sampling from a Normal Population
The distribution of sample means is normal for
any sample size.
Copyright 2010 John Wiley & Sons, Inc.
31
n
X
X
Z
X
X
o

=
Z Formula for Sample Means
Copyright 2010 John Wiley & Sons, Inc.
32
Suppose, for example, that the mean expenditure
per customer at a tire store is $85.00, with a standard
deviation of $9.00. If a random sample of 40 customers
is taken, what is the probability that the sample average
expenditure per customer for this sample will be $87.00
or more? Because the sample size is greater than 30, the
central limit theorem can be used, and the sample means
are normally distributed. With = $85.00, = $9.00, and the
z formula for sample means, z is computed as shown on
the3 next slide.
Tire Store Example
Copyright 2010 John Wiley & Sons, Inc.
33
Population Parameters:
Sample Size:
o

o
o
= =
=
> = >

|
\

|
.
|
= >

|
\

|
.
|
|
|
|
85 9
40
87
87
87
,
( )
n
P X P Z
P Z
n
X
X
( )
0793 .
4207 . 5 .
) 41 . 1 0 ( 5 .
41 . 1
40
9
85 87
=
=
s s =
> =
|
|
|
|
.
|

\
|

> =
Z
Z P
Z P
Solution to Tire Store Example
Copyright 2010 John Wiley & Sons, Inc.
34
Graphic Solution to Tire Store Example
Z =
X-
o
n
=

= =
87 85
9
40
2
1 42
1 41
.
.
o =1
Z 1.41 0
.5000
.4207
X
o
=
=
9
40
1 42 .
X 87 85
.5000
.4207
Equal Areas
of .0793
Copyright 2010 John Wiley & Sons, Inc.
35
Suppose that during any hour in a large department
store, the average number of shoppers is 448, with
a standard deviation of 21 shoppers. What is the
probability that a random sample of 49 different
shopping hours will yield a sample mean between
441 and 446 shoppers?
Demonstration Problem 7.1
Copyright 2010 John Wiley & Sons, Inc.
36
Demonstration Problem 7.1
Copyright 2010 John Wiley & Sons, Inc.
37
Graphic Solution for
Demonstration Problem 7.1
33 . 2
49
21
448 441 - X
= Z =

=
n
o

67 . 0
49
21
448 446 - X
= Z =

=
n
o

0
o =1
Z -2.33 -.67
.2486
.4901
.2415
448
X
o
= 3
X 441 446
.2486
.4901
.2415
Copyright 2010 John Wiley & Sons, Inc.
38
Sampling from a Finite Population
without Replacement
In this case, the standard deviation of the distribution of
sample means is smaller than when sampling from an
infinite population (or from a finite population with
replacement).
The correct value of this standard deviation is computed
by applying a finite correction factor to the standard
deviation for sampling from a infinite population.
If the sample size is less than 5% of the population size,
the adjustment is unnecessary.
Copyright 2010 John Wiley & Sons, Inc.
39
Sampling from a Finite Population

Finite Correction Factor



Modified Z Formula
N n
N

1
Z
X
n
N n
N
=

o
1
Copyright 2010 John Wiley & Sons, Inc.
40
Sampling from a Finite Population
A production companys 350 hourly employees
average 37.6 years of age with a st. deviation of 8.3
years, If a random sample of 45 hourly employees is
taken, what is the probability that the sample will
have an average of less than 40 years?
Copyright 2010 John Wiley & Sons, Inc.
41
p

:
p
X
n
where
X
=
= number of items in a sample that possess the characteristic
n = number of items in the sample
Sampling Distribution of
Sample Proportion




Sampling Distribution
Approximately normal if nP > 5 and nQ > 5
(P is the population proportion and Q = 1 - P.)
The mean of the distribution is P.
The standard deviation of the distribution is
(p*q)/n
Copyright 2010 John Wiley & Sons, Inc.
42
p hat is a sample proportion
Whereas the mean is computed by averaging a set
of values, the sample proportion is computed by
dividing the frequency with which a given
characteristic occurs in a sample by the number
of items in the sample (see next slide for formula)
Sampling Distribution of p hat
p
p
Copyright 2010 John Wiley & Sons, Inc.
43
Z Formula for Sample Proportions
p P
Z
P Q
n
where
p
n
P
Q P
n P
n Q
=

=
=
=
=
>
>
:

sample proportion
sample size
population proportion
1
5
5


Copyright 2010 John Wiley & Sons, Inc.
44
If 10% of a population of parts is defective,
what is the probability of randomly selecting
80 parts and finding that 12 or more parts are
defective?
Demonstration Problem 7.3
Copyright 2010 John Wiley & Sons, Inc.
45
Population Parameters
= .
= -
Sample
=
P
Q P
n
X
p
X
n
P p P Z
p
p
0 10
1 1 10 90
80
12
12
80
0 15
15
15
= =
=
= = =
> = >

. .

.
(

. )
.



o
= >
= s s
=
=
P Z
P Z
( . )
. ( . )
. .
.
1 49
5 0 1 49
5 4319
0681
=
P >


Z
P
P Q
n
. 15
= >

P
. .
(. ) (. )
15 10
10 90
80
Z
= > P Z
.
.
0 05
0 0335
Solution for Demonstration Problem 7.3
Copyright 2010 John Wiley & Sons, Inc.
46
Z =
. .
(. )(. )
.
.
.
p P
P Q
n

= =
0 15 0 10
10 90
80
0 05
0 0335
1 49
Graphic Solution for
Demonstration Problem 7.3
o =1
Z 1.49 0
.5000
.4319
.
p
o
= 0 0335
p 0.15 0.10
.5000
.4319
^
Copyright 2010 John Wiley & Sons, Inc.
47
Seatwork
If every 11
th
term is systematically sampled to
produce a sample size of 75 items. How large is the
population?
Suppose a subdivision on the southwest side of
Denver, Colorado contains 1500 houses. A sample of
100 houses is selected randomly and evaluated by an
appraiser. If the mean appraised value is 1,800, with
st. deviation of 8,500. What is the probability that
the sample average is greater than 2,400?

Copyright 2010 John Wiley & Sons, Inc.
48
Seatwork
According to a survey, 48% of executives believe that
employees are most productive on Tuesdays.
Suppose 200 executives are sampled. What is the
probability that fewer than 90 of the executives
believe employees are most productive in Tuesday?

Vous aimerez peut-être aussi