Vous êtes sur la page 1sur 108

Sampling Distribution and

Confidence Interval
A. Ramesh
Department of Management Studies
Indian Institute of Technology Roorkee

Sampling Distribution and Confidence


Interval
Context
Examples
Random vs Non-Random Samples
Central Limit Theorem
Confidence Interval

Reasons for Sampling


Sampling can save money and time.
Because the research process is sometimes
destructive, the sample can save product.
If accessing the population is impossible;
sampling is the only option.

Reasons for Taking a Census


Eliminate the possibility that a random
sample is not representative of the
population.
The person authorizing the study is
uncomfortable with sample information.

Random Versus Nonrandom Sampling


Random sampling
Every unit of the population has the same probability of
being included in the sample.
A chance mechanism is used in the selection process.
Eliminates bias in the selection process
Also known as probability sampling

Nonrandom Sampling
Every unit of the population does not have the same
probability of being included in the sample.
Open the selection bias
Not appropriate data collection methods for most statistical
methods
Also known as non-probability sampling

Random Sampling Techniques


Simple Random Sample
Stratified Random Sample
Proportionate
Disproportionate

Systematic Random Sample


Cluster (or Area) Sampling

Simple Random Sample


Number each frame unit from 1 to N.
Use a random number table or a random
number generator to select n distinct
numbers between 1 and N, inclusively.
Easier to perform for small populations
Cumbersome for large populations

Simple Random Sample:


Numbered Population Frame
01 Andhra Pradesh
02 Himachal Pradesh
03 Gujrath
04 Maharashtra
05 Nagaland
06 Goa
07 West bengal
08 Haryana
09 Punjab
10 Delhi

11 Madhya Pradesh
12 Uttar Pradesh
13 Bihar
14 Rajasthan
15 J & K
16 Tamil Nadu
17 Karantaka
18 Kerala
19 Orissa
20 Manipur

Simple Random Sampling:


Random Number Table
9
5
8
8
6
5
8

9
0
0
6
0
2
9

4
6
8
4
0
5
1

3
5
8
2
9
8
5

7
6
0
0
7
7
5

8
0
6
4
8
7
9

7
0
3
0
6
1
0

9
1
1
8
4
9
5

6
2
7
5
3
6
5

1
7
1
3
6
5
3

4
6
4
5
0
8
9

5
8
2
3
1
5
0

7
3
8
7
8
4
6

3
6
7
9
6
5
8

7
7
7
8
9
3
9

3
6
6
8
4
4
4

7
6
6
9
7
6
8

5
8
8
4
7
8
6

5
8
3
5
5
3
3

2
2
5
4
8
4
7

9
0
6
6
8
0
0

7
8
0
8
9
0
7

9
1
5
1
5
9
9

6
5
1
3
3
9
5

9
6
5
0
5
1
5

3
8
7
9
9
9
4

9
0
0
1
9
9
7

0
0
2
2
4
7
0

9
1
9
5
0
2
6

4
6
6
3
0
9
2

3
7
5
8
4
7
7

4
8
0
8
8
6
1

4
2
0
1
2
9
1

7
2
2
0
6
4
8

5
4
6
4
8
8
2

3
5
4
7
3
1
6

1
8
5
4
0
5
4

6
3
5
3
6
9
4

1
2
8
1
0
4
9

8
6
7
9
6
1
3

Simple Random Sample:


Sample Members
01 Andhra Pradesh
02 Himachal Pradesh
03 Gujrath
04 Maharashtra
05 Nagaland
06 Goa
07 West bengal
08 Haryana
09 Punjab
10 Delhi

N = 20
n=4

11 Madhya Pradesh
12 Uttar Pradesh
13 Bihar
14 Rajasthan
15 J & K
16 Tamil Nadu
17 Karantaka
18 Kerala
19 Orissa
20 Manipur

Stratified Random Sample


Population is divided into non-overlapping
subpopulations called strata
A random sample is selected from each stratum
Potential for reducing sampling error
Proportionate -- the percentage of these sample
taken from each stratum is proportionate to the
percentage that each stratum is within the
population
Disproportionate -- proportions of the strata within
the sample are different than the proportions of the
strata within the population

Stratified Random Sample:


Population of FM Radio Listeners
Stratified by Age
20 - 30 years old
(homogeneous within)
(alike)
30 - 40 years old
(homogeneous within)
(alike)
40 - 50 years old
(homogeneous within)
(alike)

Heterogeneous
(different)
between
Heterogeneous
(different)
between

Systematic Sampling
Convenient and relatively
easy to administer
Population elements are an
ordered sequence (at least,
conceptually).
The first sample element is
selected randomly from the
first k population elements.
Thereafter, sample elements
are selected at a constant
interval, k, from the ordered
sequence frame.

k =

n
where:
n = sample size
N = population size
k = size of selection interval

Systematic Sampling: Example


Purchase orders for the previous fiscal year
are serialized 1 to 10,000 (N = 10,000).
A sample of fifty (n = 50) purchases orders
is needed for an audit.
k = 10,000/50 = 200
First sample element randomly selected
from the first 200 purchase orders. Assume
the 45th purchase order was selected.
Subsequent sample elements: 245, 445,
645, . . .

Cluster Sampling
Population is divided into non-overlapping
clusters or areas
Each cluster is a miniature of the
population.
A subset of the clusters is selected randomly
for the sample.
If the number of elements in the subset of
clusters is larger than the desired value of n,
these clusters may be subdivided to form a
new set of clusters and subjected to a
random selection process.

Cluster Sampling
N

Advantages
More convenient for geographically dispersed
populations
Reduced travel costs to contact sample elements
Simplified administration of the survey
Unavailability of sampling frame prohibits using
other random sampling methods
Disadvantages
Statistically less efficient when the cluster elements
are similar
Costs and problems of statistical analysis are
greater than for simple random sampling

Nonrandom Sampling
Convenience Sampling: Sample elements are selected for
the convenience of the researcher
Judgment Sampling: Sample elements are selected by
the judgment of the researcher
Quota Sampling: Sample elements are selected until the
quota controls are satisfied
Snowball Sampling: Survey subjects are selected based
on referral from other survey respondents

Errors
N
N
N

Data from nonrandom samples are not appropriate for


analysis by inferential statistical methods.
Sampling Error occurs when the sample is not
representative of the population
Non-sampling Errors
Missing Data, Recording, Data Entry, and Analysis
Errors
Poorly conceived concepts , unclear definitions, and
defective questionnaires
Response errors occur when people so not know,
will not say, or overstate in their answers

Sampling Distribution of

Proper analysis and interpretation of a sample


statistic requires knowledge of its distribution.
Calculate x
Population

(parameter)

to estimate
Process of
Inferential Statistics

Sample
x
(statistic )

Select a
random sample

Distribution
of a Small Finite Population

Population Histogram

N=8
Frequency

54, 55, 59, 63,


68, 69, 70,74

3
2
1
0
52.5

57.5

62.5

67.5

72.5

Sample Space for n = 2 with Replacement


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

Sample
(54,54)
(54,55)
(54,59)
(54,63)
(54,64)
(54,68)
(54,69)
(54,70)
(55,54)
(55,55)
(55,59)
(55,63)
(55,64)
(55,68)
(55,69)
(55,70)

Mean
54.0
54.5
56.5
58.5
59.0
61.0
61.5
62.0
54.5
55.0
57.0
59.0
59.5
61.5
62.0
62.5

17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

Sample
(59,54)
(59,55)
(59,59)
(59,63)
(59,64)
(59,68)
(59,69)
(59,70)
(63,54)
(63,55)
(63,59)
(63,63)
(63,64)
(63,68)
(63,69)
(63,70)

Mean
56.5
57.0
59.0
61.0
61.5
63.5
64.0
64.5
58.5
59.0
61.0
63.0
63.5
65.5
66.0
66.5

33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48

Sample
(64,54)
(64,55)
(64,59)
(64,63)
(64,64)
(64,68)
(64,69)
(64,70)
(68,54)
(68,55)
(68,59)
(68,63)
(68,64)
(68,68)
(68,69)
(68,70)

Mean
59.0
59.5
61.5
63.5
64.0
66.0
66.5
67.0
61.0
61.5
63.5
65.5
66.0
68.0
68.5
69.0

49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64

Sample
(69,54)
(69,55)
(69,59)
(69,63)
(69,64)
(69,68)
(69,69)
(69,70)
(70,54)
(70,55)
(70,59)
(70,63)
(70,64)
(70,68)
(70,69)
(70,70)

Mean
61.5
62.0
64.0
66.0
66.5
68.5
69.0
69.5
62.0
62.5
64.5
66.5
67.0
69.0
69.5
70.0

Distribution of the Sample Means


Sampling Distribution Histogram

20

Frequency

15
10
5
0
53.75

56.25

58.75

61.25

63.75

66.25

68.75

71.25

1,800 Randomly Selected Values


from an Exponential Distribution
F
r
e
q
u
e
n
c
y

450
400
350
300
250
200
150
100
50
0

0 .5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 8.5 9 9.5 10

Means of 60 Samples (n = 2)
from an Exponential Distribution
F
r
e
q
u
e
n
c
y

9
8
7
6
5
4
3
2
1
0
0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00

Means of 60 Samples (n = 5)
from an Exponential Distribution
F
r
e
q
u
e
n
c
y

10
9
8
7
6
5
4
3
2
1
0
0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00

Means of 60 Samples (n = 30)


from an Exponential Distribution
16

F
r
e
q
u
e
n
c
y

14
12
10
8
6
4
2
0
0.00

0.25

0.50

0.75

1.00

1.25

1.50

1.75

2.00

2.25

2.50

2.75

3.00

1,800 Randomly Selected Values


from a Uniform Distribution

F
r
e
q
u
e
n
c
y

250
200
150
100
50
0
0.0

0.5

1.0

1.5

2.0

2.5

3.0

X-bar

3.5

4.0

4.5

5.0

Means of 60 Samples (n = 2)
from a Uniform Distribution
F 10
r 9
e 8
q 7
u
6
e
n 5
c 4
y 3
2
1
0
1.00

1.25

1.50

1.75

2.00

2.25

2.50

2.75

3.00

3.25

3.50

3.75

4.00 4.25

Means of 60 Samples (n = 5)
from a Uniform Distribution
F 12
r
e 10
q
u 8
e
n 6
c
y 4
2
0
1.00

1.25

1.50

1.75

2.00

2.25

2.50

2.75

3.00

3.25

3.50

3.75

4.00

4.25

Means of 60 Samples (n = 30)


from a Uniform Distribution
F
r
e
q
u
e
n
c
y

25
20
15
10
5
0
1.00

1.25

1.50

1.75

2.00

2.25

2.50

2.75

3.00

3.25

3.50

3.75

4.00

4.25

Marquis de Laplace

Central Limit Theorem

For sufficiently large sample sizes (n

the distribution of sample means


normal;

30),
, is approximately

the mean of this distribution is equal to , the


population mean; and

its standard deviation is

regardless of the shape of the population distribution.

Central Limit Theorem


The central limit theorem is one of the most remarkable
results of the theory of probability.
In its simplest form, the theorem states that the sum of a
large number of independent observations from the same
distribution has, under certain general conditions, an
approximate normal distribution.
Moreover, the approximation steadily improves as the
number of observations increases. The theorem is
considered the heart of probability theory, although a better
name would be normal convergence theorem.

Distributions of Samples..
Sampling distributions drawn from a
uniformly distributed population start to
look like normal distributions even with a
sample size as small as 2.
If the sample size is large enough they form
nearly perfect normal distributions

Population: Uniform Distribution

Fig. 1) Histogram of Population - Uniform Distribution:


population = 10,000; mean = 5.013;
std dev 2.897

Distribution of Samples: n=2

Fig. 2) Sampling Distribution n = 2:


number of samples = 2010; mean = 4.995;
std dev 2.011

Distribution of Samples: n=10

Fig. 3) Sampling Distribution n = 10:


number of samples = 2010; mean = 5.018;
std dev 0.906

Distribution of Samples: n=50

Fig. 4) Sampling Distribution n = 50:


number of samples = 2010; mean = 4.999;
std dev 0.411

Useful web site for Demo

http://www.statisticalengineering.com/central_limit_theorem.htm

Distribution of Sample Means


for Various Sample Sizes
Exponential
Population

Uniform
Population

n=2

n=2

n=5

n=5

n = 30

n = 30

Distribution of Sample Means


for Various Sample Sizes
U Shaped
Population

Normal
Population

n=2

n=2

n=5

n=5

n = 30

n = 30

Sampling from a Normal Population


The distribution of sample means is normal
for any sample size.
If x is the mean of a random sample of size n
from a normal population with mean of and
standard deviation of , the distribution of x is
a normal distribution with mean
standard deviation x

and

Z Formula for Sample Means


Z

Example
Population Parameters: 85, 9
Sample Size: n 40

87 X
P ( X 87) P Z

87
PZ

87 85

P Z
9

40
P Z 1.41
.5 ( 0 Z 1.41)
.5 .4201
.0793

Graphic Solution
to Example

40
1. 42

.5000

.5000

.4207

.4207
85

87

X- 87 85 2
Z=

1. 41

9
1. 42
n
40

Equal Areas
of .0793

1.41 Z

Sampling from a Finite Population


without Replacement
In this case, the standard deviation of the
distribution of sample means is smaller than
when sampling from an infinite population (or
from a finite population with replacement).
The correct value of this standard deviation is
computed by applying a finite correction factor
to the standard deviation for sampling from a
infinite population.

Sampling from a Finite Population


Finite Correction
Factor

Modified Z Formula

Nn
N 1
X
Z
N n

n
N 1

Central Limit Theorem for Proportion


Mean of the sampling distribution of the
proportion p

Standard error of the proportion

Z= p p

pq
n

Sampling Distribution of

Sample Proportion
X
n
where :
p

X number of items in a sample that possess the characteristic


n = number of items in the sample

Sampling Distribution
Approximately normal if nP > 5 and nQ > 5 (P is the
population proportion and Q = 1 - P.)
The mean of the distribution is P.
The standard deviation of the distribution is
p

pq
n

Estimation

Statistical Inference
Statistical inference is the process by which
we acquire information and draw conclusions
about populations from samples.
Statistics
Information

Data
Population

Sample
Inference

Statistic
Parameter

In order to do inference, we require the skills and knowledge of


descriptive statistics, probability distributions, and sampling
distributions.

Estimation
There are two types of inference:
estimation and hypothesis testing;
estimation is introduced first.
The objective of estimation is to determine
the approximate value of a population
parameter on the basis of a sample
statistic.
E.g., the sample mean ( x ) is employed to
estimate the population mean ()

Estimation
The objective of estimation is to determine the
approximate value of a population parameter
on the basis of a sample statistic.
There are two types of estimators:
Point Estimator
Interval Estimator

Point Estimator
A point estimator draws inferences about a
population by estimating the value of an
unknown parameter using a single value or
point.

Point Estimator
Point probabilities in continuous distributions
were virtually zero.
Point estimator gets closer to the parameter
value with an increased sample size, but point
estimators dont reflect the effects of larger
sample sizes.
Hence we will employ the interval estimator to
estimate population parameters

Interval Estimator
An interval estimator draws inferences about a
population by estimating the value of an
unknown parameter using an interval.

That is we say (with some ___% certainty) that


the population parameter of interest is between
some lower and upper bounds.

Point & Interval Estimation


For example, suppose we want to estimate the
mean summer income of a class of business
students. For n=25 students,
X is calculated to be
400/week.

point estimate

interval estimate

An alternative statement is:


The mean income is between 380 and 420
/week.

Estimator
An estimator of a population parameter is a sample statistic
used to estimate the parameter. The most commonly-used
estimator of the:
Population Parameter
Mean ()
Variance ( 2 )
Standard Deviation ()
Proportion (p)

Sample Statistic
is the
is the
is the
is the

Mean ( X )
Variance (s2 )
Standard Deviation (s)
Proportion ( p )

Estimating when is known)


We can calculate an interval estimator from a
sampling distribution, by:
Drawing a sample of size n from the population
Calculating its mean, X
And, by the central limit theorem, we know that
X is normally (or approximately normally)
distributed so
x
Z

n
will have a standard normal (or
approximately normal) distribution.

Estimating when is known


Looking at this in more detail

Known, i.e. standard


normal distribution

Known, i.e. its


assumed we know
the population
standard deviation

Known, i.e. sample


mean
Unknown, i.e. we
want to estimate
the population
mean

Known, i.e. the


number of items
sampled

Estimating when is known)


the confidence
interval

P ( z

x z
n

) 1
n
the sample mean
is in the center of
the interval

x z 2

x z 2
, x z 2

n
n
n

Thus, the probability that the interval contains


the population mean is 1
. This is a
confidence interval estimator of .

Confidence Interval Estimator for


The probability 1
confidence level.
Usually represented
with a plus/minus
( ) sign

x z 2

is called the
upper confidence
limit (UCL)



, x z 2
x z 2

n
n
n
lower
confidence
limit (LCL)

Graphically
here is the confidence interval for

x
x z

2z

width

x z

Graphically
the actual location of the population mean

may be here

or here

or possibly even here

Four commonly used


confidence levels
Confidence
level
0.90
0.95
0.98
0.99

0.10
0.05
0.02
0.01


0.05
0.025
0.01
0.005

/2

1.645
1.96
2.33
2.575

Interval Width
The width of the confidence interval estimate is a
function of the confidence level, the population
standard deviation, and the sample size

x z 2

Interval Width
The width of the confidence interval estimate is
a function of the confidence level, the
population standard deviation, and the sample
size

x z 2

A larger confidence level


produces a w i d e r
confidence interval:

Interval Width
The width of the confidence interval estimate
is a function of the confidence level, the
population standard deviation, and the sample
size

x z 2

Larger values of
produce wider
confidence intervals

Interval Width
The width of the confidence interval estimate is
a function of the confidence level, the
population standard deviation, and the sample
size

x z 2

Increasing the sample size decreases the width


of the confidence interval while the confidence
level can remain unchanged.
Note: this also increases the cost of obtaining
additional data

Selecting the Sample Size


We can control the width of the interval by
determining the sample size necessary to produce
narrow intervals.
Suppose we want to estimate the mean demand to
within 5 units; i.e. we want to the interval estimate to
be: x 5

Since: x z 2

It follows that

=5

Solve for n to get requisite sample size!

Estimation with small samples:


using the t distribution
If:
The sample size is small (<25 or so), and
The true variance 2 is unknown

Then the t distribution should be used instead of


the standard Normal.

Distribution of Sample Means


for (1-)% Confidence

Distribution of Sample Means


for (1-)% Confidence

.5

.5

Distribution of Sample Means


for (1-)% Confidence

1
2

1
2

Probability Interpretation
of the Level of Confidence

Pr ob[ X Z

X Z

] 1

Distribution of Sample Means


for 95% Confidence

.025

.025
95%
.4750

.4750

-1.96

1.96

95% Confidence Interval for


X Z

X Z

n
n
46
46
153 1.96
153 1.96
85
85
153 9.78 153 9.78
143.22 162.78

95% Confidence Intervals for


95%

X
X

X
X

X
X
X

Example
X 10.455, 7.7, and n 44.
90% confidence Z 1645
.
X Z

X Z

n
n
7.7
7.7
10.455 1.645
10.455 1.645
44
44
10.455 1.91 10.455 1.91
8.545 12.365

Pr ob[8.545 12.365] 0.90

Example
X 34.3, 8, N = 800 and n 50.
98% confidence Z 2.33
X Z

N n

X Z
N 1
n

N n
N 1

8 800 50
8 800 50
34.3 2.33
34.3 2.33
50 800 1
50 800 1
34.3 2.554 34.3 2.554
3175
. 36.85

Confidence Interval to Estimate


when n is Large and is Unknown

S
X Z
n
or
S
S
X Z
X Z
n
n
2

Example
X 85.5, S 19.3, and n 110.
99% confidence Z 2.575
S
X Z

n
19 .3
85 .5 2 .575

110
85 .5 4 .7
80 .8

S
X Z
n
19 .3
85 .5 2 .575
110
85 .5 4 .7
90 .2

Pr ob[80.8 90.2] 0.99

Sampling from normal


distribution
SBI of IITR Branch calculates that its
individual saving accounts are normally
distributed with a mean of 2,000 and a
standard deviation of 600.If the bank
takes a random sample of 100 accounts,
what is the probability that the sample
mean will lie between 1900 and 2050?
0.7492

Sampling from non-normal


distribution
The distribution of annual earnings of all
bank tellers with five year's experience is
negatively skewed. Mean 19,000 and
Std Deviation 2000.
If we draw a random sample of 30 tellers,
What is the probability that their earning
will average more than 1975 annually?
0.0202

t-distribution

Estimating the Mean of a Normal


Population: Small n and Unknown
The population has a normal distribution.
The value of the population standard
deviation is unknown.
The sample size is small, n < 30.
Z distribution is not appropriate for these
conditions
t distribution is appropriate

The t Distribution
Developed by British statistician, William
Gosset
A family of distributions -- a unique
distribution for each value of its parameter,
degrees of freedom (d.f.)
Symmetric, Unimodal, Mean = 0, Flatter
than a Z
X

t formula t
S
n

Degrees of freedom
Example
No. of values we can choose freely.

Comparison of Selected t Distributions


to the Standard Normal
Standard Normal
t (d.f. = 25)
t (d.f. = 5)
t (d.f. = 1)

-3

-2

-1

T-table

T table is more compact


Shows areas and t values only for few
percentages 1,2,5,10..
T table does not focus on the chance that
population parameter being estimated will
fall within our confidence interval.
Instead the chance that population
parameter we are estimating will not be
within our confidence interval(that is, that
will lie outside it)

Table of Critical Values of t

df
1
2
3
4
5

t0.100 t0.050 t0.025 t0.010 t0.005


3.078
1.886
1.638
1.533
1.476

6.314
2.920
2.353
2.132
2.015

12.706
4.303
3.182
2.776
2.571

31.821
6.965
4.541
3.747
3.365

63.656
9.925
5.841
4.604
4.032

1.714

25

1.319
1.318
1.316

1.708

2.069
2.064
2.060

2.500
2.492
2.485

2.807
2.797
2.787

29
30

1.311
1.310

1.699
1.697

2.045
2.042

2.462
2.457

2.756
2.750

40
60
120

1.303
1.296
1.289
1.282

1.684
1.671
1.658
1.645

2.021
2.000
1.980
1.960

2.423
2.390
2.358
2.327

2.704
2.660
2.617
2.576

23

24

1.711

With df = 24 and = 0.05,


t = 1.711.

Confidence Intervals for of a Normal


Population: Small n and Unknown

S
X t
n
or
S
S
X t
X t
n
n
df n 1

Example
X 2 .1 4 , S 1.2 9 , n 1 4 , d f n 1 1 3

1 .9 9

0 .0 0 5
2
2
t .0 0 5 ,1 3 3.0 1 2
S
X t
n
1.2 9
2 .1 4 3.0 1 2
14
2 .1 4 1.0 4
1.1 0

S
X t
n
1.2 9
2 .1 4 3.0 1 2
14
2 .1 4 1.0 4
3.1 8

Solution for Demonstration Problem


S
X t
n
1.2 9
2 .1 4 3.0 1 2
14
2 .1 4 1.0 4
1.1 0

S
X t
n
1.2 9
2 .1 4 3.0 1 2
14
2 .1 4 1.0 4
3.1 8

Pr ob[110
. 318
. ] 0.99

Chi-Square distribution

Population Variance
Variance is an inverse measure of the groups
homogeneity.
Variance is an important indicator of total quality in
standardized products and services.
Managers improve processes to reduce variance.
Variance is a measure of financial risk. Variance of
rates of return help managers assess financial and
capital investment alternatives.
Variability is a reality in global markets. Productivity,
wages, and costs of living vary between regions and
nations.

Estimating the Population Variance


Population Parameter
Estimator of

X X

n 1

formula for Single Variance

1
S

degrees of freedom = n - 1

Confidence Interval for 2

n 1 S

n 1 S

df n 1
1 level of confidence

Selected 2 Distributions
df = 3
df = 5
df = 10

2 Table
df
0.975
0.950
1 9.82068E-04 3.93219E-03
2
0.0506357
0.102586
3
0.2157949
0.351846
4
0.484419
0.710724
5
0.831209
1.145477
6
1.237342
1.63538
7
1.689864
2.16735
8
2.179725
2.73263
9
2.700389
3.32512
10
3.24696
3.94030

0.100
2.70554
4.60518
6.25139
7.77943
9.23635
10.6446
12.0170
13.3616
14.6837
15.9872

0.050
3.84146
5.99148
7.81472
9.48773
11.07048
12.5916
14.0671
15.5073
16.9190
18.3070

0.025
5.02390
7.37778
9.34840
11.14326
12.83249
14.4494
16.0128
17.5345
19.0228
20.4832

20
21
22
23
24
25

9.59077
10.28291
10.9823
11.6885
12.4011
13.1197

10.8508
11.5913
12.3380
13.0905
13.8484
14.6114

28.4120
29.6151
30.8133
32.0069
33.1962
34.3816

31.4104
32.6706
33.9245
35.1725
36.4150
37.6525

34.1696
35.4789
36.7807
38.0756
39.3641
40.6465

70
80
90
100

48.7575
57.1532
65.6466
74.2219

51.7393
60.3915
69.1260
77.9294

85.5270
96.5782
107.5650
118.4980

90.5313
101.8795
113.1452
124.3421

95.0231
106.6285
118.1359
129.5613

df = 5
0.10

10

15

20

9.23635

With df = 5 and =
0.10, 2 = 9.23635

Two Table Values of 2


df = 7
.05
.95
.05
0

2.16735

10

12

14

16

18

20

14.0671

df
1
2
3
4
5
6
7
8
9
10

0.950
3.93219E-03
0.102586
0.351846
0.710724
1.145477
1.63538
2.16735
2.73263
3.32512
3.94030

0.050
3.84146
5.99148
7.81472
9.48773
11.07048
12.5916
14.0671
15.5073
16.9190
18.3070

20
21
22
23
24
25

10.8508
11.5913
12.3380
13.0905
13.8484
14.6114

31.4104
32.6706
33.9245
35.1725
36.4150
37.6525

90% Confidence Interval for 2


S .0022125, n 8, df n 1 7, .10
14.0671
2

.1
2

.05

2
1

2
1

.1
2

n 1 S

2
.95

2.16735

8 1 . 0 0 2 2 1 2 5

1 4 .0 6 7 1
.0 0 1 1 0 1

n 1 S

2
1

8 1 . 0 0 2 2 1 2 5
2 .1 6 7 3 5

.0 0 7 1 4 6

Pr ob[0.001101 0.007146] 0.90


2

Solution for Demonstration Problem


S

1 .2544 , n 25 , df n 1 24 , .05

2
1

2
.05
2

2
1

.05
2

2
.025

n 1 S 2

39 .3641
2
.975

12 .4011

25 1(1 .2544 )

0 .7648

39 .3641

n 1 S 2

25 1(1 .2544 )
12 .4011

2 .4277

Determining Sample Size


when Estimating
Z formula

Error of Estimation
(tolerable error)
Estimated Sample
Size
Estimated

E X

Z
2

1
range
4

Sample Size When Estimating Example


E 1, 4
90% confidence Z 1.645

Z
2

2
2

E
(
1645
.
)
(
4
)

1
2

43.30 or 44

Solution for Demonstration Problem


E 2 , range 25
95% confidence Z 1.96
1
1
estimated :
range 25 6.25
4
4

Z
E
(
196
.
)
(
6
.
25
)

2
2

37.52 or 38

Determining Sample Size


when Estimating P
Z
formula

Error of Estimation (tolerable


error)

p P
Z
P Q
n

E p P

Estimated Sample
Size

PQ
Z
n
E
2

Solution for Demonstration Problem


E 0.03
98% Confidence Z 2.33
estimated P 0.40
Q 1 P 0.60
2

PQ
Z
n
E
0.40 0.60
(
2
.
33
)

.003
2

1,447.7 or 1,448

Example: Determining n when


Estimating P with No Prior Information
E 0.05
90% Confidence Z 1.645
with no prior estimate of P, use P 0.50
Q 1 P 0.50
2

PQ
Z
n
E
0.50 0.50
(
1645
.
)

.05
2

270.6 or 271

Vous aimerez peut-être aussi