5 - sampling-Dsitributions-Confidence Interval - 15-09-14 (Compatibility Mode)

Sampling Distribution and
Confidence Interval
A. Ramesh
Department of Management Studies
Indian Institute of Technology Roorkee
Sampling Distribution and Confidence

Interval
Context
Examples
Random vs Non-Random Samples
Central Limit Theorem
Confidence Interval
Reasons for Sampling

Sampling can save money and time.
Because the research process is sometimes
destructive, the sample can save product.
If accessing the population is impossible;
sampling is the only option.
Reasons for Taking a Census

Eliminate the possibility that a random
sample is not representative of the
population.
The person authorizing the study is
uncomfortable with sample information.
Random Versus Nonrandom Sampling

Random sampling
Every unit of the population has the same probability of
being included in the sample.
A chance mechanism is used in the selection process.
Eliminates bias in the selection process
Also known as probability sampling
Nonrandom Sampling
Every unit of the population does not have the same
probability of being included in the sample.
Open the selection bias
Not appropriate data collection methods for most statistical
methods
Also known as non-probability sampling
Random Sampling Techniques

Simple Random Sample
Stratified Random Sample
Proportionate
Disproportionate
Systematic Random Sample

Cluster (or Area) Sampling
Simple Random Sample

Number each frame unit from 1 to N.
Use a random number table or a random
number generator to select n distinct
numbers between 1 and N, inclusively.
Easier to perform for small populations
Cumbersome for large populations
Simple Random Sample:

Numbered Population Frame
01 Andhra Pradesh
02 Himachal Pradesh
03 Gujrath
04 Maharashtra
05 Nagaland
06 Goa
07 West bengal
08 Haryana
09 Punjab
10 Delhi
11 Madhya Pradesh
12 Uttar Pradesh
13 Bihar
14 Rajasthan
15 J & K
16 Tamil Nadu
17 Karantaka
18 Kerala
19 Orissa
20 Manipur
Simple Random Sampling:

Random Number Table
9
5
8
8
6
5
8
9
0
0
6
0
2
9
4
6
8
4
0
5
1
3
5
8
2
9
8
5
7
6
0
0
7
7
5
8
0
6
4
8
7
9
7
0
3
0
6
1
0
9
1
1
8
4
9
5
6
2
7
5
3
6
5
1
7
1
3
6
5
3
4
6
4
5
0
8
9
5
8
2
3
1
5
0
7
3
8
7
8
4
6
3
6
7
9
6
5
8
7
7
7
8
9
3
9
3
6
6
8
4
4
4
7
6
6
9
7
6
8
5
8
8
4
7
8
6
5
8
3
5
5
3
3
2
2
5
4
8
4
7
9
0
6
6
8
0
0
7
8
0
8
9
0
7
9
1
5
1
5
9
9
6
5
1
3
3
9
5
9
6
5
0
5
1
5
3
8
7
9
9
9
4
9
0
0
1
9
9
7
0
0
2
2
4
7
0
9
1
9
5
0
2
6
4
6
6
3
0
9
2
3
7
5
8
4
7
7
4
8
0
8
8
6
1
4
2
0
1
2
9
1
7
2
2
0
6
4
8
5
4
6
4
8
8
2
3
5
4
7
3
1
6
1
8
5
4
0
5
4
6
3
5
3
6
9
4
1
2
8
1
0
4
9
8
6
7
9
6
1
3
Simple Random Sample:

Sample Members
01 Andhra Pradesh
02 Himachal Pradesh
03 Gujrath
04 Maharashtra
05 Nagaland
06 Goa
07 West bengal
08 Haryana
09 Punjab
10 Delhi
N = 20
n=4
11 Madhya Pradesh
12 Uttar Pradesh
13 Bihar
14 Rajasthan
15 J & K
16 Tamil Nadu
17 Karantaka
18 Kerala
19 Orissa
20 Manipur
Stratified Random Sample

Population is divided into non-overlapping
subpopulations called strata
A random sample is selected from each stratum
Potential for reducing sampling error
Proportionate -- the percentage of these sample
taken from each stratum is proportionate to the
percentage that each stratum is within the
population
Disproportionate -- proportions of the strata within
the sample are different than the proportions of the
strata within the population
Stratified Random Sample:

Population of FM Radio Listeners
Stratified by Age
20 - 30 years old
(homogeneous within)
(alike)
30 - 40 years old
(alike)
40 - 50 years old
(alike)
Heterogeneous
(different)
between
Heterogeneous
(different)
between
Systematic Sampling
Convenient and relatively
easy to administer
Population elements are an
ordered sequence (at least,
conceptually).
The first sample element is
selected randomly from the
first k population elements.
Thereafter, sample elements
are selected at a constant
interval, k, from the ordered
sequence frame.
k =
n
where:
n = sample size
N = population size
k = size of selection interval
Systematic Sampling: Example

Purchase orders for the previous fiscal year
are serialized 1 to 10,000 (N = 10,000).
A sample of fifty (n = 50) purchases orders
is needed for an audit.
k = 10,000/50 = 200
First sample element randomly selected
from the first 200 purchase orders. Assume
the 45th purchase order was selected.
Subsequent sample elements: 245, 445,
645, . . .
Cluster Sampling
Population is divided into non-overlapping
clusters or areas
Each cluster is a miniature of the
population.
A subset of the clusters is selected randomly
for the sample.
If the number of elements in the subset of
clusters is larger than the desired value of n,
these clusters may be subdivided to form a
new set of clusters and subjected to a
random selection process.
Cluster Sampling
N
Advantages
More convenient for geographically dispersed
populations
Reduced travel costs to contact sample elements
Simplified administration of the survey
Unavailability of sampling frame prohibits using
other random sampling methods
Disadvantages
Statistically less efficient when the cluster elements
are similar
Costs and problems of statistical analysis are
greater than for simple random sampling
Nonrandom Sampling
Convenience Sampling: Sample elements are selected for
the convenience of the researcher
Judgment Sampling: Sample elements are selected by
the judgment of the researcher
Quota Sampling: Sample elements are selected until the
quota controls are satisfied
Snowball Sampling: Survey subjects are selected based
on referral from other survey respondents
Errors
N
N
N
Data from nonrandom samples are not appropriate for

analysis by inferential statistical methods.
Sampling Error occurs when the sample is not
representative of the population
Non-sampling Errors
Missing Data, Recording, Data Entry, and Analysis
Errors
Poorly conceived concepts , unclear definitions, and
defective questionnaires
Response errors occur when people so not know,
will not say, or overstate in their answers
Sampling Distribution of
Proper analysis and interpretation of a sample

statistic requires knowledge of its distribution.
Calculate x
Population
(parameter)
to estimate
Process of
Inferential Statistics
Sample
x
(statistic )
Select a
random sample
Distribution
of a Small Finite Population
Population Histogram
N=8
Frequency
54, 55, 59, 63,

68, 69, 70,74
3
2
1
0
52.5
57.5
62.5
67.5
72.5
Sample Space for n = 2 with Replacement

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Sample
(54,54)
(54,55)
(54,59)
(54,63)
(54,64)
(54,68)
(54,69)
(54,70)
(55,54)
(55,55)
(55,59)
(55,63)
(55,64)
(55,68)
(55,69)
(55,70)
Mean
54.0
54.5
56.5
58.5
59.0
61.0
61.5
62.0
54.5
55.0
57.0
59.0
59.5
61.5
62.0
62.5
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
Sample
(59,54)
(59,55)
(59,59)
(59,63)
(59,64)
(59,68)
(59,69)
(59,70)
(63,54)
(63,55)
(63,59)
(63,63)
(63,64)
(63,68)
(63,69)
(63,70)
Mean
56.5
57.0
59.0
61.0
61.5
63.5
64.0
64.5
58.5
59.0
61.0
63.0
63.5
65.5
66.0
66.5
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
Sample
(64,54)
(64,55)
(64,59)
(64,63)
(64,64)
(64,68)
(64,69)
(64,70)
(68,54)
(68,55)
(68,59)
(68,63)
(68,64)
(68,68)
(68,69)
(68,70)
Mean
59.0
59.5
61.5
63.5
64.0
66.0
66.5
67.0
61.0
61.5
63.5
65.5
66.0
68.0
68.5
69.0
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
Sample
(69,54)
(69,55)
(69,59)
(69,63)
(69,64)
(69,68)
(69,69)
(69,70)
(70,54)
(70,55)
(70,59)
(70,63)
(70,64)
(70,68)
(70,69)
(70,70)
Mean
61.5
62.0
64.0
66.0
66.5
68.5
69.0
69.5
62.0
62.5
64.5
66.5
67.0
69.0
69.5
70.0
Distribution of the Sample Means

Sampling Distribution Histogram
20
Frequency
15
10
5
0
53.75
56.25
58.75
61.25
63.75
66.25
68.75
71.25
1,800 Randomly Selected Values

from an Exponential Distribution
F
r
e
q
u
e
n
c
y
450
400
350
300
250
200
150
100
50
0
0 .5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 8.5 9 9.5 10
Means of 60 Samples (n = 2)
F
r
e
q
u
e
n
c
y
9
8
7
6
5
4
3
2
1
0
0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00
F
r
e
q
u
e
n
c
y
10
9
8
7
6
5
4
3
2
1
0
0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00

16
F
r
e
q
u
e
n
c
y
14
12
10
8
6
4
2
0
0.00
0.25
0.50
0.75
1.00
1.25
1.50
1.75
2.00
2.25
2.50
2.75
3.00
1,800 Randomly Selected Values

from a Uniform Distribution
F
r
e
q
u
e
n
c
y
250
200
150
100
50
0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
X-bar
3.5
4.0
4.5
5.0
F 10
r 9
e 8
q 7
u
6
e
n 5
c 4
y 3
2
1
0
1.00
1.25
1.50
1.75
2.00
2.25
2.50
2.75
3.00
3.25
3.50
3.75
4.00 4.25
F 12
r
e 10
q
u 8
e
n 6
c
y 4
2
0
1.00
1.25
1.50
1.75
2.00
2.25
2.50
2.75
3.00
3.25
3.50
3.75
4.00
4.25

F
r
e
q
u
e
n
c
y
25
20
15
10
5
0
1.00
1.25
1.50
1.75
2.00
2.25
2.50
2.75
3.00
3.25
3.50
3.75
4.00
4.25
Marquis de Laplace
For sufficiently large sample sizes (n
the distribution of sample means

normal;
30),
, is approximately
the mean of this distribution is equal to , the

population mean; and
its standard deviation is
regardless of the shape of the population distribution.

The central limit theorem is one of the most remarkable
results of the theory of probability.
In its simplest form, the theorem states that the sum of a
large number of independent observations from the same
distribution has, under certain general conditions, an
approximate normal distribution.
Moreover, the approximation steadily improves as the
number of observations increases. The theorem is
considered the heart of probability theory, although a better
name would be normal convergence theorem.
Distributions of Samples..
Sampling distributions drawn from a
uniformly distributed population start to
look like normal distributions even with a
sample size as small as 2.
If the sample size is large enough they form
nearly perfect normal distributions
Population: Uniform Distribution
Fig. 1) Histogram of Population - Uniform Distribution:

population = 10,000; mean = 5.013;
std dev 2.897
Distribution of Samples: n=2
Fig. 2) Sampling Distribution n = 2:

number of samples = 2010; mean = 4.995;
std dev 2.011

std dev 0.906

std dev 0.411
Useful web site for Demo
http://www.statisticalengineering.com/central_limit_theorem.htm
Distribution of Sample Means

for Various Sample Sizes
Exponential
Population
Uniform
Population
n=2
n=2
n=5
n=5
n = 30
n = 30

for Various Sample Sizes
U Shaped
Population
Normal
Population
n=2
n=2
n=5
n=5
n = 30
n = 30
Sampling from a Normal Population

The distribution of sample means is normal
for any sample size.
If x is the mean of a random sample of size n
from a normal population with mean of and
standard deviation of , the distribution of x is
a normal distribution with mean
standard deviation x
and
Z Formula for Sample Means

Z
Example
Population Parameters: 85, 9
Sample Size: n 40
87 X
P ( X 87) P Z
87
PZ
87 85
P Z
9
40
P Z 1.41
.5 ( 0 Z 1.41)
.5 .4201
.0793
Graphic Solution
to Example
40
1. 42
.5000
.5000
.4207
.4207
85
87
X- 87 85 2
Z=
1. 41
9
1. 42
n
40
Equal Areas
of .0793
1.41 Z
Sampling from a Finite Population

without Replacement
In this case, the standard deviation of the
distribution of sample means is smaller than
when sampling from an infinite population (or
from a finite population with replacement).
The correct value of this standard deviation is
computed by applying a finite correction factor
to the standard deviation for sampling from a
infinite population.
Sampling from a Finite Population

Finite Correction
Factor
Modified Z Formula
Nn
N 1
X
Z
N n
n
N 1
Central Limit Theorem for Proportion

Mean of the sampling distribution of the
proportion p
Standard error of the proportion
Z= p p
pq
n
Sampling Distribution of
Sample Proportion
X
n
where :
p
X number of items in a sample that possess the characteristic

n = number of items in the sample
Sampling Distribution
Approximately normal if nP > 5 and nQ > 5 (P is the
population proportion and Q = 1 - P.)
The mean of the distribution is P.
The standard deviation of the distribution is
p
pq
n
Estimation
Statistical Inference
Statistical inference is the process by which
we acquire information and draw conclusions
about populations from samples.
Statistics
Information
Data
Population
Sample
Inference
Statistic
Parameter
In order to do inference, we require the skills and knowledge of

descriptive statistics, probability distributions, and sampling
distributions.
Estimation
There are two types of inference:
estimation and hypothesis testing;
estimation is introduced first.
The objective of estimation is to determine
the approximate value of a population
parameter on the basis of a sample
statistic.
E.g., the sample mean ( x ) is employed to
estimate the population mean ()
Estimation
The objective of estimation is to determine the
approximate value of a population parameter
on the basis of a sample statistic.
There are two types of estimators:
Point Estimator
Interval Estimator
Point Estimator
A point estimator draws inferences about a
population by estimating the value of an
unknown parameter using a single value or
point.
Point Estimator
Point probabilities in continuous distributions
were virtually zero.
Point estimator gets closer to the parameter
value with an increased sample size, but point
estimators dont reflect the effects of larger
sample sizes.
Hence we will employ the interval estimator to
estimate population parameters
Interval Estimator
An interval estimator draws inferences about a
population by estimating the value of an
unknown parameter using an interval.
That is we say (with some ___% certainty) that

the population parameter of interest is between
some lower and upper bounds.
Point & Interval Estimation

For example, suppose we want to estimate the
mean summer income of a class of business
students. For n=25 students,
X is calculated to be
400/week.
point estimate
interval estimate
An alternative statement is:

The mean income is between 380 and 420
/week.
Estimator
An estimator of a population parameter is a sample statistic
used to estimate the parameter. The most commonly-used
estimator of the:
Population Parameter
Mean ()
Variance ( 2 )
Standard Deviation ()
Proportion (p)
Sample Statistic
is the
is the
is the
is the
Mean ( X )
Variance (s2 )
Standard Deviation (s)
Proportion ( p )
Estimating when is known)

We can calculate an interval estimator from a
sampling distribution, by:
Drawing a sample of size n from the population
Calculating its mean, X
And, by the central limit theorem, we know that
X is normally (or approximately normally)
distributed so
x
Z
n
will have a standard normal (or
approximately normal) distribution.
Estimating when is known

Looking at this in more detail
Known, i.e. standard

normal distribution
Known, i.e. its

assumed we know
the population
standard deviation
Known, i.e. sample

mean
Unknown, i.e. we
want to estimate
the population
mean
Known, i.e. the

number of items
sampled
Estimating when is known)

the confidence
interval
P ( z
x z
n
) 1
n
the sample mean
is in the center of
the interval
x z 2
x z 2
, x z 2
n
n
n
Thus, the probability that the interval contains

the population mean is 1
. This is a
confidence interval estimator of .
Confidence Interval Estimator for

The probability 1
confidence level.
Usually represented
with a plus/minus
( ) sign
x z 2
is called the
upper confidence
limit (UCL)

, x z 2
x z 2
n
n
n
lower
confidence
limit (LCL)
Graphically
here is the confidence interval for
x
x z
2z
width
x z
Graphically
the actual location of the population mean
may be here
or here
or possibly even here
Four commonly used

confidence levels
Confidence
level
0.90
0.95
0.98
0.99
0.10
0.05
0.02
0.01

0.05
0.025
0.01
0.005
/2
1.645
1.96
2.33
2.575
Interval Width
The width of the confidence interval estimate is a
function of the confidence level, the population
standard deviation, and the sample size
x z 2
Interval Width
The width of the confidence interval estimate is
a function of the confidence level, the
population standard deviation, and the sample
size
x z 2
A larger confidence level

produces a w i d e r
confidence interval:
Interval Width
The width of the confidence interval estimate
is a function of the confidence level, the
size
x z 2
Larger values of
produce wider
confidence intervals
Interval Width
The width of the confidence interval estimate is
a function of the confidence level, the
size
x z 2
Increasing the sample size decreases the width

of the confidence interval while the confidence
level can remain unchanged.
Note: this also increases the cost of obtaining
additional data
Selecting the Sample Size

We can control the width of the interval by
determining the sample size necessary to produce
narrow intervals.
Suppose we want to estimate the mean demand to
within 5 units; i.e. we want to the interval estimate to
be: x 5
Since: x z 2
It follows that
=5
Solve for n to get requisite sample size!
Estimation with small samples:

using the t distribution
If:
The sample size is small (<25 or so), and
The true variance 2 is unknown
Then the t distribution should be used instead of

the standard Normal.

for (1-)% Confidence

.5
.5

1
2
1
2
Probability Interpretation
of the Level of Confidence
Pr ob[ X Z
X Z
] 1

for 95% Confidence
.025
.025
95%
.4750
.4750
-1.96
1.96
95% Confidence Interval for

X Z
X Z
n
n
46
46
153 1.96
153 1.96
85
85
153 9.78 153 9.78
143.22 162.78
95% Confidence Intervals for

95%
X
X
X
X
X
X
X
Example
X 10.455, 7.7, and n 44.
90% confidence Z 1645
.
X Z
X Z
n
n
7.7
7.7
10.455 1.645
10.455 1.645
44
44
10.455 1.91 10.455 1.91
8.545 12.365
Pr ob[8.545 12.365] 0.90
Example
X 34.3, 8, N = 800 and n 50.
98% confidence Z 2.33
X Z
N n
X Z
N 1
n
N n
N 1
8 800 50
8 800 50
34.3 2.33
34.3 2.33
50 800 1
50 800 1
34.3 2.554 34.3 2.554
3175
. 36.85
Confidence Interval to Estimate

when n is Large and is Unknown
S
X Z
n
or
S
S
X Z
X Z
n
n
2
Example
X 85.5, S 19.3, and n 110.
S
X Z
n
19 .3
85 .5 2 .575
110
85 .5 4 .7
80 .8
S
X Z
n
19 .3
85 .5 2 .575
110
85 .5 4 .7
90 .2
Pr ob[80.8 90.2] 0.99
Sampling from normal

distribution
SBI of IITR Branch calculates that its
individual saving accounts are normally
distributed with a mean of 2,000 and a
standard deviation of 600.If the bank
takes a random sample of 100 accounts,
what is the probability that the sample
mean will lie between 1900 and 2050?
0.7492
Sampling from non-normal

distribution
The distribution of annual earnings of all
bank tellers with five year's experience is
negatively skewed. Mean 19,000 and
Std Deviation 2000.
If we draw a random sample of 30 tellers,
What is the probability that their earning
will average more than 1975 annually?
0.0202
t-distribution
Estimating the Mean of a Normal

Population: Small n and Unknown
The population has a normal distribution.
The value of the population standard
deviation is unknown.
The sample size is small, n < 30.
Z distribution is not appropriate for these
conditions
t distribution is appropriate
The t Distribution
Developed by British statistician, William
Gosset
A family of distributions -- a unique
distribution for each value of its parameter,
degrees of freedom (d.f.)
Symmetric, Unimodal, Mean = 0, Flatter
than a Z
X
t formula t
S
n
Degrees of freedom
Example
No. of values we can choose freely.
Comparison of Selected t Distributions

to the Standard Normal
Standard Normal
t (d.f. = 25)
t (d.f. = 5)
t (d.f. = 1)
-3
-2
-1
T-table
T table is more compact

Shows areas and t values only for few
percentages 1,2,5,10..
T table does not focus on the chance that
population parameter being estimated will
fall within our confidence interval.
Instead the chance that population
parameter we are estimating will not be
within our confidence interval(that is, that
will lie outside it)
Table of Critical Values of t
df
1
2
3
4
5
t0.100 t0.050 t0.025 t0.010 t0.005

3.078
1.886
1.638
1.533
1.476
6.314
2.920
2.353
2.132
2.015
12.706
4.303
3.182
2.776
2.571
31.821
6.965
4.541
3.747
3.365
63.656
9.925
5.841
4.604
4.032
1.714
25
1.319
1.318
1.316
1.708
2.069
2.064
2.060
2.500
2.492
2.485
2.807
2.797
2.787
29
30
1.311
1.310
1.699
1.697
2.045
2.042
2.462
2.457
2.756
2.750
40
60
120
1.303
1.296
1.289
1.282
1.684
1.671
1.658
1.645
2.021
2.000
1.980
1.960
2.423
2.390
2.358
2.327
2.704
2.660
2.617
2.576
23
24
1.711
With df = 24 and = 0.05,

t = 1.711.
Confidence Intervals for of a Normal

Population: Small n and Unknown
S
X t
n
or
S
S
X t
X t
n
n
df n 1
Example
X 2 .1 4 , S 1.2 9 , n 1 4 , d f n 1 1 3
1 .9 9
0 .0 0 5
2
2
t .0 0 5 ,1 3 3.0 1 2
S
X t
n
1.2 9
2 .1 4 3.0 1 2
14
2 .1 4 1.0 4
1.1 0
S
X t
n
1.2 9
2 .1 4 3.0 1 2
14
2 .1 4 1.0 4
3.1 8
Solution for Demonstration Problem

S
X t
n
1.2 9
2 .1 4 3.0 1 2
14
2 .1 4 1.0 4
1.1 0
S
X t
n
1.2 9
2 .1 4 3.0 1 2
14
2 .1 4 1.0 4
3.1 8
Pr ob[110
. 318
. ] 0.99
Chi-Square distribution
Population Variance
Variance is an inverse measure of the groups
homogeneity.
Variance is an important indicator of total quality in
standardized products and services.
Managers improve processes to reduce variance.
Variance is a measure of financial risk. Variance of
rates of return help managers assess financial and
capital investment alternatives.
Variability is a reality in global markets. Productivity,
wages, and costs of living vary between regions and
nations.
Estimating the Population Variance

Population Parameter
Estimator of
X X
n 1
formula for Single Variance
1
S

degrees of freedom = n - 1
Confidence Interval for 2
n 1 S
n 1 S
df n 1
1 level of confidence
Selected 2 Distributions
df = 3
df = 5
df = 10
2 Table
df
0.975
0.950
1 9.82068E-04 3.93219E-03
2
0.0506357
0.102586
3
0.2157949
0.351846
4
0.484419
0.710724
5
0.831209
1.145477
6
1.237342
1.63538
7
1.689864
2.16735
8
2.179725
2.73263
9
2.700389
3.32512
10
3.24696
3.94030
0.100
2.70554
4.60518
6.25139
7.77943
9.23635
10.6446
12.0170
13.3616
14.6837
15.9872
0.050
3.84146
5.99148
7.81472
9.48773
11.07048
12.5916
14.0671
15.5073
16.9190
18.3070
0.025
5.02390
7.37778
9.34840
11.14326
12.83249
14.4494
16.0128
17.5345
19.0228
20.4832
20
21
22
23
24
25
9.59077
10.28291
10.9823
11.6885
12.4011
13.1197
10.8508
11.5913
12.3380
13.0905
13.8484
14.6114
28.4120
29.6151
30.8133
32.0069
33.1962
34.3816
31.4104
32.6706
33.9245
35.1725
36.4150
37.6525
34.1696
35.4789
36.7807
38.0756
39.3641
40.6465
70
80
90
100
48.7575
57.1532
65.6466
74.2219
51.7393
60.3915
69.1260
77.9294
85.5270
96.5782
107.5650
118.4980
90.5313
101.8795
113.1452
124.3421
95.0231
106.6285
118.1359
129.5613
df = 5
0.10
10
15
20
9.23635
With df = 5 and =
0.10, 2 = 9.23635
Two Table Values of 2

df = 7
.05
.95
.05
0
2.16735
10
12
14
16
18
20
14.0671
df
1
2
3
4
5
6
7
8
9
10
0.950
3.93219E-03
0.102586
0.351846
0.710724
1.145477
1.63538
2.16735
2.73263
3.32512
3.94030
0.050
3.84146
5.99148
7.81472
9.48773
11.07048
12.5916
14.0671
15.5073
16.9190
18.3070
20
21
22
23
24
25
10.8508
11.5913
12.3380
13.0905
13.8484
14.6114
31.4104
32.6706
33.9245
35.1725
36.4150
37.6525
90% Confidence Interval for 2

S .0022125, n 8, df n 1 7, .10
14.0671
2
.1
2
.05
2
1
2
1
.1
2
n 1 S
2
.95
2.16735
8 1 . 0 0 2 2 1 2 5
1 4 .0 6 7 1
.0 0 1 1 0 1
n 1 S
2
1
8 1 . 0 0 2 2 1 2 5
2 .1 6 7 3 5
.0 0 7 1 4 6
Pr ob[0.001101 0.007146] 0.90

2

S
1 .2544 , n 25 , df n 1 24 , .05
2
1
2
.05
2
2
1
.05
2
2
.025
n 1 S 2
39 .3641
2
.975
12 .4011
25 1(1 .2544 )
0 .7648
39 .3641
n 1 S 2
25 1(1 .2544 )
12 .4011
2 .4277
Determining Sample Size

when Estimating
Z formula
Error of Estimation
(tolerable error)
Estimated Sample
Size
Estimated
E X
Z
2
1
range
4
Sample Size When Estimating Example

E 1, 4
Z
2
2
2
E
(
1645
.
)
(
4
)
1
2
43.30 or 44

E 2 , range 25
1
1
estimated :
range 25 6.25
4
4
Z
E
(
196
.
)
(
6
.
25
)
2
2
37.52 or 38
Determining Sample Size

when Estimating P
Z
formula
Error of Estimation (tolerable

error)
p P
Z
P Q
n
E p P
Estimated Sample
Size
PQ
Z
n
E
2

E 0.03
98% Confidence Z 2.33
estimated P 0.40
Q 1 P 0.60
2
PQ
Z
n
E
0.40 0.60
(
2
.
33
)
.003
2
1,447.7 or 1,448
Example: Determining n when

Estimating P with No Prior Information
E 0.05
90% Confidence Z 1.645
with no prior estimate of P, use P 0.50
Q 1 P 0.50
2
PQ
Z
n
E
0.50 0.50
(
1645
.
)
.05
2
270.6 or 271

5 - sampling-Dsitributions-Confidence Interval - 15-09-14 (Compatibility Mode)

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

5 - sampling-Dsitributions-Confidence Interval - 15-09-14 (Compatibility Mode)

Transféré par

Droits d'auteur :

Formats disponibles

Sampling Distribution and

Sampling Distribution and Confidence

Reasons for Sampling

Reasons for Taking a Census

Random Versus Nonrandom Sampling

Random Sampling Techniques

Systematic Random Sample

Simple Random Sample

Simple Random Sample:

Simple Random Sampling:

Simple Random Sample:

Stratified Random Sample

Stratified Random Sample:

Systematic Sampling: Example

Data from nonrandom samples are not appropriate for

Proper analysis and interpretation of a sample

54, 55, 59, 63,

Sample Space for n = 2 with Replacement

Distribution of the Sample Means

1,800 Randomly Selected Values

0 .5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 8.5 9 9.5 10

Means of 60 Samples (n = 30)

1,800 Randomly Selected Values

Means of 60 Samples (n = 30)

Central Limit Theorem

For sufficiently large sample sizes (n

the distribution of sample means

the mean of this distribution is equal to , the

its standard deviation is

regardless of the shape of the population distribution.

Central Limit Theorem

Population: Uniform Distribution

Fig. 1) Histogram of Population - Uniform Distribution:

Distribution of Samples: n=2

Fig. 2) Sampling Distribution n = 2:

Distribution of Samples: n=10

Fig. 3) Sampling Distribution n = 10:

Distribution of Samples: n=50

Fig. 4) Sampling Distribution n = 50:

Useful web site for Demo

Distribution of Sample Means

Distribution of Sample Means

Sampling from a Normal Population

Z Formula for Sample Means

Sampling from a Finite Population

Sampling from a Finite Population

Central Limit Theorem for Proportion

Standard error of the proportion

X number of items in a sample that possess the characteristic

In order to do inference, we require the skills and knowledge of

That is we say (with some ___% certainty) that

Point & Interval Estimation

An alternative statement is:

Estimating when is known)

Estimating when is known

Known, i.e. standard

Known, i.e. its

Known, i.e. sample

Known, i.e. the

Estimating when is known)

Thus, the probability that the interval contains

Confidence Interval Estimator for

or possibly even here