Académique Documents
Professionnel Documents
Culture Documents
Statistics
1
Frequentist Approach to Statistics
Sample Mean
A random sample is a set of independently, identically distributed or
i.i.d. observations X1, X2, … , Xn (when sampling from a large
population or with replacement)
Assume that the population has
mean µ = E ( X i ) and variance σ 2 = Var ( X i )
1 n
How does thesample mean X = ∑ Xi
n i =1
vary on repeated random samples of size n?
2
Mean and Variance of a Die Toss
3
Rolling
Two
Dice
Each one of
these 36
outcomes are
equally likely,
i.e., each one
occurs with 1/36
probability.
4
Homework To be Done Right Away
Use the Sampling Distribution simulation Java applet at the
Rice Virtual Lab in Statistics to do the following.
• Draw 10,000 random samples of size N=5 from the normal distribution provided.
– Construct the histogram of the sampling distribution of the sample mean.
– Construct the histogram of the sampling distribution of the sample variance
– Turn in this output with the rest of the homework for Unit 5.
• Draw 10,000 random samples of size N=20 from the normal distribution provided.
– Construct the histogram of the sampling distribution of the sample mean.
– Construct the histogram of the sampling distribution of the sample variance
• Draw 10,000 random samples of size N=5 from a uniform distribution on [0,32].
– Construct the histogram of the sampling distribution of the sample mean.
– Construct the histogram of the sampling distribution of the sample variance
• Draw 10,000 random samples of size N=20 from a uniform distribution on [0,32].
– Construct the histogram of the sampling distribution of the sample mean.
– Construct the histogram of the sampling distribution of the sample variance
• Draw 10,000 random samples of size N=5 from the skewed distribution provided.
– Construct the histogram of the sampling distribution of the sample mean.
– Construct the histogram of the sampling distribution of the sample variance
– Construct the histogram of the sampling distribution of the sample median
• Draw 10,000 random samples of size N=20 from the skewed distribution provided.
– Construct the histogram of the sampling distribution of the sample mean.
– Construct the histogram of the sampling distribution of the sample variance
– Construct the histogram of the sampling distribution of the sample median
5
Central Limit Theorem
Let X1, X2, … , Xn be a random sample drawn from an arbitrary
distribution with a finite mean µ and variance σ2. Then if n is
sufficiently large
X −µ
≈ N (0,1)
σ n
∑X
j =1
i − nµ
≈ N (0,1)
σ n
Central
Limit
Theorem
Illustration
0
10
6
Screen Shots of the Output of the Sampling
Distribution Simulation Java Applet
σ 6.22
= = 2.78 2.81
n 5
X − µ converges to 0 as n → ∞
•CLT says that as n goes to infinity
X −µ
converges to N (0,1) as n → ∞
σ n
7
Central Limit Theorem
Let X1, X2, … , Xn be a random sample drawn from an arbitrary
distribution with a finite mean µ and variance σ2. Then if n is
sufficiently large
X −µ
≈ N (0,1)
σ n
∑X
j =1
i − nµ
≈ N (0,1)
σ n
X − np ∑ Zi − np ∑ Z i − nE ( Z )
= i =1
= i =1
≈ N (0,1)
np (1 − p) np (1 − p) Var ( Z ) × n
How large of a sample, n, do we need for the approximation to
be good?
Rule of Thumb: np ≥ 10 and n(1 − p ) ≥ 10
6/12/2004 Unit 5 - Stat 571 - Ramon V. Leon 16
8
CLT Approximation to the Binomial
When p is Close to 0.5
CLT
Approximation
to the Binomial
When p is Not
Close to 0.5
np = n(.1)
should be at least 10.
So n should be at least
100
9
Continuity
Correction
8.5 − np
P ( X ≤ 8) Φ
np (1 − p )
Similarly:
7.5 − np
P ( X ≥ 8) 1 − Φ
np (1 − p )
6/12/2004 Unit 5 - Stat 571 - Ramon V. Leon 19
10
Why the Normal Approximation to the
Binomial Distribution Works in Pictures
Green area is
approximately
the same as the
red area
11
Example: CLT Approximation to the Binomial
Rolling
Two
Dice
Each one of these
36 outcomes are
equally likely, i.e.,
each one occurs
with 1/36
probability.
Now we pay
attention to the
sample variance.
6/12/2004 Unit 5 - Stat 571 - Ramon V. Leon 24
12
Sampling Distribution of the Sample
Variance: Two Dice Example
Chi-Square Distribution
13
Using JMP to Simulate a Chi-Square
Random Sample with 5 d.f.
The number of
rows is the size
of the random
sample
14
Fitted Chi-Square Based on the Sample
0 10 20
15
Critical Values
for the
Chi-Square
16
Application of the Distribution of Sample
Variance – Measurement Precision
0.05
χ 9,0.05
2
= 16.92
17
Student’s t-Distribution
Consider a random sample, X 1 , X 2 ,..., X n drawn from a N ( µ , σ 2 )
(X − µ)
It is known that is exactly distributed as N(0,1) for any n.
σ n
But T = ( X − µ ) is not longer distributed as N(0,1).
S n
t-Distribution
Table
18
Application of the t-Distribution
Calculation –Process Control
0.005
=3.250
19
F-Distribution
Consider two independent random samples,
X 1 , X 2 ,..., X n1 from an N ( µ1 , σ 12 ), Y1 , Y2 ,..., Yn2 from an N ( µ 2 , σ 22 ).
S12 σ 12
Then has an F distribution with ν1 = n1 - 1 d.f.
S22 σ 22
in the numerator and ν2= n2-1 d.f. in the denominator.
F-Distribution Table
20