Académique Documents
Professionnel Documents
Culture Documents
004: Statistics
Lecture 9
I NFERENCE (C HAPTER 7)
S INGLE SAMPLE TESTS , P - VALUES AND POWERS
Term 5, 2018
Exam 1:
A normal distribution table, as well as all required values from other distributions will be
provided.
- H0 : Null hypothesis (in general, status quo, hypothesis we are seeking evidence against).
- HA : Alternative hypothesis (new belief, hypothesis which requires evidence for)
Ideally, a test should have low level and high power. It is not easy to achieve both unless
sample size is increased (more on this later).
Instead of fixing a level α, we often report the p-value, which is the probability that under
H0 , a test statistic value as extreme as observed or worse is realised.
1 H0 : µ = µ0 vs. HA : µ 6= µ0 .
2 H0 : µ = µ0 vs. HA : µ = µ1 .
3 H0 : µ = µ0 vs. HA : µ > µ0 .
4 H0 : µ ≤ µ0 vs. HA : µ > µ0 .
For a one-sided alternative hypothesis, such as HA : µ > µ0 , it does not matter if we use
H0 : µ = µ0 or H0 : µ ≤ µ0 ,
for computing p-value of the test-statistic.
If we use the latter, then the maximum p-value is still obtained at the boundary, when
µ = µ0 .
One-sided tests are used when the deviation is expected to be in a particular direction.
They should not be used as a device to make a statistically non-significant result
significant.
For large samples (n > 30) we assume sample variance s2 ≈ σ 2 and we can use CLT.
A few equivalent ways to perform a hypothesis test for the mean µ (with σ known).
x−µ
√ 0 , and compare it with the appropriate critical value
Calculate the z-statistic, σ/ n
(which may be zα , or, zα/2 ).
When σ is unknown and we have a small sample, it is customary to assume that the data
is normally distributed and resort to a t-statistic: x−µ
√0 .
s/ n
Exercise: Suppose that you selected a random sample of 36 SUTD students, and found that on
average, they spend 20.0 hours on homework per week, with a sample standard deviation of
3.0 hours (assume normality, you may use CLT with σ approximated accurately by s).
Exercise: Previous research has shown that the amount of time children spend watching TV per
week had µ = 22.6h and σ = 6.1h. A market research firm believes that the stated mean is now
too low. A random sample of 60 children are taken to measure the number of hours they watch
TV. A hypothesis test at the α = 0.01 level is carried out.
1 State H0 and HA .
4 Suppose the true mean for this population is 25 hours. What is β, and what is the power in
this case? (Draw a picture!)
Note: α and β cannot be reduced simultaneously, unless we increase the sample size.
https://shiny.rit.albany.edu/stat/betaprob/
Assume that σ is known, and that n is large so we may use the z-distribution.
√
(µ − µ0 ) n
π(µ) = 1 − β = Φ − zα .
σ
Proof: generalize from Exercise 1. You should also figure out the corresponding formulas
for HA : µ < µ0 and HA : µ 6= µ0 .
Note: in situations where we need to use the t-distribution, the power calculation is less
straightforward.
With the assumptions on the previous slides, the minimum sample size required for an
α-level hypothesis test with power of (1 − β) is
2
(zα + zβ )σ
n= ,
µ − µ0
Consider a (1 − α) two-sided confidence interval for µ using the z-distribution. What is the
relationship between the width of the interval and the sample size?
If the width of the CI is 2E, then we require the minimal sample size to be
2
zα/2 σ
n= ,
E
Exercise: Find the required sample size for a 95% CI, whose width is σ/4.
Exercise: Changes in test scores for students retaking the SAT without coaching has µ = 15
and σ = 40. The changes in the scores are roughly normally distributed. A coaching program
claims that on average it can improve the mean score by at least 35 points. A 0.01-level test of
H0 : µ = 15 vs HA : µ > 15 is to be conducted. Find the number of students that must be tested
in order to have at least 90% power for detecting an increase of 35 points or more.
53