Biostatistics For Public Health: Chapter 11 - Inference About A Mean

Biostatistics for Public Health
Chapter 11 - Inference About a Mean
Kevin Brooks MSc., PhD.

Objective
Perform and interpret one-sample, two-sample, and

paired t hypothesis tests on means.
✓ Estimated Standard Error of the Mean
✓ Student’s t Distribution
✓ One-Sample t Test
✓ Confidence Interval for μ
✓ Paired Samples
✓ Conditions for Inference
Kevin Brooks MSc., PhD. 2

Estimated Standard Error of the
Mean
• We rarely know population standard deviation σ

⇒ instead, we calculate sample standard
deviations s and use this as an estimate of σ
• We then use s to calculate this estimated
standard error of the mean:
s
  SE x =
n
• Using s instead of σ adds a source of uncertainty
⇒ z procedures no longer apply  
⇒ use t procedures instead
3
Student’s t distributions
• A family of distributions identified by

“Student” (William Sealy Gosset) in 1908
• t family members are identified by their degrees of
freedom, df.
• t distributions are similar to z distributions but with
broader tails
• As df increases → t tails get skinnier → t become
more like z
4
t probability density functions with 1, 9, and ∞ degrees
of freedom.
5
t table (Table C)
• Use Table C to look up t values and probabilities

– Entries ⇒ t values
– Rows ⇒ df
– Columns ⇒ probabilities
6
Understanding Table C
Let tdf, p ≡ a t value with df degrees of freedom and

cumulative probability p. For example, t9, 0.90 = 1.383
Table C. Traditional t table

Cumulative p 0.75 0.80 0.85 0.90 0.95 0.975
Upper-tail p 0.25 0.20 0.15 0.10 0.05 0.025
df = 9 0.703 0.883 1.100 1.383 1.833 2.262
7
The 10th and 90th percentiles on t9.
Left tail: Right tail:

Pr(T9 < -1.383) = 0.10 Pr(T9 > 1.383) = 0.10
8
One-Sample t Test
A. Hypotheses. H0: µ = µ0 vs. Ha: µ ≠ µ0 (two-sided) [ Ha:

µ < µ0 (left-sided) or Ha: µ > µ0 (right-sided)]
B. Test statistic. 
 
  x − µ0
 
tstat = with df = n − 1
s n
C. P-value. Convert tstat to P-value [table C or software].
Small P ⇒ strong evidence against H0
D. Significance level (optional). See Ch 9 for guidelines.
9
One-Sample t Test: Statement of
the Problem
• Do SIDS babies have lower than average birth

weights?
• We know from prior research that the mean birth
weight of the non-SIDs babies in this population is
3300 grams
• We study n = 10 SIDS babies, determine their birth
weights, and calculate x-bar = 2890.5 and s = 720.
• Do these data provide significant evidence that
SIDs babies have different birth weights than the
rest of the population?
10
One-Sample t Test: Example
A. H0: µ = 3300 versus Ha: µ ≠ 3300 (two-sided)
B. Test statistic
x − µ 0 2890.5 − 3300
tstat = = = −1.80
SE x 720 10
df = n − 1 = 10 − 1 = 9
C.P = 0.1054 [next slide]
Weak evidence against H0
(optional) Data are not significant at α = 0.10
11
Converting the tstat to a P-value
tstat ⇒ P-value via Table C. Wedge |tstat| between critical

value landmarks on Table C. One-tailed 0.05 < P < 0.10 and
two-tailed 0.10 < P < 0.20.
Table C. Traditional t table |tstat| = 1.80

Cumulative p 0.75 0.80 0.85 0.90 0.95 0.975
Upper-tail p 0.25 0.20 0.15 0.10 0.05 0.025
df = 9 0.703 0.883 1.100 1.383 1.833 2.262
tstat ⇒ P-value via software. Use a software utility to

determine that a t of −1.80 with 9 df has two-tails of 0.1054.
12
Two-tailed P-value, SIDS illustrative example
13
Confidence Interval for µ
s
(1 − α )100% CI for µ = x ± t n −1,1− α ⋅
2
n
• Typical point “estimate ± margin of error” formula
• tn-1,1-α/2 is from t table (see bottom row for conf. level)
• Similar to z procedure except uses s instead of σ
• Similar to z procedure except uses t instead of z
• Alternative formula:
s
x ± t n −1,1− α ⋅ SE x where SE x =
2
n 14
Confidence Interval: Example 1
Let us calculate a 95% confidence interval for μ for the

birth weight of SIDS babies.
x = 2890.5 s = 720.0 n = 10
s
95% CI for µ = x ± t10 −1,1− .05 ⋅
2
n
720
= 2890.5 ± 2.262 ⋅
10
= 2890.5 ± 515.1
= (2375.4 to 3405.6) grams
15
Confidence Interval: Example 2
Data are “% of ideal body weight” in 18 diabetics:

{107, 119, 99, 114, 120, 104, 88, 114, 124, 116,
101, 121, 152, 100, 125, 114, 95, 117}. Based on
these data we calculate a 95% CI for µ.
x = 112.778 s = 14.424 n = 18
s 14.242
SE x = = = 3.400
n 18
t n −1,1− α = t18−1,1− .05 = t17,.975 = 2.110 (from t table)
2 2
x ± (t n −1,1− α )( SE x ) = 112.778 ± (2.110)(3.44)

2
= 112.778 ± 7.17 = (105.6, 120.0)

16
Paired Samples
• Paired samples: Each point in one sample is matched

to a unique point in the other sample
• Pairs be achieved via sequential samples within
individuals (e.g., pre-test/post-test), cross-over
trials, and match procedures
• Also called “matched-pairs” and “dependent
samples”
17
Example: Paired Samples
• A study addresses whether oat bran reduce LDL

cholesterol with a cross-over design.
• Subjects “cross-over” from a cornflake diet to an oat
bran diet.
– Half subjects start on CORNFLK, half on
OATBRAN
– Two weeks on diet 1
– Measures LDL cholesterol
– Washout period
– Switch diet
– Two weeks on diet 2
– Measures LDL cholesterol
18
Example, Data
Subject CORNFLK OATBRAN
---- ------- -------
1 4.61 3.84
2 6.42 5.57
3 5.40 5.85
4 4.54 4.80
5 3.98 3.68
6 3.82 2.96
7 5.01 4.41
8 4.34 3.72
9 3.80 3.49
10 4.56 3.84
11 5.35 5.26 19
12 3.89 3.73 Kevin Brooks MSc., PhD.
Calculate Difference Variable “DELTA”
• Step 1 is to create difference variable “DELTA”

• Let DELTA = CORNFLK - OATBRAN
• Order of subtraction does not materially effect
results (but does change sign of differences)
• Here are the first three observations:
ID CORNFLK OATBRAN DELTA   Positive

---- ------- ------- -----  values
1 4.61 3.84 0.77
represent
2 6.42 5.57 0.85
3 5.40 5.85 -0.45 lower LDL on
↓ ↓ ↓ ↓ oatbran
20
Explore DELTA Values
Here are all the twelve paired differences (DELTAs):  
0.77, 0.85, −0.45, −0.26, 0.30, 0.86, 0.60, 0.62, 0.31, 0.72, 0.09, 0.16
-0.5 0 0.5 1 1.5
EDA shows a slight

negative skew, a median
of about 0.45, with
results varying from −0.4
to 0.8. 21
Descriptive stats for DELTA
• Data (DELTAs): 0.77, 0.85, −0.45, −0.26, 0.30,

0.86, 0.60, 0.62, 0.31, 0.72, 0.09, 0.16
• The subscript d will be used to denote statistics for
difference variable DELTA
n = 12
xd = 0.3808
s d = 0.4335
22
95% Confidence Interval for µd
• A t procedure directed toward the DELTA

variable calculates the confidence interval
for the mean difference.
sd
(1 − α )100% CI for µ d = xd ± t n −1,1− α ⋅
2
n
• “Oat bran” data:
For 95% confidence use t12−1,1− .05 = t11,.975 = 2.201 (from Table C)
2
.4335
95% CI for µ d = 0.3808 ± 2.201 ⋅
12
= 0.3808 ± 0.2754
= (0.105 to 0.656) 23
Paired t Test
• Similar to one-sample t test

• µ0 is usually set to 0, representing “no mean
difference”, i.e., H0: µ = 0
• Test statistic:
xd − µ 0
tstat =
sd n
df = n − 1
24
Paired t Test: Example“Oat bran” data
A. Hypotheses. H0: µd = 0 vs. Ha: µd ≠ 0

B. Test statistic.
xd − µ 0 0.38083 − 0
tstat = = = 3.043
s n .4335 / 12
df = n − 1 = 12 − 1 = 11
C. P-value. P = 0.011 (via computer). The evidence
against H0 is statistically significant.
D. Significance level (optional). The evidence against H0
is significant at α = 0.05 but is not significant at α = .01
25
SPSS Output: Oat Bran data
● USE SAS TO PRODUCE THIS EXAMPLE LIVE
26
Conditions for Inference
t procedures require these conditions:

• SRS (individual observations or DELTAs)
• Valid information (no information bias)
• Normal population or large sample (central limit
theorem)
27
The Normality Condition
• The Normality condition applies to the sampling

distribution of the mean, not the population.
• Therefore, it is OK to use t procedures when:
– The population is Normal
– Population is not Normal but is symmetrical and
n is at least 5 to 10
– The population is skewed and the n is at least 30
to 100 (depending on the extent of the skew)
28
Can a t procedures be used?
• If dataset is skewed and small: avoid t

procedures
• If dataset has a mild skew and is moderate in
size: use t procedures
• If data set is highly skewed and is small: avoid t
procedure
29
Thank You
For Viewing

MPH Program,
Division of Public Health
College of Human Medicine
Michigan State University
brooks52@msu.edu
30
Biostatistics for Public Health
The End
Chapter 11 - Inference About a Mean
Kevin Brooks MSc., PhD. 31

Biostatistics For Public Health: Chapter 11 - Inference About A Mean

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Biostatistics For Public Health: Chapter 11 - Inference About A Mean

Transféré par

Droits d'auteur :

Formats disponibles

Biostatistics for Public Health

Chapter 11 - Inference About a Mean

Kevin Brooks MSc., PhD.

Perform and interpret one-sample, two-sample, and

Kevin Brooks MSc., PhD. 2

• We rarely know population standard deviation σ

• A family of distributions identified by

• Use Table C to look up t values and probabilities

Let tdf, p ≡ a t value with df degrees of freedom and

Table C. Traditional t table

Left tail: Right tail:

A. Hypotheses. H0: µ = µ0 vs. Ha: µ ≠ µ0 (two-sided) [ Ha:

• Do SIDS babies have lower than average birth

tstat ⇒ P-value via Table C. Wedge |tstat| between critical

Table C. Traditional t table |tstat| = 1.80

tstat ⇒ P-value via software. Use a software utility to

Let us calculate a 95% confidence interval for μ for the

Data are “% of ideal body weight” in 18 diabetics:

x ± (t n −1,1− α )( SE x ) = 112.778 ± (2.110)(3.44)

= 112.778 ± 7.17 = (105.6, 120.0)

• Paired samples: Each point in one sample is matched

• A study addresses whether oat bran reduce LDL

• Step 1 is to create difference variable “DELTA”

ID CORNFLK OATBRAN DELTA Positive

-0.5 0 0.5 1 1.5

EDA shows a slight

• Data (DELTAs): 0.77, 0.85, −0.45, −0.26, 0.30,

• A t procedure directed toward the DELTA

• Similar to one-sample t test

A. Hypotheses. H0: µd = 0 vs. Ha: µd ≠ 0

t procedures require these conditions:

• The Normality condition applies to the sampling

• If dataset is skewed and small: avoid t

Kevin Brooks MSc., PhD.

Chapter 11 - Inference About a Mean

Kevin Brooks MSc., PhD. 31

Vous aimerez peut-être aussi

ID CORNFLK OATBRAN DELTA   Positive