Vous êtes sur la page 1sur 36

Biostatistics

Lecture 10
Lecture 9 Review
Measures of association
Measures of association



Risk difference
Risk ratio
Odds ratio
Calculation & interpretation
interval for each measure
of confidence
of association
22 table - Measures of association
Outcome - binary
Measure of Effect Formula
Risk difference p
1
-p
0

Risk ratio
p
1
/ p
0

Odds ratio
(d
1
/h
1
) / (d
0
/h
0
)
Differences in measures of association
When there is no association between exposure and outcome,



risk difference = 0
risk ratio (RR) = 1
odds ratio (OR) = 1


Risk difference can be negative or positive
RR & OR are always positive
For rare outcomes, OR ~ RR
OR is always further from 1 than corresponding RR
If RR > 1 then OR > RR
If RR < 1 the OR < RR
Interpretation of measures of association
RR & OR < 1, associated with a reduced risk / odds (may
protective)
be
RR = 0.8 (reduced risk of 20%)
RR

& OR > 1, associated with an increased risk / odds
RR = 1.2 (increased risk of 20%)
RR & OR further the risk is from 1, stronger the association
between exposure and outcome (e.g. RR=2 versus RR=3).
Comparing the outcome measure of two exposure groups
(groups 1 & 0)
s.e.(lo g RR ) = +
e
eR
(log
e
OR )
d h d h
Outcome
variable
data type
Population
parameter
Estimate
of
population
parameter
from
sample
Standard error of
log
e
(parameter)
95% Confidence interval of
log
e
(population parameter)
Categorical

Population
risk ratio

p
1
/p
0


1 1 1 1
d
1
n
1
d
0
n
0


log eRR
1.96 s.e.(lo g R )

Categorical

Population
odds ratio


(d
1
/h
1
) /
(d
0
/h
0
)

s.e. =
1
+
1
+
1
+
1
1 1 0 0

log eOR
1.96 x s.e.(lo g eOR )
Calculation of p-values for comparing two groups
1 0
z =
s.e.(lo g ( RR ))
s.e.(lo g ( OR ))
Outcome
variable
data type
Population parameter Population parameter
under null hypothesis
Test statistic

Categorical

1
-
0

Population risk ratio
Population odds ratio

1
-
0
=0
Population risk ratio=1
Population odds ratio=1

p p
s.e.( p
1
p
0
)
z =
log
e
( RR)
e
z =
log
e
(OR)
e
Comparing the outcome measure of two exposure groups
(TBM trial: dexamethasone versus placebo)
Outcome
variable
data type
Population
parameter
under null
hypothesis
Estimate of
population
parameter
from sample
95% confidence
interval for
population
parameter
Two-sided p-value
Categorical Population
risk
difference
= 0

p
1
-p
0

= -0.095

-0.175, -0.015

0.020
Categorical

Population
risk ratio
= 1

p
1
/p
0

= 0.77

0.62, 0.96

0.016

Categorical

Population
odds ratio
= 1


(d
1
/h
1
) / (d
0
/h
0
)
= 0.66

0.46, 0.93

0.021
22 table TBM trial example
Odds ratio for death = (d
1
/h
1
) / (d
0
/h
0
) = 0.465 / 0.704 = 0.66

Odds ratio for exposure to dexamethasone = (d
1
/d
0
) / (h
1
/h
0
) = 0.777 / 1.176 = 0.66
Odds ratio for not dying = (h
1
/d
1
) / (h
0
/d
0
) = 2.149 / 1.420 = 1.51 = (1/0.66)

Odds ratio for exposure to placebo = (d
0
/d
1
) / (h
0
/h
1
) = 1.287 / 0.850 = 1.51 = (1/0.66)

Death during 9 months post start
of treatment
Treatment group Yes No Total
Dexamethasone
(group 1)
87 (d
1
) 187 (h
1
) 274 (n
1
)
Placebo
(group 0)
112 (d
0
) 159 (h
0
) 271 (n
0
)
Total 199 346 545
Measure of association
Study Design Risk
difference
Risk
Ratio
Odds
Ratio
Randomised controlled trial






Cohort Study






Case-control Study






Lecture 10 Controlling for confounding:
stratification and regression
A description of confounding
How to control for confounding
analysis by
Stratification
Regression modelling
in statistical
A brief description of the role of multiple
linear or logistic regression in adjusting for
confounding
Outcome and exposure variables
(RECAP)
Outcomes are variables of interest (population
health relevance) whose patterns and
determinants we wish to learn about from data

Exposures are the variables we think might
explain observed variation in the outcomes
Statistical analysis can be used to quantify the
association between outcomes and exposures
What is confounding?
A confounding variable
1)
2)
3)
is associated with the outcome variable;
is associated with the exposure variable;
does not lie on the causal pathway.
Outcome variable
Exposure variable
Confounding variable
Failing to control for confounding may result in a
biased estimate of the magnitude of the association
between exposure and outcome
Example of confounding
Exposure variable Outcome variable
Alcohol intake Heart disease
Confounding variables
Cigarette smoking
Control of confounding
Design of Study
Randomisation
(randomised controlled trial: e.g. TBM trial)
Restriction
(only include those with one value of confounder)
Matching
Control of confounding
Statistical analysis
Stratification
Regression modelling
Hypothetical example of a case-control study
Association between energy intake and heart disease
Odds
Odds
of heart disease in high energy intake group = 730/600 = 1.22
of heart disease in low energy intake group = 700/540 = 1.30
Odds ratio = 1.22 / 1.30 = 0.94
95% confidence interval: 0.80 up to 1.10
Heart disease
Energy intake Yes No Total
High
(group 1)
730 (d
1
) 600 (h
1
) 1330 (n
1
)
Low
(group 0)
700 (d
0
) 540 (h
0
) 1240 (n
0
)
Total 1430 1140 2570
Is this association confounded
by physical activity?
Exposure variable Outcome variable
Energy intake Heart disease
Confounding variables
Physical activity
Stratify by physical activity..
Calculate the stratum specific odds ratios

Energy
intake
High physical activity Low physical activity
Heart disease Heart disease
Yes No Yes No
High
(group 1)
500 510 230 90
Low
(group 0)
100 150 600 390
Stratify by physical activity..
For high physical activity group:
OR (95% CI) = 1.47 (1.11, 1.95)
For low physical activity group:
OR (95% CI) = 1.66 (1.26, 2.19)

Energy
intake
High physical activity Low physical activity
Heart disease Heart disease
Yes No Yes No
High
(group 1)
500 510 230 90
Low
(group 0)
100 150 600 390
Is this association confounded
by physical
???
activity?
Exposure variable
Energy intake
Outcome variable
Heart disease
??? ???
Confounding variables
Physical activity
Confounding condition 1
Association between physical activity and heart disease
** Look particularly in those who are not exposed to the factor of interest**
For low energy intake group:
OR (95% CI) = 0.43 (0.33, 0.58)
For high energy intake group:
OR (95% CI) = 0.38 (0.29, 0.50)

Physical
activity
High energy intake Low energy intake
Heart disease Heart disease
Yes No Yes No
High
(group 1)
500 510 100 150
Low
(group 0)
230 90 600 390
Confounding condition 2
Association between energy intake and physical activity
In a case-control study: examine the association in the controls
In a cohort study: use the whole cohort
Confounding condition 2
Association between energy intake and physical activity for those
without heart disease (n=1140)
Proportion in high energy intake group who report high physical activity =
510/600 = 0.85 (85%)
Proportion in low energy intake group who report high physical activity =
150/540 = 0.28 (28%)
Odds Ratio = (510/90) / (150/390) = 14.7; 95% CI: 11.0 up to 19.7
Physical activity
Energy intake High Low Total
High
(group 1)
510 90 600
Low
(group 0)
150 390 540
Is this association confounded
by physical
???
activity?
Exposure variable
Energy intake
Outcome variable
Heart disease
High energy intake:
OR = 0.38 (95% CI: 0.29, 0.50)
Low energy intake:
OR = 0.43 (95% CI: 0.33, 0.58)
High energy intake
associated
with high physical
activity
Confounding variables
Physical activity
So physical activity is a potential confounder
Control for confounding - Stratified analyses
1) Start with stratum specific estimates
differences, rate ratios
of odds ratios, risk ratios, risk
2) Calculate a weighted average of the
pooled estimate
stratum-specific estimates

Usual method is Mantel-Haenszel method
Weights assigned according to amount of information in each
stratum
Calculate a pooled OR
(60090)/1310) = 41.2
For low physical activity:
OR = 1.66
w= (d
0
h
1
)/n =

For high physical activity:
OR = 1.47
w= (d
0
h
1
)/n =
(100510)/1260) = 40.5

Energy
intake
High physical activity
(n=1260)
Low physical activity
(n=1310)
Heart disease Heart disease
Yes No Yes No
High
(group 1)
500 (d
1
) 510 (h
1
) 230 (d
1
) 90 (h
1
)
Low
(group 0)
100 (d
0
) 150 (h
0
) 600 (d
0
) 390 (h
0
)
Calculate a pooled OR
(60090)/1310) = 41.2
Mantel-Haenszel estimate of pooled odds ratio:

(w
i
OR
i
)

OR =
MH

w
i
Stratum i
For low physical activity:
OR = 1.66
w= (d
0
h
1
)/n =

For high physical activity:
OR = 1.47
w= (d
0
h
1
)/n =
(100510)/1260) = 40.5
Calculate a pooled OR
(60090)/1310) = 41.2
Mantel-Haenszel estimate of pooled odds ratio:
(40.5 1.47 ) + (41.2 1.66)
OR = 1.57 =
MH
(40.5 + 41.2)
95% CI: 1.29 up to 1.91
Recall that the crude OR was 0.94 (95% CI 0.80-1.10)
Is there a difference between crude
and adjusted measures of effect?
For low physical activity:
OR = 1.66
w= (d
0
h
1
)/n =

For high physical activity:
OR = 1.47
w= (d
0
h
1
)/n =
(100510)/1260) = 40.5
Association between energy intake & heart
disease adjusting for physical activity
OR
MH
= 1.57
95% CI: 1.29, 1.91
Exposure variable
Energy intake
Outcome variable
Heart disease
High energy intake:
OR = 0.38 (95% CI: 0.29, 0.50)
Low energy intake:
OR = 0.43 (95% CI: 0.33, 0.58)
High energy intake
associated
with high physical
activity
Confounding variables
Physical activity
Multiple logistic regression
Outcome variable (y-variable) binary
e.g. dead or alive; treatment failure or success;
disease or no disease..
Measure of association Odds ratio
Multiple logistic regression model
log
e
(odds of outcome) =
0
+
1
X
1
+
2
X
2
+
3
X
3
+. +
k
X
k

1
,
k
log
e
(odds ratios)

X
1
, ..X
k
k different exposure variables (do not need to

be binary but can be categorical with more than 2 categories
or numerical)
Useful when there are many confounding variables
Logistic regression
Example Association between energy intake and heart disease
Outcome variable (y-variable) heart disease (coded as yes-1 & no-0)
Logistic regression model
log
e
(odds of outcome) =

0
+
1
X
1

1
log
e
(odds

ratios)
X
1
energy intake (high versus low)

Exposure Odds Ratio (exp
i
) 95% Confidence Interval
Energy intake
(high vs low)
0.94 0.80, 1.10
Multiple logistic regression
Example Association between energy intake and heart disease
Outcome variable (y-variable) heart disease (coded as yes-1 & no-0)
Multiple logistic regression model
log
e
(odds of outcome) =
0
+
1
X
1
+
2
X
2

1
,
2
log
e
(odds ratios)

X
1
energy intake (high versus low)

X
2
physical activity (high versus low)

Exposure Odds Ratio (exp
i
) 95% Confidence Interval
Energy intake
(high vs low)
1.57 1.29, 1.91
Physical activity
(high vs low)
0.41 0.33, 0.49
Multiple linear regression
Outcome variable (y-variable) numerical
e.g. blood pressure, forced expiratory volume in 1 sec (FEV
1
)

Linear regression model
y =
0
+
1
X
1
+
2
X
2
+
3
X
3
+. +
k
X
k

y numerical outcome variable,

1
,
k
increase in y for every unit increase in x

X
1
, ..X
k
k different exposure variables (can be numerical
or categorical with 2+ categories)
Useful when there are many confounding variables
Lecture 10 - Objectives
Understand confounding
Calculate the Mantel-Haenszel estimate of
pooled odds ratio
the
Understand the difference between linear and
logistic regression
Thank You
www.HelpWithAssignment.com

Vous aimerez peut-être aussi