Vous êtes sur la page 1sur 5

American Journal of

EPIDEMIOLOGY
Volume 160
Number 4
August 15, 2004

Copyright 2004 by The Johns Hopkins


Bloomberg School of Public Health
Sponsored by the Society for Epidemiologic Research
Published by Oxford University Press

SPECIAL ARTICLE
Model-based Estimation of Relative Risks and Other Epidemiologic Measures in
Studies of Common Outcomes and in Case-Control Studies

Sander Greenland
From the Departments of Epidemiology and Statistics, University of California, Los Angeles, Los Angeles, CA.

Received for publication November 7, 2003; accepted for publication May 27, 2004.

Some recent articles have discussed biased methods for estimating risk ratios from adjusted odds ratios when
the outcome is common, and the problem of setting confidence limits for risk ratios. These articles have
overlooked the extensive literature on valid estimation of risks, risk ratios, and risk differences from logistic and
other models, including methods that remain valid when the outcome is common, and methods for risk and rate
estimation from case-control studies. The present article describes how most of these methods can be subsumed
under a general formulation that also encompasses traditional standardization methods and methods for
projecting the impact of partially successful interventions. Approximate variance formulas for the resulting
estimates allow interval estimation; these intervals can be closely approximated by rapid simulation procedures
that require only standard software functions.
absolute risk; case-control studies; clinical trials; cohort studies; logistic regression; odds ratio; relative risk; risk
assessment

Abbreviation: GEE, generalized estimating equations.

Recently, McNutt et al. (1) noted a bias in a popular


method by Zhang and Yu (2) for converting odds ratios to
risk ratios. Both articles overlooked the extensive literature
on estimating relative risks and other measures from fitted
models. This literature addresses the problems they noted
and provides valid methods for all study designs, including
case-control studies, cohort studies, and clinical trials.
BACKGROUND

In 1989, Holland (3) proposed an adjusted risk difference using the same biased risk-ratio formula rediscovered

by Zhang and Yu (2). Greenland and Holland (4) described


the biases in that method and gave valid formulas for
converting odds-ratio estimates into risk-difference estimates. There are now many model-based estimates and
confidence intervals for risks (incidence proportions), rates,
and their ratios and differences (5, pp. 414415; 614) and
for attributable fractions (15, 16). These methods can make
use of input from logistic and other models, and most require
no rare-disease assumption. One can also get valid confidence intervals for risk ratios by using Poisson regression
with robust (generalized estimating equations (GEE) or

Correspondence to Dr. Sander Greenland, Departments of Epidemiology and Statistics, University of California, Los Angeles, Los Angeles, CA
90095-1772 (e-mail: lesdomes@ucla.edu).

301

Am J Epidemiol 2004;160:301305

302 Greenland

sandwich) variance estimates (10), which avoid the overly


wide intervals noted elsewhere (1).
To describe the general ideas, suppose that r(x) is the risk
or rate at level x of a regressor vector X. X is a function of all
exposures, confounders, and modifiers in the model; it may
contain powers, product terms, splines, and so forth. For
example, a study of cannabis smoking and lung cancer might
use X = (cannabis grams/year, pack-years cigarettes, age,
age2, female, age female), where female = 1 for women, 0
for men. Suppose we want to compare average risks or rates
when the distribution of X in a target population is p1(x)
versus p0(x). These distributions usually correspond to
everyone exposed versus everyone unexposed to some risk
factor. In policy applications, p1(x) and p0(x) may represent
the population distribution without versus with the application of some intervention program. Some X components
(e.g., age, sex) may have the same distribution in p1(x) and
p0(x); only those components affected by the exposure will
differ. For example, p1(x) could represent the existing joint
distribution of cannabis smoking, cigarette smoking, age,
and sex in the target, while p0(x) could represent the same
distribution of cigarette smoking, age, and sex, with zero
cannabis assigned to everyone (for everyone, the first entry
in X is shifted to zero, and the rest are unchanged).
The standardized (population-averaged) risk or rate under
exposure or intervention j is Rj = x pj(x)r(x), where the sum
is over the range of X in the target. The adjusted risk or rate
ratio and difference are then RR10 = R1/R0 and RD10 = R1 R0,
respectively. The adjusted attributable fraction is (R1 R0)/
R1 = RD10/R1 = (RR10 1)/RR10; the population attributable
fraction is the special case in which p1(x) is the current distribution and p0(x) is the distribution after exposure removal
(15, 16). The covariate-specific RR formula given by
McNutt et al. (1, p. 941) is a special case in which the distributions p1(x) and p0(x) are concentrated at single values x1
and x0 of X, and the exposure differs between x1 and x0 but the
other covariates do not. For example, to compare the risk
from use of 200 g/year of cannabis with that of nonuse
among males aged 50 years with 10 cigarette pack-years of
smoking, if X = (cannabis grams/year, cigarette pack-years,
age, age2, female, age female), one would take x1 = (200,
10, 50, 502, 0, 50 0) and x0 = (0, 10, 50, 502, 0, 50 0).
ESTIMATION

Model-based confidence intervals for the above quantities


can be obtained from variance formulas (814). To avoid
programming these formulas, intervals can instead be
obtained by simulation or other resampling methods such as
bootstrapping (1618). If the comparison distributions p1(x)
and p0(x) are also estimated, their estimates should also be
resampled. There are many ways to make use of the resampling distribution of estimates; one should avoid naive use of
the percentiles of the bootstrap distribution to set confidence
limits, however (16, 17). Resampling methods allow use of
any model to fit the risks or rates r(x), and they may also be
applied by replacing r(x) with other outcome measures such
as expected years of life lost or with clinical measures such
as blood pressure, CD4 count, and so forth.

A key advantage of model-based estimates is that they do


not require large numbers at each regressor level; the
regressor values may even be unique to each individual (as
would be expected when some covariates are continuous)
(614). To avoid sparse-data artifacts, they do require that
the numbers of cases and noncases be adequate relative to
the number of model parameters, although this restriction
can be reduced by using penalized estimation (shrinkage) or
Bayesian methods to fit the model (1921).
Model-based estimates of r(x) can be sensitive to influential data points and to model misspecification, but they nonetheless tend to have a smaller mean-squared error than do
raw covariate-specific estimates (which become wildly
unstable in sparse data) if the model fits well (22, chap. 12).
When one standardizes over distributions similar to those in
the data, the resulting summary estimates will be far less
sensitive than the specific r(x) estimates; this robustness
derives from the tendency of residual errors to average to
zero over the data distribution.
EXAMPLE

Table 1 presents data on the relation of receptor level and


staging to survival in a cohort of women with breast cancer
(23, table 5.3), along with risk estimates from use of several
methods. Let x = (x1, x2, x3), with x1 an indicator of lowreceptor status and x2 and x3 indicators of stage II and III.
The first set of estimates is the observed proportion of
women in each column who died. The second set is from
maximum-likelihood logistic regression using the model r(x) =
expit( + x), where expit(u) = eu/(1 + eu); the model fits
well (e.g., likelihood-ratio p = 0.8) and the fitted risks are
expit(a + xb), where a and b are the and estimates. The
third set is from a log-linear model r(x) = e + x fit by binomial maximum likelihood (23); this model also fits well
(e.g., p = 0.8) and the fitted risks are ea + xb. The fourth set is
from this log-linear model fit using the incorrect Poisson
likelihood, as one would obtain by entering the observed
column totals as person-years in a Poisson regression
program (1, 10, 14).
To obtain a standardized risk ratio comparing the low and
high receptor group using the total group as the standard,
take p1(x) and p0(x) shown in the final two rows of table 1,
multiply them against a row of estimated risks, and sum the
results to get the R1 and R0 estimates. Under the log-linear
model, the RR10 estimate simplifies to exp(b1), where b1 is
the estimated x1 coefficient. The estimated standardized
risks, ratios, and differences are R 1 = 0.392, R 0 = 0.237,
10 = 0.156 if the observed proportions
10 = 1.65, and RD
RR

are used; R 1 = 0.401, R 0 = 0.239, RR


10 = 1.68, and RD 10 =

0.162 if the logistic model is used; R 1 = 0.371, R 0 = 0.238,

RR
10 = 1.56, and RD 10 = 0.133 if the log-linear model fit by
binomial maximum likelihood is used; and R 1 = 0.383, R 0 =

0.235, RR
10 = 1.63, and RD 10 = 0.148 if log-linear Poisson
regression is used. The Mantel-Haenszel estimates (5, p.

271) are RR
10 = 1.62 and RD 10 = 0.166. Thus, all the valid
approximations yield similar results, as expected given the
sample size and good fit of the models.
In contrast, the odds-ratio estimate exp(b1) from the
logistic model is 2.51, and the Zhang-Yu risk-ratio estimate
Am J Epidemiol 2004;160:301305

Model-based Estimation of Epidemiologic Measures 303

TABLE 1. Data relating receptor level (low, high) and stage (I, II, III) to 5-year breast
cancer mortality (23), observed and model-based estimates of average risk (incidence
proportion) by receptor level and stage, and distributions for standardizing receptorlevel comparisons to total-cohort stage distribution
Stage I

Stage II

Stage III

Low

High

Low

High

Low

High

Deaths

17

12

Survivors

10

50

13

57

Total

12

55

22

74

14

15

Observed*

167

91

409

230

857

600

Logistic

190

86

422

226

816

639

Binomial

148

95

376

242

870

558

Poisson

153

94

386

237

905

555

p1(x)

0.349

0.500

0.151

p0(x)

0.349

0.500

0.151

Estimates of 5-year
mortality risk
(per 1,000)

Comparison distributions

* Deaths/total (nonparametric risk estimate).


Log-linear risk model with receptor level and stage, fit by binomial maximum likelihood.
Log-linear risk model fit by incorrect Poisson maximum likelihood.
p1(x) puts the total cohort at a low receptor level; p0(x) puts the total cohort at a high
receptor level.

(2) is 1.89; both overestimate the risk ratio, as expected


given that the outcome is not rare (over a quarter of the
patients died). Another invalid model-based adjustment
predicts an expected number of exposed cases E from a
model without exposure, then divides E into the observed
number of exposed cases to get a standardized mortality ratio
(22, sec. 4.3). This approach underestimates risk ratios (24);
using a logistic model with only x2 and x3 yields E = 17.35
and a standardized mortality ratio of 23/17.35 = 1.33.
If the observed proportions are used, 95 percent confidence limits for RR10 are 1.06, 2.58 and for RD10 are 0.006,
0.304 (5, p. 263); if the logistic model is used, the limits for
RR10 are 1.09, 2.57 and for RD10 are 0.013, 0.303 (8, 9); and,
if the binomial log-linear model is used, the limits for RR10
are exp(b1 1.96v11/2) = 1.05, 2.30, where v1 is the estimated
variance of b1, and for RD10 are 0.023, 0.312 (9). Standard
Poisson regression overestimates the variance of b1, yielding
limits for RR10 of 0.93, 2.87; nonetheless, GEE Poisson
regression with the robust variance estimate (available in
Stata proc xtgee and SAS proc genmod (25)) yields limits for
RR10 of 1.07, 2.48. The Mantel-Haenszel limits for RR10 are
1.09, 2.39 and for RD10 are 0.016, 0.316 (5, p. 271). The loglinear model fit by binomial maximum-likelihood supplies
the narrowest risk-ratio interval because it is the only one of
these methods that is fully efficient under the model.
Simulated 95 percent confidence limits (16) from 400,001
coefficient resamplings (which avoid complex variance
formulas) were nearly the same: 1.09, 2.54 for RR10 and
Am J Epidemiol 2004;160:301305

0.023, 0.312 for RD10 using logistic regression; 1.05, 2.30 for
RR10 and 0.023, 0.312 for RD10 using binomial log-linear
regression; and 1.07, 2.47 for RR10 and 0.021, 0.299 for RD10
using Poisson regression.
CASE-CONTROL STUDIES

Cumulative case-control studies sample cases and controls


from cohort members who do and do not get disease by the
end of follow-up (5, pp. 110111). Given a valid estimate of
the crude (overall) risk rc in the target population or of the
ratio of case-control sampling fractions rf, one can estimate
the covariate-specific risks in the target (and hence their
differences and ratios) even if the disease is common. If the
data are not sparse, one can use results from case-control
modeling to estimate risks or rates and their contrasts (5, pp.
418419; 2632). For models (such as the logistic) in which
the baseline odds is a multiplicative factor, ln(rc) or ln(rf)
becomes a simple adjustment term to the model intercept (5,
pp. 417419; 26, 27); other models can be used, however
(28, 31). Similar methods can be used to estimate risks from
case-cohort studies, in which controls are sampled from all
cohort members, not just noncases (5, pp. 417, 419; 33).
In density case-control studies, controls are sampled longitudinally from those at risk, in proportion to person-time (5,
pp. 9396). No adjustments are then needed to estimate rate
ratios from the fitted logistic model (5, pp. 416417; 34, 35),
and intercept adjustments analogous to the cumulative

304 Greenland

formulas can be used to estimate rates and rate differences


(5, pp. 417; 2730, 32).
If the analysis strata are small (sparse), as in matched analyses, special summary methods may be needed to estimate
exposure-specific risks (36). To avoid sparse data, one often
sees matching factors entered as simple terms in an
unmatched analysis. Unfortunately, this strategy can
produce bias if the matching factors are not ignorable and are
modeled as continuous (e.g., age is entered directly despite
being matched within 5-year categories), because casecontrol matching creates discontinuities in the sample factoroutcome relation at matching-category boundaries (37).
DISCUSSION

Traditional impact measures such as attributable fractions


take p1(x) to be the current population distribution and p0(x)
the distribution after complete exposure removal (5, p. 58;
13, 15, 18). These measures can be very misleading for
policy projections: Feasible interventions can rarely achieve
anything near complete exposure removal, may have untoward side effects (including adverse effects on quality of life
or resources available for other purposes), and may affect the
size of the population at risk. Hence, intelligent policy input
requires consideration of the full spectrum of intervention
limitations and side effects, rather than just traditional estimates (3840). It further requires quantitative assessments of
bias, as well as of random error (4146); simulation confidence intervals are easily extended to subsume this task (16).
Finally, because epidemiologic textbooks persist in erroneous claims otherwise (e.g., 47, p. 201), it is worth noting
that attributable fractions do not approximate the etiologic
fraction (fraction of cases caused by exposure) or the probability of causation, even if the disease is rare (4850).
Rates are often substituted for risks when estimating
impact measures. This substitution overstates impact on the
study outcome when the exposure at issue strongly affects
person-time at risk, as can occur when exposure affects other
outcomes (5, p. 63; 51). One can reduce this problem by
converting rates to risk estimates before standardizing, for
example, by stratifying on follow-up time and then applying
the exponential formula (5, p. 40): If the fitted rate in period
k is rk(x) and the length of period k is tk, an estimate of the
risk over periods 1 through K is 1 exp[krk(x)tk], where
the sum is from k = 1 to k = K. One can also estimate risks
from rates via survival models, which allow use of continuous time (52).

ACKNOWLEDGMENTS

The author thanks Katherine Hoggatt for helpful


comments.

REFERENCES
1. McNutt LA, Wu C, Xue X, et al. Estimating the relative risk in
cohort studies and clinical trials of common outcomes. Am J

Epidemiol 2003;157:9403.
2. Zhang J, Yu KF. Whats a relative risk? A method of correcting
the odds ratio in cohort studies of common outcomes. JAMA
1998;280:16901.
3. Holland PW. A note on the covariance of the Mantel-Haenszel
log-odds-ratio estimator and the sample marginal rates.
Biometrics 1989;45:100916.
4. Greenland S, Holland PW. Estimating standardized risk differences from odds ratios. Biometrics 1991;47:31922.
5. Rothman KJ, Greenland S, eds. Modern epidemiology. 2nd ed.
Philadelphia, PA: Lippincott-Raven, 1998.
6. Lee J. Covariance adjustment of rates based on the multiple
logistic regression model. J Chronic Dis 1981;34:41526.
7. Lane PW, Nelder JA. Analysis of covariance and standardization as instances of prediction. Biometrics 1982;38:61321.
8. Flanders WD, Rhodes PH. Large-sample confidence intervals
for regression standardized risks, risk ratios, and risk differences. J Chronic Dis 1987;40:697704.
9. Greenland S. Estimating standardized parameters from generalized linear models. Stat Med 1991;10:106974.
10. Stijnen T, van Houwelingen HC. Relative risk, risk difference
and rate difference models for sparse stratified data: a pseudolikelihood approach. Stat Med 1993;12:2285303.
11. Greenland S. Modeling risk ratios from matched cohort data: an
estimating equation approach. Appl Stat 1994;43:22332.
12. Joffe MM, Greenland S. Estimation of standardized parameters
from categorical regression models. Stat Med 1995;14:2131
41.
13. Bruzzi P, Green SB, Byar DP, et al. Estimating the population
attributable risk for multiple risk factors using case-control
data. Am J Epidemiol 1985;122:90414.
14. Cummings P, McKnight B, Greenland S. Matched cohort methods for injury research. Epidemiol Rev 2003;25:4350.
15. Greenland S, Drescher K. Maximum likelihood estimation of
attributable fractions from logistic models. Biometrics 1993;49:
86572.
16. Greenland S. Interval estimation by simulation as an alternative
to and extension of confidence intervals. Int J Epidemiol (in
press).
17. Carpenter J, Bithell J. Bootstrap confidence intervals: when,
which, and what? Stat Med 2000;19:114164.
18. Greenland S. Estimating population attributable fractions from
fitted incidence ratios and exposure survey data, with an application to electromagnetic fields and childhood leukemia.
Biometrics 2001;57:1828.
19. Greenland S, Schwartzbaum JA, Finkle WD. Problems due to
small samples and sparse data in conditional logistic regression
analysis. Am J Epidemiol 2000;151:5319.
20. Greenland S. When should epidemiologic regressions use random coefficients? Biometrics 2000;56:91521.
21. Greenland S. Putting background information about relative
risks into conjugate priors. Biometrics 2001;57:66370.
22. Bishop YMM, Fienberg SE, Holland PW. Discrete multivariate
analysis. Cambridge, MA: MIT Press, 1975.
23. Newman SC. Biostatistical methods in epidemiology. New
York, NY: Wiley, 2001.
24. Greenland S. Bias in methods for deriving standardized morbidity ratios and attributable fraction estimates. Stat Med 1984;
3:13141.
25. Zou G. A modified Poisson regression approach to prospective
studies with binary data. Am J Epidemiol 2004;159:7026.
26. Anderson JA. Separate-sample logistic discrimination.
Biometrika 1972;59:1935.
27. Greenland S. Multivariate estimation of exposure-specific incidence from case-control studies. J Chronic Dis 1981;34:445
53.

Am J Epidemiol 2004;160:301305

Model-based Estimation of Epidemiologic Measures 305

28. Nurminen M. Assessment of excess risks in case-base studies. J


Clin Epidemiol 1992;45:108192.
29. Benichou J, Wacholder S. A comparison of three approaches to
estimate exposure-specific incidence rates from populationbased case-control data. Stat Med 1994;13:65161.
30. Benichou J, Gail MH. Methods of inference for estimates of
absolute risk derived from population-based case-control studies. Biometrics 1995;51:18294.
31. Wacholder S. The case-control study as data missing by design:
estimating risk differences. Epidemiology 1996;7:14450.
32. King G, Zeng L. Estimating risk and rate levels, ratios and differences in case-control studies. Stat Med 2002;21:140927.
33. Schouten EG, Dekker JM, Kok FJ, et al. Risk ratio and rate estimation in case-cohort designs. Stat Med 1993;12:173345.
34. Sheehe PR. Dynamic risk analysis in retrospective matchedpair studies of disease. Biometrics 1962;18:32341.
35. Prentice RL, Breslow NE. Retrospective studies and failuretime models. Biometrika 1978;65:1538.
36. Greenland S. Estimation of exposure-specific rates from sparse
case-control data. J Chronic Dis 1987;40:108794.
37. Greenland S. Partial and marginal matching in case-control
studies. In: Moolgavkar SH, Prentice RL, eds. Modern statistical methods in chronic disease epidemiology. New York, NY:
Wiley, 1986:3549.
38. Morgenstern H, Bursic ES. A method for using epidemiologic
data to estimate the potential impact of an intervention on the
health status of a target population. J Community Health 1982;
7:292309.
39. Greenland S. Causality theory for policy uses of epidemiologic
measures. In: Murray CJL, Salomon JA, Mathers CD, et al, eds.
Summary measures of population health. Geneva, Switzerland:
World Health Organization, 2002:291302.
40. Poole C. Generalized effect estimation: An antidote to utopian
preventive fantasies. (Abstract). Am J Epidemiol 2003;157:

Am J Epidemiol 2004;160:301305

S59.
41. Eddy DM, Hasselblad V, Schachter R. Meta-analysis by the
confidence profile method. New York, NY: Academic Press,
1992.
42. Lash TL, Fink AK. Semi-automated sensitivity analysis to
assess systematic errors in observational epidemiologic data.
Epidemiology 2003;14:4518.
43. Phillips CV. Quantifying and reporting uncertainty from systematic errors. Epidemiology 2003;14:45966.
44. Greenland S. The impact of prior distributions for uncontrolled
confounding and response bias. J Am Stat Assoc 2003;98:47
54.
45. Greenland S. Multiple-bias modeling for observational studies
(with discussion). J R Stat Soc (A) (in press).
46. Steenland K, Greenland S. Monte Carlo sensitivity analysis and
Bayesian analysis of smoking as an unmeasured confounder in
a study of silica and lung cancer. Am J Epidemiol 2004;160:
38492.
47. Koepsell TD, Weiss NS. Epidemiologic methods. New York,
NY: Oxford University Press, 2003.
48. Greenland S, Robins JM. Conceptual problems in the definition
and interpretation of attributable fractions. Am J Epidemiol
1988;128:118597.
49. Greenland S. The relation of the probability of causation to the
relative risk and the doubling dose: a methodologic error that
has become a social problem. Am J Public Health 1999;89:
11669.
50. Greenland S, Robins JM. Epidemiology, justice, and the probability of causation. Jurimetrics 2000;40:32140.
51. Greenland S. Absence of confounding does not correspond to
collapsibility of the rate ratio or rate difference. Epidemiology
1996;7:498501.
52. Kalbfleisch JD, Prentice RL. The statistical analysis of failure
time data. 2nd ed. New York, NY: Wiley, 2002.