Vous êtes sur la page 1sur 36

STANDARDIZATION AS A METHOD TO

CONTROL CONFOUNDING

Stratification vs. Standardization

In analytical epidemiology, the purpose is to investigate


etiologic or causal, associations between disease and exposure.
In controlling for confounding, we attempt to obtain an
undistorted estimate of the strength of the hypothesized
relationship.

When the purpose is to compare health status in different


populations, i.e. , membership in the population is the
exposure, standardization of summary rates is carried out in
order to take into account differences in characteristics
between the two populations, such as age, gender, etc.
Standardized rates are useful in a public health context, and
are used for descriptive as well as analytical purposes.

The goal in either case is to account for any mixing of a third


factor (or multiple other factors) with the primary association
of interest.
Standardization of rates is commonly used when comparing
mortality rates or incidence rates across populations.

Examples of populations compared:

Geography: Nations, states, counties, cities, community areas,


census tracts, etc.

Groups: Occupational groups (typically blue collar, more


recently white collar occupations studies), members of HMOs,
groups categorized by race/ethnicity, etc.

Summary adjusted rates provide a kind of snapshot of the


overall risk of disease or death, which can be compared across
populations. Similar to stratification procedures, however, a
summary rate sometimes masks differences within stratum
that may be relevant for public health or etiologic purposes.
An age adjusted summary rate, for example, will not reveal if
the age specific relative risks for two populations are
homogeneous.
Since age is positively correlated with most chronic diseases
and risk of death, and since most populations have different
age structures, age meets the definition of a confounder, and
age adjusted rates are generated in order to make comparisons
across different populations.

Example: CHD rates increase with age.


Population A has 12% of population over age of 65
Population B has 6% of population over age of 65
Higher crude death rates would be expected in
population A on the basis of age difference alone.

From a public health perspective, it is more relevant


to know if there are factors other than age that
contribute to the difference in CHD rates that are
amenable to interventions.
What distinguishes standardization of rates from other
stratified methods of controlling for confounding, is use
of an external standard as the basis for comparison.
For example, when adjusting for age using direct
standardization, the external standard is an age
distribution. This can be the World age distribution,
state or nation age distribution, or can be one of the
populations being compared, or a combination of the
two populations being compared.

In indirect age standardization, the external standard is


a set of age specific rates applied to the study age
distribution.
Direct standardization applies the stratum specific rates
(i.e. age specific rates) of each study population to the
number of individuals in the corresponding stratum in
the standard population. An expected number of
cases/deaths is generated for each stratum, and the total
expected is used in the numerator of the adjusted rate.
The denominator is the total number of individuals in
the standard population. Adjusted rates can then be
compared for different populations using Relative
Risks, or Attributable Risk differences.

Adjusted Relative Risk: Ratio of two adjusted rates

Adjusted Attributable Risk: Difference between two


adjusted rates

The adjusted rates are the rates the populations would


have experienced if they had the same distribution on
the confounding factor. Age adjusted risks, for
example, are the rates the different populations would
have had if they all had the same age distribution. Age
adjusted rates mean nothing by themselvesthey are
only used in comparison with other age adjusted rates.
Assumptions

If the purpose of adjustment is to compare rates across


populations, adjusted RR or AR assume that the effects
are homogeneous across strata of the confounding
variable. The adjusted RR or AR is a weighted average
that should reasonably represent stratum specific RR
or AR.

Homogeneity can be assessed by either an additive


model (AR), or multiplicative model (RR). If there is
no additive interaction, it is appropriate to use adjusted
AR. If there is no multiplicative interaction, it is
appropriate to use adjusted RR.
EXAMPLE
Table 7-4 in Szklo

Example when stratum specific attributable risks are


homogeneous
Study Group A Study Group B
Age N Cases Rate N Cases Rate
(%) (%)
<40 100 20 20 400 40 10
>=40 200 100 50 200 80 40
Total 300 120 40 600 120 20

Attributable Risk by age: Relative Risk by age:


< 40 20-10=10 < 40 20/10=2.00
>40 50-40=10 >40 50/40=1.25

Calculation of Age adjusted Estimates:

Younger Standard Population Older Standard Population


Age N Expected Expected N Expected Expected
cases cases Cases cases using
using A using B using A B rates
rates rates rates
<40 500 100 50 100 20 10
>=40 100 50 40 500 250 200
Total 600 150 90 600 270 210
Adjusted rate A: 150/600= 25% Adjusted Rate A: 270/600=45%
Adjusted rate B: 90/600=15% Adjusted Rate B: 210/600=35%

AR=25-15=10% AR=45-35=10%
RR A/B=.25/.15= 1.67 RR A/B=.45/.35=1.29

Relative risk is higher in younger (2.0) than older population (1.25), so


using younger population as standard, results in higher relative risk.
Adjusted AR are same, regardless of standard
population because AR are homogeneous within
stratum. Adjusted RR are different due to lack of
homogeneity across stratum.

If stratum specific AR or RR are not homogeneous,


choice of standard population will affect the adjusted
AR or RR. It may be preferable in this situation, to use
stratum specific AR or RR.
Example when stratum specific RR are homogeneous,
but AR are not homogeneous.

Table 7-8 in Szklo


Study Group A Study Group B
Age N Cases Rate N Cases Rate
(%) (%)
<40 100 6 6 400 12 3
>=40 200 60 30 200 30 15
Total 300 66 22 600 42 7

Attributable Risk by age: Relative Risk by age:


< 40 6-3=3% < 40 6/3=2.00
>40 30-15=15% >40 30/15=2.00

Calculation of Age adjusted Estimates:

Younger Standard Population Older Standard Population


Age N Expected Expected N Expected Expected
cases cases Cases cases using
using A using B using A B rates
rates rates rates
<40 500 30 15 100 6 3
>=40 100 30 15 500 150 75
Total 600 60 30 600 156 78
Adjusted rate A: 60/600= 10% Adjusted Rate A: 156/600=26%
Adjusted rate B: 30/600= 5% Adjusted Rate B: 78/600=13%

AR=10-5=5% AR=26-13=13%
RR A/B=.10/.5= 2.00 RR A/B=.26/.13=2.00

Higher attributable risk using older standard population, because


attributable risk is greater in older than younger stratum (15 vs 3%)
Issues in Direct Adjustment

Used when rates are to be compared. The absolute


value of an adjusted rate will vary depending on the
standard population, so the absolute rate is not usually
of interest.

Choice of Standard Population:

Depends on populations being compared

Standard populations based on geography:

If rates in countries are being compared, can use World


Standard population. World standard is often used in
comparison of mortality rates by country.

State ratescan use US population as standard.

For local rates (i.e. county or cities) can use the state
standard population. For example, Illinois State
Cancer Registry publishes age adjusted rates by county
using Illinois standard population.

If calculating rates for an occupational group within a


metropolitan area, can use metropolitan area as
standard.
Other standard populations:

1. Artificial populations (i.e. 1000 subjects in each


stratum)

2. Combined study group populations (i.e. for


comparison of two counties, can add number of subjects
in each stratum from each county)

3. Use stratum specific numbers from one of the study


groups. Eliminates the need to calculate age adjusted
rate for that group, because the crude=adjusted for that
group. If one study group is relatively small, use that
population as the standard, because the age specific
rates will be unstable due to small numbers.

4. Minimum variance method. Useful when sample


sizes are smallproduces statistically stable adjusted
estimates using population sizes from both study
samples. (Alternatively can use indirect method if
sample sizes are small).
Can adjust for more than one confounder in direct or
indirect adjustment

Rates are often calculated adjusted for age, race, sex.


However, for public health purposes, rates are also
often calculated separately by sex, race. Illinois State
Cancer Registry calculates rates separately by
race/ethnicity, sex, so that health disparity issues can be
addressed. Calendar time is also used as a confounder,
because rates can vary over time.

If rates vary across different groups, summary adjusted


rate may not be appropriate, particularly if intervention
is planned.

If there are too many strata, some cell sizes will be


sparse, rates will be unstable. Indirect adjustment can
be used instead.
Adjustment for more than one confounder:

Rates by age, race gender, calendar time :

Male
White Black
1990-1994 1995-1999 1990-94 1995-999

Age 0-4
5-9
10-14
.
.
.
.
85+

Apply each stratum specific rate to a standard


population, can get age, race, gender, calendar year
adjusted rate.

Or, can calculate age, calendar adjusted rates


separately by race and gender.
EXAMPLE USING DIRST in PEPE

Enter weights or population

Enter rates or numerator denominator for each


population

If rates only entered, can only calculate adjusted rate


(no SE or confidence interval calculated)

Standard population used: US, 1970

1970 Mortality rates by age: California and Maine


8 age groups
DIRST - Direct Standardization
Thursday, 19th September 2002.

DATA
Weights expressed as proportions:
Stratum Weight
1 0.28492
2 0.17440
3 0.12257
4 0.11362
5 0.11426
6 0.09148
120
8 0.03755

1 Numerator = 8751 Denominator = 5524000


2 Numerator = 4747 Denominator = 3558000
3 Numerator = 4036 Denominator = 2677000
4 Numerator = 6701 Denominator = 2359000
5 Numerator = 2330 Denominator = 15675000
6 Numerator = 1704 Denominator = 26276
7 Numerator = 1105 Denominator = 36259
1 Numerator = 8751 Denominator = 5524000
2 Numerator = 4747 Denominator = 3558000
3 Numerator = 4036 Denominator = 2677000
4 Numerator = 6701 Denominator = 2359000
5 Numerator = 15675 Denominator = 2330000
6 Numerator = 26276 Denominator = 1704000
7 Numerator = 36259 Denominator = 1105000
8 Numerator = 63840 Denominator = 696000

Standardized rate = 8.823 per 1000

Standard error of standardized rate = 0.021 per 1000

90% confidence interval = 8.788 to 8.858 per 1000


95% confidence interval = 8.782 to 8.864 per 1000
99% confidence interval = 8.769 to 8.877 per 1000

Total denominator = 19953000


DATA
1 Numerator = 535 Denominator = 286000
2 Numerator = 192 Denominator = 168000
3 Numerator = 152 Denominator = 110000
4 Numerator = 313 Denominator = 109000
5 Numerator = 759 Denominator = 110000
6 Numerator = 1622 Denominator = 94000
7 Numerator = 2690 Denominator = 69000
8 Numerator = 4788 Denominator = 46000

Standardized rate = 9.889 per 1000

Standard error of standardized rate = 0.092 per 1000

90% confidence interval = 9.737 to 10.040 per 1000


95% confidence interval = 9.708 to 10.069 per 1000
99% confidence interval = 9.652 to 10.126 per 1000

Total denominator = 992000


Crude rate for California: 8.3
Crude rate for Maine: 11.1
Relative risk crude: Maine/California= 1.34
AR Maine-California=2.8

Age adjusted rate for California: 8.82


Age adjusted rate for Maine: 9.89
RR adjusted for age: 1.12
AR Adjusted for age: 1.07

Crude RR weighted by older population in Maine


Maine still has higher mortality rate than California after
adjusting for age. Some other factor explains this difference.
SE of rates

Can calculate a Standard Error for directly adjusted


rates. However, if populations are large (i.e. states,
countries, cities, etc), statistical or sampling stability is
less of an issue than other potential errorsdata
collection errors, estimation of population
denominators, coding of cause of death etc.

If you have a large sample in each group, you can find a


significant difference between rates even for trivial
differences.

In small samples, sampling variability becomes more


important. Can calculate SE to get a confidence
interval around adjusted rate. (Chiang 1961, other
estimates)

Can also get variance of difference between two


adjusted rates, use to get a z score for significant
differences between rates (Kahn and Sempos, 1989)
INDIRECTLY STANDARDIZED RATES

Two main reasons why we would use indirectly standardized


rates.

1. Stratum specific rates are unavailable in the study


population(s).

2. Small sample sizes--- render stratum specific rates unstable.


Ex. 2 deaths, 20 people in one stratum, rate=.10
If only 1death instead of 2, rate is 1/20= .05
Difference of only one death changes rate by 50%!
Indirect AdjustmentSteps

Requires external reference rates for each stratum

Study population:

Need population size for each stratum

Need observed number of events in each stratum

Multiply external rates times population in each study


stratum to get expected number of events.

Sum total number of expected events for all stratum.

Divide observed events by expected events to get:

Standardized Incidence Ratio (SIR) for incidence

Standardized Mortality Ratio (SMR) for mortality

Standardized Prevalence Ratio (SPR) for


prevalence(not used very often)

Sometimes SIR/SMR multiplied by 100

SIR/SMR over 1 (100) Observed greater than expected


SIR/SMR less than 1 (100)--Observed less than expected
SIR/SMR ---comparison is always to the external
reference population from which you obtained the rates

SIRs/SMRs cannot be compared to each other unless


either:

(1) they have the same distribution according to the


stratification variable, i.e. age, OR

(2) the stratum specific SIRs/SMRs are similar within


your study population (i.e. SIRs/SMRs for each age
group are similar)

The latter is not likely in occupational cohort studies,


because the healthy worker effect declines with age (as
workers age, their mortality experience mirrors that of
the general population, thus SMRs tend to get larger
with age)

If either of these is true, you can compare SIRs/SMRs


to each other.

Otherwise you can only compare them to the external


reference rates.
Hypothetical Example with two study groups with
identical age specific rates, but different age
distributionsTable 7-8 in Szklo

External
Reference
Study Group A Study Group B Rates
Age N Deaths Rate N Deaths Rate
< 40 100 10 10% 500 50 10% 12%
>=40 500 100 20% 100 20 20% 50%
Total 600 110 18.3% 600 70 11.7%

Expected number of deathsapply reference rates to stratum


specific populations

Age Study Group A Study Group B

<40 .12 x 100=12 .12 x 500=60


>=40 .50 x 500=250 .50 x 100=50
Total Expected 262 110

SMR (group A)= 110/262=.42 SMR (group B)=70/110=.64

Even though they have the same age specific rates, they have
different SMRs because they have different age distributions.

Also age specific SMRs in each group are different:

Group A: <40 SMR= 10/12=.83, >=40 SMR= 100/250=.4


Group B: <40 SMR= 50/60=.83 >=40 SMR=20/50=.4
We can compare them to the external standard but not to each
other for these reasons.

Group A has 42% fewer deaths than expected compared to the


external standard.

Group B has 64% fewer deaths than expected compared to the


external standard.

If we are only comparing one study group to an external


standard, it is similar to the direct method, in that the study
population is the standard population (we are using the age
distribution of the study population). But we cannot compare
stratum specific SMRs to each other in one study population
unless they have the same stratum specific distribution or
stratum specific SMRs.
EXAMPLES FROM OCCUPATIONAL COHORT
STUDIES

Indirect standardization has been used most often in


occupational cohort studies.

Usually one study population (i.e. occupational cohort)


compared to external rates

The overall SMR from all causes is not usually of prime


interest, particularly because of the healthy worker effect

(overall mortality in a working population will be less than


that of the general population because working people are
usually healthier than non-working people)

However, the healthy worker effect will not usually affect


diseases such as cancers, which have a long latency period.
The healthy worker effect also declines with age.

Usually SMRs are calculated for different causes of death, and


adjusted for age, time worked, exposure, calendar year, etc.
Thus, we have not just one SMR, but often a number of SMRs
for different categories.

Stratification on many variables can result in small cell sizes,


and small observed number of events, particularly for rare
diseases (i.e. cancers). Can combine some cells if this is a
problem (i.e. can use larger age groups)

Overall SMRs may also reflect heterogeneity within stratum.


Stratum specific SMRs may provide more information
regarding associations between disease and exposure in this
case.
Example:

White female primary liver cancer Standardized Mortality Ratios (SMR) by


number of years employed, adjusted for age and calendar year

Number of years SMR (number of 95% Confidence


employed deaths) Interval
Less than one 2.27 (4) .62-5.82
1-4 0 (0)
5-9 0 (0)
10 or more 6.22* (4) 1.70-15.92
TOTAL 2.27*(9) 1.04-4.31
*p < .01

20 or more years latency, 10 or more years worked: SMR: 6.33* (4), 1.73-16.20

Cannot compare these SMRs unless they have similar


distributions by age and calendar year, or similar SMRs across
these categories.
Primary Liver/Biliary Cancer Standardized Mortality Ratios, White Females,
by Calendar Year and Duration of Employment

Calendar year, Observed SMR 95%


Duration of Employment Deaths Confidence
Intervals
Number of years worked,
1944-1951
1-3 quarters 3 2.28 .47-6.7
1-4 years 1 .91 .02-5.06
5 years or more 4 9.80** 2.67-25.07
Number of years worked,
1952-1956
1-3 quarters 0 - -
1-3 years 0 - -
3 years or more 4 7.30** 1.99-18.67
Number of years worked,
1957-1970
1-3 quarters 2 5.86 .71-21.17
1-3 years 0 - -
3 years or more 3 4.55 .94-13.30
Number of years worked,
1970-1977
1-3 quarters 0
1-3 years 0
3 years or more 0
PERSON YEARS ARE TABULATED FOR EACH
STRATUM, EXPECTED EVENTS CALCULATED
SUMMED OVER ALL STRATUM OR SUB GROUP
OF STRATUM

Example:

White Males
Years worked=less than one
Calendar Years
1970-74 1975-1979 . 1995-99
Age*
18-24 Number Number etc
of of
Person person
yrs yrs
25-29

85+
*Working populationexcludes under 18

Years worked=1-4
Calendar Years
1970-74 1975-1979 . 1995-99
Age*
18-24 Person Person
yrs yrs
25-29

85+

ETC. Repeat for different years worked


NIOSH LTAS SoftwareGenerates SMRs for 99 or 92
causes of death using US or State Rates
Adjusted for age, sex, race, calendar year, user provides
employment info (years worked, calendar years worked
etc).

MONSON Original SMR software using Fortrannot


used much anymore

Or you can use PEPI INDIRS

Need to enter stratum specific rates, stratum


specific population information
CHOICE OF REFERENCE POPULATION

Same issues as for direct standardization

For occupational cohorts, ideal reference rates would be from


another working population (NIOSH has been developing such
a population) to avoid Healthy Worker Effect

For urban study populations, use urban reference rates, rural


populations use rural reference rates, etc. This is a way to
control for exposures that may vary by geography.

For example, urban breast cancer rates should be used for


urban study groups looking at breast cancer. (Late age at first
pregnancy is a major risk factor for breast cancer, which
varies according to urban/rural residence)

Local rates should be used if the reference population is large


enough. Ex. Study of an occupational cohort in Chicago
could use Cook County incidence/mortality rates as the
reference.

If national rates are different from local rates (whether they be


state or urban rates), use rates local to your population. This is
a way of controlling for exposures that may vary by
geography. For example, skin cancer rates are higher in the
Southern US, so if skin cancer is an outcome of interest in your
study population, which is in the northern US, use reference
rates from northern US. Otherwise, you will overestimate the
number of expected cases and underestimate the SMR/SIR.
REFERENCE RATES, Continued

Can also use an internal comparison group. For


occupational cohorts, can use a non-exposed group--IF
non-exposed group exists, and size of non-exposed group
is large enoughnot always true. Can use Cox Proportional
Hazard Models, other methods.
Confidence Intervals and Statistical Tests for
Indirectly Standardized Rates

Approximate Confidence Intervals:

Because SMRs/SIRs are not symmetrical, there are two


different formulas for the Confidence Interval
(Rothman and Boice, 1979)

3
1 z 1
Obs 1
9 Obs 3 Obs
Lower Limit =
Exp

Upper Limit=

3
1 z 1
1 +
9 ( Obs + 1) 3 Obs + 1
( Obs + 1)
Exp

Obs=Numerator of SMR\SIR
Exp=Denominator of SMR\SIR
Z=Standard Normal Deviate (1.96 for 95% Confidence
Interval)
If expected number is less than 5, use exact confidence
intervals --uses iterative procedures

Example: Obs=15, Exp=12.9, z=1.96

3
1 1.96 1
15 1
9 15 3 15 = .65
Lower Limit=
12.9

Upper Limit=

3
1 1.96 1
1 +
9 (15 + 1) 3 15 + 1 = 1.80
(15 + 1)
12.9
EXACT LIMITS CAN ALSO BE OBTAINED FROM TABLE
in SECTION A.2 in Szklo (see section A.7) for 95 %
Confidence Limits

Use Observed Number of Events, find limits for number of


events on p. 435, multiply by observed number, divide each
result by expected number

For 15 events, limits for number of events are .560 and 1.65

Lower Limit: 15 x .560=8.4 8.4/12.9=.65


Upper Limit: 15 x 1.65=22.90 22.90/12.9=1.78

95 % Confidence Limits for SMR: .65-1.78, compared to


approximate of .65-1.80
Statistical Tests for the SMR

If number of expected events is 5 or greater, can use Chi


Square Test (1 d.f.) (some authors use 2 or greater)

H0=SMR/SIR=1
Ha=SMR/SIR NE 1

( Obs Exp )
2
=
2
Exp

or with continuity correction:

( Obs Exp .5 )
2

2 =
Exp

( 15 12.9 .5 )
2

Previous example: = .20 , p > .05


12.9
Cannot reject Null Hypothesis, SMR not significantly
different from 1

For expected of 2 or less, use exact tests (PEPI Poisson)


For this example:

POISSON - Poisson Probability: Observed vs Expected Events


Tuesday, 24th September 2002.

Observed number = 15 Expected number = 12.9

Exact P = 0.314 by the usual (Fisher) method.


Exact P = 0.270 by the mid-P method.

Ratio (observed:expected) = 1.163

Number of events Ratio (obs:exp)

Exact Fisher:
90% conf. interval = 9.25 to 23.10 0.72 to 1.79
95% conf. interval = 8.40 to 24.74 0.65 to 1.92
99% conf. interval = 6.89 to 28.16 0.53 to 2.18
Another example:
USING PEPI INDIRST --4 age groups
INDIRST - Indirect Standardization
Thursday, 26th September 2002.

DATA
STANDARD RATES per 1000
Stratum 1 2.5
Stratum 2 6.1
Stratum 3 12.4
Stratum 4 25.0

DATA
No correction factor.

NOTEif your standardized rates are annual rates, but your study
population observed events are for more than one year, you need to
enter the correction factor for the number of years followed, otherwise
the number expected will be underestimated because it will be based on
only one year of follow up.

DATA
STUDY POPULATION
Stratum 1: Numerator: 6 Denominator: 1200
Stratum 2: Numerator: 27 Denominator: 2340
Stratum 3: Numerator: 98 Denominator: 3750
Stratum 4: Numerator: 48 Denominator: 975

NOTENumerator is number of events in each stratum, denominator is


person years in each stratum

Observed cases = 179


Expected cases = 88.149

SMR = 203.07%
Standard error of SMR = 14.95%
Approx. 90% conf. interval = 178.8 to 229.9%
Approx. 95% conf. interval = 174.4 to 235.1%
Approx. 99% conf. interval = 166.1 to 245.5%

Z (SMR different from 100%?) = 8.42 P = 0.000 [ 3.88E-17 ]

REJECT Null Hypothesis that SMR=1 (z=square root of chi square)