Académique Documents
Professionnel Documents
Culture Documents
DISEASE MODELS Is it non recurrent disease? Alzeimers, Osteoarthirtis, Suicide, Homicide, SIDS Is it recurrent disease? Depression, UTI, Low back pain Are all people at risk? Who are not at risk? Diseased people are not at risk Immunized people are not at risk Does it have susceptible state? Does it have immune state? Is immunity throughout the life? Does it have fluctuating at risk periods? Eg: occupational injury Is duration of disease event negligible? Use line diagram POPULATION Defined vs Undefined population Why to define population? to know its size generalizability Population definition Does it specify characteristic shared by all members of the population being defined? Personal attributes such as age, gender, momebership; Geographic scope, time or time period Does it specify what distinguished them from others outside the population? Link between Cases and the population at risk If each of the cases had not developed the disease, would he or she still have been included in the population? If each of the non-cases in the population had developed the disease, would he or she have been included as a case? Defined population observed over time Is it a closed population? all members initially at risk No gains or losses in embership during period of observation (except due to disease itself) Able to identify and count all new disease cases over a fixed time period
Is it a open population? population at risk can gain or lose members during time period of interest. Eg births and deaths, migration, occurrence of new cases and recovery of old ones
Change in criteria for case definition Proportional incidence = No of cases of certain disease/no of cases in a larger category that contains it Fetal Death ratio= No of fetal deaths/Number of live births OTHER MEASURES OF DISEASE FREQUENCY Period prevalence Hybrid of prevalence and cumulative incidence Period prevalence = P + (1-P). CI What is the main limitation? Point prevalence and cumulative incidence convey very different kinds of information about disease frequency. Those distinction are lost when they are combined, which limits usefulness as summary measure. Years of potential life lost
Fatality Case fatality = Number of fatal cases/Total number of cases Proxy Measure of Incidence Proportional mortality = Deaths from disease/Death from all causes Be aware of pitfall ! When can proportional morality be valid for comparison? If the number of denominator is equal and person time of followup is equal between two population What can increase in proportional mortality mean? Increase in disease Decrease in other disease Increase in relative size of that segment of population 2
Is all data from same year? If no, think that they may not have come to equilibrium yet.
Pharmacy records
Foetal Death Records Disease report Cancer, Birth defects, Trauma Laboratory records National Health and Nutrition Examination Survey
Disease Registries
USES OF MULTIPLE DATA SOURCES Excluding ineligible cases Validating using two data sources Estimating completeness of data sources Capture-recapture sampling: data layout
Behavioral Risk Factor Surveillance System (BRFSS) Can estimate x if can assume that capture by each data source is independent of capture by the other Independence assumption implies that 50/250 = 10/x Hence x = 10 250/50 = 50 Estimating total (known + unknown) cases
DENOMINATOR DATA Common sources of denominator data U.S. Census Administrative records HMO enrollment records Employment or labor union records Alumni rosters Etc. Birth certicates for perinatal epidemiology U.S. census data
Estimated total cases (known + unknown) = 360about 15% higher than simple tally of 310 known cases What is the pitfall of capture recapture method? Assumption of independent capture not easily tested and may not always be plausible Reducing Misclassification Verifying using two data sources
In general, women along a diagonal belong to the same birth cohort Why consider birth cohort? Shared experiences at an earlier age can affect future disease risk Can provide a simple explanation for otherwise puzzling pattern of variation in rates by age
PAR
Use the OR in case-control studies to approximate the RR only when the outcome is rare in the source population from which cases and controls were drawn
AR
PAR %
(It-Io)/It *100%
1/AR
RR is equal, AR is higher in population A comparted to B For Relative risk of given size, RD or AR associated with a given exposure will be larger for common illness than for rare illness What does PAR% depend on? RR Proportion of population exposed
Interpretation Guidelines
What is properties of sensitivity and specificity? Not dependent on frequency or prevalence of the trait among persons tested Relative stable characteristic of a test, but may vary slightly according to: Who performs the test, test setting, Disease severity ROC curve Plot sensitivity against 1-Specificity The more accurate a test is, the farther toward the upper left its curve falls in an ROC plot. This rule applies even if the two tests yield results in entirely different units on totally different scales AUC (Area under curve) Summary measure of test accuracy based on ROC curve 0.5 is useless test, 1 is perfect test CONSEQUENCE OF MEASUREMENT ERROR Differential misclassification Is ascertainment of exposure status influenced by the presence or absence of disease? Is ascertainment of disease status influenced by the presence of exposure? Are exposed and unexposed group followed for same period of time while ascertaining disease? What is the impact of differential misclassification? Can falsely exaggerate or falsely minimize an association Non differential misclassification What is the impact of non-differential misclassification? Bias towards the null What is strategies for minimizing misclassification? Sharpen the tool: Use tools with highest reliability and validity 9
What is limitation of Kappa? Kappa declines as prevalence approached 0 or 1. This property should be kept in mind when comparing Kappas among populations in which the prevalence of the characteristic under study differs substantially What is impact of low Kappa? As Kappa approaches 0, attenuation of the OR becomes severe, and a true exposure-disease association may go undetected due to measurement error Intraclass Correlation Coefficient
What is limitation of Concordance? It fails to account for agreement that chance alone could produce Kappa
In STATA .set obs 4 .input clinc self count 1. 0 0 1117 2. 1 0 132 3. 0 1 128
What is limitation of Restriction? Enhances ability to make statistical inference because we loose some precision when controlling for confounding It reduces generalizability, can conclude for those who are restricted Eliminates ability to assess effect modification May be difficult (or expensive) to find sufficient subjects. 3. Matching Cohort study: Each exposed subject matched to one or more non exposed subjects on potential confounder Case control study: Each case matched to one or more control on potential confounder What is attraction of Matching? Allows control for few well known strong risk factors (eg: age) Increases efficiency of case control study Precludes examining matching factor(s) as risk factors What is limitation of Matching Differential loss to follow up may result in imbalance in matching factors Matching can create bias in case control study (if matched on the factor which is only associated with exposure) What is main purpose of matching in case control studies? Increase efficiency of the study CONFOUNDING CONTROL IN ANALYSIS PHASE Standardized (adjusted) rates Summary rate that enables comparison of two or more groups that differ in their distribution of an important factor Subgroup differences hidden When is standardized rate not influenced by choice of standard population? When difference in rates of two communities is contant across age categories, the size of difference between the adjusted rates will not be influenced by choice of the standard population Ratio will be same no matter what age distribution is chosen to assign the weight Example 1 : Categorical variable
Step 1: Choose standard or reference population with a known confounder distribution One of the two groups to be compared Combination of two groups to be compared 2000 US standard population (age) IARC world standard population (age)
Step-2: Apply the confounder category specific rates for each population to the number in the standard population in that category
CONFOUNDING CONTROL IN DESIGN PHASE 1. Randomization What is attraction of randomization? Removes association between exposure and potential confounders (usually) Controls confounding by unknown or immeasurable confounder What is limitation of randomization? Confounding may still occur due to accident of randomization What is remedy of accident of randomization? Increase size of study, use stratified randomization methods, handle as for observational study & do multivariate analysis 2. Restriction Requires all members of study population to have same status on potential confounder(s) When is restriction most useful? Most useful when most potential subjects have same status on potential confounder (eg: look at only singleton in prenatal studies)
Step 3: Add up the total number of hypothetical deaths in each population, divide by the total in the standard population to determine each populations adjusted rate
10
What happens when you fail to adjust for confounding by severity score? Mixing of effect of hospital (B vx A, exposure) with the effect of severity score distribution (potential confounding) Observed motality rate of Hospital B may be due to in part to lower severity of illness of hospital B patients (compared to that of Hospital A) Observed mortality rate in hospital B (vs A) will be lower than the adjusted or unconfounded rate because patients at hospital B tend to also have a factor that is known to decrease mortality (low severity) Example 2 : Continuous variable categories SMR: Miners and tuberculosis mortality
Step 3 Add up the total number of hypothetical deaths in the unexposed (general) population 50.55+58.32+31.96 = 140.83 (among a hypothetical population of 294,013) Step 4. Compare with the # of TB deaths actually observed in the miners (among a real population of 294,013): 384 O= Observed # of deaths in general population E= Expected # of deaths in general population O/E = 384/140.83 = 2.73 SMR = 2.73 Crude Vs Stratum-specific rates vs standardized rates Crude rates Represent reality Useful for health services needs assessment StratumRepresent reality specific rates Detailed information useful Appropriate when stratum-specific effects differ Standardized Weights for each stratum defined by analyst rates Facilitate comparison with other data, studies particularly when known population is known Appropriate when stratum-specific effects are similar POOLING USING MANTEL-HAENSZEL ADJUSTED ODDS RATIO CASE CONTROL STUDIES : ODDS RATIO
Is working as miner a risk factor for tuberculosis mortality? TB mortality rate among miners= 384/294,013= 130.6/100,000 TB mortality rate among all 35-64 year old men= 54.1/100,000 person years RR = 130.6/54.1 = 2.4 Age adjusted by standardization: Step 1 Choose a standard population- the miners (exposed group). The general population is the unexposed group
Standardized Incidence Ratio (SIR) and Standardized Mortality Ratio (SMR) SIR and SMR are the standardized rate ratio calculated usig the exposed group as the standard population Ratio of the total number of deaths in the exposed group divided by the number of expected in the exposed group if the rates among the unexposed prevailed within each age
Step 2 Apply the age group specific TB death rates of the unexposed population (general popn) to the standard popn (here the # of miners) in that category to get the hypothetical number of TB deaths in unexposed popn. If it had the age distribution (and #) of the miners population
11
COHORT STUDIES, PERSON-TIME DATA: RELATIVE RISK COHORT STUDIES, PERSON-TIME DATA: RATE DIFFERENCE
RESIDUAL CONFOUNDING Is confounder measured? Is there incomplete control of confounding? Is measurement improperly defines categories? Does measurement correctly capture attributes? Is measurement imperfect surrogate for confounder? What is effect of residual confounding? Adjusted effect measure closer to crude effect measure, if the measurement is non differential More precisely we measure confounding, more its effect is reduced. COHORT STUDIES, CUMULATIVE INCIDENCE: RELATIVE RISK DIRECTION OF CONFOUNDING 1. Positive-positive or Negative-negative CONFOUNDING BY INDICATION (OR SEVERITY) Non randomized pharmaco-epidemiology studies Comparison of specific drug takers vs non takers Drug treatment is marker for characteristic or condition that triggers use that treatment (and increase risk outcome) May attenuate beneficial effect new drug Determine risk factors for disease complication/progression Adjust for prognostic differences Stratifying on basis of severity of illness
2. Positve-Negative
12
When exposure frequency varies substantially between populations but not very much within population If the exposure is subject to a high degree of measurement error or short term biological variation at the individual level Estimating Attributable Risk and Relative Risk 1. Apply regression analysis to the group-level data, modeling disease rate as a function of exposure prevalence. Several forms of regression can be used for this purpose such as y=mx+c 2. Use the fitted regression model to predict the disease rate for a population in which everyone is exposed i.e, when x=0. Call that rate R1. Similarly, predict the rate for a population in which nobody is exposed, and call that rate R0 3. Estimate Relative Risk as R1/R0 and attributable risk R1-R0 A theoretically preferable analysis would give greater weight to data from larger countries thus are subject to less sampling error. What can be major problems? (R1) or (R0) can be negative, which is, of course, impossible for a rate Exposure prevalence of 0 or 1 often fall way beyond the range of observed exposure prevaences among the population studied, leading to large extrapolation errors Because the number of population data points in an ecological study is often small, there are very limited power to determine whether one model form fits significantly better than another Results are highly model dependent, sample size may be too small to determine which model fits the data set What are the pitfalls? The associations at the population level need not necessarily reflect association of similar magnitude or even similar direction, at the individual level. (ecological fallacy) Cross-level bias occurs when an association at one level of aggregation is assumed to represent the association at another level, when in fact the associations at the two levels are unequal. What can lead to cross level bias? Group-level association between exposure prevalence and baseline disease rate (rate in non exposed person) such as country itself is a group-level confounder: it is associated with both outcome and exposure due to following reasons: a. The groups may differ on the distribution of one or more extraneous individual-level risk factors, such as age and gender b. An intrinsically group level factor may be a confounder. For eg: lax (negligent) law c. The exposure itself may have effects at the group level above and beyond its effects at the individual level. eg: homicide risk to a gun non-owner may be greater in a country where owning a gun is common than where it is rare. In infectious disease epidemiology is herd immunity. Unequal distribution of effect modifier in the group: Model misspecification: For many graded exposures, the relationship between exposure and risk at the individual level is nonlinear. Only available data may be the mean exposure level for each group, which can not capture information about the distribution of individuals among different exposure levels. The
same mean exposure level could result from most individuals falling near the mean, or from two subgroups at opposite ends for the expxoure range. These two patterns could correspond to quite different epected overall disease rates. Number of groups available for study may be small. As a result, a simple linear or log linear model between disease rate and mean exposure level may appear to fit the ecological data adequately, even though it is actually a poor reflection of the individual-level relationship of real interest. What is the effect of non-differential measurement error? Non differential misclassification can cause estimates of excess risk to be biased away from null. What can be done about non differential measurement error? If sensitivity and specificity data for the measure of exposure are available, the seize of this non-conservative bias can be estimated and correction made. How can confounding operate? Confounders can operate at either individual or the group level. What can be done about confounding? The possibility of nonlinear associations motivates using more finely detailed information about distribution of the confounder in each group, if this information is available. For eg: rather than including just mean age in a group-level regression analysis in an attempt to remove confounding by age, better control may be gained by including several age related variables, each of which reflects the proportion of group members falling into a particular age group. Rate standardization can also be used to control confounding in ecological studies, while doing so also standardize the prevalence of exposure and of other covariates to the same reference population.
When is ecological studies less biased? when within-group variation in exposure is small but between group variations in exposure prevalence is large What is the drawback of this ? Confounding at group level STUDYING EFFECTS OF GROUP LEVEL EXPOSURES to evaluate programs and policies that apply to entire populations - an intrinsically group-level characteristics Cross-level bias is also of less concern , because the target level of inference s at the group level, the level at which such an exposure would be potentially modifiable. What is drawback? individual or group level confounding factors to bias the observed group level association. How can potential biased be addressed? cross-classifying the study population by age, gender, race, state and calendar time STUDYING EXPOSRES AT TWO OR MORE LEVELS AT ONCE Individual level studies may be carried out in only a single setting,
13
or they may deliberately match or stratify study subjects on area of residence, thus controlling for neighborhood level influences . Having information at more than one level can permit a richer and more complete conceptualization of how disease occurs, leading in turn to a wider range of opportunities for prevention. In such a goup level association is present, what might it represent? treat of methodological artifact, such as measurement error or residual confounding, always lurks in the background. Shared environmental exposures Selection effects: Eg: people with asthma may move to cleaner place Contagion: prevalence of illness itself can affect the level of risk to susceptible by influencing their chance of exposure.
14
Drop out Statistical power: probability of experiencing a key outcome Number of study subjects For parallel-groups trail with two equal-sized group and a binary outcome:
analyses and possible early termination Small trial (n<50 per group) and strong prognostic factor(s) known a priori Unusual, specialized situation
2. Allocation concealment: Recommended Central randomization Serially numbered, opaque, sealed envelopes (SNOSE) Numbered and coded containers (in drug trials) Specially programmed portable computer/PDA Discouraged method Any form of public posting Alternate allocation Allocation based on an identier Allocation by date of birth or date of entry
What are pitfalls of taking treatment effect based on pilot study? underestimating intervention effect, can cause worthwhile intervention to be abandoned prematurely Overestimating intervention effect, causeing main study to be underpowered INFORMED CONSENT What are necessary elements of informed consent? Awareness of participation in research Procedures to be followed Risks and discomfort Potential benets to self and others Alternative treatments or procedures available Condentiality, data-retention provisions Compensation should injury occur (if more than minimal risk) Whom to contact if questions Voluntary nature of participation and right to withdraw without penalty or loss of benets RANDOMIZATION Why to randomize? Protection against known and unknown confounding Not costly, time consuming or difficult to do properly Assignment list can usually be made up an dchecked in advance, before any participants are enrolled, provided it is kept adequately concealed What are the three issues in randomization? 1. Sequence generation: Simple, Blocked, Stratified Suggestion on choice of randomization approach Method Good choice when Simple Expected total n>200 and no interim analyses planned Block Single Total sample size known in advance Many Wish to keep group sizes balanced small throughout trial to facilitate interim
3. Implementation Blinding Who are blinded? Participants* Staff who assess outcomes* Clinicians responsible for care of participants Statisticians (!) * Usually meant by double-blind DATA COLLECTION Baseline data Why to collect baseline data? To verify eligibility To assess potential confounding To assess frequency of potential effect modifier To identify planned subgroups To enhance study power Outcomes Primary outcome vs Secondary outcome What are the problems with secondary outcome? Trial may not have good statistical power Intermediate vs Final outcome What is advantage of intermediate outcome? Provide confirmatory evidence to mechanis If no benefit on final outcome is found, intermediate outcome can help distinguish between an incorrect causal model vs failure in implementation What is the pitfall of surrogate outcome? If other pathway is acting towards the end outcome, altering the benefit of intervention and only surrogate outcome is taken, then mislead about benefit of intervention ANALYSIS How well randomization work? Look at table 1 and see the difference What to do if randomization failed? Stratification or multivariate analysis can be used, as in observational studies 15
Control arm
Best practical alternative Often broad, to represent potential target population Those most salient to key decision makers: E.g patients, clinicians, public health policy makers
Eligibility
Outcomes
What are the treatment arms? Experimental Control: Nothing, Placebo, Active alternative, Usual care SELECTION OF STUDY SUBJECTS What are the selection criteria Eligibility Internal validity Generalizability Risk and benefits to subjects What affects internal validity? Subject retention Data quality Compliance
What is intent-to-treat principle? Primary analysis should compare outcomes between the groups formed by randomization What are situation tempting to depart from intent-to-treat? Non compliance Cross over Late exclusion after participant dropped from analysis after randomization What to do when these things happen? Should nearly always define the primary comparison In pragmatic trials, discrepancies between intended an received treatments may reflect real life In explanatory trials, price of intent-to-treat can be higher Interferes with estimating efficacy Still, direction of bias is known and conservative o Design features in explanatory trials often aim at minimizing discrepancies: e.g tight eligibility criteria, run-in phase Randomization late as possible ESTIMATING EFFICACY INDIRECTLY Example
Each study subjects serves as his/her own control Completely prevents confounding from individual level factors such as age, gender, comorbidity Increased statistical power or smaller smaller sample size requirement N-of-1 trials randomization of Interval of time Group randomized trial What is the attraction? When by its nature, intervention applies to entire group. E.g. laws/policies, mass media campaign, environmental modifications Intervention has spillover effects to others through social interaction Intervention effects are thought to be transmissible from person to person What can be the drawback? Unacceptable risk of contamination within group if smaller units randomized Complexity, cost When no of groups randomized often small, greater risk of imbalance in groups formed by randomization; Statistical power usually much lower than an individually randomized study of same Requires more complex statistical analysis to obtain valid confidence interval limits and p-values
Counterfactual view of situation Suppose controls had been allocated to experimental treatment By virtue of randomization: o Would expect a similar proportion not to have received active treatment Would expect incidence among them to be similar to that actually observed among non-recipients in experimental group Can then estimate, by subtraction, experience of controls who would have received experimental treatment, had they been assigned to it
Subgroup Analysis What is proper subgroup analysis based on? Inherent participant characteristics that treatment group could not affect Other characteristics measured before randomization Limit number os subgroup hypotheses Use test of interaction to reduce multiple comparison problem Interpret post-hoc subgroup differences with great caution What is improper subgroup analysis? Characteristics measured after randomization that could be affected by treatment group. Examples: Compliance, response to treatment What is pitfall of subgroup analysis? the more ways one looks for subgroup differences the more likely it is that some statistically significant ones will be found, even if they reflect only the play of chance Because each subgroup is smaller than the full study population, statistical test for a treatment effect within subgroup have less power What can be done if subgroup analysis is important to do? Increase trial size accordingly during planning phase DESIGN VARIATIONS Factorial design What are the attraction for Factorial design? Can tease apart two or more interventions If interventions are synergistic or antagonistic when used in combination, can find out For overall effects, get two (or more) studies for not much more than price of one Sequential Trails What is the attraction of sequential trial? Allows termination of trial if one arm emerges as clearly superior What is the draw back? And remedy? Multiple comparison can inflate alpha (probability of type I error) , remedy is to use biostatistical method to deal with this Randomization within individual of Body parts Attraction? Minimize confounding Cross over trial randomize order of exposure Attraction?
16
beyond difference in those demographic characteristics that are measured in both groups and for which statistical adjustment can be performed? (e.g. soldiers vs general population) (healthy worker bias; Sick retiree bias) What should be considered when comparing rates of illness or death between patients who have received a specific medical intervention and the population as a whole? Could the condition that necessitated the treatment itself have an impact on the incidence of the disease under study? (confounding by indication) At the time treatment was being considered, were members of the treated group evaluated for the presence of a condition, with only those not having the condition allowed to receive the treatment? (Healthy screenee bias) What can be done for this? Omit from the analysis the part of the follow up experience of exposed individuals that is most susceptible to these biases. i.e. that which accrues (accumulates or adds) relatively soon after exposure status is defined. (e.g. breast cancer counted after 3 years among those with breast implants) OUTCOME DEFINITION AND ASSESSENT Is there standard criteria to define outcome? Is it assessed similarly among cohort and comparison group? Is outcome measured at same period of time? FOLLOW UP OF COHORT MEMBERS When is validity of result threatened? If the under ascertainment of disease, especially among just exposed group What are eligibility criteria? Reachable throughout study period Stable to maximize the likelihood of successful follow-up Restriction of unstable subgroup NATURE OF THE ILLNESS OUTCOME: INCIDENCE VS PREVALANCE If the prevalence is measured It is called cross sectional study ISSUES IN ANALYSIS AND INTERPRETATION For purpose of analysis, how soon after exposure should outcome events that occur in cohort members begin to be counted? Generally, immediately Exceptions Healthy worker bias Healthy screenee bias the diagnosis was made early in the follow up period had that disease present in hidden form before exposure commenced (presence of disease in those persons could not have been affected by the exposure, it could possibly have influenced the likelihood of receipt of exposure, or the presence or level of a characteristic under study) Always think if reverse casualty could be possible? In pharmacological research, what bias can be encountered?
Immortal time bias How are changes in exposure status of cohort members handled? Data taken to reduce exposure misclassification Person years contribute to denominator of exposed group, after change of status When duration is important, cohort members cannot be permitted to contribute events to the numerator nor person-years in denominator until they meet the criteria for a particular category of duration When is counting stopped? Get full range of consequence as far as possible
17
Are information restricted to point till when case is diagnosed? Are information restricted to same point for control? 3. Physical and Laboratory measurement What are limitations of laboratory measurement? Post diagnosis exposure levels may not reflect pre-disease levels Pre-diagnosis exposure may not reflect etiologically relevant time period Are levels measured following identification of cases and control? Can rely: lead in dentine, BCG scar Cant rely: hormone status Which records to be excluded from analysis? those obtained with the period prior to diagnosis that might correspond to the duration of preclinical stage of disease Exception: genetically determined characteristics CASE DEFINITION Are there all (or representative sample) of members of defined population who developed a given health outcome? Can some person with disease go undiagnosed? If yes, Is there reason to believe that exposed persons are relatively less likely to go undiagnosed? Is there chance that exposure status may influence cases likelihood of diagnosis and therefore selection for study? if yes, define objectively focus on more seriously ill cases What are the criteria to identify and select cases for study? Objective Sensitive and specific Specificity is of particular concern because, inadvertent inclusion of persons without disease in the case group will generally obscure any true association with the exposure Eg: in study of Reyes syndrome and Aspirin, they include only severe cases, so that non-cases and misclassification could be avoided, especially when there was general notion among physicians regarding association with exposure. What are sources of getting cases? Geographically, members of health plan, occupational group, registry What is challenge of ascertainment of case from population based case selection? Complete ascertainment Use capture/recapture method/ out of area health events Are they drawn in an unselected manner with regard to exposure status? Eg: including all eligible cases Are they incident or prevalent cases? Goal of etiology is to have incident cases ISSUES IN CASE SELECTION Inclusion of Prevalent cases Under what circumstances it may be necessary to enroll prevalent cases?
For some conditions, date of occurrence is unknown: eg: HIV infection; For uncommon disease of long duration, incident series may yield too few cases What is disadvantage of adding prevalent cases? Problems of accurate exposure ascertainment If date of occurrence is known, should be obtained for more distant points in the past, on average, that would be necessary for incident series If date of occurrence is known, there will be uncertainty about best point in time before which one should elicit (produce) exposure information. By studying persons remaining alive with a given condition, one is studying at the same time not only etiologic factors, but factors that influence the duration of the condition, including those associated with survival Length biased sampling: cases with long lasting disease more likely to be sample associations may be with disease duration not etiology Inclusion of Diagnosed cases without disease What is the impact of including cases without disease? diagnosis may depend on presence of exposure Over diagnosis may threaten interval validity more than under diagnosis Inclusion of cases only from the portion of the population What is the impact? Missing cases are not missed systematically Missing due to death- those with longest survival are preferentially included Characteristics of tertiary care sites cases (clinic based studies) Asymptomatic and symptomatic undiagnosed cases (population based studies) Influence of exposure on likelihood of diagnosis among truly diagnosed persons Are controls who would have been diagnosed had they become ill have similar access to diagnostic Willing to undergo diagnostic procedure CONTROL DEFINITION When was the control not considered? Occasionally, the proportion of ill person who have had a specific exposure so high, unequivocally more than that would be expected in the population they were derived from ,that the presence of an association (though not its magnitude) can be surmised from a case series alone. Eg: Pneumonia due to ingestion of adulterated rapeseed oil in Spain in 1981 Ideal control group Are controls at risk for developing disease? Are the controls selected from a population whose distribution of exposure is that of the population the case arose from? If not, Selection bias Are they identical to the cases with respect to their distribution of all characteristics? 18
That influence the likelihood and/or degree of exposure, and That, independent of their relationship to exposure, are also related to the occurrence of the illness under study or to its recognition If not, Confounding Can presence of exposure be measured accurately and in a manner that is identical to that used for cases? If not, information bias Minimizing selection bias Population based controls Are controls selected from same population as cases? Geographically defined population: Random digit dialing of telephone numbers, area sampling, neighborhood sampling, voters list, population registers, motor vehicle licenses, birth certificates etc Prepaid health care plan: who were members of the same health plan when the illness or injury occurred? Employed population: same group of employees What are the drawbacks of random digit dialing? Household identification: change of telephone number, have only cell phones Enumeration: answering machine screening of calls, inaccurate response about eligibility What are drawbacks of population based controls? Not known to be free from disease Response rate may be low and may not be unbiased sample of population Hard to identify if no list exist Characteristics of non-responding population based controls are shown to have more smokers, less educated, younger What is effect of inclusion of diseased subjects in control group? Benefit of population based controls over-weights misclassification of some. (see lecture notes) Examine for disease (after selection and if feasible) estimate amount of undiagnosed disease estimate resulting bias
persons overstate the cigarette consumption of the population from which the cases arouse, the odds ratio associated with smoking based on the use of ill persons as controls will be spuriously low. How to remove selection bias when taking ill controls? Omit potential controls with conditions known to be related (positively or negatively) to exposure. Eg : in study of bladder cancer and prior use of sweeteners, excluded control who were hospitalized for obesity related disease this is successful, if can be judged correctly which conditions truly are exposure related, and how accurately the presence of those condition can be determine. But for cigarette smoking and alcohol drinking, it has been shown that admitting diagnoses or statements of cause of death are incapable of identifying the persons with illnesses related to these exposures. What is advantage of selecting controls chosen from individual who are tested for the presence of disease and are found not to have? inexpensive to find comparability with regard to the choice of health care provider this will increase studys validity if disease being investigated is generally asymptomatic and so would not be detected in the absence of testing Situ cancer example: oral contraceptive and situ cancer of cervix. Women who use oral contraceptive were more likely to get screening, situ cancer can be in asymptomatic form and shall be discovered only through screening. If controls were chosen from general population, who may or may not have received cervical screening, an apparent excess of oral contraceptive users would be present among cases of in situ cancer even if no true association was present What is drawback of having controls that are test negative? Those with a diagnostic evaluation but confirmed not to have disease may not be typical of those in the population from which cases arouse (if they are in hospital, they have some problems so they Will detract studys validity if large majority of persons who develop the disease soon would get diagnosed whether or not the test was administered Eg: Endometrial cancer and postmenopausal estrogen. Controls were chosen from those who underwent biopsy and found negative, because of hidden cases in the population. However, a group of scientist believed that there were no such hidden cases. And also, estrogen use predisposes bleeding leading to biopsy. So, they claimed that risk estimate was spuriously high. How is selection bias introduced if exposure information is not received from all participants of study? If the frequency of missing data and the degree to which exposure frequencies or level differ between study subjects for whom exposure status is and is not known.
Are the questions asked in identically to both cases and controls? Are the past exposures or events more salient to persons with an illness? recall bias Are these socially undesirable questions? Eg: in prenatal study of malformation, control taken with other types of malformation; anal intercourse and anal cancer, control were with colon cancer Are the questions very subjective? Eg; stress or shock producing events and down syndrome, Controls general mother with OR=17; control other mentally retarded childrens mother with OR=4.3 If questions for fatal diseases asked with surrogates of cases, then who to ask in control? Though controls themselves might give more accurate answer, it is better to ask from their surrogates for purpose of comparability What is drawback of getting information from surrogates? Misclassification of the exposure, especially by surrogates of control, For e.g. study on radiation exposure and cancer What is way to minimize information bias? Blinding to those who are collecting information Example of information bias in records Records of endometriosis are higher among women with infertility. However, the women with infertily undergo laproscopy as a diagnostic tool to investigate the possible presence of conditions such as endometriosis. CONTROL OF CONFOUNDING IN CASE CONTROL STUDIES Is the proportion of cases and control vary across level or categories of the potential confounding factor? Means of controlling confounding? Restriction: Restrict cases and controls to a single category or level of potentially confounding, e.g. study of physical activity and cardiac arrest, restrict people who have clinically recognized heart disease that could both predispose to cardiac arrest and physical activity What is drawback of restriction? Shrink pool of available subjects, especially because we are doing case-control study for rare disease limits generalization of results Cant see effect modification Adjustment: of potentially confounding factor in analysis phase Matching Individual matching Frequency matching Is matching alone sufficient to control for confounding? No, should be considered in analysis as well Appropriate to match Yes Is variable, one of the exposures of interest? Is variable strongly associated with disease? Is it inexpensive to do matching? Is cost of ascertainment of exposure expensive? What is drawback of matching? Missing of possibly large fraction of cases 19
Clinic based controls Are cases selected from few hospitals or clinics? If yes, Are controls chosen from persons who, had they developed the illness under study, would have received care at these hospitals or clinics? No selection bias Are person who do and do not receive care from these sources differ with regard to their frequency or level of exposure? Yes selection bias What is drawback of having other ill people as control? Hospitalized or clinic based controls may not be typical of those in population from which cases arose in terms of exposure of interest dont represent population from where cases are coming) Ill or recently diseased persons tend to have been smokers of cigarettes more often than other people. Because smoking history of ill
May be overmatching, Is it surrogate for exposure measurement? Matching can induce confounding Is the factor associated with only exposure? CASE CONTROL STUDIES THAT DIRECTLY COMPARE DISEASE OR EXPOSURE SUBGROUP Compare between different types of disease Compare among different level of exposure What are the possible interpretations? e.g. OR > 1 with alcohol and HPV +ve Oropharyngeal alcohol risk factor HPV +ve cancer? OR Alcohol protective factor for HPV ve cancer? Case control study superior over other study design Is the disease too rare for prospective studies? Is the induction period too short? Eg: alcohol-injury Is the exposure to disease period is very long? Does it allow studying multiple exposure? Allows to obtain information when exposure records do not exist
What proportion of TSS in women who used Rely tampons was due to their Rely tampon use?
OR = 6.3 AR% = (6.3-1)/6.3*100 = 84.1% PAR % Requires inference of causality Requires rare disease assumption Requires that exposure frequency in study cases approximates that of all cases in the population Does not require known disease incidence in population PAR% = AR% * Pc ( Pc= proportion of cases in population who are exposed) If cases under study are similar to all cases in population with respect to frequency of exposure then Pc = a / (a+c) Pc = a/ (a+c) = 0.60 PAR % = 84.1% * 0.60 = 50.5% AR Estimates the amount of disease in exposed persons that is due to exposure Requires inference of causality Requires rare disease assumption Requires that frequency of exposure in controls in study reflects that of the population from which the cases arose Requires known overall incidence of the disease in the population (or in unexposed or exposed) Example: Whats the incidence of TSS in Rely tampon users that is due to their tampon use?
How many controls is needed? 4 controls per case is enough for maximizing power Depends on cost ANALYSIS OF CATEGORICAL EXPOSURE Odds ratio OR is an estimate of RR in case-controls studies Under assumption that the disease is rare in both exposed and unexposed persons (<5-19%) Calculate OR In STATA: .cci 43 33 45 56 One-to-one matching Matched analysis: data analyzed as matched sets (e.g. pairs) If 1:1 matching ratio there are four possible types of case-control pairs Calculate adjusted OR in STATA: .mcci 144 41 19 23 Estimate the portion of disease in exposed persons that is due to exposure Requires inference of casualty, rare disease assumption
16. INDUCTION PERIOD AND LATENT PERIOD Is the time relevant period for etiology? What happens if the period is not relevant? Induction period Interval between presence of exposure and initial presence of disease Latent period Interval between initial presence of disease and its recognition How to measure induction/latent time? The distribution of the length of time required for an exposure to give rise to disease can be estimated by examining the relative risk associated with that exposure over successive periods of time after it was sustained. E.g. Leukemia in Hiroshima and Nagasaki Enumerating times when cases occurred following the exposure when nearly all exposed cases are due to exposure. Eg DES Examination of variations in disease occurrence across populations, or within a population over time
Could exposures other than the one under investigation vary over time in the same way? Confounders that vary over time, eg alcohol, smoking Long induction/latent period What are the problems? Current exposures with future follow up will take a long time to complete Exposure status may change, necessitating future exposure measurement Often hindered by absent or imprecise measures of exposure status What can be done? Get exposure from records if possible Memory can be used for specific exposure Have patience Invariant exposures can be done with case-control studies Use surrogate outcomes, surrogate exposure where possible What can be good research design? Nested case control study
21
What are the strategies to enchance sensitivity, irrespective of the size of available study population
Disaggregation of categories of exposure of concern that are heterogeneious with respect to their impact on disease occurrence Disaggregation of disease entities that are heterogeneous with respect to their association with the exposure of concern Disaggregation of study subjects who, bcause of the presence of one or more other exposures or characteristics, are not affected to the same degree by exposure of concern
2. Magnitude of increased risk is theoretically not too small, but a. there is insufficient variation among individuals within a population regarding presence/level of the factor b. We are unable to distinguish the effect of the factor from that of other correlated factors Identify population within which there is variation
What happens when two exposure act through separate means to produce disease?
the relative impact of either of them is greater in that segment of the population in which the other exposure is absent
What happens when two factors have the capacity to act together in a single causal pathway leading to disease?
The incidence of that disease in persons in whom both factors are present would be more than sum of the two rates produced by either factors presence alone. VARIATION IN SIZE OF RELATIVE RISK ACROSS SUBGROUP Look for variation in RR or AR
Conduct the study in population in which the confounding factor is not so highly correlated with exposure in question Identify exposure records, or stored samples (as in nested case control study)
c. Practical problems: i) No valid measure of past presence or past levels of factor ii) Lengthy induction/latent period
Can we compare mean (or median) ages of cases whoh do or do not have a particular exposure or characteristic?
No. It can be misleading as the same difference in mean age at diagnosis can be produced by complerely different patters of effect modiciation (Eg, pg 430)
22
What does prevalence of the disease affect on? Predictive value (especially, positive predictive value) What is the implication of the fact that PV+ value of screening test can be quite low in screened populations with low disease prevalence?
It can affect how a positive screening test result should be interpreted and perhaphs how this information is communicated to the screenee Persons with a positive screening test result must unsually be evaluated further to determine whether the result was a true positive or a false positive It affects choice of a target population for screening. Subgroups in which prevalence is highest can yield both more cases per screening test and more true positives per positive screening test
EVALUATING THE EFFECTIVENESS OF SCREENING Does treatment given at early detection lead to a more favorable outcome than treatment given when the cancer is clinically manifest? Randomized Trail and Cohort (Follow up) studies
In non randomized trail, is there potential confounding that true benefit or lack associated with use of the test is distorted?
evidence to indicate, absence of screening, the mortality rates would not have fallen
24
19. OUTBREAK INVESTIGATION What are purposes of outbreak investigation? Limit scope of severity and immediate threat to public health Prevent future outbreaks Identify new vehicles of infection Monitor the success of intervention program STEPS IN AN OUTBREAK INVESTIGATION 1. Verify the accuracy of disease reports Confirm the diagnosis 1. Determine existence of an outbreak Compare observed vs. expected in a preliminary investigation 2. Establish a case definition May need to be modified as more information is available When appropriate, classify by confirmed, probable, or possible 3. Identify additional cases 4. Conduct descriptive epidemiology 5. Generate and test hypotheses (e.g., disease causation, risk factors, transmission) 6. Monitor course of the outbreak and reassess strategies 7. Carry out lab and environmental investigations 8. Implement disease control measures 9. Communicate findings Detection: How are Outbreaks identified ?
Diagnostic tests used and their characteristics (esp. PPV) Size of population Step 2: Determine the Existence of Outbreak Compare observed vs. expected number of cases Observed: number of cases reported during this event Expected: number of cases you would normally expect (in comparable period of time) Background rate: typical rate of disease among affected population; consult historical surveillance data, scientific literature, and disease registries Use rates to make comparisons Frequency of cases relative to population size Is there a real increase in the rate of observed cases beyond what is expected? Is outbreak investigation Necessary? When should a potential outbreak be investigated? Considerations include: Severity of illness Communicability Potential ongoing health threat Need to learn more about agent new or novel Public concern and political considerations Available resources Step 3: Establish a case definition Require standardized case definition Case definition should include criteria for o Person, Place, Time o Clinical criteria (should be simple and objective) Use CDC or CSTE case definition when possible Do not include potential risk factor in case definition Classify cases Can have definite, probable and possible cases o Useful for tracking cases o Useful in estimating burden of illness In larger outbreaks, not necessary to confirm every case Step 4 Identify additional case Enhanced surveillance: Active Health departments actively solicit reports from: Health care providers and health care facilities Clinical and public health laboratories Discrete populations (e.g., exposed persons) Passive Non-direct way of increasing awareness Targeted communications
Use descriptive epidemiology to characterize the outbreak by person, place, and time. For new conditions, you may need to produce description before creating case definition. Data can be used to refine case definition. New clinical features? What population is being affected? Review of Descriptive Epidemiology Terms Incubation period Time between exposure to infectious agent and the first signs/symptoms of clinical disease Index case Initial case/patient who may have become the source of exposure for other cases or first affected case Primary cases Cases who were exposed to the source (agent) Secondary cases Cases who were exposed by a primary case inperson-to-person spread Descriptive Epidemiology: Time The epi curve displays the distribution of cases over time (and can display more). Can be used to: Estimate magnitude and time trend Determine exposure period Help predict course of epidemic Suggest the type of epidemic Point source (exposure at one point in time) Common (continuous) source (exposures continue over time) Propagated (exposure to the source by initial cases, followed by secondary cases infected from person-to-person spread)
Step 1: Verify the Accuracy of Disease Reports Establish the accuracy of the data (report) Know your data sources Confirm the diagnosis Review clinical findingsdo they make sense? Review laboratory results and methods Interview cases and potential cases Consult with subject matter experts Is it an outbreak? Rule out a pseudo-outbreak. Consider other reasons for an increase in reports For example, changes in Reporting procedures Case definitions Awareness among reporters Habits of reporters (referral bias)
How to create an Epi Curve Step 5: Conduct Descriptive Epidemiology Who where and when? Use the Data 25
Immunization policy Step 10: Communicate Findings Provide ongoing current and accurate information to: Staff within your team and agency Environmental health officer, public information officer, department administration Other health agencies Local and state health departments, CDC, and Indian Health Service Governmental agencies and jurisdictions Health care providers and facilities The public: media, schools, businesses
Plot locations of exposure Descriptive Epidemiology: Person Define population at risk Age Gender Occupation Social features Medical history Travel history Step 6 (a) : Generate hypotheses Step 6 (b) Test Hypotheses Analytical Epidemiology Different methods (study designs) for comparing groups Two study designs used in outbreak investigation 1. Cohort studies Well-defined groups of exposed and non-exposed individuals Track and compare disease (or outcome) among exposed and non-exposed individuals 2. Case-Control studies Compare individuals with a disease (cases) to those without the disease (controls) Examine differences in exposures or risk factors
Case Control studies: Control Selection Controls should be similar to cases with respect to opportunities for exposure Cannot have the disease in any form Must represent the population from which cases came (e.g., same age group) Strategies for control selection Random sample Friend or neighbor controls Meal companions Step 7 Monitor Outbreak and Reassess Strategies Refine hypotheses if necessary Have other potential explanations been overlooked? Sequential case-control studies Narrow down exposures to identify risk factor Surveillance Data Needs During outbreaks Step 8: Environmental and lab investigations Complement epidemiological investigations Environmental investigations Examine and sample food, water sources, buildings, materials, or environmental surfaces Provide information about: Exposure to agent Contamination during food preparation, or manufacturing Exposure during recreational activities Document contaminated environment Do trace back investigations Step 9: Control and Prevention Measures Implement Control and Prevention Policies Policy development and implementation Food safety Guidance on procedures Guidance on food handling Policies about food preparation Shellfish harvesting Exclusion of ill children from daycare settings Petting zoos; pet turtle bans; salmonella and psittacosis warnings at pet shops, etc. Isolation and quarantine
Communicate with the Public Establish reliable communication Be available to the media Issue frequent updates Be prepared and anticipate questions Who is at risk? How can the disease be prevented? Provide sufficient detail to meet PH needs and address public concern Control rumors; ensure correct information is publicized Dont over-reassure Acknowledge what you dont know Assign a credible spokesperson Communicate with Public Health and health care professionals Communicate so that others can learn about new diseases and strategies CDC sponsored tools Morbidity and Mortality Weekly Report (MMWR) Internet-based information exchange: EPI-X Local and national listservs Peer-reviewed publications Local communication with healthcare providers Evaluate your outbreak response Review each step in your outbreak response Identify what worked well and what didnt work Incorporate lessons learned Include outbreak response partners Best practices for outbreak investigation Best practices apply to all investigations. Establish clear and concise policies and procedures. Record-keeping and careful documentation is crucial. Use good communication skills. Evaluate your response.
26