Vous êtes sur la page 1sur 5

BMJ Publishing Group

Why We Need Observational Studies To Evaluate The Effectiveness Of Health Care


Author(s): Nick Black
Source: BMJ: British Medical Journal, Vol. 312, No. 7040 (May 11, 1996), pp. 1215-1218
Published by: BMJ Publishing Group
Stable URL: http://www.jstor.org/stable/29731602
Accessed: 24/09/2010 09:04

Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless
you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you
may use content in the JSTOR archive only for your personal, non-commercial use.

Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at
http://www.jstor.org/action/showPublisher?publisherCode=bmj.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.

Digitization of the British Medical Journal and its forerunners (1840-1996) was completed by the U.S. National
Library of Medicine (NLM) in partnership with The Wellcome Trust and the Joint Information Systems
Committee (JISC) in the UK. This content is also freely available on PubMed Central.

BMJ Publishing Group is collaborating with JSTOR to digitize, preserve and extend access to BMJ: British
Medical Journal.

http://www.jstor.org
Why we need observational studies to evaluate the effectiveness of
healthcare

Nick Black

The view is widely held that experimental methods all of the latter. I will return to this distinction later,
(randomised controlled trials) are the "gold stan? but firstit is necessary to document the reasons why
dard" for evaluation and that observational methods observational methods are needed. There are four

(cohort and case control studies) have little or no main reasons: experimentation may be unnecessary,
value. This ignores the limitations of randomised inappropriate, impossible, or inadequate.

trials, whichmay prove unnecessary, inappropriate,


impossible, or inadequate. Many of the problems of
randomised trials could in theory, Experimentation may be unnecessary
conducting often,
be overcome, but the practical implications for When the effect of an intervention is dramatic, the
researchers and funding bodies mean that this is likelihood of unknown confounding factors being
often not possible. The false conflict between those important is so small that they can be ignored. There
who advocate randomised trials in all situations are many well known examples of such intervenfipns:
and those who believe data
observationalprovide penicillin for bacterial infections; smallpox vaccina?
sufficient evidence needs to be
replaced with mutual tion; thyroxine in hypothyroidism; vitamin B12
recognition of the complementary roles of the two replacement; insulin in insulin dependent diabetes;
approaches. Researchers should be united in their anaesthesia for surgical operations; immobilisation of

quest for scientific rigour in evaluation, regardless of fractured bones. In all these examples observational
the method used. studies were adequate to demonstrate effectiveness.

Despite the essential role of observational methods in


may be
shedding light on the effectiveness of many aspects of Experimentation inappropriate
health care, some scientists believe such methods have There are four situations in which randomised trials
little or even nothing to contribute. In his summing up may beinappropriate. The first is that they are rarely
at a major conference held in 1993, the eminent large enough to measure accurately infrequent adverse
medical epidemiologist Richard Doll concluded that outcomes. This limitation has been addressed by the
observational methods "provide no useful means establishment, in many countries, of postmarketing
of assessing the value of a therapy."1 The widely surveillance schemes to detect rare adverse effects of
held view that methods (randomised drugs. The use of such observational data can be
experimental
controlled trials) are the "gold standard" for evaluation illustrated by the case of benoxaprofen (Opren), a drug
has led to the denigration of non-experimental launched in 1980. Despite preceding clinical trials on
methods, to the extent that research funding bodies over 3000 patients, the drug had to be withdrawn two
and journal editors automatically reject them. I suggest years after its launch because of reports of serious side
that such attitudes limit our potential to evaluate health effects, including 61 deaths.2 Similar surveillance
care and hence to improve the scientific basis of how to schemes are needed' for non-pharmaceutical inter?
treat individuals and how to organise services. ventions. The lack of such schemes means that there is

My main contention is that those who are opposed to still uncertainty as to whether or not laparoscopic
the use of observational methods have assumed that techniques are associated with an increased risk

they represent an alternative to experimentation rather of injuries, such as bile duct damage during chole
than a set of complementary approaches. This in turn cystectomy.3 Huge observational dataseis are the only
stems from a misguided notion that everything can be practical means of acquiring such vital information.

investigated using a randomised controlled trial. In A second limitation, also arising from study size,
response Iwant to outline the limitations of randomised is the difficulty of evaluating interventions designed
trials and show that observational methods are needed to prevent rare events. Examples include accident
both to evaluate the parts randomised trials cannot prevention schemes and placing infants supine or on
reach and to help design and interpret correctly the their side sleep toprevent to sudden infant death
results obtained from these trials. syndrome. A randomised trial would have needed a
Before doing so I must clarify what I mean by few hundred thousand babies.
"observational" in this context. I am referring exclu? A third limitation of trials iswhen the outcomes of
sively to quantitative, epidemiological methods and interest are far in the future. Three well known
not qualitative, sociological methods in which data examples are the long term
consequences of oral
are collected through observation. The principal contraceptives, which may not be manifest for decades;
observational epidemiological methods are non the use of hormone replacement therapy to prevent
randomised trials, cohort studies (prospective and femoral fractures; and the loosening of artificial hip
retrospective), and case-control methods, though joints, forwhich a 10 to 15 year follow up isneeded. The
Health Services Research
relatively little use has been made of the latter beyond practical difficulties in maintaining such prolonged
Unit, Department of Public measures. studies or obser?
evaluating preventive prospective (whether experimental
Health and Policy, London can seen as are as are their costs.
The limitations of randomised trials be vational) considerable, With
School of Hygiene and
deriving from either the inherent nature of the method luck, there will occasionally be times when a random?
Tropical Medicine, Keppel
Street, London WOE 7HT (a limitation in principle) or from the way trials are ised trial addressing the question of current interest
conducted (a limitation in procedure). The importance has already been established decades before and
Nick Black, professor of
health services research of this distinction is that while little can be done patients from it can then be followed up. Unfortunately
about the former, improvements in the conduct of such serendipity is all too rare. As a practical alternative

BMJ 1996;312:1215-8 randomised trials could, in theory, overcome some or to doing nothing, retrospective observational studies

BMJ volume 312 11may1996 1215


was needed to investigate the effectiveness of surgery
for stress incontinence, but none was prepared to

participate as each believed in the correctness of


their own practice
style. In other words, although
"collective equipoise" existed, "individual equipoise"
was absent.7 Even when clinicians purport to par?

ticipate, randomisation may be subverted by clinicians

deciphering the assignment sequence.8


Ethical objections are a second potential obstacle. It
is most unlikely that any ethics committee in an
industrialised country would sanction the random
allocation of patients to intensive care versus ward
care, or cardiac transplantation versus medical
management. Observational studies provide an
alternative to leaving the question of the effectiveness
of theseexpensive services unevaluated. Furthermore,
the results of such studies may generate sufficient

uncertainty as to make an experimental study accept?


able. This happened in the case of surgery for benign

prostatic hyperplasia.910
What works well in pharmacological research may not work in the messier world of clinical
care POLITICAL AND LEGAL OBSTACLES

can be used to obtain some information on long term Thirdly, theremay be political obstacles if those who
outcomes.4 fund and manage health services do not want their

policies studied. In the United Kingdom this was


SELF DEFEATING true for and the
general practitioner fundholding
Finally, a randomised trial may be inappropriate introduction of an internal market. As a result,
because the very act of random allocation may reduce researchers have been able to only a few
perform
the effectiveness of the intervention. This arises when observational studies, mostly with retrospective
the effectiveness of the intervention depends on the controls.1112
subject's active participation, which, in turn, depends Researchers may also meet legal obstacles to per?
on the subject's beliefs and preferences. As a conse? forming a randomised trial. The classic example is the

quence, the lack of any subsequent difference in attempt to subject radial k?ratotomy (an operation to
outcome between groups may under? correct short sightedness) to a randomised trial in the
comparison
estimate the benefits of the intervention. For example, United States.13 The
researchers were blocked by
it is well recognised that clinical audit is successful in private sector ophthalmologists who faced a major loss

improving the quality of health care only if the of income if the procedure was declared "experimental"
clinicians participating have a sense of ownership of the because this would have meant that health insurance

process.5 Such a "bottom up" approach is in stark companies would no longer reimburse them. As a
contrast to experimentation, in which the investigator result of legal action, the academic ophthalmologists
seeks to impose asmuch control on the subjects in the were forced to declare
the operation safe and effective

study as possible?that is, a "top down" approach. As and abandon any attempt at evaluation.
a consequence, randomised trials of audit might find Fortunately, legal obstacles are rare, but a common
less benefit than observational studies. The same may problem is that some interventions simply cannot be
be true formany interventions for which clinicians, or allocated on a random basis. These tend to be questions

patients, or both, have a preference (despite agreeing of how best to organise and deliver an intervention. For
to random allocation), and where patients need to example, a current consensus is that clinicians and

participate in the intervention?psychotherapy, for hospitals treating a high of patients


volume achieve
interventions to promote health or better results than those treating a low volume.14 If
example.6 Many
prevent disease fall into this category, particularly true, the policy implications for the way health services
those based on community development. It is at least as are organised are immense. While experimental
plausible to assume that experimentation reduces the methods could, in theory, help resolve this, rando?
effectiveness of such interventions as to assume, misation is unlikely to be acceptable to patients,
as most researchers have done, that the results of clinicians, or managers. The spectre of transporting
observational studies are wrong. to more distant facilities on a randomly
patients
allocated basis would find little support from any of
the interested parties. Careful observational methods
Experimentation may be impossible provide a means of investigating the value of regionalis
There are some people who believe that any and ing services.15
every intervention can be
subjected to a randomised
that those who this have CONTAMINATION AND SCALE
trial, and challenge simply
not made sufficient effort and are methodologically The sixth
problem is that of contamination. This can
Such a view minimises the impact of take several forms. If in a trial a clinician is expected to
incompetent.
seven serious obstacles that researchers have to face all provide care in more than one way, it is possible that
too often. The exact nature of the obstacles will depend each approach will influence the way they provide care
on the cultural, political, and social characteristics of to patients in the other arms of the study. Consider,
the situation and, clearly, therefore, will vary over for example, a randomised trial to see if explaining
time. treatments fully to patients, rather than telling them
The first, and most familiar, is the reluctance and the bare minimum, would achieve better outcomes.
refusal of clinicians and other key people to participate. This would rely on clinicians being able to change
Just because clinical uncertainty, manifest by variation character repeatedly and convincingly. Fortunately
in practice, may exist, this does not mean that each there are few Dr Jekylls in clinical practice. Rando?
individual clinician is uncertain about how to practise. misation of clinicians (rather than patients) may
In 1991 most gynaecologists and urologists in the sometimes help, though contamination between
North Thames that a randomised trial colleagues may occur, and randomisation of centres
region agreed

1216 BMJ volume 312 11may 1996


requires a much larger study at far greater cost. As a result of these problems, randomised trials
The seventh and final reason why itwill not always generally offer an indication of
efficacy of an the
be possible to conduct randomised trials is simply the intervention rather than its effectiveness in everyday
scale of the task confronting the research community. practice. While the latter can be achieved through
There are an immense number of health care inter? "pragmatic" trials which evaluate normal clinical
ventions in use, added to which, most interventions practice, these are rarely undertaken.23 Most random?
have many components. Consider a simple surgical ised trials are "explanatory"?that is, they provide
operation: this entails preoperative tests, anaesthesia, evidence of what can be achieved in the most favourable
the surgical approach, wound management, post? circumstances.
operative nursing, and discharge practice. And these The question of external validity has received little
are just the principal components. It will only ever be attention from those who promote randomised trials as

practical to subject a limited number of items to the gold standard. None of the 25 instruments that

experimental evaluation.16 We therefore need to take have been developed to judge the methodological
advantage of other methods to try and fill in the huge quality of trials includes any consideration of this
gaps that are always likely to exist in the experimental aspect.24 The same is true for the guidance provided by
published findings. the Cochrane Collaboration.25

be Discussion
Experimentation may inadequate
or Randomised controlled trials occupy a special
The external validity, "generalisability," of the
results extent to place in the pantheon of methods for assessing the
of randomised trials is often low.17 The
effectiveness of health care interventions. When
which the results of a trial are generalisable depends on
extent to which appropriate, practical, and ethical, a randomised trial
the the outcome of the intervention is
design should be used. I have tried to show that, for all
determined by the particular person providing the
care. At one extreme the outcome of pharmaceutical their well known methodological strengths, trials

treatment to a not cannot meet all our


needs as patients, practitioners,
is, large extent, affected by the
managers, and policy makers. There are situations in
characteristics of the prescribing doctor. The results of
trials can, in the main, be generalised to other which the use of randomised trials is limited either
drug
doctors and In contrast, the outcome of because of problems that derive from their inherent
settings.
as nature or from practical obstacles. While nothing can
activities such surgery, physiotherapy, psycho?
be done to the in the
therapy, and community nursing may be highly remedy former, improvement
on design and execution of trials could, in theory at least,
dependent the characteristics of the provider,
a consequence, care overcome the latter.
setting, and patients. As unless
is taken in the design and conduct of a randomised
PRINCIPLES VERSUS PRACTICE
trial, the results may not be generalisable.
There are three reasons randomised trials in The problems that could in theory be overcome (and
why
areas of care how that could be achieved) include:
many health may have low external

validity. The first is that the health care professionals ? Failure to assess rare outcomes (by mounting large
who participate may be unrepresentative. They may trialswith thousands of patients)
have a particular interest in the topic or be enthusiasts ? Failure to assess long term outcomes (by continuing
and innovators. The setting may also be atypical, a
to follow up patients for many years)
teaching hospital for example. In one of the few
? Elimination of clinicians' and
randomised trials
of surgery for glue ear undertaken in patients' preferences
the United all the outpatient and surgical (by introducing preference arms26)
Kingdom,
care was performed ? Refusal to participate more
by a highly experienced consultant by clinicians (by using
surgeon; in real life most such work is performed acceptable methods of randomisation27)
by
relatively inexperienced junior surgeons.18 ? Ethical objections to randomisation (by exploring
Secondly, the patients who participate may be alternative less methods of
demanding obtaining
atypical. All trials exclude certain categories of patients. informed consent28)
Often the exclusion criteria are so restrictive that the
? Political and legal obstacles (by persuasion)
patients who are eligible for inclusion represent only a
small of the patients treated in
? The daunting size of the task (by vastly expanding
proportion being
normal practice. Only 4% of patients currently
the available funds for experimental studies)
undergoing coronary revascularisation in the United ? Overrestrictive patient eligibility criteria (by
States would have been eligible for inclusion in the rather than
undertaking pragmatic explanatory trials23)
trials that were conducted in the 1970s.19 It has
While all the proposed solutions could work in
been suggested that the same problem will limit the
theory, few of them are realistic in practice, presenting
usefulness of the current randomised controlled trials as they do enormous problems for researchers and,
comparing coronary artery surgery and angioplasty.20
more importantly, for research funding bodies. For
Similar problems occur in trials of cancer treatment.21
example, it is feasible to randomise tens of thousands of
Another facet of this problem is the absence of
people in a drug trial in which death is the only
privately funded patients from almost all randomised outcome of interest, but it is unrealistic if more
trials in theUnited Kingdom.
complex and sophisticated outcomes are the relevant
The problem eligibility of
may be exacerbated by a In many ways the problems that randomised
endpoints.
poor recruitmentrate. Although most trials fail to
trials encounter arise from a largely uncritical transfer
report their recruitment rate,4 those that do suggest
of a well developed scientific method in
pharma?
rates are often very low. As little is yet known about the
cological research to the evaluation of other health
sort of people who are prepared to have their treatment
technologies and to health services.
allocated on a random basis, it seems wise to assume
Several of the other limitations cannot be polarised
that theymay differ in important ways from those who between and but are a complex
principle practice mix
decline to take part.
of the two. These include:
And the third and final problem in generalising the
? Contamination between treatment groups
results of randomised trials is that treatment may be
Patients who may receive better ? The of clinicians who volunteer
atypical. participate unrepresentativeness
care, regardless of which arm of the trial they are in.22 to participate

BMJ volume 312 11may 1996 1217


? Poor recruitment rates research within a discipline con?
patient "Every strategy
? The care tributes importantly relevant and complementary
better that trial participants receive
information to a .totality of evidence upon which
In theory all of these could be overcome, although in
rational clinical decision-making and public policy
practice it is hard to see how without the cost of the
can be reliably based. In this context, observational
study becoming astronomical.
evidence has provided and will continue to make
Assuming procedural problems could be overcome,
unique and important contributions to this totality
two problems of principle inherent in the method
of evidence upon which to support a judgment of
would remain. Firstly, the artificiality of a randomised
proof beyond a reasonable doubt in the evaluation of
trial probably reduces the placebo element of any interventions."31
intervention. Given the placebo
that effect accounts for
a large proportion of the effect of many interventions, I thank Nicholas Colin Sanderson,
Mays, Martin McKee,
the results of a trial will inevitably reflect the minimum and the reviewer for their comments, but I take full responsi?
level of benefit that can be expected. This may be one bility for the views expressed in this article.
reason (along with
confounding) why experimental Funding: None.
studies often smaller estimates of treatment Conflict of interest: None.
yield
effects than studies using observational methods.29

Secondly, a randomised trial provides information


on the value of an intervention shorn of all context,
1 Doll R. Summation of conference.Doing more good thanharm: the evaluation
such as patients' beliefs and wishes and clinicians' of health care interventions.AnnN YAcad Sei 1994;703:313.
attitudes and beliefs, the fact that such aspects 2 Opren scandal.Lancet 1983;i:219-20.
despite 3 Downs SH, BlackNA, Devlin HB, Royston C, Russell C. A systematic review
may be crucial to determining the success of the of the safety and effectiveness of laparoscopiccholecystectomy.Ann R Coll
intervention.30 In contrast, observational methods SurgEngl 1996;78:211-23.
4 StaufferRN. Ten-year follow-up study of total hip replacement.J Bone Joint
maintain the integrity of the context in which care
SurgAm 1982;64:983-90.
is provided. For these two reasons, the notion that 5 Black NA. The relationship between evaluative research and audit. J Public
a gold HealthMed 1992;14:361-6.
information from randomised trials represents 6 Brewin CR, Bradley C. Patient preferences and randomised clinical trials.
standard, while that derived from observational BMJ 1989;299:313-15.
studies is viewed as wrong, be too 7 Lilford R, Jackson J. Equipoise and the ethics of randomization.J R SocMed
may simplistic.
1995;88:552-9.
An alternative perspective is that randomised trials 8 Schulz KF. Subverting randomization in controlled trials.JAMA 1995;274:
an indication of the minimum effect of an 1456-8.
provide
9 Fowler FJ, Wennberg JE, Timothy RP, BarryMJ, Mulley AG, Hanley D.
intervention whereas observational studies offer an
Symptom status and quality of life following prostatectomy. JAMA
estimate of the maximum effect. If this is so then 1988;259:3018-22.
10Wasson JH, Reda DJ, BruskewitzRC, Elinson J,Keller AM, Henderson WG.
policymakers need data from both approaches when
A comparison of transurethralsurgerywith watchful waiting formoderate
making decisions about health services, and neither symptoms of benign prostatichyperplasia.NEnglJMed 1995;332:75-9.
should 11 Dixon J, Glennerster H. What do we know about fundholding in general
reign supreme.
practice?BMJ 1995;311:727-30.
REDRESSING THE BALANCE 12 Clinical StandardsAdvisory Group. Access to and availability of coronaryartery
bypassgraftingand coronaryangioplasty.London: HMSO, 1993.
My intention in focusing on the limitations of trials 13 Chalmers I.Minimizing harm and maximizing benefit during innovation in
health care: controlled or uncontrolled experimentation? Birth 1986;13:
is not to suggest that observational methods are
155-64.
unproblematic but to redress the balance. The 14 Black NA, Johnston A. Volume and outcome in hospital care: evidence,
of non-experimental approaches have explanations and implications.Health ServiceManagement Research 1990;3:
shortcomings 108-14.
been widely and frequently aired. The principal 15 Sowden AJ, Deeks JJ, Sheldon TA. Volume and outcome in coronary artery
problem is that their internal validity may be under? bypass graft surgery:true association or artefact?BMJ 1995;311:115-18.
16 Dorey F, Grigoris P, Amstutz H. Making do without randomised trials.J Bone
mined by previously unrecognised confounding Joint SurgBr 1994;76-B: 1-3.
factors which may not be evenly distributed between 17 Cross design synthesis: a new strategy for studyingmedical outcomes? Lancet
intervention It is currently unclear how serious 1992;340:944-6.
groups-. 18Maw R, Bawden R. Spontaneous' resolution of severe chronic glue ear in
and how insurmountable a methodological problem children and the effect of adenoidectomy, tonsillectomy, and insertion of
this is in practice. While some of this ventilation tubes (grommets).BMJ 1993;306:756-60.
investigations 19 HlatkyMA, Califf RM, Harrell FE, Lee KL, Mark DB, PryorD. Comparison
issue have been undertaken,19 more studies comparing of predictions based on observational data with the results of randomised
and observational are urgently controlled trials of coronary artery bypass surgery. J Am Coll Cardiol
experimental designs
1988;11:237-45.
needed. 20 White HD. Angioplasty versusbypass surgery.Lancet 1995;346:1174-5.
For too long a false conflict has been created between 21 Ward LC, Fielding JWL, Dunn JA, Kelly KA. The selection of cases for
randomised trials:a registryof concurrent trialand non-trial participants.Br
those who advocate randomised trials in all situations
J Cancer 1992;66:943-50.
and those who believe observational data provide 22 Stiller CA. Centralised treatment, entry to trials, and survival.Br J Cancer
sufficient evidence. Neither is helpful. There 1994;70:352-62.
position 23 SchwartzD, Lellouch J. Explanatory and pragmatic attitudes in clinical trials.
is no suchthing as a perfect method; each method has JChron D? 1967;20:637-48.
its and weaknesses. The two 24 Moher D, JadadAR, Nichol G, PenmanM, Tugwell P,Walsh S. Assessing the
strengths approaches
should be seen as complementary. quality of randomized controlled trials:an annotated bibliography of scales
After all, experi? and checklists.ControlledClin Trials 1995;16:62-73.
mental methods depend on observational ones to 25 Oxman AD, ed. Section VI: Preparing and maintaining systematic reviews.
Cochrane Collaboration handbook. Oxford: Cochrane Collaboration,
generate clinical uncertainty; generate hypotheses; 1994.
identify the structures, processes, and outcomes that 26 Wennberg JE, BarryMJ, Fowler FJ,Mulley A. Outcomes research, PORTs,
should be measured in a trial;
and help to establish and health care reform.Doing more good than harm: the evaluation of
health care interventions.AnnNYAcadSci 1994;703:56.
the appropriate sample size for a randomised trial. 27 Korn EL, Baumrind S. Randomised clinical trials with clinician-preferred
When trials cannot be well treatment.Lancet 1991;337:149-52.
conducted, designed
28 Zelen M. A new design for randomised clinical trials. N Engl J Med
observational methods offer an alternative to doing
1979;300:1242-5.
nothing. They also offer the opportunity to establish 29 Schulz KF, Chalmers I, Hayes RJ, Altman DG. Empirical evidence of
bias. Dimensions of methodological quality associated with estimates of
high external validity, something that is difficult to treatmenteffects in controlled trials.JAMA 1995;273:408-12.
achieve in randomised trials. 30 BeecherHK. Surgery as placebo. JAMA 1961;176:1102-7.
Instead of advocates of each the 31 Hennekens CH, Buring JE. Observational evidence. Doing more good than
approach criticising , harm: the evaluation of health care interventions. Ann NY Acad Sei
other method, everyone should be striving for greater 1994;703:22.
rigour in the execution of research, regardless of the
method used. (Accepted7March 1996)

1218 BMJ volume 312 11may 1996

Vous aimerez peut-être aussi