Académique Documents
Professionnel Documents
Culture Documents
In other words, there is a threshold (or between pt and B/H can be expressed
Summary likelihood) at which a particular policy as (see Appendix, Equation A1):
Ioannidis estimated that most becomes socially acceptable.
In the case of medical research, we (1)
published research findings are false
[1], but he did not indicate when, if at can similarly try to define a threshold
all, potentially false research results by asking: “When should potentially
false research findings become We define net benefit as the
may be considered as acceptable to
acceptable to society?” In other words, difference between the values of the
society. We combined our two previously
at what probability are research outcomes of the action taken under
published models [2,3] to calculate
findings determined to be sufficiently the research hypothesis and the null
the probability above which research
true and when should we be willing to hypothesis, respectively (when in fact
findings may become acceptable. A new
accept the results of this research? the research hypothesis is true). Net
model indicates that the probability
harms are defined as the difference
above which research results should
Defining the “Threshold between the values of the outcomes of
be accepted depends on the expected
Probability” the action taken under the null and the
payback from the research (the benefits)
As in most investment strategies, research hypotheses, respectively (when
and the inadvertent consequences (the
our willingness to accept particular in fact the null hypothesis is true) [3].
harms). This probability may dramatically
research findings will depend on the It follows that if the PPV is above pt
change depending on our willingness to
expected payback (the benefits) and we can rationally accept the results
tolerate error in accepting false research
the inadvertent consequences (the of the research findings. Similarly, if
findings. Our acceptance of research
harms) of the research. We begin by the PPV is below pt we should accept
findings changes as a function of what
defining a “positive” finding in research the null hypothesis. Note that the
we call “acceptable regret,” i.e., our
in the same way that Ioannidis defined research payoffs (the benefits) and
tolerance of making a wrong decision
it [1]. A positive finding occurs when the inadvertent consequences (harms)
in accepting the research hypothesis.
We illustrate our findings by providing a the claim for an alternative hypothesis
new framework for early stopping rules (instead of the null hypothesis) can be
in clinical research (i.e., when should accepted at a particular, pre-specified Funding: This work was funded by the Department
of Interdisciplinary Oncology of the H. Lee Moffitt
we accept early findings from a clinical statistical significance. The probability Cancer Center and Research Institute at the
trial indicating the benefits as true?). that a research result is true (the University of South Florida.
Obtaining absolute “truth” in research is posterior probability; PPV) depends
Competing Interests: The authors have declared
impossible, and so society has to decide on: (1) the probability of it being true that no competing interests exist.
when less-than-perfect results may before the study is undertaken (the
Citation: Djulbegovic B, Hozo I (2007) When should
become acceptable. prior probability), (2) the statistical potentially false research findings be considered
power of the study, and (3) the acceptable? PLoS Med 4(2): e26. doi:10.1371/journal.
statistical significance of the research pmed.0040026
A
s society pours more resources result. The PPV may also be influenced Copyright: © 2007 Djulbegovic and Hozo. This is
into medical research, it will by bias [1,4], i.e., by systematic an open-access article distributed under the terms
misrepresentation of the research due of the Creative Commons Attribution License,
increasingly realize that the which permits unrestricted use, distribution, and
research “payback” always represents a to inadequacies in the design, conduct, reproduction in any medium, provided the original
mixture of false and true findings. This or analysis [1]. author and source are credited.
tradeoff is similar to the tradeoff seen However, the calculation of PPV Abbreviations: B/H, net benefits/harms; CI,
with other societal investments—for tells us nothing about whether a confidence interval; combined Rx, radiotherapy plus
particular research result is acceptable chemotherapy; EUT, expected utility theory; PPV,
example, economic development can posterior probability; pr, acceptable regret threshold
lead to environmental harms while to researchers or not. Nevertheless, probability; pt, threshold probability; R0, acceptable
measures to increase national security it can be shown that there is some regret; RT, radiotherapy alone
can erode civil liberties. In most of the probability (the “threshold probability,”
Benjamin Djulbegovic is in the Department of
enterprises that define modern society, pt) above which the results of a study Interdisciplinary Oncology, H. Lee Moffitt Cancer
we are willing to accept these tradeoffs. will be sufficient for researchers Center and Research Institute, University of South
Florida, Tampa, Florida, United States of America.
to accept them as “true” [3]. The Iztok Hozo is in the Department of Mathematics,
threshold probability will depend Indiana University Northwest, Gary, Indiana, United
on the ratio of net benefits/harms States of America.
The Essay section contains opinion pieces on topics
of broad interest to a general medical audience.
(B/H) that is generated by the study * To whom correspondence should be addressed:
[3,5,6]. Mathematically the relationship Benjamin.Djulbegovic@moffitt.org
or
(2)
doi:10.1371/journal.pmed.0040026.g001
Calculation of the Threshold
Probability of “Accepted Truth” Figure 1. The Threshold Probability Above (Pt in Red) Which We Should Accept Findings of
Research Hypothesis as Being True
Figure 1 shows the threshold
The horizontal yellow line indicates the actual conditional probability that the research hypothesis
probability of “truth” (i.e., the is true in the case of positive findings. This means that for benefit/harm ratios above the threshold
probability above which the research (1.5 in this example), the research hypothesis can be accepted.
findings may be accepted) as a function
of B/H associated with the research researchers and/or data safety (B/H = 13.5), and (2) the worst-case
results. The graph shows that as long as monitoring committees have to make scenario (B/H = 1.4). The probability
the probability of “accepted truth” (a a judgment as to whether to accept that the research finding is true [16,17]
horizontal line) is above the threshold early promising results and terminate (i.e., that combined treatment is truly
probability curve, the research findings a trial or whether the trial should better than radiotherapy alone) under
may be accepted. The higher the B/H continue [13,14]. If the interim the best-case scenario is 95% [95%
ratio, the less certain we need to be of analysis shows significant benefit in confidence interval (CI), 89%–99.9%].
the truthfulness of the research results efficacy for the new treatment over Under the worst-case scenario, the
in order to accept them. the standard treatment, continuing to probability that combined treatment is
Note that we are following the classic enroll patients into the trial may mean better than radiotherapy alone is 80%
decision theory approach to the results that many patients will receive the [95% CI, 61%–99%]. The threshold
of clinical trials, which states that a inferior standard treatment [13,14]. probability above which these findings
rational decision maker should select The first randomized controlled should be accepted is 7% [95% CI,
the research versus the null hypothesis trial of circumcision for preventing 0%–30%] if we assume that B/H =
depending on which one maximizes heterosexual transmission of HIV, 13.5, or 41% [95% CI, 11%–72%] if we
the value of consequences [7–9]. In the for example, was terminated early assume B/H = 1.4 (Table 1).
parlance of expected utility decision after the interim analysis showed that The results indicate that in the
theory, this means that we should circumcised men were less likely to best-case scenario, the probability
choose the option with the higher be infected with HIV [15]. However, that the research findings are true far
expected utility [3,5,7–12]. (Expected if a study is wrongly terminated for exceeds the threshold above which the
utility is the average of all possible presumed benefits, this could result results should be accepted (i.e., PPV is
results weighted by their corresponding in adoption of a new therapy of greater than pt). Therefore, rationally,
probabilities—see Appendix). In questionable efficacy [13,14]. in this case we should not hesitate to
other words, the results of the research We now illustrate these issues accept the findings from this study as
hypothesis should be accepted when the by considering a clinical research truthful. However, in the worst-case
benefit of the action outweigh its harms. hypothesis: is radiotherapy plus scenario, the lower limit of the PPV’s
chemotherapy (combined Rx) 95% confidence interval intersects
A Practical Example: When Should superior to radiotherapy alone (RT) with the upper limit of the threshold’s
We Stop a Clinical Trial? in the management of cancer of the 95% confidence interval, indicating
Interim analyses of clinical trials esophagus? (see Box 1). We consider that under these circumstances the
are challenging exercises in which two scenarios: (1) the best-case scenario research hypothesis may not be
Table 1. How True Is the Research Hypothesis that Combined Chemotherapy Is Superior To Radiotherapy Alone in the
Management of Esophageal Cancer?
Net Benefits Net Harms Benefit/Harms Type I (α) Type II (β) The Threshold Probability Probability that
(Survival; %)a (Treatment-Related Ratio Error (%) Error (%) Above Which Research Research Hypothesis
Mortality; %)a Hypothesis Should Be Is True (%)
Accepted as True Findings
(%)
20 0.75
30 0.5
40 0.38
60 0.25
80 0.19
Confirmatory meta-analysis of good 85 0.18 1 15
quality RCTs (β = 5%)
20 0.75
30 0.5
40 0.38
60 0.25
80 0.19
Meta-analysis of small inconclusive 41 1.44 1 59
studies (β = 20%)
20 2.95
30 1.97
40 1.48
60 0.98
80 0.74
Underpowered, but well-performed 23 3.35 1 77
phase I/II RCT (β = 80%)
20 3.85
30 2.57
40 1.93
60 1.28
80 0.96
Underpowered, poorly performed 17 4.88 1 83
phase I/II RCT (β = 80%)
20 4.15
30 2.77
40 2.08
60 1.38
80 1.04
Adequately powered exploratory 20 4 1 80
epidemiological study (β = 20%)
20 4
30 2.67
40 2
60 1.33
80 1
Underpowered exploratory 12 7.33 1 88
epidemiological study (β = 80%)
20 4.4
30 2.93
40 2.2
60 1.47
80 1.1
Discovery-oriented exploratory 0.10 999 1 99
research with massive testing (β = 80%)
20 5
30 3.33
40 2.5
60 1.67
80 1.25
Figures in red: applies only if a decision maker is willing to violate precepts of rational decision making under expected utility theory; otherwise under these circumstances research
hypothesis never becomes acceptable. (Research may become acceptable if regret is smaller than the values pre-specified in the table; see text of article (Equation 4) and Text S1 for a
longer version of the paper.)
a
Data from [1].
b
Expressed as r = 1%, 20%, etc. of benefits (i.e., percentage of benefits that we can tolerate losing in case we wrongly accept research findings).
RCT, randomized controlled trial
doi:10.1371/journal.pmed.0040026.t002