Académique Documents
Professionnel Documents
Culture Documents
D
e
n
s
i
t
y
F
u
n
c
t
i
o
n
P
0
Probability Density Function For H|E
m
5 Copyright 2012 by ASME
interval known to the authors with this property.
Fortunately it is easy to calculate the endpoints of a
(1 - 2)100% Clopper-Pearson confidence interval (p
L
,
p
U
) in Excel using the worksheet function for the inverse
beta distribution:
p
L
= BETAINV(, m, n m + 1) for 0 < m
= 0 for m = 0
p
U
= BETAINV(1 , m + 1, n m) if m < n
= 1 for m = n
If (p
L
, p
U
) is a Clopper-Pearson (1 - 2)100% confidence
interval for an ILI tools certainty p, then p
L
< p with
(1 - )100% confidence level and p < p
U
with (1 - )100%
confidence level. A tool's performance is accepted with
(1 - )100% confidence level if p
0
< p
L
and rejected with
(1 - )100% confidence level if p
U
< p
0
. In particular, we
use the end points of a 90% (2 = 0.10) confidence
interval to accept or reject ILI tool performance at a 95%
confidence level. This means that acceptance results
for a 90% confidence interval are roughly comparable
to an acceptance threshold of 0.95 for the Bayesian
method. Straightforward calculations with 2 = 0.10
show that p
U
< 0.8 when m 16 and 0.8 < p
U
and 23 m.
Thus we reject tool performance if m 16 and accept tool
performance if 23 m. For this example rejection of tool
performance by this confidence interval method and the
Bayesian method are the same, but acceptance of tool
performance by the Bayesian method occurs for two more
values of m than for the confidence interval method.
Thoughtless application of either confidence intervals or
the Baysian method can lead to acceptance of tool
performance based on very small sample sizes. When
applying either technique the probability of having all
measurements out of tolerance needs to be considered.
For example, suppose the certainty is 0.8. If the
confidence level is 95%, then 5% of all runs do not meet
the certainty specification. The probability of a sample of
size n having all of its measurements within tolerance is
(0.8)n. The sample size n needs to be so large that the
probability of having all measurements within tolerance is
less than the probability of the certainty specification not
being met. That is, n needs to be so large that (0.8)n is
less than 0.05 (equivalent of 5%). This means 14 n.
Consequently for p
0
= 0.8 and confidence level = 95% we
need to have a sample size greater than or equal to 14.
The Table 2 compares the number of successful
measurements for rejection and acceptance of tool
performance by the Bayesian method (rejection and
acceptance thresholds of 0.5 and 0.95, respectively) and a
method based on the Clopper-Pearson 90% confidence
interval for the standard tool specifications of p
0
= 0.8 and
95% confidence level.
Table 2 Number of successful measurements for rejection and
acceptance of tool performance
(rejection and acceptance thresholds of 0.5 and 0.95,
respectively, for Bayesian method)
(p
0
= 0.8, 95% confidence level)
The minimum number of successful measurements in
Table 2 for acceptance of tool performance by the
Bayesian method ranges from 1 to 4 fewer than for the
confidence interval method and roughly increases with
sample size. The maximum number of successful
measurements for rejection by the confidence interval
method is the same or one smaller for the Bayesian
method with a rejection threshold of 0.5. Thus, rejection
criteria are roughly the same for the two methods. The
primary difference comes with the minimum number of
successful measurements for acceptance of tool
performance. The condition for acceptance of tool
performance by the Bayesian method in this example is
always less demanding for the confidence interval
method. In fact, if the acceptance threshold for the
Bayesian method is increased to 0.99 in this example, the
number of required successful measurements for
acceptance is still the same or one fewer than the number
for the confidence interval method.
Confidence intervals in API 1163
API 1163 gives no details as to how confidence intervals
can be used to construct confidence intervals as a method
for assessing certainty. API 1163 only gives a table
(Table 9 on page 37) of 95% confidence intervals for
Rejectif AcceptIf Rejectif AcceptIf Rejectif AcceptIf Rejectif AcceptIf
n m m m m n m m m m
14 8 12 8 14 33 22 27 22 30
15 9 13 9 15 34 23 28 23 31
16 10 14 10 16 35 24 29 24 32
17 10 15 10 17 36 24 30 25 33
18 11 15 11 17 37 25 31 25 34
19 12 16 12 18 38 26 31 26 34
20 13 17 13 19 39 27 32 27 35
21 13 18 13 20 40 27 33 28 36
22 14 19 14 21 41 28 34 28 37
23 15 19 15 22 42 29 35 29 38
24 16 20 16 23 43 30 35 30 39
25 16 21 16 23 44 30 36 31 40
26 17 22 17 24 45 31 37 32 40
27 18 23 18 25 46 32 38 32 41
28 19 23 19 26 47 33 39 33 42
29 19 24 19 27 48 33 39 34 43
30 20 25 20 28 49 34 40 35 44
31 21 26 21 29 50 35 41 35 45
32 21 27 22 29 51 36 42 36 45
BayesianMethod Intervalmethod BayesianMethod Intervalmethod
ClopperPearson ClopperPearson
90%Confidence 90%Confidence
6 Copyright 2012 by ASME
sample size 25. Fortunately, it is possible to determine
how most of this table could have been calculated. Given
a random sample of size n with x successes, the standard
textbook (1 )100% confidence interval (p
L
, p
U
) for a
population proportion has endpoints
p
L
= p z
/2
p(1 p)
n
p
U
= p+ z
/2
p(1 p)
n
(1)
where p = x/n and z
/2
is (1 /2)100-th percentile of
the standard normal distribution. In particular, z
0.025
=
1.96 when = 0.05. In order to prevent absurdities p
L
and p
U
are restricted to be non-negative and at most 1. It
is often overlooked that eq. (1) contains an assumption
that n is large relative to the true proportion p being
estimated and (1 p). A common requirement is
5 np and 5 n(1 p) (2)
In practice p is often replaced with p in eq. (2) because p
is usually unknown. Also, some authors replace 5 with
10. Eq. (2) is merely a rule of thumb, not an
analytically determined condition. Ref. [3] cites five
other conditions that have also appeared in textbooks to
indicate that n must be large.
Eq. (1) gives the endpoints of confidence intervals in
Table 9 of API 1163, except when p = 0 and p = 1. Eq.
(1) clearly has a problem when p = 0 or 1 because the
interval degenerates to a single point. Table 9 of API
1163 avoids this problem by some unmentioned
procedure.
Refs [3] and [4] show that the coverage of the confidence
intervals whose end points are given in eq. (1), and by
implication those in API 1163, can be significantly less
than the nominal coverage of (1 )100%. In particular,
the coverage of 95% confidence interval given by eq. (1)
when n = 25 is only 88%, not the nominal 95% for p =
0.8. Details of this calculation are given in Appendix 2.
Probabilities of false acceptance and false rejection
The probability of false rejection (Type 1 error) of tool
performance, P
FalseRejection
, is the probability that we reject
tool performance when in fact we should accept tool
performance. That is, P
FalseRejection
is the probability that
there are m
r
or fewer successes and p p
0
.
The probability of false acceptance (type 2 error) of tool
performance, P
FalseAcceptance
, is the probability that we
accept tool performance when in fact we should reject
tool performance. That is, P
FalseAcceptance
is the probability
that there are m
a
or more successes and p < p
0
. Appendix
3 shows
P
FalseRejection
=
P(E
m
|H)
m
m=0
(1-p
0
)(n+1)
P
FalseAcceptance
=
P(E
m
|H
c
)
n
m=m
a
p
0
(n+1)
Table 3 gives values of P
FalseRejection
and P
FalseAcceptance
for
various values of n in the typical setting p
0
= 0.8 with
95% confidence level and rejection-acceptance thresholds
of 0.05 and 0.95, respectively. P
FalseRejection
does not
decrease monotonically because the values are very small
and m
r
does not decrease monotonically. The important
observation is that P
FalseRejection
is less than 0.004. This
means that for practical purposes the likelihood of
rejecting tool performance when it should be accepted is
negligible. The case for P
FalseAcceptance
is slightly different.
Values of P
FalseAcceptance
are small, but not necessarily
negligible.
Table 3 Probabilities of False Acceptance and False Rejection
(p
0
= 0.8, 95% confidence level, rejection and acceptance thresholds of
0.5 and 0.95, respectively)
Why is sample size important?
Table 3 seems to imply that sample size is not an
important consideration in assessing tool performance.
This is a completely incorrect conclusion. We will justify
this statement by considering the typical setting p
0
= 0.8
with 95% confidence level and rejection-acceptance
thresholds of 0.05 and 0.95, respectively. Appendix 5
gives a formula for the probability of m
r
or more
successful measurements when p
0
p. Table 4 gives
7 Copyright 2012 by ASME
these probabilities for various values of n. That is, Table
4 gives the probability that we accept a tool's performance
when the performance is acceptable.
Table 4 Probability the number of
successful measurements m
r
when p
0
p
(p
0
= 0.8, 95% confidence level, rejection and
acceptance thresholds of 0.5 and 0.95, respectively)
Thus when n = 15 we will fail to accept about 16%
(roughly 1 in 6) of acceptable performances, while when
n = 50 we fail to accept only 10% (1 in 10). Thus,
acceptance is considerably more likely with the larger
sample size, in general the larger the sample the more
accurate the results. The Bayesian method is optimal
for acceptance.
The following discussion applies only to the typical
situation with p
0
= 0.8, 95% confidence level, and
acceptance threshold of 0.95. However, something
similar may hold in general. Intuitively, it is not possible
to conclude that 0.8 certainty from a sample unless
more than 80% of its measurements are successes. That
is, we cannot expect to be able to accept tool performance
using a sample of size n with fewer than 0.8n + 1
successful measurements. It is easily verified that the
acceptance number of successes, m
a
, in Table 2 equal
0.8*n + 1 rounded to the nearest integer. In Excel
m
a
= ROUND(0.8*n + 1, 0)
This formula has been verified for values of n up to 100
and larger values of 500, 1000, and 10,000. Thus, the
acceptance number of success for the typical situation
appears to be optimal in the sense that no smaller value
will suffice.
Advantages of the Bayesian method
The Bayesian method
1. is less restrictive on accepting tool performance
than a comparable confidence interval method
based on the Clopper-Pearson confidence
interval, at least for the situations depicted in
Table 2.
2. allows the calculation of the probability density
function for H|E
m
, which could be used for
probabilistic assessments of failure.
3. allows the calculation of the probabilities of false
acceptance and false rejection.
4. gives an optimal acceptance criterion, at least for
the situation with p
0
= 0.8, 95% confidence level
acceptance threshold of 0.95
Disadvantage of the Bayesian method
The primary disadvantage of the Bayesian method is it
relies heavily on the certainty and confidence level in the
sizing accuracy specification. If these are erroneous then
the Bayesian method is also likely to be erroneous. The
amounts to restating the proverb "Garbage in, garbage
out."
Minimum Sample Size
Generalizing the argument in the second paragraph before
Table 2 to arbitrary certainties and confidence levels,
requires that the sample size n satisfies (p
0
)
n
< . That is,
n > ln()/ln(p
0
).
The Bayesian method is optimal for acceptance
Intuitively, it is not possible to conclude that 0.8
certainty from a sample unless more than 80% of its
measurements are successes. That is, we cannot expect to
be able to accept tool performance using a sample of size
n with fewer than 0.8n + 1 successful measurements. It is
easily verified that the acceptance number of successes,
m
a
, in Table 2 equal 0.8*n + 1 rounded to the nearest
integer. In Excel
ma = ROUND(0.8*n + 1, 0)
for the data in Table 2. It appears that in general m
a
is
very close to p
0
*n + 1 rounded to the nearest integer.
Actually, rounding p
0
*n + 0.99999 works better in Excel
due to the way fractional parts of 0.5 are rounded. There
are still inaccuracies when the fractional part of p
0
*n + 1
is close to 0.5, but these are not excessive for the cases
considered below. We compared m
a
with ROUND(p
0
*n
+ 0.99999, 0) in Excel for cases with
p
0
: 0.750, 0.775, 0.800, 0.825, 0.850
confidence levels: 90%, 92.5%, 95%, 97.5%
sample size ranging from the minimal acceptable
value to 100
8 Copyright 2012 by ASME
acceptance thresholds determined by the
confidence level ((1 - )100% confidence level
determines an acceptance threshold of (1 - ))
The Table 5 describes the cases for which m
a
and
ROUND(p
0
*n + 0.99999, 0) are not equal. For a fixed p
0
the comparisons of m
a
with ROUND(p
0
*n + 0.99999, 0)
are independent of the confidence level, even though the
minimum sample size changes. For 95% confidence level
only 15 of 433 cases (roughly 3%) had ROUND(p
0
*n + 1,
0) > m
a
and in each case ROUND(p
0
*n + 1, 0) is only one
larger than m
a
. ROUND(p
0
*n + 1, 0) is also at most one
larger than m
a
for all other confidence levels considered.
In short, m
a
is as small as can be reasonably expected, at
least for the cases considered.
Table 5 Comparison of m
a
with ROUND(p
0
*n + 0.99999, 0)
(confidence levels, sample size, acceptance thresholds described in text)
The optimality of m
a
in these calculations justifies using
the confidence level to determine the acceptance
threshold.
REFERENCES
1. Haldar, H. and Mahadevan, S., Probability, Reliability,
and Statistical Methods in Engineering Design, John
Wiley & Sons, New York, 2000, pp. 26.
2. McCann, R., McNealy, R., and Gao, M., In-Line-
Inspection Performance Verification, II, Validation
Sampling, NACE, Corrosion 2008, Paper No. 08151.
3. Brown, L.D., Cai, T.T., and DasGupgta, A., Interval
Estimation for a Binomial Proportion, Statistical
Science, Vol. 16, No. 2, 2003, 101-133.
(http://correio.cc.fc.ul.pt/~mcg/aulas/dinpop/Mod7/Brown_et_al.pdf)
4. Brown, L.D., Cai, T.T., and DasGupgta, A.,
Confidence Intervals for a Binomial Proportion and
Asymptotic Expansions, Annals of Statistics, Vol 30,
No. 1, 2002, 160-201.
(http://wwwstat.wharton.upenn.edu/~tcai/paper/Binomial-Annals.pdf)
5. API Standard 1163, In-Line Inspection Systems
qualification Standard, First Edition, August, 2005,
American Petroleum Institute, Washington, D. C.
APPENDIX 1
Evaluation of P(E|H
c
), P(E|H), P(H
c
) and P(H)
The probability of exactly m successes in n measurements
for a given proportion p of successes is given by the
binomial distribution:
B(m, n, p) = [
n
m
p
m
(1 -p)
n-m
P(E
m
|H
c
) is the "sum" of all these probabilities for p < p
0
divided by the "sum" of all possible probabilities (0 < p <
1):
P(E
m
|E
c
) =
] B(m,n,p)dp
0
0
] B(m,n,p)dp
1
0
The right side of this identity is the beta distribution,
which is easily evaluated in Excel using a worksheet
function:
P(E
m
|H
c
) = BETADIST(p
0
, m + 1, n m + 1)
The beta distribution is commonly encountered in
application of Bayes' Theorem. Since H and H
c
are
mutually exclusive and exhaustive we have P(E
m
|H) +
P(E
m
|H
c
) = 1, so that
P(E
m
|H) = 1 P(E
m
|H
c
).
P(H) and P(H
c
)
A (1 - )100% confidence level implies that (1 - )100%
of the ILI runs satisfy H so that the probability of a
random ILI run satisfying H is 1 . That is, P(H) = 1
. Since P(H) + P(H
c
) = 1, we have P(H
c
) = 1 P(H) =
.
We are now able to evaluate all the terms in the formula
for P(H|Em), the probability that certainty p
0
given a
sample of tool run data.
0.750 0.775 0.800 0.825 0.850
Numberofsamples
(95%confidencelevel)
Numberofsampleswith
ROUND(po*n+0.99999,0)>m
a
Numberofsampleswith
ROUND(po*n+0.99999,0)<m
a
0 0 0 0 0
90 89 87 85 82
p
0
0 11 0 4 0
9 Copyright 2012 by ASME
APPENDIX 2
Calculation of Coverage of API 1163
95% Confidence Interval when n =25, p = 0.8
The probability P (m) of exactly m successes in a random
sample of size n for population proportion p is given by
the binomial distribution, which was discussed at the
beginning of Appendix 1.
Table A1 gives P (m) for all possible choices of m when n
= 25 and p = 0.8, along with the endpoints (converted
from percentages to decimals) of the corresponding
confidence intervals in Table 9 of API 1163 for each
possible number of successes. Note that p
L
and p
U
are
independent of the of the population proportion. The
chosen population proportion (0.8) is only used to
calculate the probabilities, in Table A1.
Table A1 Probabilities and 95% confidence interval endpoints
from API 1163, Table 9 (changed from percent to decimal)
Suppose we take a random sample of size 25, determine
the number of successes, and construct a confidence
interval according to Table A1. Notice that there are only
26 possible confidence intervals. If we repeat this
method for all possible samples of size 25, the proportion
of times any given confidence interval is constructed is
the same as the proportion of times its corresponding
number of successes occurs. For example, the proportion
of times we construct the confidence interval (0.64, 0.96)
equals the proportion of times there are 20 successes.
Since we constructed all possible confidence intervals by
this method, the proportion of times there are m successes
is exactly P (m). Then, the proportion of confidence
intervals that contains 0.8 equals the sum of all P (m) with
0.8 in the confidence interval corresponding to m
successes. Since 0.8 is only in the confidence intervals
corresponding to m from 16 through 22, the proportion of
confidence intervals that contain 0.8 is given by
88 . 0 P(m)
22
16
=
= m
Thus, the true coverage of this method for obtaining
confidence intervals is 0.88, not the nominal value of
0.95. This means that if the true certainty of an ILI tool
were 0.8 and we used Table 9 to determine a confidence
interval for the certainty, the true confidence level would
be 88%, not 95%. Inadequacies of eq. (1) to determine
confidence intervals with the nominal coverage, even
with large sample sizes, are well-documented in the
literature. The interested reader is directed to Ref. [3] as
a starting point.
m Prob(m) pL pU
0 3.36E-18 0.00 0.11
1 3.36E-16 0.00 0.12
2 1.61E-14 0.00 0.19
3 4.94E-13 0.00 0.25
4 1.09E-11 0.02 0.30
5 1.83E-10 0.04 0.36
6 2.43E-09 0.07 0.41
7 2.64E-08 0.10 0.46
8 2.38E-07 0.14 0.50
9 1.80E-06 0.17 0.55
10 1.15E-05 0.21 0.59
11 6.27E-05 0.25 0.63
12 2.93E-04 0.28 0.68
13 1.17E-03 0.32 0.72
14 4.01E-03 0.37 0.75
15 1.18E-02 0.41 0.79
16 2.94E-02 0.45 0.83
17 6.23E-02 0.50 0.86
18 1.11E-01 0.54 0.90
19 1.63E-01 0.59 0.93
20 1.96E-01 0.64 0.96
21 1.87E-01 0.70 0.98
22 1.36E-01 0.75 1.00
23 7.08E-02 0.81 1.00
24 2.36E-02 0.88 1.00
25 3.78E-03 0.89 1.00
API 1163
10 Copyright 2012 by ASME
APPENDIX 3
Probabilities of false acceptance and false rejection
(Type 1 and Type 2 errors)
Let A and B denote two events. The event in which both
A and B occurs is denoted by AB. The multiplication rule
for probability can be stated as
P(AB) = P(A|B)*P(B)
Let the sample size n be fixed and consider the maximum
number of successes m
r
for which we reject tool
performance and the minimum number of successes m
a
for which we accept tool performance. Values for m
r
and
ma are given in Table 2 for 14 n 51.
The probability of false rejection (Type 1 error) of tool
performance, P
FalseRejection
, is the probability that we reject
tool performance when in fact we should accept tool
performance. That is, P
FalseRejection
is the probability that
there are m
r
or fewer successes and p p
0
. That is,
P
FalseRejection
= P(m
r
or fewer successes and p p
0
)
=
P(exactly m successes anu p p
0
)
m
m=0
P(exactly m successes anu p p
0
)
n
m=0
=
P(E
m
B)
m
m=0
P(E
m
B)
n
m=0
)
=
P(E
m
|B)
m
m=0
P(E
m
|B)
m
m=0
The probability of false acceptance (type 2 error) of tool
performance, P
FalseAcceptance
, is the probability that we
accept tool performance when in fact we should reject
tool performance. That is, P
FalseAcceptance
is the probability
that there are m
a
or more successes and p < p
0
. That is,
P
FalseAcceptance
=P(m
a
or more successes and p < p
0
)
=
P(exactly m successes anu p < p
0
)
n
m=m
a
P(exactly m successes anu p < p
0
)
n
m=0
=
P(E
m
B
c
)
n
m=m
a
P(E
m
B
c
)
n
m=0
=
P(E
m
|B
c
)
n
m=m
a
P(E
m
|B
c
)
n
m=0
Appendix 1 describes how to calculate P(E
m
|H
c
) and
P(H
c
). Appendix 4 shows
P(E
m
|B)
n
m=0
= (1 -p
0
)(n +1) anu P(E
m
|B
c
)
n
m=0
= p
0
(n +1)
regardless of the value of m. Consequently,
P
FalseRejection
=
P(E
m
|H)
m
m=0
(1-p
0
)(n+1)
P
FalseAcceptance
=
P(E
m
|H
c
)
n
m=m
a
p
0
(n+1)
11 Copyright 2012 by ASME
APPENDIX 4
Evaluation of P(E
m
|H
)
n
m=
and P(E
m
|H)
n
m=
The probability of exactly m successes in n measurements
for a given proportion p of successes is given by the
binomial distribution:
B(m, n, p) = [
n
m
p
m
(1 -p)
n-m
We have
B(m, n, p)
n
m=0
= 1
Consequently
_ B(m, n, p)Jp
p
0
0
n
m=0
= _ B(m, n, p)Jp
n
m=0
p
0
0
= _ Jp
p
0
0
= p
0
According to eq. 1 in Section 8.384 of Ref [6] we have
_ B(m, n, p)
1
0
Jp = [
n
m
_ p
m
(1 -p)
n-m
1
0
Jp
= [
n
m
m! (n -m)!
(n +1)!
=
n!
m! (n -m)!
m! (n -m)!
(n +1)!
=
1
n +1
so that ] B(m, n, p)
1
0
Jp is independent of m and equals
1/(n + 1). Consequently
P(E
m
|B
c
)
n
m=0
=
] B(m, n, p)Jp
p
0
0
] B(m, n, p)Jp
1
0
n
m=0
= p
0
_ B(m, n, p)Jp
1
0
= p
0
(n +1)
Similarly
P(E
m
|B)
n
m=0
=
] B(m, n, p)Jp
1
p
0
] B(m, n, p)Jp
1
0
n
m=0
= (1 -p
0
) _ B(m, n, p)Jp
1
0
= (1 -p
0
)(n +1)
APPENDIX 5
Probability of m
r
or more successful measurements
when p
0
p
P(m
a
oi moie successes & p
0
p) =
P(exactly m successes & p
0
p)
n
m=m
a
P(exactly m successes & p
0
p)
n
m=0
=
P(E
m
B)
n
m=m
a
P(E
m
B)
n
m=0
=
P(E
m
|B)
n
m=m
a
P(E
m
|B)
n
m=0
=
P(E
m
|B)
n
m=m
a
(1 -p
0
)(n +1)