Empirical Likelihood For Right Censored Lifetime Data

Journal of the American Statistical Association
ISSN: 0162-1459 (Print) 1537-274X (Online) Journal homepage: http://www.tandfonline.com/loi/uasa20
Empirical Likelihood for Right Censored Lifetime

Data
Shuyuan He, Wei Liang, Junshan Shen & Grace Yang
To cite this article: Shuyuan He, Wei Liang, Junshan Shen & Grace Yang (2016) Empirical
Likelihood for Right Censored Lifetime Data, Journal of the American Statistical Association,
111:514, 646-655, DOI: 10.1080/01621459.2015.1024058
To link to this article: http://dx.doi.org/10.1080/01621459.2015.1024058
Accepted author version posted online: 01

Apr 2015.
Published online: 18 Aug 2016.
Submit your article to this journal
Article views: 391
View related articles
View Crossmark data
Full Terms & Conditions of access and use can be found at

http://www.tandfonline.com/action/journalInformation?journalCode=uasa20
Download by: [UNICAMP]
Date: 28 September 2016, At: 06:10
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION

, VOL. , NO. , , Theory and Methods
http://dx.doi.org/./..
Empirical Likelihood for Right Censored Lifetime Data

Shuyuan He, Wei Liang, Junshan Shen, and Grace Yang
ABSTRACT
When the empirical likelihood (EL) of a parameter is constructed with right censored data, literature shows
that 2 log(empirical likelihood ratio) typically has an asymptotic scaled chi-squared distribution, where
the scale parameter is a function of some unknown asymptotic variances. Therefore, the EL construction
of confidence intervals for requires an additional estimation of the scale parameter. Additional estimation would reduce the coverage accuracy for . By using a special influence function as an estimating function, we prove that under very general conditions, 2 log(empirical likelihood ratio) has an asymptotic standard chi-squared distribution with one degree of freedom. This eliminates the need for estimating the scale
parameter as well as eases some of the often demanding computations of the EL method. Our estimating
function yields a smaller asymptotic variance than those of Wang and Jing (2001) and Qin and Zhao (2007).
Thus, it is not surprising that confidence intervals using the special influence functions give a better coverage accuracy as demonstrated by simulations.
1. Introduction
Let Y and C be two independent, nonnegative random variables with respective distribution functions F (y) = P[Y y]
and G(s) = P[C s]. Y represents a lifetime whose observation is subject to right censoring by C such that instead of Y we
observe Z = min(Y, C) and the indicator = I[Y C] of the
event [Y C].
This article is concerned with using the empirical likelihood
(EL) method with right-censored data to construct confidence
intervals for a parameter , a functional of the distribution F,
defined by
E g(Y, ) =
g(s, ) dF (s) = 0,
(1.1)
for a specified function g(s, ). Both F and G are unknown.

A simple example is g(s, ) = (s) for some function
(s). Then = E (Y ). Many other examples of g(s, ) are given
in Section 2.
The problem of estimating = E (Y ) with a sample of n iid
observations of (Z, ) has been studied by many authors using
the estimator
n =
(s) dFn (s),
(1.2)
where Fn is the Kaplan-Meier (KM) or the product-limit estimator of F defined in (2.3) below. For continuous F and G, under
the finite second moment condition for (s),
CONTACT Grace Yang
2 (s)
dF (s) < ,
1 G(s)
gly@math.umd.edu
American Statistical Association
(1.3)
ARTICLE HISTORY
Received October
Revised December
KEYWORDS
Condence intervals;
Empirical likelihood;
Inuence function;
Parameter estimation;
Right-censored lifetimes.
the asymptotic distribution of n(n ), as n , is normal N(0, 2 ), see, for example, Yang (1994), where

(F (s) (s) (s))2
2
=
dF (s),
2
0
F (s)G(s)

(y) dF (y), s 0,
(1.4)
(s) =
s
where F = 1 F and G = 1 G.
Confidence intervals for can be constructed using the
asymptotic normal distribution N(0, 2 ). To use N(0, 2 ), it
is necessary to estimate the unknown variance 2 in (1.4). For
instance, Stute (1996) proposes a jackknife estimator for 2 .
Although any consistent estimator n2 of 2 can be employed,
the convergence rate of n2 is generally unknown. Substitution of
2 by n2 tends to reduce the coverage accuracy for as compare
to the case of known 2 . In the example of = F (y) for a fixed y,
the coverage accuracy was studied by Thomas and Grunkemeier
(1975), who proposed the EL method and illustrated by simulation its superiority over the method of normal approximation.
The ensuing theoretical work on the EL method was carried out
by Owen (1988) and followed by a rapid growth of literature.
The usefulness of the EL method for constructing confidence
interval/regions has been well established in a wide variety of
situations, see for example, DiCiccio, Hall, and Romano (1991),
Chen (1994), and a standard reference, Owen (2001) and references therein. An article by Zheng, Zhao, and Yu (2012) contains
an excellent review of some recent development.
Let R( ) denote the EL ratio function of and l( ) =
2 log R( ). If R( ) is computed from a complete sample of
n iid observations, then in many cases, l( ) has asymptotic 12
distribution. Confidence intervals for can be constructed using
the asymptotic 12 distribution. However, if R( ) is obtained
Professor Emerita, Department of Mathematics, University of Maryland, College Park, MD .
from a right-censored sample, except for a few special cases

(see, e.g., Owen 2001, Chap. 6), literature shows that the asymptotic distribution of l( ) is typically a scaled 12 , where the scale
parameter r is a function of unknown asymptotic variances. See,
for example, Wang and Jing (2001) and Qin and Zhao (2007).
This is often also the case in the regression analysis (e.g., Li and
Wang 2003), weighted EL analysis of a variety of censoring models and nonparametric problems involving infinite dimensional
nuisance parameters in which the asymptotic distribution of the
corresponding l( ) involves more complicated distributions of
sums of weighted chi-squared random variables with unknown
weights (Hjort, McKeague, and van Keilegom 2009). For example, when applying Theorems 2.1 and 2.2 in Hjort,
McKeague,
and van Keilegom (2009) with parameter = (t )dF (t ), it
can be seen from their Remark 2.2 that we will have
d (V1 /V2 ) 2 ,
2 log ELn (0 , h)
1
where V1 is the variance of U in their Condition A2 and V2
in Condition A3. This example shows that the unknown scale
V1 /V2 remains in the asymptotic distribution. In their Section
3.3 on functionals of survival distributions, applying their Theorems 2.1 and 2.2 one gets the same result obtained by Wang
and Jing (2001). In view of a complicated formula for the scale
parameter r (see Remark 3.1) which is computationally demanding, a scale-free chi-squared distribution would be desirable.
For the EL interval estimation of in (1.1) we propose a
method of eliminating the estimation of the scale parameter
r. The elimination simplifies some of the intensive computations often required of the EL method for censored data. This
approach also unifies the proof of the asymptotic distribution of
the corresponding l( ). In addition, simulation shows that the
method also gives a larger cover probability of the true as compared to the results of Wang and Jing (2001) and Qin and Zhao
(2007).
Empirical likelihood ratio function R( ) is obtained by utilizing auxiliary information on through a set of estimating equations. In our investigation, the estimation equation is Eg(Y, ) =
0, where the estimating function g(y, ) is a real-valued function. In the case of complete sample, Y1 , . . . , Yn , R( ) is formulated as
R( ) = sup

n
{pi }
n
n

npi
pi = 1,
pi g(yi , ) = 0,
i=1
i=1

pi 0, i = 1, 2, . . . , n ,
i=1
(1.5)
where {pi , i = 1, . . . , n} is a multinomial distribution which

puts weight pi on the observation Yi , and the constraint

n
i=1 pi g(yi , ) = 0 mimics the estimating equation Eg(Y, ) =
0; see, for example, Owen (2001). In the case of right censoring,
we observe not Y1 , . . . , Yn , but a right-censored sample of iid
random vectors (Z1 , 1 ), . . . , (Zn , n ). Our approach is to use
the (Z, )sample to construct another set of n random variables, {Wni , i = 1, . . . , n}, and formulate a modified EL ratio
function R( ) as follows,
R( ) = sup
{pi }

n
i=1
n
n

npi
pi = 1,
piWni = 0,
i=1
i=1

pi 0, i = 1, 2, . . . , n .
647
(1.6)
The Wni given in (3.7) are constructed in a special way such

that the asymptotic distribution of the corresponding l( ), that
is, l( ) = 2 log R( ), is the standard 12 without involving any
unknown scale parameter. Our proof of the asymptotic results
requires some hard analysis due in part to the fact that the Win
are dependent random variables.
Our work is motivated by that of Wang and Jing (2001)
and Qin and Zhao (2007). They used a modified R( ) similar to (1.6) but with different Wni , say Vni . Their Vni (given in
(3.13)) would result in a scaled 12 distribution for the asymptotic distribution of the corresponding l( ). The Wni in (1.6),
is
based on a special influence function W (Zi , i , ) of n =
0 g(y, )dFn (y) for given . Our results generalize those of
Wang and Jing (2001) and Qin and Zhao (2007) under a more
relaxed assumption. Namely, throughout the article the only
assumption we use is (1.3) which is stated as Assumption A in
Section 3.
In comparison with several other recent results, we have the
following observations. Encountered with the same challenging
or prohibitive computational problem of estimating the scale (or
asymptotic variances/covariances) using other existing procedures, Zhou and Jeong (2011) find a way to avoid the estimation
of asymptotic variances/covariances. Their method is different
from ours. Their parameter of interest is also not identical to
ours.
To test the null hypothesis H0 : = Eg(Y ) for a given g, Zhou
and Jeong (2011) put the probability mass, pi , on the Zi which is
Zi = min(Yi , Ci ) in our notation and define the maximum EL,
EL(KM), to be the maximum value of
1i
n

EL =
{pi }i
pj
i=1
Z j >Zi
where EL(KM) is the maximum over all pi and attained at the

KM estimator for right censored data. Let maxEL
be the maximum value of EL under the extra constraint n1=1 g(Zi )pi = .
These authors formulated the likelihood ratios maxEL/EL(KM)
and proved that W ( ) = 2 log{maxEL/EL(KM)} converges to
the standard 12 . Pan and Zhou (2002) formulated a different
EL ratio ALR( ) and proved that 2 log ALR( ) converges
2
in distribution to the standard 1 , but for the parameter =
g(t )d(t ) defined in terms of cumulative hazard function
(t ) of F. There are some recent results in the regression analysis using the EL method. For right censored observations and
p-dimensional regression parameter in the context of accelerated failure time models, Zhou, Kim and Bathke (2012) formulated certain likelihood ratios as Rxy ( ). Under H0 : = 0 ,
they showed that 2 log Rxy (0 ) converges in distribution to
the standard p2 , and their method is applicable to the related
censored quantile regression. In the Bayesian framework, Yang
and He (2012) used the EL analysis for quantile regression. The
application of our method to censored regression problems will
be reported in a future publication.
In our article we use R( ) defined in (1.6). The probability
pi is put on Wni . Maximization shows that R( ) equals to (3.11).
In addition to the consideration of efficient estimation for which
648
S. HE ET AL.
Wni are responsible, (3.11) seems to be easier to investigate than

W ( ) of Zhou and Jeong (2011) in which the KM estimator is
used. We prove the weak convergence
of 2 log R( ) in Theorem 3.2 for parameter defined by 0 g(s, )dF (s) = 0. Our
method yields an efficient estimation equation in the sense that
Wi in (3.2) has the minimum variance 2 as shown in Remark
3.1. The class of parameters considered in this article is more
general than that of Pan and Zhou (2002) and Zhou and Jeong
(2011). Our method can be used to test and construct confidence
intervals for a parameter defined by Eg(Y, ) while Zhou and
Jeong (2011)s method cannot.
The EL method based on influence functions was studied by
Zheng, Zhao, and Yu (2012). They obtained a chi-squared limit
for finite dimensional nuisance parameters, and no proofs were
given for infinite dimensional nuisance parameters. Our nuisance parameters F and G are infinite dimensional. For right
censored data, the infinite dimensional nuisance parameters F
and G involved in the estimating functions cannot be dealt with
by their profile EL. The proposed influence function based EL
methods provide a way to retain the nonparametric Wilks property of a chi-squared limit.
The article is organized as follows. Preliminaries, notations,
examples of , g(y, ) are given in Section 2. In Section 3, the
construction of Wni from W (Zi , i , ) is carried out. It is shown
in Theorem 3.1 that asymptotically 0 g(y, )dFn (y) is a partial sum of n independent influence functions W (Zi , i , ), or
is asymptotically linear. The weak convergence of 2 log R( )
to the standard 12 distribution without any scale parameter is
established in Theorem 3.2. Theorem 3.2 is used for the EL construction of confidence intervals for right-censored data. In Section 4, simulation comparisons of the new method with that
of the scaled 12 distribution is presented. If is the survival
function F (y) at a fixed y or the expected value EY , the coverage ratios computing from Wang and Jing (2001), Qin and
Zhao (2007), and our EL method are about the same. For more
complicated , the proposed EL procedure performs better. The
amount of improvement depends on the form of . In Section 5,
an example with real data is provided. The data are taken from
Andrews and Herzberg (1985). The proofs of Theorems 3.1, 3.2,
and Lemma 3.1 are relegated to the Appendix.
2. Preliminaries and Examples

For Z = min(Y, C), let
H(x) = P(Z x), bH = sup{x : H(x) < 1}.
(2.1)
So that [0, bH ] is the range of H. The symbols bF and bG are similarly defined for F and G.
Given a sample of n iid random vectors (Zi , i ), i =
1, 2, . . . , n, of (Z, ), their empirical distribution functions are
given by
Hn1 (x) =
1
I[Z j x, j = 1],
n j=1
Hn0 (x) =
1
I[Z j x, j = 0],
n j=1
(2.2)
1
I[Z j x].
n j=1
n
Hn (x) = Hn0 (x) + Hn1 (x) =
For any right continuous monotone function (x), let

(x), or (x) denote the left continuous version of (x)
and the curly brackets {x} denote the difference (x) (x).
Then {x} = d(x). For any cumulative distribution function F,
let F = 1 F.
Asymptotic optimal nonparametric estimators of F (x) and
G(x) are the well-known KM estimators given by

Hn1 {s}
1
Fn (x) = 1
H n (s)
sx
and

Hn0 {s}
Gn (x) = 1
,
(2.3)
1
H n (s)
sx
respectively, where an empty product is set equal to one. It can
be checked that for all x,
H n (x) = F n (x)Gn (x).
(2.4)
Applying (2.3) and (2.4), we get

dFn (x) = F n (x)
dHn1 (x)
F n (x)Gn (x)
(2.5)
It follows that
dHn1 (x) = Gn (x) dFn (x), dHn0 (x) = F n (x) dGn (x).
(2.6)
Put
H 0 (x) = P(Z x, = 0), H 1 (x) = P(Z x, = 1).
(2.7)
Then
H 0 (x) =
F (s) dG(s), H 1 (x) =
G(s) dF (s). (2.8)
Here and after, for simplicity, the integral sign

and stands for 0 .
b
a
stands for

(a,b]
Examples of and g(x, )

1. g(x, ) = I[x > y] with y fixed. Then solving the
equation Eg(Y, ) = 0 for (cf (1.1)) yields = F (y),
the survival function of Y.
2. g(x, ) = xk . Then = E Y k , the kth moments of Y.
3. g(x, ) = (x t0 )I[x t0 ] with t0 fixed. Then
= E(Y t0 |Y t0 ) =
E{(Y t0 )I[Y t0 ]}
,
P(Y t0 )
the mean residual life of Y.

4. g(x, ) = x(I[x > y] ). Then

1
s dF (s),
=
EY y
the length-biased survival function of Y. See Vardi
(1982), for example.
5. g(x, ) = x2 x. Then
1
=
EY
replaced by the estimate,
the mean of the length-biased lifetime.

6. g(x, ) = x2 2 x. Then

1
=
x2 dF (x),
2E Y 0
The following assumption will be used throughout the article.

Assumption A. F and G are continuous, g2 (Y, )/G(Y ) has finite
moment
2
g (s, )
dF (s) < .
G(s)
0
To develop an EL inference procedure, for a specific and
g(x, ), let us define

(s) =
g(x, ) dF (x), = g(x, ) dF (x). (3.1)
xs
To obtain an estimating equation for the modified EL ratio in

(1.6), we utilize the iid random functions in Theorem 6 of Akritas (2000) or He and Huang (2003),
(3.2)
The Wi are functions of random variables Zi , i , and the parameter . It can be calculated that

= var(Wi ) =

=
G(Zi )
g2 (s, )
G(s)
= ,
(3.3)
dF (s)
2
(F (s)g(s, ) (s))2
2
F (s)G(s)
i
n (Zi )
H n (Zi )
Gn (Zi )

I[Zi s]
n (s) 2
dHn0 (s).
H n (s)
2 (s)
2
F (s)G (s)
dF (s).
dG(s)
(3.7)
The price to pay for the approximation is that Wni s are not
stochastically independent which complicates the ensuing analysis.
The following theorem indicates the possibility of using Wni
to construct EL ratio and to obtain asymptotically a standard 2
distribution.
Theorem 3.1. Let Wni be given by (3.7) and E g(Y, ) = 0. Then
under Assumption A, as n , we have
1
Wni = n
n i=1
n
1
g(y, ) dFn (y) =
Wi + o p (1).
n i=1
n
(3.8)
Following
literature, we call W (Zi , i ) the i-th influence function of g(s, ) dFn (s) for given .
We define the modified EL ratio of by a multinomial likelihood subject to constraints as

n
n

n
npi
pi = 1,
piWni = 0,
R( ) = sup
{pi }
i=1
i=1

pi 0, i = 1, 2, . . . , n .
i
(Zi )
+
H(Zi )
Wi = W (Zi , i , ) =
G(Zi )

I[Zi s] 0
(s)
dH (s).
2
H (s)
g(Zi , )i
Wni =
3. Empirical Likelihood Ratios and Confidence

Intervals for
g(Zi , )i
(3.6)
We arrived at an approximation of Wi in (3.2) as
the mean of the length-biased residual lifetime.

7. g(x, ) = I[x ] p
with
p (0, 1).
Then
= F 1 (p), the pth quantile of Y.
Examples (4)(6) often appear in renewal processes and their
applications.
E Wi = E
g(s, ) dFn (s).

sx
g(Zi , )i
n (x) =
x dF (x),
2
649
i=1
(3.9)
To determine R( ), we solve, as usual, for the Lagrange multipliers and in

n

n
n

A=
log(npi ) n
piWni 1
pi .
i=1
i=1
i=1
1
1
(1
n
+ Wni ) , i = 1, 2, . . . , n, where
Then = n and pi =
is the solution of the equation
1 Wni
= 0.
n i=1 1 + Wni
n
(3.4)
If the true parameter 0 is the solution of the equation

E g(Y, ) = 0, then

(3.5)
E W (Zi , i , 0 ) = g(x, 0 ) dF (x) = 0.
Regarding Wi for i = 1, . . . , n as a complete random sample, one could formulate an EL likelihood ratio R(0 ) with
multinomial
probability pi assigned to Wi and the constraint
n
W
p
=
0. To proceed, we shall replace the unknown disi
i
i=1
tribution functions F, G, H H 0 in (3.2) by their respective estimates Fn , Gn given by (2.3) and (2.2). Likewise, (x) will be
h()
(3.10)
Lemma 3.1 shows that with probability 1, for large n the set {Wni }
contains a positive and a negative value. It follows that for large
n, (3.10) has a unique solution n such that nWni > 1. Now,
the EL ratio of can be written as
R( ) =
n

i=1
(npi ) =
n

(1 + nWni )1 .
(3.11)
i=1
Theorem 3.2. Suppose that 0 is the unique solution of

Eg(Y, ) = 0. Then under Assumption A, l(0 ) = 2 log R(0 )
converges in distribution to a 12 random variable with one
degree of freedom, as n .
650
S. HE ET AL.
Applying Theorem 3.2, confidence intervals for can be constructed as

I1 = { : l( ) c1 },
(3.12)
where c1 is the (1 )th quantile of the 12 distribution. I1

has asymptotic coverage probability of 1 , as n .
The following lemma is needed for proving Theorem 3.2.
Lemma 3.1. Under the conditions of Theorem 3.2 and = 0 ,
as n ,

(1) n1 ni=1 (Wni Wi )2 = o p (1),

(2) max1in |Wni | = o p ( n), n1 ni=1 Wni2 = 2 + o p (1).

Remark 3.1. We are able to obtain the standard asymptotic
12 distribution for 2 log R(0 ). This is because
according to
Lemma 3.1 the asymptotic variance, 2 , of 1n ni=1 Wni is equal

to the limit of n1 ni=1 Wni2 .
nIn Qin and Zhao (2007), the constraint in their EL ratio is
i=1 piVni = 0, where
Vni =
g(Zi , )i
1 Gn (Zi )
(3.13)
is the estimate of V = g(Z, )/(1 G(Z)). Theorem 3.1

remains true for the weights Vni , that is
1
1
Vni =
Wi + o p (1)
n i=1
n i=1
n

and, therefore, the asymptotic variance of 1n ni=1 Vni is 2 as
given in (3.4).
However, Lemma
3.1(1) does
not hold for {Vni }.
The limit of n1 ni=1 Vni2 or n1 ni=1 (Vni n1 nj=1 Vn j )2 is

12 =
g2 (s, )
G(s)
dF (s) 2 > 2 , = Eg(Y, ).
Therefore, a scaled parameter r = 12 / 2 must be introduced to

obtain an asymptotic distribution for 2 log(EL ratio).
4. Simulation
Simulations are carried out to study and compare finite sample
performance of confidence intervals I1 in (3.12) derived from
Theorem 3.2 and I2 from the scaled 12 distribution given by
Wang and Jing (2001) and Qin and Zhao (2007).
Confidence intervals I2 are calculated as follows. Let Fn , and
defined by (2.3). Suppose is the
Gn be the KM estimators

unique solution of g(s, ) dFn (s) = 0. Set
i = g(Zi , ),
i = g(Zi , ), Vni =
V ni =
n
i i
1
, Vn =
V ni ,
1 Gn (Zi )
n i=1
12 =
1
(V ni V n )2 ,
n i=1
i i
,
1 Gn (Zi )
r =
12
,
n v
ar (jack)
(4.1)
where n v
ar (jack) is the modified jackknife estimator of the
asymptotic variance of given in Stute (1996). Then, the confidence interval for is

n

log(1 + Vni ) c1 ,
(4.2)
I2 = : 2 r
i=1

where is the solution of ni=1 Vni /(1 + Vni ) = 0.
Simulations were performed in two scenarios. In Scenario I,
the parameter of interest is 0 = E Y and in Scenario II, the mean
residual lifetime.
Scenario I: The parameter of interest is 0 = E Y and (x) =
g(x, ) = x is used for calculating I1 . Two cases (i) and (ii)
were simulated:
(i) The lifetime Y is uniformly distributed on (0, 1) and the
censoring time C is uniformly distributed on (0, c). We
selected c = 2.5 and c = 1.3 which corresponds, respectively, to 20% and 30% censoring proportions.
(ii) Y has a Weibull(1, 10) distribution and C has an Exp()
distribution. Then for = 4.3 and = 2.7, the corresponding censoring proportions are 20% and 30%. The
simulated sample are n iid copies of Z = min(Y, C), =
I[Y C]. Based on the simulated observations, confidence intervals I1 derived from Theorem 3.2 and I2 from
(4.2) were calculated. The simulation of a sample n was
repeated N times for N = 2 104 . The coverage proportions and the average width of the N confidence intervals
were calculated using the N datasets. The results are summarized in Tables 1 and 2. The following are noted.
1. As the sample size n increases from 20 to 80, all of the
coverage proportions converge to the nominal level
1 .
2. For Uniform(0, 1) distribution, I1 has better coverage
proportions. In 8/16 of the cases, the average width
of I2 is slightly shorter than that of I1 . In 8/16 of the
cases, I2 and I1 have the same average width.
3. For Weibull(1, 10) distribution, I1 has better coverage
proportion and width.
Scenario II: Let g(x, ) = (x t0 )I[x t0 ]. Then the
parameter of interest is the mean residual life of Y, = E(Y
t0 |Y t0 ), as considered in Qin and Zhao (2007) and in Example 3 of Section 2.
We used the same Weibull (1, 10) distribution for Y and the
same exp() distribution for the censoring variable C as in Scenario I. As in Scenario I, a sample of Wni , i = 1, . . . , n was simulated repeatedly N times for N = 2 104 . The coverage proportions and the average width of the N confidence intervals were
calculated using the N datasets. The results are summarized in
Table 3 and Table 4. The following are noted.
1. All of the coverage proportions increase with the sample size n and are close to the nominal levels. In fact
when sample size is 80 (which is moderate), the coverage proportions are about the nominal level except when
P[Y t0 ] = 0.3.
2. The coverage proportions of I1 are much better than that
of I2 .
3. In 15/32 of the samples, the average width of I2 is slightly
shorter than that of I1 . In 17/32 of the cases, I2 and I1 have
the same average width.
651
Table . The coverage proportions for the true 0 = E Y .

Uniform(, )
Censoring
proportion
20%
Sample size
n
I2
I1
I2
I1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
30%
Weibull(, )
Nominal value
1
Table . The average width of condence intervals for 0 = E Y .

Uniform(, )
Censoring
proportion
%
Sample size
n
I2
I1
I2
I1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Weibull(, )
Nominal value
1
Table . The coverage proportion and average width of condence intervals for 0 = E(Y t0 |Y t0 ) under the assumptions of Y Weibull(, ), % censoring
proportion, 1 = 0.90, and specied value for P[Y t0 ].
Coverage proportion
P(Y t0 )
Average width
P(Y t0 )
Sample size n
CI
0.90
0.70
0.50
0.30
0.90
0.70
0.50
0.30
I2
I1
I2
I1
I2
I1
I2
I1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Table . The coverage ratio and average width of condence intervals for 0 = E(Y t0 |Y t0 ), under the assumptions Y Weibull (, ), % censoring proportion,
1 = 0.90 and specied value for P[Y t0 ].
Coverage proportion
P(Y t0 )
Average width
P(Y t0 )
Sample size n
CI
0.90
0.70
0.50
0.30
0.90
0.70
0.50
0.30
I2
I1
I2
I1
I2
I1
I2
I1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
652
S. HE ET AL.
5. Example with Real Data

In this section, we apply our method to a well-known dataset
taken from Andrews and Herzberg (1985). The data are the survival times of heart transplant patients in the Stanford Heart
Transplant Program. The survival data from the Program have
been analyzed and reanalyzed by many authors using different methods. A comparison of performances of several different methods was carried out in Miller and Halpern (1982) using
the survival times of 184 patients who received a transplant
among all of those patients admitted to the program during the
period from October 1967 to February 1980. For each of the 184
patients, his/her survival time in days after transplant and an
indicator on whether the patient is dead or alive were recorded.
We are interested in the mean survival time after transplant,
. We shall use the same dataset to compute the confidence intervals for .
Based on our method described in the simulations, the 95%
confidence interval for is [1038, 1478]. The 95% confidence
interval computed using the method of Wang and Jing (2001)
is [1045, 1443]. A modified EM algorithm to calculate the EL
for censored data is proposed by Zhou (2005). Using Zhous
method, the 95% confidence interval is [1033, 1507]. These
results are quite similar.
For the sake of comparison, we computed the mle of
which is taken to be the maximizer of R( ) defined in (1.6).
The mle of is = 1232 days. But keep in mind that the largest
observation in the dataset equals to 3695 days and is a censored
lifetime. We treated it as an uncensored lifetime to compute the
mle.
As an example of Lemma A.1, put

bH
hn (b) =
b
2 (s)
2
G (s)
dHn1 (s), b < bH .
Then, by the SLLN and (2.8),

lim hn (b) =
bH
2 (s)
2
G (s)
dH 1 (s) = h(b) 0, as b bH .(A.1)
Proof of Theorem 3.1. The first equation follows from

1
Wni
n i=1
n
i
1 (Zi )i
+
n (Zi )
n i=1 Gn (Zi ) H n (Zi )

I[Zi s] 0
dHn (s)
n (s) 2
H n (s)

n (s)
(s)
1
dHn0 (s)
dHn (s) +
=
H
Gn (s)
n (s)

H n (s) 0
n (s) 2
dHn (s)
H n (s)

(s)
dHn1 (s) = (s) dFn (s).
=
Gn (s)
n
The second equation follows from Theorem 6 in Akritas (2000) and

the fact that Wi equals Zi defined by (12) in Akritas (2000).
Proof of Lemma 3.1. The differences Wni Wi in (3.7) and (3.2) can
be expressed in terms of
Appendix
Throughout Section 6, is fixed and such that Eg(Y, ) = 0. Then
it is convenient to suppress in the exposition, by setting (y) =
g(y, ). It follows that
(Zi )i
(Zi )i
,
Gn (Zi )
G(Zi )
i
i
n (Zi )
(Zi ),
i =
H n (Zi )
H(Zi )

I[Zi s] 0
I[Zi s] 0
i = n (s) 2
dHn (s) (s)
dH (s),
2
H n (s)
H (s)
i =

E (Y ) =
(y)dF (y) = 0.
The following Lemma will be used repeatedly.

Lemma A.1. For b < bH , let {hn (b)} be a random sequence such
that hn (b) h(b) in distribution as n , and h(b) = o p (1) as
b bH . As n , if Vn = O p (1) and the random sequence {Sn }
can be written as Sn = o p (1) + Vn hn (b) for any b < bH , then Sn =
o p (1).
Proof. Put n = o p (1). By assumptions, for any > 0 and >
0, there exist M > 0, b < bH and n0 > 1 such that for n n0 ,
P(|Vn | M) , P(|hn (b)| /M) P(|h(b)| /M) + /2
and P(|n | ) < . It follows that for n n0 ,
P(|Sn | 2) P(|n | ) + P(|Vn hn (b)| )
+ P(|Vn hn (b)| , |hn (b)| /M) +
as
Wni Wi = i + i i .
Applying an elementary inequality (a + b + c)2 3(a2 + b2 + c2 ),
we obtain
(Wni Wi )2 = (i + i i )2 3(i2 + i2 + i2 ).
The lemma will be proven by showing that the sample means of

i2 , i2 , and i2 tend to zero in probability. The proofs will be presented in (A), (B), and (C) below.
(A) The sample mean of i2 is o p (1).
Proof: Let Gn (x) be the KM estimator defined in (2.3) and
b < bH . Then as n ,
P(|Vn | M) + 2
3.
(A.2)
Un sup
sb
|Gn (s) G(s)|

Gn (s)
= o p (1),
Vn
|Gn (s) G(s)|
sup
Gn (s)
smax{Zi }
= O p (1).
(A.3)
See Zhou (1992). To apply this result, we shall split the first integral in the following equation into two intervals [0, b] and (b, bH ],
accordingly. For Hn1 defined in (2.2), we have

(s)
1 2
=
n i=1 i
n
Gn (s)
b 2
(s)
2
Un
(s) 2
G(s)
dHn1 (s)
dHn1 (s)
+ Vn2
2 (s)
bH
where + and are the positive and negative part of . Define

monotone functions:

n (x) =
(s) dFn (s), (x) =
(s) dF (s).
sx
sx
The monotone n (x) converges to continuous (x) almost

surely for x [0, bH ] as shown by Stute and Wang (1993). The convergence is uniform on [0, bH ]. From these we conclude
sup |n (x) (x)| = o(1), a.s.
(A.7)
0xbH
dHn1 (s)
b
G (s)
G (s)
= o p (1)O p (1) + O p (1)hn (b) = o p (1), as n ,
0
653
and for b < bH ,
(A.4)
sup
sb
(s)
(s) 2
n
0, a.s.
H n (s) H(s)
where
Applying (a + b)2 2a2 + 2b2 and Lemma A.1, we have
bH
hn (b) =
(s)
2
G (s)
dHn1 (s).
(A.5)
By (A.1), hn (b) satisfies Lemma A.1. The proof of (A) follows.

(B) The sample mean of i2 is o p (1).

To split the integral that defines i i2 /n, let us define
n2 (s)
Tn (b, t] =
Sn (b, t] =
b
n2 (s) =
2 (u) dFn (u),
D(a, t] =
us

dGn (s)
2
2 (u) dFn (u) F n (s) 2
b
us
H n (s)
bH
1
2 (u) dFn (u) d
Gn (s)
b
us
bH
bH 2
1
(s)
dFn (s)
lim
2 (u) dFn (u) +
sbH Gn (s) s
Gn (s)
b
bH
1
1
1 2 2
+
(s) dHn1 (s)
2
Gn (s) G(s)
G(s)
b
(A.6)
o p (1) + O p (1)hn (b).
The first inequality follows from (2.6), the second and the third
from integration by parts, and the last equality from (a + b)2
2(a2 + b2 ), (A.3), and (A.1).
By the same token, we conclude that Sn (b, bH ] = o p (1) +
O p (1)hn (b).
Write
= + , n (x) =
+ (s) dFn (s)

sx
(s) 2
n (s)
dHn0 (s)
H n (s) H(s)
and Fn and Gn have no common jumps. It follows that
b
n (s) dHn0 (s)
H n (s)
(s) dH 0 (s)
2
H (s)
Tn (b, bH ]
(s)
(s) 2
n
dHn0 (s)
H n (s) H(s)
(C) The sample mean of i2 is o p (1).

Proof: To split the integral that defines i i2 /n, let us define
Bn (a, t] =

2
(u) dFn (u) F n (s)
bH
bH
= o p (1) + O p (1)hn (b) + O p (1)hn (b) = o p (1).(A.8)

+
+2Tn (b, bH ] + 2Sn (b, bH ]

dHn0 (s),
H n (s)
2 (s)
dHn0 (s), b < bH .
2
H (s)
us
Observe that

1 2
=
n i=1 i

sx
(s) dFn (s),
, Bn (a, t] =
|n (s)| dHn0 (s)

2
H n (s)
Then, for b < bH , we have

2n
bH

B2n (b, t] dHn (t )
H n (b)B2n (b, b] + 2
0+2

b
bH
bH
bH
B2n (b, t] d(H n (t ))
H n (t)Bn (b, t]
1/2
2
Bn (b, t] dHn (t )
= n 2[Tn (b, bH )]
bH
1/2
|n (t )|
2
H n (t)
n2 (t )
2
H n (t)
dHn0 (t )
1/2
dHn0 (t )
(A.9)
The second term in the first inequality is obtained using the

LebesgueStieltjes integration by parts and Bn (t) Bn (t ).
Applying (A.9), we get
bH
B2n (b, t] dHn (t ) = 2n 4Tn (b, bH ] = o p (1) + O p (1)hn (b).
Similarly, for S(b, bH ] = ESn (b, bH ], we have

bH
b
D2 (b, t] dHn (t ) 4S(b, bH ] + o p (1) = o p (1) + O p (1)hn (b).
654
S. HE ET AL.
Applying (A.7), we conclude that for any b < bH , with probability 1,

Bn (0, t] D(0, t] uniformly on [0, b]. It follows that Bn (0, b]
D(0, b], and
1
n
n

i2 = o p (1) +
i=1
bH
2 log R(0 ) = 2
Bn (0, b] D(0, b] + Bn (b, t]
D(b, t] dHn (t )
bH

B2n (b, t] dHn (t ) + 4
o p (1) + 4
b
=
bH
D2 (b, t ) dHn (t )
= o p (1) + O p (1)hn (b) + O p (1)hn (b) = o p (1).

Finally, we prove the second assertion of Lemma 3.1.
Since Wi are iid random variables with zero mean and finite vari
ance 2 , hence, max |Wi | = o p ( n). It follows from assertion (1)
1in
of the lemma that

1in
max |Wni Wi |2
1in
1/2
+ max |Wi | = o p ( n).

1in
(A.10)
Note that Wni2 is bounded by Wi2 + (Wi Wni )2 2|Wi (Wi
Wni )|. By assertion (1) of the lemma, we get
1 2
W = 2 + o p (1).
n i=1 ni
n
Proof of Theorem 3.2. The proof is similar to that of Theorem 3.2

in Chapter 3 of Owen (2001). Setting
1
1 2
Xi = nWni , W n =
Wni , S2n =
W .
n i=1
n i=1 ni
n
We have S2n = 2 + o p (1), W n = O p (n1/2 ),

1 Xi2
= X n n h(n ) = nW n ,
n i=1 1 + Xi
n
and

1 2
1 Xi2
1 + max |X j |
Xi
1 jn
n i=1
n i=1 1 + Xi
n
2n S2n =
= nW n + 2n o p (1).
(A.11)
It follows that
n =
Wn
= O p (n1/2 )
+ o p (1)
and
W n = n 2 + o p (n1/2 ).
Applying Lemma 3.1, we have
n

i=1
|Xi |3 3n
n

i=1
|Wni |2 max |Wni | = o p (1).

1 jn
n

ln(1 + nWni )
i=1
2
max |Wni |
Now, using Lemma 3.1, Taylors expansion and Theorem 3.1, we

can write
(A.12)
n

1
2 nWni 2nWni2 + i
2
i=1
( here |i | |Xi |3 )
Wn
1
2
4 nW n ( 2 + o p (1)) + o p (1)
2
2
nW n
=
+ o p (1) 12 ,
in distribution.
2
= 2n
Acknowledgments
The authors are grateful to the associate editor and referees for their valuable
comments and providing missing references which led to an improvement
of the presentation of our work and a simplification of the proof of Theorem
3.1.
Funding
This work was done when Yang was visiting the School of Mathematical Sciences, Xiamen University. Research of He, Liang, and Shen were supported
by National Natural Science Foundation of China(11171230,11231010).
Research of Yang was supported in part by NSF grant DMS-1209111.
References
Akritas, M. G. (2000), The Central Limit Theorem Under Censoring,
Bernoulli, 6, 11091120. [649,652]
Andrews, D. F., and Herzberg, A. M. (1985), DATA: A Collection of Problems
from Many Fields for the Student and Research Work, Berlin: Springer,
pp. 4549. [648,652]
Chen, S. X. (1994), Empirical Likelihood Confidence Intervals for Linear Regression Coefficients, Journal of Multivariate Analysis, 49,
2440. [646]
DiCiccio, T. J., Hall, P., and Romano, J. P. (1991), Empirical Likelihood is
Bartlett-Correctable, Annals of Statistics, 19, 10531061. [646]
He, S. Y., and Huang, X. (2003), Central Limit Theorem of Linear Regression Model Under Right Censorship, Science in China, 46, 600610.
[649]
Hjort, N. L., McKeague, I. W., and van Keilegom, I. (2009), Extending
the Scope of Empirical Likelihood, Annals of Statistics, 37, 10791111.
[647]
Li, G., and Wang, Q. H. (2003), Empirical Likelihood Methods for Linear
Regression Analysis of Right Censored Data, Statistica Sinica, 13, 51
68. [647]
Miller, R. G., and Halpern, J. W. (1982), Regression With Censored Data,
Biometrika, 69, 521531. [652]
Owen, A. B. (1988), Empirical Likelihood Ratio Confidence Intervals for
Single Functional, Biometrika, 75, 237249. [646]
Owen, A. B. (2001), Empirical Likelihood, London: Chapman and Hall.
[646,647,654]
Pan, X. R., and Zhou, M. (2002), Empirical Likelihood Ratio in Terms of
Cumulative Hazard Function for Right Censored Data, Journal of Multivariate Analysis, 80, 166188. [647]
Qin, G. S., and Zhao, Y. C. (2007), Empirical Likelihood Inference for the
Mean Residual Life Under Random Censorship, Statistics and Probability Letters, 77, 549557. [647,648,650]
Stute, W. (1996), The Jackknife Estimate of Variance of a Kaplan-Meier
Integral, Annals of Statistics, 24, 26792704. [646,650]
Stute, W., and Wang, J. L. (1993), The Strong Law Under Random Censorship, Annals of Statistics, 21, 15911607. [653]
Thomas, D. R., and Grunkemeier, G. L. (1975), Confidence Interval Estimation of Survival Probabilities for Censored Data, Journal of the
American Statistical Association, 70, 865871. [646]
Vardi, Y. (1982), Nonparametric Estimation in the Presence of Length
Bias, Annals of Statistics, 10, 616620. [648]
Wang, Q. H., and Jing, B. Y. (2001), Empirical Likelihood for a Class of
Functions of Survival Distribution With Censored Data, Annals of the
Institute of Statistical Mathematics, 53, 517527. [647,648,650,652]
Yang, S. (1994), A Central Limit Theorem for Functionals of the
Kaplan-Meier Estimator, Statistics and Probability Letters, 21, 337345.
[646]
Yang, Y., and He, X. M. (2012), Bayesian Empirical Likelihood
for Quantile Regression, Annals of Statistics, 40, 11021131.
[647]
655
Zheng, M., Zhao, A. Q., and Yu, W. (2012), Empirical Likelihood Methods
Based on Influence Functions, Statistics and Its Interface, 5, 355266.
[646,648]
Zhou, M. (1992), Asymptotic Normality of the Synthetic Data Regression
Estimator for Censored Survival Data, Annals of Statistics, 20, 1002
1021. [653]
(2005), Empirical Likelihood Ratio With Arbitrarily Censored/Truncated Data by EM Algorithm, Journal of Computational
and Graphical Statistics, 14, 643656. [652]
Zhou, M., and Jeong, J. H. (2011), Empirical Likelihood Ratio for Median
and Mean Residual Lifetime, Statistics in Medicine, 30, 152159. [647]
Zhou, M., Kim, M. O., and Bathke, A. C. (2012), Empirical Likelihood
Analysis for The Heteroscedastic Accelerated Failure Time Model, Statistica Sinica, 22, 295316. [647]

Empirical Likelihood For Right Censored Lifetime Data

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Empirical Likelihood For Right Censored Lifetime Data

Transféré par

Droits d'auteur :

Formats disponibles

Journal of the American Statistical Association

ISSN: 0162-1459 (Print) 1537-274X (Online) Journal homepage: http://www.tandfonline.com/loi/uasa20

Empirical Likelihood for Right Censored Lifetime

Accepted author version posted online: 01

Article views: 391

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at

Date: 28 September 2016, At: 06:10

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION

Empirical Likelihood for Right Censored Lifetime Data

for a specified function g(s, ). Both F and G are unknown.

(s) dFn (s),

CONTACT Grace Yang

American Statistical Association

Professor Emerita, Department of Mathematics, University of Maryland, College Park, MD .

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION

from a right-censored sample, except for a few special cases

where {pi , i = 1, . . . , n} is a multinomial distribution which

The Wni given in (3.7) are constructed in a special way such

where EL(KM) is the maximum over all pi and attained at the

Wni are responsible, (3.11) seems to be easier to investigate than

2. Preliminaries and Examples

Hn (x) = Hn0 (x) + Hn1 (x) =

For any right continuous monotone function (x), let

Applying (2.3) and (2.4), we get

F (s) dG(s), H 1 (x) =

G(s) dF (s). (2.8)

Here and after, for simplicity, the integral sign

Examples of and g(x, )

the mean residual life of Y.

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION

replaced by the estimate,

the mean of the length-biased lifetime.

The following assumption will be used throughout the article.

To obtain an estimating equation for the modified EL ratio in

3. Empirical Likelihood Ratios and Confidence

We arrived at an approximation of Wi in (3.2) as

the mean of the length-biased residual lifetime.

g(s, ) dFn (s).

To determine R( ), we solve, as usual, for the Lagrange multipliers and in

If the true parameter 0 is the solution of the equation

Theorem 3.2. Suppose that 0 is the unique solution of

Applying Theorem 3.2, confidence intervals for can be constructed as

where c1 is the (1 )th quantile of the 12 distribution. I1

(2) max1in |Wni | = o p ( n), n1 ni=1 Wni2 = 2 + o p (1).

is the estimate of V = g(Z, )/(1 G(Z)). Theorem 3.1

dF (s) 2 > 2 , = Eg(Y, ).

Therefore, a scaled parameter r = 12 / 2 must be introduced to

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION

Table . The coverage proportions for the true 0 = E Y .

Table . The average width of condence intervals for 0 = E Y .

5. Example with Real Data

As an example of Lemma A.1, put

dHn1 (s), b < bH .

Then, by the SLLN and (2.8),

dH 1 (s) = h(b) 0, as b bH .(A.1)

Proof of Theorem 3.1. The first equation follows from

The second equation follows from Theorem 6 in Akritas (2000) and

The following Lemma will be used repeatedly.

The lemma will be proven by showing that the sample means of

|Gn (s) G(s)|

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION

B2n (b, t] dHn (t ) = 2n 4Tn (b, bH ] = o p (1) + O p (1)hn (b).