Académique Documents
Professionnel Documents
Culture Documents
COMPARISON OF SURVIVAL
CURVES
Why nonparametric?
• fairly robust
• efficient relative to parametric tests
• often simple and intuitive
1
1.0
0.8
0.6
Survival
0.4
0.2
0.0
0 5 10 15 20 25 30 35
Time
2
There are many ways to form a basis for compar-
ison:
3
General Framework for Survival Analysis
(for right-censored data)
4
More generally though, it is useful to build a model that
characterizes the relationship between survival and all of the
covariates of interest.
5
Two sample tests
The logrank test is the most well known and widely used.
6
2 × 2 Tables
Event
Group Yes No Total
0 d0 n0 − d0 n0
1 d1 n1 − d1 n1
Total d n−d n
7
Therefore, under H0:
[d0 − n0 d/n]2 approx. 2
χ2M H = n n d(n−d) ∼ χ1
0 1
n2 (n−1)
Note: Pearson’s χ2 test was derived when only the row mar-
gins were considered fixed, and thus the variance in the de-
nominator was replaced by:
n0 n1 d(n − d)
V ar(d0) =
n3
8
Example: Leukemia data, just counting the number of
relapses in each treatment group.
Fail
Group Yes No Total
0 21 0 21
1 9 12 21
Total 30 12 42
But, this doesn’t account for the length of survival time itself.
9
K × 2 × 2 Tables
Event
Group Yes No Total
0 d0j n0j − d0j n0j
1 d1j n1j − d1j n1j
Total dj nj − dj nj
PK
{ j=1 (d0j − n0j ∗ dj /nj )}2
χ2CM H = PK
j=1 n1j n0j dj (nj − dj )/[n2j (nj − 1)]
10
How does this apply in survival analysis?
Note that we listed Group 1 for times 1-5 even though there were
no events or censorings at those times.
11
Logrank Test: Formal Definition
Die/Fail
Group Yes No Total
0 d0j r0j − d0j r0j
Total dj rj − dj rj
where d0j and d1j are the number of failures in group 0 and
1, respectively at the j-th failure time, and r0j and r1j are
the number at risk at that time, in groups 0 and 1.
12
First several tables of leukemia data
Frequency| Frequency|
Expected | 0| 1| Total Expected | 0| 1| Total
---------+--------+--------+ ---------+--------+--------+
0 | 19 | 2 | 21 0 | 16 | 1 | 17
| 20 | 1 | | 16.553 | 0.4474 |
---------+--------+--------+ ---------+--------+--------+
1 | 21 | 0 | 21 1 | 21 | 0 | 21
| 20 | 1 | | 20.447 | 0.5526 |
---------+--------+--------+ ---------+--------+--------+
Total 40 2 42 Total 37 1 38
Frequency| Frequency|
Expected | 0| 1| Total Expected | 0| 1| Total
---------+--------+--------+ ---------+--------+--------+
0 | 17 | 2 | 19 0 | 14 | 2 | 16
| 18.05 | 0.95 | | 15.135 | 0.8649 |
---------+--------+--------+ ---------+--------+--------+
1 | 21 | 0 | 21 1 | 21 | 0 | 21
| 19.95 | 1.05 | | 19.865 | 1.1351 |
---------+--------+--------+ ---------+--------+--------+
Total 38 2 40 Total 35 2 37
13
The logrank test is:
PK 2
[ j=1 (d0j − r0j ∗ dj /rj )]
χ2logrank = PK r1j r0j dj (rj −dj )
j=1 [r2 (rj −1)]
j
14
Calculating logrank statistic step-by-step
Leukemia Example:
oj = d0j
ej = dj r0j /rj
vj = r1j r0j dj (rj − dj )/[rj2(rj − 1)]
2 (10.251)2
χlogrank = = 16.793
6.257
15
Notes about logrank test:
16
Analogous to the CMH test for a series of tables, the logrank
test is most powerful when the “odds ratios” are constant
over time points. That is, it is most powerful for propor-
tional hazards:
λ1(t) = αλ0(t).
This is also equivalent to S0(t) = S1(t)α . (Why?)
17
Left Truncated Logrank Test
That is, the risk set now consists of subjects who have
entered the study, and have not failed or been censored by a
certain time.
Still let r0j be the number at risk from group 0, r1j be the
number at risk from group 1, and everything else remains
the same.
18
Gehan’s Generalized Wilcoxon Test
where
n(m + n + 1)
E(R1) =
2
mn(m + n + 1)
Var(R1) =
12
19
The Mann-Whitney form of the Wilcoxon is defined as:
+1 if Xi > Yj
U (Xi, Yj ) = Uij = 0 if Xi = Yj
−1 if Xi < Yj
and m
n X
X
U= Uij .
i=1 j=1
so U = 2R1 − n(m + n + 1)
Therefore,
E(U ) = 0
20
Extending Wilcoxon to censored data
Then define n X
m
X
W = Uij
i=1 j=1
21
[Reading:] First, pool the sample of (n + m) observations
into a single group, then compare each individual with the
remaining n + m − 1:
Then
m+n
X
Ui = Uij
j=1
= W
This puts it in the form of a linear rank test, from which
the log-rank test got its name.
22
Under H0, U = W has mean 0 and variance estimated by
m+n
mn X
Var(U ) =
c Ui2
(m + n)(m + n − 1) i=1
23
Example: (leukemia data)
| Events Sum of
trt | observed expected ranks
------+--------------------------------------
0 | 21 10.75 271
1 | 9 19.25 -271
------+--------------------------------------
Total | 30 30.00 0
chi2(1) = 13.46
Pr>chi2 = 0.0002
where wj = rj .
24
Weighted Log-rank Tests
This general class of tests is like the logrank test, but with
weights wj . The logrank test, Gehan’s Wilcoxon test, Peto-
Prentice’s Wilcoxon and more are included as special cases:
PK 2
{ j=1 w j (d1j − r1j · dj /r j )}
χ2w = PK wj2r1j r0j dj (rj −dj )
l=1 rj2 (rj −1)
Test Weight wj
Logrank wj = 1
Gehan’s Wilcoxon wj = rj
Peto/Prentice wj = Ŝ(tj )
25
Which test should we use?
• All the tests have the correct Type I error for testing the
null hypothesis of equal survival, Ho : S1(t) = S2(t)
• The choice of which test may therefore depend on the
alternative hypothesis, which will drive the power of the
test.
• The Wilcoxon is sensitive to early differences be-
tween survival, while the logrank is most powerful under
proportional hazards. Compare the relative weights they
assign to the test statistic:
X
LOGRANK numerator = (oj − ej )
j
X
WILCOXON numerator = rj (oj − ej )
j
X
or = Ŝ(tj )(oj − ej )
j
• The Wilcoxon also has high power when the failure times
are log-normally distributed, with equal variance in both
groups but a different mean. It will turn out that this
is one special case of the accelerated failure time (AFT)
model.
• The Wilcoxon with weights Ŝ(tj ) is most powerful under
the alternative hypothesis of log-logistic model.
• All the above weighted tests will lack power if the haz-
ards “cross”.
26
Counting Process Notation
27
For the two-sample problem: l = 1, 2,
Pnl Pnl
let Nl (t) = i=1 Nli(t), Yl (t) = i=1 Yli(t).
28
Asymptotic Relative Efficiency (ARE)
29
Example: recall Wald test based on MLE (TSH section 12.4)
√ D.
Under H0, n(θ̂n − θ0) → N (0, I(θ0)−1), where I(θ0) is the
Fisher information, and is estimated by σ̂n−2.
30
Now assume two sample survival data with sizes n1 and n2.
Let n = n1 +n2. Assume that limn→∞ nl /n = al , 0 < al < 1
for l = 1, 2.
Let H0 : S1 = S2 = S.
31
µρ/σρ is called the Pitman efficiency.
Let
r
n2
θ1n = θ0 + ,
n1(n1 + n2)
r
n1
θ2n = θ0 − .
n2(n1 + n2)
√
Then θln → θ0 at 1/ n rate.
32
Theorem 2.3 (H+F): √ For θln (l = 1, 2) defined above,
the test based on Gρ/ V has maximum (Pitman) efficiency
against the contiguous alternatives Sln(t) (also defined above,
and depends on ρ) among all tests of the form
Z ∞
dN1(u) dN2(u)
K(u) − ,
0 Y 1 (u) Y2 (u)
where K() is any stochastic process satisfying regularity con-
ditions.
33
Corollary: From eq.(*)
ρ = 0 =⇒ Log-rank test is the most efficient when λ2(t)=λ1(t)e∆;
ρ = 1 =⇒ Wilcoxon (Peto’s) log-rank test is the most effi-
cient under logistic shift alternative.
34
[Reading:] P -sample and stratified logrank tests
P -sample logrank
35
Suppose we observe data from P different groups, and the
data from group p (p = 1, ..., P ) are:
(Xp1, δp1) . . . (Xpnp , δpnp )
Die/Fail
Group Yes No Total
1 d1j r1l − d1j r1j
. . . .
P dP j rP j − dP j rP j
Total dj rj − dj rj
36
Some formal calculations:
v11j v12j ... v1(P −1)j
v22j ... v2(P −1)j
Vj =
... ... ...
v(P −1)(P −1)j
37
The resulting χ2 test for a single (P × 2) table would have
(P − 1) degrees of freedom and is constructed as follows:
(Oj − Ej )T Vj−1 (Oj − Ej )
Generalizing to K tables
(O − E)T V−1 (O − E)
38
Example:
Noise Level
Group Group Group
1 2 3
9.0 10.0 12.0
9.5 12.0 12+
9.0 12+ 12+
8.5 11.0 12+
10.0 12.0 12+
10.5 10.5 12+
39
Log-rank test for equality of survivor functions
------------------------------------------------
| Events
group | observed expected
------+-------------------------
1 | 6 1.57
2 | 5 4.53
3 | 1 5.90
------+-------------------------
Total | 12 12.00
chi2(2) = 20.38
Pr>chi2 = 0.0000
| Events Sum of
group | observed expected ranks
------+--------------------------------------
1 | 6 1.57 68
2 | 5 4.53 -5
3 | 1 5.90 -63
------+--------------------------------------
Total | 12 12.00 0
chi2(2) = 18.33
Pr>chi2 = 0.0001
40
Stratified Logrank
41
The approach behind stratified logrank:
Die/Fail
X Yes No Total
1 ds1j rs1j − ds1j rs1j
42
Let Os be the sum of the “o”s obtained by applying the
logrank calculations in the usual way to the data from group
s. Similarly, let Es be the sum of the “e”s, and Vs be the
sum of the “v”s.
PS
s=1 (Os − Es )
Z= q PS
s=1 (Vs )
43
Nursing home example (using Stata):
. use nurshome
. gen age1=0
| Events
age1 | observed expected(*)
------+-------------------------
0 | 795 764.36
1 | 474 504.64
------+-------------------------
Total | 1269 1269.00
chi2(1) = 3.22
Pr>chi2 = 0.0728
44