Vous êtes sur la page 1sur 90

4.

1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

CHAPTER 4: Bootstrap Methods


MAST90083 Computational Statistics & Data Mining
Guoqi Qian
School of Mathematics & Statistics
The University of Melbourne

CHAPTER 4: Bootstrap Methods

1/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

Outline
4.1 The bootstrap principle
4.2 Basic methods
4.2.1 Nonparametric and parametric bootstrap
4.2.2 Bootstrapping samples in regression
and se()

4.2.3 Bootstrap estimation of bias()


4.3 Bootstrap inference
4.3.1 Computing bootstrap confidence intervals
4.3.2 Percentile and pivoting methods
4.3.3 Bootstrap hypothesis testing
4.4 Reducing Monte Carlo error
4.4.1 Balanced bootstrap
4.4.2 Antithetic bootstrap
CHAPTER 4: Bootstrap Methods

2/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

Distributions of correlation coefficient (1)


I

As a motivating example, consider the correlation coefficient


of a bivariate random vector (X , Y ) which is defined as
=

E (XY ) E (X )E (Y )
.
X Y

The sample correlation coefficient is


Pn

Xi Yi nX
,
P
n
2
2
i=1 (Xi X )
i=1 (Yi Y )

r = qP
n

i=1

based on i.i.d. samples (X 1, Y1 ), , (Xn , Yn ).


One can use r to estimate . Evaluating the variability in this
estimation would require the sampling distribution of r .
When (X , Y ) is bivariate normal and = 0, the distribution
of r is quite simple. It is extremely complicated otherwise.

CHAPTER 4: Bootstrap Methods

3/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

Distributions of correlation coefficient (2)


Theorem (Distribution of r (Fisher, 1915))
Let (X , Y ) be bivariate normal with correlation coefficient .
Then, the sample correlation coefficient r has the pdf
fR (r )

n1
n4
3
(n 2)(n)

(1 2 ) 2 (1 r 2 ) 2 (1 r ) 2 n
1
1
2(n 1)( 2 )(n 2 )
1 1 + r
1 1
2 F1 ( , ; n ;
),
(1)
2 2
2
2

where 2 F1 denotes the ordinary hypergeometric function.


In particular, if X and Y are independent, then n2r
tn2 .
1r 2
Various methods, including asymptotic and Monte Carlo ones,
have been developed to approximate the distribution of r when
intractable. Bootstrap is a Monte Carlo method using resampling.
CHAPTER 4: Bootstrap Methods

4/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

Bootstrap principle (1)


I

Let = T (F ) be an unknown parameter of a distribution


function F with pdf f (x) = F 0 (x). We want to estimate and
investigate the sampling distribution of the estimator of ,
from which we can make statistical inference about .
Note the parameter is expressed as a functional of F . E.g.
I

R
d
T1 (F ) = R xdF (x) = EFR(X ) = is the mean of X = F .
T2 (F ) = x 2 dF (x) [ xdF (x)]2 = VarF (X ) = 2 is the

variance of
R RX = F .
RR
RR
T3 (F ) =
x1 x2 dF (x1 , x2 )
x1 dF (x1 , x2 )
x2 dF (x1 , x2 )

= CovF (X1 , X2 ) is the covariance of (X1 , X2 ) = F .


I

Let xn = {x1 , , xn } be data observed as a realisation of


iid

the random variables X1 , , Xn = F . Let X = {X1 , , Xn }


denote the entire dataset.
CHAPTER 4: Bootstrap Methods

5/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

Bootstrap principle (2)


I

Let F be the empirical cdf of the data x1 , , xn , or X , i.e.


# of xi s x
,
F (x) =
n

# of Xi s x
or F (x) =
.
n

F (x) can be regarded as an estimator of F .


Consequently, = T (F ) can be regarded as an estimator of ,
called the plug-in estimator of .
I

R
Pn
As an example,
= T1 (F ) = xd F (x) = n1 i=1 Xi . So the
sample mean is a plug-in estimator of the population mean .
h P i2
P 2
Xi
Xi
As another example,
2 = T2 (F ) =

is a
n

plug-in estimator of 2 .
CHAPTER 4: Bootstrap Methods

6/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

Bootstrap principle (3)


I

The sampling distribution of the estimator T (F ) is required


for statistical inference on .
Sometimes, the distribution of a related random quantity
R(X , F ) is also required, which may provide better inference
on .
I

For example, R(X , F ) =

)T1 (F )
T1 (F
q
n

n1 T2 (F )

is a t-test statistic, and

its distribution is required for one-sample t-test.


I

The distribution of R(X , F ) often depends on unknown F and


is mostly intractable.

The motivation of bootstrap is to find an approximation to


the distribution of R(X , F ) or T (F ) through a sophisticated
use of the empirical cdf F .

CHAPTER 4: Bootstrap Methods

7/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

Bootstrap principle (4)


I

A sample of size n randomly drawn from the empirical cdf F is


called a bootstrap sample, denoted as xn = {x1 , , xn } or
as X = {X1 , , Xn } if regarded as to be drawn.

By default, x1 , , xn are n elements drawn with replacement


from x1 , , xn ; and X1 , , Xn are i.i.d. random variables
with cdf F .
The bootstrap strategy is to examine the distribution of
R(X , F ) (being called the ideal bootstrap distribution).

For example, if R(X , F ) =


R(X , F ) =

)T1 (F )
T1 (F
q
,
n

n1 T2 (F )

then

)T1 (F
)
T1 (F
q
,
n

n1 T2 (F )

where F is the empirical cdf of the bootstrap sample X .


CHAPTER 4: Bootstrap Methods

8/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

Bootstrap principle (5)


I

In some special cases it is possible to derive the ideal bootstrap


distribution of R(X , F ) through analytical means. However, it can
only be done by simulation in most cases.

The bootstrap principle says R(X , F ) R(X , F ).

This principle has been justified by many research results:



h
i

) q] P[R(X , F ) q] > 0
I P P[R(X , F
for any > 0 and any q as n . Here P () is the
probability measure determined by the empirical cdf F .
I If R(X , F ) is asymptotically pivotal with standard normal dist.,
P [R(X , F ) q] P[R(X , F ) q] = Op (n1 ),
better than the usual normal approximation rate Op (n1/2 ).
The rate could be improved to Op (n2 ) when more advanced
bootstrap is implemented.

CHAPTER 4: Bootstrap Methods

9/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

Bootstrap principle (6)


I

Theory underlying the above results is Edgeworth expansion:


1

P[R(X , F ) q] = (q) + n 2 p1 (q)(q) + O(n1 )


1

P [R(X , F ) q] = P[R(X , F ) q | X ] = (q) + n 2 p


1 (q)(q) + Op (n1 )

where p1 (q) is related to the Hermite polynomials and


involves up to the 3rd moments of F ; and p1 (q) is the plug-in
estimator of p1 (q) by substituting the moments of F . (q)
and (q) are cdf and pdf of N(0, 1).
I

One can show that p1 (q) p1 (q) = Op (n 2 ). Thus


P [R(X , F ) q] P[R(X , F ) q] = Op (n1 ),
1

in comparison with (q) P[R(X , F ) q] = O(n 2 ).


CHAPTER 4: Bootstrap Methods

10/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

Bootstrap principle: Example 4.1 (1)


Example 4.1 Suppose the data x3 = {x1 , x2 , x3 } = {1, 2, 6} are
n = 3 i.i.d observations from a cdf F that has mean . Then the
empirical pdf F is determined by the empirical pdf/pmf

P(x)
= 13 ; x = x1 , x2 , x3 .
Suppose T (F ) = 13 (X1 + X2 + X3 ) is used to estimate .
Our objective is to bootstrap the distribution of R(X , F ) = .
Note that X = {X1 , X2 , X3 }; and X = {X1 , X2 , X3 } is a
bootstrap sample consisting of elements drawn from F . There
 are
= 10
nn = 27 possible outcomes for X but consist of only 2n1
n

distinct ones. Let F denote the empirical cdf of X and P () be


the corresponding empirical pdf/pmf. Then
T (F ) = 31 (X1 + X2 + X3 ) is a bootstrap replicate of
based on X .
CHAPTER 4: Bootstrap Methods

11/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

Bootstrap principle: Example 4.1 (2)


Possible outcomes of bootstrap sample X from {1, 2, 6} (ignoring
order), the resultant values of , the empirical pmf P ( ) and
the observed relative frequency in 1000 bootstrap iterations:
X
1, 1, 1
1, 1, 2
1, 1, 6
1, 2, 2
1, 2, 6
1, 6, 6
2, 2, 2
2, 2, 6
2, 6, 6
6, 6, 6

3/3
4/3
8/3
5/3
9/3
13/3
6/3
10/3
14/3
18/3


3/3 3
4/3 3
8/3 3
5/3 3
9/3 3
13/3 3
6/3 3
10/3 3
14/3 3
18/3 3

CHAPTER 4: Bootstrap Methods

P ( )
1/27 0.037
3/27 0.111
3/27 0.111
3/27 0.111
6/27 0.222
3/27 0.111
1/27 0.037
3/27 0.111
3/27 0.111
1/27 0.037

obs. Frequency
38/1000
100/1000
116/1000
112/1000
245/1000
105/1000
38/1000
104/1000
108/1000
34/1000
12/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

Bootstrap principle: Example 4.1 (3)


The R code for the above table:
x=c(1,2,6); X=matrix(0,3^3,3)
for(i in 1:3){for(j in 1:3){ #Find all possible bootstrap samples
for(k in 1:3)X[(i-1)*3^2+(j-1)*3+k,]=sort(x[c(i,j,k)])}}
X.unique=unique(X, MARGIN=1)
#Find all distinct bootstrap samples
#Find frequency of each distinct bootstrap sample
nd=nrom(X.unique);
freq=rep(0,nd); obs.freq=rep(0,nd)
for(j in 1:3^3){for(i in 1:nd)freq[i]=freq[i]+(sum((X[j,]-X.unique[i,])^2)==0)}
set.seed(123) #Find the observed frequency from 1000 bootstrap samples:
for(j in 1:1000){x.bs=sort(sample(x, size=3, rep=T))
for(i in 1:nd){obs.freq[i]=obs.freq[i]+(sum((x.bs-X.unique[i,])^2)==0)}}
cbind(X.unique,freq,obs.freq)
freq obs.freq
[1,] 1 1 1
1
38
[2,] 1 1 2
3
100
[3,] 1 1 6
3
116
[4,] 1 2 2
3
112
[5,] 1 2 6
6
245
[6,] 1 6 6
3
105
[7,] 2 2 2
1
38
[8,]
2 2 6 Methods
3
104
CHAPTER
4: Bootstrap
13/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

Bootstrap principle: Example 4.1 (4)


I

From the previous table we see the ideal bootstrap distribution


of R(X , F ) = is given by P (), which provides an
estimate for the distribution of R(X , F ) = . This
estimate is further estimated by the bootstrap distribution of
R(X , F ) which is given by the "obs. Frequency" column.
A 92.6% ideal bootstrap C.I. for can be found to be
(4/3, 14/3) using the quantiles of .
The confidence level for this interval is 92.8% based on the
"obs. Frequency" column.
By the bootstrap principle, this interval is an approximate
92.6% C.I. for .
The point estimate of is still calculated from the observed
data which is = 9/3 = 3.

CHAPTER 4: Bootstrap Methods

14/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.2.1 Nonparametric and parametric bootstrap

Nonparametric bootstrap
I

Finding the ideal bootstrap distribution of R(X , F ) requires


complete enumeration of F or P (), which is not practical
when the sample size n is even moderate.
Instead, B i.i.d. samples, each of size n, are drawn from F ,
producing B nonparametric bootstrap samples. Denote
, , X } iid

them as Xi = {Xi1
in = F for i = 1, , B.

The empirical cdf of {R(Xi , F ), i = 1, , B} is used to


approximate the ideal bootstrap cdf of R(X , F ) which
further approximates the cdf of R(X , F ), allowing inference.
The simulation error in approximating the ideal bootstrap cdf
of R(X , F ) can be made arbitrarily small by increasing B.
C.f. the last 2 columns of the table in Example 4.1.
A key requirement of bootstrapping is that the data to be
resampled must be an i.i.d. sample.

CHAPTER 4: Bootstrap Methods

15/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.2.1 Nonparametric and parametric bootstrap

Parametric bootstrap
I

When a parametric model is assumed for the data, namely


iid
X1 , , Xn = F (x|), the cdf F (x|) can be parametrically
instead of being estimated by the
estimated by F (x|)

empirical cdf F .
To estimate the distribution of R(X , F (x|)), one can draw B
producing B
i.i.d. samples, each of size n, from F (x|),
parametric bootstrap samples. Denote them as
, , X } iid

Xi = {Xi1
in = F (x|) for i = 1, , B.
i = 1, , B} is then
The empirical cdf of {R(Xi , F (x|)),

used to approximate the ideal bootstrap cdf of R(X , F (x|))


and further the cdf of R(X , F (x|)).
If the parametric model is not good, the parametric bootstrap
can give misleading inference.

CHAPTER 4: Bootstrap Methods

16/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.2.2 Bootstrapping samples in regression

Bootstrapping samples in regression (1)


I

Consider a multiple regression model, Yi = xT


i + i , for
iid

i = 1, , n, where 1 , , n = F with EF (i ) = 0 and


VarF () = 2 .
I

The observed data are {z1 = (x1 , y1 ), , zn = (xn , yn )}.

It is wrong to generate bootstrap samples from {y1 , , yn }


and from {x1 , , xn } independently, because {y1 , , yn }
are not i.i.d. samples.

Two appropriate ways to construct bootstrap samples from


the observed data are bootstrap the residuals and
bootstrap the cases.

CHAPTER 4: Bootstrap Methods

17/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.2.2 Bootstrapping samples in regression

Bootstrapping samples in regression (2)


Bootstrap the residuals
1. Fit the regression model to the observed data. Obtain the

fitted responses yi = xT
i = yi yi .
i and residuals
2. Bootstrap residuals from {
1 , , n } to get {
1 , , n }.
Note {
1 , , n } are not i.i.d. but roughly so if the
regression model is correct.
3. Create a bootstrap sample of responses: Yi = yi + i for
i = 1, , n.
4. Fit the regression model to {(x1 , Y1 ), , (xn , Yn )} to get
bootstrap estimate ( ,
) of (, ).
5. Repeat this process B times to obtain {(1 ,
1 ), (B ,
B )},
from which an empirical cdf can be built for inference.
CHAPTER 4: Bootstrap Methods

18/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.2.2 Bootstrapping samples in regression

Bootstrapping samples in regression (3)


Bootstrap the cases (also called paired bootstrap)
1. Treat the observed data {z1 = (x1 , y1 ), , zn = (xn , yn )} as
i.i.d. from a cdf F (x, y ).
2. Create a bootstrap sample {Z1 , , Zn } by sampling with
replacement from {z1 , , zn }.
3. Fit the regression model to {Z1 , , Zn } to get bootstrap
estimate ( ,
) of (, ).
4. Repeat this process B times to obtain {(1 ,
1 ), (B ,
B )},
from which an empirical cdf can be built for inference.
5. Bootstrapping the cases is less sensitive to violations in the
regression model assumptions (i.e. adequacy of the model and
constancy of 2 ) than bootstrapping the residuals.
CHAPTER 4: Bootstrap Methods

19/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

and se()

4.2.3 Bootstrap estimation of bias()

and se()

Bootstrap estimation of bias()


I

q
are the two basic

Bias() = EF () and se() = VarF ()


attributes of the estimator that we can use bootstrap
analysis to estimate.
for some
Suppose = T (F ) and = T (F ) or = T (F (|))
functional T .
or
T (F ) = .
Let R(X , F ) = T (F ) T (F ) = T (F (|))

Then bias() = EF [R(X , F )] and Var() = VarF [R(X , F )] are


population moments of R(X , F ), which can be estimated by
the population moments of the ideal bootstrap distribution of
per the bootstrap principle.
R(X , F ) or R(X , F (|))
They can be further estimated by the sample moments of
calculated from the bootstrap
R(X , F ) or R(X , F (|)),
samples.

CHAPTER 4: Bootstrap Methods

20/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

and se()

4.2.3 Bootstrap estimation of bias()

and se()

Nonparametric bootstrap estimation of bias()


Computing steps for obtaining nonparametric bootstrap
and se()
are as following:
estimates of bias()
1 Compute from the observed sample xn = (x1 , , xn ).
2 Generate B (typically B 999) nonparametric bootstrap
samples of size n from the observed sample.
3 For each bootstrap sample, compute an estimate of in the
The new estimates of are
same way as estimating by .
called the bootstrap replicates of and are denoted as
1 , , B .
P

by
bias()
4 Compute = B 1 B
r =1 r and estimate
q
P
B
1
2
= ;
compute seB ()
=
bB ()
r =1 (r )
B1
by seB ().

and estimate se()


CHAPTER 4: Bootstrap Methods

21/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

and se()

4.2.3 Bootstrap estimation of bias()

and se()

Parametric bootstrap estimation of bias()

Parametric bootstrap estimation proceeds the same way as the


nonparametric bootstrap estimation except for in step 2 where

bootstrap samples of size n are generated from F (x|).


Remark:
I

= EF [( )2 ] may be
A bootstrap estimate of MSE(
)
P
2.
= 1 B (r )
obtained as MSEB ()
r =1
B

CHAPTER 4: Bootstrap Methods

22/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

and se()

4.2.3 Bootstrap estimation of bias()

R package boot
The package boot in R contains many functions for implementing
bootstrap methods. The function boot() is the one for generating
bootstrap samples and various bootstrap estimates about .
library(boot)
boot(data, statistic, R, sim="ordinary", stype="i",
strata=rep(1,n), L=NULL, m=0, weights=NULL,
ran.gen=function(d, p) d, mle=NULL, ...)
See the
The argument statistic is a function defined by .
examples later for illustration.
The functions boot.array() and freq.array() are useful for finding
which original observations and how many times they are included
in each bootstrap sample.
CHAPTER 4: Bootstrap Methods

23/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

and se()

4.2.3 Bootstrap estimation of bias()

Example 4.2 Bootstrapping on copper-nickel alloy data (1)


Example 4.2 The table below gives 13 measurements of corrosion
loss (yi ) in copper-nickel alloys, each with a specific iron content
(xi ) (Draper & Smith 1966).
xi

0.01

yi

127.6

0.48
1.44
124.0
92.3

0.71
0.71
110.8
113.1

0.95
1.96
103.9
83.7

1.19
0.01
101.5
128.0

0.01
1.44
130.1
91.4

0.48
1.96
122.0
86.2

Of interest is the change in corrosion in the alloys as the iron


content increases, relative to the corrosion loss when there is no
iron. Thus = 10 is the quantity we want to estimate in a simple
linear regression model yi = 0 + 1 xi + i .
CHAPTER 4: Bootstrap Methods

24/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

and se()

4.2.3 Bootstrap estimation of bias()

Example 4.2 Bootstrapping on copper-nickel alloy data (2)


The LS estimate or MLE of is =
>
>
>
>
>

1
0

= 0.185.

z=matrix(0,13,2)
z[,1]=c(0.01,0.48,0.71,0.95,1.19,0.01,0.48,1.44,0.71,1.96,0.01,1.44,1.96)
z[,2]=c(127.6,124.0,110.8,103.9,101.5,130.1,122.0,92.3,113.1,83.7,128.0,91.4,86
temp=lm(z[,2]~z[,1])
temp$coef[2]/temp$coef[1]

-0.1850722

and sd().

1. Use bootstrap the cases to estimate bias()


and sd().

2. Use bootstrap the residuals to estimate bias()


3. Assuming the normal linear regression model, use a parametric
and sd().

bootstrap approach to estimate bias()


4. Let = corr(X , Y ). Perform a nonparametric bootstrap
analysis for bias(
) and sd(
).
CHAPTER 4: Bootstrap Methods

25/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

and se()

4.2.3 Bootstrap estimation of bias()

Example 4.2 Bootstrapping on copper-nickel alloy data (3)


and sd().

1. Use bootstrap the cases to estimate bias()


First run a pilot study involving only 5 bootstrap samples.
> library(boot)
# Step 1. Write a function to specify the statistic estimator for which
# we want to find a bootstrap replicates.
#lm1.bt() uses the "bootstrap the cases" approach.
lm1.bt=function(x,i){temp=lm(x[i,2]~x[i,1])$coef
ratio=temp[2]/temp[1]
return(ratio)}
# Step2. Use boot() to perform the bootstrap. Need to specify the data, the
# statistic and the # of bootstrap samples (R) to to be generated.
set.seed(1234);

boot1=boot(data=z, statistic=lm1.bt, R=5)

CHAPTER 4: Bootstrap Methods

26/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

and se()

4.2.3 Bootstrap estimation of bias()

Example 4.2 Bootstrapping on copper-nickel alloy data (4)


> boot1
ORDINARY NONPARAMETRIC BOOTSTRAP
Call:
boot(data = z, statistic = lm1.bt, R = 5)
Bootstrap Statistics :
original
bias
std. error
t1* -0.1850722 -0.001099678 0.01011771
> attributes(boot1)
$names
[1] "t0"
[7] "sim"

"t"
"call"

> boot1$t0
> t(boot1$t)

"R"
"stype"

"data"
"strata"

"seed"
"weights"

"statistic"

#gives \hat{\theta} estimate


-0.1850722
#the 5 bootstrap replicates of \hat{\theta}

[1,] -0.1871027 -0.1963776 -0.1835215 -0.1704960 -0.1933616


> mean(boot1$t)-boot1$t0

#gives the bias estimate

[1] -0.001099678
> sd(boot1$t)

#gives the sd. estimate

[1] 0.01011771
CHAPTER 4: Bootstrap Methods

27/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

and se()

4.2.3 Bootstrap estimation of bias()

Example 4.2 Bootstrapping on copper-nickel alloy data (5)


> boot.array(boot1, indices=T)
[1,]
[2,]
[3,]
[4,]
[5,]

#indices of obs used in bootstrap samples.

[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
2
9
10
11
5
11
6
10
8
7
1
7
12
9
1
8
4
4
7
4
3
9
9
5
7
1
8
4
4
4
3
12
4
4
5
7
10
10
5
9
9
13
3
1
11
7
13
9
4
7
3
1
12
7
4
4
3
1
3
11
5
10
2
12
4

For example the 1st bootstrap sample is {(x2 , y2 ), (x9 , y9 ), , (x7 , y7 ), (x12 , y12 )}.
> boot.array(boot1, indices=F) #frequencies of obs used in boot. samples.
[1,]
[2,]
[3,]
[4,]
[5,]

[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
1
1
0
0
1
1
2
1
1
2
2
1
0
2
0
1
3
1
0
2
1
3
0
0
0
0
0
0
1
5
2
0
1
1
0
2
0
1
0
2
0
2
1
0
0
2
0
3
0
1
0
2
1
1
2
3
1
0
1
0
0
1
1
2
0

E.g. in bootstrap sample (x1 , y1 ) appeared once, (x7 , y7 ) appeared twice, etc..
> freq.array(boot.array(boot1,indice=T)) #same as boot.array(boot1, indices=F)
CHAPTER 4: Bootstrap Methods

28/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

and se()

4.2.3 Bootstrap estimation of bias()

Example 4.2 Bootstrapping on copper-nickel alloy data (6)


B = R = 5 is too small. Now run for B = R = 1999. It follows
= 0.00148788 and sdB ()
= 0.008483298.
that bB ()
> set.seed(1234); boot1=boot(data=z, statistic=lm1.bt, R=1999);
ORDINARY NONPARAMETRIC BOOTSTRAP
Call: boot(data = z, statistic = lm1.bt, R = 1999)
Bootstrap Statistics :
original
bias
std. error
t1* -0.1850722 -0.001487880 0.008483298

boot1

> plot(density(boot1$t),lwd=2); hist(boot1$t, breaks=50, freq=F, add=T)

30
0

10

20

Density

40

50

density.default(x = boot1$t)

0.24

0.22
N = 1999

CHAPTER 4: Bootstrap Methods

0.20

0.18

0.16

Bandwidth = 0.001472

29/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

and se()

4.2.3 Bootstrap estimation of bias()

Example 4.2 Bootstrapping on copper-nickel alloy data (7)


and sd().

2. Use bootstrap the residuals to estimate bias()


> library(boot)
#Require lm2.bt() for the "bootstrap the residuals" approach.
lm2.bt=function(x,i){temp=lm(x[,2]~x[,1])
y.star=temp$fitted + temp$residual[i]
temp1=lm(y.star~x[,1])$coef
ratio=temp1[2]/temp1[1]
return(ratio)}
> set.seed(1234); boot2=boot(data=z, statistic=lm2.bt, R=1999);

boot2

ORDINARY NONPARAMETRIC BOOTSTRAP


Call: boot(data = z, statistic = lm2.bt, R = 1999)
Bootstrap Statistics :
original
bias
std. error
t1* -0.1850722 0.0002736776 0.007615583

= 0.0002737 and sdB ()


= 0.007616.
It follows that bB ()
CHAPTER 4: Bootstrap Methods

30/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

and se()

4.2.3 Bootstrap estimation of bias()

Example 4.2 Bootstrapping on copper-nickel alloy data (8)


The bootstrap the residuals method gives smaller estimates of
and sd()
than the bootstrap the cases method. The
bias()
bootstrap pdf of is more symmetric.

4
0

12

40

90

100

110

120

130

2.0

Normal QQ
2
13

12

0.19

0.18

0.17

N = 1999 Bandwidth = 0.001499

CHAPTER 4: Bootstrap Methods

0.16

0.0

90

100

110

Fitted values

1.0

1.5

120

130

Residuals vs Leverage
1

13 0.5

1.2
0.6

Standardized residuals

0.20

Standardized residuals

30
10
0

0.21

2
12

0.5

Theoretical Quantiles

ScaleLocation
13

1.5 1.0 0.5 0.0

Fitted values

20

Density

50

Residuals

60

2
7

0.5

Residuals vs Fitted

1.0

density.default(x = boot2$t)

Standardized residuals

> plot(density(boot2$t), ylim=c(0,60), lwd=2)


> hist(boot2$t, breaks=50, freq=F, add=T)
> par(mfrow=c(2,2)); plot(lm(z[,2]~z[,1])) #regression diagnosis plots.

Cook's distance

12

0.00

0.05

0.10

0.15

0.5

0.20

0.25

Leverage

31/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

and se()

4.2.3 Bootstrap estimation of bias()

Example 4.2 Bootstrapping on copper-nickel alloy data (9)


3. Assuming the normal linear regression model, use a parametric
and sd().

bootstrap approach to estimate bias()


> library(boot)
#Parametric bootstrap analysis
#lm3.bt() uses parametric bootstrap samples generated from lm.gen().
lm3.bt=function(x){temp=lm(x[,2]~x[,1])$coef; ratio=temp[2]/temp[1]; ratio}
#lm.gen() generates a parametric bootstrap sample
lm.gen=function(x, mle){n=nrow(x); err=rnorm(n, mean=0, sd=mle$sigma)
y=mle$beta0+x[,1]*mle$beta1+err; return(cbind(x[,1],y))}
temp=lm(z[,2]~z[,1])
mle.list=list(beta0=temp$coef[1], beta1=temp$coef[2])
mle.list$sigma=sqrt(sum(temp$resid^2)/temp$df.resid)
> set.seed(1234)
> boot3=boot(data=z, stat=lm3.bt, R=1999, sim="parametric",
ran.gen=lm.gen, mle=mle.list)
CHAPTER 4: Bootstrap Methods

32/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

and se()

4.2.3 Bootstrap estimation of bias()

Example 4.2 Bootstrapping on copper-nickel alloy data (10)


> boot3
PARAMETRIC BOOTSTRAP
boot(data = z, statistic = lm3.bt, R = 1999, sim = "parametric",
ran.gen = lm.gen, mle = mle.list)
Bootstrap Statistics :
original
bias
std. error
t1* -0.1850722 -7.34372e-05 0.008344886
> plot(density(boot3$t), ylim=c(0,60), lwd=2)
> hist(boot3$t, breaks=50, freq=F, add=T)

= 7.34 105 and sdpar ()


= 0.00834.
It follows that bpar ()
density.default(x = boot4$t)

40
0

10

10

20

30

Density

30
20

Density

40

50

50

60

60

density.default(x = boot3$t)

0.22

0.21

0.20

0.19

0.18

0.17

N = 1999 Bandwidth = 0.001642

CHAPTER 4: Bootstrap Methods

0.16

0.15

1.00

0.98

0.96

0.94

0.92

N = 1999 Bandwidth = 0.001271

33/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

and se()

4.2.3 Bootstrap estimation of bias()

Example 4.2 Bootstrapping on copper-nickel alloy data (11)


4. Let = corr(X , Y ). Perform a nonparametric bootstrap
analysis for bias(
) and sd(
).
It follows that = 0.9847435, bB (
) = 0.0001254529 and
sdB (
) = 0.007203108.
#A bootstrap correlation function:
cor.bt=function(x,i){cor(x[i,1],x[i,2])}
> set.seed(1234); boot4=boot(data=z, stat=cor.bt, R=1999); boot4
ORDINARY NONPARAMETRIC BOOTSTRAP
Call: boot(data = z, statistic = cor.bt, R = 1999)
Bootstrap Statistics :
original
bias
std. error
t1* -0.9847435 -0.0001254529 0.007203108
> plot(density(boot4$t), ylim=c(0,65), lwd=2)
> hist(boot4$t, breaks=50, freq=F, add=T)
CHAPTER 4: Bootstrap Methods

34/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

Bootstrap inference contents

This section will include:


1. how to use boot.ci() in package boot in R to compute
bootstrap confidence intervals;
2. percentile and pivoting methods for deriving various bootstrap
confidence intervals;
3. use of bootstrap and permutation in hypothesis testing.

CHAPTER 4: Bootstrap Methods

35/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.1 Computing bootstrap confidence intervals

Computing bootstrap confidence intervals


generated by the boot() function in
Bootstrap replicates of ,
boot package, can be used to construct CIs for .
The boot package computes 6 types of bootstrap CIs for :
1. Percentile (or basic percentile)
2. Normal approximation
3. Basic (or residual)
4. Studentized
5. BCa bias corrected and accelerated
6. ABC approximate bias corrected
All of them except ABC are computed using the boot.ci()
function. The ABC CIs are computed using the abc.ci() function.
We will not discuss the ABC CIs here.
CHAPTER 4: Bootstrap Methods

36/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.1 Computing bootstrap confidence intervals

boot.ci() function for computing bootstrap CIs

boot.ci(boot.out, conf = 0.95, type = "all",


index = 1:min(2,length(boot.out$t0)), var.t0 = NULL,
var.t = NULL, t0 = NULL, t = NULL, L = NULL, h = function(t) t,
hdot = function(t) rep(1,length(t)), hinv = function(t) t, ...)

The argument boot.out is the result returned from executing


boot().
Type help(boot.ci) for details on other arguments.

CHAPTER 4: Bootstrap Methods

37/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.1 Computing bootstrap confidence intervals

Example 4.3 Bootstrap CIs on copper-nickel alloy data (1)


Example 4.3 (Example 4.2 continued)
1. Use the bootstrap replicates of saved in object boot1 to
find 95% bootstrap CIs for .
2. Use the bootstrap replicates of saved in object boot2 to
find 95% bootstrap CIs for .
3. Use the parametric bootstrap replicates of saved in object
boot3 to find 95% bootstrap CIs for .
4. Use the bootstrap replicates of saved in object boot4 to
find 95% bootstrap CIs for .

CHAPTER 4: Bootstrap Methods

38/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.1 Computing bootstrap confidence intervals

Example 4.3 Bootstrap CIs on copper-nickel alloy data (2)


1. Use the bootstrap replicates of saved in object boot1 to
find 95% bootstrap CIs for .
> library(boot); boot.ci(boot1)
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 1999 bootstrap replicates
CALL :

boot.ci(boot.out = boot1)

Intervals :
Level
Normal
95%
(-0.2002, -0.1670 )

Basic
(-0.1963, -0.1626 )

Level
Percentile
BCa
95%
(-0.2076, -0.1738 )
(-0.2047, -0.1731 )
Calculations and Intervals on Original Scale
Warning message:
In boot.ci(boot1) : bootstrap variances needed for studentized intervals
CHAPTER 4: Bootstrap Methods

39/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.1 Computing bootstrap confidence intervals

Example 4.3 Bootstrap CIs on copper-nickel alloy data (3)


2. Use the bootstrap replicates of saved in object boot2 to
find 95% bootstrap CIs for .
> boot.ci(boot2)
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 1999 bootstrap replicates
CALL :

boot.ci(boot.out = boot2)

Intervals :
Level
Normal
95%
(-0.2003, -0.1704 )

Basic
(-0.2006, -0.1710 )

Level
Percentile
BCa
95%
(-0.1992, -0.1695 )
(-0.1978, -0.1677 )
Calculations and Intervals on Original Scale
Warning message:
In boot.ci(boot2) : bootstrap variances needed for studentized intervals
CHAPTER 4: Bootstrap Methods

40/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.1 Computing bootstrap confidence intervals

Example 4.3 Bootstrap CIs on copper-nickel alloy data (4)


3. Use the parametric bootstrap replicates of saved in object
boot3 to find 95% bootstrap CIs for .
> boot.ci(boot3)
Error in empinf(boot.out, index = index, t = t.o, ...) :
influence values cannot be found from a parametric bootstrap
In addition: Warning message:
In boot.ci(boot3) : bootstrap variances needed for studentized intervals
> boot.ci(boot3, type=c("norm","basic", "perc"))
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 1999 bootstrap replicates
CALL : boot.ci(boot.out = boot3, type = c("norm", "basic", "perc"))
Intervals :
Level
Normal
Basic
95%
(-0.2014, -0.1686 )
(-0.2021, -0.1692 )
Calculations and Intervals on Original Scale
CHAPTER 4: Bootstrap Methods

Percentile
(-0.2010, -0.1681 )
41/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.1 Computing bootstrap confidence intervals

Example 4.3 Bootstrap CIs on copper-nickel alloy data (5)


4. Use the bootstrap replicates of saved in object boot4 to
find 95% bootstrap CIs for .
> boot.ci(boot4)
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 1999 bootstrap replicates
CALL : boot.ci(boot.out = boot4)
Intervals :
Level
Normal
95%
(-0.9987, -0.9705 )

Basic
(-1.0010, -0.9740 )

Level
Percentile
BCa
95%
(-0.9954, -0.9685 )
(-0.9935, -0.9539 )
Calculations and Intervals on Original Scale
Warning message:
In boot.ci(boot4) : bootstrap variances needed for studentized intervals
CHAPTER 4: Bootstrap Methods

42/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.1 Computing bootstrap confidence intervals

Example 4.3 Bootstrap CIs on copper-nickel alloy data (6)


Remarks:
1. Across parts 1 to 4, studentized CIs cannot be computed
because boot.ci() requires bootstrap replicates of the
estimated variance of which are not available. See
section 4.3.2 for solutions.
2. In part 3, it reveals that empirical influence function(EIF)
values are required for computing the BCa CIs, but parametric
bootstrap does not provide the EIF values.
3. The EIF of an estimator based on a sample {Z1 , , Zn } is
defined as a sequence {EIF1 , , EIFn }, where

EIFi = (n 1)[ (i)]


for i = 1, , n, with (i)
being
the estimate of from {Z1 , , Zi1 , Zi+1 , Zn }.
CHAPTER 4: Bootstrap Methods

43/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.1 Computing bootstrap confidence intervals

Example 4.3 Bootstrap CIs on copper-nickel alloy data (7)


Remarks: (continued)
4. The empinf() function computes EIF. The BCa CI for based
on parametric bootstrap replicates can now be calculated:
> boot3$L=empinf(data=z, stat=lm1.bt, stype="i")
> boot.ci(boot3)
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 1999 bootstrap replicates
CALL : boot.ci(boot.out = boot3)
Intervals :
Level
Normal
95%
(-0.2014, -0.1686 )

Basic
(-0.2021, -0.1692 )

Level
Percentile
BCa
95%
(-0.2010, -0.1681 )
(-0.1987, -0.1653 )
Calculations and Intervals on Original Scale
Warning message:
In boot.ci(boot3) : bootstrap variances needed for studentized intervals
CHAPTER 4: Bootstrap Methods

44/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.1 Computing bootstrap confidence intervals

Normal approximation based bootstrap confidence intervals


d
N(0, 1), e.g., when is the MLE.

se()

In many cases,

Then an approximate 100(1 )% CI for would be


where z1 = (1 ).
z1 2 se()
2
2
to estimate
If bootstrap replicates are available, we use seB ()

se() (if otherwise difficult to estimate); and estimate by


is an unbiased
= 2 (note bias()
bB ()
estimator of ). This suggests the following 100(1 )%
normal approximation based bootstrap CI for :
h
i

.
(2 ) z1 2 seB (),
(2 ) + z1 2 seB ()

This formula is used in boot.ci() to compute the Normal CI.

CHAPTER 4: Bootstrap Methods

45/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

Percentile bootstrap confidence intervals (1)


I

The 2.5th and 97.5th percentiles, say, of bootstrap replicates


of provide a 95% prediction interval for , and accordingly
a 95% PI for by the bootstrap principle.
The above 100(1 )% PI is used as the 100(1 )%
Percentile CI for in boot.ci(). Recall Example 4.3 (1):
> boot.ci(boot1, type="perc")
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Intervals based on 1999 bootstrap replicates:
Level
Percentile
95%
(-0.2076, -0.1738 )
> quantile(boot1$t, prob=c(0.025,0.975), type=6)
2.5%
97.5%
-0.2075644 -0.1738358

The percentile method bootstrap CI is prone to bias and


inaccurate coverage probabilities. It works better when is
essentially a location parameter.

CHAPTER 4: Bootstrap Methods

46/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

Percentile bootstrap confidence intervals (2)


A justification on the percentile method bootstrap CI for
I Assume the existence of a continuous and strictly increasing
transformation , and a continuous cdf H with symmetric pdf
d
() =
(implying H(z) = 1 H(z)) such that ()
H.
I This assumption is likely to be reasonable although it may be
difficult to find such and H. However, it turns out that we
dont need explicit specification of and H. On the other
hand, when such and H exist, we can even assume H to be
N(0, 1) (why?).
I Now we know
h
i
() h
P h/2 ()
=1
(2)
1/2
where h is the quantile of H.
CHAPTER 4: Bootstrap Methods

47/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

Percentile bootstrap confidence intervals (3)


A justification on the percentile method CI for (continued)
I Applying the bootstrap principle to (2), we have
h
i
h
1 P h/2 ( ) ()
1/2
h
i
( ) h

= P h/2 + ()
+
(
)
1/2
h



i
1

1 h

= P
h/2 + ()
+ ()
.(3)
1/2
I



Hence 1 h/2 +()
and 1 h1/2 +()
/2
1/2
with being the quantile of the ideal bootstrap
distribution P () of which can be estimated by

([B+1])
, the sample quantiles (order statistic) from B
bootstrap replicates of .

CHAPTER 4: Bootstrap Methods

48/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

Percentile bootstrap confidence intervals (4)


A justification on the percentile method CI for (continued)
I

On the other hand, (2) can be rewritten as


h



i
1 h

P 1 h/2 +()
+(
)
=1
1/2
(4)
noting that H has a symmetric pdf so that h/2 = h1/2 .

Therefore, by comparing (3) and (4) we know


h
i h
i

([B+1]/2)
/2
, 1/2
, ([B+1](1/2))
can serve as an approximate 100(1 )% C.I. for , which is
called the (basic) percentile bootstrap CI.

CHAPTER 4: Bootstrap Methods

49/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

Basic (or residual) bootstrap confidence intervals


I

Taking to be the identical transformation, eq. (2) becomes


h
i
P h/2 h1/2 = 1
(5)

We call the residual of the estimator .

= where
By the bootstrap principle, h ( )
is the sample
quantile of . Using this approximation,
(5)
h
i

becomes P /2 1/2 1 , which is


i
h
1
(6)
P 2
2
1/2

/2

This suggests the following approximate 100(1 )% basic


(or residual) bootstrap CI for :
i
h
i h



, 2
2
,
2

([B+1] )
1/2
/2
([B+1](1 ))
2

CHAPTER 4: Bootstrap Methods

50/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

BCa bootstrap confidence intervals (1)


I

The basic (residual) bootstrap CI tends to suffer from the


same defects as the basic percentile bootstrap CI does.
Namely, it is prone to bias and inaccurate coverage
probabilities.

For these two CIs to work well, it requires the cdf H there to
be free from . This implies that a stronger transformation
and to find the CI
is in need to get a pivotal quantity for ,
based on the pivot.

The bias corrected and accelerated percentile method, or


BCa, is motivated by this finding, and has been used to derive
CIs for with substantial improvement over the previous two
percentile methods.

CHAPTER 4: Bootstrap Methods

51/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

BCa bootstrap confidence intervals (2)

The idea behind the BCa method is to assume the existence


of a transformation of whose distribution is (asymptotically)
normal and whose mean and standard deviation depend in a
particular way on so that an (asymptotic) pivotal can be
easily constructed.

A CI is made on the transformed parameter and then the


interval is inverted to obtain an interval for . By using the
bootstrap method, the inversion can be done without
knowledge of the explicit form of the transformation.

CHAPTER 4: Bootstrap Methods

52/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

BCa bootstrap confidence intervals (3)


Suppose there is a strictly increasing transformation such that
has a normal distribution with
(h)
i
h
i
= () c0 [1+a()] and Var ()
= [1+a()]2 .
E ()
()
()
d
+ c0 = N(0, 1).
1 + a()

Namely,
I

(7)

If zp is the 100pth percentile of N(0, 1) with p = 1 2 , then


P

()
()
zp
+ c0 zp
1 + a()
P

!
= p (1 p) = 1

+ c0 zp
+ c0 + zp
()
()
()
1 a(c0 zp )
1 a(c0 + zp )

CHAPTER 4: Bootstrap Methods

!
=1
53/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

BCa bootstrap confidence intervals (4)


which suggests a 100(1 )% CI for () as
"

+ c0 zp
()
,
L=
1 a(c0 zp )

#
+ c0 + zp
()
U=
, not computable as unknown;
1 a(c0 + zp )

And a 100(1 )% CI for would be [ 1 (L), 1 (U)].


I

By the bootstrap principle and (7),

Thus it can be verified that (note p

( )()

1+a()
= 1 2 )

+ c0 N(0, 1).

!


)()

c
+
z
0
p
P ( ) U = P
+c0
+c0

1a(c0 + zp )
1 + a()


c0 + zp
denoted

+c0
= pU .
1a(c0 + zp )
CHAPTER 4: Bootstrap Methods

54/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

BCa bootstrap confidence intervals (5)


I
I

Hence U is approximately the pU quantile of the cdf of ( ).


1 (U) is the UCL of the 100(1)% CI for as is strictly
increasing. It is also approximately the pU quantile of the cdf
of 1 (( )) = , which can be estimated by p , the pU
U
sample quantile of the bootstrap replicates of .
Similarly (note p = 1 2 )
!


)()

z
0
p
P ( ) L = P
+c0
+c0

1a(c0 zp )
1 + a()


c0 zp
denoted

+c0
= pL ,
1a(c0 zp )

So the LCL of the 100(1)% CI for can be estimated by


p , the pL sample quantile of the bootstrap replicates of .
L

CHAPTER 4: Bootstrap Methods

55/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

BCa bootstrap confidence intervals (6)


I

Given values of c0 and a, we can compute pU and pL , and


accordingly a 100(1 )% BCa bootstrap CI of as
h
i h
i

p , p ([B+1]p
,

)
([B+1]p )
L

E.g., given c0 = 0.20, a = 0.01 and = 0.05 (so p = 0.975), a


95% BCa CI of is
h
i h
i

0.063
, 0.992
= (0.063[B+1])
, (0.992[B+1])
which can be read off from the B bootstrap replicates of .
The value of c0 is determined by the relative position of
among the bootstrap replicates of , while the value of a is
determined by the skewness of the bootstrap replicates of .

CHAPTER 4: Bootstrap Methods

56/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

BCa bootstrap confidence intervals (7)



the proportion of the
Specifically, let p0 = #{B<} = F (),

and let i be the estimate of based


replicates that are ;
on the data x1 , , xi1 , xi+1 , , xn . We call i s the jackknife
Then
replicates of .
the p0 quantile of N(0, 1).
()),
I c0 = 1 (p0 ) = 1 (F

a=

P n  3
i=1 J i

 ,
Pn  2 3/2
6
i=1 J i

where J =

1
n

Pn

i=1 i .

Note



(n1) J i are related to EIF (empirical influence function). In

boot.ci(), EIFi are used in place of (J i ).

Sometimes a = 0 is set. Then the resultant interval is called the


BC (bias corrected) CI of .
CHAPTER 4: Bootstrap Methods

57/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

Computing a 100(1 )% BCa bootstrap CI of


In summary, compute a BCa CI for as following:
1 Generate B replicates 1 , , B by a bootstrap method.
2 Compute c0 = 1 (p0 ), where p0 =
compute a =

P n  3
i=1 J i

 ,
Pn  2 3/2
6
i=1 J i

3 Compute pU =
4

c0 +z1
2

1a(c0 +z1

PB

I (r <)
.
B

Also
P
where J = n1 ni=1 i .
r =1




c0 z1
2
+c
and
p
=

+c
0
0 .
L
)
1a(c0 z )
1

([B+1]p
)
U

([B+1]p
)
L

Find the order statistics


and
from the B
bootstrap
h replicates. Then a 100(1
i
h )% BCa
i bootstrap CI

of is ([B+1]p ) , ([B+1]p ) or p , p .
L

CHAPTER 4: Bootstrap Methods

58/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

Example 4.4 BCa CI involving copper-nickel alloy data


Example 4.4 (Example 4.3 continued) In Example 4.3 part 1 we
have found a 95% BCa CI as (0.2047, 0.1731). This can be
verified following the procedure on the previous page.
> boot.ci(boot1)$bca
conf
[1,] 0.95 72.56 1968.28 -0.2047184 -0.1730628
p0=sum(boot1$t < boot1$t0)/boot1$R; c0=qnorm(p0)
eif=empinf(boot1); a=sum(eif^3)/(6*(sum(eif^2))^(3/2))
pu=pnorm((c0+qnorm(0.975))/(1-a*(c0+qnorm(0.975)))+c0)
pl=pnorm((c0-qnorm(0.975))/(1-a*(c0-qnorm(0.975)))+c0)
> quantile(boot1$t, prob=c(pl,pu), type=6)
3.627772% 98.41409%
-0.2047184 -0.1730626
> p0; c0; a; pl; pu; pl*2000; pu*2000
[1] 0.5087544; [1] 0.02194573; [1] 0.03419632;
[1] 0.03627772; [1] 0.9841409; [1] 72.55545; [1] 1968.282
CHAPTER 4: Bootstrap Methods

59/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

Studentized bootstrap confidence intervals (1)


I

A more intuitive way to construct an appropriate pivot for


bootstrap is the studentized bootstrap or bootstrap t
method.
Suppose = T (F ) is to be estimated using = T (F ), with

V (F ) estimating the variance of .


Then it is reasonable to expect that R(X , F ) =

T (F
)T (F )
)
V (F

will be roughly pivotal. Bootstrapping R(X , F ) yields a


collection of R(X , F ).
and G
the distributions of R(X , F ) and
Denote by G

R(X , F ) respectively.

CHAPTER 4: Bootstrap Methods

60/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

Studentized bootstrap confidence intervals (2)


I

Theoretically a 100(1 )% CI for can be obtained using




) R(X , F ) 1 (G
)
P 2 (G
2


q
q

= P 1 2 (G ) V (F ) 2 (G ) V (F ) = 1

) is the quantile of G
. These quantiles are
where (G
unknown but can be estimated under the bootstrap principle,
) (G
).
so (G
This gives the 100(1 )% studentized bootstrap CI of :



q
q
) V (F ), T (F ) (G
) V (F )
T (F ) 1 2 (G
2


q
q
c ),
c )

) Var(
) Var(
= 1 2 (G
2 (G

) is the quantile of G
.
where (G
CHAPTER 4: Bootstrap Methods

61/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

Studentized bootstrap confidence intervals (3)


I

To calculate the studentized bootstrap CI for , we need the


estimated variance V (F ), which can be approximated by the
or by using a delta method.
bootstrap estimate sd2B ()
A more difficult problem in calculating the studentized
) values. Note (G
) is
bootstrap CI for is finding the (G

F )T (F )
.
w.r.t. the cdf G
the quantile of R(X , F ) = T (
)
V (F

The bootstrap replicates of R(X , F ) for B given bootstrap




to replace
samples are q 1
, , q B
. Using sd2B ()
d ( )
Var
1

d ( )
Var
B

c (j )s ignores their variation, which reduces to the basic


all Var
(or residual) bootstrap CI method.
c ( ) by a
An approximation method is to calculate V (F ) = Var
delta method for each bootstrap sample.
Can use double bootstrap but computationally intensive.

CHAPTER 4: Bootstrap Methods

62/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

Studentized bootstrap confidence intervals (4)

Coverage probability of the studentized bootstrap CI closely


approximates the nominal confidence level in general.
The approximation is most reliable when T (F ) is a location
statistic in the sense that a constant shift in all the data
values will induce the same shift in T (F ).

It is also reliable for variance-stabilized estimators.

It is however sensitive to the presence of outliers in the


dataset, so use the studentized bootstrap CI with caution in
such cases.

Unlike the percentile-based methods, the studentized


bootstrap CI is not transformation-respecting.

CHAPTER 4: Bootstrap Methods

63/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

Example 4.5 Bootstrap t CI for copper-nickel alloy data (1)


Example 4.5 In Example 4.3 part 1 we cannot get a 95%
studentized bootstrap CI. We can get it now if we incorporate an
into the bootstrap procedure.
estimate of Var()
I Using the delta method, an estimated variance is
c )
=
Var(
I

1
0

!2

c 1 )
c 0 )
d 0 , 1 )
Var(
Var(
2Cov(
+

2
2

1
0
0 1

Need a statistic function for boot() to generate bootstrap


c ):
replicates for both and Var(
lm5.bt=function(x,i){tem=summary(lm(x[i,2]~x[i,1]))
beta=tem$coef[,1]
ratio=beta[2]/beta[1]
v.ratio=ratio^2*(tem$cov[1,1]/beta[1]^2+tem$cov[2,2]/beta[2]^2
-2*tem$cov[1,2]/(beta[1]*beta[2]))
return(c(ratio, v.ratio))}

CHAPTER 4: Bootstrap Methods

64/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

Example 4.5 Bootstrap t CI for copper-nickel alloy data (2)


Using lm5.bt() function, run boot(), boot.ci() and other functions:
library(boot); set.seed(1234)
boot5=boot(data=z, statistic=lm5.bt, R=1999);
boot.ci(boot5);
boot.ci(boot5)$stu

boot5

Q=(boot5$t[,1]-boot5$t0[1])/sqrt(boot5$t[,2])
sort(Q)[(1999+1)*0.975];
sort(Q)[(1999+1)*0.025]
quant=quantile(Q,prob=c(0.025,0.975),type=6)
c(boot5$t0[1]-sqrt(boot5$t0[2])*quant[2],boot5$t0[1]-sqrt(boot5$t0[2])*quant[1])
c(boot5$t0[1]-sqrt(boot5$t0[2])*1.96, boot5$t0[1]+sqrt(boot5$t0[2])*1.96)
par(mfrow=c(1,3))
plot(density(boot5$t[,1]), ylim=c(0,60),lwd=2)
hist(boot5$t[,1], breaks=50, freq=F, add=T)
plot(density(boot5$t[,2]^0.5), ylim=c(0,1000),lwd=2)
hist(boot5$t[,2]^0.5, breaks=50, freq=F, add=T)
plot(density(Q),ylim=c(0,0.18),lwd=2)
hist(Q, breaks=50, freq=F, add=T)
CHAPTER 4: Bootstrap Methods

65/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

Example 4.5 Bootstrap t CI for copper-nickel alloy data (3)


From the following R outputs we see:
c )
= 0.185; and V (F ) = Var(
= 7.4663 106 by the
I
delta method;
are 6.129 and 4.360
I the 2.5 and 97.5 percentiles of G
respectively;
I

so the 95% studentized bootstrap CI for is



0.185 4.360 7.4663 106 , 0.185 (6.129) 7.4663 106

= [0.1970, 0.1683].
I

The empirical pdfs of the bootstrap replicates of ,


and R(X , F ) are non-symmetric.

CHAPTER 4: Bootstrap Methods

V (F )

66/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

Example 4.5 Bootstrap t CI for copper-nickel alloy data (4)


> boot5
ORDINARY NONPARAMETRIC BOOTSTRAP
Call:
boot(data = z, statistic = lm5.bt, R = 1999)
Bootstrap Statistics :
original
bias
std. error
t1* -0.1850722153 -1.487880e-03 8.483298e-03
t2* 0.0000074663 1.494497e-06 3.879474e-06
> boot.ci(boot5)
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 1999 bootstrap replicates
CALL : boot.ci(boot.out = boot5)
Intervals :
Level
Normal
95%
(-0.2002, -0.1670 )

Basic
(-0.1963, -0.1626 )

Studentized
(-0.1970, -0.1683 )

Level
Percentile
BCa
95%
(-0.2076, -0.1738 )
(-0.2047, -0.1731 )
Calculations and Intervals on Original Scale
CHAPTER 4: Bootstrap Methods

67/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

Example 4.5 Bootstrap t CI for copper-nickel alloy data (5)


> boot.ci(boot5)$stu
conf
[1,] 0.95 1950 50 -0.1969873 -0.1683238
> Q=(boot5$t[,1]-boot5$t0[1])/sqrt(boot5$t[,2])
> sort(Q)[(1999+1)*0.025];
sort(Q)[(1999+1)*0.975]
[1] -6.129442

[1] 4.360584

> quant=quantile(Q,prob=c(0.025,0.975),type=6); quant


2.5%
-6.129442

97.5%
4.360584

> c(boot5$t0[1]-sqrt(boot5$t0[2])*quant[2],
boot5$t0[1]-sqrt(boot5$t0[2])*quant[1])
-0.1969873

-0.1683238

#verifying the results given by boot.ci(boot5)$studen

> c(boot5$t0[1]-sqrt(boot5$t0[2])*1.96, boot5$t0[1]+sqrt(boot5$t0[2])*1.96)


-0.1904278 -0.1797166
CHAPTER 4: Bootstrap Methods

68/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

Example 4.5 Bootstrap t CI for copper-nickel alloy data (6)


par(mfrow=c(1,3))
plot(density(boot5$t[,1]), ylim=c(0,60),lwd=2)
hist(boot5$t[,1], breaks=50, freq=F, add=T)
plot(density(boot5$t[,2]^0.5), ylim=c(0,1000),lwd=2)
hist(boot5$t[,2]^0.5, breaks=50, freq=F, add=T)
plot(density(Q),ylim=c(0,0.18),lwd=2)
hist(Q, breaks=50, freq=F, add=T)
density.default(x = boot5$t[, 2]^0.5)

density.default(x = Q)

0.10

Density

Density

0.00

10

200

0.05

20

400

30

Density

600

40

800

50

60

1000

density.default(x = boot5$t[, 1])

0.15

>
>
>
>
>
>
>

0.24

0.22

N = 1999

0.20

0.18

0.16

Bandwidth = 0.001472

CHAPTER 4: Bootstrap Methods

0.002
N = 1999

0.004

0.006

Bandwidth = 8.967e05

10

N = 1999

Bandwidth = 0.5123

69/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

Empirical variance stabilization (1)


I

A variance-stabilization transformation of an estimator is one


for which the sampling variance of the transformed estimator
does not depend on . It is often the basis for a good pivot.

The mostly unknown variance-stabilization transformation can


be estimated using the (double) bootstrap.

Let Z be a r.v. with mean and standard deviation s(). By


the delta method, Var[g (Z )] g 0 ()2 s 2 ().
Rz
For Var[g (Z )] to be constant, we require g (z) = a s 1 (u)du
where a is any such that s 1 (u) is continuous on [a, z].

Given a sequence of (u, s(u)) values, g (z) or g () can be


estimated using numerical integration.

CHAPTER 4: Bootstrap Methods

70/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

Empirical variance stabilization (2)


I

The sequence of (u, s(u)) values are generated using the


bootstrap:
1. Draw B1 bootstrap samples Xj for j = 1, , B1 from the
original data X . Calculate bootstrap replicates j , j = 1, , B1 .

2. From each Xj , draw B2 bootstrap samples Xj1 , , XjB


, and
2

calculate j1 , , jB2 .
3. For each j = 1,
1 , calculate
PB 2 , B
PB2
(jk j )2 with j = B12 k=1
jk .
s (j ) = B211 k=1

4. Return the sequence (1 , s (1 )), , (B1 , s (B1 )).

Once the variance-stabilization transformation g (z) is


estimated (as g (z)), we can apply a further bootstrap
procedure to find (either percentile-based or studentized) CI
for g (). Then invert the CI back to that for .
The procedure is computing-intensive. Details skipped here.

CHAPTER 4: Bootstrap Methods

71/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

Nested bootstrap and prepivoting (1)


I

The strategy of drawing further bootstrap samples from each


bootstrap sample of the original sample data can be used to
provide another approach of pivoting in finding bootstrap CIs.
The approach is called the nested bootstrap, or iterated
bootstrap or double bootstrap.
Suppose a statistic R0 (X , F ), involving parameter = T (F ),
can be used to construct a CI for if the distribution of
R0 (X , F ) is known. Suppose the data {x1 , , xn } are
iid

observed from the model X1 , , Xn = F .


Let F0 (q, F ) = P[R0 (X , F ) q] be the cdf of R0 (X , F ),
where we make explicit its dependence on F . Now a CI for
could be derived based on the statement
P[F01 (/2, F ) R0 (X , F ) F01 (1 /2, F )] = 1

CHAPTER 4: Bootstrap Methods

72/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

Nested bootstrap and prepivoting (2)


I

Of course F0 (q, F ) is unknown, so what we have been doing is


to use bootstrap to approximate F0 (q, F ) and its quantiles.
As the approximation is involved, the CI constructed will not
have coverage probability exactly equal to 1 . The error in
the approximation can be quite bad if R0 (X , F ) is not a pivot.
However, the random variable
d
R1 (X , F ) = F0 (R0 (X , F ), F ) = U(0, 1) is a pivot. This
means, for a bootstrap estimate F0 (q, F ) of F0 (q, F ) , the
difference between U(0, 1) and the distribution of
1 (X , F ) = F0 (R0 (X , F ), F ) should be smaller than that
R
between F0 (q, F ) and F0 (q, F ).
This suggests we can use the bootstrap distribution of
1 (X , F ) to construct a CI for instead of using the
R
bootstrap distribution of R0 (X , F ).

CHAPTER 4: Bootstrap Methods

73/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

Nested bootstrap and prepivoting (3)


I

1 (X , F ) q] be the cdf of R
1 (X , F ). The
Let F1 (q, F ) = P[R
100(1 )% CI for based on the bootstrap distribution of
1 (X , F ) is fashioned after the statement
R
1 (X , F ) F 1 (1/2, F )] = 1 .
P[F11 (/2, F ) R
1

1 (X , F ) comes from two sources:


Note the randomness in R
1. random observations {x1 , , xn } from F , which determines F ;
1 (X , F ) = F0 (R0 (X , F ), F ) is calculated from random
2. R
sampling from F .

These two sources of randomness are captured in the


following nested/iterated/double bootstrap algorithm,
which gives a double bootstrap CI for .

CHAPTER 4: Bootstrap Methods

74/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

Nested bootstrap and prepivoting (4)


Nested/iterated/double bootstrap algorithm:
1 Generate B0 bootstrap samples X1 , , XB0 from {x0 , , xn }.
2 Compute R0 (X , F ) for j = 1, , B0 .
j

3 For j = 1, , B0 :
(a) Let Fj be the empirical cdf of Xj . Draw B1 bootstrap samples

Xj1 , , XjB
from Fj .
1
(b) Compute R0 (Xjk , Fj ) for k = 1, , B1 .
(c) Compute
1 (Xj, F ) = F0 (R0 (Xj, F ), F ) = 1
R
B1

PB1


k=1 I [R0 (Xjk , Fj )

R0 (Xj, F )].

1 (X , F ), , R
1 (X , F ).
4 Denote as F1 the empirical cdf of R
1
B0
1 ({x1 , , xn }, F ) = F0 (R0 ({x1 , , xn }, F ), F ) and
5 Use R
quantiles of F1 to construct the CI for following the
1 (X , F ) F 1 (1 /2)] 1 .
statement P[F11 (/2) R
1
CHAPTER 4: Bootstrap Methods

75/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

Nested bootstrap and prepivoting (5)


Remarks:
1. Steps 1 and 2 of the algorithm aim to capture the first
source of randomness by applying the bootstrap principle to
approximate R0 (X , F ) by R0 (X , F ).
2. Step 3 aims to capture the second source of randomness
1 when R0 is bootstrapped conditional on F .
introduced in R
3. It is expected the double bootstrap algorithm is much more
computing intensive than the usual bootstrap because B0 B1
bootstrap samples need be generated.
4. As it need capture two sources of randomness, the double
bootstrap may not be as good as the two pivoting methods
BCa and studentized t when the assumptions involved in the
latter are satisfied. But the former can be applied to
situations where assumptions for the latter are not satisfied.
CHAPTER 4: Bootstrap Methods

76/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

Example 4.6 Double bootstrap CI for copper-nickel alloy data (1)


Example 4.6 Continuing the analysis for copper-nickel alloy data,
we want to find a 95% CI for = 10 by double bootstrap. We will
use the bootstrap cases approach to generate each bootstrap
sample. (The bootstrap residuals approach is left for exercise)
1
0

1
0 .

First define R0 ({x1 , , xn }, F ) = =


1 and F1 are determined accordingly.
R

The boot package does not have any function to implement


double bootstrap. So we write our own lm6.dbt() using R.

Then set B0 = B1 = 1000 and execute


lm6.dbt(z,B0=1000,B1=1000,conf.lev=0.95).

CHAPTER 4: Bootstrap Methods

R1 (X , F ),

77/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

Example 4.6 Double bootstrap CI for copper-nickel alloy data (2)


I

1 shows that F1 differs noticeably from


The histogram of R
uniform. The double bootstrap gives 2.5 and 97.5 percentiles
1 as 0.01925 and 0.997, respectively. The 0.01925 and
of R
0.997 quantiles, or the 1.925 and 99.7 percentiles of
R0 (X , F ) = are then found to be 0.02195345 and
0.02156509, respectively. Hence a 95% double bootstrap CI
for is

[0.02156509,
(0.02195345)]
= [0.2066373, 0.1631188]
knowing = 0.1850722.

CHAPTER 4: Bootstrap Methods

78/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

Example 4.6 Double bootstrap CI for copper-nickel alloy data (3)


lm6.dbt=function(x,B0,B1,conf.lev=0.95){
n=nrow(x); R0.star=rep(0,B0); R0.2star=rep(0,B1); R1.hat=rep(0,B0)
tem0=lm(x[,2]~x[,1]);
ratio0=tem0$coef[2]/tem0$coef[1]
for(j in 1:B0){
i1=sample.int(n,size=n,rep=T);
x1=x[i1,]
tem1=lm(x1[,2]~x1[,1]);
ratio1=tem1$coef[2]/tem1$coef[1]
R0.star[j]=ratio1-ratio0
for(k in 1:B1){
i2=sample.int(n,size=n,rep=T)
tem2=lm(x1[i2,2]~x1[i2,1]);
ratio2=tem2$coef[2]/tem2$coef[1]
R0.2star[k]=ratio2-ratio1}
#end loop k
R1.hat[j]=mean(R0.2star<=R0.star[j])}
#end loop j
qL=quantile(R1.hat, prob=(1-conf.lev)/2, type=6, na.rm=T)
qU=quantile(R1.hat, prob=1-(1-conf.lev)/2, type=6, na.rm=T)
#qL and qU are alpha/2 and (1-alpha/2) quantiles of R1.hat.
#qL and qU quantiles of R0.star are used to find the CI of ratio.
L=ratio0-quantile(R0.star,prob=qU,type=6, na.rm=T)
U=ratio0-quantile(R0.star,prob=qL,type=6, na.rm=T)
rslt=list(theta=ratio0, qL=qL, qU=qU, L=L, U=U, R1.hat=R1.hat, R0.star=R0.star)
return(rslt)}
CHAPTER 4: Bootstrap Methods

79/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

Example 4.6 Double bootstrap CI for copper-nickel alloy data (4)


> ptm=proc.time(); set.seed(1234); res6=lm6.dbt(x=z,B0=1000,B1=1000,conf=0.95)
> proc.time()-ptm
user
2264.53

system elapsed
0.12 2273.27

> res6
$theta
-0.1850722
$qL
2.5%
0.01925
$qU 97.5%
0.997
$L
-0.2066373
$U
-0.1631188
$R1.hat
[1] 0.174 0.864 0.851 0.278
NA 0.509 0.358 0.836 ......
$R0.star
[1] -1.376886e-02 4.653109e-03 6.934917e-03 ......
CHAPTER 4: Bootstrap Methods

80/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.2 Percentile and pivoting methods

Example 4.6 Double bootstrap CI for copper-nickel alloy data (5)


> quantile(res6$R0.star, prob=c(0.01925, 0.997), type=6)
1.925%
99.7%
-0.02195345 0.02156509
> c(res6$theta-0.02156509, res6$theta-(-0.02195345))
-0.2066373 -0.1631188
> par(mfrow=c(1,2))
> hist(res6$R1.hat, breaks=30, freq=F)
> plot(density(res6$R0.star, na.rm=T),ylim=c(0,60), lwd=2)
> hist(res6$R0.star, breaks=30, freq=F, add=T)
density.default(x = res6$R0.star, na.rm = T)

40
30

Density

0.0

10

0.5

20

1.0

Density

1.5

50

2.0

60

Histogram of res6$R1.hat

0.0

0.2

0.4

0.6

res6$R1.hat

CHAPTER 4: Bootstrap Methods

0.8

1.0

0.04

0.02
N = 1000

0.00

0.01

0.02

0.03

Bandwidth = 0.001716

81/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.3 Bootstrap hypothesis testing

Bootstrap hypothesis testing (1)

Hypothesis testing (HT) can be performed using bootstrap.

For example, HT for H0 : = 0 vs. H1 : 6= 0 can be


simply done based on (1 )100% bootstrap CI for . H0 will
be rejected at significance level if the CI does not cover 0 .

However, caution should be exercised when bootstrap HT. In


particular, be careful about the selection of (approximate)
pivots R(X , F ), and its bootstrap replicates being used for
estimating its sampling distribution.

CHAPTER 4: Bootstrap Methods

82/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.3 Bootstrap hypothesis testing

Bootstrap hypothesis testing (2)


I

For example, let the test statistic be R(X , F ) = 0 . The


distribution of R(X , F ) under H0 : = 0 is required in HT.
There is a temptation to generate values of
R(X , F ) = 0 , with null value 0 being used, via
bootstrap to approximate the pdf of R(X , F ) under H0 .
However, the bootstrap distribution of R(X , F ) actually
approximates that of R(X , F ) under the true value of ,
because the sample X is observed from F with true value.
Therefore, the p-value obtained by comparing
R({x1 , , xn }, F ) with the bootstrap pdf of R(X , F ) is
unlikely to be significant whether or not 0 is significantly
different from the true value of .

CHAPTER 4: Bootstrap Methods

83/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.3 Bootstrap hypothesis testing

Bootstrap hypothesis testing (3)


I

The fact is it is not possible to bootstrap the distribution of


R(X , F ) under H0 . So the distribution of under the true
value of is actually used as the reference distribution for
R(X , F ) in bootstrap HT. The bootstrap distribution of
R(X , F ) = is used to approximate this reference
distribution. It is easy to see that, if 0 is significantly
different from the true value, R(X , F ) = 0 will look very
unusual by comparing it with the bootstrap distribution of
R(X , F ) = and a significant p-value will return; hence
H0 will be rejected.
We have seen that the paradigm behind bootstrap HT can be
quite different from that for traditional HT. Hall and Wilson
have addressed the issues on bootstrapping HT and provided
advice to improve the power and accuracy of bootstrap HT.

CHAPTER 4: Bootstrap Methods

84/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.3.3 Bootstrap hypothesis testing

Bootstrap hypothesis testing (4)


I

Using an appropriate pivot is still important in bootstrap HT.


It is often best to base HT on the bootstrap distribution of

is a good estimator of sd( ). This pivot

, where
0

usually gives better results than ,

, or

from the original dataset.


0 , where
estimates sd()

Finally, note that permutation testing or randomization testing


is another important HT method using resampling approach as
the bootstrap HT. Permutation tests are capable of providing
exact p-values if all possible permutations are considered,
while the bootstrap HT cannot. Permutation tests are often
more powerful than their bootstrap counterparts. However,
bootstrap HT requires less stringent assumptions and provides
greater flexibility. Permutation tests will not be detailed here.

CHAPTER 4: Bootstrap Methods

85/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.4.1 Balanced bootstrap

Balanced bootstrap (1)


Balanced bootstrap is an approach to reduce the Monte Carlo
error induced by bootstrap sampling.
I Consider a bootstrap bias correction of the sample mean. We
is 0 in estimating the
know the bias of the sample mean X
population mean .
be the bias quantity, and R(X , F ) be
I Let R(X , F ) = X
is
its bootstrap replicate. Then EF [R(X , F )] = 0 as X
unbiased. However,
P the bootstrap estimate
P of the bias,
) = B 1 B R(X , F ) = B 1 B [X

bB (X
j=1
j=1 j X ], is
j
unlikely to be 0 in ordinary bootstrap. This is caused by the
Monte Carlo variation in generating bootstrap samples.
) = 0 exactly if each value occurs in the
I However, bB (X
combined collection of bootstrap samples with the same
relative frequency as it does in the observed sample.
CHAPTERI
4: Bootstrap
UnlikeMethods
the percentile-based methods, the studentized
I

86/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.4.1 Balanced bootstrap

Balanced bootstrap (2)


I

Hence by balancing the bootstrap samples in this manner, a


source of potential Monte Carlo error is eliminated.

This motivates using balanced bootstrap samples in


bootstrap methods.

The simplest way to get B balanced bootstrap samples is to


concatenate B copies of the observed sample of size n,
randomly permute this series, and then read off B blocks of
size n sequentially. The jth block becomes the jth bootstrap
sample Xj . For the permutation involved, balanced bootstrap
is also called permutation bootstrap.

More elaborate balancing algorithms are possible, but will not


be discussed here.

CHAPTER 4: Bootstrap Methods

87/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.4.2 Antithetic bootstrap

Antithetic bootstrap (1)


I

For a sample of univariate data, x1 , , xn , denote the


ordered data as x(1) , , x(n) . Let (i) = n i + 1 be a
permutation operator that reverses the order statistics.

Then for each bootstrap sample X = {X1 , , Xn }, let


X = {X1 , , Xn } denote the sample obtained by
substituting X((i)) for every instance of X(i) in X . Thus, for
example, if X has an unrepresentative predominance of the
larger observed data values, the smaller observed values will
predominate X .

Using this strategy, each bootstrap draw provides two


estimators: R(X , F ) and R(X , F ). The two estimators are
often negatively correlated.

CHAPTER 4: Bootstrap Methods

88/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.4.2 Antithetic bootstrap

Antithetic bootstrap (2)


I

Let Ra (X , F ) = 12 [R(X , F ) + R(X , F )]. Then Ra has the


following desirable property
1
Var[Ra (X , F )] =
[Var[R(X , F )] + Var[R(X , F )]
4

+2Cov[R(X , F ), R(X , F )]
Var[R(X , F )]

if the covariance is negative.


The above strategy of reducing Monte Carlo error in
bootstrap is referred to as antithetic bootstrap.
It is also possible to establish ordering of multivariate data to
permit an antithetic bootstrap strategy.

CHAPTER 4: Bootstrap Methods

89/90

4.1 Bootstrap principle

4.2 Basic methods

4.3 Bootstrap inference

4.4 Reducing MC error

4.4.2 Antithetic bootstrap

Questions?
4.1 The bootstrap principle
4.2 Basic methods
4.2.1 Nonparametric and parametric bootstrap
4.2.2 Bootstrapping samples in regression
and se()

4.2.3 Bootstrap estimation of bias()


4.3 Bootstrap inference
4.3.1 Computing bootstrap confidence intervals
4.3.2 Percentile and pivoting methods
4.3.3 Bootstrap hypothesis testing
4.4 Reducing Monte Carlo error
4.4.1 Balanced bootstrap
4.4.2 Antithetic bootstrap
CHAPTER 4: Bootstrap Methods

90/90

Vous aimerez peut-être aussi