Vous êtes sur la page 1sur 10

Bootstrapping

Husnain Raza
PMS-017

Bootstrapping
A quick

view of bootstrapping

Bootstrap
Cases
How
Is

distribution

where bootstrap does not apply

many bootstraps?

it reliable?

A quick view of bootstrapping

Introduced by Bradley Efron in 1979

Popularized in 1980s due to the introduction of


computers in statistical practice.

It has a strong mathematical background.

While it is a method for improving estimators, it is


well known as a method for estimating standard
errors, bias, and constructing confidence intervals
for parameters.

A quick view of bootstrap (cont)


The bootstrap method attempts to determine
the probability distribution from the data
itself.
The bootstrap method is not a way of reducing
the error. It only tries to estimate it.
It has minimum assumptions. It is merely based on
the assumption that the sample is a good
representation of the unknown population
parameters.

A quick view of bootstrap (cont)

In practice, it is computationally demanding, but


the progress on computer speed makes it easily
available in everyday practice.

The population population distribution (unknown)

Original sample sampling distribution

Resamples bootstrap distribution

Bootstrap distribution

The bootstrap does not replace or add to the original data.

We use bootstrap distribution as a way to estimate the


variation in a statistic based on the original data.

Bootstrap distributions usually approximate the shape,


spread, and bias of the actual sampling distribution.

Bootstrap distributions are centered at the value of the


statistic from the original data plus any bias.

Cases where bootstrap does


not apply

Small data sets: the original sample is not a good


approximation of the population

Dirty data: outliers add variability in our estimates.

Dependence structures (e.g., time series, spatial


problems): Bootstrap is based on the assumption
of independence.

How many bootstrap samples


are needed?
Choice depends on
1.Computer

availability

2.Type

of the problem: standard errors, confidence


intervals,
3.Complexity

of the problem

How many bootstraps?

Rule of thumb : try it 100 times, then 1000 times, and see if
your answers have changed by much.

Anyway have NN possible subsamples

Is it reliable?

Jury still out on how far it can be applied, but for now
nobody is going to shoot you down for using it.

Good agreement for Normal (Gaussian) distributions,


skewed distributions tend to more problematic,
particularly for the tails, (boot strap underestimates the
errors).

Vous aimerez peut-être aussi