A Brief Introduction To Adaboost: Hongbo Deng 6 Feb, 2007

A Brief Introduction
to Adaboost
Hongbo Deng
6 Feb, 2007
1
Some of the slides are borrowed from Derek Hoiem & Jan Sochman.
Outline
Background
Adaboost Algorithm
Theory/Interpretations
2
Whats So Good About Adaboost
Can be used with many different classifiers
Improves classification accuracy
Commonly used in many areas
Simple to implement
Not prone to overfitting
3
Resampling for
A Brief History estimating statistic
Bootstrapping
Bagging Resampling for

classifier
Boosting (Schapire 1989) design
Adaboost (Schapire 1995)
4
Bootstrap Estimation
Repeatedly draw n samples from D
For each set of samples, estimate a
statistic
The bootstrap estimate is the mean of the
individual estimates
Used to estimate a statistic (parameter)
and its variance
5
Bagging - Aggregate Bootstrapping
For i = 1 .. M
Draw n*<n samples from D with replacement
Learn classifier Ci
Final classifier is a vote of C1 .. CM

Increases classifier stability/reduces
variance D2
D1
D3 D
6
Boosting (Schapire 1989)
Consider creating three component classifiers for a two-category problem
through boosting.
Randomly select n1 < n samples from D without replacement to obtain D1
Train weak learner C1
Select n2 < n samples from D with half of the samples misclassified by C1 to

obtain D2
Select all remaining samples from D that C1 and C2 disagree on

D
Final classifier is vote of weak learners D3
D1
D2 -
++ -
7
Adaboost - Adaptive Boosting
Instead of resampling, uses training set re-weighting
Each training sample uses a weight to determine the probability
of being selected for a training set.
AdaBoost is an algorithm for constructing a strong

classifier as linear combination of simple weak
classifier
Final classification based on weighted vote of weak

classifiers
8
Adaboost Terminology
ht(x) weak or basis classifier (Classifier =
Learner = Hypothesis)
strong or final classifier
Weak Classifier: < 50% error over any

distribution
Strong Classifier: thresholded linear combination
of weak classifier outputs
9
Discrete Adaboost Algorithm
Each training sample has a
weight, which determines the
probability of being selected for
training the component classifier
10
Find the Weak Classifier
11
Find the Weak Classifier
12
The algorithm core
13
Reweighting
y * h(x) = 1
y * h(x) = -1
14
Reweighting
In this way, AdaBoost focused on the

informative or difficult examples.
15
Reweighting
In this way, AdaBoost focused on the

informative or difficult examples.
16
Algorithm recapitulation t=1
17
Algorithm recapitulation
18
19
20
21
22
23
24
Pros and cons of AdaBoost
Advantages
Very simple to implement
Does feature selection resulting in relatively
simple classifier
Fairly good generalization
Disadvantages
Suboptimal solution
Sensitive to noisy data and outliers
25
References
Duda, Hart, ect Pattern Classification
Freund An adaptive version of the boost by majority algorithm
Freund Experiments with a new boosting algorithm
Freund, Schapire A decision-theoretic generalization of on-line learning and an application to boosting
Friedman, Hastie, etc Additive Logistic Regression: A Statistical View of Boosting
Jin, Liu, etc (CMU) A New Boosting Algorithm Using Input-Dependent Regularizer
Li, Zhang, etc Floatboost Learning for Classification
Opitz, Maclin Popular Ensemble Methods: An Empirical Study
Ratsch, Warmuth Efficient Margin Maximization with Boosting
Schapire, Freund, etc Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods
Schapire, Singer Improved Boosting Algorithms Using Confidence-Weighted Predictions
Schapire The Boosting Approach to Machine Learning: An overview
Zhang, Li, etc Multi-view Face Detection with Floatboost
26
Appendix
Bound on training error
Adaboost Variants
27
Bound on Training Error (Schapire)
28
Discrete Adaboost (DiscreteAB)
(Friedmans wording)
29
Discrete Adaboost (DiscreteAB)
(Freund and Schapires wording)
30
Adaboost with Confidence
Weighted Predictions (RealAB)
31
Adaboost Variants Proposed By
Friedman
LogitBoost
Solves
Requires care to avoid numerical problems
GentleBoost
Update is fm(x) = P(y=1 | x) P(y=0 | x) instead of
Bounded [0 1]
32
Friedman
LogitBoost
33
Friedman
GentleBoost
34
Thanks!!!
Any comments or questions?
35

A Brief Introduction To Adaboost: Hongbo Deng 6 Feb, 2007

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

A Brief Introduction To Adaboost: Hongbo Deng 6 Feb, 2007

Transféré par

Droits d'auteur :

Formats disponibles

A Brief Introduction

Improves classification accuracy

Commonly used in many areas

Not prone to overfitting

Bagging Resampling for

Adaboost (Schapire 1995)

Final classifier is a vote of C1 .. CM

Select n2 < n samples from D with half of the samples misclassified by C1 to

Select all remaining samples from D that C1 and C2 disagree on

AdaBoost is an algorithm for constructing a strong

Final classification based on weighted vote of weak

Weak Classifier: < 50% error over any

In this way, AdaBoost focused on the

In this way, AdaBoost focused on the

Freund An adaptive version of the boost by majority algorithm

Freund Experiments with a new boosting algorithm

Freund, Schapire A decision-theoretic generalization of on-line learning and an application to boosting

Friedman, Hastie, etc Additive Logistic Regression: A Statistical View of Boosting

Li, Zhang, etc Floatboost Learning for Classification

Opitz, Maclin Popular Ensemble Methods: An Empirical Study

Ratsch, Warmuth Efficient Margin Maximization with Boosting

Schapire, Singer Improved Boosting Algorithms Using Confidence-Weighted Predictions

Schapire The Boosting Approach to Machine Learning: An overview

Zhang, Li, etc Multi-view Face Detection with Floatboost

Vous aimerez peut-être aussi