Why and How To Use Random Forest Variable Importance Measures (And How You Shouldn't)

Why and how to use random forest
variable importance measures

(and how you shouldnt)
Introduction
Construction
R functions
Variable
importance
Tests for variable
importance
Conditional
importance
Summary
Carolin Strobl (LMU M

unchen) and Achim Zeileis (WU Wien)
References
carolin.strobl@stat.uni-muenchen.de
useR! 2008, Dortmund
Introduction
Introduction
Construction
R functions
Random forests
Variable
importance
Tests for variable
importance
Conditional
importance
Summary
References
Introduction
Introduction
Construction
R functions
Random forests
Variable
importance
have become increasingly popular in, e.g., genetics and

the neurosciences
Tests for variable

importance
Conditional
importance
Summary
References
Introduction
Introduction
Construction
R functions
Random forests
Variable
importance

the neurosciences [imagine a long list of references here]
Tests for variable

importance
Conditional
importance
Summary
References
Introduction
Introduction
Construction
R functions
Random forests
Variable
importance

Tests for variable

importance
Conditional
importance
Summary
can deal with small n large p-problems, high-order

interactions, correlated predictor variables
References
Introduction
Introduction
Construction
R functions
Random forests
Variable
importance

Tests for variable

importance
Conditional
importance
Summary
can deal with small n large p-problems, high-order

interactions, correlated predictor variables
are used not only for prediction, but also to assess

variable importance
References
(Small) random forest

Introduction
1
Start
p < 0.001
1
Start
p < 0.001
1
Start
p < 0.001
8
8
>8
12
2
n = 13
y = (0.308, 0.692)
>8
2
n = 15
y = (0.4, 0.6)
3
Age
p < 0.001
3
Start
p < 0.001
14
87
6
n = 16
y = (0.75, 0.25)
2
n = 38
y = (0.711, 0.289)
>5
9
n = 11
y = (0.364, 0.636)
2
Age
p < 0.001
> 12
81
3
n = 33
y = (1, 0)
3
Number
p < 0.001
> 81
4
Start
p < 0.001
12
5
n = 13
y = (0.385, 0.615)
>3
4
n = 25
y = (1, 0)
5
n = 18
y = (0.889, 0.111)
4
n = 11
y = (1, 0)
12
6
n = 12
y = (0.25, 0.75)
5
n = 31
y = (1, 0)
1
Start
p < 0.001
> 12
2
Age
p < 0.001
14
7
Number
p < 0.001
> 18
4
Number
p < 0.001
4
>3
8
9
n = 28
n = 21
y = (1, 0) y = (0.952, 0.048)
7
Start
p < 0.001
> 13
8
9
n = 11
n = 37
y = (0.818, 0.182) y = (1, 0)
>4
12
71
> 71
3
n = 15
y = (0.933, 0.067)
4
Start
p < 0.001
12
5
6
n = 12
n = 10
y = (0.417, 0.583)y = (0.2, 0.8)
> 12
81
8
5
Start
p < 0.001
> 81
13
7
n = 34
y = (1, 0)
1
Start
p < 0.001
12
5
Start
p < 0.001
12
3
4
6
n=9
n = 13
n = 12
y = (0.778, 0.222) y = (0.154, 0.846) y = (0.833, 0.167)
2
Age
p < 0.001
>3
136
6
n = 47
y = (1, 0)
> 136
7
n=8
y = (0.75, 0.25)
13
> 12
7
n = 47
y = (1, 0)
71
> 71
12
5
Start
p < 0.001
14
3
4
6
n = 15
n = 17
n = 17
y = (0.667, 0.333) y = (0.235, 0.765) y = (0.882, 0.118)
2
n = 28
y = (0.607, 0.393)
> 14
7
n = 32
y = (1, 0)
7
n = 10
y = (0.5, 0.5)
>3
3
Start
p < 0.001
6
n = 37
y = (0.865, 0.135)
> 13
4
n = 10
y = (0.8, 0.2)
5
n = 24
y = (1, 0)
1
Start
p < 0.001
1
Start
p < 0.001
> 12
Summary
>6
2
Number
p < 0.001
5
Age
p < 0.001
8
>8
Conditional
importance
1
Number
p < 0.001
>8
3
4
n = 12
n = 14
y = (0.667, 0.333) y = (0.143, 0.857)
Tests for variable

importance
> 12
5
6
n = 16
n = 15
y = (0.375, 0.625) y = (0.733, 0.267)
2
Start
p < 0.001
> 13
4
6
n = 16
n = 11
y = (0.188, 0.812) y = (0.818, 0.182)
1
Start
p < 0.001
>1
7
n = 35
y = (1, 0)
1
Start
p < 0.001
2
Age
p < 0.001
3
n = 20
y = (0.85, 0.15)
5
6
n = 14
n=9
y = (0.357, 0.643)
y = (0.111, 0.889)
> 14
2
Age
p < 0.001
>4
Variable
importance
6
n = 11
y = (0.818, 0.182)
13
4
Number
p < 0.001
7
n = 31
y = (0.806, 0.194)
> 125
1
Start
p < 0.001
1
Start
p < 0.001
> 27
>4
4
Age
p < 0.001
125
12
R functions
3
Number
p < 0.001
> 12
2
Age
p < 0.001
2
Start
p < 0.001
>1
5
n=9
y = (0.556, 0.444)
18
6
Start
p < 0.001
1
2
n=8
y = (0.375, 0.625)
>4
3
n = 10
y = (0.9, 0.1)
> 12
15 > 15
7
8
n = 12
n = 12
y = (0.833, 0.167) y = (1, 0)
1
Start
p < 0.001
7
n = 16
y = (1, 0)
7
n = 49
y = (1, 0)
> 68
3
Number
p < 0.001
> 13
1
Number
p < 0.001
12
27
68
5
Start
p < 0.001
13
5
n = 32
y = (1, 0)
1
Start
p < 0.001
3
n = 10
y = (1, 0)
> 87
4
n = 36
y = (1, 0)
> 14
4
n = 34
y = (0.882, 0.118)
> 12
2
Age
p < 0.001
Construction
1
Start
p < 0.001
2
n = 18
y = (0.5, 0.5)
> 12
3
Start
p < 0.001
14
4
n = 21
y = (0.905, 0.095)
>8
3
Start
p < 0.001
12
> 14
5
n = 32
y = (1, 0)
> 12
4
n = 18
y = (0.833, 0.167)
5
Number
p < 0.001
3
6
n = 30
y = (1, 0)
>3
7
n = 15
y = (0.933, 0.067)
References
Construction of a random forest

Introduction
Construction
R functions
Variable
importance
Tests for variable
importance
Conditional
importance
Summary
References

Introduction
Construction
R functions
draw ntree bootstrap samples from original sample
Variable
importance
Tests for variable
importance
Conditional
importance
Summary
References

Introduction
Construction
R functions
I
I
Variable
importance
fit a classification tree to each bootstrap sample
Tests for variable

importance
ntree trees
Conditional
importance
Summary
References

Introduction
Construction
R functions
I
I
Variable
importance
fit a classification tree to each bootstrap sample
Tests for variable

importance
ntree trees
Conditional
importance
creates diverse set of trees because

I
trees are instable w.r.t. changes in learning data

ntree different looking trees (bagging)
randomly preselect mtry splitting variables in each split

ntree more different looking trees (random forest)
Summary
References
Random forests in R
Introduction
randomForest (pkg: randomForest)
Construction
R functions
reference implementation based on CART trees

(Breiman, 2001; Liaw and Wiener, 2008)
for variables of different types: biased in favor of

continuous variables and variables with many categories
(Strobl, Boulesteix, Zeileis, and Hothorn, 2007)
I
cforest (pkg: party)

I
based on unbiased conditional inference trees

(Hothorn, Hornik, and Zeileis, 2006)
+ for variables of different types: unbiased when

subsampling, instead of bootstrap sampling, is used
(Strobl, Boulesteix, Zeileis, and Hothorn, 2007)
Variable
importance
Tests for variable
importance
Conditional
importance
Summary
References
(Small) random forest

Introduction
1
Start
p < 0.001
1
Start
p < 0.001
1
Start
p < 0.001
8
8
>8
12
2
n = 13
y = (0.308, 0.692)
>8
2
n = 15
y = (0.4, 0.6)
3
Age
p < 0.001
3
Start
p < 0.001
14
87
6
n = 16
y = (0.75, 0.25)
2
n = 38
y = (0.711, 0.289)
>5
9
n = 11
y = (0.364, 0.636)
2
Age
p < 0.001
> 12
81
3
n = 33
y = (1, 0)
3
Number
p < 0.001
> 81
4
Start
p < 0.001
12
5
n = 13
y = (0.385, 0.615)
>3
4
n = 25
y = (1, 0)
5
n = 18
y = (0.889, 0.111)
4
n = 11
y = (1, 0)
12
6
n = 12
y = (0.25, 0.75)
5
n = 31
y = (1, 0)
1
Start
p < 0.001
> 12
2
Age
p < 0.001
14
7
Number
p < 0.001
> 18
4
Number
p < 0.001
4
>3
8
9
n = 28
n = 21
y = (1, 0) y = (0.952, 0.048)
7
Start
p < 0.001
> 13
8
9
n = 11
n = 37
y = (0.818, 0.182) y = (1, 0)
>4
12
71
> 71
3
n = 15
y = (0.933, 0.067)
4
Start
p < 0.001
12
5
6
n = 12
n = 10
y = (0.417, 0.583)y = (0.2, 0.8)
> 12
81
8
5
Start
p < 0.001
> 81
13
7
n = 34
y = (1, 0)
1
Start
p < 0.001
12
5
Start
p < 0.001
12
3
4
6
n=9
n = 13
n = 12
y = (0.778, 0.222) y = (0.154, 0.846) y = (0.833, 0.167)
2
Age
p < 0.001
>3
136
6
n = 47
y = (1, 0)
> 136
7
n=8
y = (0.75, 0.25)
13
> 12
7
n = 47
y = (1, 0)
71
> 71
12
5
Start
p < 0.001
14
3
4
6
n = 15
n = 17
n = 17
y = (0.667, 0.333) y = (0.235, 0.765) y = (0.882, 0.118)
2
n = 28
y = (0.607, 0.393)
> 14
7
n = 32
y = (1, 0)
7
n = 10
y = (0.5, 0.5)
>3
3
Start
p < 0.001
6
n = 37
y = (0.865, 0.135)
> 13
4
n = 10
y = (0.8, 0.2)
5
n = 24
y = (1, 0)
1
Start
p < 0.001
1
Start
p < 0.001
> 12
Summary
>6
2
Number
p < 0.001
5
Age
p < 0.001
8
>8
Conditional
importance
1
Number
p < 0.001
>8
3
4
n = 12
n = 14
y = (0.667, 0.333) y = (0.143, 0.857)
Tests for variable

importance
> 12
5
6
n = 16
n = 15
y = (0.375, 0.625) y = (0.733, 0.267)
2
Start
p < 0.001
> 13
4
6
n = 16
n = 11
y = (0.188, 0.812) y = (0.818, 0.182)
1
Start
p < 0.001
>1
7
n = 35
y = (1, 0)
1
Start
p < 0.001
2
Age
p < 0.001
3
n = 20
y = (0.85, 0.15)
5
6
n = 14
n=9
y = (0.357, 0.643)
y = (0.111, 0.889)
> 14
2
Age
p < 0.001
>4
Variable
importance
6
n = 11
y = (0.818, 0.182)
13
4
Number
p < 0.001
7
n = 31
y = (0.806, 0.194)
> 125
1
Start
p < 0.001
1
Start
p < 0.001
> 27
>4
4
Age
p < 0.001
125
12
R functions
3
Number
p < 0.001
> 12
2
Age
p < 0.001
2
Start
p < 0.001
>1
5
n=9
y = (0.556, 0.444)
18
6
Start
p < 0.001
1
2
n=8
y = (0.375, 0.625)
>4
3
n = 10
y = (0.9, 0.1)
> 12
15 > 15
7
8
n = 12
n = 12
y = (0.833, 0.167) y = (1, 0)
1
Start
p < 0.001
7
n = 16
y = (1, 0)
7
n = 49
y = (1, 0)
> 68
3
Number
p < 0.001
> 13
1
Number
p < 0.001
12
27
68
5
Start
p < 0.001
13
5
n = 32
y = (1, 0)
1
Start
p < 0.001
3
n = 10
y = (1, 0)
> 87
4
n = 36
y = (1, 0)
> 14
4
n = 34
y = (0.882, 0.118)
> 12
2
Age
p < 0.001
Construction
1
Start
p < 0.001
2
n = 18
y = (0.5, 0.5)
> 12
3
Start
p < 0.001
14
4
n = 21
y = (0.905, 0.095)
>8
3
Start
p < 0.001
12
> 14
5
n = 32
y = (1, 0)
> 12
4
n = 18
y = (0.833, 0.167)
5
Number
p < 0.001
3
6
n = 30
y = (1, 0)
>3
7
n = 15
y = (0.933, 0.067)
References
Measuring variable importance

Introduction
Construction
R functions
Variable
Gini importance
importance
mean Gini gain produced by Xj over all trees

I
obj <- randomForest(..., importance=TRUE)

obj$importance
column: MeanDecreaseGini
importance(obj, type=2)
for variables of different types: biased in favor of continuous

variables and variables with many categories
Tests for variable

importance
Conditional
importance
Summary
References
Measuring variable importance

Introduction
permutation importance
Construction
R functions
mean decrease in classification accuracy after

permuting Xj over all trees
I

obj$importance
column: MeanDecreaseAccuracy
Variable
importance
Tests for variable
importance
Conditional
importance
Summary
importance(obj, type=1)
I
obj <- cforest(...)

varimp(obj)
for variables of different types: unbiased only when

subsampling is used as in cforest(..., controls =
cforest unbiased())
References
The permutation importance

within each tree t
Introduction
Construction
R functions
Variable
P
VI (t) (xj ) =
(t)
I yi = yi

(t)
B
(t)
iB
(t)
iB
(t)
I yi = yi,j

(t)
B
importance
Tests for variable
importance
Conditional
importance
Summary
(t)
yi
= f (t) (xi ) = predicted class before permuting
(t)
yi,j = f (t) (xi,j ) = predicted class after permuting Xj

xi,j = (xi,1 , . . . , xi,j1 , xj (i),j , xi,j+1 , . . . , xi,p
Note: VI (t) (xj ) = 0 by definition, if Xj is not in tree t
References

Introduction
Construction
R functions
over all trees:
Variable
importance
1. raw importance
Tests for variable

importance
Conditional
importance
Pntree
VI (xj ) =
VI (t) (xj )
ntree
Summary
t=1

importance(obj, type=1, scale=FALSE)
References

Introduction
Construction
R functions
over all trees:
Variable
importance
2. scaled importance (z-score)
Tests for variable

importance
Conditional
importance
VI (xj )
ntree
Summary
= zj

importance(obj, type=1, scale=TRUE) (default)
References
Tests for variable importance

for variable selection purposes
Introduction
Construction
R functions
Variable
importance
Tests for variable
importance
Conditional
importance
Summary
References

Introduction
Construction
R functions
Breiman and Cutler (2008): simple significance test

based on normality of z-score
randomForest, scale=TRUE + -quantile of N(0,1)
Variable
importance
Tests for variable
importance
Conditional
importance
Summary
References

Introduction
Construction
R functions

Diaz-Uriarte and Alvarez de Andres (2006): backward
Variable
importance
Tests for variable
importance
Conditional
importance
Summary
elimination (throw out least important variables until

out-of-bag prediction accuracy drops)
varSelRF (pkg: varSelRF), dep. on randomForest
References

Introduction
Construction
R functions

Diaz-Uriarte and Alvarez de Andres (2006): backward
Variable
importance
Tests for variable
importance
Conditional
importance
Summary
elimination (throw out least important variables until

out-of-bag prediction accuracy drops)
varSelRF (pkg: varSelRF), dep. on randomForest
I
Diaz-Uriarte (2007) and Rodenburg et al. (2008): plots

and significance test (randomly permute response values
to mimic the overall null hypothesis that none of the
predictor variables is relevant = baseline)
References

Introduction
Construction
R functions
Variable
problems of these approaches:
importance
Tests for variable
importance
Conditional
importance
Summary
References

Introduction
Construction
R functions
Variable
importance
Tests for variable
importance
(at least) Breiman and Cutler (2008): strange statistical

properties (Strobl and Zeileis, 2008)
Conditional
importance
Summary
References

Introduction
Construction
R functions
Variable
importance
Tests for variable
importance
(at least) Breiman and Cutler (2008): strange statistical

properties (Strobl and Zeileis, 2008)
Conditional
importance
Summary
References
all: preference of correlated predictor variables (see also

Nicodemus and Shugart, 2007; Archer and Kimes, 2008)
Breiman and Cutlers test

Introduction
Construction
R functions
Variable
under the null hypothesis of zero importance:
importance
Tests for variable
importance
as.
zj N(0, 1)
Conditional
importance
Summary
References
if zj exceeds the -quantile of N(0,1) reject the

null hypothesis of zero importance for variable Xj
Raw importance
Introduction
Construction
R functions
Variable
sample size
importance
100
200
500
mean importance
ntree = 200
mean importance
ntree = 100
Tests for variable

importance
mean importance
ntree = 500
Conditional
importance
Summary
References
0.0
0.1
0.2
0.3
0.4
0.0
0.1
0.2
relevance
0.3
0.4
0.0
0.1
0.2
0.3
0.4
z-score and power

Introduction
sample size
Construction
100
200
500
zscore
ntree = 200
zscore
ntree = 100
R functions
zscore
ntree = 500
Variable
importance
Tests for variable
importance
Conditional
importance
Summary
0.0
0.1
0.2
0.3
0.4
0.0
0.1
power
ntree = 100
0.0
0.1
0.2
0.2
0.3
0.4
0.0
0.1
power
ntree = 200
0.3
0.4
0.0
0.1
0.2
relevance
0.2
0.3
0.4
References
power
ntree = 500
0.3
0.4
0.0
0.1
0.2
0.3
0.4
Findings
Introduction
Construction
z-score and power
R functions
Variable
importance
increase in ntree
decrease in sample size
Tests for variable

importance
Conditional
importance
Summary
rather use raw, unscaled permutation importance!

importance(obj, type=1, scale=FALSE)
varimp(obj)
References
What null hypothesis were we testing

in the first place?
Introduction
Construction
R functions
Variable
obs
Xj
1
..
.
y1
..
.
xj (1),j
..
.
z1
..
.
i
..
.
yi
..
.
xj (i),j
..
.
zi
..
.
yn
xj (n),j
zn
H0 : Xj Y , Z or Xj Y Xj Z
H
P(Y , Xj , Z ) =0 P(Y , Z ) P(Xj )
importance
Tests for variable
importance
Conditional
importance
Summary
References

in the first place?
Introduction
Construction
R functions
Variable
importance
the current null hypothesis reflects independence of Xj from

both Y and the remaining predictor variables Z
Tests for variable

importance
Conditional
importance
Summary
References

in the first place?
Introduction
Construction
R functions
Variable
importance
the current null hypothesis reflects independence of Xj from

both Y and the remaining predictor variables Z
a high variable importance can result from violation of
either one!
Tests for variable

importance
Conditional
importance
Summary
References
Suggestion: Conditional permutation scheme

Introduction
obs
Xj
y1
xj|Z =a (1),j
z1 = a
y3
xj|Z =a (3),j
z3 = a
27
y27
xj|Z =a (27),j
z27 = a
y6
xj|Z =b (6),j
z6 = b
14
y14
xj|Z =b (14),j
z14 = b
33
..
.
y33
..
.
xj|Z =b (33),j
..
.
z33 = b
..
.
H0 : Xj Y |Z
P(Y , Xj |Z )
H0
P(Y |Z ) P(Xj |Z )
or P(Y |Xj , Z )
H0
P(Y |Z )
Construction
R functions
Variable
importance
Tests for variable
importance
Conditional
importance
Summary
References
Technically
Introduction
Construction
R functions
use any partition of the feature space for conditioning
Variable
importance
Tests for variable
importance
Conditional
importance
Summary
References
Technically
Introduction
Construction
R functions
I
I
Variable
importance
here: use binary partition already learned by tree
Tests for variable

importance
(use cutpoints as bisectors of feature space)
Conditional
importance
Summary
References
Technically
Introduction
Construction
R functions
I
I
Variable
importance
Tests for variable

importance
Conditional
importance
condition on correlated variables or select some
Summary
References
Technically
Introduction
Construction
R functions
I
I
Variable
importance
Tests for variable

importance
Conditional
importance
condition on correlated variables or select some
Summary
References
Strobl et al. (2008)

available in cforest from version 0.9-994: varimp(obj,
conditional = TRUE)
Simulation study
I
I
Introduction
i.i.d.
dgp: yi = 1 xi,1 + + 12 xi,12 + i , i N(0, 0.5)

X1 , . . . , X12 N(0, )
Construction
R functions
Variable
0.9
0.9
0.9
0 0
0.9
0.9
0.9
0.9
0.9
0.9
0.9
0.9
0.9
0
..
.
0
..
.
0
..
.
0
..
.
0 0
0 0
0 0
1 0
.. . .
. 0
.
importance
Tests for variable

importance
Conditional
importance
Summary
References
Xj
X1
X2
X3
X4
X5
X6
X7
X8
X12
-5
-5
-2
Results
Construction
R functions
15
mtry = 1
25
Introduction
Variable
0 5
importance
30
Summary
References
10
11
12
mtry = 8
0 10
mtry = 3
50
Conditional
importance
20 40 60 80
Tests for variable

importance
variable
variable
Peptide-binding data
Introduction
Construction
Variable
0.005
importance
Tests for variable
importance
Conditional
importance
Summary
0.005
References
conditional
conditional
unconditional
R functions
h2y8
flex8
*
pol3
Summary
Introduction
Construction
R functions
Variable
importance
Tests for variable
importance
Conditional
importance
Summary
References
Summary
if your predictor variables are of different types:
use cforest (pkg: party) with default option controls =
Introduction
Construction
R functions
cforest unbiased()
with permutation importance varimp(obj)
Variable
importance
Tests for variable
importance
Conditional
importance
Summary
References
Summary
Introduction
Construction
R functions
cforest unbiased()
Variable
importance
Tests for variable
importance
otherwise: feel free to use cforest (pkg: party)
Conditional
importance
Summary
or randomForest (pkg: randomForest)
References
with permutation importance importance(obj, type=1)

or Gini importance importance(obj, type=2)
but dont fall for the z-score! (i.e. set scale=FALSE)
Summary
Introduction
Construction
R functions
cforest unbiased()
Variable
importance
Tests for variable
importance
otherwise: feel free to use cforest (pkg: party)
Conditional
importance
Summary
or randomForest (pkg: randomForest)
References
with permutation importance importance(obj, type=1)

or Gini importance importance(obj, type=2)
but dont fall for the z-score! (i.e. set scale=FALSE)
if your predictor variables are highly correlated: use the
conditional importance in cforest (pkg: party)
Introduction
Construction
R functions
Variable
importance
Tests for variable
importance
Conditional
importance
Summary
References
Archer, K. J. and R. V. Kimes (2008). Empirical characterization

of random forest variable importance measures. Computational
Introduction
Construction
Statistics & Data Analysis 52 (4), 22492260.

Breiman, L. (2001). Random forests. Machine Learning 45 (1),
532.
Breiman, L. and A. Cutler (2008). Random forests Classification
manual. Website accessed in 1/2008;
http://www.math.usu.edu/adele/forests.
Breiman, L., A. Cutler, A. Liaw, and M. Wiener (2006). Breiman
and Cutlers Random Forests for Classification and Regression.
R package version 4.5-16.
Diaz-Uriarte, R. (2007). GeneSrF and varselrf: A web-based
tool and R package for gene selection and classification using
random forest. BMC Bioinformatics 8:328.
R functions
Variable
importance
Tests for variable
importance
Conditional
importance
Summary
References
Hothorn, T., K. Hornik, and A. Zeileis (2006). Unbiased recursive

partitioning: A conditional inference framework. Journal of
Computational and Graphical Statistics 15 (3), 651674.
Introduction
Construction
R functions
Variable
Strobl, C., A.-L. Boulesteix, A. Zeileis, and T. Hothorn (2007).
importance
Bias in random forest variable importance measures:
Tests for variable

importance
Illustrations, sources and a solution. BMC Bioinformatics 8:25.
Conditional
importance
Strobl, C. and A. Zeileis (2008). Danger: High power! exploring

the statistical properties of a test for random forest variable
importance. In Proceedings of the 18th International
Conference on Computational Statistics, Porto, Portugal.
Strobl, C., A.-L. Boulesteix, T. Kneib, T. Augustin, and A. Zeileis
(2008). Conditional variable importance for random forests.
BMC Bioinformatics 9:307.
Summary
References

Why and How To Use Random Forest Variable Importance Measures (And How You Shouldn't)

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Why and How To Use Random Forest Variable Importance Measures (And How You Shouldn't)

Transféré par

Droits d'auteur :

Formats disponibles

Why and how to use random forest

variable importance measures

Carolin Strobl (LMU M

have become increasingly popular in, e.g., genetics and

Tests for variable

have become increasingly popular in, e.g., genetics and

Tests for variable

have become increasingly popular in, e.g., genetics and

Tests for variable

can deal with small n large p-problems, high-order

have become increasingly popular in, e.g., genetics and

Tests for variable

can deal with small n large p-problems, high-order

are used not only for prediction, but also to assess

(Small) random forest

Tests for variable

Construction of a random forest

Construction of a random forest

draw ntree bootstrap samples from original sample

Construction of a random forest

draw ntree bootstrap samples from original sample

fit a classification tree to each bootstrap sample

Tests for variable

Construction of a random forest

draw ntree bootstrap samples from original sample

fit a classification tree to each bootstrap sample

Tests for variable

creates diverse set of trees because

trees are instable w.r.t. changes in learning data

randomly preselect mtry splitting variables in each split

randomForest (pkg: randomForest)

reference implementation based on CART trees

for variables of different types: biased in favor of

cforest (pkg: party)

based on unbiased conditional inference trees

+ for variables of different types: unbiased when

(Small) random forest

Tests for variable

Measuring variable importance

mean Gini gain produced by Xj over all trees

obj <- randomForest(..., importance=TRUE)

for variables of different types: biased in favor of continuous

Tests for variable

Measuring variable importance

mean decrease in classification accuracy after

obj <- randomForest(..., importance=TRUE)

obj <- cforest(...)

for variables of different types: unbiased only when

The permutation importance

= f (t) (xi ) = predicted class before permuting

yi,j = f (t) (xi,j ) = predicted class after permuting Xj

Note: VI (t) (xj ) = 0 by definition, if Xj is not in tree t

The permutation importance

over all trees:

Tests for variable

obj <- randomForest(..., importance=TRUE)

The permutation importance

over all trees:

2. scaled importance (z-score)

Tests for variable

obj <- randomForest(..., importance=TRUE)

Tests for variable importance

Tests for variable importance