Académique Documents
Professionnel Documents
Culture Documents
Contents
Recursive Partitioning
Classification
Regression/Decision
Bagging
Random Forest
Boosting
Gradient Boosting
Questions
2
2
3
3
Basics
Supervised Learning:
Called supervised because of the presence of the outcome variable to guide learning process
building a learner (/model) to predict the outcome for new unseen objects.
Alternatively,
Unsupervised Learning:
observe only the features and have no measurements of the outcome
task is rather to describe how the data are organized or clustered
4
4
Machine learning may use probability models, and when it does, it overlaps with
statistics.
isn't so committed to probability
use other approaches to problem solving that are not based on probability
The basic optimization concept is the same for trees is same as that of parametric
techniques, minimizing errors metrics. Instead of square error function or MLE,
Machine Learning supervises optimization of entropy, node impurity etc
An application _-> Trees
5
5
The segments are also called nodes, and the final segments are called leaf nodes or
leaves
Final node surviving the partitions called the terminal node
< $30k
>= $30k
Age
< 25
not on-time
Credit Score
>=25
on-time
< 600
not on-time
>= 600
on-time
Main Idea: form a binary tree and minimize error in each leaf
Given dataset, a decision tree: choose a sequence of binary split of the data
8
8
If an input variable is interval, a splitting value is used to classify the data into two
segments
For example, if household income is interval and there are 100 possible incomes in the
data set, then there are 100 possible splitting values
For example, income < $30k, and income >= $30k
10
10
11
12
12
13
Contents
Recursive Partitioning
Classification
Regression/Decision
Bagging
Random Forest
Boosting
Gradient Boosting
14
14
Bagging
Ensemble Models : Combines the results from different models
An ensemble classifier using many decision tree models
15
15
Bagging: Stanford
Suppose
C(S, x) is a classier, such as a tree, based
on our training data S, producing a
predicted class label at input point x.
To bag C, we draw bootstrap samples
S1,...SB each of size N with replacement
from the training data.
Then
Cbag(x) = Majority Vote{C(Sb, x)}Bb =1.
Bagging can dramatically reduce the
variance of unstable procedures (like trees),
leading to improved prediction.
However any simple structure in C (e.g a
tree) is lost.
16
16
Bootstrapped samples
17
17
Contents
Recursive Partitioning
Classification
Regression/Decision
Bagging
Random Forest
Boosting
Gradient Boosting
18
18
Boosting
Make Copies of Data
Boosting idea: Based on "strength of weak learn ability" principles
Example:
IF Gender=MALE AND Age<=25 THEN claim_freq.=high
19
19
GBM
21
21
22
22
Questions
Concept/ Interpretation
Application
23
23
Parth Khare
https://www.linkedin.com/profile/view?
id=43877647&trk=nav_responsive_tab_profile