Vous êtes sur la page 1sur 25

# Logistics and Reminders

## Lab 1 due Friday August 19th 11.55pm

Enrolment key - csl603_201620171

## Tomorrow Mondays schedule

Class 9.55-10.45am

Quiz 3
Thursday August 25th holiday
Quiz 3 will be held on Wednesday August 24th during
regular class times 11.45-12.35

## CSL465/603 - Machine Learning

Instance Based
Learning
CSL465/603 - Fall 2016
Narayanan C Krishnan
ckn@iitrpr.ac.in

Outline
K-nearest neighbor
Other forms of IBL
Nonparametric Methods

## CSL465/603 - Machine Learning

Key Ideas
Training store all training examples (no explicit
learning)
Testing compute only locally the target function
Can learn very complex target function
Training is very fast
No loss of information

Slow during testing
Easily fooled by irrelevant attributes

Example

## K-Nearest Neighbor Learning

Just store all training examples
Nearest neighbor

x" , %

(
%&'

## Given query instance x, first locate the nearest example

x" , then estimate %

K- nearest neighbor
Given the query instance x,
take a vote among its nearest neighbors (if is discrete)
take the mean of the % values of nearest neighbors
-

1
, %

%&'

## CSL465/603 - Machine Learning

Distance Measures
Numeric features
Manhattan, Euclidean, / norm
7

/ x' , x0 =

, '4 04

4&'

## Symbolic (categorical) features

Hamming/ overlap
Value difference measure (VDM):
<

% , 4 = , |% |4
-&'

## In general - encode knowledge

Metric learning
Instance Based Learning

## CSL465/603 - Machine Learning

Illustrating k-NN

## CSL465/603 - Machine Learning

Voronoi Diagram
Voronoi cell of x : all points in closer to x than
any other instance in

## What is the target concept ?

Instance Based Learning

## Behavior in the limit

x : Error of optimal prediction
(( x : Error of nearest neighbor
Theorem: lim (( 2
EG

## Proof sketch: (2-class case)

(( = K ((LM + M ((LK
= K (1 ((LK ) + 1 K ((LK
lim ((LK = K , lim ((LM = M
EG
EG
lim (( = K 1 K + 1 K K = 2 1 2
EG
lim Nearest Neighbor = Gibbs Classifier
EG
lim
k Nearest Neighbor = Bayes
EG,-G, _
E

## CSL465/603 - Machine Learning

10

Distance-Weighted k-NN
Simple refinement over k-NN
Might want to weight nearer neighbors more
heavily
-%& % %
= %&' %
Where
1
%
x, x" 0
And (x, x" )is the distance between x and x"
Makes senses to use all the training examples
Instance Based Learning

11

## Issues with k-NN (1)

Inductive bias
Classification of an instance will be most similar to the
classification of instances close to it

## Distance computation depends on all attributes

Imagine instances described by 20 attributes, but only 2
are relevant to the target attribute.

## CSL465/603 - Machine Learning

12

Possible Solutions
Feature Selection
Filter Approach
Pre-select features individually using some measure

Wrapper approach
Experiment with different combinations of features using a
learner
Forward selection
Backward elimination

Feature Weighting

## Stretch the gh attribute by weight j , where ', , 6 is

chosen to minimize prediction error.

13

## Issues with k-NN (2)

Inductive bias
Classification of an instance will be most similar to the
classification of instances close to it

## Distance computation depends on all attributes

Imagine instances described by 20 attributes, but only 2
are relevant to the target attribute.

Curse of dimensionality
Sensitive to dimensionality of the data
Low dimension intuitions do not apply in high dimensions

## CSL465/603 - Machine Learning

14

Curse of Dimensionality
Examples
Normal Distribution
Points on a hyper grid
Approximation of a sphere by a cube

15

## Issues with k-NN (3)

Inductive bias
Classification of an instance will be most similar to the
classification of instances close to it

## Distance computation depends on all attributes

Imagine instances described by 20 attributes, but only 2
are relevant to the target attribute.

Curse of dimensionality
Sensitive to dimensionality of the data
Low dimension intuitions do not apply in high dimensions

Computational cost
Distance computation while testing!
Instance Based Learning

16

## Reducing the Computational Cost

Efficient retrieval: k-D Trees (for lower dimensions)
Efficient Similarity computation
Use a cheap approximation to weed out most of the
instances
Use expensive measure on the remainder

Forming prototypes
Edited k-NN
Remove instances that do not affect the decision

## CSL465/603 - Machine Learning

17

Overfitting
What parameter of the model can indicate
overfitting?
set the parameter through validation experiments

## Remove noisy instances

Remove x if all xs - nearest neighbors are of different
classes

## CSL465/603 - Machine Learning

18

Nonparametric methods
Form of underlying distributions unknown
Still want to perform classification (or regression)

## Nonparametric Density Estimation

Given a training set I = x" , % , drawn i.i.d from

Divide data into bins of size h
o "q rst uvwt x"q vu o
Histogram: n x = p

(h
z
~
op oM{ | op } oK{

Nave estimator:n x =
(h
(
1
x x"
1, if < 1/2
n
x =
,
, =
0, otherwise
%&'

19

(1)

20

## Nonparametric Density Estimation

(2)
Smoothing the estimate
Use a kernel function, e.g., radial basis function or
Gaussian kernel
1
0
=
exp
2
2
Parzen windows (Kernel estimator)
(
1
x x"
n x =
,
%&'

21

(3)

22

## K-Nearest Neighbor Estimator (1)

Instead of fixing bin width , and counting the
number of instances, fix the number of instances
(neighbors) and check bin width

n x =
2- x
Where
- x - distance to the kth closest instance to x

23

## CSL465/603 - Machine Learning

24

Summary
Lazy learning
K-NN for classification
Issues with KNN and potential solutions
Curse of dimensionality

Density Estimation
Nonparameteric methods

25