Vous êtes sur la page 1sur 30

# RischanMafrur/138173 ChapterII PatternRecognition

Exercise2.3.1
Consider the setup ofExample2.3.3wherenowthemeansofthethreeclasses1,2,and3are m1 =[0, 0,0]T, m2=[1,2,2]T,andm3=[3,3,4]T,respectively.ApplytheSSErrMATLABfunction onthe data setX1 to estimatethe parametervectorsw1,w2,andw3ofthethreelineardiscriminant functions,intheextended4dimensionalspace.UsethesetX2tocomputetheerrorprobability. Computetheclassificationerrorofthe(optimal)BayesianclassifieronX2andcompareitwiththat resultingfromtheLSclassifierinstep1.

In this exerciseweusetheexample2.3.3butnowwechangethemeansofthethreeclassesm1:[00 0]T,m2:[122]T,m3:[334]T.
c l o s e ( ' a l l ' ) c l e a r % m = [ 1 1 1 5 3 2 3 3 4 ] ' m = [ 0 0 0 1 2 2 3 3 4 ] ' [ l , c ] = s i z e ( m ) S 1 = [ 0 . 8 0 . 2 0 . 1 0 . 2 0 . 8 0 . 2 0 . 1 0 . 2 0 . 8 ] S ( : , : , 1 ) = S 1 S ( : , : , 2 ) = S 1 S ( : , : , 3 ) = S 1 P = [ 1 / 3 1 / 3 1 / 3 ] % 1 . G e n e r a t e X 1 N 1 = 1 0 0 0 r a n d n ( ' s e e d ' , 0 ) [ X 1 , y 1 ] = g e n e r a t e _ g a u s s _ c l a s s e s ( m , S , P , N 1 ) [ l , N 1 ] = s i z e ( X 1 ) X 1 = [ X 1 o n e s ( 1 , N 1 ) ] 1

% P l o t X 1 u s i n g d i f f e r e n t c o l o r s f o r p o i n t s o f d i f f e r e n t c l a s s e s , f i g u r e ( 1 ) , p l o t 3 ( X 1 ( 1 , y 1 = = 1 ) , X 1 ( 2 , y 1 = = 1 ) , X 1 ( 3 , y 1 = = 1 ) , ' r . ' , . . . X 1 ( 1 , y 1 = = 2 ) , X 1 ( 2 , y 1 = = 2 ) , X 1 ( 3 , y 1 = = 2 ) , ' g . ' , . . . X 1 ( 1 , y 1 = = 3 ) , X 1 ( 2 , y 1 = = 3 ) , X 1 ( 3 , y 1 = = 3 ) , ' b . ' ) % N e x t , w e d e f i n e m a t r i x z 1 , e a c h c o l u m n o f w h i c h % c o r r e s p o n d s t o a t r a i n i n g p o i n t . z 1 = z e r o s ( c , N 1 ) f o r i = 1 : N 1 z 1 ( y 1 ( i ) , i ) = 1 e n d % G e n e r a t e X 2 N 2 = 1 0 0 0 0 r a n d n ( ' s e e d ' , 1 0 0 ) [ X 2 , y 2 ] = g e n e r a t e _ g a u s s _ c l a s s e s ( m , S , P , N 2 ) [ l , N 2 ] = s i z e ( X 2 ) X 2 = [ X 2 o n e s ( 1 , N 2 ) ] % P l o t X 1 u s i n g d i f f e r e n t c o l o r s f o r p o i n t s o f d i f f e r e n t c l a s s e s , f i g u r e ( 2 ) , p l o t 3 ( X 2 ( 1 , y 2 = = 1 ) , X 2 ( 2 , y 2 = = 1 ) , X 2 ( 3 , y 2 = = 1 ) , ' r . ' , . . . X 2 ( 1 , y 2 = = 2 ) , X 2 ( 2 , y 2 = = 2 ) , X 2 ( 3 , y 2 = = 2 ) , ' g . ' , . . . X 2 ( 1 , y 2 = = 3 ) , X 2 ( 2 , y 2 = = 3 ) , X 2 ( 3 , y 2 = = 3 ) , ' b . ' ) % D e f i n e m a t r i x z 2 z 2 = z e r o s ( c , N 2 ) f o r i = 1 : N 2 z 2 ( y 2 ( i ) , i ) = 1 e n d % E s t i m a t e t h e p a r a m e t e r v e c t o r s o f t h e t h r e e d i s c r i m i n a n t f u n c t i o n s w _ a l l = [ ] f o r i = 1 : c w = S S E r r ( X 1 , z 1 ( i , : ) , 0 ) w _ a l l = [ w _ a l l w ] e n d % N o t e : i n w _ a l l , t h e i t h c o l u m n c o r r e s p o n d s t o t h e p a r a m e t e r v e c t o r o f t h e i t h d i s c r i m i n a n t f u n c t i o n . % C o m p u t e t h e c l a s s i f i c a t i o n e r r o r u s i n g t h e s e t X 2 [ v a l i , c l a s s _ e s t ] = m a x ( w _ a l l ' * X 2 ) e r r = s u m ( c l a s s _ e s t ~ = y 2 ) / N 2 % 2 . C o m p u t e t h e e s t i m a t e s o f t h e a p o s t e r i o r i p r o b a b i l i t i e s , a s t h e y 2

r e s u l t i n t h e f r a m e w o r k % o f t h e L S c l a s s i f i e r a p o s t e _ e s t = w _ a l l ' * X 2 % C o m p u t e t h e t r u e a p o s t e r i o r i p r o b a b i l i t i e s a p o s t e = [ ] f o r i = 1 : N 2 t = z e r o s ( c , 1 ) f o r j = 1 : c t ( j ) = c o m p _ g a u s s _ d e n s _ v a l ( m ( : , j ) , S ( : , : , j ) , X 2 ( 1 : l , i ) ) * P ( j ) e n d t o t _ t = s u m ( t ) a p o s t e = [ a p o s t e t / t o t _ t ] e n d % C o m p u t e t h e a v e r a g e s q u a r e e r r o r o f t h e e s t i m a t i o n o f t h e p o s t e r i o r p r o b a b i l i t i e s a p p r o x _ e r r = s u m ( s u m ( ( a p o s t e a p o s t e _ e s t ) . ^ 2 ) ) / ( N 2 * c ) % 3 . C o m p u t e t h e o p t i m a l B a y e s i a n c l a s s i f i c a t i o n e r r o r ( t h e t r u e a p o s t e r i o r i % p r o b a b i l i t i e s a r e k n o w n ) [ v a l i , c l a s s ] = m a x ( a p o s t e ) e r r _ b a = s u m ( c l a s s ~ = y 2 ) / N 2

Theresult: >>exercise231 err= 0.1836 approx_err= 0.0652 err_ba= 0.0990 Theresultin example2.3.3istheclassificationerrorifusetheLS5.11%andforbayes4.82%butin thiscase(thisexercise)the result error forLS18.36%andforbayes9.90%.Thequestioniswhythe
3

## X2 DatasetExample2.3.3 m=[111532334]' X2DatasetExercise2.3.1 m=[000122334]'

In myperspective i thinkthe differentposition isinfluencefor theclassification,wheni seethefigure X2 data set especially for the second data (with green color) in the example 2.3.3 the position second data with mean [5 3 2] beside the both of data(blue and red)butin the example 2.3.1the greendatainthemiddleoftwodata(blueandreddata).

Exercise2.4.1
Repeat Example 2.4.1, now with the covariance matrices of the Gaussian distributions S1= S2= 0.3I.Commentontheresults.

Inthisexerciseweuseexample2.4.1withdifferentcovariancematrixvalue.
c l o s e ( ' a l l ' ) c l e a r % G e n e r a t e a n d p l o t X 1 r a n d n ( ' s e e d ' , 5 0 ) m = [ 0 0 1 . 2 1 . 2 ] ' % m e a n v e c t o r s S = 0 . 3 * e y e ( 2 ) % c o v a r i a n c e m a t r i x p o i n t s _ p e r _ c l a s s = [ 2 0 0 2 0 0 ] X 1 = m v n r n d ( m ( : , 1 ) , S , p o i n t s _ p e r _ c l a s s ( 1 ) ) ' X 1 = [ X 1 m v n r n d ( m ( : , 2 ) , S , p o i n t s _ p e r _ c l a s s ( 2 ) ) ' ] y 1 = [ o n e s ( 1 , p o i n t s _ p e r _ c l a s s ( 1 ) ) o n e s ( 1 , p o i n t s _ p e r _ c l a s s ( 2 ) ) ] f i g u r e ( 1 ) , X 1 ( 1 , y 1 = = 1 ) , X 1 ( 2 , y 1 = = 1 ) , ' b o ' ) p l o t ( X 1 ( 1 , y 1 = = 1 ) , X 1 ( 2 , y 1 = = 1 ) , ' r . ' ,

## % C o m p u t e t h e c l a s s i f i c a t i o n e r r o r o n t h e t r a i n i n g s e t P e _ t r = s u m ( ( 2 * ( w * X 1 w 0 > 0 ) 1 ) . * y 1 < 0 ) / l e n g t h ( y 1 ) % C o m p u t e t h e c l a s s i f i c a t i o n e r r o r o n t h e t e s t s e t P e _ t e = s u m ( ( 2 * ( w * X 2 w 0 > 0 ) 1 ) . * y 2 < 0 ) / l e n g t h ( y 2 ) % P l o t t h e c l a s s i f i e r h y p e r p l a n e g l o b a l f i g t 4 f i g t 4 = 2 s v c p l o t _ b o o k ( X 1 ' , y 1 ' , k e r n e l , k p a r 1 , k p a r 2 , a l p h a , w 0 ) % C o u n t t h e s u p p o r t v e c t o r s s u p _ v e c = s u m ( a l p h a > 0 ) % C o m p u t e t h e m a r g i n m a r g = 2 / s q r t ( s u m ( w . ^ 2 ) )

Table Result of training and testing classification error, number of support vector and margin with differentcovariancematrixvalue. TestuseC=0.1 withdifferentI 0.1I 0.2I 0.3I 0.4I Training classification error
0.0050 0.0225 0.0525 0.0650

## Testing classification error

0.0025 0.0325 0.0675 0.1125

## Number of Margine supportvector

53 82 104 123 1.0440 0.9405 0.9562 1.0468

When we increase the value of S (in this case S1=S2)maketheboth of data is close(mixed)and data set notsoclearlyseparatedsoItsdifficultformakedecisionboundaryperfectly.Wecanseein the table result above when we increase the S the training and testing error and the number of supportvectorincreasealso.Letsseetheplotresultinfigurebelow:

PlotresultswithdifferentI:

0.1I

0.2I

0.3I Tableresultfor0.3IwithdifferentvalueofC.

0.4I

## C=20 0.0500 0.0725 47 0.5913

In this case the parameter C is for control the margin, the margin will increase if the C value decreases.

Exercise2.6.1(doesn'tincludeassignment)
Consider the data sets X1 (training set) and X2 (test set) from Example 2.5.2. Run the kernel perceptron algorithm usingX1 as the trainingset wherethekernel functions are(a)linear,(b)radial basis with= 0.1,0.5,1,1.5, 2,5, and(c) polynomial oftheform(xTy+1)nforn=3,5,15,18,20,22. For all three cases,countthetrainingandtesterrors,thenumberofmisclassificationsduringtraining and the number of iterations the algorithm runs. Use 30,000 as themaximum number ofallowable iterations. For eachcase plot inthe same figure the trainingsetX1,thetestsetX2,andthedecisionboundary betweenthetwoclasses.Usedifferentcolorsandsymbolsforeachclass.

In this exercise we use data set (training set)X1 and (testingset) X2 from example 2.5.2 andthen usethe kernelperceptron algorithmusing X1asthetraining setwherekernelfunction:linear,radial basiswithdifferent value,andpolynomialwithdifferentnvalue. Then the result we count the training and testing error, the number of misclassifications during training and the numberof iteration when the algorithm running. (use30000maxiterations).Thelast weplottheresults. MatlabSourceCode:

c l o s e ( ' a l l ' ) c l e a r % 1 . G e n e r a t e X 1 l = 2 % D i m e n s i o n a l i t y p o i _ p e r _ s q u a r e = 3 0 % P o i n t s p e r s q u a r e N = 9 * p o i _ p e r _ s q u a r e % T o t a l n o . o f p o i n t s r a n d ( ' s e e d ' , 0 ) X 1 = [ ] y 1 = [ ] f o r i = 0 : 2 f o r j = 0 : 2 X 1 = [ X 1 r a n d ( l , p o i _ p e r _ s q u a r e ) + . . . [ i j ] ' * o n e s ( 1 , p o i _ p e r _ s q u a r e ) ] i f ( m o d ( i + j , 2 ) = = 0 ) y 1 = [ y 1 o n e s ( 1 , p o i _ p e r _ s q u a r e ) ] e l s e y 1 = [ y 1 o n e s ( 1 , p o i _ p e r _ s q u a r e ) ] e n d e n d e n d % P l o t X 1 f i g u r e ( 1 ) , p l o t ( X 1 ( 1 , y 1 = = 1 ) , X 1 ( 2 , y 1 = = 1 ) , ' r . ' , X 1 ( 1 , y 1 = = 1 ) , X 1 ( 2 , y 1 = = 1 ) , ' b o ' ) f i g u r e ( 1 ) , a x i s e q u a l % G e n e r a t e X 2 r a n d ( ' s e e d ' , 1 0 0 ) X 2 = [ ] y 2 = [ ] f o r i = 0 : 2 f o r j = 0 : 2 X 2 = [ X 2 r a n d ( l , p o i _ p e r _ s q u a r e ) + . . . [ i j ] ' * o n e s ( 1 , p o i _ p e r _ s q u a r e ) ] i f ( m o d ( i + j , 2 ) = = 0 ) y 2 = [ y 2 o n e s ( 1 , p o i _ p e r _ s q u a r e ) ] e l s e y 2 = [ y 2 o n e s ( 1 , p o i _ p e r _ s q u a r e ) ] e n d e n d e n d % 1 . 9

% a d d / r e m o v e c o m m e n t s a s a p p r o p r i a t e % R u n t h e k e r n e l p e r c e p t r o n a l g o r i t h m f o r t h e l i n e a r k e r n e l k e r n e l = ' l i n e a r ' k p a r 1 = 0 k p a r 2 = 0 m a x _ i t e r = 3 0 0 0 0 [ a , i t e r , c o u n t _ m i s c l a s ] = k e r n e l _ p e r c e ( X 1 , y 1 , k e r n e l , k p a r 1 , k p a r 2 , m a x _ i t e r ) % a d d / r e m o v e c o m m e n t s a s a p p r o p r i a t e % R u n t h e a l g o r i t h m u s i n g t h e r a d i a l b a s i s k e r n e l f u n c t i o n % { k e r n e l = ' r b f ' k p a r 1 = 0 . 1 k p a r 2 = 0 m a x _ i t e r = 3 0 0 0 0 [ a , i t e r , c o u n t _ m i s c l a s ] = k e r n e l _ p e r c e ( X 1 , y 1 , k e r n e l , k p a r 1 , k p a r 2 , m a x _ i t e r ) % } % a d d / r e m o v e c o m m e n t s a s a p p r o p r i a t e % R u n t h e a l g o r i t h m u s i n g t h e p o l y n o m i a l k e r n e l f u n c t i o n % { k e r n e l = ' p o l y ' k p a r 1 = 1 k p a r 2 = 3 m a x _ i t e r = 3 0 0 0 0 [ a , i t e r , c o u n t _ m i s c l a s ] = k e r n e l _ p e r c e ( X 1 , y 1 , k e r n e l , k p a r 1 , k p a r 2 , m a x _ i t e r ) % } % C o m p u t e t h e t r a i n i n g e r r o r t r a i n i t e r a t i o n = 0 f o r i = 1 : N K = C a l c K e r n e l ( X 1 ' , X 1 ( : , i ) ' , k e r n e l , k p a r 1 , k p a r 2 ) ' o u t _ t r a i n ( i ) = s u m ( ( a . * y 1 ) . * K ) + s u m ( a . * y 1 ) t r a i n i t e r a t i o n = t r a i n i t e r a t i o n + 1 e n d e r r _ t r a i n = s u m ( o u t _ t r a i n . * y 1 < 0 ) / l e n g t h ( y 1 ) f p r i n t f ( ' T r a i n i n g i t e r a t i o n s : % d \ n ' , t r a i n i t e r a t i o n ) % w h e r e N i s t h e n u m b e r o f t r a i n i n g v e c t o r s . % C o m p u t e t h e t e s t e r r o r t e s t i t e r a t i o n = 0 f o r i = 1 : N 10

K = C a l c K e r n e l ( X 1 ' , X 2 ( : , i ) ' , k e r n e l , k p a r 1 , k p a r 2 ) ' o u t _ t e s t ( i ) = s u m ( ( a . * y 1 ) . * K ) + s u m ( a . * y 1 ) t e s t i t e r a t i o n = t e s t i t e r a t i o n + 1 e n d e r r _ t e s t = s u m ( o u t _ t e s t . * y 2 < 0 ) / l e n g t h ( y 2 ) f p r i n t f ( ' T e s t i n g i t e r a t i o n s : % d \ n ' , t e s t i t e r a t i o n ) % C o u n t t h e n u m b e r o f m i s c l a s i f i c a t i o n d u r i n g t r a i n i n g s u m _ p o s _ a = s u m ( a > 0 ) % 2 . P l o t t h e t r a i n i n g s e t ( s e e b o o k F i g u r e s 2 . 7 a n d 2 . 8 ) f i g u r e ( 1 ) , h o l d o n f i g u r e ( 1 ) , p l o t ( X 1 ( 1 , y 1 = = 1 ) , X 1 ( 2 , y 1 = = 1 ) , ' r o ' , . . . X 1 ( 1 , y 1 = = 1 ) , X 1 ( 2 , y 1 = = 1 ) , ' b + ' ) f i g u r e ( 1 ) , a x i s e q u a l % N o t e t h a t t h e v e c t o r s o f t h e t r a i n i n g s e t f r o m c l a s s 1 ( ? 1 ) a r e m a r k e d b y % 2 . P l o t t h e t e s t i n g s e t ( s e e b o o k F i g u r e s 2 . 7 a n d 2 . 8 ) f i g u r e ( 2 ) , h o l d o n f i g u r e ( 2 ) , p l o t ( X 2 ( 1 , y 2 = = 1 ) , X 2 ( 2 , y 2 = = 1 ) , ' r o ' , . . . X 2 ( 1 , y 2 = = 1 ) , X 2 ( 2 , y 2 = = 1 ) , ' b + ' ) f i g u r e ( 2 ) , a x i s e q u a l % P l o t t h e d e c i s i o n b o u n d a r y i n t h e s a m e f i g u r e b o u _ x = [ 0 3 ] b o u _ y = [ 0 3 ] r e s o l u = . 0 5 f i g _ n u m = 1 p l o t _ k e r n e l _ p e r c e _ r e g ( X 1 , y 1 , a , k e r n e l , k p a r 1 , k p a r 2 , b o u _ x , b o u _ y , r e s o l u , f i g _ n u m ) % P l o t t h e d e c i s i o n b o u n d a r y i n t h e s a m e f i g u r e b o u _ x = [ 0 3 ] b o u _ y = [ 0 3 ] r e s o l u = . 0 5 f i g _ n u m = 2 p l o t _ k e r n e l _ p e r c e _ r e g ( X 2 , y 2 , a , k e r n e l , k p a r 1 , k p a r 2 , b o u _ x , b o u _ y , r e s o l u , f i g _ n u m )

## TheresultinTablebelow: TrainingError TestingError Number of Number of

11

Misclassification Training Linear RBF0.1 RBF0.5 RBF1 RBF1.5 RBF2 RBF3 RBF4 RBF5 Polynomialn=0 Polynomialn=1 Polynomialn=3 Polynomialn=5 Polynomialn=10 Polynomialn=15 Polynomialn=18 Polynomialn=20 Polynomialn=22 0.4778 0 0 0 0.0259 0.1111 0.2815 0.4259 0.4407 0.4444 0.5148 0.4296 0.1778 0.2333 0.1444 0.1444 0.1519 0.2000 0.4815 0.1111 0.0519 0.0185 0.0222 0.1148 0.2741 0.4000 0.4407 0.4444 0.5000 0.4148 0.1852 0.2407 0.2037 0.1926 0.1889 0.2333 48 95 72 81 85 83 78 72 62 17 46 89 150 193 234 241 244 245

Iterations 270 270 270 270 270 270 270 270 270 270 270 270 270 270 270 270 270 270

We also plotting the trainingset X1and testing setX2 data and thedecisionboundarybetweenthe twoclasses. PlotX1andX2datasetusingLinearKernelFunction(X1left,X2right)

12

RBF0.5

13

RBF1

RBF1.5

RBF2

14

RBF5

## Polynomial(xTy+1)nforn=3,5,15,18,20,22 X1(trainingdataset) n=3 X2(testingdataset)

15

n=5

n=10

n=15

16

n=18

n=20

n=22

Exercise2.7.1
Consider a 2class 2dimensional classification task, where the classes are described by normal distributionswithmeans[1,1]T(class+1)and[s,s]T(class1)andidentitycovariancematrices.Set s = 2. Generate a data set X consisting of 50 points from the first class and 50 points from the secondclass. 1.Repeatsteps1,2,and3ofExample2.7.1anddrawconclusions.
17

2.Repeatstep1fors=3,4,6.

This exerciseusesetupin example2.7.1 withmeansclass +1 [11]Tandmeanclass1[22]T.The problemis weneed toshow the plot of classification error forthetraining and testingsetversusthe numberofbaseclassifiersandrepeatusingdifferentmaininclass1. MatlabCode:

% G e n e r a t e t h e s u b s e t X 1 o f X , w h i c h c o n t a i n s t h e d a t a p o i n t s f r o m t h e f i r s t c l a s s m 1 1 = [ 1 1 ] ' m 1 2 = [ 1 1 ] ' m 1 3 = [ 1 1 ] ' m 1 = [ m 1 1 m 1 2 m 1 3 ] S 1 ( : , : , 1 ) = 0 . 1 * e y e ( 2 ) S 1 ( : , : , 2 ) = 0 . 2 * e y e ( 2 ) S 1 ( : , : , 3 ) = 0 . 3 * e y e ( 2 ) P 1 = [ 0 . 4 0 . 4 0 . 2 ] N 1 = 5 0 s e d = 0 [ X 1 , y 1 ] = m i x t _ m o d e l ( m 1 , S 1 , P 1 , N 1 , s e d ) % T h e s u b s e t X 2 o f X , w i t h t h e p o i n t s f r o m t h e s e c o n d c l a s s i s g e n e r a t e d s i m i l a r l y ( u s e a g a i n s e d = 0 ) m 2 1 = [ 2 2 ] ' m 2 2 = [ 2 2 ] ' m 2 3 = [ 2 2 ] ' m 2 = [ m 2 1 m 2 2 m 2 3 ] S 2 ( : , : , 1 ) = 0 . 1 * e y e ( 2 ) S 2 ( : , : , 2 ) = 0 . 2 * e y e ( 2 ) S 2 ( : , : , 3 ) = 0 . 3 * e y e ( 2 ) P 2 = [ 0 . 2 0 . 3 0 . 5 ] N 2 = 5 0 s e d = 0 [ X 2 , y 2 ] = m i x t _ m o d e l ( m 2 , S 2 , P 2 , N 2 , s e d ) X = [ X 1 X 2 ] y = [ o n e s ( 1 , N 1 ) o n e s ( 1 , N 2 ) ] % P l o t X f i g u r e ( 1 ) , h o l d o n f i g u r e ( 1 ) , p l o t ( X ( 1 , y = = 1 ) , X ( 2 , y = = 1 ) , ' r . ' , X ( 1 , y = = 1 ) , X ( 2 , y = = 1 ) , ' b x ' ) % 1 . T _ m a x = 3 0 0 0 % m a x n u m b e r o f b a s e c l a s s i f i e r s [ p o s _ t o t , t h r e s _ t o t , s l e f t _ t o t , a _ t o t , P _ t o t , K ] = b o o s t _ c l a s _ c o o r d ( X , y , T _ m a x ) 18

## % 2 . [ y _ o u t , P _ e r r o r ] = b o o s t _ c l a s _ c o o r d _ o u t ( p o s _ t o t , t h r e s _ t o t , s l e f t _ t o t , a _ t o t , P _ t o t , K , X , y ) f i g u r e ( 2 ) , p l o t ( P _ e r r o r ) % 3 . % D a t a s e t Z i s a l s o g e n e r a t e d i n t w o s t e p s , i . e . , 5 0 p o i n t s a r e f i r s t g e n e r a t e d f r o m % t h e f i r s t c l a s s ( d a t a s e t Z 1 ) m Z 1 1 = [ 1 1 ] ' m Z 1 2 = [ 1 1 ] ' m Z 1 3 = [ 1 1 ] ' m Z 1 = [ m Z 1 1 m Z 1 2 m Z 1 3 ] S Z 1 ( : , : , 1 ) = 0 . 1 * e y e ( 2 ) S Z 1 ( : , : , 2 ) = 0 . 2 * e y e ( 2 ) S Z 1 ( : , : , 3 ) = 0 . 3 * e y e ( 2 ) w Z 1 = [ 0 . 4 0 . 4 0 . 2 ] N Z 1 = 5 0 s e d = 1 0 0 [ Z 1 , y z 1 ] = m i x t _ m o d e l ( m Z 1 , S Z 1 , w Z 1 , N Z 1 , s e d ) % T h e r e m a i n i n g 5 0 p o i n t s f r o m t h e s e c o n d c l a s s ( d a t a s e t Z 2 ) a r e g e n e r a t e d s i m i l a r l y ( u s e % a g a i n s e d = 1 0 0 ) . m Z 2 1 = [ 2 2 ] ' m Z 2 2 = [ 2 2 ] ' m Z 2 3 = [ 2 2 ] ' m Z 2 = [ m Z 2 1 m Z 2 2 m Z 2 3 ] S Z 2 ( : , : , 1 ) = 0 . 1 * e y e ( 2 ) S Z 2 ( : , : , 2 ) = 0 . 2 * e y e ( 2 ) S Z 2 ( : , : , 3 ) = 0 . 3 * e y e ( 2 ) w Z 2 = [ 0 . 2 0 . 3 0 . 5 ] N Z 2 = 5 0 s e d = 1 0 0 [ Z 2 , y z 2 ] = m i x t _ m o d e l ( m Z 2 , S Z 2 , w Z 2 , N Z 2 , s e d ) % F o r m Z a n d r e s p e c t i v e l a b e l s Z = [ Z 1 Z 2 ] y = [ o n e s ( 1 , N Z 1 ) o n e s ( 1 , N Z 2 ) ] % C l a s s i f y t h e v e c t o r s o f Z [ y _ o u t _ Z , P _ e r r o r _ Z ] = b o o s t _ c l a s _ c o o r d _ o u t ( p o s _ t o t , t h r e s _ t o t , s l e f t _ t o t , a _ t o t , P _ t o t , K , Z , y ) f i g u r e ( 3 ) , p l o t ( P _ e r r o r _ Z )

TableResultwiths=2:

19

DistributedDataSet

## Training Set X vs number of Testing Set Z vs number of baseclassifiers baseclassifiers

Training set X the classification errortendsto 0 asthe number ofbase classifiersincreases but in the testing set Z the classification error tends to a posotive limitasthenumber of base classifiers increases. Iamnotunderstandaboutthequestionno2forjustrepeatstep1fors=2,3,and6. s=3 s=4 s=6

20

Exercise2.8.1
Repeat Example 2.8.1, now with the covariance matrix of the involved Gaussian distributions being 2I, where2=4.

In this exerciseweusingExample2.8.1butusedifferentvariancevalue: 2 :4.Intheexample2.8.1 we see the training andtestingerror=0usingthevariancevalue:1,learningrate:0.01,fourhidden layer and epoch 9000iterations, but when wetrying using the samedataand setupjust different in variance valuethe result issodifferent than theresultinexample2.8.1.Usevariancevalue:4make datasetnotsoclearlyseparatedsotheBPAlgorithmcantmakedecisionboundaryperfectly. Letsseethistable epoch/lr ErrorTraining ErrorTesting 9000/0.01 0.0357 0.0571 9000/0.0001 0.1929 0.2000 12000/0.0001 0.1929 0.2000

I also try for many times with the different maximum iterations/epoh, different hidden layer and learning rate, but i think is so difficult for make the error training and error tasting value to zero because I thinkthe separated of datais not clear.lets seethe comparisonimage distributeddata setinExample2.8.1andinThisexercise.

21

DistributedDataSetinExample2.8.1

ResultofclassificationExample2.8.1

DistributedDataSetinExercise2.8.1

ResultofclassificationExercise2.8.1

In this green line I think its so complicated for make a clear decision boundary because thedata fromclass+1andclass1mixed.

22

Exercise2.8.2
Consider a 2class 2dimensional classification problem. The points of the first (second) class, denoted+1 (1), stem from oneout of eight Gaussiandistributions withmeans [10,0]T,[0,10]T, [10,0]T,[0,10]T,[10,20]T,[10,20]T,[20,10]T,[20,10]T([10,10]T, [0,0]T,[10,10]T,[10, 10]T , [10, 10]T ,[20, 20]T ,[20, 0]T, [0,20]T)withequalprobability.Thecovariancematrixforeach distributionis2I,where2=1andIisthe22identitymatrix. Takethefollowingsteps: 1. Generate andplota dataset X1 (training set) containing160points from class+1(20fromeach distribution)and another160 points from class 1(20pointsfromeach distribution). Usethe same prescriptiontogenerateadatasetX2(testset). 2. Run the adaptive BP algorithm with learning rate 0.01 and for 10,000 iterations, totrain2layer FNNs with 7, 8, 10, 14, 16, 20, 32, and 40 hiddenlayer nodes (the values of the rest of the parametersfortheadaptiveBPalgorithmarechosenasinExample2.8.1). 3.Repeatstep2for2=2,3,4anddrawconclusions.

## f o r i = 1 : c 1 S 1 ( : , : , i ) = s * e y e ( l ) e n d s e d = 0 % R a n d o m g e n e r a t o r s e e d [ c l a s s 1 _ X , c l a s s 1 _ y ] = m i x t _ m o d e l ( m 1 , S 1 , P 1 , N 1 , s e d ) % G e n e r a t e t h e t r a i n i n g d a t a f r o m t h e s e c o n d c l a s s N 2 = 1 6 0 % N u m b e r o f s e c o n d c l a s s d a t a p o i n t s f o r i = 1 : c 2 S 2 ( : , : , i ) = s * e y e ( l ) e n d s e d = 0 [ c l a s s 2 _ X , c l a s s 2 _ y ] = m i x t _ m o d e l ( m 2 , S 2 , P 2 , N 2 , s e d ) % F o r m X 1 X 1 = [ c l a s s 1 _ X c l a s s 2 _ X ] % D a t a v e c t o r s y 1 = [ o n e s ( 1 , N 1 ) o n e s ( 1 , N 2 ) ] % C l a s s l a b e l s f i g u r e ( 1 ) , h o l d o n f i g u r e ( 1 ) , p l o t ( X 1 ( 1 , y 1 = = 1 ) , X 1 ( 2 , y 1 = = 1 ) , ' r . ' , X 1 ( 1 , y 1 = = 1 ) , X 1 ( 2 , y 1 = = 1 ) , ' b x ' ) % G e n e r a t e t e s t s e t X 2 % D a t a o f t h e f i r s t c l a s s s e d = 1 0 0 % R a n d o m g e n e r a t o r s e e d . T h i s t i m e w e s e t t h i s v a l u e t o 1 0 0 [ c l a s s 1 _ X , c l a s s 1 _ y ] = m i x t _ m o d e l ( m 1 , S 1 , P 1 , N 1 , s e d ) % D a t a o f t h e s e c o n d c l a s s s e d = 1 0 0 % R a n d o m g e n e r a t o r s e e d [ c l a s s 2 _ X , c l a s s 2 _ y ] = m i x t _ m o d e l ( m 2 , S 2 , P 2 , N 2 , s e d ) % P r o d u c t i o n o f t h e u n i f i e d d a t a s e t X 2 = [ c l a s s 1 _ X c l a s s 2 _ X ] % D a t a v e c t o r s y 2 = [ o n e s ( 1 , N 1 ) o n e s ( 1 , N 2 ) ] % C l a s s l a b e l s r a n d ( ' s e e d ' , 1 0 0 ) r a n d n ( ' s e e d ' , 1 0 0 )

% 7 h i d d e n l a y e r i t e r = 1 0 0 0 0 % N u m b e r o f i t e r a t i o n s c o d e = 3 % C o d e f o r t h e c h o s e n t r a i n i n g a l g o r i t h m k = 7 % n u m b e r o f h i d d e n l a y e r n o d e s l r = . 0 1 % l e a r n i n g r a t e p a r _ v e c = [ l r 0 1 . 0 5 0 . 7 1 . 0 4 ] % P a r a m e t e r v e c t o r [ n e t , t r ] = N N _ t r a i n i n g ( X 1 , y 1 , k , c o d e , i t e r , p a r _ v e c ) 24

% C o m p u t e t h e t r a i n i n g a n d t h e t e s t e r r o r s p e _ t r a i n 7 = N N _ e v a l u a t i o n ( n e t , X 1 , y 1 ) p e _ t e s t 7 = N N _ e v a l u a t i o n ( n e t , X 2 , y 2 ) % 8 h i d d e n l a y e r i t e r = 1 0 0 0 0 % N u m b e r o f i t e r a t i o n s c o d e = 3 % C o d e f o r t h e c h o s e n t r a i n i n g a l g o r i t h m k = 8 % n u m b e r o f h i d d e n l a y e r n o d e s l r = . 0 1 % l e a r n i n g r a t e p a r _ v e c = [ l r 0 1 . 0 5 0 . 7 1 . 0 4 ] % P a r a m e t e r v e c t o r [ n e t , t r ] = N N _ t r a i n i n g ( X 1 , y 1 , k , c o d e , i t e r , p a r _ v e c ) % C o m p u t e t h e t r a i n i n g a n d t h e t e s t e r r o r s p e _ t r a i n 8 = N N _ e v a l u a t i o n ( n e t , X 1 , y 1 ) p e _ t e s t 8 = N N _ e v a l u a t i o n ( n e t , X 2 , y 2 )

% 1 0 h i d d e n l a y e r i t e r = 1 0 0 0 0 % N u m b e r o f i t e r a t i o n s c o d e = 3 % C o d e f o r t h e c h o s e n t r a i n i n g a l g o r i t h m k = 1 0 % n u m b e r o f h i d d e n l a y e r n o d e s l r = . 0 1 % l e a r n i n g r a t e p a r _ v e c = [ l r 0 1 . 0 5 0 . 7 1 . 0 4 ] % P a r a m e t e r v e c t o r [ n e t , t r ] = N N _ t r a i n i n g ( X 1 , y 1 , k , c o d e , i t e r , p a r _ v e c ) % C o m p u t e t h e t r a i n i n g a n d t h e t e s t e r r o r s p e _ t r a i n 1 0 = N N _ e v a l u a t i o n ( n e t , X 1 , y 1 ) p e _ t e s t 1 0 = N N _ e v a l u a t i o n ( n e t , X 2 , y 2 ) % 1 4 h i d d e n l a y e r i t e r = 1 0 0 0 0 % N u m b e r o f i t e r a t i o n s c o d e = 3 % C o d e f o r t h e c h o s e n t r a i n i n g a l g o r i t h m k = 1 4 % n u m b e r o f h i d d e n l a y e r n o d e s l r = . 0 1 % l e a r n i n g r a t e p a r _ v e c = [ l r 0 1 . 0 5 0 . 7 1 . 0 4 ] % P a r a m e t e r v e c t o r [ n e t , t r ] = N N _ t r a i n i n g ( X 1 , y 1 , k , c o d e , i t e r , p a r _ v e c ) % C o m p u t e t h e t r a i n i n g a n d t h e t e s t e r r o r s p e _ t r a i n 1 4 = N N _ e v a l u a t i o n ( n e t , X 1 , y 1 ) p e _ t e s t 1 4 = N N _ e v a l u a t i o n ( n e t , X 2 , y 2 )

25

% 1 6 h i d d e n l a y e r i t e r = 1 0 0 0 0 % N u m b e r o f i t e r a t i o n s c o d e = 3 % C o d e f o r t h e c h o s e n t r a i n i n g a l g o r i t h m k = 1 6 % n u m b e r o f h i d d e n l a y e r n o d e s l r = . 0 1 % l e a r n i n g r a t e p a r _ v e c = [ l r 0 1 . 0 5 0 . 7 1 . 0 4 ] % P a r a m e t e r v e c t o r [ n e t , t r ] = N N _ t r a i n i n g ( X 1 , y 1 , k , c o d e , i t e r , p a r _ v e c ) % C o m p u t e t h e t r a i n i n g a n d t h e t e s t e r r o r s p e _ t r a i n 1 6 = N N _ e v a l u a t i o n ( n e t , X 1 , y 1 ) p e _ t e s t 1 6 = N N _ e v a l u a t i o n ( n e t , X 2 , y 2 )

% 2 0 h i d d e n l a y e r i t e r = 1 0 0 0 0 % N u m b e r o f i t e r a t i o n s c o d e = 3 % C o d e f o r t h e c h o s e n t r a i n i n g a l g o r i t h m k = 2 0 % n u m b e r o f h i d d e n l a y e r n o d e s l r = . 0 1 % l e a r n i n g r a t e p a r _ v e c = [ l r 0 1 . 0 5 0 . 7 1 . 0 4 ] % P a r a m e t e r v e c t o r [ n e t , t r ] = N N _ t r a i n i n g ( X 1 , y 1 , k , c o d e , i t e r , p a r _ v e c ) % C o m p u t e t h e t r a i n i n g a n d t h e t e s t e r r o r s p e _ t r a i n 2 0 = N N _ e v a l u a t i o n ( n e t , X 1 , y 1 ) p e _ t e s t 2 0 = N N _ e v a l u a t i o n ( n e t , X 2 , y 2 ) % 3 2 h i d d e n l a y e r i t e r = 1 0 0 0 0 % N u m b e r o f i t e r a t i o n s c o d e = 3 % C o d e f o r t h e c h o s e n t r a i n i n g a l g o r i t h m k = 3 2 % n u m b e r o f h i d d e n l a y e r n o d e s l r = . 0 1 % l e a r n i n g r a t e p a r _ v e c = [ l r 0 1 . 0 5 0 . 7 1 . 0 4 ] % P a r a m e t e r v e c t o r [ n e t , t r ] = N N _ t r a i n i n g ( X 1 , y 1 , k , c o d e , i t e r , p a r _ v e c ) % C o m p u t e t h e t r a i n i n g a n d t h e t e s t e r r o r s p e _ t r a i n 3 2 = N N _ e v a l u a t i o n ( n e t , X 1 , y 1 ) p e _ t e s t 3 2 = N N _ e v a l u a t i o n ( n e t , X 2 , y 2 ) % 4 0 h i d d e n l a y e r i t e r = 1 0 0 0 0 % N u m b e r o f i t e r a t i o n s c o d e = 3 % C o d e f o r t h e c h o s e n t r a i n i n g a l g o r i t h m k = 4 0 % n u m b e r o f h i d d e n l a y e r n o d e s l r = . 0 1 % l e a r n i n g r a t e p a r _ v e c = [ l r 0 1 . 0 5 0 . 7 1 . 0 4 ] % P a r a m e t e r v e c t o r [ n e t , t r ] = N N _ t r a i n i n g ( X 1 , y 1 , k , c o d e , i t e r , p a r _ v e c ) 26

% C o m p u t e t h e t r a i n i n g a n d t h e t e s t e r r o r s p e _ t r a i n 4 0 = N N _ e v a l u a t i o n ( n e t , X 1 , y 1 ) p e _ t e s t 4 0 = N N _ e v a l u a t i o n ( n e t , X 2 , y 2 )

PlotofDataSet

2=1 Hidden Layer Error Train ErrorTest 7 0.187 5 0.200 0 8 0.0625 0.0688 10 0.0500 0.0719 14 0 0.0094 16 0.1719 0.2000 20 0 0.0094 32 0 0.0094 40 0 0.0156

27

2=2 Hidden Layer Error Train Error Test 7 0.1187 0.1219 8 0.1406 0.1406 10 0.0594 0.1000 14 0.0125 0.0437 16 0.0125 0.0219 20 0.0094 0.0094 32 0.0031 0.0187 40 0.0125 0.0219

28

2=3 Hidden Layer Error Train Error Test 7 0.1344 0.1344 8 0.1437 0.1469 10 0.0781 0.1094 14 0.0094 0.0469 16 0.0813 0.1031 20 0.0187 0.0469 32 0.0125 0.0563 40 0.0187 0.0781

2=4 Hidden Layer Error Train Error Test 7 0.2125 0.2562 8 0.1562 0.1812 10 0.3688 0.4094 14 0.0344 0.0906 16 0.1031 0.1156 20 0.0406 0.1062 32 0.0312 0.0813 40 0.0312 0.1031

29

Afterweseetheresultonthetablesandthegraphicswecanmakeaconclusions: Thenumberofhiddenlayerisnotlineartotheperformanceoftheclassifier.Theisa certainoptimum numberforit. Ifthe value ofvariance ( 2) decrease the datais moreclose with themeanand whenthevariance increase the data further away from mean and if wehave many classesthe data will mix withother classesandmakedifficultforclassificatesotheerrorvalueoftrainingandtestingwillbeincrease.

30