Vous êtes sur la page 1sur 44

Introduction Problem Proposal Experiments Results Conclusions and Future Work

IEEE Autumn Meeting on Power, Electronics and Computing (ROPEC15)

Towards the Full Model Selection in Temporal Databases by


Using Micro-Differential Evolution. An Empirical Study
Hacia el Modelo Completo de Selecci
on en bases de datos temporales a trav
es de
Micro-Evoluci
on Diferencial. Un estudio emprico

Nancy P
erez-Castro, H
ector-Gabriel Acosta-Mesa, Efr
en
Mezura-Montes and Nicandro Cruz Ramrez
Artificial Intelligence Research Center

Xalapa, Veracruz, 91000, MEXICO


perez.castro.nancy@gmail.com, heacosta@uv.mx
emezura@uv.mx, ncruz@uv.mx

November 6, 2015
University of Veracruz

1/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Outline
1 Introduction
2 Problem
3 Proposal
4 Experiments
5 Results
6 Conclusions and Future Work

University of Veracruz

2/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Outline
1 Introduction
2 Problem
3 Proposal
4 Experiments
5 Results
6 Conclusions and Future Work

University of Veracruz

3/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Introduction
Practitioners and researches in Machine Learning and other disciplines have faced the dilemma to choose an appropriate technique to treat a specific instance (database). This dilemma has
been named as the algorithm selection problem or full model
selection problem.

University of Veracruz

4/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Full Model Selection (FMS)

FMS
The FMS term was introduced by Escalante et al. a . FMS involves the selection of an appropriate set of methods and their
respective parameter values to maximize or minimize a performance measure.
a

Escalante et al., Particle Swarm Model Selection. JMLR (2009).

University of Veracruz

5/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Time-Series
Definition
Time-series can be defined as a collection of chronologicallymade observations, which is characterized by its numerical and
continuous nature b .
b

P. Esling and C. Agon, Time-series data mining, ACM Comput. Surv., vol. 45, no. 1, pp. 12:112:34,

Dec. 2012

University of Veracruz

6/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Related work FMS

University of Veracruz

7/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Outline
1 Introduction
2 Problem
3 Proposal
4 Experiments
5 Results
6 Conclusions and Future Work

University of Veracruz

8/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Problem

University of Veracruz

9/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Problem

How the search space will be


explored?
University of Veracruz

10/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Outline
1 Introduction
2 Problem
3 Proposal
4 Experiments
5 Results
6 Conclusions and Future Work

University of Veracruz

11/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Proposal
-DEMS

University of Veracruz

12/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Proposal
-DEMS

University of Veracruz

12/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Proposal
-DEMS

University of Veracruz

12/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Proposal
-DEMS

University of Veracruz

12/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Proposal
-DEMS

University of Veracruz

12/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Proposal
-DEMS

University of Veracruz

12/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

-Differential Evolution (-DE)

begin
Randomly generate an initial population of vectors
Compute fitness
repeat
if count == R then
Reinitialization of N worst individuals
end if
For each target generate a mutant vector.
Generate trial vectors through crossover operation.
Selection of vectors for next generation
count + +
until maximum number of generations
end
University of Veracruz

13/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Encoding

k
1X
CVER =
ERi
k i=1
University of Veracruz

(1)
14/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Outline
1 Introduction
2 Problem
3 Proposal
4 Experiments
5 Results
6 Conclusions and Future Work

University of Veracruz

15/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Experiments

Goal of experiments
To evaluate four -Differential variants (rand/1/bin,
rand/1/exp, best/1/bin, best/1/exp) for searching models
that involve a suitable combination of smoothing, time-series
representation and classification methods with their respective
parameter values.

University of Veracruz

16/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Experiments

Experiments description
Three experiments were performed:
1 A comparison of the final statistical results.
2 Convergence behavior.
3 An analysis of the best final solution obtained.

University of Veracruz

17/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Experiments

Experimental setup
5 independent runs per each -DEMS were performed.
6 time-series databases were used.
Parameters values used:
NP = 6
CR = 0.1
F = 0.9
N=2
R = 10

5-folds for cross validation.

University of Veracruz

18/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Outline
1 Introduction
2 Problem
3 Proposal
4 Experiments
5 Results
6 Conclusions and Future Work

University of Veracruz

19/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Results
Experiment 1. Final results
Database

Beef

Coffee

ECG200

OliveOil

FaceFour

GunPoint

University of Veracruz

Stat rand/1/bin
B
0.3667
M
0.3800
W
0.4000
SD
0.0139
p
B
0.0000
M
0.0000
W
0.0000
SD
0.0000
p
B
0.0650
M
0.0670
W
0.0700
SD
0.0027
p
B
0.0667
M
0.0670
W
0.0700
SD
0.0027
p
B
0.0265
M
0.0353
W
0.0704
SD
0.0196
p
B
0.0050
M
0.0100
W
0.0150
SD
0.0035
p

rand/1/exp best/1/bin
0.3667
0.3333
0.3900
0.3700
0.4167
0.4000
0.0224
0.0298
0.6751
0.0000
0.0000
0.0100
0.0067
0.0333
0.0167
0.0149
0.0091
0.2945
0.0600
0.0600
0.0630
0.0610
0.0650
0.0650
0.0027
0.0022
0.0637
0.0500
0.0667
0.0630
0.0610
0.0650
0.0650
0.0027
0.0022
0.4670
0.0265
0.0261
0.0338
0.0261
0.0447
0.0261
0.0076
0.0000
0.1021
0.0100
0.0000
0.0100
0.0030
0.0100
0.0050
0.0000
0.0027
0.0712

best/1/exp
0.3667
0.3833
0.4000
0.0167
0.0000
0.0100
0.0167
0.0091
0.0600
0.0650
0.0700
0.0050
0.0833
0.0650
0.0700
0.0050
0.0261
0.0265
0.0269
0.0003
0.0050
0.0100
0.0150
0.0071

20/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Results
Experiment 1. Post-hoc Bonferroni test

(a) Best values

University of Veracruz

(b) Median values

21/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Results
Experiment 2. Convergence plots.

Beef database

University of Veracruz

22/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Results
Experiment 2. Convergence plots.

Coffee database

University of Veracruz

22/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Results
Experiment 2. Convergence plots.

ECG200 database

University of Veracruz

22/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Results
Experiment 2. Convergence plots.

FaceFour database

University of Veracruz

22/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Results
Experiment 2. Convergence plots.

GunPoint database

University of Veracruz

22/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Results
Experiment 2. Convergence plots.

OliveOil database

University of Veracruz

22/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Results
Experiment 3. Analysis of the final solutions

Database
Beef

Coffee

ECG200

OliveOil

FaceFour

GunPoint

Variant
rand/1/bin
rand/1/exp
best/1/bin
best/1/exp
rand/1/bin
rand/1/exp
best/1/bin
best/1/exp
rand/1/bin
rand/1/exp
best/1/bin
best/1/exp
rand/1/bin
rand/1/exp
best/1/bin
best/1/exp
rand/1/bin
rand/1/exp
best/1/bin
best/1/exp
rand/1/bin
rand/1/exp
best/1/bin
best/1/exp

University of Veracruz

Model
Smoothing:
Smoothing:
Smoothing:
Smoothing:
Smoothing:
Smoothing:
Smoothing:
Smoothing:
Smoothing:
Smoothing:
Smoothing:
Smoothing:
Smoothing:
Smoothing:
Smoothing:
Smoothing:
Smoothing:
Smoothing:
Smoothing:
Smoothing:
Smoothing:
Smoothing:
Smoothing:
Smoothing:

Moving Average{span(105)}, Representation: PCA{red(0.6)}, Classifier/Similarity Measure: KNN-MD{k(6)}


Sgolay{k(1),f(115)}, Representation: PCA{red(0.4)}, Classifier/Similarity Measure: KNN-MD{k(7)}
Sgolay{k(2),f(153)}, Representation: PCA{red(0.8)}, Classifier/Similarity Measure: KNN-MD{k(8)}
Moving Average{span(33)}, Representation: PAA{ns(190)}, Classifier/Similarity Measure: KNN-MD{k(9)}
Sgolay{k(2),f(11)}, Representation: PAA{ns(103)}, Classifier/Similarity Measure: KNN-CD{k(1)}
Sgolay{k(5),f(33)}, Representation: SAX{w(86),a(18)}, Classifier/Similarity Measure: KNN-ED1{k(13)}
Sgolay{k(3),f(25)}, Representation: SAX{w(93),a(18)}, Classifier/Similarity Measure: KNN-ED1{k(4)}
Sgolay{k(2),f(23)}, Representation: SAX{w(139),a(20)}, Classifier/Similarity Measure: KNN-ED1{k(10)}
Sgolay{k(3),f(5)}, Representation: PCA{red(0.5)}, Classifier/Similarity Measure: KNN-MD{k(2)}
Sgolay{k(5),f(5)}, Representation: PCA{red(0.4)}, Classifier/Similarity Measure: KNN-MD{k(2)}
Sgolay{k(2),f(3)}, Representation: PCA{red(0.7)}, Classifier/Similarity Measure: KNN-DTW1{k(1),r(8)}
Sgolay{k(4),f(9)}, Representation: PCA{red(0.7)}, Classifier/Similarity Measure: KNN-MD{k(3)}
Moving Average{span(15)}, Representation: PCA{red(0.8)}, Classifier/Similarity Measure: KNN-MD{k(1)}
Sgolay{k(4),f(39)}, Representation: PCA{red(0.3)}, Classifier/Similarity Measure: KNN-MD{k(1)}
Moving Average{span(15)}, Representation: PCA{red(0.9)}, Classifier/Similarity Measure: KNN-MD{k(2)}
Sgolay{k(2),f(19)}, Representation: PCA{red(0.8)}, Classifier/Similarity Measure: KNN-MD{k(3)}
Sgolay{k(4),f(23)}, Representation: PCA{red(0.6)}, Classifier/Similarity Measure: KNN-MD{k(1)}
Sgolay{k(3),f(9)}, Representation: SAX{w(93),a(19)}, Classifier/Similarity Measure: KNN-ED1{k(1)}
Sgolay{k(5),f(23)}, Representation: SAX{w(146),a(9)}, Classifier/Similarity Measure: KNN-ED1{k(1)}
Sgolay{k(3),f(13)}, Representation: PCA{red(0.4)}, Classifier/Similarity Measure: KNN-MD{k(1)}
Moving Average{span(11)}, Representation: PAA{ns(48)}, Classifier/Similarity Measure: KNN-DTW2{k(1),r(3)}
Sgolay{k(3),f(19)}, Representation: PAA{ns(74)}, Classifier/Similarity Measure: KNN-DTW2{k(1),r(14)}
Sgolay{k(2),f(15)}, Representation: PAA{ns(74)}, Classifier/Similarity Measure: KNN-DTW2{k(1),r(15)}
Sgolay{k(4),f(25)}, Representation: PAA{ns(75)}, Classifier/Similarity Measure: KNN-DTW2{k(3),r(14)}

CVER Time (m)


0.3667
752.55
0.3667
1046.23
0.3333
752.97
0.3667
1089.15
0.0000
174.67
0.0000
119.04
0.0000
190.78
0.0000
107.20
0.0650
260.35
0.0600
281.97
0.0600
306.41
0.0600
169.75
0.0667
637.96
0.0500
700.18
0.0667
739.35
0.0833
813.28
0.0265
1136.19
0.0265
551.81
0.0261
1672.13
0.0261
882.48
0.0050
1820.59
0.0000
3518.24
0.0000
6870.46
0.0050
4105.02

23/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Results
Experiment 3. Analysis of the final solutions

Database

Variant
rand/1/bin

Beef
rand/1/exp

best/1/bin

best/1/exp

University of Veracruz

Model
Smoothing: Moving Average{span(105)},
Representation: PCA{red(0.6)},
Classifier/Similarity Measure: KNN-MD{k(6)}
Smoothing: Sgolay{k(1),f(115)},
Representation: PCA{red(0.4)},
Classifier/Similarity Measure: KNN-MD{k(7)}
Smoothing: Sgolay{k(2),f(153)},
Representation: PCA{red(0.8)},
Classifier/Similarity Measure: KNN-MD{k(8)}
Smoothing: Moving Average{span(33)},
Representation: PAA{ns(190)},
Classifier/Similarity Measure: KNN-MD{k(9)}

CVER

Time (m)

0.3667

752.55

0.3667

1046.23

0.3333

752.97

0.3667

1089.15

24/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Results
Experiment 3. Analysis of the final solutions

Database

Variant
rand/1/bin

Coffee
rand/1/exp

best/1/bin

best/1/exp

University of Veracruz

Model
Smoothing: Sgolay{k(2),f(11)},
Representation: PAA{ns(103)},
Classifier/Similarity Measure: KNN-CD{k(1)}
Smoothing: Sgolay{k(5),f(33)},
Representation: SAX{w(86),a(18)},
Classifier/Similarity Measure: KNN-ED1{k(13)}
Smoothing: Sgolay{k(3),f(25)},
Representation: SAX{w(93),a(18)},
Classifier/Similarity Measure: KNN-ED1{k(4)}
Smoothing: Sgolay{k(2),f(23)},
Representation: SAX{w(139),a(20)},
Classifier/Similarity Measure: KNN-ED1{k(10)}

CVER

Time (m)

0.0000

174.67

0.0000

119.04

0.0000

190.78

0.0000

107.20

25/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Results
Experiment 3. Analysis of the final solutions

Database

Variant
rand/1/bin

ECG200
rand/1/exp

best/1/bin

best/1/exp

University of Veracruz

Model
Smoothing: Sgolay{k(3),f(5)},
Representation: PCA{red(0.5)},
Classifier/Similarity Measure: KNN-MD{k(2)}
Smoothing: Sgolay{k(5),f(5)},
Representation: PCA{red(0.4)},
Classifier/Similarity Measure: KNN-MD{k(2)}
Smoothing: Sgolay{k(2),f(3)},
Representation: PCA{red(0.7)},
Classifier/Similarity Measure: KNN-DTW1{k(1),r(8)}
Smoothing: Sgolay{k(4),f(9)},
Representation: PCA{red(0.7)},
Classifier/Similarity Measure: KNN-MD{k(3)}

CVER

Time (m)

0.0650

260.35

0.0600

281.97

0.0600

306.41

0.0600

169.75

26/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Results
Experiment 3. Analysis of the final solutions

Database

Variant
rand/1/bin

OliveOil
rand/1/exp
best/1/bin

best/1/exp

University of Veracruz

Model
CVER
Smoothing: Moving Average{span(15)},
Representation: PCA{red(0.8)},
0.0667
Classifier/Similarity Measure: KNN-MD{k(1)}
Smoothing: Sgolay{k(4),f(39)}, Representation: PCA{red(0.3)},
0.0500
Classifier/Similarity Measure: KNN-MD{k(1)}
Smoothing: Moving Average{span(15)},
Representation: PCA{red(0.9)},
0.0667
Classifier/Similarity Measure: KNN-MD{k(2)}
Smoothing: Sgolay{k(2),f(19)},
Representation: PCA{red(0.8)},
0.0833
Classifier/Similarity Measure: KNN-MD{k(3)}

Time (m)
637.96
700.18
739.35

813.28

27/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Results
Experiment 3. Analysis of the final solutions

Database

Variant
rand/1/bin

FaceFour
rand/1/exp

best/1/bin

best/1/exp

University of Veracruz

Model
Smoothing: Sgolay{k(4),f(23)},
Representation: PCA{red(0.6)},
Classifier/Similarity Measure: KNN-MD{k(1)}
Smoothing: Sgolay{k(3),f(9)},
Representation: SAX{w(93),a(19)},
Classifier/Similarity Measure: KNN-ED1{k(1)}
Smoothing: Sgolay{k(5),f(23)},
Representation: SAX{w(146),a(9)},
Classifier/Similarity Measure: KNN-ED1{k(1)}
Smoothing: Sgolay{k(3),f(13)},
Representation: PCA{red(0.4)},
Classifier/Similarity Measure: KNN-MD{k(1)}

CVER

Time (m)

0.0265

1136.19

0.0265

551.81

0.0261

1672.13

0.0261

882.48

28/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Results
Experiment 3. Analysis of the final solutions

Database

Variant
rand/1/bin

GunPoint
rand/1/exp

best/1/bin

best/1/exp

University of Veracruz

Model
Smoothing: Moving Average{span(11)},
Representation: PAA{ns(48)},
Classifier/Similarity Measure: KNN-DTW2{k(1),r(3)}
Smoothing: Sgolay{k(3),f(19)},
Representation: PAA{ns(74)},
Classifier/Similarity Measure: KNN-DTW2{k(1),r(14)}
Smoothing: Sgolay{k(2),f(15)},
Representation: PAA{ns(74)},
Classifier/Similarity Measure: KNN-DTW2{k(1),r(15)}
Smoothing: Sgolay{k(4),f(25)},
Representation: PAA{ns(75)},
Classifier/Similarity Measure: KNN-DTW2{k(3),r(14)}

CVER

Time (m)

0.0050

1820.59

0.0000

3518.24

0.0000

6870.46

0.0050

4105.02

29/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Outline
1 Introduction
2 Problem
3 Proposal
4 Experiments
5 Results
6 Conclusions and Future Work

University of Veracruz

30/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Conclusions

The overall assessment indicates that the -DE is a viable


option to find suitable models.
Using the best vector in the population as the base vector
coupled with the bin crossover is a good option to reach
better final results.
If the temporal database requires significant computational
time to evaluate models, changing the base vector with
one chosen at random and using the bin crossover may
be a good choice to get a competitive result faster.

University of Veracruz

31/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Conclusions

In the convergence behavior, it was noted that in a range


of 150 to 200 generations it was possible to achieve
competitive results by the rand variants.
Finally, the empirical comparison in this work showed that
Sgolay, SAX with PCA, and KNN with Euclidean distance,
all of them with suitable parameter values found by the
-DEMS variants, were the most competitive methods for
smoothing, representation and classification, respectively.

University of Veracruz

32/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Future work

As future work, post-processing methods will be added to


the encoding of the -DEMS variants.
Comparisons against other -EAs will be carried out and
other objectives (e.g. complexity of the model) will be
considered.

University of Veracruz

33/34

Introduction Problem Proposal Experiments Results Conclusions and Future Work

Thank you!

University of Veracruz

34/34

Vous aimerez peut-être aussi