Hybrid Genetic Algorithms and Support Vector Machines

Expert Systems with Applications 31 (2006) 652660
www.elsevier.com/locate/eswa
Hybrid genetic algorithms and support vector machines

for bankruptcy prediction
Sung-Hwan Min a,*, Jumin Lee b, Ingoo Han b
a
Department of Business Administration, Hallym University, 39 Hallymdaehak-gil, Chuncheon Gangwon-do, 200-702, Korea
b
Graduate School of Management, KAIST, 207-43 Cheongryangri-Dong, Dongdaemoon-Gu, Seoul 130-012, Korea
Abstract
Bankruptcy prediction is an important and widely studied topic since it can have significant impact on bank lending decisions and profitability.
Recently, the support vector machine (SVM) has been applied to the problem of bankruptcy prediction. The SVM-based method has been
compared with other methods such as the neural network (NN) and logistic regression, and has shown good results. The genetic algorithm (GA)
has been increasingly applied in conjunction with other AI techniques such as NN and Case-based reasoning (CBR). However, few studies have
dealt with the integration of GA and SVM, though there is a great potential for useful applications in this area. This study proposes methods for
improving SVM performance in two aspects: feature subset selection and parameter optimization. GA is used to optimize both a feature subset and
parameters of SVM simultaneously for bankruptcy prediction.
q 2005 Elsevier Ltd. All rights reserved.
Keywords: Support vector machines; Bankruptcy prediction; Genetic algorithms
1. Introduction
Bankruptcy prediction is an important and widely studied
topic since it can have significant impact on bank lending
decisions and profitability. Statistical methods and data mining
techniques have been used for developing more accurate
bankruptcy prediction models. The statistical methods include
regression, discriminant analysis, logistic models, factor
analysis etc. The data mining techniques include decision
trees, neural networks (NNs), fuzzy logic, genetic algorithm
(GA), support vector machine (SVM) etc.
Bankruptcy has been an important issue in accounting and
finance, and it has been forecasted by the prediction models.
The prediction is a kind of binary decision, in terms of twoclass pattern recognition. Beaver (1966) originally proposed
the univariate analysis on financial ratios to predict the problem
and many studies have followed to improve the decision with a
variety of statistical methodologies. Linear discriminant
analysis (Altman, 1968; Altman, Edward, Haldeman, &
Narayanan 1977), multiple regression (Meyer & Pifer, 1970),
* Corresponding author. Address: Department of Business Administration,
Hallym University, 39 Hallymdaehak-gil, Chuncheon Gangwon-do, 200-702,
Korea. Tel.: C82-33-248-1841; fax: C82-33-256-3424.
E-mail address: shmin@hallym.ac.kr (S.-H. Min).
0957-4174/$ - see front matter q 2005 Elsevier Ltd. All rights reserved.
doi:10.1016/j.eswa.2005.09.070
and logistic regression (Dimitras, Zanakis, & Zopounidis,

1996; Ohlson, 1980; Pantalone & Platt, 1987) have been
typically used for this purpose. However strict assumptions of
traditional statistics such as the linearity, normality, independence among predictor variables and pre-existing functional
form relating the criterion variable and the predictor variable
limit application in the real world.
Recent AI approaches include inductive learning (Han,
Chandler, & Liang, 1996; Shaw & Gentry, 1998), Case-based
reasoning (CBR) (Bryant, 1997; Buta, 1994), and Neural
networks (NNs) (Bortiz & Kennedy, 1995; Coakley & Brown,
2000; Jo & Han, 1996; Zhang, Hu, Patuwo, & Indro, 1999). In
the AI approach, NNs are powerful tools for pattern recognition
and pattern classification due to their nonlinear non-parametric
adaptive-learning properties. NNs have been used successfully
for many financial problems. Moreover, hybrid NN models for
predicting bankruptcy with statistical and inductive learning
methods (Liang, Chandler, & Hart, 1990), and SOFM (Lee,
Han, Kwon, & Hybrid, 1996) have shown great results.
Recently SVM which was developed by Vapnik (Vapnik
(1995)) is one of the methods that is receiving increasing
attention with remarkable results. The main difference between
NN and SVM is the principle of risk minimization. While NN
implements empirical risk minimization to minimize the error
on the training data, SVM implements the principle of
Structural Risk Minimization by constructing an optimal
separating hyper plane in the hidden feature space, using
S.-H. Min et al. / Expert Systems with Applications 31 (2006) 652660
quadratic programming to find a unique solution. Originally

SVM was developed for pattern recognition problems and it
has been used for isolated handwritten digit recognition
(Sclkopf, Burges, & Vapnik, 1995), text categorization
(Joachims, 1997), speaker identification (Schmidt, 1996) and
mechanical systems (Jack & Nadi, 2000). SVM has yielded
excellent generalization performance that is significantly better
than that of competing methods. In financial applications, time
series prediction such as stock price indexing (Cao & Tay,
2001; Kim, 2004; Tay & Cao, 2001; Tay & Cao, 2002), and
classification such as credit rating (Huang, Chen, Hsu, Chen, &
Wu, 2004), and bankruptcy (Fan & Palaniswami, 2000; Van
Gestel et al., 2003) are main areas with SVM.
On the other side, hybrid models also have advanced with
these single prediction models. One of the popular hybrid
models used GA. GA has been increasingly applied in
conjunction with other AI techniques such as NN and CBR.
However, few studies have dealt with integration of GA and
SVM although there is a great potential for useful applications
in this area. This paper focuses on the improvement of the
SVM-based model by means of the integration of GA and
SVM.
This study presents, the methods for improving SVM
performance in two aspects: feature subset selection and
parameter optimization. GA is used to optimize both the
feature subset and parameters of SVM simultaneously for
bankruptcy prediction. This paper applies the proposed GASVM model to the bankruptcy prediction problem using a real
data set from Korean companies.
The rest of this paper is organized as follows: The next
section describes the background. Section 3 presents the
proposed model. Sections 4 and 5 explain the experimental
design and the results of the evaluation experiment. The final
section presents the summary and future research issues.
653
W
H2
b
W
H1
Origin
Margin
Fig. 1. Linear separating hyperplanes for the separable case (The support
vectors are circled).
Kernel functions.
Simple dot product : Kx; y Z x$y
Vovk0s polynomial : Kx; y Z xy C 1p
Radial basis functionRBF : Kx; y Z eKjjxKyjj
2=2s2
Two layer neural network : Kx; y Z tan hkx$yKd

Using a dual problem, the quadratic programming problems
can be re-written as
Qa Z
l
X
iZ1
ai K
l
1 X
a a y y Kxi ; xj
2 i;jZ1 i j i j
subject to 0% ai % C
l
X
ai yi Z 0
iZ1
with the decision function f xZ sgn
l
P

yi ai kx; xi C b .
iZ1
2. Research background
In this paper, we define the bankruptcy problem as a nonlinear problem and use the RBF kernel to optimize the hyper
plan.
2.1. Support vector machine (SVM)

The SVM developed by Vapnik (Vapnik (1995))
implemented the principle of Structural Risk Minimization
by constructing an optimal separating hyper plane w$xCbZ0.
To find the optimal hyper plane: {x2S(w,x)CbZ0}, the
norm of the vector w needs to be minimized, on the other hand,
the margin 1/kwk should be maximized between two classes.
min jw; x C bj Z 1
iZ1;.n
The solution for the typical two classes in linear problems

has the form as shown in Fig. 1. Those circled points are called
support vectors for which yi xi $wC bZ 1 holds and which
confine the margin. Moving of any of them will change the
hyper plane normal vector w.
In the non-linear case, we first mapped the data to some
other Euclidean space H, using a mapping, F : Rd 1 H. Then
instead of the form of dot products, kernel function K is
issued such that K(xi,yi)ZF(xi)$F(xj). There are several
2.2. Genetic algorithm (GA)

The Genetic algorithm (GA) is an artificial intelligence
procedure based on the theory of natural selection and
evolution. GA uses the idea of survival of the fittest by
progressively accepting better solutions to the problems. It is
inspired by and named after biological processes of inheritance, mutation, natural selection, and the genetic crossover
that occurs when parents mate to produce offspring (Goldberg,
1989). GA differs from conventional non-linear optimization
techniques in that it searches by maintaining a population (or
data base) of solutions from which better solutions are created
rather than making incremental changes to a single solution to
the problem. GA simultaneously possesses a large number
of candidate solutions to a problem, called a population.
The key feature of GA is the manipulation of a population
whose individuals are characterized by possessing a
chromosome.
654
Two important issues in GA are the genetic coding used to

define the problem and the evaluation function, called the fitness
function. Each individual solution in GA is represented by a string
called the chromosome. The initial solution population could be
generated randomly, which evolves into the next generation by
genetic operators such as selection, crossover and mutation. The
solutions coded by strings are evaluated by the fitness function.
The selection operator allows strings with higher fitness to appear
with higher probability in the next generation (Holland, 1975;
Mitchell, 1996). Crossover is performed between two selected
individuals, called parents, by exchanging parts of their strings,
starting from a randomly chosen crossover point. This operator
tends to enable to the evolutionary process to move toward
promising regions of the search space. Mutation is used to search
for further problem space and to avoid local convergence of GA
(Tang, Man, & Kwong, 1996).
GA has been extensively researched and applied to many
combinatorial optimization problems. Furthermore GA has
been increasingly applied in conjunction with other AI
techniques such as NN and CBR. Various problems of neural
network design have been optimized using GA. GA has also
been used in conjunction with CBR to select relevant input
variables and to tune the parameters of CBR (Bishop, Bushnell,
Usher, & Westland, 1993). But, few studies have dealt with
integration of GA and SVM, though there is a great potential
for useful applications in this area.
validation set. Although this is a slower procedure, the features

selected are usually more optimal for the classifier employed.
In the bankruptcy prediction problem, feature subset
selection plays an important role in the performance of
prediction. Furthermore, its importance increases when the
number of features is large. This paper seeks to improve the
SVM based bankruptcy prediction model. We propose GA as
the method of feature subset selection in the SVM system. This
paper uses the wrapper approach to select the optimal feature
subset of the SVM model using GA.
3.2. Optimizing the parameters of SVM
One of the big problems in SVM is the selection of the
values of parameters that will allow good performance.
Selecting appropriate values for parameters of SVM plays an
important role in the performance of SVM. But, it is not known
beforehand which values are the best for the problem.
Optimizing the parameters of SVM is crucial for the best
prediction performance.
This paper proposes, GA as the method of optimizing
parameters of SVM. In this paper, the radial basis function
(RBF) is used as the kernel function for bankruptcy prediction.
There are two parameters while using RBF kernels: C and d2.
These two parameters play an important role in the
performance of SVMs (Tay & Cao, 2001). In this study, C
and d2 are encoded as binary strings, and optimized by GA.
3. Hybrid GA-SVM model

This study presents methods for improving the performance
of SVM in two aspects: feature subset selection and parameter
optimization. GA is used to optimize both the feature subset
and parameters of SVM simultaneously for bankruptcy
prediction.
3.1. Optimizing the feature subset
Feature subset selection is essentially an optimization
problem that involves searching the space for possible features
to find one that is optimum or near-optimal with respect to a
certain performance measures such as accuracy. In a
classification problem, the selection of features is important
for many reasons: good generalization performance, running
time requirements and constraints imposed by the problem
itself.
In the literature there are two known general approaches to
solve the feature selection problem: The filter approach and the
wrapper approach (Sun, Bebis, & Miller, 2004). The distinction
is made depending on whether feature subset selection is done
independent of the learning algorithm used to construct the
classifier (i.e. filter) or not (i.e. wrapper). In the filter approach,
feature selection is performed before applying the classifier to
the selected feature subset. The filter approach is computationally more efficient than the wrapper approach. The wrapper
approach trains the classifier system with a given feature subset
as an input, and it estimates the classification error using a
3.3. Simultaneous optimization of SVM using GA

In general, the choice of the feature subset has an influence
on the appropriate kernel parameters and vice versa. Therefore
the feature subset and parameters of SVM need to be optimized
simultaneously for the best prediction performance.
Fig. 2 shows the overall procedure of the proposed model
which optimizes both the feature subset and parameters of
SVM simultaneously for bankruptcy prediction. The procedure
starts with the randomly selected chromosomes which
represent the feature subset and values of parameters of
SVM. Each new chromosome is evaluated by sending it to the
SVM model. The SVM model uses the feature subset and
values of parameters in order to obtain the performance
measure (e.g. hit ratio). This performance measure is used as
the fitness function and is evolved by GA.
The chromosomes for the feature subset are encoded as
binary strings standing for some subset of the original feature
set list. Each bit of the chromosome represents whether the
corresponding feature is selected or not. 1 in each bit means the
corresponding feature is selected, whereas 0 means it is not
selected. The chromosomes for parameters of SVM are
encoded as a 16-bit string which consists of 8 bits standing
for C and 8 bits standing for d2. Fig. 3 shows examples of
encoding for GA.
Each of the selected feature subsets and parameters is
evaluated using SVM. This process is iterated until the best
feature subset and values of parameters are found. The data set
655
Fig. 2. Overall procedure of GA-SVM.
is divided into a training set and a validation portion. The

training set (T) consists of both T_1 and T_2.
GA evolves a number of populations. Each population
consists of sets of features of a given size and the values of
parameters. The fitness of an individual of the population is
based on the performance of SVM. SVM is trained on T_1
using only the features of the individual and the values of
parameters of the individual. The fitness is the average
prediction accuracy of SVM over T_2. At each generation
Table 1
Steps of GA-SVM
Step 1 Define the string (or chromosome)
V1iZ(s,t,.,r)(Features of SVM are encoded into chromosomes)
V2i (Parameters of SVM are encoded into chromosomes)
Step 2 Define population size (Npop), probability of crossover (Pc) and
probability of mutation (Pm).
Step 3 Generate binary coded initial population of Npop chromosomes
randomly.
Step 4 While stopping condition is false, do Step 48.
Step 5 Decode jth chromosome (jZ1,2,.,Npop) to obtain the corresponding
feature subset V1j and parameters V2j
Step 6 Apply V1j and V2j to the SVM model to compute the output, Ok.
Step 7 Evaluate fitness, Fj of the jth chromosome using Ok (fitness function:
average predictive accuracy)
Step 8 Calculate total fitness function of population
N
pop
P
F1 V1i ; V2i
TFZ
iZ1
Fig. 3. Examples of encoding for GA.
Step 9 Reproduction
9.1 Compute qiZFi (V1i,V2i)/TF
9.2 Calculate cumulative probability
9.3 Generate random number r between [0, 1]. If r!q1, then select first string
(V11, V21), otherwise, select jth string such that qjK1 hriqj
Step 10 Generate offspring population by performing crossover and mutation
on parent pairs
10.1 Crossover: generate random number r1 between [0, 1] for a new string. If
r1!Pc, then operate crossover
10.2 Mutation: generate random number r2 between [0, 1] and select the bit for
mutation randomly. If r2!Pm, then operate mutation for the bit
Step 11 Stop the iterative step when the terminal condition is reached
656
new individuals are created and inserted into the population by

selecting fit parents which are mutated and recombined.
The fitness function is represented mathematically as
follows:
n
P
Hi
Fitness Z iZ1
n
where Hi is 1 if the actual output equals the predicted value of
the SVM model, otherwise Hi is zero.
During the evolution, the simple crossover operator
(traditional 1-point crossover) is used. The mutation operator
just flips a specific bit. With the elite survival strategy,
we reserve the elite set not only between generations but also
in the operation of crossover and mutation so that we can
obtain all the benefit of GA operation. The details of the
proposed model in an algorithmic form are explained
in Table 1.
4. Experimental design
The research data used in this study is obtained from a
commercial bank in Korea. The data set contains 614
externally non-audited medium-size light industry firms.
Among the cases, 307 companies went bankrupt and filed for
bankruptcy between 1999 and 2002. Initially 32 financial ratios
categorized as stability, profitability, growth, activity and cash
flow are investigated through literature review and basic
statistical methods.
Out of total 32 financial ratios four feature subsets are
selected for the experiment. The selected variables and feature
subsets are shown Table 2. In Table 2, 32FS represents all
financial ratios. ThirtyFS means 30 financial ratios which are
selected by the independent-sample t-test between each
financial ratio as an input variable, and bankruptcy or nonbankruptcy as the output variable. TwelveFS and 6FS represent
the feature subset selected by logistic regression stepwise and
the MDA stepwise method respectively.
Table 2
Variables and feature subsets
Category
Features
Feature subset for model comparison

6FS
Stability
Profitability
Growth
Activity
Cash flow
Quick ratio
Debt/total asset
Debt repayment coefficient
Debt ratio
Equity capital ratio
Debt/total asset
Cash ratio
Financial expenses to sales
Operating income/net interest
expenses
Financial expenses to debt ratio
Net financing cost/sales
Time interest earned (interest cover)
Ordinary income of total asset
Return on total asset
(Operation profitCnon-operation
profit)/capital
Net income/capital
EBIT/interest cost
EBITDA/interest cost
Sales increase ratio
Growth rate of sales
Net profit increase rate
Inventory change/sales
Account receivable change/sales
Working capital change/sales
Operating asset change/sales
Cash operational income/debt
Cash operational income/interest
expenses
Debt service coverage ratio
Cash flow from operating activity/debt
Cash flow from operating activity/
interest expenses
Cash flow after interest payment/debt
Cash flow after interest payment/
interest expenses
12FS
O
O
O
O
O
O
O
O
O
O
Selected by
GA-SVM
30FS
32FS
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
Table 3
Classification accuracies (%) of various parameters in pure SVM using various feature subsets
d2
1
10
Tr
30
50
Val
Tr
Val
Tr
Val
Tr
80.05
87.74
92.31
94.71
95.67
95.91
95.67
96.39
97.36
98.08
70.71
70.20
64.65
63.64
64.65
64.14
64.65
66.16
64.14
63.13
75.48
77.40
79.09
79.81
80.53
81.49
81.49
82.69
82.93
83.65
71.21
70.71
70.20
96.70
70.71
70.71
70.20
71.22
70.71
71.21
75.24
76.44
75.96
75.72
75.48
75.72
76.20
77.16
78.13
79.09
69.19
70.71
69.70
68.69
96.19
69.19
69.19
96.19
69.70
69.70
75.48
81.25
76.44
76.20
76.20
76.20
76.20
75.00
75.72
75.48
1
10
30
50
70
90
100
150
200
250
79.09
81.73
83.89
86.06
88.22
88.94
89.66
90.14
90.38
91.59
96.19
72.73
70.20
69.70
70.71
70.20
70.71
69.70
66.67
66.16
75.96
78.13
79.09
78.67
79.57
79.81
79.57
79.81
79.57
80.29
68.69
67.17
68.18
96.19
68.69
70.20
96.70
96.19
69.70
68.69
75.24
76.68
77.88
78.67
78.13
77.88
77.88
77.64
77.40
77.40
66.16
67.17
67.68
67.17
67.17
67.17
67.68
68.69
69.19
69.70
73.80
76.20
77.88
77.88
78.13
78.13
79.61
78.37
77.88
77.40
1
10
30
50
70
90
100
150
200
250
81.73
91.35
95.43
97.60
98.32
9.56
98.80
99.28
99.52
699.76
70.20
69.19
64.14
64.14
64.65
65.15
64.65
65.66
67.17
64.65
76.20
78.61
81.97
81.97
82.69
83.17
83.17
84.86
85.58
86.90
70.20
68.69
68.69
68.18
69.19
70.20
70.20
71.72
70.71
69.70
74.52
75.72
76.44
78.85
79.09
79.57
79.33
81.49
81.73
82.45
69.19
70.71
67.68
68.18
69.70
68.69
68.69
68.18
68.69
68.69
73.32
75.96
76.20
76.20
76.92
77.40
77.64
78.85
79.09
79.33
1
10
30
50
70
90
100
150
200
250
82.45
93.27
96.88
98.56
99.04
99.28
99.28
99.76
99.76
99.76
72.22
66.67
66.16
64.14
63.64
63.13
63.64
62.63
64.65
64.14
76.68
78.85
81.25
83.17
84.13
85.10
85.34
26.54
87.02
88.22
70.71
68.69
69.19
70.20
69.70
70.20
70.20
71.21
71.21
70.20
74.76
76.20
76.92
78.61
79.57
80.05
80.53
81.25
81.49
82.69
69.70
69.70
69.19
69.70
69.70
68.69
68.18
69.19
69.19
68.69
73.32
75.96
76.20
75.72
76.68
77.16
77.40
79.57
20.53
80.53
Val
100
200
Tr
Val
Tr
Val
Tr
Val
75.00
75.24
75.72
76.68
76.68
76.20
75.96
75.96
76.20
75.96
96.19
72.22
70.71
70.20
70.20
69.70
69.70
96.70
96.19
96.19
72.12
75.72
75.96
76.20
76.68
76.44
76.44
75.96
75.96
76.20
70.20
71.72
71.72
70.71
70.20
70.71
70.71
69.70
96.70
96.19
62.50
74.76
75.24
75.96
75.72
75.96
75.96
76.92
75.44
75.96
57.58
96.19
71.72
72.22
71.21
70.20
70.20
70.20
71.21
70.20
79.80
76.44
76.92
78.13
78.37
78.61
78.37
78.91
78.37
79.09
64.14
68.69
67.17
68.18
66.67
66.67
67.17
68.18
67.68
68.18
73.32
75.72
76.20
77.40
78.37
78.13
78.37
78.61
78.37
78.67
64.65
68.69
67.68
67.17
66.67
67.17
67.17
68.18
68.18
67.68
68.99
75.72
75.72
76.44
76.92
77.64
77.64
78.37
78.85
78.85
61.62
67.17
67.17
67.68
67.17
68.18
67.17
67.17
67.17
67.68
72.36
75.72
75.72
70.20
76.44
76.20
77.16
77.16
76.92
77.88
66.67
70.20
69.70
76.20
69.70
68.18
68.18
68.18
67.68
67.68
72.12
75.24
75.96
70.20
76.44
76.68
76.44
76.92
76.92
77.16
65.66
69.70
70.20
75.72
69.70
69.70
69.70
68.69
68.69
68.18
71.39
74.76
75.72
70.20
75.48
75.72
75.72
76.44
76.44
76.68
64.14
70.20
69.70
72.36
75.72
76.20
76.44
76.20
75.45
75.96
76.44
77.40
79.09
66.67
70.71
69.19
69.19
68.69
68.69
68.69
68.69
67.68
68.69
71.88
74.52
76.20
76.20
76.20
75.96
75.72
76.44
76.68
77.16
65.66
68.69
71.21
68.69
69.19
68.18
67.68
68.69
68.18
67.68
71.63
74.76
75.48
75.72
76.44
75.96
76.20
76.20
76.20
76.20
64.14
69.70
69.70
70.71
70.71
70.20
69.19
68.69
68.18
68.18
6FS
69.19
68.69
68.69
69.19
69.19
68.69
657
Tr, training data set; val, validation data set.
68.18
74.75
70 20
96.19
69.70
68.69
68.69
96.19
68.69
96.19
12FS
66.16
67.68
67.68
67.68
67.17
67.68
67.68
67.68
68.69
68.69
30FS
66.16
71.21
70.71
76.44
67.68
67.68
68.18
69.19
69.19
68.69
32FS
66.16
70.71
69.19
68.18
69.19
69.70
69.40
69.70
68.18
68.69
c
1
10
30
50
70
90
100
150
200
250
80
658
We use the term, GA-SVM as the proposed model which

represents simultaneous optimization of SVM using GA. The
data set for GA-SVM is separated into two parts: the training
set, and the validation set. The ratios are about 0.7 and 0.3. The
training data for NN and GA-SVM is divided into two portions
again: one is for the training model and the other is for avoiding
over fitting. Additionally, to evaluate the effectiveness of the
proposed model, we compare three different models with
arbitrarily selected values of parameters and a given feature
subset. The first model, labeled LR, uses logistic regression.
The second model, labeled NN, uses NN and the third model,
labeled Pure SVM means SVM.
Fig. 4. Accuracy of validation set when d2Z10.
5. Experimental results
5.1. Sensitivity of pure SVM to feature sets and parameters
Accuracy of Training Set &

2
Validation Set when d = 10
85
80
Accuracy
75
70
65
60
30FS_Delta_30_Tr
30FS_Delta_30_Val
55
50
1
10
30
50
90
C
100
150
200
250
Fig. 5. Accuracy of 30FSs training set (Tr) and validation set (Val) when d2Z30.
95
12FS_C_70_Tr
90
12FS_C_70_Val
85
Accuracy
80
75
Table 3 shows the classification accuracies of various

parameters in SVM using various feature subsets. In Table 3,
the values of parameters to get the best prediction using 6FS
are d2Z10 and CZ50. However with the values of parameters,
the prediction using 12FS is poor. Fig. 4 shows the sensitivity
of SVM to various feature subsets and parameters. The
experimental results show that the prediction performance of
SVM is sensitive not only to various feature subsets but also to
various parameters. Thus, this result shows that simultaneous
optimization of the feature set and parameters are needed for
the best prediction.
Fig. 5 shows one of the results of SVM with 30FS where d2
is fixed at 30 as C increases. As Tay and Cao (Tay & Cao,
2001) mentioned, we can observe that a large value for C
would over-fit the training data.
Fig. 6 shows the result of SVM with 12FS where C is fixed
on 70 and d2 increases. We can observe that a small value for d2
would over-fit the training data, and d2 plays an important role
on the generalization performance of SVM. These results also
support Tay and Cao (Tay & Cao, 2001).
70
5.2. Results of GA-SVM
65
60
55
50
1
Fig. 6. Accuracy of 30FSs training set (Tr) and validation set (Val) when
CZ70.
Table 2 shows the feature subset selected by GA. Table 3

describes the average prediction accuracy of each model. In
Pure SVM, we use the best parameter from the validation set
out of the results from Table 3. In Table 4, the proposed model
shows better performance than the other models. The
McNemar tests are used to examine whether the proposed
model significantly outperforms the other models. This is
Table 4
Average prediction accuracy
LR
32FS
30FS
12FS
6FS
NaN
Pure SVM
GA-SVM
Training (%)
Validation (%)
Training (%)
Validation (%)
Training (%)
Validation (%)
Training
Validation
78.13
80.53
66.83
76.92
68.18
67.68
68.69
70.71
79.57
78.85
79.81
75.48
68.18
69.19
69.19
71.72
82.45
84.86
81.73
81.01
72.22
71.72
72.73
74.75
86.53%
80.30%

Table 5
p-values for the validation data
LR
NN
Pure SVM
659
References
NN
Pure SVM
GA-SVM
0.727
0.115
0.263
0.002**
0.004**
0.082*
*Significant at the 10% level, **significant at the 1% level.
a nonparametric test for two related samples using the chisquare distribution. The McNemar test assesses the significance of the difference between two dependent samples when
the value of interesting variable is a dichotomy. It is useful for
detecting changes in responses due to experimental intervention in before-and-after designs (Siegel, 1956).
Table 5 shows the results of the McNemar test. As shown in
Table 5, GA-SVM outperforms LR and NN at the 1% statistical
significance level, and Pure SVM at the 10% statistical level.
But the other models do not significantly outperform each
other.
6. Conclusion
Bankruptcy prediction is an important and widely studied
topic since it can have significant impact on bank lending
decisions and profitability. Recently, the support vector
machine (SVM) has been applied to the problem of bankruptcy
prediction. The SVM-based model has been compared with
other methods such as the neural network (NN) and logistic
regression, and has shown good results. However, few studies
have dealt with integration of GA and SVM although there is a
great potential for useful applications in this area. This paper
focuses on the improvement of the SVM-based model by
means of the integration of GA and SVM.
This study presents, the methods for improving SVM
performance in two aspects: feature subset selection and
parameter optimization. GA is used to optimize both the
feature subset and parameters of SVM simultaneously for
bankruptcy prediction. This paper applies, the proposed GASVM model to the bankruptcy prediction problem using a real
data set from Korean companies.
We evaluated the proposed model using the real data set
and compared it with other models. The results showed that
the proposed model was effective in finding the optimal
feature subset and parameters of SVM, and that it improved
the prediction of bankruptcy. The results also demonstrate
that the choice of the feature subset has an influence on the
appropriate kernel parameters and vice versa.
For future work, we intend to optimize the kernel function,
parameters and feature subset simultaneously. We would also
like to expand this model to apply to instance selection
problems.
Acknowledgements
This work was supported by the Research Grant from
Hallym University, Korea.
Altman, E. L. (1968). Financial ratios, discriminant analysis and the prediction

of corporate bankruptcy. The Journal of Finance, 23(3), 589609.
Altman, E. L., Edward, I., Haldeman, R., & Narayanan, P. (1977). A new model
to identify bankruptcy risk of corporations. Journal of Banking and
Finance, 1, 2954.
Beaver, W. (1966). Financial ratios as predictors of failure, empirical research
in accounting: Selected studied. Journal of Accounting Research, 71111.
Bishop, J. M., Bushnell, M. J., Usher, A., & Westland, S. (1993). Genetic
optimization of neural network architectures for color recipe prediction
Artificial pleural networks and genetic algorithms. New York: Springer
(pp. 719725).
Bortiz, J. E., & Kennedy, D. B. (1995). Effectiveness of neural network types for
prediction of business failure. Expert Systems with Application, 9(4), 503512.
Bryant, S. M. (1997). A case-based reasoning approach to bankruptcy
prediction modeling. International Journal of Intelligent Systems in
Accounting, Finance and Management, 6(3), 195214.
Buta, P. (1994). Mining for financial knowledge with CBR. AI Expert, 9(10),
3441.
Cao, L., & Tay, F. E. H. (2001). Financial forecasting using support vector
machines. Neural Computing & Applications, 10, 184192.
Coakley, J. R., & Brown, C. E. (2000). Artificial neural networks in accounting
and finance: Modeling issues. International Journal of Intelligent Systems
in Accounting, Finance and Management, 9(2), 119144.
Dimitras, A. I., Zanakis, S. H., & Zopounidis, C. (1996). A survey of business
failure with an emphasis on prediction methods and industrial applications.
European Journal of Operational Research, 90(3), 487513.
Fan, A., & Palaniswami, M. (2000). Selecting bankruptcy predictors using a
support vector machine approach Proceeding of the international joint
conference on neural network, (Vol. 6) (pp. 354359).
Goldberg, D. E. (1989). Genetic algorithms in search, optimization and
machine learning. New York: Addison-Wesley.
Han, I., Chandler, J. S., & Liang, T. P. (1996). The impact of measurement
scale and correlation structure on classification performance of inductive
learning and statistical methods. Expert System with Applications, 10(2),
209221.
Huang, Zan, Chen, Hsinchun, Hsu, Chia-Jung, Chen, Wun-Hwa, & Wu,
Soushan (2004). Credit rating analysis with support vector machines and
neural networks: A market comparative study. Decision Support Systems,
37, 543558.
Holland, J. H. (1975). Adaptation in natural and artificial systems. Ann Arbor,
MI: The University of Michigan Press.
Jack, L. B., & Nadi, A. K. (2000). Support vector machine for detection and
characterization of rolling element bearing faults. Proceedings of the
Institution of Mechanical Engineers. Part C: Journal of Mechanical
Engineering Science, 215, 10651074.
Jo, H., & Han, I. (1996). Integration of case-based forecasting, neural network,
and discriminant analysis for bankruptcy prediction. Expert Systems with
applications, 11(4), 415422.
Joachims, T. (1997). Text categorization with support vector machines,
technical report, LS VIII Number 23, University of Dormund.
Kim, K. (2004). Financial time series forecasting using support vector
machines. Neurocomputing, 55, 307319.
Lee, K. C., Han, I., & Kwon, Y. (1996). Hybrid NN models for bankruptcy
predictions. Decision Support Systems, 18, 6372.
Liang, T., Chandler, J., & Hart, I. (1990). Integrating statistical and inductive
learning methods for knowledge acquisition. Expert Systems with
Applications, I, 391401.
Meyer, P. A., & Pifer, H. (1970). Prediction of bank failures. The Journal of
Finance, 25, 853868.
Mitchell, M. (1996). An introduction to genetic algorithms. Cambridge, MA:
MIT Press/Addison-Wesley.
Ohlson, J. (1980). Financial ratios and the probabilistic prediction of
bankruptcy. Journal of Accounting Research, 18(1), 109131.
Pantalone, C., & Platt, M. B. (1987). Predicting commercial bank failure since
deregulation. New England Economic Review, 3747.
660
Schmidt, M. S. (1996). Identifying speaker with support vector networks. In

interface 96 proceedings, Sydney.
Sclkopf, B., Burges, C., & Vapnik, V. (1995). Extracting support data for a
given task. In U. M. Fayyad, & R. Uthurusamy (Eds.), Proceedings, first
international conference on knowledge discovery & data mining. Menlo
Park, CA: AAAI Press.
Shaw, M., & Gentry, J. (1998). Using and expert system with inductive learning
to evaluate business loans. Financial Management, 17(3), 4556.
Siegel, S. (1956). Nonparametric statistics for the behavioral sciences. New
York: McGraw-Hill.
Sun, Z., Bebis, G., & Miller, R. (2004). Object detection using feature subset
selection. Pattern Recognition, 27, 21652176.
Tang, K. S., Man, K. F., Kwong, S., & He, Q. (1996). Genetic algorithms and
their applications. IEEE Signal Processing Magazine, 13, 2237.
Tay, F. E. H., & Cao, L. (2001). Application of support vector machines in

financial time series forecasting. Omega, 29, 309317.
Tay, F. E. H., & Cao, L. (2002). Modified support vector machines in financial
time series forecasting. Neurocomputing, 48, 847861.
Van Gestel, T., Baesens, B., Suykens, J., Espinoza, M. Baestaens, D.-E.,
Vanthienen, J., et al. (2003). Bankruptcy prediction with least squares support
vector machine classifiers, computational intelligence for financial engineering, 2003, proceeding 2003. IEEEE international conference on 2003
(pp. 18).
Vapnik, V. N. (1995). The nature of statistical learning theory. New York:
Springer.
Zhang, G., Hu, M. Y., Patuwo, B. E., & Indro, D. C. (1999). Artificial neural
networks in bankruptcy prediction: General framework and cross-validation
analysis. European Journal of Operational Research, 116(1), 1632.

Hybrid Genetic Algorithms and Support Vector Machines

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Hybrid Genetic Algorithms and Support Vector Machines

Transféré par

Droits d'auteur :

Formats disponibles

Expert Systems with Applications 31 (2006) 652660

Hybrid genetic algorithms and support vector machines

and logistic regression (Dimitras, Zanakis, & Zopounidis,

S.-H. Min et al. / Expert Systems with Applications 31 (2006) 652660

quadratic programming to find a unique solution. Originally

Two layer neural network : Kx; y Z tan hkx$yKd

with the decision function f xZ sgn

2.1. Support vector machine (SVM)

The solution for the typical two classes in linear problems

2.2. Genetic algorithm (GA)

S.-H. Min et al. / Expert Systems with Applications 31 (2006) 652660

Two important issues in GA are the genetic coding used to

validation set. Although this is a slower procedure, the features

3. Hybrid GA-SVM model

3.3. Simultaneous optimization of SVM using GA

S.-H. Min et al. / Expert Systems with Applications 31 (2006) 652660

Fig. 2. Overall procedure of GA-SVM.

is divided into a training set and a validation portion. The

Fig. 3. Examples of encoding for GA.

S.-H. Min et al. / Expert Systems with Applications 31 (2006) 652660

new individuals are created and inserted into the population by

Feature subset for model comparison

Tr, training data set; val, validation data set.

S.-H. Min et al. / Expert Systems with Applications 31 (2006) 652660

S.-H. Min et al. / Expert Systems with Applications 31 (2006) 652660

We use the term, GA-SVM as the proposed model which

Accuracy of Training Set &

Validation Set when d = 10

Table 3 shows the classification accuracies of various

5.2. Results of GA-SVM

Table 2 shows the feature subset selected by GA. Table 3

S.-H. Min et al. / Expert Systems with Applications 31 (2006) 652660

*Significant at the 10% level, **significant at the 1% level.

Altman, E. L. (1968). Financial ratios, discriminant analysis and the prediction

S.-H. Min et al. / Expert Systems with Applications 31 (2006) 652660

Schmidt, M. S. (1996). Identifying speaker with support vector networks. In

Tay, F. E. H., & Cao, L. (2001). Application of support vector machines in

Vous aimerez peut-être aussi