Adaptation and Hybridization in Computational Intelligence

Adaptation, Learning, and Optimization 18
Iztok Fister
Iztok Fister Jr. Editors
Adaptation and
Hybridization in
Computational
Intelligence
Adaptation, Learning, and Optimization
Volume 18
Series editors
Meng-Hiot Lim, Nanyang Technological University, Singapore
e-mail: emhlim@ntu.edu.sg
Yew-Soon Ong, Nanyang Technological University, Singapore
e-mail: asysong@ntu.edu.sg
About this Series
The role of adaptation, learning and optimization are becoming increasingly essen-
tial and intertwined. The capability of a system to adapt either through modification
of its physiological structure or via some revalidation process of internal mecha-
nisms that directly dictate the response or behavior is crucial in many real world
applications. Optimization lies at the heart of most machine learning approaches
while learning and optimization are two primary means to effect adaptation in var-
ious forms. They usually involve computational processes incorporated within the
system that trigger parametric updating and knowledge or model enhancement, giv-
ing rise to progressive improvement. This book series serves as a channel to con-
solidate work related to topics linked to adaptation, learning and optimization in
systems and structures. Topics covered under this series include:
complex adaptive systems including evolutionary computation, memetic com-
puting, swarm intelligence, neural networks, fuzzy systems, tabu search, sim-
ulated annealing, etc.
machine learning, data mining & mathematical programming
hybridization of techniques that span across artificial intelligence and compu-
tational intelligence for synergistic alliance of strategies for problem-solving.
aspects of adaptation in robotics
agent-based computing
autonomic/pervasive computing
dynamic optimization/learning in noisy and uncertain environment
systemic alliance of stochastic and conventional search techniques
all aspects of adaptations in man-machine systems.
This book series bridges the dichotomy of modern and conventional mathematical
and heuristic/meta-heuristics approaches to bring about effective adaptation, learn-
ing and optimization. It propels the maxim that the old and the new can come to-
gether and be combined synergistically to scale new heights in problem-solving. To
reach such a level, numerous research issues will emerge and researchers will find
the book series a convenient medium to track the progresses made.
More information about this series at http://www.springer.com/series/8335

Iztok Fister Iztok Fister Jr.
Editors
Adaptation and Hybridization

in Computational Intelligence
ABC
Editors
Iztok Fister Iztok Fister Jr.
Faculty of Electrical Engineering and Faculty of Electrical Engineering and
Computer Science Computer Science
University of Maribor University of Maribor
Maribor Maribor
Slovenia Slovenia
ISSN 1867-4534 ISSN 1867-4542 (electronic)

Adaptation, Learning, and Optimization
ISBN 978-3-319-14399-6 ISBN 978-3-319-14400-9 (eBook)
DOI 10.1007/978-3-319-14400-9
Library of Congress Control Number: 2014958757
Springer Cham Heidelberg New York Dordrecht London

c Springer International Publishing Switzerland 2015
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology
now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, express or implied, with respect to the material contained herein or for any
errors or omissions that may have been made.
Printed on acid-free paper
Springer International Publishing AG Switzerland is part of Springer Science+Business Media

(www.springer.com)
Preface
Rapid development of digital computers has given to computer science a new momen-
tum with emergence of the computational intelligence (CI). In line with this, a several
number of algorithms has been developed to compete with each other in order to reach
an eternal human desire to make an algorithm which would be suitable of solving all
the problems that human are confronted to. Unfortunately, this desire has been reduced
to ashes because of the No-Free-Launch (NFL) theorem. In place of general problem
solvers, specific nature-inspired algorithms incorporated a domain-specific knowledge
to solve problems with sufficient accuracy in real time. The solutions that earlier cannot
be afforded because of time as well as space limitations of digital computers, nowadays
can be solved due to a rapid development of hardware components.
The most powerful algorithms today base on the following inspirations from the
nature:
human brain,
Darwinian evolution and
behavior of some species of social living insects (e.g., bees, ants, termites, etc.) and
animals (birds, dolphins, bats, etc.).
The first inspiration has led to an origin of artificial neural networks (ANNs), the
second to evolutionary algorithms (EAs) and third to swarm intelligence (SI). Unfor-
tunately, without any of the mentioned algorithms we are not near to our ideal to find
the general problem solver. For instance, despite the starting success with ANN in the
eighties of the past millennium, it has been proven that equating the capacity of hu-
man brain with the capacity of computer memory led us in dead end. Nowadays, the
computer capacity exceeds capacity of human brain, but the computers are a long way
away from the intelligence of the human brain. In line with this, the question is how
intelligent these nature-inspired algorithms are.
However, when we would like to answer this the question is what the intelligence is at
all. The answer can be found in Piagets book "The Psychology of Intelligence" arguing
the intelligence is not passive, but arises when the organism interacts with its environ-
ment. The intelligence is adaptive in its nature, where the adaptation is described as
an equilibrium between the action of the organism in the environment and vice versa.
VI Preface
Cleperde and Stern look on the intelligence as a mental adaptation to new circum-
stances. They divide the intelligence mental structures into instincts, trial-and-error and
habits. Thus, the origin of intelligence represents the most elementary empirical test-
and-error that in essence characterizes a search for a hypothesis. However, the hypoth-
esis together with a problem and control represent marks of intelligence.
The test-and-error is a method of problem solving, characterized by repeated at-
tempts until the valid solution is found. Typically, this method is significant for children
when discovering the elementary principles of the world. During this process, they ac-
quire a new knowledge based on their own experience, i.e., learned from own mistakes.
This method presents the fundamental metaphor of computational intelligence as well.
In the sense that the CI algorithms search for hypothesize by solving the problems and
thus adapt to demands of the problem to be solved, these algorithms are reasonably
referred to computational intelligence.
In fact, the adaptation is common characteristic of the nature-inspired algorithms that
are taken into account in this book. However, the adaptation is considered differently
in each of these algorithms. For instance, ANN employ learning due to minimizing
the error rate obtained as a feedback from the learning process. On the other hand, the
EAs and SI-based algorithms adapt to the dynamic demands of the problems either by
searching for the proper parameter setting or even by using the different strategies of
exploring the search space at different stages of the search process.
However, these algorithms are too general for solving the all real-world problems,
with which human are confronted to. In order to converge the solution to the global
optimum as near as possible, a domain-specific knowledge must be incorporated into
the algorithm structures. Usually, the domain-specific knowledge is conducted to the
nature-inspired algorithm via traditional construction heuristics or local search. In gen-
eral, so hybridized algorithm solves the specific problem better than the traditional. In
such so named meta-heuristic algorithms, the nature-inspired algorithm plays a role of
generating the new solution, while the incorporated heuristic solves the problem on
traditional way. This way of solving the problem is also known as generate-and-test.
In summary, the book takes a walk through recent advances in adaptation and hy-
bridization in CI domain. It consists of ten chapters that are divided into three parts.
The first part (Chapter 1) illustrates background information and provides some the-
oretical foundation tackling the CI domain, the second part (Chapters 2-6) deals with
the adaptation in CI algorithms, while the third part (Chapters 7-10) focuses on the hy-
bridization in CI. An emphasis of the book is given to nature-inspired CI algorithms,
like ANNs, EAs and SI-based. All the chapters in the second and third parts are ordered
according to the classification adopted in Chapter 1. A short description of chapters
contents are as follows.
Chapter 1 presents a background of adaptation and hybridization in CI domain. It is
focused especially on three already mentioned nature-inspired algorithms. In line with
this, biological foundations of an adaptation, as the basis for the speciation (i.e., the
formation of new species), are illustrated in the example of an adaptation of Darwins
finches. Then, foundations of nature-inspired algorithms are presented in the sense of
phenomena in the nature serving to imitate their behavior in the corresponding nature-
inspired algorithm. However, the emphasize is on the adaptation and hybridization of
Preface VII
these CI algorithms. Finally, the recent advances captured in the papers, tackling the
adaptation and hybridization are shortly surveyed.
Chapter 2 gives an overview of adaptive and self-adaptive mechanisms within the
DE algorithms. This review shows that these methods mainly base on a controlling the
mutation and crossover parameters, but less on the population size. Additionally, the
chapter proposes a new self-adaptive jDE algorithm using two strategies, selected by
the same probability during the run.
Chapter 3 takes a closer look on the self-adaptation of control parameters in evolu-
tion strategies (ES). In line with this, an analysis of classical mutation operators, like
uncorrelated mutation with one step size and uncorrelated mutation with n step sizes
is performed. Additionally, the uncorrelated mutation with n step size 4-dimensional
vectors is proposed where each 4-dimensional vector consists of a problem variable,
a mutation strength, shifting the location of normal distribution for a shift angle and
reversing the sign of the change. This means, changing a position of each problem vari-
able in a search space depends on three control parameters modified in each generation.
Chapter 4 presents a review of the most relevant adaptive techniques as proposed
in papers tackling the cooperative co-evolution (CC) in EAs. The CC divides a whole
population into sub-populations that explore the search space in different directions.
These sub-populations cooperate by exchanging information in order to evaluate the
fitness of individuals. The chapter finishes with a presentation of a new adaptive CC
firefly algorithm (FA).
Chapter 5 presents a parameter tuning of a novel SI-based optimization algorithm
inspired by krill herd. In the original krill herd (KH) algorithm, the parameter setting
is based on the real data found in biological literature. Unfortunately, the parameter
setting does not comply with the best suited parameter values needed by solving the
specific problem. Therefore, in this chapter, the best parameter setting as found during
the manual tuning process by solving the high-dimensional benchmark problems is
proposed.
Chapter 6 focuses on the SI-based algorithm inspired by the behavior of natural bats
that for an orientation in space and for hunting preys employ a physical phenomenon
called an echolocation. Using this algorithm, economic dispatch (ED) problems are
solved, in this chapter. The manual parameter tuning is proposed in order to find the
best parameter setting.
Chapter 7 deals with an automatic tuning parameters of ANNs, where the original
ANN for fire analysis of steel frame is hybridized with a real-valued meta-GA that
searches for the optimal values of ANN parameters. The meta-GA uses the genetic
operators of crossover and mutation, and evaluates the quality of parameters obtained
after applying the ANN.
Chapter 8 presents a memetic approach which is a hybridization of differential evo-
lution (DE) with a variable neighborhood search (VNS) heuristic. As identified in the
chapter, the performance of the proposed DE_VNS algorithm depends on the muta-
tion strategies, crossover operator and the standard DE control parameters. In line with
this, a multiple mutation operators can be employed to the VNS. In order to prevent a
population diversity, an inversion and injection operators are proposed in the chapter.
VIII Preface
Chapter 9 introduces a new memetic approach based on the DE algorithm hybridized

with VNS in order to increase the exploitation ability of the algorithm. The algorithm
is applied for solving the Probabilistic Traveling Salesman Problem (PTSP) and the
Vehicle Routing Problem (VRP) with stochastic demands.
Chapter 10 introduces a multi-agent system consisting of self-assembled nanorobots
that operate as artificial platelets for repairing wounds in a simulated, human small
vessel, which may be used to treat platelet diseases. These nanorobots exhibit only sim-
ple behavior and work together on their nearly stage. The particle swarm optimization
(PSO) is employed for controlling the locomotion of the nanorobots in order to be able
for self-assembling into a structure in a simulation system. In fact, the PSO algorithm
acts as a meta-heuristic that guides on the lower level operating nanorobots to assembly
a structure that assists by repairing wounds.
This book can serve as an ideal reference for both undergraduate and graduate stu-
dents of computer science, electrical and civil engineering, economy, and all the other
students of natural sciences that are confronted with solving the optimization, model-
ing and simulation problems. In line with this, it covers the recent advances in CI that
encompasses the nature-inspired algorithms, like ANNs, EAs and SI-based algorithms.
On the other hand, the purpose of this book is to encourage developers of new nature-
inspired algorithms especially in SI domain that rather searching for new algorithms for
any new occurred problem apply the tested methods of adaptation and hybridization in
the existing nature-inspired algorithms. There are countless opportunities how to realize
this. Thus, an applicability of the existing algorithms can be increased.
I would like to thank editors of the Springer Verlag Dr. Thomas Ditzinger and Dr.
Dieter Merkle, series editors Dr. Ong, Yew-Soon and Dr. Lim, Meng-Hiot, and Springer
technical staff for their help and support by the book publishing. Special thank goes to
authors of contributions in this book. Finally, I would like to thank my family for the
patience, encouragement and support.
October 2014 Iztok Fister

Maribor
Contents
Part I: Background Information and Theoretical Foundations

of Computational Intelligence
Adaptation and Hybridization in Nature-Inspired Algorithms . . . . . . . . . . . . 3
Iztok Fister, Damjan Strnad, Xin-She Yang, Iztok Fister Jr.
Part II: Adaptation in Computational Intelligence

Adaptation in the Differential Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Janez Brest, Ale Zamuda, Borko Bokovic
On the Mutation Operators in Evolution Strategies . . . . . . . . . . . . . . . . . . . . . 69
Iztok Fister Jr., Iztok Fister
Adaptation in Cooperative Coevolutionary Optimization . . . . . . . . . . . . . . . . 91
Giuseppe A. Trunfio
Study of Lagrangian and Evolutionary Parameters in Krill Herd
Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Gai-Ge Wang, Amir H. Gandomi, Amir H. Alavi
Solutions of Non-smooth Economic Dispatch Problems by Swarm
Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Seyyed Soheil Sadat Hosseini, Xin-She Yang, Amir H. Gandomi,
Alireza Nemati
Part III: Hybridization in Computational Intelligence

Hybrid Artificial Neural Network for Fire Analysis of Steel Frames . . . . . . . 149
Toma Hozjan, Goran Turk, Iztok Fister
A Differential Evolution Algorithm with a Variable Neighborhood Search
for Constrained Function Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
M. Fatih Tasgetiren, P.N. Suganthan, Sel Ozcan, Damla Kizilay
X Contents
A Memetic Differential Evolution Algorithm for the Vehicle Routing

Problem with Stochastic Demands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
Yannis Marinakis, Magdalene Marinaki, Paraskevi Spanou
Modeling Nanorobot Control Using Swarm Intelligence for Blood Vessel
Repair: A Rigid-Tube Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
Boonserm Kaewkamnerdpong, Pinfa Boonrong, Supatchaya Trihirun,
Tiranee Achalakul
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

Part I
Background Information
and Theoretical Foundations
of Computational Intelligence
Adaptation and Hybridization
in Nature-Inspired Algorithms
Iztok Fister1, , Damjan Strnad1 , Xin-She Yang2 , and Iztok Fister Jr.1
1
University of Maribor, Faculty of Electrical Engineering and Computer Science
Smetanova ul. 17, 2000 Maribor, Slovenia
{iztok.fister,damjan.strnad,iztok.fister1}@um.si
2
School of Science and Technology, Middlesex University, London NW4 4BT, UK
x.yang@mdx.ac.uk
Abstract. The aim of this chapter is to familiarize readers with the

basics of adaptation and hybridization in nature-inspired algorithms as
necessary for understanding the main contents of this book. Adapta-
tion is a metaphor for exible autonomous systems that respond to
external changing factors (mostly environmental) by adapting their well-
established behavior. Adaptation emerges in practically all areas of hu-
man activities as well. Such adaptation mechanisms can be used as a
general problem-solving approach, though it may suer from a lack of
problem-specic knowledge. To solve specic problems with additional
improvements of possible performance, hybridization can be used in or-
der to incorporate a problem-specic knowledge from a problem domain.
In order to discuss relevant issues as general as possible, the classication
of problems is identied at rst. Additionally, we focus on the biological
foundations of adaptation that constitute the basis for the formulation
of nature-inspired algorithms. This book highlights three types of inspi-
rations from nature: the human brain, Darwinian natural selection, and
the behavior of social living insects (e.g., ants, bees, etc.) and animals
(e.g., swarm of birds, shoals of sh, etc.), which inuence the develop-
ment of articial neural networks. evolutionary algorithms, and swarm
intelligence, respectively. The mentioned algorithms that can be placed
under the umbrella of computational intelligence are described from the
viewpoint of adaptation and hybridization so as to show that these mech-
anisms are simple to develop and yet very ecient. Finally, a brief review
of recent developed applications is presented.
Keywords: Computational intelligence, evolutionary algorithms,

swarm intelligence, articial neural networks, adaptation, nature-inspired
algorithms.
1 Introduction
The noun adaptation originates from the greek word ad aptare which means to
fit to. This word emerged primarily in biology and was later widened to other

Corresponding author.
c Springer International Publishing Switzerland 2015 3

I. Fister and I. Fister Jr. (eds.), Adaptation and Hybridization in Computational Intelligence,
Adaptation, Learning, and Optimization 18, DOI: 10.1007/978-3-319-14400-9_1
4 I. Fister et al.
areas as well. It designates a collective name for problems arising within dierent
areas, e.g., genetic, articial intelligence, economics, game theory, etc., encom-
passing the optimization problems of dierent diculties regarding complexity
and uncertainty [41]. Complexity means how much eort must be incorporated
in order to solve a specic problem. Uncertainty denotes the environment in
which a problem arises and typically, changes over time. In general, real-world
problems are embodied within environments which are typical dynamic, noisy
and mostly unpredictable.
An adaptive system undergoes acting operators that aects its structure. That
means, such systems adapt to the changing conditions of the environment by
modifying the structure. In fact, each system prepares itself for changes using
the so-called adaptive plan; i.e., the set of factors controlling these changes [41].
The adaptive plan determines how the structures are changed in order to best
t to the changing environment. Typically, the adaptive plans are realized by
developing the operators that determine how the changes of structures are per-
formed. There are several plans (operators) that can be used for adapting to the
environment. Which of these is the best depends on a performance measure in
which the estimation of a plan is based. Selecting the proper performance mea-
sure depends on the domain from which the specic problem arises. On the other
hand, the performance measure estimates the quality of the modied structure.
Many natural, as well as articial systems, arising within dierent domains are
adaptive in nature. Some of these systems, by their structures and performance
measures, are illustrated in Table 1.
In genetics, the structure of an adaptation is a chromosome that undergoes
the actions by the operators of crossover, mutation and inversion. The quality
of an individual is measured by its tness. The tter the individual, the more
chances it has to survive. Articial intelligence looks for a program tool that
imitates the behavior of the human brain, which should be able to learn, while
Table 1. Domains and corresponding structures, operators, and performance measures
Domain Structures Operators Performance Measure
genetic chromosome mutation, crossover, tness

inversion
articial program cleavage learning error function
intelligence
production goods/services production activities utility
game theory strategies rules payo
supramolecular supermolecules recognition,transcription, amount of energy
chemistry transformation and information
memetic memes transmission, selection, payo
computation replication, variation
Adaptation and Hybridization in Nature-Inspired Algorithms 5
its performance is normally measured by the error function. The smaller the
value of error function, the better the program is adapted to its environment.
Production is a process of combining various material and immaterial inputs
(plans, know-how) in order to make something for consumption (the output). It
is the act of creating outputs, goods or services which have values and contribute
to the utility of individuals [35]. The higher the utility, the more the production
process is optimized.
In a game theory, a game is a mathematical model of a situation of interactive
decision making, in which every decision maker (or player) strives to attain his
the best possible outcome [42]. Indeed, each player plays a move according to
the strategy that maximize its payo. The payo matrix provides a quantitative
representation of players preference relations over the possible outcomes of the
game. The strategy for player A is the winning strategy if for every move of
player B, player A is the winner. A combination of moves must obey the game
rules by all game players.
Supramolecular chemistry may be dened as chemistry beyond the molecule,
where two molecules (i.e., receptor and substrate) are assembled into
supramolecules using intermolecular bonds [43]. Supramolecules undergo the
actions such as molecular recognition, transformation, and translocation that
may lead to the development of molecular and supramolecular species and can
provide very complex functions. These species are capable of self-organizing, self-
assembling and replicating by using molecular information. Here, the amount of
energy and information is employed as the performance measure.
In memetic computation (MC), a meme represents a building block of infor-
mation obtained by autonomic software agents obtained either by learning or
by interacting with the surrounding agents which acts within a complex dy-
namic environment [24]. Indeed, memes can represent the agents ideas and
knowledge captured as memory items and abstractions (e.g., perceptions, be-
liefs, minds) [29]. The primary memetic operator is imitation [61], which takes
place when the memes are transmitted, replicated or modied. These intelligent
agents are also confronted by selection, where the agents with the higher payos
in the previous generations have more chances for survival.
Although adaptation has emerged within dierent domains of human activi-
ties, it shares the similar characteristics, e.g., each adapted system has its struc-
ture on which operators are applied according to an adaptive plan, while the
modied structure is estimated using a suitable performance measure. The higher
the performance measure, the better the system adapts to its environment. As
a result, only the best adapted structures can continue to develop and improve
their characteristics. The less adapted ones are condemned to disappear. In this
sense, adaptation can also be viewed as an optimization process.
Obviously, most real-world problems are hard to solve. This means that prob-
lems cannot be solved exactly by an algorithm enumerating all the possible
solutions. They are too complex in terms of both the time and space necessary
for obtaining solutions [40]. Therefore, these problems are usually solved ap-
proximately by using heuristic methods that guess the solution of the problem
6 I. Fister et al.
in some (ideally smart) way. Although such a solution is not exact, it is good
enough to be used in a practice.
Nowadays, algorithm developers often try to imitate the operations of natural
processes by attempting to solve the harder, real-world problems. From the al-
gorithm development point of view, there are three types of inspiration sources
from nature:
human brain,
natural selection,
behavior of some social living insects and animals.
The rst source of inspiration has led to the emergence of the articial in-
telligence, where the algorithm tries to mimic the operations of human brains
in order to solve problems, where the main example is the articial neural net-
works (ANNs) [39]. The second source of inspiration has led to the foundations
of evolutionary algorithms (EA) [36] using the Darwinian natural selection [37],
where the ttest individual in a population can survive during the struggle for
existence. The third source of inspiration has closely related to the development
of swarm intelligence (SI) [1,173] that mimics the social behavior of some living
insects and animals [38]. Although such systems tend to obey simple rules, sim-
ple creatures such as ants are capable of performing autonomous actions, they
are still capable of doing great things, e.g., building magnicent anthills, when
acting together within a group. All three mentioned nature-inspired algorithms
can be placed under the umbrella of computational intelligence (CI). The al-
gorithms belonging to this family share the same characteristics, i.e., they are
capable of solving the problems on some sophisticated, intelligent way.
On the other hand, the behavior of an optimization algorithm is controlled by
its parameters (also strategy or control parameters). These parameters mostly
stay xed during the algorithms run. However, this is in contrast to the real-
world, where the good starting values of parameters could become bad during
the run. As a result, a need has been emerged to modify them during the run.
Here, the adaptation of control parameters can be used as well, where the values
of the control parameters are modied during the run in order to best suit the
demands of the search process.
In addition, many traditional algorithms, especially gradient-based methods,
exist that contain a lot of domain-specic knowledge within algorithm struc-
tures. Contrary, the general problem solver methods, especially nature-inspired
population-based algorithms like EAs and SI, are capable to obtain the moder-
ate results on all classes of optimization problems. In order to connect the gen-
eral problem solver methods with the traditional heuristics, the hybridization
of nature-inspired population-based algorithms with traditional heuristic algo-
rithms has been performed. Such hybridized algorithms incorporate a problem-
specic knowledge into algorithms structures and are therefore more suitable
for solving the specic problems. Using more problem-specic knowledge, these
algorithms may overcome limitation imposed by the No-Free Lunch theorem [18]
stating that two algorithms are equivalent when comparing across all classes of
problems. According to Chen et al. [24], the hybridized algorithms evolved over
simple hybrids, via adaptive hybrids to memetic automation. Simple hybrids of-
ten represent a hybridization of population-based CI algorithms with local search
heuristics. The result of connecting the adaptation with hybridization has led to
adaptive hybrids. The last step in the integration of adaptation with hybridiza-
tion forms a part of memetic computing, where, in addition to the parameters,
other algorithmic structures can also be adapted.
The remainder of this chapter is organized as follows. Section 2 deals with
optimization problems and their complexity. The origin of adaptation within
natural systems is the subject of Section 3. Section 4 analyzes the nature-inspired
algorithms. In line with this, the ANN, EA and SI-based algorithms are taken
into account. Section 5 highlights key characteristics of adaptation and diversity
in CI. Section 6 deals with a description of hybridization methods in CI. A brief
review of recent application arisen in CI is given in Section 7. Finally, some
conclusions are drawn in Section 8.
2 Classification of Problems
From a system analysis point of view, problem-solving can be seen as a system
consisted of three components: input, output, and model (Fig. 1). The model
transforms input data to output data. If the model is known, the output data
can be determined by each set of input data. The problem can also be placed
dierently, i.e., which input data produces specic output data by a known
model. Finally, knowing the input and output data, the problem is how to nd
a model that transforms the specic input data to the output data.
Fig. 1. Problems and System Analysis
In line with this, three classes of problems can be dened with regard to one
of the unknown components within system analysis, as follows:
optimization: the input data that satises a criterion of optimality are
searched for by a known model and known output data,
simulation: a set of known input data are applied to the known model in
order to simulate the output data,
modeling: searching for a (mathematical) model is performed, which can
transform the known input data to the known output data, at a glance.
The optimization and simulation/modeling problems are described in the next
subsections in more detail.
8 I. Fister et al.
2.1 Optimization Problems and Their Complexity

When solving optimization problems, the output value needs to be determined
with a set of input data, a model for transforming the input data into output, and
a goal prescribing the optimal solutions. Optimal solutions are feasible solutions
the values of which are either minimal or maximal. These values can be written
as y = f (y ), while their optimal values as f (y). Only one set of input data can
be set on the input. This set is therefore known under the name instance. The set
of all instances that can appear on the input constitute an optimization problem
P . Formally, the optimization problem is dened as quadruple P = I, S, f, goal ,
where
I is a set of instances of problem P ,
S is a function assigning each instance x I to a set of feasible solutions
S(x), where x = {xi } for i = 1 . . . n and n determines a dimensionality of
the problem P ,
f is an objective function assigning a value f (y) R to each feasible solution
y S(x),
the goal determines whether the feasible solution with the minimum or max-
imum values is necessary to search for.
In computational intelligence, the tness function is employed in place of
the objective function because using the equality min(f (y)) = max(f (y)) the
maximal values of objective function can be transformed into searching for the
minimal values of the tness function.
The optimization problems may be emerged within one of three possible forms,
as follows:
constructed form, where the optimal values of variables y and the cor-
responding value of objective function f (y) needs to search for a given
instance y = S(x),
non-constructed form, where the optimal value of objective function f (y)
needs to search for a given instance y = S(x),
decision form, where the problem is to identify whether the optimal value of
the objective function is better than some prescribed constant K, i.e., either
f (y) K, when goal = min or f (y) K, when goal = max.
Optimization problems can be divided into three categories, i.e., problems
using: the continuous variables, the discrete (also combinatorial ) variables, and
the mixed variables. The rst category of problems searches for the optimum
value in an innite set of real numbers R. Variables are taken from a nite set
by discrete problems, while they may be either discrete or continuous by the
mixed problems.
In general, algorithms are procedures for solving problems according to certain
prescribed steps [2]. Usually, these procedures are written in some programming
language. If a certain algorithm solves all instances I of the specic problem
P then it can be said that the algorithm solves this problem completely. Here,
the algorithm which solves this problem the most eciently is typically searched
for. The eciency of algorithms is normally estimated according to the time

and space occupied by the algorithm during a run. Generally, the more ecient
algorithms are those that solve problems the fastest.
Time complexity is not measured by the real-time as required for solving
the problem on a concrete computer because this measure would not be fair.
Algorithms can be run on dierent hardware or even on dierent operating
systems. In general, the problem or instance size is therefore measured in some
informal way which is independent of the platform on which the algorithm runs.
Therefore, time complexity is expressed as a relation that determines how the
time complexity increases with the increasing problem size. Here, we are not
interested in the problem size, but in how the instance size inuences on the
time complexity. If the algorithm solves a problem of size n, for example, with
a time complexity C n2 for some constant C means that the time complexity
of the algorithm is O(n2 ) (read: of order n2 ). The function O(n2 ) determines an
asymptotic time complexity of the algorithm and limits its upper bound.
If the time complexity of the algorithm is exponential, i.e., O(2n ), it can be
argued that the problem is hard. As a result, these kinds of problems belong to
a class of nondeterministic-polynomial hard problems (i.e., NP-hard) [40]. Clas-
sical combinatorial problems like the Traveling Salesman Problem (TSP) [44],
the Graph Coloring Problem (GCP) [45], etc. are members of this class.
2.2 Simulation/Modeling Problems

The behavior of real-world facilities or processes (also systems) can be described
in the form of mathematical or logical relationships. In general, these real world
systems are too complex for expressing their behavior with exact mathematical
methods. Therefore, analytical solutions of this systems behavior are not pos-
sible. As a result, the system is studied by simulation, where the mathematical
model of the system is built on a digital computer. The task of simulation is to
evaluate a model numerically by known input variables in order to obtain output
variables matching the expected real world values as closely as possible.
In this chapter, modeling problems (in the narrow sense) refer to supervised
learning, where on the basis of observing some examples of input-output pairs,
the system learns a model that maps input data to output data. Supervised
learning can be dened formally as follows. Let a training set be given with N
instances of input-output pairs in the form (x1 , y1 ), , (xN , yN ), where each
yi is generated by an unknown function y = f (x). The task is to discover a
function h that approximates the true function f [39].
The function h represents a hypothesis that is validated throughout all input-
output pairs during the learning process. The learning process is nished when
the search space of all possible hypotheses is searched for and none of these are
rejected. Moreover, the learned model h must also perform well on the so-called
test set of input-output pairs that are distinct from the training set.
When the elements of output vector y belong to a nite set of values, such
a learning problem becomes a classication problem. On the other hand, when
these elements are real values, the learning problem is also known as regression.
10 I. Fister et al.
3 Biological Foundations of Natural Adaptation
In natural evolution, adaptation indicates a genetic as well as non-genetic mod-

ication of individuals during more generations. Moreover, this term is usually
used as being a synonymous for measure of tness, i.e., a characteristic that in-
creases during generations. What does an individual adapt to? More frequently,
here goes about adapting to conditions of environment or ecological niche, i.e.,
an area that is occupied by individuals living in a particular community be-
cause of common exploitation of resources in the environment [46]. Too specic
adaptation of a particular ecological niche can lead to speciation [37].
Darwins nches (also Galapagos nches) are one of the most famous examples
of speciation using adaptation, where a group of about fteen nch species with
common ancestors occupied specic ecological niches and adapted to dierent
food sources with dierent body sizes and beak shapes. Indeed, only the best
adapted individuals survived. The process of so-called adaptive radiation [3],
in which individuals diversify rapidly into a multitude of new forms, had been
started when nch ancestors originated from South America occupied an island
in Galapagos archipelago closest to the continent.
The adaptive radiation as an origin of evolutionary diversity opens up the
question as to when and why comes to the speciation. Darwin in 1859 [37] an-
swered with an allopatric model of speciation whereby the evolutionary diversity
was caused by geographical separation of the population.
Fig. 2. Gal
apagos archipelago
Speciation and formation of new Galapagos nches were carried over three
phases (Fig. 2):
The population of nches colonized an island closest to the continent. This

population underwent the rules of natural selection.
Part of the population separated from the group and colonized the next
island. They adapted themselves to new environmental conditions, because a
distribution of food sources on the next island was dierent. As a result, only
the most adapted to the new conditions with the body size and the shape of
their beaks could survive. Additionally, geographically separated populations

underwent changes of their reproduction materials through mutation.
The process of colonizing the other islands of the Gal apagos archipelago
repeated until nally, the conquering population recolonized the site island
from which the adaptive radiation started. As a result, the new population
meets its ancestor population.
The meeting of these two populations may have caused the individuals of both
populations:
to mate among themselves and the osprings became more successful than
their parents,
to mate among themselves and the osprings became less successful than
their parents,
not to mate among themselves.
In the rst case, both populations merge together into a single one, while in
the third case the individuals of both populations are so dierent that the mating
was impossible. In this worst case, reproduction isolation happens and prevents
mating between the individuals of two dierent populations. However, the most
interesting is the second case that represents a process of adaptive radiation
that could cause population isolation over a longer period of time. However, this
isolation is just a precondition for speciation.
More recently views on the adaptive radiation and speciation of Darwins
nches have cast doubt in the correctness of the allopatric model [37]. Indeed, it
seems that the proximity of the Galapagos islands might prevent the existence
of geographical isolation and therefore, the nches could freely travel between
islands. This fact also suggests that more populations need to live on the same
place at the same time.
Today, a sympatric model has been established that argues speciation with-
out geographical isolation [3]. In this model, new species appear as a result of
adaptation to ecological niches. When individuals of a sympatric species mate
between themselves, then the tness of their ospring usually decreases. The nat-
ural selection quickly eliminates such individuals from the population. On the
other hand, dierences in the reproductive materials changed by mutations can
also cause a reproduction barrier whereby individuals of dierent populations
do not mate between themselves and thus speciation can occur.
Dierences in reproduction material represent a reproduction barrier when
mating has been performed. Usually, the reproduction barrier can emerge before
the mating takes place. Interestingly, each male nch uses a similar kind of
courtship. Thus, it is not important how males appear, but how they look.
Usually, males dier between themselves according to the size and the shape of
their beaks rather than the birds plumages. As a result, the size and the shape
of the beaks adapted to the local food sources can cause a reproduction barrier
between individuals of sympatric populations.
Furthermore, the reproduction isolation can also be caused by dierences in
the acquired characteristics of individuals (i.e., ecological isolation), e.g., sounds
12 I. Fister et al.
that have been learned by males from their parents and which are susceptible by
females of the same population. The sound is independent of the reproduction
material, although morphological characteristics of individuals are written in
genes (e.g., the size and the shape of beaks) can have an impact on the volume
and pitch of sound articulated by the bird.
Interestingly, Wrights concept of adaptive landscape [4] can be used to illus-
trate the morphological characteristics of Darwins nches according to various
food sources on the Gal apagos islands. Both dierent morphological characteris-
tics, i.e., the body size and the shape of beaks, are represented as two coordinate
axes in a 3-dimensional coordinate system, while the third axis represents se-
lective advantages or disadvantages of morphological characteristics of a specic
individual in regard to the food sources.
The adaptive landscape of morphological characteristics versus body sizes
and beak shapes can change over the longer period of time. Therefore, such
landscape is also named dynamic adaptive landscape. Similarly to the conditions
in the environment have changed over time, also the heights and positions of
hills are changed in the adaptive landscape. For instance, the height of the hill is
lowered, a valley between two hills is increased or two hills move closer to each
other or move away from each other. Various populations of Darwins nches
adapt to these changes in the environment. If, for example, two hills are moved
closer to each other because of frequent earthquakes on Galapagos archipelago,
two or more populations of Darwins nches come together, while if the hills
are moved away the groups of nches are being separated. Speciation appears
when the specic population colonizes the peak of a hill. Each hill is occupied by
exactly one nch population with the body size and the shape of beaks adapted
to the specic food source. As a result, fteen ecological niches can be discovered
on the Galapagos archipelago, on which exactly the same number of nch species
have appeared.
In computational intelligence, the adaptive landscape is known as the fitness
landscape. Furthermore, the speciation is more frequently used by solving mul-
timodal problems, where more equivalent problem solutions (i.e., more peaks
within the tness landscape) are maintained during the algorithm run. In fact,
each peak represents an ecological niche appropriate for speciation [47].
Therefore, dierent landscapes (from dierent problems) may pose dierent
challenges to dierent algorithms. It is not possible in general to adapt to all
landscapes at the same time. As a result, dierent algorithms may perform
dierently for dierent problems. In order to solve this a broad spectrum of
various problems, developers of new algorithms draw inspirations from dierent
natural systems. Nature-inspired algorithms are the most generalized terms and
we will discuss nature-inspired algorithms in greater detail in the next section.
4 Nature-Inspired Algorithms
Nature-inspired algorithms are very diverse. Loosely speaking, we can put nature-
inspired algorithms into three categories: articial neural networks, evolutionary
algorithms and swarm intelligence. It is worth pointing out that such categoriza-
tion here is not rigorous. However, it is mainly for the convenience of discussions
in this chapter.
4.1 Algorithm as an Iterative Process

Mathematical speaking, an algorithm A is an iterative process, which aims to
generate a new and better solution x(t+1) to a given problem from the current
solution x(t) at iteration or (pseudo)time t. It can be written as
x(t+1) = A(x(t) , p), (1)
where p is an algorithm-dependent parameter. For example, the Newton-Raphson

method to nd the optimal value of f (x) is equivalent to nding the critical
points or roots of f (x(t) ) = 0 in a d-dimensional space. That is,
f (x(t) )
x(t+1) = x(t) = A(x(t) ). (2)
f (x(t) )
Obviously, the convergence rate may become very slow near the optimal point
where f (x) 0. Sometimes, the true convergence rate may not be as quick as
it should be. A simple way to improve the convergence is to modify the above
formula slightly by introducing a parameter p as follows:
f (x(t) ) 1
x(t+1) = x(t) p , p= . (3)
f (x(t) ) 1 A (x )
Here, x is the optimal solution, or a xed point of the iterative formula.

The above formula is mainly valid for a trajectory-based, single agent system.
(t) (t) (t)
For population-based algorithms with a swarm of n solutions (x1 , x2 , ..., xn ),
we can extend the above iterative formula to a more general form
(t+1) (t)
x1 x1
x2 x2
(t) (t)
.. = A (x1 , x2 , ..., x(t)
n ); (p1 , p2 , ..., pk ); (1 , 2 , ..., m ) .. ,
. .
xn xn
(4)
where p1 , ..., pk are k algorithm-dependent parameters and 1 , ..., m are m ran-
dom variables. An algorithm can be viewed as a dynamical system, Markov
chains and iterative maps [173], and it can also be viewed as a self-organized
system [174].
4.2 Articial Neural Networks

The human brain consists of a network of interconnected neural cells (also-called
neurons) which communicate using electrochemical signaling mechanisms. The
14 I. Fister et al.
main part of a human neuron (Fig. 3.a) is the cell body that contains a cell
nucleus [39]. The cell body branches out with a number of bers (dendrites) and
a single long ber named an axon. The neuron accepts the incoming signals from
its neighbors axons through dendrite tips at junctions called synapses, which
inhibit or amplify the signal strength. After the processing of accumulated inputs
inside the nucleus the output signal is propagated through the axon to neurons
down the communication line. The brain function is evolved through short-term
and long-term changes in the connectivity of the neurons, which is considered
as learning.
inputs threshold
weights x =-1
0
dendrite w0=q
synapse x1
dendrite w1 activation output
signal x2 w2
axon
S
axon x3 w3 v y
cell body cell body
signal
synapse
dendrite
synapse
dendrite
xn wn
transfer function
(a) Biological neuron (b) Articial neuron
Fig. 3. Human and articial neuron
There is a natural desire to compare the performance of the human brain

with the performance of a digital computer. Like the brain, todays computers
are capable of highly parallel processing of signals and data. Interestingly, todays
capacity of a digital computer is comparable to the capacity of the human brain.
Moreover, they are capable of parallel processing. On the other hand, human
brain do not use all of their neurons simultaneously. If it is further assumed that
according to Moores law [14], the memory capacity of digital computers doubles
approximately every two years, if this trend continues, it is obviously possible
that the singularity point [15] at which the performance of digital computers
will be greater than those of the human brain has to be reached. Although
computer intelligence has virtually unlimited capacity, this does not mean that
truly intelligence will emerge automatically. It is still a challenging, unresolved
task to gure out how to use such resources to produce any useful intelligence.
The Artificial neural network (ANN) is a simplied and inherently adaptive
mathematical model of the human brain. The elementary part of every ANN
is the artificial neuron (Fig 3.b), which is modeled after the biological brain
cell. In an ANN the neurons communicate through weighted connections that
simulate the electrochemical transfer of signals in the brain. Many dierent ANN
topologies and neuron models have been presented in the past, each developed
for a specic type of machine learning task like classication, regression (i.e.,
function approximation), or clustering. By far the most practically employed
type of ANN is the multi-layered feed-forward neural network that consists of
the McCulloch-Pitts type of articial neuron [16].
The structure of a classical feedforward multi-layered neural network, com-

monly known as a multi-layer perceptron (MLP), is shown in Figure 4. The
external input signals xi , 1 i n, enter the network on the left and ow
through multiple layers of neurons towards the outputs oi , 1 i m, on the
right. The neuron connectivity exists only from the previous layer to the next
one, so the outputs of neurons in layer l 1 serve as inputs to the neurons of
layer l. There is no interconnection of neurons within the layer, no backward
connections, and no connections that bypass layers. In a MLP network with L
layers, the rst L1 are called hidden layers and the last one is called the output
layer. Two hidden layers are enough for most practical purposes. We shall use
hi to denote the number of neurons in the i-th hidden layer and m to denote
the number of neurons in the output layer (i.e., the number of network outputs).
We will use the compact notation n/h1 /h2 / . . . /hL1 /m to describe such MLP
network with n external inputs.
y1
(1)
y1(L-1)
w1,1
(1)
1 1 (L)
w1,1
(L)
w
(L-1) y1
w1,0
(1) (L)
x1 1,0 w1,2 1
y2(1) y2(L-1) (L) (L)
2 2 ws,1 w1,0
x2 (L)
w
(L-1) ws,2
w2,0
(1)
2,0
(L)
w1,r (L)
ym
xn m
y y
(L-1)
(1)
p (L)
p r
r
ws,r wm,0
(L)
wp,n
(1)
(L-1)
w
(1)
p,0
w r,0
Fig. 4. Multi-layer feed-forward neural network
Every connection within the MLP network is assigned a real-valued weight

that amplies or inhibits the signal traveling over the connection. We will use
(l)
notation wij to denote the weight on the j-th input to the i-th neuron in layer
l. The function of a MLP network with xed structure is determined by a set of
weights on all of its connections.
Neurons in a MLP function as simple processors that gather the weighted
signals on their input lines and transform them into a single numerical output.
In the McCulloch-Pitts neuron model shown in Fig. 3.b this is performed in
two steps. The summation unit adds weighted inputs and shifts the result by an
additional intercept parameter called threshold or bias to produce the neuron
activation value v in the following way:
n

v= wi xi , (5)
i=0
where x = {x0 , . . . , xn } is the augmented input vector with x0 = 1 and w =

{w0 , . . . , wn } is the corresponding augmented weight vector with w0 = .
16 I. Fister et al.
In the second step the activation value is injected into the transfer function
to obtain the neuron output y:
y = (v). (6)
The Heaviside step function is used in place of for classication tasks, while
for regression tasks the popular choice for is the logistic function :
1
(v) = . (7)
ev/
Here, is the sigmoid slope parameter with default value 1. Fig. 5 shows the step
function on the left and the logistic function for various values of on the right-
hand side. When a signed version of the sigmoid transfer function is required,
the common choice is the hyperbolic tangent.
f(v) f(v)
1 r=10
1
r=1
0.5
0
v -0.5 0 0.5 v
Fig. 5. The step (left) and the sigmoid (right) activation function
The ow of signals in a MLP network with structure n/h1 /h2 / . . . /hL1 /m

can be described in a unied form as:

hl1

yi =
(l) (l) (l1)
wij yj , 1 i hi ; 1 l L (8)
j=0
(0) (L)
where h0 = n, yi = xi , hL = m, and oi = yi .
Weights represent the programmable part of neural network. In order to per-
form a specic task, we need to train the MLP, i.e., adjust the weights using a set
of training samples with known input-output mappings. This is an example of
supervised learning, which is used in classication tasks with existing records of
correctly labeled patterns or regression tasks with known values of an unknown
nonlinear map in a given set of points.
The weight adaptation in neural networks is achieved by iterative training
algorithms, in which the input parts of the training samples are presented to the
network in succession. A cycle in which all of the training samples are introduced
on the network input is called an epoch. The better known supervised training
method for the MLP is the error back-propagation algorithm. For each presented
input, the computed network output o is compared with the target vector d to
obtain the prediction error. The usual error measure E in back-propagation
training is the mean squared error (MSE) of the output neurons:
1
E= (d o)T (d o) (9)
m
The weights are then updated in the direction of the negative gradient E/w
to reduce the error in the next iteration.
Training continues until the maximum number of epochs is reached or the av-
erage MSE error for the epoch falls below some prescribed tolerance . General
methods like cross-validation to prevent over-tting can also be used for prema-
ture training termination. The complete back-propagation training algorithm is
summarized in Algorithm 1.
Algorithm 1. Pseudo-code of back-propagation ANN

1: repeat
2: initialize weights
3: for all examples(x,y) do
4: propagate the inputs forward to obtain the outputs
5: propagate deltas backwards from output layer to input layer
6: update every weight in network with deltas
7: end for
8: until termination criteria met
9: return articial neural network
Training continues until the maximum number of epochs is reached or the av-
erage MSE error for the epoch falls below some prescribed tolerance . General
methods like cross-validation to prevent over-tting can also be used for prema-
ture training termination. The complete back-propagation training algorithm is
summarized in Algorithm 1.
4.3 Evolutionary Algorithms

EAs found their origins for basic operations from the Darwinian evolutionary
theory of the survival of the ttest [37], where the tter individuals in nature have
more chances to survive in the struggle for survivor. Thus, the tter individuals
are able to adapt better to changing conditions of the environment. The lesser
t individuals are gradually eliminated from the population by natural selection.
Darwinian theory of survival of the ttest refers to a macroscopic view of
natural evolution [36]. Today, it is known that all characteristic traits that dene
the behavior of an individual are written in genes as fundamental carriers of
heredity. Individuals outer characteristics (also phenotype) are determined in
genes (also genotype). The view on these individuals as inner characteristics
18 I. Fister et al.
is also known as the microscopic view of natural evolution. As matter of fact,

the phenotypic characteristics are encoded into genotypes. Unfortunately, this
encoding is not one-to-one, i.e., the genotype-phenotype mapping is not injective
because one phenotype trait can be determined by more genes. On the other
hand, a genetic material is passed onto the new generation using the process of
reproduction. Reproduction consists of two phases: crossover and mutation. In
the former phase, the genetic material from two parents are combined in order to
generate ospring with new traits, while in the latter phase, the genetic material
of the ospring may be randomly modied.
In order to introduce this Darwinian natural evolution in EAs, some links be-
tween the concepts of both domains should be performed [36]. Natural evolution
is handled by a population of individuals living in an environment that changes
over the time (also dynamic). On the other hand, EAs use the population of
candidate solutions. The environment can be taken as the problem space. Sim-
ilarly, the natural reproduction process is simulated by operators of crossover
and mutation in EAs. Finally, the tness of the individual in natural evolution
represents the quality of the candidate solution in EAs. A pseudo-code of EA is
presented in Algorithm 2, where two selection operators are supported in EAs. In
the rst selection (function select parents), two parents are selected for crossover,
while in the second (function select candidate solution for the next generation),
the candidate solutions are determined for the next generation. When the gen-
erational model of population is selected, the whole population is replaced in
each generation, while using the steady-state model only the worst part of the
population is replaced by the best osprings.
Algorithm 2. Pseudo-code of evolutionary algorithm

1: initialize population with random candidate solutions
2: evaluate each candidate solution
3: while termination criteria not met do
4: select parents
5: recombine pairs of parents
6: mutate the resulting ospring
7: evaluate each candidate solution
8: select candidate solution for the next generation
9: end while
10: return best agent
Evolutionary computation (EC) was inspired by Darwinian theory of natural

evolution. EC is a contemporary term that captures all the algorithms arising
from the principle of natural selection. Consequently, all algorithms that have
been emerged within this EC domain are known under the name EAs. Loosely
speaking, EAs can be divided into the following types (Fig. 6):
Genetic Algorithms (GA) [47],
Genetic Programming (GP) [49],
Evolution Strategies (ES)[46],

Evolutionary Programming (EP) [48],
Dierential Evolution (DE) [13].
Fig. 6. Primarily, EAs dier from each other in terms of the representation of solutions.
For example, GAs operate with a population of mainly binary represented solutions,
ESs use real-valued elements of solutions, GPs represent solutions as trees implemented
in Lisp programming language, while EPs employ the solutions represented as nite
state automata.
EAs have been successfully applied to dierent areas of optimization, modeling

and simulation, where problems cannot be solved suciently using traditional
methods such as gradient-based methods.
4.4 Swarm Intelligence

Swarm intelligence concerns the studies of the collective behavior of multi-agent
and decentralized systems, which may be self-organized and evolving. This term
was probably rst used by Beni in 1989 [1], when he developed cellular robots
consisted of simple agents communicating by interactions with other agents
within the neighborhood.
In nature, some social living insects (e.g., ants, bees, termites, etc.) and ani-
mals (e.g., ocks of birds, schools of shes, etc.) may show some characteristics
that may be classied as swarm intelligence (Fig. 7). Though individual agents
such as ants and bees may follow simple rules, they can carry out complex tasks
collectively. In other words, their decision making is decentralized, while they
are self-organized and act consistently with the intentions of the group. Such
interactions between individuals (such as particles) are local and rule based.
Interactions between particles in a swarm can be direct or indirect. In the
indirect case, two particles are not in physical contact with each other because a
20 I. Fister et al.
Fig. 7. Nature-inspired SI-based algorithms - The picture presents the sources of in-
spiration from nature for developing the following SI-based algorithms that follow in
the clockwise direction: natural immune systems, particle swarm optimization, ower
pollination algorithm, bat algorithm (echolocation), cuckoo search (to lay own eggs
into other birds nests), reies (bioluminescence), bee (foraging of nectar) and ant
colonies(pheromone)
communication is performed via modulation of the environment [38]. For exam-

ple, ants deposit pheromones on their way back from a protable food source and
other ants will follow paths marked with pheromone. In that way information is
simply spit out without controlling who receives it. In the direct case, informa-
tion is transferred directly without modulation of environment. A good example
of such an interaction mechanism is the honeybees waggle dance to encode
the spatial information: the direction and the distance to the nectar source. The
quality of a new food source is assessed by the forager gauge, based on the sugar
content of the nectar, the distance from the colony and the diculty with which
the nectar can be collected.
SI-based algorithms are population-based, which uses multiple interacting
agents or particles. Each particle has a position and velocity where the posi-
tion usually represents a solution to the problem of interest. Their interaction
may be described by some mathematical equations, based on the idealized char-
acteristic for the collective behavior of imitated insects or animals (e.g., swarm
of birds, reies, etc.). In most SI-based algorithms, all solutions are moved to-
wards the best candidate solution and thus, the new better solutions can be
obtained. Sometimes, problems arise when the best solution cannot be improved
anymore. In this case, stagnation emerges. However, stagnation may be avoided
using an additional mechanisms like local search heuristics, though there is no
guarantee that it will solve the stagnation issue. The pseudo-code of the generic
SI-based algorithm is shown in Algorithm 3.
Algorithm 3. Pseudo-code of swarm intelligence algorithm

1: initialize swarm within bounds
2: evaluate all particles
3: while termination criteria not met do
4: move all particles
5: evaluate all particles
6: nd the best particle
7: end while
8: return best particle
The main characteristics of SI-based algorithms are as follows [38]:
decentralization via rule-based models,

interaction among particles is carried locally (collective behavior),
particle behavior is subordinated to the system behavior (self-organization),
adapting to changes in the landscape (reasonable robust and exible).
Some representative SI-based algorithms are as follows:
Articial Immune Systems (AIS) [5],

Particle Swarm Optimization (PSO) [8],
Flower Pollination (FPA) [11],
Bat Algorithm (BA) [9],
Cuckoo Search (CS) [12],
Firey Algorithm (FA) [10],
Articial Bee Colony (ABC) [7],
Ant Colony Optimization (ACO) [6].
It is worth pointing out that we can only cover and discuss less than 10% of
all dierent SI-based algorithms in this brief review. However, the development
of new types of the SI-based algorithms is not nished yet. Almost every day
new SI-based algorithms have been emerging. In this way, there is no doubt that
this area will become more active in the near future.
5 Adaptation and Diversity in Computational Intelligence
Adaptation in nature-inspired algorithms can take many forms. For example,

the ways to balance exploration and exploitation are the key form of adapta-
tion [175]. As diversity can be intrinsically linked with adaptation, it is better
not to discuss these two features separately. If exploitation is strong, the search
process will use problem-specic information (or landscape-specic information)
obtained during the iterative process to guide the new search moves; this may
lead to the focused search and thus reduce the diversity of the population, which
22 I. Fister et al.
may help to speed up the convergence of the search procedure. However, if ex-
ploitation is too strong, it can result in the quick loss of diversity in the pop-
ulation and thus may lead to the premature convergence. On the other hand,
if new search moves are not guided by local landscape information, it can typi-
cally increase the exploration capability and generate new solutions with higher
diversity. However, too much diversity and exploration may result in meandered
search paths, thus lead to the slow convergence. Therefore, adaptation of search
moves so as to balance exploration and exploitation is crucial. Consequently, to
maintain the balanced diversity in a population is also important.
Diversity in meta-heuristic algorithms can also appear in many forms. The
simplest diversity is to allow the variations of solutions in the population by
randomization. For example, solution diversity in genetic algorithms is mainly
controlled by the mutation rate and crossover mechanisms, while in simulated
annealing, diversity is achieved by random walks. In most SI-based algorithms,
new solutions are generated according to a set of deterministic equations, which
also include some random variables. Diversity is represented by the variations,
often in terms of the population variance. Once the population variance is get-
ting smaller (approaching zero), diversity also decreases, leading to converged
solution sets. However, if diversity is reduced too quickly, premature conver-
gence may occur. Therefore, a right amount of randomness and the right form
of randomization can be crucial.
In summary, adaptation and diversity in meta-heuristic algorithms can mainly
take the following forms:
balance of exploration and exploitation,
generation of new solutions,
right amount of randomness,
parameter setting, and
other subtle form.
In the remainder of this chapter, we discuss the role of adaptation and diversity
in these cases.
5.1 Exploration and Exploitation

The eciency of a search process in all population-based nature-inspired algo-
rithms depends on two components: exploration and exploitation [21]. The rst
component is connected with the generation of new undiscovered regions of the
search space, while the second with directing the search towards the known
good solutions. Both components must be balanced during the search because
too much exploration can lead to inecient search, while too much exploitation
can lead to the loss of the population diversity that may cause premature con-
vergence. Exploitation and exploration are also referred to as intensication and
diversication [59,176,10].
Exploitation uses any information obtained from the problem of interest so
as to help to generate new solutions that are better than existing solutions.
However, this process is typically local, and information (such as gradients) is
also local. Actually, it is for a local search. For example, hill-climbing is a method
that uses derivative information to guide the search procedure. In fact, new steps
always try to climb up the local gradient. The advantage of exploitation is that
it usually leads to very high convergence rates, but its disadvantage is that it
can get stuck in a local optimum because the nal solution point largely depends
on the starting point.
On the other hand, exploration makes it possible to explore the search space
more eciently, and it can generate solutions with enough diversity and far from
the current solutions. Therefore, the search is typically on a global scale. The
advantage of exploration is that it is less likely to get stuck in a local mode,
and the global optimality can be more accessible. However, its disadvantages are
slow convergence and waste of a lot of computational eorts because many new
solutions can be far from global optimality.
As a result, a ne balance is required so that an algorithm can achieve the
best performance. Too much exploitation and too little exploration means the
system may converge more quickly, but the probability of nding the true global
optimality may be low. On the other hand, too little exploitation and too much
exploration can cause the search path meander with very slow convergence. The
optimal balance should mean the right amount of exploration and exploitation,
which may lead to the optimal performance of an algorithm. Therefore, a proper
balance is crucially important.
However, how to achieve such a balance is still an open problem. In fact,
no algorithm can claim to have achieved such an optimal balance in the current
literature. In essence, the balance itself is a hyper-optimization problem, because
it is the optimization of an optimization algorithm. In addition, such a balance
may depend on many factors such as the working mechanism of an algorithm,
its setting of parameters, tuning and control of these parameters and even the
problem to be considered. Furthermore, such a balance may not universally
exist [18], and it may vary from problem to problem, thus requiring an adaptive
strategy.
These unresolved problems and mystery can motivate more research in this
area, and it can be expected relevant literature will increase in the near future.
Attraction and Diusion. The novel idea of attraction via light intensity as
an exploitation mechanism was rst used by Yang in the rey algorithm (FA)
in 2007 and 2008. It is simple, exible and easy to implement. This algorithm
bases on the ashing patterns and behavior of tropical reies, and can naturally
deal with nonlinear multimodal optimization problems.
The movement of rey i is attracted to another more attractive (brighter)
rey j as determined by
2
= xi + 0 erij (xj xi ) + i ,
(t+1) (t) (t) (t) (t)
xi (10)
where the second term is due to the attraction, and 0 is the attractiveness at r =
0. The third term is randomization with being the randomization parameter,
(t)
and i is a vector of random numbers drawn from a Gaussian distribution
24 I. Fister et al.
(t)
at time t. Other studies also use the randomization in terms of i that can
easily be extended to other distributions such as Levy ights. A comprehensive
review of the rey algorithm and its variants has been carried out by Fister et
al. [74,79,75].
In FA, the attractiveness (and light intensity) is intrinsically linked with the
inverse-square law of light intensity variations and the absorption coecient. As
a result, there is a novel but nonlinear term of 0 exp[r2 ] where 0 is the
attractiveness at the distance r = 0, and > 0 is the absorption coecient for
light [10].
The main function of such attraction is to enable an algorithm to converge
quickly because these multi-agent systems evolve, interact and attract, leading
to some self-organized behavior and attractors. As the swarming agents evolve,
it is possible that their attractor states will move towards to the true global
optimality.
This novel attraction mechanism in FA is the rst of its kind in the literature of
nature-inspired computation and computational intelligence. This also motivated
and inspired others to design similar or other kinds of attraction mechanisms.
Other algorithms that were developed later also used inverse-square laws, derived
from nature. For example, the charged system search (CSS) used Coulombs law,
while the gravitational search algorithm (GSA) used Newtons law of gravitation.
Whatever the attraction mechanism may be, from the meta-heuristic point of
view, the fundamental principles are the same: that is, they allow the swarming
agents to interact with one another and provide a forcing term to guide the
convergence of the population.
Attraction mainly provides the mechanisms for exploitation, but, with proper
randomization, it is also possible to carry out some degree of exploration. How-
ever, the exploration is better analyzed in the framework of random walks and
diusive randomization. From the Markov chain point of view, random walks
and diusion are both Markov chains. In fact, Brownian diusion such as the
dispersion of an ink drop in water is a random walk. Levy ights can be more
eective than standard random walks. Therefore, dierent randomization tech-
niques may lead to dierent eciency in terms of diusive moves. In fact, it is
not clear what amount of randomness is needed for a given algorithm.
5.2 Generation of New Solutions

The ways of generating new solutions aect the performance of an algorithm.
There are as many ways of solution generations as the number of variants or
algorithms. For example, according to Yang [173], three major ways of generating
the new solutions in SI-based algorithms are:
Uniform random generation between a lower bound L and an upper bound
U. Thus, the new solution often takes the form
x = L + (U L), (11)
where [0, 1].

Local random walks around a current solution (often the best solution),
which gives
x(t+1) = x(t) + w, (12)
where w is drawn from a Gaussian normal distribution.
Global Levy ights provide an ecient way of generating long-jump solutions
x(t+1) = x(t) + L(), (13)
where L() obeys a Levy distribution with the exponent of .
However, it is very rare for an algorithm to use only one of the above methods.
In fact, most algorithms use a combination of the above methods together with
other ways of solution generation.
5.3 Right Amount of Diversity via Randomization
As we mentioned earlier, all meta-heuristic algorithms have to use stochastic

components (i.e., randomization) to a certain degree. Randomness increases the
diversity of the solutions and thus enables an algorithm to have the ability to
jump out of any local optimum. However, too much randomness may slow down
the convergence of the algorithm and thus can waste a lot of computational
eorts. Therefore, there is some tradeo between deterministic and stochastic
components, though it is dicult to gauge what is the right amount of random-
ness in an algorithm? In essence, this question is related to the optimal balance
of exploration and exploitation, which still remains an open problem.
As random walks are widely used for randomization and local search in meta-
heuristic algorithms [10,9], a proper step size is very important. As dierent
algorithms use dierent forms of randomization techniques, it is not possible to
provide a general analysis for assessing randomness.
One of the simplest randomization techniques is probably the so-called ran-
dom walk, which can be represented as the following generic equation
x(t+1) = x(t) + s(t) , (14)
where (t) is drawn from a standard normal distribution with a zero mean and
unity standard deviation. Here, the step size s determines how far a random
walker (e.g., an agent or a particle in meta-heuristics) can go for a xed number
of iterations. Obviously, if s is too large, then the new solution x(t+1) generated
will be too far away from the old solution (or more often the current best). Then,
such a move is unlikely to be accepted. If s is too small, the change is too small
to be signicant, and consequently such search is not ecient. So a proper step
size is important to maintain the search as ecient as possible. However, what
size is proper may depend on the type of the problem and can also be changed
during the iteration. Therefore, step sizes and thus the amount of randomness
may have to be adaptive.
26 I. Fister et al.
5.4 Parameter Settings in Computational Intelligence
Biological species live in a dynamic environment. When the environment changes

these changes are also followed by living beings changing their behavior as deter-
mined by corresponding genetic material written in chromosomes. Those beings
who do not follow these changes are eliminated by the natural selection. The ex-
tinction of mammoths is a well-known example of animals that were not capable
of adapting to new environmental conditions that occurred after the recent Ice
Age.
On the other hand, the changing environment of the Gal apagos archipelago
essentially inuenced the adaptive radiation of Darwins nches. At that time,
some islands had disappeared, while some new ones had emerged because of
volcanic activity within that region. The tropical climate from before the onset
of the recent Ice Age had changed by global cooling that crucially inuenced the
vegetation. Consequently, the ancestral nches acquired longer and narrower
beaks better suited to exploring for nectar and insects [3] thus changing their
habitat regarding trees by living on the ground. In line with this, the ground
nches had also changed their feeding habits, i.e., in place of nectar and insect
they fed on seeds. Those ground nches with the shorter beaks were more suitable
for this living space and therefore had more chances of surviving and reproducing
their genetic material for the next generations. Additionally, mutations were
ensured for the modication of this material, where only successful mutations
ensured individuals survived.
In summary, it can be concluded that nches adapted to a changing envi-
ronment with their body size and shape of their beaks. Both characteristics are
written in chromosomes that were changed via crossover and mutation. As mat-
ter of fact, the adaptation process can be viewed from almost three aspects to:
when to adapt (environment), what to adapt (chromosomes), and how to adapt
(crossover and mutation).
How can we use this adaptation metaphor from biology in computational intel-
ligence (CI)? As stated previously, a problem in EAs relates to the environment
in nature. However, this formulation can also be widened to other population-
based CI algorithms. If the problem is solved by an algorithm, then its behavior
is determined by the algorithm parameters. In other words, the algorithm pa-
rameters (also strategic parameters) control the behavior of the algorithm.
For instance, EAs have several parameters like the probability of crossover,
probability of mutation, etc. [36]. The former regulates the probability that the
crossover operator will be applied to two or more parents, while the latter the
probability that the mutation will change a generated ospring. The parame-
ters CR and F are used in DE for the same purposes. The other SI-based and
ANN algorithms use specic algorithm parameters depending on the biologi-
cal, physical, chemical, and all other rules that inspire developers of the new
algorithms [30].
An instance of parameter values set during the run is also-called a parameter set-
ting. Obviously, the dierent values of parameters, i.e., parameter setting can lead to
dierent results and indirectly to dierent behavior by an algorithm.
Therefore, it can be concluded that CI algorithms adapt their parameters (what?)

to a problem to be solved (when?) by changing algorithm parameters (how?).
Links between a natural adaptation and adaptation in CI is made in Table 2,
where the adaptation domains are analyzed according to three dierent aspects,
i.e., when to adapt, what to adapt and how to adapt.
Table 2. Adaptation in natural and articial systems
Adaptation When? What? How?
Natural Environment Structures Operators

ANN Problem Perceptrons Learning
EAs and SI Problem Parameter Changing parameter settings
The adaptation in ANNs is embedded into the algorithms structures, where

perceptrons learn how to minimize the error rate. On the other hand, the
population-based CI search algorithms improve the tness by changing the pa-
rameter settings. According to Eiben and Smith [36], the algorithm parameters
can be changed:
deterministically,
adaptively,
self-adaptively.
Deterministic parameter control takes place when the strategy parameters
are changed by some deterministic rule. That means, this deterministic rule is
predened and therefore any feedback from a search process is not necessary.
For instance, parameters can be changed in a time-varying schedule, i.e., when
a predened number of generations have elapsed [36].
Adaptive parameter control means that the strategy parameters are changed
according to some form of feedback from the search process. An example of
this parameter control is the well-known 1/5 success rule of Rechenberg [51],
where the mutation strength (probability of mutation) is increased when the
ratio of successful mutation is greater than 1/5 and decreased when the ratio of
successful mutation is less than 1/5. In the rst case, the search process focuses
on exploring the search space, while in the second case on searching around the
current solution, i.e., exploiting the search space.
Control parameters are encoded into chromosomes and undergo actions by the
variation operators (e.g., crossover and mutation) using self-adaptive parame-
ter control. The better values of parameter variables and control parameters
have more chances to survive and reproduce their genetic material into the next
generations. This phenomenon makes EAs more exible and closer to natural
evolution [53]. This feature was rstly introduced in ES by Schweel [52].
Parameter control addresses only one side of the parameter setting, where the
strategic parameters are changed during the run. In contrast, when the parame-
ters are xed during the run, an optimal parameter setting needs to be found by
28 I. Fister et al.
an algorithms developer. Typically, these optimal parameters are searched dur-

ing a tuning. In general, the taxonomy of parameter setting according to Eiben
and Smith [36] is as illustrated in Fig. 8.
Fig. 8. Parameter setting in CI algorithms
Obviously, the dierent values of strategic parameters may lead to dierent

results, i.e., the results obtained by one parameter setting can be better than by
another and vice versa. In order to nd the best parameter setting, the tuning of
parameters is performed that demands extensive experimental work. This work
can be increased enormously when the algorithm has more parameters to be
tuned, and where also an analysis as to how the combination of the individual
parameters must be taken into consideration [17].
6 Hybridization in Computational Intelligence

This section deals with a hybridization in CI. Here, we are focused on the nature-
inspired CI algorithms. According to their characteristics, two types of the
hybridization in CI can be considered, as follows: hybridization in ANNs and
hybridization in population-based CI search algorithms. Actually, it is hard to
treated both types of hybridizations separately, because the hybridization be-
comes a powerful bond that connects the individual algorithms under the same
umbrella. In line with this, boundaries between individual algorithms composing
such the hybrid algorithm are deleted, while the hybrid algorithm operates as a
homogenous unit by solving the hardest real-world problems.
In the remainder of the chapter, hybridizations of ANNs and population-based
CI search algorithms are presented in details.
6.1 Hybridization in Neural Networks

The hybridization of ANNs with EAs and SI-based algorithms is aimed at solving
two optimization problems arising during the application of ANNs. The rst
problem arises because gradient-based methods for ANN training are susceptible
to getting stuck in local optimums on complex error surfaces. For such cases,
global search methods like EAs and SI-based algorithms can provide a robust and
ecient approach for weight optimization. The second problem arises because
the optimal network structure for a specic task is rarely known in advance and
is usually determined by an expert through a tedious experimentation process.
When using EA or SI-based algorithm, the network topology can be dynamically
adapted to the problem at hand by the insertion and removal of neurons or the
connections between them.
The eld of neuro-evolution provides an unied framework for adaptive evolu-
tion and the training of neural networks. In neuro-evolution the ANN structure
and weights are adaptively developed using one of the nature-inspired optimiza-
tion methods with a problem specic tness function. We can distinguish three
groups of neuro-evolutionary methods depending on whether the network param-
eters (i.e., weights), topology or both, are evolved. Further, because the concept
of application to the training and evolution of ANN is very similar using either
EAs or SI-based methods, we regard them all under the term of neuro-evolution
in this text (Fig. 9).
Fig. 9. Hybridization in ANNs
6.2 Hybridization in Population-Based CI Search Algorithms

EAs and SI-based algorithms belong to a class of population-based CI search
algorithms. This means, these algorithms maintain a population of solutions in
place of a single point solution during the run. While the single point search
algorithms deal with single points within a tness landscape, population-based
algorithms investigate the sub-regions of points within the same landscape. Be-
side this inherent parallelism, the population-based search algorithms are more
likely to provide a better balance between the simultaneous exploration of these
sub-regions and exploitation of the knowledge accumulated in the representation
of the solutions.
30 I. Fister et al.
As a result, the population-based search algorithms like EAs and SI-based

algorithms, rely on balancing exploration and exploitation within the search
process [36]. The former is connected with discovering new solutions, while the
latter with directing the search process in the vicinity of good solutions. Both
components of the search process are controlled indirectly by the algorithms
parameters. Therefore, the suitable parameter settings can have a great impact
on the performance of the population-based search algorithms. Actually, these
algorithms operate correctly, when a sucient population diversity is present.
The population diversity can be measured as: the number of dierent tness
values, the number of dierent phenotypes, entropy, and others [21]. The higher
the population diversity, the better the exploration of the search space. Losing
population diversity leads to premature convergence. In SI, stagnation can also
occur where the current best solution can no longer be improved [23].
In general, the population-based search algorithms can be considered as gen-
eral problem solvers that can be successfully applied to the many NP-hard prob-
lems occurring in practice. Unfortunately, the metaphor general problem solver
does not mean that they obtain the best solution for each of our problems. In
this sense, they act similarly to a Swiss Army knife [54] that can be used to
address a variety of tasks. Denitely, the majority of tasks can be performed
better using the specialized tools but, in absence of these tools, the Swiss Army
knife may be a suitable replacement for them. For instance, when slicing a piece
of bread, the kitchen knife is more suitable but when traveling the Swiss Army
knife is ne.
Although population-based CI algorithms provide adequate solutions for most
real-world problems and therefore can even be applied to domains where the
problem-specic knowledge is absent, they perform worse when solving the prob-
lems from domains where a lot of problem-specic knowledge has to be explored.
This is consistent with the so-called No-Free Lunch theorem [18] arguing that
any two algorithms are equivalent when their average performances are com-
pared across all classes of problems. This theorem that in fact destroys our
dreams about developing a general problem solver can fortunately be circum-
vented for a specic problem by hybridizing, i.e., incorporating problem-specic
knowledge into the algorithm. However, no exact solutions of the problems are
needed, in practice, and therefore the primary task is to nd the ecient tool
for solving a specic class of problems eectively.
On the one hand, integration of population-based search algorithms with one
or more renement methods in order to conduct problem-specic knowledge
within the stochastic search process, represents a synergistic combination that
often enhances the performance of the population-based search algorithms [24].
On the other hand, this synergistic combination of population-based search and
renement methods is capable of better balancing between exploration and ex-
ploitation within the stochastic search process. Obviously, the population-based
search is more explorative, while the renement methods act more exploita-
tively. Mostly, the renement methods address the following elements of the
population-based search algorithms [55]:
initial population,
genotype-phenotype mapping,
evaluation function, and
variation and selection operators.
This chapter has focused on population-based CI search algorithms composed

within the evolutionary framework. In line with this, the typical renement meth-
ods applied within this class of algorithms are as follows:
automatic parameter tuning,
hybridization of components,
construction heuristics,
local search heuristics (also memetic algorithms [19,20]).
In the remainder of the chapter, these renement methods are illustrated in

detail. This section concludes with a case study, that presents how hybridization
can be performed in typical EAs.
Automatic Parameter Tuning. As an algorithm is a set of interacting Markov

chains, we can in general write an algorithm as
(t+1) (t)
x1 x1
.. ..
. = A[x1 , ..., xn , p1 , ..., pk , 1 , ..., m ] . , (15)
xn xn
which generates a set of new solutions (x1 , ..., xn )(t+1) from the current pop-
ulation of n solutions. This behavior of an algorithm is largely determined by
the eigenvalues of the matrix A that are in turn controlled by the parameters
p = (p1 , . . . , pk ) and the randomness vector = (1 , ..., m ). From the Marko-
vian theory, we know that the rst largest eigenvalue is typically 1, and therefore
the convergence rate of an algorithm is mainly controlled by the second largest
eigenvalue 0 2 < 1 of A. However, it is extremely dicult to nd this eigen-
value in general. Therefore, the tuning of parameters becomes a very challenging
task.
The parameter tuning can be dened as an optimization problem that searches
for those values of the strategic parameters that optimize the performance of the
population-based CI search algorithm [17]. In fact, parameter tuning, or tuning
of parameters, is an important topic under active research [17,177]. The aim
of parameter tuning is to nd the best parameter setting so that an algorithm
can perform most eciently for a wider range of problems. At the moment,
parameter tuning is mainly carried out by detailed, extensive parametric studies,
and there is no ecient method in general. In essence, parameter tuning itself
is an optimization problem which requires higher-level optimization methods
to tackle. However, a recent study had shown that a framework for self-tuning
algorithms can be established with promising results [177].
32 I. Fister et al.
In summary, studying how the algorithm depends on its parameters is often

of interest to the algorithms designer. However, both mentioned tasks occur
by parameter tuning that can be conducted either manually by a designer or
automatically by an algorithm. Because the manually parameter setting is time
consuming, automatic parameter tuning is increasingly prevailing. Here, a tradi-
tional population-based CI search algorithm can be used for automatic tuning.
In this approach, one population-based CI search algorithm controls the perfor-
mance of another by changing its parameter setting, while the other algorithm
solves the original problem and therefore works within the corresponding prob-
lem space. The control algorithm operates in the parameter space of the con-
trolled algorithm, i.e., at the higher level. Therefore, this approach is also named
as meta-heuristic and was introduced by Grefenstette in 1986 [34]. Recently, the
word meta-heuristic (meaning higher-level [9]) has become used for any com-
bination of population-based CI search algorithms and appropriate renement
methods.
Hybridization of Components. The EA domain has been matured over more

than 50 years of development. Small numbers of problems in science, as well
as in practice, remain intact by the evolutionary approach. In line with this,
many prominent experts have emerged within this domain together with several
original solutions developed by solving this huge diapason of problems. These
original solutions were mostly tackled for developing new evolutionary operators,
population models, elitism, etc.
Typically, SI-based algorithms borrow the DE operators of mutation and
crossover that replace the original move operator in order to increase the ef-
ciency of the SI-based search process. Obviously, the DE variation operators
are eective because of their exploration and exploitation power. For instance,
Fister et al. in [31] hybridized the BA algorithm with DE/rand/1/bin strategy
of applying the mutation and crossover, and reported signicant improvements
compared with the original BA algorithm, as well as the other well-known algo-
rithms, like ABC, DE and FA.
Construction Heuristics. Usually, population-based CI search algorithms are

used for solving those problems where a lot of knowledge has to be accumulated
within dierent heuristic algorithms. Unfortunately, those algorithms operate
well on a limited number of problems. On the other hand, population-based CI
search algorithms are in general more matured and therefore prepared for solving
the various classes of problems, although they suer from a lack of problem-
specic knowledge. In order to combine the advantages of both, population-based
CI search algorithms are used for discovering new solutions within the search
space, and exploiting these for building new, possibly better solutions.
Construction heuristics build solutions incrementally, i.e., elements are added

to the solution step by step until the nal solution is obtained (Algorithm 4).
Algorithm 4. Pseudo-code of construction heuristic

1: y =
2: while solution y S not found do
3: add element yi I to solution y heuristic
4: move the the next element
5: end while
Greedy heuristics are the simplest type of construction heuristics that add new
elements to a solution according to the value of current heuristic function that
can maximize (or minimize) the current non-nal set of elements during each
construction step. When the stochastic construction heuristics [60] are used,
the results of construction may depend on some coincidence. As a result, com-
bining the population-based CI search algorithms which are stochastic in their
nature with stochastic construction heuristics form synergy suitable for solving
the hardest real-world problems.
Memetic Algorithms. The hybridization of population-based CI search algo-

rithms with local search methods is also named as memetic algorithms (MA).
The term MA originated from Moscato in 1989 [56] and means: similar as genes
form the instructions for building proteins in genetic, memes are instructions
for carrying out behavior, stored in brains [24]. The term meme was intro-
duced by Dawkins in its famous book The selfish gene [58]. In computer science
and engineering, a meme represents the smallest piece of knowledge that can be
replicated, modied and combined with other memes in order to generate a new
meme [22].
Interestingly, there is a dierence between the evolution of memes and evo-
lution of genes. While the former does not alter the memetic information at
this stage, the latter modied the genetic information during the variation pro-
cess. However, both changes have their own metaphor in biology. The rst can
be attributed to the Baldwian model of evolution arguing that behavior char-
acteristics can also be learned during the life-time of individual and therefore
not written in genes, while the second is inspired by the Lamarkian model of
evolution stating that each behavior characteristics are written in genes.
A local search [59] is an iterative process of investigating the set of points
in the neighborhood of the current solution and exchanging it, when a better
solution is found [60]. The neighborhood of the current solution y is dened as
a set of solutions achieved by using the elementary operator N : S 2S . All
points in neighborhood N are reached from the current solution y in k strokes.
Therefore, this set of points is also named k-opt neighborhood of point y.
34 I. Fister et al.
Algorithm 5. Pseudo-code of local search

1: generate initial solution y S
2: repeat
3: nd the next neighbor y N (y)
4: if f(y ) < f (y) then
5: f (y) = f (y )
6: end if
7: until neighbor set is empty
It should be noticed that MAs represent the simplest class of so-called meme-
inspired computation (MC) that are also known as simple hybrids by Chen et al.
in [24]. Recently, MAs have merged with the eld of hybridization with adap-
tation. In line with this, several studies have been emerged that extended the
concept of adaptation of parameters also to adaptation of operators [25] that
represent the next step in evolution of MC, i.e., adaptive hybrids. In contrast to
simple hybrids in which domain knowledge is only captured and incorporated
once by a human expert during the design of MAs, adaptive hybrids incorporate
the adaptive strategies and adaptive parameters in order to better suit to solve
the problem as the search process progress [24,26]. To date, the further step of
evolution of MC represents the memetic automation already described in Sec-
tion 1 [28]. In the context of MC, all mentioned renement methods represent
the attempt to use memes as the carriers of the various kind of knowledge [27].
Case Study: Hybridization of EAs. Fig. 10 illustrates some possibilities how

and where to hybridize EAs. In general, the other population-based CI search
algorithms, e.g., SI can also be hybridized in the similar way.
Fig. 10. How to hybridize EAs

At rst, the initial population can be generated by incorporating solutions

of existing algorithms or by using heuristics, local search, etc. In addition, the
local search can be applied to the population of osprings. Evolutionary oper-
ators (e.g., mutation, crossover, parent and survivor selection) can incorporate
problem-specic knowledge or apply the operators taken from other algorithms.
Finally, a tness function evaluation oers more possibilities for a hybridization.
As a matter of fact, it can be used as a decoder that decodes the indirect rep-
resented genotype into a feasible solution. By this mapping, however, various
kinds of the problem-specic knowledge or even the traditional heuristics can be
incorporated within the algorithm.
7 Applications in Computational Intelligence

Applications of various stochastic population-based CI search algorithms are
very diverse, and therefore it is hard to review all the recent developments. In
this chapter, we outline some interesting studies briey.
7.1 Adaptive EAs

EAs were usually connected with parameter adaptation and self-adaptation. Dif-
ferent forms of adaptation and self-adaptation were also applied to the original
DE in order to improve its performance. For instance, Qin and Suganthan [105]
developed a self-adaptive DE (SaDE). In this version, learning strategy and DE
control parameters F and CR are not demanded to be known in advance. That
means, learning strategy and parameters are self-adapted during the run ac-
cording to the learning experience. Brest et al. [64] proposed a DE variant called
jDE. Here, control parameters are self-adaptively changed during the evolution-
ary process. Another variant of self-adaptive DE with the neighborhood search
was proposed by Yang [112]. GAs also encompass enormous work in adapta-
tion and self-adaptation domain. In line with this, a very interesting work was
proposed by Hinterding et al. [89] that self-adapts mutation strengths and pop-
ulation size. Deb and Beyer [68] developed a self-adaptive GA with simulated
binary crossover (SBX). A more complete reviews of the other works in this
domain can also be found in [32,33,65,69,110].
7.2 Adaptive SI-Based Algorithms

Adaptations in SI were used less frequently than hybridizations. Usually, adapta-
tion is connected with the adaptation and self-adaptation of control parameters,
mutation strategies, learning and etc. Some adaptation forms of ABC was pro-
posed in order to improve search ability of the algorithm, to avoid local optima,
and to speed up convergence. For instance, Liao et al. [97] developed an ABC
algorithm and applied it to long-term economic dispatch in cascaded hydro-
power systems. Furthermore, Pan et al. [103] added a self-adaptive strategy for
generating neighboring food sources based on insert and swap operators, which
36 I. Fister et al.
allow the algorithm to work on discrete spaces. Alam and Islam proposed an
interesting ABC variant called articial bee colony with self-adaptive mutation
(ABC-SAM) which tries to dynamically adapt the mutation step size with which
bees explore the problem search space. In line with this, small step sizes serve
to an exploitation component, while large mutation steps more to exploration
component of the ABC search process.
On the other hand, some interesting adaptation has also been applied to the
bat algorithm (BA). For example, Fister et al. [73] proposed a self-adaptive bat
algorithm (SABA), based on the self-adapting mechanism borrowed from the
jDE algorithm. In addition, adaptation or self-adaptation in cuckoo search (CS)
has yet to be developed.
However, there are some adaptive and self-adaptive variants of the FA. For
instance, Fister et al. extended the original FA with the self-adaptation of con-
trol parameters called also MSA-FFA and achieved better balancing between
exploration and exploitation of the search process. They tested their proposed
approach on the graph coloring and showed very promising results [63]. This
MSA-FFA was modied by Galvez and Iglesias and adopted for continuous op-
timization problems [81]. Yu et al. [113] proposed a self-adaptive step FA to
avoid falling into the local optimum and reduce the impact of the maximum
of generations. Authors core idea was to set the step of each rey varying
with the iteration according to current situation and also historical informa-
tion of reies. Roy et al. [106] developed a FA variant using self-adaptation of
the algorithm control parameter values by learning from the reies previous
experiences, which led to a more ecient algorithm.
Adaptations in improving PSO are widely spread in many papers describing
many applications. Since there are many ecient PSO variants, and readers can
refer to the following papers [99,109,115,90,114,95].
7.3 Hybrid ANN+EAs
There is a vast body of literature on the subject of combining EA and ANN,

which is nicely assembled in an indexed bibliography [117]. Early neuro-evolution
approaches focused on ANN training and demonstrated superior eciency of EA
methods over traditional back-propagation training in many domains [169]. The
shift towards the evolution of network topology required consideration of ecient
encoding schemes to resolve the problem of multi-way genotype to phenotype
maps and avoid small genotypic mutations to result in vastly dierent pheno-
types [140,164]. Most of the work in the last two decades concentrated on the
simultaneous evolution of both weights and topology, where various paradigms
of EAs have been employed for the evolution of neural networks.
For example, Angeline et al. proposed an approach based on evolutionary
programming (EP) to build recurrent neural networks [119]. A similar EP-based
method for feed-forward ANNs was presented by Yao and Liu [165]. More re-
cently, Oong and Isa described a hybrid evolutionary ANN (HEANN) in which
both the weights and topology were evolved using an adaptive EP method [151].
The symbiotic adaptive neuro-evolution (SANE) by Moriarty used cooperative

coevolution to evolve neural networks that adapt to input corruption [145,133].
NeuroEvolution of Augmenting Topologies (NEAT) is an approach that evolves
the network topology and adjusts the weights using the genetic algorithm [160,159].
Later, Stanley introduced the HyperNEAT which used compositional pattern pro-
ducing network as a developmental encoding scheme and was aimed at evolving
large neural networks [158]. HyperNEAT is able to capture symmetries in the ge-
ometric representation of the task and was extended by Risi into Evolvable Sub-
strate HyperNEAT (ES-HyperNEAT) which added adaptive density of hidden
neurons [152]. Evolution of adaptive networks using improved developmental en-
coding that outperformed HyperNEAT was proposed by Suchorzewski [161]. A
multi-objective approach to the evolution of ART networks with adaptive param-
eters for the genetic algorithm was proposed in a PhD thesis by Kaylani [135].
Hierarchical genetic algorithms, which used parametric and control genes to con-
struct the chromosome, were applied for neuroevolution by Elhachmi and Guen-
noun [126]. On the side of ANN training procedures the focus is in recent years
on novel combinations of GA with gradient-based or local optimization methods,
which were used to address the problem of stock market time-series prediction [120]
and optimize multi-objective processes in material synthesis [128].
Evolution strategy (ES) was regarded as a driving mechanism of ANN evo-
lution by Matteucci [141]. In place of ES, Igel used evolution strategies with
adaptive covariance matrix (CMA-ES) as the neuroevolutionary method in [131].
Kassahun and Sommer presented an improved method called Evolutionary Ac-
quisition of Neural Topologies (EANT), which used more ecient encoding and
balancing exploration/exploitation of useful ANN structures [134].
Adaptive dierential evolution (ADE) is among the most recent methods to
train multi-layer ANNs, used by Silva [124], Slowik [157], and Sarangi et al. [153].
Memetic variants of DE were used to solve prediction problems in medicine and
biology [127,122]. Cartesian genetic programming was used by several authors
to eciently encode evolvable ANN [137,136,162].
7.4 Hybrid ANN+SI

More recently the ANNs have been coupled with SI-based algorithms. Particle
swarm optimization (PSO) was combined with the classical back-propagation
(BP) learning method for the training of feed-forward neural networks by Zhang
et al. [167]. Very recently, a similar hybridization of PSO with a simplex op-
timization method was proposed by Liao et al. [139]. A hybrid of PSO and
gravitational search algorithm (GSA) outperformed each individual method in
ANN training benchmarks [144]. Sermpinis et al. have used the PSO method
with adaptive inertia, cognitive, and social factors to improve the performance
of a radial basis function (RBF) network in the task of exchange rate forecast-
ing [154]. A similar approach by Zhang and Wu uses adaptive chaotic PSO to
train the ANN in a crop classication task [168].
The successful application of PSO in ANN training was followed by the use
of other SI-based algorithms. A hybrid of BP and ACO algorithm was used in
38 I. Fister et al.
ANN for nancial forecasting [129]. The domain of stock forecasting attracted
researchers who hybridized ANNs with the ABC algorithm [150] and the sh al-
gorithm [156]. A related application of ABC to earthquake time-series prediction
is due to Shah et al. [155].
Additionally, for the most recent SI-based algorithms, adaptive hybridizations
of ANNs with the FA [142,146], the BA [148], the CS [147], and hunting algo-
rithm/harmony search combination [138] have also been carried out with good
results.
ANN training was also approached using the population-based algorithms
which are not strictly nature-inspired, such as magnetic optimization algorithm
[143], chemical reaction optimization [166], and articial photosynthesis and pho-
totropism [123].
While the majority of hybrid ANN+SI-based approaches are concerned with
ANN training, evolution of both weights and topology using the PSO was pre-
sented by Garro et al. [130] and by Ali [118]. A version of PSO called jumping
PSO was recently used by Ismail and Jeng to obtain self-evolving ANN [132].
7.5 Hybrid EAs
There are many hybrid variants of EAs. Most studies in this domain are based
on hybridization with local search, and recently also on borrowing some princi-
ples from SI. In line with this, Grimaccia et al. [84] combined properties of PSO
and GA, and tested performance on the optimization of electromagnetic struc-
tures. Galinier and Hao in [80] proposed a hybrid EA (HEA) for graph coloring.
Their algorithms combined a highly specialized crossover operators with Tabu
search [83]. GA-EDA [104] is a good example of a hybrid EA which uses genetic
and estimation of distribution algorithms. Niknam [102] developed a new EA
algorithm called DPSO-HBMO, which based on the combination of honey bee
mating optimization [87] and discrete PSO [171]. Lin [98] proposed a new EA
combining DE with the real-valued GA.
7.6 Hybrid SI-Based Algorithms
In order to improve original SI-based algorithms, researchers usually hybridized

these with other meta-heuristics, dierent local searches, fuzzy logic, machine
learning methods and other mathematical principles. This chapter has briey
summarized some SI-based hybrids.
ACO has been hybridized in many applications. For instance, Chitty and Her-
nandez [67] developed a hybrid technique which added the principles of dynamic
programming to ACO for solving the problem of dynamic vehicle routing. On
the other hand, Wang et al. [107] proposed a hybrid routing algorithm mobile
ad hoc network which based on ACO and zone routing framework of border-
casting. Hybrid ACO was also applied to cope with well-known problem a job-
shop scheduling in the study [88]. Moreover, Duan and Yu [70] applied hybrid
ACO using memetic algorithm for solving the traveling salesman problem.
ABC was also hybridized in many papers to enhance its performance and
eciency. Duan et al. [71] proposed an ABC and quantum EA, where the ABC
was adopted to increase the local search capacity and also the randomness of
populations. Data clustering was improved with hybrid ABC (HABC) [111],
where authors introduced crossover operator of genetic algorithm to ABC and
enhance information exchange between bees. Large-scale global optimization was
tackled by memetic ABC (MABC) algorithm [72], where the original ABC was
hybridized with two local search heuristics: the Nelder-Mead algorithm (NMA)
and the random walk with direction exploitation (RWDE) in order to obtain the
better balance between exploration and exploitation. Moreover, a hybrid simplex
ABC algorithm (HSABC) which combines NMA with ABC was proposed and
applied for solving the inverse analysis problems [92]. An interesting hybrid
variant of ABC was also applied to solve graph coloring problems [77].
BA has also been developed many hybrid variants, which try to enhance the
eciency, performance, quality of solutions, and faster convergence. A hybrid
BA with path relinking was proposed by Zhou et al. [116], where authors in-
tegrated the greedy randomized adaptive search procedure (GRASP) and path
relinking into the BA, and applied to capacitated vehicle routing problem. Fister
et al. [76] created a hybrid BA (HBA) in order to combine the original BA with
DE strategies as a local search instead of classic random walk. An extension of
the SABA was done by the same authors in [31] where they hybridized the SABA
(HSABA) also with ensemble DE strategies that were used as a local search for
improving current best solution directing the swarm of a solution towards the
better regions within a search space. Wand and Guo developed a novel hybrid
BA with harmony search and applied to global numerical optimization [85].
Chandrasekaran and Simon [66] proposed a hybrid CS (HCS) algorithm that
was integrated with a fuzzy system in order to cope with multi-objective unit
commitment problems. Layeb [94] developed a novel quantum inspired CS that
connects the original CS with quantum computing principles. The main advan-
tage of this hybridization was a good balance between exploration and exploita-
tion during the search process. Li and Yin [96] created a new hybrid variant of
CS called CS-based memetic algorithm and applied it for solving permutation
ow shop scheduling problems. Since the creation of CS, a diverse range of hy-
brid variants this algorithm have emerged. Therefore, readers are invited to read
the review of these algorithms in the paper [91].
FA is another example of very successful SI-based algorithm that experienced
many promising hybridizations since its birth in 2008. Although a comprehen-
sive description of this algorithm was performed in papers [74,75], let us present
some ecient and recent hybrid variants of the FA only. Kavousi-Fard et al. [93]
combined a support vector machine (SVM) and modied FA in order to get
a hybrid prediction algorithm and applied it to the short term electrical load
forecast. Guo et al. in [86] combined FA with harmony search. The result of this
hybridization was an eective algorithm for solving the global numerical opti-
mization problems. On the other hand, Fister et al [78] developed a memetic FA
(MFFA) and applied it to the graph coloring problems. Interesting approach to
40 I. Fister et al.
the distributed graph coloring problem based on the calling behavior of Japanese
tree frogs were accomplished by Hern andez and Blum in [170].
PSO underwent many hybridization suitable for continuous and combinato-
rial optimization. For instance, Lovbjerg et al. [100] created a hybrid PSO and
borrowed some concepts from EAs. A very interesting method was proposed
by Marinakis and Marinaki [101] where authors developed new approach based
on PSO, greedy randomized adaptive search procedure and expanding neigh-
borhood search. This algorithm was then tested on the probabilistic traveling
salesman problem. Zhang et al. proposed a DEPSO algorithm [172], which com-
bined PSO with DE operators, while Wang and Li [108] combined PSO with
simulated annealing (SA).
Obviously, there are other developments and applications, but the purpose of
this chapter is not to review all of they. Therefore, interested readers can refer
to more specialized literature.
8 Conclusion
Adaptation becomes the metaphor for reactions of the natural or articial system
to the conditions of the changing environment. There are a lot of renewed inter-
ests in this area. Therefore, this chapter starts from a denition of adaptive sys-
tems and identies the human domains that already deal with this phenomenon.
Adaptation has also been encountered in the domain of problem-solving. In or-
der to solve these problems, developers usually try to develop new algorithms
imitating the main characteristics of natural processes. Interestingly, the nature
does not impose questions only, but also provides the answers how to solve these.
However, these answers provides diverse sources of inspiration for scientists in
order to solve their problems.
Researchers have always been trying to nd the general problem solver suit-
able to solve all classes of the real-world problems. However, this is usually not
possible as constrained by the NFL theorem. Hybridization of nature-inspired
algorithms may partly overcome the limitations of the NFL theorem, when solv-
ing a specic problem by incorporating the problem-specic knowledge in the
algorithm structures. In line with this, some popular hybridization methods have
been presented in the chapter, with emphasize on the memetic algorithms. This
initial idea of hybridizing the population-based CI nature-inspired algorithms
with the local search has led to the emergence of the new area in CI, i.e., memetic
computation that represents the class of new general problem solvers suitable
for solving the hardest real-world problems.
Here, we have identied three main sources of inspiration that are the most
commonly used nowadays for the development of the new nature-inspired al-
gorithms, i.e., human brains, a Darwinian natural selection, and behavior or
some social living insects and animals. In line with this, three classes of nature-
inspired algorithms have been emerged, in general: ANNs, EAs and SI-based.
All the mentioned classes of algorithms placed under the umbrella of CI are de-
scribed in detail throughout this chapter. The descriptions of these algorithms
are emphasized in terms of adaptation and hybridization that can be applied in

order to increase their performance. At the end, the papers tackling the recent
advances in this CI domains are reviewed shortly.
In summary, we hope that this chapter (and the chapters in the book) contains
a sucient information to inspire researchers to begin searching for solutions in
the beautiful dynamic world represented by the adaptation and hybridization
in CI.
References
1. Beni, G., Wang, J.: Swarm Intelligence in Cellular Robotic Systems. In: Proceed-
ings of NATO Advanced Workshop on Robots and Biological Systems, Tuscany,
Italy, pp. 2630 (1989)
2. Turing, A.M.: Computing machinery and intelligence. Mind, 433460 (1950)
3. Grant, P.R., Grant, B.R.: Adaptive Radiation of Darwins Finches. American
Scientist 90(2), 130150 (2002)
4. Wright, S.A.: The roles of mutation, inbreeding, crossbreeding and selection in
evolution. In: Proceedings of the VI International Congress of Genetrics, vol. (1),
pp. 356366 (1932)
5. Dasgupta, D.: Information Processing in the Immune System. In: Corne, D.,
Dorigo, M., Glover, F. (eds.) New Ideas in Optimization, pp. 161167. McGraw
Hill, New York (1999)
6. Dorigo, M., Di Caro, G.: The Ant Colony Optimization Meta-heuristic. In: Corne,
D., Dorigo, M., Glover, F. (eds.) New Ideas in Optimization, pp. 1132. McGraw
Hill, London (1999)
7. Karaboga, D., Bastruk, B.: A Powerful and Ecient Algorithm for Numerical
Function Optimization: Articial Bee Colony (ABC) Algorithm. Journal of Global
Optimization 39(3), 459471 (2007)
8. Kennedy, J., Eberhart, R.: The Particle Swarm Optimization; Social Adaptation
in Information Processing. In: Corne, D., Dorigo, M., Glover, F. (eds.) New Ideas
in Optimization, pp. 379387. McGraw Hill, London (1999)
9. Yang, X.-S.: A New Metaheuristic Bat-Inspired Algorithm. In: Gonz alez, J.R.,
Pelta, D.A., Cruz, C., Terrazas, G., Krasnogor, N. (eds.) NICSO 2010. SCI,
vol. 284, pp. 6574. Springer, Heidelberg (2010)
10. Yang, X.-S.: Firey Algorithm. In: Yang, X.-S. (ed.) Nature-Inspired Metaheuris-
tic Algorithms, pp. 7990. Luniver Press, London (2008)
11. Yang, X.-S.: Flower Pollination Algorithm for Global Optimization. In: Durand-
Lose, J., Jonoska, N. (eds.) UCNC 2012. LNCS, vol. 7445, pp. 240249. Springer,
Heidelberg (2012)
12. Yang, X.-S., Deb, S.: Cuckoo Search via Levy Flights. In: World Congress &
Biologically Inspired Computing (NaBIC 2009), pp. 210214. IEEE Publication
(2009)
13. Storn, R., Price, K.: Dierential Evolution: A Simple and Ecient Heuristic
for Global Optimization over Continuous Spaces. Journal of Global Optimiza-
tion 11(4), 341359 (1997)
14. Moore, G.E.: Cramming more components onto integrated circuits. Electron-
ics 38(8), 114117 (1965)
15. Ulam, S.: Tribute to John von Neumann. Bulletin of the American Mathematical
Society 64(3), 5056 (1958)
42 I. Fister et al.
16. McCulloch, W.S., Pitts, W.: A logical calculus of the ideas immanent in nervous
activity. The Bulletin of Mathematical Biophysics 5(4), 115133 (1943)
17. Eiben, A.E., Smith, S.K.: Parameter tuning for conguring and analyzing evolu-
tionary algorithms. Swarm and Evolutionary Computation 1(1), 1931 (2011)
18. Wolpert, D., Macready, W.: No free lunch theorems for optimization. IEEE Trans-
actions on Evolutionary Computation 1(1), 6782 (1997)
19. Moscato, P.: Memetic algorithms: A short introduction. In: Corne, D., Dorigo, M.,
Glover, F. (eds.) New Ideas in Optimization, pp. 219234. McGraw Hill, London
(1999)
20. Wilfried, J.: A general cost-benet-based adaptation framework for multimeme
algorithms. Memetic Computing 2, 201218 (2010)

21. Crepin sek, M., Liu, S.-H., Mernik, M.: Exploration and exploitation in evolution-
ary algorithms: A survey. ACM Computing Surveys 45(3), 133 (2013)
22. Neri, F., Cotta, C.: Memetic algorithms and memetic computing optimization: A
literature review. Swarm and Evolutionary Computation 1(2), 114 (2011)
23. Neri, F.: Diversity Management in Memetic Algorithms. In: Neri, F., Cotta, C.,
Moscato, P. (eds.) Handbook of Memetic Algorithms, pp. 153164. Springer,
Berlin (2012)
24. Chen, X., Ong, Y.-S., Lim, M.-H., Tan, K.C.: A Multi-Facet Survey on Memetic
Computation. Trans. Evol. Comp. 15(5), 591607 (2011)
25. Ong, Y.-S., Lim, M.-H., Zhu, N., Wong, K.-W.: Classication of adaptive memetic
algorithms: a comparative study. IEEE Transactions on Systems, Man, and Cy-
bernetics, Part B: Cybernetics 36(1), 141152 (2006)
26. Garcia, S., Cano, J.R., Herrera, F.: A memetic algorithm for evolutionary proto-
type selection: A scaling up approach. Pattern Recogn. 41(8), 26932709 (2008)
27. Iacca, G., Neri, F., Mininno, E., Ong, Y.-S., Lim, M.-H.: Ockhams Razor in
memetic computing: Three stage optimal memetic exploration. Inf. Sci. 188(4),
1743 (2012)
28. Ong, Y.-S., Lim, M.H., Chen, X.: Research frontier: memetic computation-past,
present & future. Comp. Intell. Mag. 5 2(5), 2431 (2010)
29. Lynch, A.: Thought as abstract evolution. J. Ideas 2(1), 310 (1991)
30. Fister Jr., I., Yang, X.-S., Fister, I., Brest, J., Fister, D.: A brief review
of nature-inspired algorithms for optimization. Electrotehnical Review 80(3),
116122 (2013)
31. Fister, I., Fong, S., Brest, J., Fister Jr., I.: A novel hybrid self-adaptive bat algo-
rithm. The Scientic World Journal, 112 (2014)
32. Fister, I., Mernik, M., Filipic, B.: Graph 3-coloring with a hybrid self-adaptive
evolutionary algorithm. Comp. Opt. and Appl. 54(3), 741770 (2013)
33. Fister, I., Mernik, M., Filipic, B.: A hybrid self-adaptive evolutionary algorithm
for marker optimization in the clothing industry. Appl. Soft Comput. 10(2),
409422 (2010)
34. Grefenstette, J.: Optimization of control parameters for genetic algorithms. IEEE
Transactions on Systems, Man, and Cybernetics 16, 122128 (1986)
35. Kotler, P., Armstrong, G., Brown, L., Adam, S.: Marketing, 7th edn. Pearson
Education Australia/Prentice Hall, Sydney (2006)
36. Eiben, A., Smith, J.: Introduction to Evolutionary Computing. Springer, Berlin
(2003)
37. Darwin, C.: On the Origin of Species. Harvard University Press, London (1859)
38. Blum, C., Merkle, D.: Swarm Intelligence. Springer, Berlin (2008)
39. Russell, S., Norvig, P.: Articial Intelligence: A Modern Approach, 3rd edn. Pren-
tice Hall, New Jersey (2009)
40. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory
of NP-Completeness. W. H. Freeman & Co., New York (1979)
41. Holland, J.H.: Adaptation in Natural and Articial Systems: An Introductory
Analysis with Applications to Biology, Control, and Articial Intelligence. A Brad-
ford Book, Cambridge (1992)
42. Maschler, M., Solan, A., Zamir, S.: Game Theory. Cambridge University Press,
Cambridge (2013)
43. Lehn, J.M.: Supramolecular Chemistry: Concepts and Perspectives. VCH Ver-
lagsgeselschaft, Weinheim (1995)
44. Applegate, D.L., Bixby, R.E., Chv atal, V., Cook, W.: The Traveling Salesman
Problem. University Press, Princeton (2006)
45. Bondy, J.A., Murty, U.S.R.: Graph Theory. Springer, Berlin (2008)
46. Back, T.: Evolutionary Algorithms in Theory and Practice: Evolution Strategies,
Evolutionary Programming, Genetic Algorithms. Oxford University Press, Oxford
(1996)
47. Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learn-
ing. Addison-Wesley Longman Publishing Co., Inc., Boston (1989)
48. Fogel, L., Owens, A., Walsh, M.: Articial Intelligence through Simulated Evolu-
tion. John Willey & Sons, Inc., New York (1966)
49. Koza, J.: Genetic Programming 2 - Automatic Discovery of Reusable Programs.
MIT Press, Cambridge (1994)
50. Searle, J.R.: The rediscovery of the mind. MIT Press, Cambridge (1992)
51. Rechenberg, I.: Evolutionsstrategie, Optimierung technischer Systeme nach
Prinzipien der biologischen Evolution. Frommann-Holzboog, Stuttgart (1973)
52. Schwefel, H.P.: Numerische Optimierung von ComputerModellen mittels der
Evolutionsstrategie. Birkh auser, Basel (1977)
53. Deb, K.: Multi-Objective Optimization Using Evolutionary Algorithms. John Wi-
ley & Sons, Inc., New York (2001)
54. Michalewicz, Z., Fogel, D.: How to solwe it: Modern heuristics. Springer (2004)
55. Michalewicz, Z.: Genetic algorithms + data structures = evolution programs.
Springer, Berlin (1992)
56. Moscato, P.: On evolution, search, optimization, genetic algorithm and martial
arts: Toward memetic algorithms. Tech. Rep. 826. California Institute of Tech-
nology, Pasadena, CA (1989)
57. Yang, X.-S.: Nature-Inspired Metaheuristic Algorithms. Luniver Press (2010)
58. Dawkins, R.: The selsh gene. Oxford University Press, Oxford (1976)
59. Aarts, E., Lenstra, J.K.: Local Search in Combinatorial Optimization. Oxford
University Press, Princeton (1997)
60. Hoos, H.H., St utzle, T.: Stochastic Local Search: Foundations and Applications.
Elsevier, Oxford (2005)
61. Blackmore, S.: The Meme Machine. Oxford University Press, New York (1999)
62. Law, A.: Simulation Modeling and Analysis with Expertt Software. McGraw-
Hill, New York (2006)
63. Fister, I., Fister Jr, I., Brest, J., Yang, X.-S.: Memetic rey algorithm for com-

binatorial optimization. In: Filipic, B., Silc, J. (eds.) Bioinspired Optimization
Methods and Their Applications: Proceedings of the Fifth International Confer-
ence on Bioinspired Optimization Methods and their Applications, BIOMA 2012,
pp. 7586. Jozef Stefan Institute, Ljubljana (2012)

64. Brest, J., Greiner, S., Boskovic, B., Mernik, M., Zumer, V.: Self-adapting control
parameters in dierential evolution: A comparative study on numerical bench-
mark problems. IEEE Transactions on Evolutionary Computation 10(6), 646657
(2006)
44 I. Fister et al.
65. Cai, Z., Peng, Z.: Cooperative coevolutionary adaptive genetic algorithm in path
planning of cooperative multi-mobile robot systems. Journal of Intelligent and
Robotic Systems 33(1), 6171 (2002)
66. Chandrasekaran, K., Simon, S.P.: Multi-objective scheduling problem: Hybrid ap-
proach using fuzzy assisted cuckoo search algorithm. Swarm and Evolutionary
Computation 5, 116 (2012)
67. Chitty, D.M., Hernandez, M.L.: A hybrid ant colony optimisation technique
for dynamic vehicle routing. In: Deb, K., Tari, Z. (eds.) GECCO 2004. LNCS,
vol. 3102, pp. 4859. Springer, Heidelberg (2004)
68. Deb, K., Beyer, H.-G.: Self-adaptive genetic algorithms with simulated binary
crossover. Evolutionary Computation 9(2), 197221 (2001)
69. Dilettoso, E., Salerno, N.: A self-adaptive niching genetic algorithm for multi-
modal optimization of electromagnetic devices. IEEE Transactions on Magnet-
ics 42(4), 12031206 (2006)
70. Duan, H., Yu, X.: Hybrid ant colony optimization using memetic algorithm for
traveling salesman problem. In: IEEE International Symposium on Approximate
Dynamic Programming and Reinforcement Learning, ADPRL 2007, pp. 9295.
IEEE (2007)
71. Duan, H.-B., Xu, C.-F., Xing, Z.-H.: A hybrid articial bee colony optimization
and quantum evolutionary algorithm for continuous optimization problems. In-
ternational Journal of Neural Systems 20(01), 3950 (2010)

72. Fister, I., Fister Jr., I., Zumer, V., Brest, J.: Memetic articial bee colony algo-
rithm for large-scale global optimization. In: 2012 IEEE Congress on Evolutionary
Computation (CEC), pp. 18. IEEE (2012)
73. Fister Jr, I., Fong, S., Brest, J., Fister, I: Towards the self-adaptation in the
bat algorithm. In: Proceedings of the 13th IASTED International Conference on
Articial Intelligence and Applications (2014)
74. Fister, I., Fister Jr., I., Yang, X.-S., Brest, J.: A comprehensive review of rey
algorithms. Swarm and Evolutionary Computation (2013)
75. Fister, I., Yang, X.-S., Fister, D., Fister Jr., I.: Firey algorithm: A brief review of
the expanding literature. In: Cuckoo Search and Firey Algorithm, pp. 347360.
Springer (2014)
76. Fister Jr., I., Fister, D., Yang, X.-S.: A hybrid bat algorithm. arXiv preprint
arXiv:1303.6310 (2013)
77. Fister Jr., I., Fister, I., Brest, J.: A hybrid articial bee colony algorithm for
graph 3-coloring. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz,
R., Zadeh, L.A., Zurada, J.M. (eds.) EC 2012 and SIDE 2012. LNCS, vol. 7269,
pp. 6674. Springer, Heidelberg (2012)
78. Fister Jr, I., Yang, X.-S., Fister, I., Brest, J.: Memetic rey algorithm for com-
binatorial optimization. arXiv preprint arXiv:1204.5165 (2012)
79. Fister, I., Yang, X.-S., Brest, J., Fister Jr., I.: Modied rey algorithm using
quaternion representation. Expert Syst. Appl. 40(18), 72207230 (2013)
80. Galinier, P., Hao, J.-K.: Hybrid evolutionary algorithms for graph coloring. Jour-
nal of Combinatorial Optimization 3(4), 379397 (1999)
81. Galvez, A., Iglesias, A.: New memetic self-adaptive rey algorithm for continuous
optimization. International Journal of Bio-Inspired Computation (2014)
82. Geem, Z.W., Kim, J.H., Loganathan, G.: A new heuristic optimization algorithm:
harmony search. Simulation 76(2), 6068 (2001)
83. Glover, F., Laguna, M.: Tabu search. Springer (1999)
84. Grimaccia, F., Mussetta, M., Zich, R.E.: Genetical swarm optimization: Self-
adaptive hybrid evolutionary algorithm for electromagnetics. IEEE Transactions
on Antennas and Propagation 55(3), 781785 (2007)
85. Guo, L.: A novel hybrid bat algorithm with harmony search for global numerical
optimization. Journal of Applied Mathematics 2013 (2013)
86. Guo, L., Wang, G.-G., Wang, H., Wang, D.: An eective hybrid rey algorithm
with harmony search for global numerical optimization. The Scientic World Jour-
nal 2013 (2013)
87. Haddad, O.B., Afshar, A., Marino, M.A.: Honey-bees mating optimization (hbmo)
algorithm: a new heuristic approach for water resources optimization. Water Re-
sources Management 20(5), 661680 (2006)
88. Heinonen, J., Pettersson, F.: Hybrid ant colony optimization and visibility studies
applied to a job-shop scheduling problem. Applied Mathematics and Computa-
tion 187(2), 989998 (2007)
89. Hinterding, R., Michalewicz, Z., Peachey, T.C.: Self-adaptive genetic algorithm for
numeric functions. In: Ebeling, W., Rechenberg, I., Voigt, H.-M., Schwefel, H.-P.
(eds.) PPSN 1996. LNCS, vol. 1141, pp. 420429. Springer, Heidelberg (1996)
90. Ismail, A., Engelbrecht, A.P.: The self-adaptive comprehensive learning parti-
cle swarm optimizer. In: Dorigo, M., Birattari, M., Blum, C., Christensen, A.L.,
Engelbrecht, A.P., Gro, R., St utzle, T. (eds.) ANTS 2012. LNCS, vol. 7461,
pp. 156167. Springer, Heidelberg (2012)
91. Fister Jr., I., Fister, D., Fister, I.: A comprehensive review of cuckoo search: vari-
ants and hybrids. International Journal of Mathematical Modelling and Numerical
Optimisation 4(4), 387409 (2013)
92. Kang, F., Li, J., Xu, Q.: Structural inverse analysis by hybrid simplex articial
bee colony algorithms. Computers & Structures 87(13), 861870 (2009)
93. Kavousi-Fard, A., Samet, H., Marzbani, F.: A new hybrid modied rey algo-
rithm and support vector regression model for accurate short term load forecast-
ing. Expert Systems with Applications 41(13), 60476056 (2014)
94. Layeb, A.: A novel quantum inspired cuckoo search for knapsack problems. Inter-
national Journal of Bio-Inspired Computation 3(5), 297305 (2011)
95. Li, C., Yang, S., Nguyen, T.T.: A self-learning particle swarm optimizer for global
optimization problems. IEEE Transactions on Systems, Man, and Cybernetics,
Part B: Cybernetics 42(3), 627646 (2012)
96. Li, X., Yin, M.: A hybrid cuckoo search via levy ights for the permutation ow
shop scheduling problem. International Journal of Production Research 51(16),
47324754 (2013)
97. Liao, X., Zhou, J., Zhang, R., Zhang, Y.: An adaptive articial bee colony algo-
rithm for long-term economic dispatch in cascaded hydropower systems. Interna-
tional Journal of Electrical Power & Energy Systems 43(1), 13401345 (2012)
98. Lin, W.-Y.: A gade hybrid evolutionary algorithm for path synthesis of four-bar
linkage. Mechanism and Machine Theory 45(8), 10961107 (2010)
99. Liu, S., Wang, J.: An improved self-adaptive particle swarm optimization
approach for short-term scheduling of hydro system. In: International Asia
Conference on Informatics in Control, Automation and Robotics, CAR 2009,
pp. 334338. IEEE (2009)
100. Lovbjerg, M., Rasmussen, T.K., Krink, T.: Hybrid particle swarm optimiser with
breeding and subpopulations. In: Proceedings of the Genetic and Evolutionary
Computation Conference, vol. 2001, pp. 469476. Citeseer (2001)
46 I. Fister et al.
101. Marinakis, Y., Marinaki, M.: A hybrid multi-swarm particle swarm optimization
algorithm for the probabilistic traveling salesman problem. Computers & Opera-
tions Research 37(3), 432442 (2010)
102. Niknam, T.: An ecient hybrid evolutionary algorithm based on pso and hbmo
algorithms for multi-objective distribution feeder reconguration. Energy Conver-
sion and Management 50(8), 20742082 (2009)
103. Pan, Q.-K., Fatih Tasgetiren, M., Suganthan, P.N., Chua, T.J.: A discrete arti-
cial bee colony algorithm for the lot-streaming ow shop scheduling problem.
Information Sciences 181(12), 24552468 (2011)
104. Pena, J.M., Robles, V., Larranaga, P., Herves, V., Rosales, F., Perez, M.S.: GA-
EDA: Hybrid evolutionary algorithm using genetic and estimation of distribu-
tion algorithms. In: Orchard, B., Yang, C., Ali, M. (eds.) IEA/AIE 2004. LNCS
(LNAI), vol. 3029, pp. 361371. Springer, Heidelberg (2004)
105. Qin, A.K., Suganthan, P.N.: Self-adaptive dierential evolution algorithm for nu-
merical optimization. In: The 2005 IEEE Congress on Evolutionary Computation,
vol. 2, pp. 17851791. IEEE (2005)
106. Roy, A.G., Rakshit, P., Konar, A., Bhattacharya, S., Kim, E., Nagar, A.K.: Adap-
tive rey algorithm for nonholonomic motion planning of car-like system. In:
2013 IEEE Congress on Evolutionary Computation (CEC), pp. 21622169. IEEE
(2013)
107. Wang, J., Osagie, E., Thulasiraman, P., Thulasiram, R.K.: Hopnet: A hybrid
ant colony optimization routing algorithm for mobile ad hoc network. Ad Hoc
Networks 7(4), 690705 (2009)
108. Wang, X.-H., Li, J.-J.: Hybrid particle swarm optimization with simulated an-
nealing. In: Proceedings of 2004 International Conference on Machine Learning
and Cybernetics, vol. 4, pp. 24022405. IEEE (2004)
109. Wang, Y., Li, B., Weise, T., Wang, J., Yuan, B., Tian, Q.: Self-adaptive learn-
ing based particle swarm optimization. Information Sciences 181(20), 45154538
(2011)
110. Wu, Q., Cao, Y., Wen, J.: Optimal reactive power dispatch using an adap-
tive genetic algorithm. International Journal of Electrical Power & Energy Sys-
tems 20(8), 563569 (1998)
111. Yan, X., Zhu, Y., Zou, W., Wang, L.: A new approach for data clustering using
hybrid articial bee colony algorithm. Neurocomputing 97, 241250 (2012)
112. Yang, Z., Tang, K., Yao, X.: Self-adaptive dierential evolution with neighborhood
search. In: IEEE Congress on Evolutionary Computation, CEC 2008 (IEEE World
Congress on Computational Intelligence), pp. 11101116. IEEE (2008)
113. Yu, S., Yang, S., Su, S.: Self-adaptive step rey algorithm. Journal of Applied
Mathematics 2013 (2013)
114. Zhan, Z.-H., Zhang, J., Li, Y., Chung, H.-H.: Adaptive particle swarm optimiza-
tion. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernet-
ics 39(6), 13621381 (2009)
115. Zhang, J., Ding, X.: A multi-swarm self-adaptive and cooperative particle swarm
optimization. Engineering Applications of Articial Intelligence 24(6), 958967
(2011)
116. Zhou, Y., Xie, J., Zheng, H.: A hybrid bat algorithm with path relinking for
capacitated vehicle routing problem. Mathematical Problems in Engineering 2013
(2013)
117. Alander, J.T.: An indexed bibliography of genetic algorithms and neural networks
118. Ali, Y.M.B.: Evolving multilayer feedforward neural network using adaptive par-
ticle swarm algorithm. Int. J. Hybrid Intell. Syst. 8(4), 185198 (2011)
119. Angeline, P.J., Saunders, G.M., Pollack, J.B.: An evolutionary algorithm that
constructs recurrent neural networks. IEEE Transactions on Neural Networks 5,
5465 (1994)
120. Asadi, S., Hadavandi, E., Mehmanpazir, F., Nakhostin, M.M.: Hybridization of
evolutionary levenberg-marquardt neural networks and data pre-processing for
stock market prediction. Knowl.-Based Syst. 35, 245258 (2012)
121. Caudell, T.P., Dolan, C.P.: Parametric connectivity: Training of constrained net-
works using genetic algorithms. In: David Schaer, J. (ed.) Proceedings of the
Third International Conference on Genetic Algorithms. Morgan Kaufmann Pub-
lishers (1989)
122. Cruz-Ramrez, M., Herv as-Martnez, C., Gutierrez, P.A., Perez-Ortiz, M.,
Briceno, J., de la Mata, M.: Memetic pareto dierential evolutionary neural
network used to solve an unbalanced liver transplantation problem. Soft. Com-
put. 17(2), 275284 (2013)
123. Cui, Z., Yang, C., Sanyal, S.: Training articial neural networks using appm.
IJWMC 5(2), 168174 (2012)
124. da Silva, A.J., Mineu, N.L., Ludermir, T.B.: Evolving articial neural networks
using adaptive dierential evolution. In: Kuri-Morales, A., Simari, G.R. (eds.)
IBERAMIA 2010. LNCS, vol. 6433, pp. 396405. Springer, Heidelberg (2010)
125. Delgado, M., Pegalajar, M.C., Cuellar, M.P.: Evolutionary training for dynami-
cal recurrent neural networks: an application in nantial time series prediction.
Mathware & Soft Computing 13(2), 89110 (2006)
126. Elhachmi, J., Guennoun, Z.: Evolutionary neural networks algorithm for the dy-
namic frequency assignment problem. International Journal of Computer Science
& Information Technology 3(3), 4961 (2011)
127. Fernandez, J.C., Hervas, C., Martnez-Estudillo, F.J., Gutierrez, P.A.: Memetic
pareto evolutionary articial neural networks to determine growth/no-growth in
predictive microbiology. Appl. Soft Comput. 11(1), 534550 (2011)
128. Furtuna, R., Curteanu, S., Leon, F.: Multi-objective optimization of a stacked
neural network using an evolutionary hyper-heuristic. Appl. Soft Comput. 12(1),
133144 (2012)
129. Gao, W.: Financial data forecasting by evolutionary neural network based on ant
colony algorithm. In: Deng, H., Miao, D., Lei, J., Wang, F.L. (eds.) AICI 2011,
Part III. LNCS, vol. 7004, pp. 262269. Springer, Heidelberg (2011)
130. Garro, B.A., Sossa, H., Vazquez, R.A.: Design of articial neural networks using
a modied particle swarm optimization algorithm. In: Proceedings of the 2009
International Joint Conference on Neural Networks, IJCNN 2009, pp. 23632370
(2009)
131. Igel, C.: Neuroevolution for reinforcement learning using evolution strategies. In:
Reynolds, R., Abbass, H., Tan, K.C., Mckay, B., Essam, D., Gedeon, T. (eds.)
Congress on Evolutionary Computation (CEC 2003), vol. 4, pp. 25882595. IEEE
(2003)
132. Ismail, A.Z., Jeng, D.S.: SEANN: A Self-evolving Neural Network based on PSO
and JPSO algorithms
133. Kala, R., Shukla, A., Tiwari, R.: Modular symbiotic adaptive neuro evolution for
high dimensionality classicatory problems. Intelligent Decision Technologies 5(4),
309319 (2011)
134. Kassahun, Y., Sommer, G.: Ecient reinforcement learning through evolutionary
acquisition of neural topologies. In: ESANN, pp. 259266 (2005)
135. Kaylani, A.: An Adaptive Multiobjective Evolutionary Approach to Optimize
Artmap Neural Networks. PhD thesis, Orlando, FL, USA (2008), AAI3335346
48 I. Fister et al.
136. Khan, M.M., Ahmad, A.M., Khan, G.M., Miller, J.F.: Fast learning neural net-
works using cartesian genetic programming. Neurocomputing 121, 274289 (2013)
137. Khan, M.M., Khan, G.M., Miller, J.F.: Evolution of neural networks using carte-
sian genetic programming. In: IEEE Congress on Evolutionary Computation, pp.
18. IEEE (2010)
138. Kulluk, S.: A novel hybrid algorithm combining hunting search with harmony
search algorithm for training neural networks. JORS 64(5), 748761 (2013)
139. Liao, S.-H., Hsieh, J.-G., Chang, J.-Y., Lin, C.-T.: Training neural networks via
simplied hybrid algorithm mixing neldermead and particle swarm optimization
methods. Soft Computing, 111 (2014)
140. Mandischer, M.: Representation and evolution of neural networks, pp. 643649.
Springer (1993)
141. Matteucci, M.: ELeaRNT: Evolutionary learning of rich neural network topolo-
gies. Technical report, Carnegie Mellon University (2002)
142. Lee, M.-C., Horng, M.-H., Lee, Y.-X., Liou, R.-J.: Firey Meta-Heuristic Algo-
rithm for Training the Radial Basis Function Network for Data Classication and
Disease Diagnosis. InTech (2012)
143. Mirjalili, S., Sadiq, A.S.: Magnetic optimization algorithm for training multi layer
perceptron. In: 2011 IEEE 3rd International Conference on Communication Soft-
ware and Networks (ICCSN), pp. 4246 (May 2011)
144. Mirjalili, S., Hashim, S.Z.M., Sardroudi, H.M.: Training feedforward neural net-
works using hybrid particle swarm optimization and gravitational search algo-
rithm. Applied Mathematics and Computation 218(22), 1112511137 (2012)
145. Moriarty, D., Miikkulainen, R.: Forming neural networks through ecient and
adaptive coevolution. Evolutionary Computation 5, 373399 (1998)
146. Nandy, S., Karmakar, M., Sarkar, P.P., Das, A., Abraham, A., Paul, D.: Agent
based adaptive rey back-propagation neural network training method for dy-
namic systems. In: 2012 12th International Conference on Hybrid Intelligent Sys-
tems (HIS), pp. 449454 (December 2012)
147. Nawi, N.M., Khan, A., Rehman, M.Z.: Csbprnn: A new hybridization technique
using cuckoo search to train back propagation recurrent neural network. In:
Herawan, T., Deris, M.M., Abawajy, J. (eds.) Proceedings of the First Interna-
tional Conference on Advanced Data and Information Engineering (DaEng-2013).
LNEE, vol. 285, pp. 111118. Springer, Heidelberg (2014)
148. Nawi, N.M., Rehman, M.Z., Khan, A.: A new bat based back-propagation (BAT-
BP) algorithm. In: Swiatek, J., Grzech, A., Swiatek,
P., Tomczak, J.M. (eds.)
Advances in Systems Science. AISC, vol. 240, pp. 395404. Springer, Heidelberg
(2014)
149. Neruda, R., Slusn y, S.: Parameter genetic learning of perceptron networks. In:
Proceedings of the 10th WSEAS International Conference on Systems, ICS 2006,
pp. 9297 (2006)
150. Nourani, E., Rahmani, A.-M., Navin, A.H.: Forecasting stock prices using a hybrid
articial bee colony based neural network. In: 2012 International Conference on
Innovation Management and Technology Research (ICIMTR), pp. 486490 (May
2012)
151. Oong, T.H., Isa, N.A.M.: Adaptive evolutionary articial neural networks for
pattern classication. IEEE Transactions on Neural Networks 22(11), 18231836
(2011)
152. Risi, S., Stanley, K.O.: Enhancing es-hyperneat to evolve more complex regular
neural networks. In: Proceedings of the 13th Annual Conference on Genetic and
Evolutionary Computation, GECCO 2011, pp. 15391546 (2011)
153. Sarangi, P.P., Sahu, A., Panda, M.: Article: A hybrid dierential evolution and
back-propagation algorithm for feedforward neural network training. International
Journal of Computer Applications 84(14), 19 (2013); Published by Foundation
of Computer Science, New York, USA
154. Sermpinis, G., Theolatos, K.A., Karathanasopoulos, A.S., Georgopoulos, E.F.,
Dunis, C.L.: Forecasting foreign exchange rates with adaptive neural networks
using radial-basis functions and particle swarm optimization. European Journal
of Operational Research 225(3), 528540 (2013)
155. Shah, H., Ghazali, R., Nawi, N.M.: Using articial bee colony algorithm for mlp
training on earthquake time series data prediction. CoRR, abs/1112.4628 (2011)
156. Shen, W., Guo, X., Wu, C., Wu, D.: Forecasting stock indices using radial basis
function neural networks optimized by articial sh swarm algorithm. Knowl.-
Based Syst. 24(3), 378385 (2011)
157. Slowik, A.: Application of an adaptive dierential evolution algorithm with mul-
tiple trial vectors to articial neural network training. IEEE Transactions on
Industrial Electronics 58(8), 31603167 (2011)
158. Stanley, K.O., DAmbrosio, D.B., Gauci, J.: A hypercube-based encoding for
evolving large-scale neural networks. Artif. Life 15(2), 185212 (2009)
159. Stanley, K.O., Miikkulainen, R.: Ecient reinforcement learning through evolv-
ing neural network topologies. In: Proceedings of the Genetic and Evolutionary
Computation Conference, GECCO 2002, pp. 569577 (2002)
160. Stanley, K.O., Miikkulainen, R.: Evolving neural networks through augmenting
topologies. Evol. Comput. 10(2), 99127 (2002)
161. Suchorzewski, M.: Evolving scalable and modular adaptive networks with devel-
opmental symbolic encoding. Evolutionary Intelligence 4(3), 145163 (2011)
162. Turner, A.J., Miller, J.F.: Cartesian genetic programming encoded articial neural
networks: A comparison using three benchmarks. In: Proceedings of the 15th
Annual Conference on Genetic and Evolutionary Computation, GECCO 2013,
pp. 10051012 (2013)
163. Vogl, T.P., Mangis, J.K., Rigler, A.K., Zink, W.T., Alkon, D.L.: Accelerating
the convergence of the back-propagation method. Biological Cybernetics 59(4-5),
257263 (1988)
164. Whitley, D., Starkweather, T., Bogart, C.: Genetic algorithms and neural
networks: optimizing connections and connectivity. Parallel Computing 14(3),
347361 (1990)
165. Yao, X., Liu, Y.: A new evolutionary system for evolving articial neural networks.
IEEE Transactions on Neural Networks 8, 694713 (1996)
166. Yu, J.J.Q., Lam, A.Y.S., Li, V.O.K.: Evolutionary articial neural network based
on chemical reaction optimization. In: IEEE Congress on Evolutionary Compu-
tation, pp. 20832090. IEEE (2011)
167. Zhang, J.-R., Zhang, J., Lok, T.-M., Lyu, M.R.: A hybrid particle swarm
optimization-back-propagation algorithm for feedforward neural network train-
ing. Applied Mathematics and Computation 185(2), 10261037 (2007)
168. Zhang, Y., Wu, L.: Crop classication by forward neural network with adaptive
chaotic particle swarm optimization. Sensors 11(5), 47214743 (2011)
169. Montana, D.J., Davis, L.: Training feedforward neural networks using genetic al-
gorithms. In: Proceedings of the 11th International Joint Conference on Articial
intelligence (IJCAI 1989), vol. 1, pp. 762767. Morgan Kaufmann Publishers Inc.,
San Francisco (1989)
170. Hern andez, H., Blum, C.: Distributed graph coloring: an approach based on the
calling behavior of Japanese tree frogs. Swarm Intelligence, 117150 (2012)
50 I. Fister et al.
171. Chen, W.-N., Zhang, J., Chung, H.S.H., Zhong, W.-L., Wu, W.-G., Shi, Y.-H.:
A novel set-based particle swarm optimization method for discrete optimization
problems. Trans. Evol. Comp. 14, 278300 (2010)
172. Zhang, W.-J., Xie, X.-F.: DEPSO: Hybrid Particle Swarm with Dierential Evo-
lution Operator. IEEE International Conference on Systems, Man and Cybernet-
ics 4, 38163821 (2003)
173. Yang, X.S.: Nature-Inspired Optimization Algorithms. Elsevier, London (2014)
174. Ashby, W.R.: Princinples of the self-organizing sysem. In: Von Foerster, H., Zopf
Jr., G.W. (eds.) Pricinples of Self-Organization: Transactions of the University of
Illinois Symposium, pp. 255278. Pergamon Press, London (1962)
175. Booker, L., Forrest, S., Mitchell, M., Riolo, R.: Perspectives on Adaptation in
Natural and Articial Systems. Oxford University Press, Oxford (2005)
176. Blum, C., Roli, A.: Metaheuristics in combinatorial optimisation: Overview and
conceptural comparision. ACM Comput. Surv. 35, 268308 (2003)
177. Yang, X.S., Deb, S., Loomes, M., Karamanoglu, M.: A framework for self-tuning
optimization algorithm. Neural Computing and Applications 23(7-8), 20512057
(2013)
Part II
Adaptation in Computational
Intelligence
Adaptation in the Dierential Evolution
Janez Brest , Ales Zamuda, and Borko Boskovic
Institute of Computer Science,

Faculty of Electrical Engineering and Computer Science,
University of Maribor,
{janez.brest,ales.zamuda,borko.boskovic}@um.si
http://labraj.uni-mb.si/en/Janez_Brest
Abstract. This chapter gives an overview of Dierential Evolution (DE),

then presents adaptive and self-adaptive mechanisms within the
DE algorithm. They can be used in order to make a DE solver more robust,
ecient, etc., and to overcome parameter tuning which is usually a time-
consuming task needed to be done before the actual optimization process
starts. Literature overviews of adaptive and self-adaptive mechanisms are
mainly focused on mutation and crossover DE operations, but less on pop-
ulation size adaptation. Some experiments have been performed on bench-
mark functions to present both the advantages and disadvantages of using
self-adaptive mechanisms.
Keywords: continuous optimization, evolutionary algorithm, self-

adaptation, parameter control.
1 Introduction
Population-based algorithms are suitable for solving continuous optimization
problems as well as for discrete optimization. Population-based algorithms in-
clude particle swarm algorithms, evolutionary algorithms, and other algorithms
inspired by nature. These algorithms usually have several control parameters
that are responsible for tuning the algorithm itself. Good values of the parame-
ters have an inuence on improving an algorithms performance during an opti-
mization process.
Globally, we distinguish between two major forms of setting parameter values:
parameter tuning and parameter control. Parameter tuning means that a user
tries to nd good values for the parameters before running the algorithm and
then running the algorithm using these values, which remain xed during the
optimization process. A particular problem may prefer some parameter values
at the early optimization stage, while other values are more suitable at the later
stage. A question can arise on how many stages an optimization process should
have. When to decide to change parameter values? If the values of the parameters
are changed during the run, we call it parameter control.


54 J. Brest, A. Zamuda, and B. Boskovic
Hence, it is seemingly natural to use a population-based algorithm (e.g. evo-

lutionary algorithm), not only for nding solutions to a problem but also for
tuning the (same) algorithm to a particular problem. Technically speaking, we
are trying to modify the values of parameters during the run of the algorithm
by taking the actual search progress into account.
Eiben et al. [14,15] categorized the change of parameters into three classes:
Deterministic parameter control : the value of a parameter is altered by some

deterministic rule.
Adaptive parameter control : it is used to place when there is some form of
feed-back from the search that is used for determining the direction and/or
the magnitude of the change to the parameter.
Self-adaptive parameter control : the idea of evolution of the evolution can
be used to implement the self-adaptations of parameters. Here the parame-
ters to be adapted are encoded into the individuals and undergo the actions
of some operators. The better values of these encoded parameters lead to
better individuals which, in turn, are more likely to survive and produce
ospring and, hence, propagate these better parameter values.
Self-adaptation allows the solver to adapt itself to any problem from a general
class of problems, to recongure itself accordingly, and to do this without any
user interaction [4]. On the other hand, when solving a particular problem using
tuning, it is possible to nd very good parameter values which are usually more
competitive than a self-adaptive mechanism.
In this chapter we conduct an overview of adaptation in Dierential Evolution
(DE). The DE algorithm is a population-based evolutionary algorithm. Since it
has only a few control parameters and is very ecient for solving real-world
problems, it has become a very popular algorithm. Adaptive and self-adaptive
DE-variants were recently proposed, in order to avoid the need for problem
specic parameter tuning and also for improving the convergence characteristics
of DE.
The chapter is structured as follows. Section 2 presents an overview of the
DE algorithm. In Section 3 a survey is given of works related to adaptation and
self-adaptation, and a case study is presented of the self-adaptive mechanism on
the jDE algorithm, as an example. Section 4 shows the experimental results of
DE algorithms with and without self-adaptive mechanisms. Section 5 concludes
this chapter.
2 Background
This section provides some background of the DE algorithm introduced by

R. Storn and K. Price in 1995 [26], and published in the Journal of Global
Optimization in 1997 [27].
DE is a stochastic population-based algorithm. During an evolutionary pro-
cess, a population is transformed into a new population. After some such
Adaptation in the Dierential Evolution 55
transformations, the algorithm stops and returns a best found solution. The
DE algorithm uses mutation, crossover, and selection operators to generate a
next population from the current population.
The DE algorithm belongs to the evolutionary algorithms but there are some
dierences between the DE and an evolutionary algorithm (EA):
an EA applies a dierent order of operators, i.e. the order in an EA algorithm

is crossover, mutation, and selection;
the important dierence appears in mutation. While an EA usually uses
mutation in order to introduce the very small change of an individual, a DE
applies a bigger change of an individual during the mutation;
a selection operation in an EA is more sophisticated compared to those used
in DE, where a greedy selection is used.
Let us present the original DE algorithm [27]. It uses three operators within
an evolutionary process. The population in generation G consists of NP vectors:
(G) (G) (G) (G)
xi = (xi,1 , xi,2 , ..., xi,D ), i = 1, 2, ..., NP.
In the EA community the vectors are called individuals.
(G)
Mutation. A mutant vector vi is created by using one of the DE mutation
strategies [13,23]. Currently, there exist many mutational strategies, and the
more powerful are:
(G) (G) (G) (G)
rand/1: vi = xr1 + F (xr2 xr3 ), (1)
(G) (G) (G) (G)
best/1: vi = xbest + F (xr1 xr2 ), (2)
(G) (G) (G) (G) (G) (G)
current-to-best/1: vi = xi + F (xbest xi ) + F (xr1 xr2 ), (3)
(G) (G) (G) (G) (G) (G)
random-to-best/1: vi = xr1 + F (xbest xr1 ) + F (xr2 xr3 ), (4)
(G) (G) (G) (G) (G) (G)
best/2: vi = xbest + F (xr1 xr2 ) + F (xr3 xr4 ), (5)
(G) (G) (G) (G) (G) (G)
rand/2: vi = xr1 + F (xr2 xr3 ) + F (xr4 xr5 ), (6)
where the indexes r1 r5 represent the random and mutually dierent integers
generated within the set {1, . . . , N P } and also dierent from index i. F is a
(G)
mutation scale factor within the range [0, 2], usually less than 1. xbest denotes
the best vector in generation G.
If some components of the mutant vector are out of bounds, the proposed
solutions for repairing the mutant vector found in the literature [27,24] are as
follows: (1) they are reected onto bounds, (2) set on bounds, (3) used as they
are (out of bounds), and (4) randomly generated once again until they are out
of bounds. Which solution is the more appropriate depends on the problem we
are solving it, its characteristics, etc.
(G)
Crossover. A crossover operator generates a trial vector ui as follows:

(G)
(G) vi,j , if rand(0, 1) CR or j = jrand ,
ui,j = (G)
xi,j , otherwise,
for i = 1, 2, ..., NP and j = 1, 2, ..., D. The crossover parameter CR presents the

probability of creating components for a trial vector from a mutant vector. Index
jrand {1, . . . , NP} is a randomly-chosen integer which is responsible for a trial
vector containing at least one component from the mutant vector. Value of CR
is within the range [0, 1).
Selection. The selection operator for a minimization problem is dened as

follows:
(G) (G) (G)
(G+1) ui , if f (ui ) f (xi ),
xi = (G)
xi , otherwise.
1: {NP ... population size, F ... scale factor, CR ... crossover parameter}
2: {xi ... i-th individual of population}
3: {MaxFEs ... maximum number of function evaluations}
4: {rand(0, 1) ... uniformly distributed random number [0, 1)}
5: Initialization()
6: {** Generate uniformly distributed random population within search space **}
7: while stopping criateria is not met do
8: for (i = 0; i < NP; i = i + 1) do
9: {*** DE/rand/1/bin ***}
10: Randomly select indexes r1 , r2 , and r3 , that are mutually dierent and
dierent also from index i.
(G) (G) (G) (G)
11: vi = xr1 + F (xr2 xr3 )
12: jrand = rand{1, . . . , D}
13: for (j = 0; j < D; j + +) do
14: if (rand(0, 1) CR or j == jrand ) then
(G) (G)
15: ui,j = vi,j
16: else
(G) (G)
17: ui,j = xi,j
18: end if
19: end for
(G) (G)
20: if (f (ui ) f (xi )) then
(G+1) (G)
21: xi = ui
22: else
(G+1) (G)
23: xi = xi
24: end if
25: end for
26: end while
Algorithm 1: DE algorithm
The DE has a greedy selection, while other evolutionary algorithms have a

more sophisticated selective operation (truncation selection, rank-based selec-
tion, roulette-wheel selection, etc.). Fitness values between the population vector
and its corresponding trial vector are compared in DE. This better vector will
become a member of the population for the next generation.
For the sake of clarity, a pseudo-code of the DE algorithm is shown in Algo-
rithm 1, where DE/rand/1/bin strategy is presented.
The prominence of the DE algorithm and its applications are shown in [24,16].
V. Feoktistov in his book ([16], p. 18) says, that the concept of dierential
evolution is a spontaneous self-adaptability to the function. In the rest of this
chapter we will focus on the adaptation and self-adaptations of the DE control
parameters.
3 Adaptation in the DE Algorithm
The DE [27] algorithm was proposed by Storn and Price 1997, and since then it
has been used in many practical cases. The original DE had no adaptive control
parameters, since their values were xed during the evolutionary process.
3.1 Literature Overview
J. Tvrdk in [30] proposed a DE algorithm where the competition was used

between dierent control parameter settings. The settings used x values for
control parameters. Adaptation by competitive setting is proposed in [31].
Ali and T orn in [1] proposed new versions of the DE algorithm, and also
suggested some modications to the classical DE, in order to improve its e-
ciency and robustness. They introduced an auxiliary population of NP individ-
uals alongside the original population (noted in [1], a notation using sets is used
population set-based methods). Next they proposed a rule for calculating the
control parameter F , automatically. Here we can see a need for changing the
value of the control parameter in DE, and a large amount of adaptation in DE
is related to control parameters F and CR, while parameter NP obviously gets
less attention than other two DE parameters.
Liu and Lampinen [22] proposed a version of DE, where the mutation crossover
control parameters are adaptive. The Fuzzy adaptive dierential evolution uses
fuzzy logic controllers the inputs of which incorporate the relative function values
and individuals of successive generations for adapting the control parameters.
Teo in [29] made an attempt at self-adapting the population size parameter, in
addition to self-adapting crossover and mutation rates. Brest et al. in [7] proposed
a DE algorithm using a self-adapting mechanism on the control parameters F
and CR.
Qin and Suganthan in [25] proposed a Self-adaptive Dierential Evolution
algorithm (SaDE), where the choice of a learning strategy and the two control
parameters (F and CR) are gradually self-adapted according to the learning
experience. The parameter F in SaDE is approximated by a normal distribution
with mean value of 0.5 and standard deviation of 0.3, denoted by N (0.5, 0.3). A
set of F values are randomly sampled from such normal distribution and applied
to each target vector within the current population. SaDE gradually adjusts the
range of CR values for a given problem according to previous CR values which
have generated trial vectors for successfully entering the next generation. CR
is approximated by a normal distribution with mean value CRm and standard
deviation Std = 0.1, denoted by N (CRm, Std), where CRm is initialized as 0.5.
SaDE combined two mutation strategies DE/rand/1 and DE/current-to-best/1.
Das et al. proposed a neighborhood concept for the population members of
DE, called DEGL [7]. It is a similar idea to the communities of the Particle
Swarm Optimization (PSO) algorithms. The small neighborhoods are dened
over the index-graph of parameter vectors.
A self-adaptive dierential evolution algorithm with opposition-based mecha-
nisms is presented in [20]. This opposition-based mechanism can be used during
the initialization of a population or later during the optimization process.
Zhang and Sanderson [37] proposed self-adaptive DE (JADE) with DE/
current-to-pBest mutation strategy:
(G) (G) (G) (G)

vi = xi + Fi (xpBest xi r1 xr2 ),
) + Fi (x(G) (G)
(7)
(G)
where xpBest is randomly chosen as one of the top 100p% individuals of the current
population with p (0, 1]. Fi is the scale factor associated with the ith individual
and it is updated dynamically in each generation. Instead of only adopting the best
individual in the DE/current-to-best/1 strategy, the current-to-pBest/1 strategy
utilizes information regarding other good solutions, while the best individual is
adopted in the DE/current-to-best/1 strategy. The DE/current-to-pBest/1 strat-
egy is a less greedy generalization of the DE/current-to-best/1 strategy. It also up-
dates the control parameters in an adaptive manner along with generations. The
algorithm uses an optional external archive to track the previous history of success
(G)
and failure. xr2 is in this case selected at random from a union of the current pop-
ulation and the archive. The archive size is xed and if the size exceeds a certain
threshold, then some individuals from the archive are randomly eliminated.
A new mutation strategy based on the best of a group of randomly selected
solutions from the current generation (called DE/current-to-gr best/1) was pro-
posed in [19].
Success-History based Adaptive DE (SHADE) [28] is an improved version of
JADE [37]. It uses a historical memory in order to adapt the control parameters
F and CR, current-to-pBest/1 mutation strategy, and external archive. SHADE
performed excellently on CEC2013 competition on Real-Parameter Single Ob-
jective Optimization.
Wang et al. in [34] proposed a new DE, which employs self-adapting control
parameters and generalized opposition-based learning.
Distributed DE is another very prominent optimization technique. In [12]
a distributed DE with several subpopulations and two migration selection ap-
proaches to maintaining a high diversity in the subpopulations are presented.
A wider overview over DE related work and its applicability within various do-
mains can be found in surveys about DE [23,13]. Recently, the adaptive DE algo-
rithm has been used for optimization within dierent domains [38,17,2,33,32,3,18],
and many others, which clearly indicates the high usability of adaptive and self-
adaptive mechanism in the DE algorithms.
3.2 Self-adaptation of F and CR: Case Study The jDE Algorithm
The self-adaptive jDE was introduced in 2006 [7]. The self-adapting mechanism
uses rand/1/bin strategy and is applied on the control parameters F and CR.
The third control parameter NP remained unchanged. However, it seems that
NP also plays an important role among control parameters in the DE [9,5].
jDE-based algorithms have been applied to solve large-scale single objec-
tive optimization problems: CEC 2008 [11], CEC 2010 [10,6], CEC 2012 [5],
CEC2013 [6], large-scale continuous optimization problems [9], dynamic opti-
mization [8], and in real problems [36,35].
In [7] a self-adaptive control mechanism was used to change the control param-
eters F and CR during a run. Each individual in the population was extended
using the values of these two control parameters (see Figure 1). Both of them
were applied at individual level. The better values for these (encoded) control
parameters lead to better individuals which, in turn, are more likely to survive
and produce ospring and, hence, propagate these better parameter values.
(G+1) (G+1)
In jDE [7], new control parameters Fi and CR i are calculated before
the mutation operator as follows [7]:

(G+1) Fl + rand1 Fu , if rand2 < 1 ,
Fi = (G)
Fi , otherwise,

(G+1) rand3 , if rand4 < 2 ,
CR i = (G)
CR i , otherwise,
where randj , for j {1, 2, 3, 4} are uniform random values within the range [0, 1].
The jDE algorithm uses DE/rand/1/bin strategy. The presented self-adaptive
mechanism can also be used with other DE strategies [6].
In [7] parameters 1 , 2 , Fl , Fu are xed to values 0.1, 0.1, 0.1, 0.9, respectively.
Figure 1 presents the usage of one mutation strategy with two self-adaptive
control parameters in each individual. As a particular strategy might indicate
dierent performance properties during stages of the evolutionary process, it is
reasonable to apply two or more strategies in the DE algorithm. Each strategy
can have its own control parameters (see Figure 2). The simple way of perform-
ing more strategies is to use them with the same probability. More sophisticated
usages of many strategies are proposed in literature [25,9] like adaptive mecha-
nisms which usually utilize a rule that the better strategy should have a higher
probability to be chosen during the mutation.
Table 1. Properties of the CEC 2013 benchmark functions [21]
No. Functions fi = fi (x )
1 Sphere Function -1400
Unimodal 2 Rotated High Conditioned Elliptic Function -1300
Functions 3 Rotated Bent Cigar Function -1200
4 Rotated Discus Function -1100
5 Dierent Powers Function -1000
6 Rotated Rosenbrocks Function -900
7 Rotated Schaers F7 Function -800
8 Rotated Ackleys Function -700
9 Rotated Weierstrass Function -600
10 Rotated Griewanks Function -500
11 Rastrigins Function -400
Basic 12 Rotated Rastrigins Function -300
Multimodal 13 Non-Continuous Rotated Rastrigins Function -200
Functions 14 Schwefels Function -100
15 Rotated Schwefels Function 100
16 Rotated Katsuura Function 200
17 Lunacek Bi Rastrigin Function 300
18 Rotated Lunacek Bi Rastrigin Function 400
19 Expanded Griewanks plus Rosenbrocks Function 500
20 Expanded Scaers F6 Function 600
21 Composition Function 1 (n=5, Rotated) 700
22 Composition Function 2 (n=3, Unrotated) 800
Composition 24 Composition Function 4 (n=3, Rotated) 1000
Functions 25 Composition Function 5 (n=3, Rotated) 1100
Search Range: [-100,100]D
x1,1 x1,2 ... x1,D F1 CR1
x2,1 x2,2 ... x2,D F2 CR2

...
...
...
...
...
...
xNP,1 xNP,2 ... xNP,D FNP CRNP
Fig. 1. Population and control parameters within one generation. Each individual has
its own F and CR control parameters.
strategy1 strategy2
x1,1 x1,2 ... x1,D F1 CR1 F1 CR1
x2,1 x2,2 ... x2,D F2 CR2 F2 CR2

...
...
...
...
...
...
...
...
xNP,1 xNP,2 ... xNP,D FNP CRNP FNP CRNP
Fig. 2. Each individual has its own two-pairs of F and CR control parameters, each
pair belongs to one DE strategy
Let us demonstrate a usage of two strategies with the same probability:
if rand(0, 1) < 0.5 then apply strategy1

else apply strategy2
and to keep it simple, strategy2 will be equal as strategy1, i.e. DE/rand/1/bin,

and we name this variant as jDE -2bin. Note that each strategy uses its own
control parameters, that can be dened on dierent intervals.
4 Experimental Results
In this section we present some experimental results of the original DE algo-

rithm, jDE and jDE -2bin. The experiments were conducted on the Congress on
Evolutionary Computation (CEC 2013) benchmark functions for real parameter
single objective optimization.
A set of 28 benchmark functions [21] was used and the general features of
the functions are presented in Table 1. The functions are divided into unimodal,
multimodal, and composition functions. We used each algorithm as black-box
optimizer. Here we made experiments on benchmark functions with dimension
Table 2. Experimental results of the jDE algorithm with dimension D = 10
Func. Best Worst Median Mean Std

1 0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00
2 1.4300e05 5.0129e+02 9.1825e02 1.3455e+01 7.0268e+01
3 7.1865e02 2.3999e+02 1.3621e+00 1.0323e+01 3.4629e+01
4 1.2219e07 2.7820e02 1.3462e04 2.6599e03 6.5131e03
5 0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00
6 5.5696e02 9.8124e+00 9.8124e+00 9.6211e+00 1.3662e+00
7 6.8186e04 1.9680e01 1.3616e02 2.5876e02 3.4604e02
8 2.0124e+01 2.0514e+01 2.0377e+01 2.0375e+01 7.6663e02
9 7.0932e01 7.1391e+00 4.4930e+00 4.1174e+00 1.4883e+00
10 1.7265e02 2.8168e01 9.5680e02 1.1227e01 4.7886e02
11 0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00
12 7.3856e+00 1.9244e+01 1.2995e+01 1.2901e+01 2.7882e+00
13 5.2645e+00 2.2631e+01 1.4529e+01 1.4394e+01 4.0652e+00
14 0.0000e+00 6.2454e02 0.0000e+00 1.2246e03 8.7454e03
15 5.9701e+02 1.3980e+03 1.1144e+03 1.1154e+03 1.5679e+02
16 6.6797e01 1.5512e+00 1.0848e+00 1.0947e+00 1.7848e01
17 1.0122e+01 1.0122e+01 1.0122e+01 1.0122e+01 2.4162e08
18 2.3314e+01 3.9173e+01 3.4674e+01 3.3170e+01 3.9930e+00
19 1.8320e01 5.7525e01 4.4065e01 4.3237e01 7.3646e02
20 2.4270e+00 3.4213e+00 3.1028e+00 3.0621e+00 2.1466e01
21 2.0000e+02 4.0019e+02 4.0019e+02 3.6487e+02 7.7077e+01
22 1.2047e+01 1.0682e+02 5.0640e+01 6.4325e+01 3.8390e+01
23 5.5666e+02 1.4865e+03 1.1907e+03 1.1494e+03 1.6935e+02
24 1.3780e+02 2.1460e+02 2.0784e+02 2.0745e+02 1.0245e+01
25 1.3693e+02 2.1597e+02 2.0650e+02 2.0705e+02 1.0610e+01
26 1.1345e+02 2.0002e+02 2.0002e+02 1.8784e+02 2.7149e+01
27 3.0002e+02 5.2152e+02 4.8435e+02 4.7721e+02 4.5661e+01
28 1.0000e+02 3.0000e+02 3.0000e+02 2.8824e+02 4.7527e+01
D = 10, and 51 runs of algorithm were executed for each function. The opti-
mal values are known a-priori for all benchmark functions, and therefore, we
can compute an error between the obtained value using our algorithm and the
optimal value. Note, error values smaller than 108 are taken as zero.
In the experiments, the parameters of the DE algorithm were set as follows:
F = 0.5, CR = 0.1,
F = 0.5, CR = 0.9,
F = 0.9, CR = 0.9, and
NP = 100.
The parameters of the jDE algorithm were set as follows:
F and CR were self-adaptive,

NP = 100,
Fl = 0.1, Fu = 0.9 (then F [0.1, 1.0]),

CR [0, 1].
The parameters of the jDE -2bin algorithm were set as follows:
F and CR for both strategies were self-adaptive,
NP = 100,
strategy1:
Fl = 0.1, Fu = 0.9 (then F [0.1, 1.0]),
CR [0, 1].
strategy2:
Fl = 0.3, Fu = 0.7 (then F [0.3, 1.0]),
CR [0.9, 1].
The strategy1 is the same as in the jDE , while strategy2 diers only on using
narrowed intervals for F and CR control parameters.
The obtained results (error values f (x) f (x )) are presented in Tables 2
and 3. In Table 2 the best, worst, median, mean and standard deviation (Std)
values are shown. In Table 3 only the values of mean and standard deviation are
presented.
Table 3 shows the results of the original DE algorithm with dierent control
parameter values. These values remained unchanged during the evolutionary
process. The obtained results (Table 3) indicate that the performance of the DE
algorithm is highly dependent of the values of the F and CR control parameters.
It is obvious that particular control parameter values are more suitable than
others.
At the bottom of Table 3, +, , =, indicate the numbers of how many times
the DE variant with a particular xed values of F and CR were better, worse,
equal, respectively, if we compared the mean value against the jDE algorithm
(see Table 2). From the obtained results one can see that DE with F = 0.5, CR =
0.1 performed very competitively compared with jDE , and there no algorithm
performed the best on all benchmark functions nor a single algorithm indicated
the superior performance on a set of unimodal, multi-modal, and composition
functions.
If we rank the algorithms for each function based on the mean value, the jDE
obtained 23 times the better or second results (i.e. rank 1, and rank 2), and 5
times only third and fourth ranked.
Table 4 shows results of the jDE -2bin algorithm. This algorithm performed
slightly better than jDE based on the comparison of the mean value. The jDE -
2bin algorithm illustrates of using parameters for two strategies.
Our main objective in this chapter was to present those adaptive and/or
self-adaptive mechanisms incorporated within the DE that are useful most of
the time. But it is not necessary that adaptive or self-adaptive variants of an
algorithm perform the best in all cases.
On the other hand, a reader can nd very competitive DE-based algorithms as
well as other evolutionary algorithms for solving real parameter single objective
optimization problems at the CEC competitions web-page1 .
1
http://www.ntu.edu.sg/home/epnsugan/index_files/cec-benchmarking.htm
Table 3. Experimental results of the DE algorithms with variing F and CR for di-
mension D = 10
F = 0.5, CR = 0.1 F = 0.5, CR = 0.9 F = 0.9, CR = 0.9

Fun.
Mean Std Mean Std Mean Std
1 0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00 1.2023e04 5.5479e05
2 2.3127e+06 9.7442e+05 0.0000e+00 0.0000e+00 5.6387e+04 2.4990e+04
3 1.5106e+07 7.9035e+06 4.2452e01 1.2253e+00 1.4479e+07 7.3957e+06
4 1.2031e+04 3.0106e+03 0.0000e+00 0.0000e+00 6.6812e+02 2.8707e+02
5 0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00 1.3551e03 4.5954e04
6 8.7326e+00 2.0644e+00 2.5081e+00 4.3149e+00 3.1366e02 1.2478e02
7 2.1478e+01 5.0289e+00 5.5674e04 4.4470e04 1.4942e+01 3.5674e+00
8 2.0360e+01 8.2596e02 2.0371e+01 7.4255e02 2.0366e+01 8.0270e02
9 5.6915e+00 5.7842e01 2.2847e+00 1.8598e+00 4.2039e+00 1.5583e+00
10 2.3627e+00 6.9006e01 3.7893e01 1.4000e01 7.6823e01 8.8696e02
11 0.0000e+00 0.0000e+00 1.7712e+01 3.3736e+00 2.9337e+01 4.6283e+00
12 1.9102e+01 3.5445e+00 2.5926e+01 3.7017e+00 3.9925e+01 5.2473e+00
13 1.9281e+01 5.2840e+00 2.6128e+01 4.2030e+00 4.2800e+01 5.0643e+00
14 1.4695e02 2.6756e02 1.1225e+03 1.4778e+02 1.3549e+03 1.7886e+02
15 1.0937e+03 1.3169e+02 1.3605e+03 1.5519e+02 1.4594e+03 1.9185e+02
16 1.0595e+00 1.6958e01 1.1241e+00 2.0099e01 1.1729e+00 1.7017e01
17 1.0115e+01 4.9699e02 2.9473e+01 3.6393e+00 4.7494e+01 5.8385e+00
18 3.5827e+01 4.3712e+00 3.6229e+01 3.8113e+00 5.5594e+01 5.5236e+00
19 3.5892e01 9.9331e02 2.0207e+00 3.7185e01 3.3147e+00 5.3829e01
20 3.2328e+00 2.0043e01 2.7311e+00 2.2156e01 3.3648e+00 1.5551e01
21 3.1605e+02 6.8357e+01 3.7272e+02 6.9575e+01 3.3165e+02 9.8850e+01
22 4.9795e+01 1.4024e+01 6.7595e+02 1.3741e+02 9.2255e+02 1.5388e+02
23 1.2305e+03 1.8380e+02 1.4020e+03 1.6336e+02 1.4290e+03 2.6961e+02
24 1.5780e+02 1.1512e+01 2.0576e+02 1.0603e+01 2.1017e+02 1.6746e+00
25 2.0580e+02 1.3972e+01 2.0714e+02 4.1909e+00 2.1011e+02 4.0935e+00
26 1.4384e+02 7.6748e+00 1.8966e+02 2.6324e+01 2.0002e+02 9.0629e04
27 4.3025e+02 2.4863e+01 4.6000e+02 6.6187e+01 4.9254e+02 9.7315e+00
28 2.9671e+02 2.0846e+01 2.8039e+02 6.0065e+01 2.8873e+02 4.7585e+01
+ 12 + 11 +3
13 15 25
=3 =3 =1
+,,= means that the DE variant obtained better, worse,
equal, repectively, mean values than the jDE algorithm.
Table 4. Experimental results jDE -2bin with dimension D = 10 (in last column, jDE
mean values are shown)
Func. Best Worst Median Mean Std jDE

1 0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00
2 0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00 1.3455e+01
3 0.0000e+00 6.3150e+00 9.4812e03 4.9932e01 1.4939e+00 1.0323e+01
4 0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00 2.6599e03
5 0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00
6 0.0000e+00 9.8124e+00 9.8124e+00 5.7720e+00 4.8773e+00 9.6211e+00
7 3.0599e07 1.1855e02 3.4670e04 1.2833e03 2.3572e03 2.5876e02
8 2.0156e+01 2.0508e+01 2.0367e+01 2.0353e+01 7.7522e02 2.0375e+01
9 9.7457e06 6.9339e+00 2.0218e+00 2.1233e+00 1.3887e+00 4.1174e+00
10 0.0000e+00 8.8500e02 3.6902e02 3.8880e02 2.2094e02 1.1227e01
11 0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00
12 1.9899e+00 2.0523e+01 1.0926e+01 1.1153e+01 4.2643e+00 1.2901e+01
13 1.9899e+00 2.4023e+01 1.1866e+01 1.1438e+01 5.3269e+00 1.4394e+01
14 0.0000e+00 6.2455e02 1.8268e07 1.3327e03 8.7389e03 1.2246e03
15 6.5552e+02 1.5800e+03 1.2127e+03 1.2142e+03 1.7457e+02 1.1154e+03
16 4.7927e01 1.5382e+00 1.1549e+00 1.1210e+00 2.0057e01 1.0947e+00
17 1.0122e+01 1.0177e+01 1.0122e+01 1.0125e+01 8.9654e03 1.0122e+01
18 2.2493e+01 4.6711e+01 3.4952e+01 3.4420e+01 4.4963e+00 3.3170e+01
19 2.6684e01 6.5008e01 4.8083e01 4.8599e01 6.9329e02 4.3237e01
20 2.1401e+00 3.3526e+00 2.8177e+00 2.8199e+00 2.8871e01 3.0621e+00
21 2.0000e+02 4.0019e+02 4.0019e+02 3.6879e+02 7.3529e+01 3.6487e+02
22 2.1654e+01 1.5064e+02 1.0180e+02 1.0112e+02 1.6895e+01 6.4325e+01
23 7.1354e+02 1.4959e+03 1.1863e+03 1.1762e+03 1.9219e+02 1.1494e+03
24 1.1242e+02 2.1107e+02 2.0680e+02 2.0552e+02 1.3392e+01 2.0745e+02
25 2.0454e+02 2.1781e+02 2.0611e+02 2.0721e+02 3.6675e+00 2.0705e+02
26 1.0696e+02 2.0002e+02 2.0002e+02 1.9436e+02 2.0216e+01 1.8784e+02
27 3.0000e+02 5.2502e+02 4.8169e+02 4.6092e+02 6.5859e+01 4.7721e+02
28 1.0000e+02 3.0000e+02 3.0000e+02 2.9608e+02 2.8006e+01 2.8824e+02
+ 13
12
=3
+,,= means that the jDE -2bin variant obtained better, worse,
equal, repectively, mean values than the jDE algorithm.
Here we conducted experiments on D = 10, while higher dimensions, e.g.

D = 30, 50 or even higher are furthermore challenging for DE-based and other
evolutionary algorithms or algorithms inspired by nature.
5 Conclusion
This chapter presented the adaptive and self-adaptive mechanisms of control pa-
rameters in the Dierential Evolution (DE) algorithm. More mutation strategies
can also be applied in the algorithm. These strategies can have either common
(adaptive or self-adaptive) control parameters or each strategy has its own con-
trol parameters. Using adaptation mechanisms of control parameters, more mu-
tation strategies, also dierent crossover schemes, show that the DE algorithm
can adopt to a particular problem being solved.
Acknowledgement. This work was supported in part by the Slovenian Re-

search Agency under program P2-0041. The authors would like to thank the
editors and reviewers for their constructive comments.
References
1. Ali, M.M., Torn, A.: Population Set-Based Global Optimization Algorithms: Some
Modications and Numerical Studies. Computers & Operations Research 31(10),
17031725 (2004)
2. Asafuddoula, M., Ray, T., Sarker, R.: An adaptive hybrid dierential evolution
algorithm for single objective optimization. Applied Mathematics and Computa-
tion 231, 601618 (2014)
3. Baatar, N., Jeong, K.-Y., Koh, C.-S.: Adaptive Parameter Controlling Non-
Dominated Ranking Dierential Evolution for Multi-Objective Optimization of
Electromagnetic Problems. IEEE Transactions on Magnetics 50(2) (February 2014)
4. Back, T.: Adaptive Business Intelligence Based on Evolution Strategies: Some Ap-
plication Examples of Self-Adaptive Software. Information Sciences 148, 113121
(2002)
5. Brest, J., Boskovic, B., Zamuda, A., Fister, I., Sepesy Maucec, M.: Self-Adaptive
Dierential Evolution Algorithm with a Small and Varying Population Size. In:
IEEE World Congress on Computational Intelligence (IEEE WCCI 2012), Bris-
bane, Australia, pp. 28272834 (2012)
6. Brest, J., Boskovic, B., Zamuda, A., Fister, I., Mezura-Montes, E.: Real Param-
eter Single Objective Optimization using Self-Adaptive Dierential Evolution Al-
gorithm with more Strategies. In: IEEE Congress on Evolutionary Computation
(CEC) 2013, pp. 377383 (2013)

7. Brest, J., Greiner, S., Boskovic, B., Mernik, M., Zumer, V.: Self-Adapting Control
Parameters in Dierential Evolution: A Comparative Study on Numerical Bench-
mark Problems. IEEE Transactions on Evolutionary Computation 10(6), 646657
(2006)

8. Brest, J., Korosec, P., Silc, J., Zamuda, A., Boskovic, B., Maucec, M.S.: Dier-
ential evolution and dierential ant-stigmergy on dynamic optimisation problems.
International Journal of Systems Science 44, 663679 (2013)
9. Brest, J., Maucec, M.S.: Self-adaptive dierential evolution algorithm using popula-
tion size reduction and three strategies. Soft Computing - A Fusion of Foundations,
Methodologies and Applications 15(11), 21572174 (2011)
10. Brest, J., Zamuda, A., Boskovic, B., Fister, I., Maucec, M.S.: Large Scale Global
Optimization using Self-adaptive Dierential Evolution Algorithm. In: IEEE World
Congress on Computational Intelligence, pp. 30973104 (2010)

11. Brest, J., Zamuda, A., Boskovic, B., Maucec, M.S., Zumer, V.: High-dimensional
Real-parameter Optimization Using Self-adaptive Dierential Evolution Algorithm
with Population Size Reduction. In: 2008 IEEE World Congress on Computational
Intelligence, pp. 20322039. IEEE Press (2008)
12. Cheng, J., Zhang, G., Neri, F.: Enhancing distributed dierential evolution
with multicultural migration for global numerical optimization. Information Sci-
ences 247, 7293 (2013)
13. Das, S., Suganthan, P.N.: Dierential evolution: A survey of the state-of-the-art.
IEEE Transactions on Evolutionary Computation 15(1), 2754 (2011)
14. Eiben, A.E., Hinterding, R., Michalewicz, Z.: Parameter Control in Evolutionary
Algorithms. IEEE Transactions on Evolutionary Computation 3(2), 124141 (1999)
15. Eiben, A.E., Smith, J.E.: Introduction to Evolutionary Computing. In: Natural
Computing. Springer, Berlin (2003)
16. Feoktistov, V.: Dierential Evolution: In Search of Solutions. Springer Optimiza-
tion and Its Applications. Springer-Verlag New York, Inc., Secaucus (2006)
17. Gong, W., Cai, Z., Yang, J., Li, X., Jian, L.: Parameter identication of an SOFC
model with an ecient, adaptive dierential evolution algorithm. International
Journal of Hydrogen Energy 39(10), 50835096 (2014)
18. Hu, Z., Xiong, S., Fang, Z., Su, Q.: A Convergent Dierential Evolution Algorithm
with Hidden Adaptation Selection for Engineering Optimization. Mathematical
Problems in Engineering (2014)
19. Islam, S.M., Das, S., Ghosh, S., Roy, S., Suganthan, P.N.: An adaptive dierential
evolution algorithm with novel mutation and crossover strategies for global numer-
ical optimization. IEEE Transactions on Systems, Man, and Cybernetics, Part B:
Cybernetics 42(2), 482500 (2012)
20. Ku, J.H., Cai, Z.H., Zheng, B., Yun, D.W.: The research of self-adaptive dierential
evolution algorithm with opposition-based mechanisms. Applied Mechanics and
Materials 543-547, 17061710 (2014)
21. Liang, J.J., Qu, B.-Y., Suganthan, P.N., Hern andez-Daz, A.G.: Problem Deni-
tions and Evaluation Criteria for the CEC 2013 Special Session and Competition
on Real-Parameter Optimization. Technical Report 201212, Computational Intelli-
gence Laboratory, Zhengzhou University, Zhengzhou China and Technical Report,
Nanyang Technological University, Singapore (2013)
22. Liu, J., Lampinen, J.: A Fuzzy Adaptive Dierential Evolution Algorithm. Soft
Computing - A Fusion of Foundations, Methodologies and Applications 9(6),
448462 (2005)
23. Neri, F., Tirronen, V.: Recent advances in dierential evolution: a survey and
experimental analysis. Articial Intelligence Review 33(1-2), 61106 (2010)
24. Price, K.V., Storn, R.M., Lampinen, J.A.: Dierential Evolution, A Practical Ap-
proach to Global Optimization. Springer (2005)
25. Qin, A.K., Huang, V.L., Suganthan, P.N.: Dierential evolution algorithm with
strategy adaptation for global numerical optimization. IEEE Transactions on Evo-
lutionary Computation 13(2), 398417 (2009)
26. Storn, R., Price, K.: Dierential Evolution - a simple and ecient adaptive scheme
for global optimization over continuous spaces. Technical Report TR-95-012, Berke-
ley, CA (1995)
27. Storn, R., Price, K.: Dierential Evolution A Simple and Ecient Heuristic for
Global Optimization over Continuous Spaces. Journal of Global Optimization 11,
341359 (1997)
28. Tanabe, R., Fukunaga, A.: Evaluating the performance of shade on cec 2013 bench-
mark problems. In: 2013 IEEE Congress on Evolutionary Computation (CEC),
pp. 19521959 (June 2013)
29. Teo, J.: Exploring dynamic self-adaptive populations in dierential evolution. Soft
Computing - A Fusion of Foundations, Methodologies and Applications 10(8),
673686 (2006)
30. Tvrdk, J.: Competitive dierential evolution. In: MENDEL 2006, 12th Interna-
tional Conference on Soft Computing, pp. 712 (2006)
31. Tvrdk, J.: Adaptation in dierential evolution: A numerical comparison. Appl.
Soft Comput. 9(3), 11491155 (2009)
32. Vasundhara, Mandal, D., Kar, R., Ghoshal, S.P.: Digital FIR lter design using t-
ness based hybrid adaptive dierential evolution with particle swarm optimization.
Natural Computing 13(1), 5564 (2014)
33. Venske, S.M., Goncalves, R.A., Delgado, M.R.: ADEMO/D: Multiobjective op-
timization by an adaptive dierential evolution algorithm. Neurocomputing 127,
6577 (2014); 12th Brazilian Symposium on Neural Networks (SBRN) held as part
of the 1st Brazilian Conference on Intelligent Systems (BRACIS), Curitiba, Brazil,
October 20-25, 2012.
34. Wang, H., Rahnamayan, S., Wu, Z.: Parallel dierential evolution with self-
adapting control parameters and generalized opposition-based learning for solv-
ing high-dimensional optimization problems. Journal of Parallel and Distributed
Computing 73(1), 6273 (2013)
35. Zamuda, A., Brest, J.: Vectorized Procedural Models for Animated Trees Recon-
struction using Dierential Evolution. Information Sciences 278, 121 (2014)

36. Zamuda, A., Brest, J., Boskovic, B., Zumer, V.: Dierential Evolution for Param-
eterized Procedural Woody Plant Models Reconstruction. Applied Soft Comput-
ing 11, 49044912 (2011)
37. Zhang, J., Sanderson, A.C.: JADE: Adaptive Dierential Evolution with Optional
External Archive. IEEE Transactions on Evolutionary Computation 13(5), 945958
(2009)
38. Zhong, Y., Zhao, L., Zhang, L.: An Adaptive Dierential Evolution Endmember
Extraction Algorithm for Hyperspectral Remote Sensing Imagery. IEEE Geoscience
and Remote Sensing Letters 11(6), 10611065 (2014)
On the Mutation Operators
in Evolution Strategies
Iztok Fister Jr. and Iztok Fister

{iztok.fister1,iztok.fister}@um.si
http://www.feri.um.si
Abstract. Self-adaptation of control parameters is realized in classical

evolution strategies (ES) using the appropriate mutation operators con-
trolled by strategy parameters (i.e. mutation strengths) that are embed-
ded into representation of individuals. The mutation strengths determine
the direction and the magnitude of the changes on the basis of the new
position of the individuals in the search space is determined. This chapter
analyzes the characteristics of classical mutation operators, like uncorre-
lated mutation with one step size and uncorrelated mutation with n step
sizes. In line with this, the uncorrelated mutation with n 4-dimensional
vectors is proposed that beside the mutation strengths utilizes two ad-
ditional strategy parameters embedded in the 4-dimensional structure
used for denition of the change, i.e., shifting the location of normal
distribution for a small shift angle and reversing the sign of the change.
The promising results conducted on the suite of ten benchmark functions
taken from the publications shown that the ES despite their maturity
serve as an interesting area of future research.
Keywords: evolution strategies, uncorrelated mutations, covariance

matrix adaptation, uncorrelated mutation with n 4D-vectors.
1 Introduction
This chapter focuses on the self-adaptation in evolution strategies (ES). Pri-
marily, the self-adaptation has been gaining popularity due to the exibility in
adaptation to dierent tness landscapes [1]. This method enables an implicit
learning of mutation strengths in the real-valued search spaces. Self-adaptation
bases on the mutation operator that modies the problem variables using the
strategy parameters to search the problem and parameter spaces simultane-
ously [2]. In line with this, the best values of problem variables and strategy
parameters survive during the evolutionary process.
Evolutionary algorithms (EAs) are intrinsically dynamic, adaptive processes [3].
Therefore, setting the strategy parameters, controlling the behavior of these algo-
rithms, xed during the run is in contrast with an idea of evolution on which bases


70 I. Fister Jr. and I. Fister
an evolutionary computation (EC). In line with this, the evolution of evolution has
been developed by Rechengerg [31] and Schwefel [32], where the strategy param-
eters are put into a representation of individuals and undergo operations of the
variation operators. As a result, the values of strategy parameters that modify the
problem variables, which are the most adapted to the tness landscape, as deter-
mined by the tness function of the problem to be solved during the evolutionary
search process. However, the process of the simultaneously evolution of strategy
parameters together with the problem variables is also named a self-adaptation in
EC.
Although the adaptation can tackle various elements of EAs, like representa-
tion of individuals (e.g., Schaefer in [5]) or operators (e.g., Shaer et al. in [6,7]),
it has especially been enforced by adapting the control parameters that regulate
the mutation stregths by mutation operators [8,9]. The comprehensive survey of
self-adapting in EAs can be found in [4,24,10].
In classical ES, three types of mutation operators have been applied: an uncor-
related mutation with one step size, an uncorrelated mutation with n-step sizes
and correlated mutation using a covariance matrix [28]. In the rst mutation,
the same distribution is used to mutate each problem variable, in the second,
dierent step sizes are used for dierent dimensions, while in the last mutation,
not only the magnitude in each dimension but also the direction (i.e., rotation
angle) are taken into account [33,11,12].
This paper proposes a modied mutation operator for ES, where each problem
variable in representation of individual is widen with three strategy parameters,
i.e., step size, shift angle and reverse sign determining the sign of change. In other
words, if the sign is positive the change is added to the corresponding problem
variable, otherwise the change is subtracted. Using these strategy parameters,
the magnitude, direction and sign of the change is determined. The motivation
behind the mutation operator is that each element of the solution vector explores
the search space independently of the other elements, but the way of exploring
this space depends on the strategy parameters. In fact, this mutation is uncor-
related. On the other hand, each element of solution vector is represented by a
four dimensional vector (also 4D-vector). Although mathematical structures for
description of four dimensional vector spaces already exist, like quaternions [17]
or 4-vectors [29], these structures refer to well dened algebras and are therefore
too complex for our needs. As a matter of fact, the modied mutation is referred
to an uncorrelated mutation with n 4D-vectors.
Consequently, the proposed self-adaptive evolution strategy using the uncorre-
lated mutation with n 4D-vectors (4SA-ES) was applied to test suite consisting
of ten well-known functions taken from publications. The obtained results of
4SA-ES were compared with the results of the original ESs that implement the
uncorrelated mutation with one step sizes, on the one hand and the uncorre-
lated mutation with n step size, on the other hand. Additionally, the other EA
and SI algorithms, like dierential evolution [13] (DE), self-adaptive dieren-
tial evolution [14] (jDE), and hybrid self-adaptive bat algorithms [18] (HSABA)
are also included in this comparative study. The results of the 4SA-ES showed
On the Mutation Operators in Evolution Strategies 71
that the DE, despite its maturity, still remains not completely discovered and
nevertheless oers many directions for developments also in the future. Further-
more, this contribution of self-adaptation could be applied also in contemporary
nature-inspired algorithms, i.e., the domain that usually forgets on the valuable
features of ES.
The structure in the remainder of this paper is as follows. Section 2 deals
with backround information needed by the reader. In line with this, the mutation
operator in ES were discussed in detail. Section 3 describes the proposed 4SA-ES.
In Section 4, the experiments and results are discussed. The paper concludes with
Section 5, where the possible direction for further development of this algorithm
are discussed.
2 Self-adaptive Evolution Strategies

The self-adaptive EAs accumulate all informations about a problem explored up
to a given moment in a population of solutions [34]. Their eciency depends
on characteristics of population density, i.e., accumulated informations about
the problem as written in genotypes of individuals. The higher the population
diversity, the higher the search power of the self-adaptive EAs. On a basis of the
solution quality, the algorithm decides how to progress with the search process.
Obviously, this progress is aected by an appropriate setting of the strategy
parameters [15].
There are dierent strategy parameters in self-adaptive EAs, e.g., probability
of mutation pc , probability of crossover pm , population size Np, etc. In order
to better adapt the parameter setting to the tness landscape of the problem,
the self-adaptation of strategy parameters has been emerged that is tightly con-
nected with a development of so named evolution strategies (SA-ES) [31,32,2].
SA-ES were especially useful for solving the continuous optimization problems.
Today, they are also successfully applied for solving the discrete optimization
problems [16].
As each EAs, the SA-ES also consists of the following components [28]:
representation of individuals,
evaluation function,
mutation,
crossover and
survivor selection.
In the remainder of this chapter, these components of the original SA-ES are
described, in detail. This section concludes with an outline of the original SA-ES.
2.1 Representation of Individuals

Historically, the original ESs were applied for continuous optimization problems.
Therefore, the problem variables present oating-point coecients xi R for
i = 1 . . . n of objective function f (x) for which the optimum value is searched
for, where n denotes the number of problem variables. In order to implement the
self-adaptation of strategy parameters in ES, these parameters are added to a
representation of problem variables and become a subject of acting the variations
operators crossover and mutation.
Typically, the strategy parameters of mutation strengths are self-adapted
in the SA-ES. These parameters enable to calculate the mutation step sizes to
determine a magnitude and direction of changes applied to the corresponding
problem variables. The number of mutation strengths variables depends on the
mutation type and can vary from a single value , to n values i dedicated to
each problem variable xi , or even to the matrix of rotation angles ij , when the
correlation mutation is used. In general, the individual is represented, as [28]:
x1 , . . . , xn , 1 , . . . , n , 1 , . . . , n , (1)
where n determines the number of problem variables xi , n the number of self-

adaptive strategy parameters j , and n the size of correlated matrix usually
determined by an expression n = (n n2 )(n 1). Generally, the search space
of candidate solution is determined by S = Rn Rn Rn .
2.2 Evaluation Function

Evaluation function (also tness function) in EAs is connected with the problem
to be solved and represents the requirements for the algorithm to adapt to [28].
Using this function, the quality of solution in the problem space is estimated.
Actually, this can be either a procedure or function that assigns a quality measure
to each candidate solution in the population.
2.3 Mutation
Typically, strategy parameters are used by mutation operator. This operator
bases on normal (also Gaussian) distribution that depends on two parameters,
i.e., mean value and standard deviation (t) . Mutation adds to each problem
(t)
variable xi a mutation step size xi that is randomly drawn from the normal
distribution N(, ) with corresponding probability density [36].

(t)
(t) 1 (xi )2
p(xi ) = exp . (2)
(t) 2 2 2
In practice, the mean value is set to zero ( = 0), while the candidate solution
(t) (t) (t)
x1 , . . . , xn , (t)
is mutated such that each value of variable xi is modied,
as
(t+1) (t)
xi = xi + N(0, (t+1) ), (3)
(t+1) (t+1)
where denotes the updated value of mutation strengths and N(0, )
is the random value drawn from normal distribution with mean of zero and
standard deviation (t+1) . On the other hand, a specialty of mutation in SA-ES
is that the mutation strengths determined by (t+1) are part of selection and
undergo acting the variation operators. The slightly simplied version of the
Eq. 3 can be employed, as
(t+1) (t)
xi = xi + (t+1) N(0, 1), (4)
where is assumed that N(0, ) = N(0, 1), and N(0, 1) denotes the
(t+1) (t+1)
random value drawn from normal distribution with mean of zero and standard
deviation (t+1) .
Uncorrelated Mutation with One Step Size. In the simplest case, a single
(t)
mutation strength is applied to all problem variables xi and thus the representa-
(t) (t)
tion of individuals is reduced to x1 , . . . , xn , (t) . From the candidate solution,
(t+1) (t+1)
mutation generates the modied values of ospring x1 , . . . , xn , (t+1)
according to the following equations:
(t+1) = (t) exp( N(0, 1)), (5)
(t+1) (t)
xi = xi + (t+1) Ni (0, 1). (6)
In Eq. (5), the parameter denotes a learning rate similar to articial neural
networks (ANN) [30] set by user. Usually, this parameter is set proportionally
to the square root of the problem size, i.e., = 1/ (n). Eq. (6) shows how
(t+1)
the mutated values of program variables xi are obtained. Actually, the term
(t+1)
i = (t+1) Ni (0, 1) in the equation denotes the mutation step size of i-th
problem variable. Thus, the mutation strengths are not explicitly controlled by
the user, but are a part of solution evolving during the evolutionary process.
In order to prevent that the standard deviation comes too near to zero, the
following limitation is used:
< 0 = 0 . (7)
This mutation makes the variable step sizes possible to use. At the beginning
of the evolutionary process, when the whole search space is explored, larger step
sizes are needed. On the other hand, when the population is directed towards
the optimal solutions the step sizes can become smaller.
Uncorrelated Mutation with n Step Sizes. Motivation behind the uncor-

related mutation with n step sizes is dealing with each dimension of solution
x(t) dierently. Therefore, dierent values of mutation strengths are applied for
each dimension of the candidate solution. A main reason for implementing of
this mutation lies in the fact that tness landscape is not at. For an instance,
in 3-dimensional search space, a gradient in the direction of abscissa axis is not
the same than in the ordinate axis. In this case, n mutation strengths are added
(t) (t)
to problem variables x1 , . . . , xn . As a result, the candidate solution is repre-
(t) (t) (t) (t)
sented as x1 , . . . , xn , 1 , . . . , n , while the mutation operation is described
using the following equations:
= i exp( N(0, 1) + Ni (0, 1)),

(t+1) (t)
i (8)
(t+1) (t) (t+1)

xi= xi + i Ni (0, 1), (9)

where 1/ 2 n and 1/ 2 n denote learning steps. Here, Eq. (7) also
prevents too small values of mutation strengths not to terminate the evolutionary
search process.
Correlated Mutation. In correlated mutation, in addition to n problem vari-

ables and n mutation strengths also at most n2 rotation angles j calculated
(t)
from the covariance matrix C are included in the representation of candidate

solution [28]. Consequently, the candidate solution is now represented as
(t) (t) (t) (t)
x1 , . . . , x(t)
n , 1 , . . . , n , 1 , (n)
(t)
2
n
size of each solution is increased to ( 2 + 2 n) on this way, where the
The
( n2 + n) strategy parameters are self-adapted during the mutation [37]. The
mutation operator obeys the following update rules [19]:
= i exp( N(0, 1) + Ni (0, 1)),

(t+1) (t)
i (10)
(t+1) (t)
j = j + Nj (0, 1)), (11)

(t+1)
x(t+1) = x(t) + Ni 0, C(i , (t+1) ) , (12)

(t+1)
where Ni 0, C(i , (t+1) ) is an implementation of a normally distributed
correlated mutation vector with a zero mean vector and a covariance matrix
5
C. The parameter is xed to 5 (i.e., 180 0.0873) according to Schwe-
fel [20]. The following limitation is used in order to prevent a rotation angle j
to cross the border of the interval [, ]. The values of and are set similarly
to uncorrelated mutation with n step sizes. The rotation angles are initialized
randomly in the interval j [0, ].
In summary, the mutation in ES is an asexual operator symbolically described
as
m, , : S S, (13)
where , , are predened constants.
2.4 Crossover
Although the rst ES has employed only a mutation operator for modifying the
candidate solution, the contemporary ES employ the crossover operator as well.
In contrast to classical EAs, crossovers in ES generate one ospring from two
parents. There are two types of ES crossovers in general, i.e., discrete and arith-
metic. The discrete crossover selects the value for ospring randomly between
values laying on the same position in the parent chromosomes. The arithmetic
crossover determines the value for ospring from values laying on the same po-
sition in the parent chromosomes according to the following equation [37]:
xi = xi + (1 ) xi , (14)
where the parameter can occupy values from the interval [0 . . . 1]. When
= 0.5, the corresponding crossover becomes uniform arithmetic crossover [36].
Symbolically, the ES crossover operator is presented as
c : S S S, (15)
which determines that one ospring is generated from two randomly selected
candidate solutions.
2.5 Survivor Selection

In SA-ES, there are two survivor selections, i.e., (, )-selection and ( + )-
selection. In the former one,
s(,) : S S , (16)
best solutions are selected from ospring, while in the latter,
s(+) : S + S , (17)
best solutions are selected from union of parents and ospring for the next
generation. Although the ( + )-selection preserves the best solutions in the
current population, the (, )-selection is more recommended for self-adaptation
because of [36]:
in dynamic environments, the (, )-selection preserves outdated solutions
and prevents the current optimum to be moved.
the (, )-selection is suitable to forget good solutions and therefore is not
sensitive to get stuck into a local optimum.
the ( + )-selection enables the mis-adapted strategy parameters to survive
for a relatively large number of generations and thus hinders the self-adaptive
mechanism highlighting that both (i.e., problem variables as well as strategy
parameters) need to improve during the evolutionary search process.
On the other hand, the selective pressure in SA-ESs is very high because the
recommended ratio of / 1/7 indicating is much higher than [28].
2.6 Outline of SA-ES
In summary, when the components discussed in previous topics are combined

together, the outline of the original SA-ES can be obtained as presented in
pseudo-code of Algorithm 1.
Algorithm 1. Pseude-code of evolution strategy

1: t = 0;
(0) (0)
2: initialize: P (0) = {a1 , . . . , a } where ai = {xi , j , k } for i = [1, n] j =
[1, n ] k = [1, n ];
(0) (0)
3: evaluate: P (0) : {f (x1 ), . . . , f (x )};
4: while not terminate do
(t)
5: crossover : a l = c (P (t) ) for l = [1, ];
(t) (t)
6: mutate: a l = m, , (a l ) for l = [1, ];
(t) (t) (t)
7: evaluate: P : {f (x1 ), . . . , f (x )};
(t+1)
8: selection: P = ((, ) selection)?s(,) (P (t) ) : s(+) (P (t) ) P (t) );
9: t = t + 1;
10: end while
Note that this pseudo-code summarizes all strategies to self-adapt at least one
strategy parameter. From an implementation point of view, it is important to nd
so named evolution window, where the evolutionary search can progress [36]. The
window is connected with the proper choosing the order of magnitude for strat-
egy parameters , within the reasonable performance is observed. The proper
identication of this parameter is sometimes connected to an extensive experi-
mental work.
3 The Uncorrelated Mutation with n 4D-Vectors
The phenomenon of self-adaptation in ES is realized by the mutation operators

based on the normal distribution. The normal distribution (also Gaussian) de-
scribes a family of continuous probability distributions, having the same general
shape, but diering in their location (also mean or average) and scale parameters
(also standard deviation) [38]. The distribution can be symbolically described as
N(, ), where denotes shifting the mean value for specic positive or negative
value, and the corresponding standard deviation.
A typical mutation operator bases on the normal distribution N(0, ) that has
the following characteristics [36]:
the mean value of mutation steps is zero,

adding the specic value of mutation step to problem variable xi occurs with
the same probability as the subtraction of the same value,
smaller changes of problem variable xi occur more frequently than larger,
the mutation step sizes decrease during the generations.
In summary, the larger mutation step sizes in the beginning of the evolutionary
search enable exploring wide regions of the search space. When the evolution-
ary search becomes matured and the promising regions are already located, the
evolutionary search can direct itself on exploring those regions.
A motivation guiding our experimental work was to elaborate a modied
mutation operator on the basis of self-adapting the parameters using the normal
distribution. An aim of this experimentation was to develop candidate solutions
independently of each other throughout the search space. In line with this, it
is expected that the exploration power of the evolutionary search process is
increased using this operator especially by problems, where the problem variables
are uncorrelated. In these cases, the search process, armed with this operator,
can be much better adapted to the tness landscape, determined by the problem
to be solved.
The candidate solution is now represented as
x1 , x2 , . . . , xn , (18)
(t) (t) (t) (t) (t) (t)
where each step size vector is in form as xi = xi , i , i , ri , where xi
(t) (t)
denotes the problem variable, i the mutation strength, i the rotation angle
(t)
and ri the boolean variable denoting a reection, which reverse the sign of
change when the value of this variable is set to true. The meaning of the other
variables are similar to the correlated mutation.
The modied mutation mechanism is realized by the following equations:
= r (ri ),
(t+1) (t)
ri (19)
(t+1) (t)
i = i + Ni (0, 1)), (20)
= i exp( N(0, 1) + Ni (0, 1)),

(t+1) (t)
i (21)

(t+1) (t+1) (t+1)
xi = x(t) + ri N ( t + 1)i , i , (22)
where U(0, 1) is the uniform distributed random number drawn from the inter-
val [0, 1], N(0, 1) the normal distributed random number drawn from the same
interval, the maximum shift angle, r(t+1) the reverse sign function, and the
and learning rates. Note that the location of the mean value of the cor-
responding normal distribution N(, ) is denoted as a shift angle because this
shift is normally small, and thus there is valid the equation sin() . A typical
5
shift value is therefore 5 (i.e., 180 0.0873) as proposed by [20].
The reverse sign function r is dened as follows:

ri if U(0, 1) < ,
(t)
(t+1)
r = (t) (23)
ri othervise.
where U(0, 1) is the uniformly distributed random value drawn from the interval
[0, 1] and [0, 1] some prescribed constant value determining the probability
Fig. 1. Uncorrelated mutation with n step size vectors
of reversing the sign (also learning rate). Eects of this mutation mechanism in
two dimensions is illustrated in Fig. 1.
The uncorrelated mutation with n step sizes uses the normal distribution
N(0, ) (bell curve with location zero). In contrast, the proposed mutation sup-
ports the general form of the normal distribution N(, ) to provide changing
the location of normal distribution N(0, ) for a shift angle (bell curve with
location ). When > 0 the positive changes were preferred, while < 0 the neg-
ative. The standard deviation on the gure presents an inection point, where
the concavity changes a sign from plus to minus.
4 Experiments and Results

Our experimental work was guided by three assumptions: to identify the char-
acteristics of the mutation operator with n 4D-vectors, to show that using this
mutation operator can outperform the results of the ES using the existing cor-
related mutation operators, i.e., with one step size and n step size, and to show
that the results of ES with the new developed operator can be comparable with
the results obtained by the other well-known algorithms. During the tests, the
ES algorithm using the uncorrelated mutation with one step size was denoted
as ES-1, the ES algorithm using the uncorrelated mutation with n step sizes as
ES-2 and the ES algorithm using the uncorrelated mutation with 4D-vectors as
ES-3, respectively. Algorithms were applied for solving the test suite consisted
of 10 well-known functions taken from literature.
The function optimization problem is dened as follows. Let us assume, an
objective function f (x) is given, where x = (x1 , . . . , xn ) is a vector of n design
variables in decision space S. The decision variables xi [xlb , xub ] are limited
by their lower xlb R and upper bound xub R to determine the potential
domain of their values. The task of the optimization is to nd the minimum of
the objective function.
The parameter setting during the experiments are presented in Table 1, where
the following parameters denoted by column Parameter are included: mutation
and crossover type, population model, the function evaluations to solution (FEs),
property of mutation (pm ), property of crossover (pc ), starting value of mutation
strengths (0 ) and the minimum value of mutation strengths (0 ), for each of
corresponding ES version.
Table 1. Parameter setting of the ES algorithms
Parameter ES-1 ES-2 ES-3

uncorrelated mutation with one step size with n step sizes with n 4D-vectors
crossover discrete discrete discrete
populatin model (50, 350) (50, 350) (50, 350)
FEs 1, 000 n 1, 000 n 1, 000 n
pm 1.0 1.0 1.0
pc 0.8 0.8 0.8
0 0.01 0.01 0.001
0 0.00001 0.00001 0.00001
From the table, it can be seen that the same values are used by almost all algo-
rithms, except the mutation type and starting value of mutation strength. Learn-

ing rates were
similar in all mentioned ES versions, as follows: = 1/ 2 n,
and = 1/ 2 n. Setting the value of learning rate is discussed later in this
paper.
The 25 independent runs were conducted by each ES algorithm, while the
minimum, maximum, average, median and standard deviation values were accu-
mulated. As a part of experimental work, the following ve tests were conducted:
an impact of the probability of a reverse sign,

an impact of the number of evaluations,
an impact of the dimensionality of problems,
a convergence graphs and
a comparative study.
In the remainder of this chapter, the test suite is described. Then, the results
of the mentioned ve tests are presented. The section concludes with discussion
of the performed work.
4.1 Benchmark Suite
The benchmark suite was composed of ten well-known functions, selected from
various publications. The denitions of the benchmark functions are summarized
in Table 2, where the function name and its corresponding denition can be seen.
However, reader is invited to check a deep details about test functions in the
state-of-the art reviews [21,22,23].
Each function in the table is tagged with its sequence number from f1 to f10 .
Properties of the benchmark functions can be seen in Table 3 consisting of ve
Table 2. Denitions of benchmark functions
Function Denition
n
x2
Griewangk f1 (x) = cos
i=1
xi
+ n i
i=1 4000 + 1
n i 2
Rastrigin f2 (x) = n 10 + i=1 (xi 10 cos(2 + xi ))

Rosenbrock f3 (x) = n1 2 2
(xi+1 xi ) + (xi 1)
i=1 100
2

n1 0.2 0.5(xi+1 +x2
2
i)
Ackley f4 (x) = i=1 20 + e20 e e0.5(cos(2xi+1 )+cos(2+xi+1 )+cos(2+xi ))

Schwefel f5 (x) == 418.9829 D D i=1 si sin( |si |)
D 2
De Jong f6 (x) = i=1 xi
D
Easom f7 (x) = (1)D ( D 2
i=1 cos (xi )) exp[ i=1 (xi ) ]
2
D 2
ixi 210
Michalewicz f8 (x) = i=1 sin(xi )[sin( )]

Yang f9 (x) = ( D |xi |) exp[ D sin(x2i )]
i=1 2 D i=1 2 4
Zakharov f10 (x) = i=1 xi + ( 2 i=1 ixi ) + ( 12 D
D 1
i=1 ixi )
Table 3. Properties of benchmark functions
f f x Domain Characteristics
f1 0 (0, 0, . . . , 0) 600, 600 Highly multi-modal
f3 0 (1, 1, . . . , 1) 15, 15 Several local optima
f4 0 (0, 0, . . . , 0) 32.768, 32.768 Highly multi-modal
f6 0 (0, 0, . . . , 0) 600, 600 Uni-modal, convex
f7 -1 (, , . . . ) 2, 2 Several local optima
f8 -1.80131 (2.20319, 1.57049)1 0, Several local optima
f9 0 (0, 0, . . . , 0) 2, 2 Several local optima
f10 0 (0, 0, . . . , 0) 5, 10 Uni-modal
columns: the function tag f , the value of the optimal solution f , the optimal
solution x , the parameter domains and the function characteristics.
Parameter domains limit the values of parameters into interval between their
lower and upper bounds. As a matter of fact, these determine the size of the
search space. In order to make the problems heavier to solve, the parameter
domains were selected wider than those prescribed in the standard publications.
Additionally, the problem becomes also heavier to solve when the dimensionality
of the benchmark functions are increased. As a result, benchmark functions of
more dimensions need to be optimized in the experimental work.
One of the more important characteristics of the function is the number of
local and global optima. According to this characteristics the functions are di-
vided either into uni-modal or multi-modal. The former type of functions has
only one global optimum, while the latter is able to have more local and global
optima thrown across the whole search space.
1
Valid for 2-dimensional parameter space.
In general, the functions of dierent dimensions can be observed. However,

the function of dimensions n = 10, n = 30 and n = 50 were taken into account,
in our test.
4.2 An Impact of the Probability of a Reverse Sign

An aim of this experiment was to discover how the probability of a reverse sign
aects the results of the ES-3 algorithm. In line with this, this parameter varied
in the interval [0.0, 0.5] in steps of 0.5. As a result, the eleven instances were
obtained for each of three dimension, i.e., n = 10, n = 30, and n = 50. The
results of each instance according to ve statistical measures (i.e., minimum,
maximum, average, median and standard deviation) were accumulated after 25
runs for each function and aggregated into statistical classiers consisting of
the 10 5 = 50 variables to serve as input to Friedman statistical tests. The
Friedman tests [25] calculate the average rank for each of the test instances,
while the results of these tests are illustrated in Fig.2.
(a) Dimension n = 10 (b) Dimension n = 30 (c) Dimension n = 50
Fig. 2. Results of varying the probability of a reverse sign
The gure is divided in three diagrams according to the observed dimensions,

where dierences of ranks between the ES-3 without the reverse sign feature and
the instance of the ES-3 with the corresponding probability of the reverse sign are
presented. As a result, each bar higher than zero indicates that the corresponding
ES-3 using the reverse sign feature outperforms the ES-3 algorithm without this
feature. The following conclusions can be drawn from Fig. 2:
A reverse sign has an impact on the results of the ES-3 algorithm because
more than half of instances are improved by considering their parameter. For
example, eight instances of ES-3 using the reverse sign feature outperformed
the results obtained by the ES-3 without this feature for dimension n = 10,
while six instances were better for dimension n = 30 and ve for dimension
n = 50.
This parameter depends on the dimensionality of the problem. As lower the
dimension, higher the inuence of this parameter. Finally, the results diverse
a little at higher dimensions.
The optimal value of this parameter is = 0.25 by D = 10 and D = 50,
and = 0.40 by D = 30.
In summary, the probability of reverse sign has an inuence on the results

of the optimization. However, the optimal value of this parameter depends on
the dimensionality of the observed benchmark functions. In the following tests
the optimal values, as found during this experiment, were employed by the ES-3
algorithm, i.e., = 0.25 for n = 10 and n = 50, and = 0.40 for n = 30.
Table 4. Detailed results (n = 30)
Evals Meas. f1 f2 f3 f4 f5
Best 3.01E+002 3.94E+003 7.23E+007 1.97E+001 6.49E+003
Worst 1.66E+001 4.47E+002 5.53E+005 9.31E+000 9.30E+002
1.20E+03 Mean 8.15E+000 1.74E+002 7.55E+002 2.84E+000 2.40E+002
Median 5.77E+002 5.47E+003 2.74E+008 2.07E+001 9.59E+003
StDev 4.55E+001 6.43E+002 4.75E+006 1.24E+001 2.37E+003
Best 3.02E+001 2.52E+002 1.97E+003 3.40E+000 1.23E+003
Worst 4.21E+002 4.74E+003 1.62E+008 2.03E+001 8.08E+003
6.00E+03 Mean 2.81E+001 5.31E+002 1.32E+006 1.08E+001 1.83E+003
Median 1.64E+001 2.23E+002 1.35E+003 3.24E+000 8.04E+002
StDev 4.27E+002 4.77E+003 1.55E+008 2.03E+001 8.25E+003
Best 2.24E+001 5.16E+002 9.91E+005 1.09E+001 1.88E+003
Worst 1.22E+001 2.19E+002 1.21E+003 3.28E+000 8.27E+002
3.00E+04 Mean 8.37E+001 4.48E+002 6.67E+007 3.33E-001 1.04E+003
Median 1.07E+001 5.26E+001 1.22E+006 1.08E+000 4.09E+002
StDev 8.46E+000 2.42E+001 4.10E+002 1.74E-001 3.09E+002
Best 1.22E+006 0.00E+000 8.79E-001 5.15E-004 4.00E+002
Worst 6.10E+004 0.00E+000 3.82E-001 3.34E-007 2.33E+002
1.20E+03 Mean 2.46E+004 0.00E+000 6.95E-003 4.63E-009 3.98E+001
Median 1.86E+006 0.00E+000 3.00E+000 5.71E-001 1.34E+004
StDev 1.39E+005 0.00E+000 1.15E+000 3.37E-005 6.13E+002
Best 7.49E+004 0.00E+000 2.88E-001 1.55E-007 8.41E+001
Worst 1.57E+006 0.00E+000 1.83E+000 1.23E-001 2.85E+003
6.00E+03 Mean 9.91E+004 0.00E+000 6.99E-001 1.71E-005 3.42E+002
Median 5.21E+004 0.00E+000 2.08E-001 4.16E-008 5.82E+001
StDev 1.62E+006 0.00E+000 1.67E+000 5.30E-003 1.07E+003
Best 9.46E+004 0.00E+000 5.27E-001 1.89E-005 2.87E+002
Worst 5.13E+004 0.00E+000 2.04E-001 3.20E-008 5.00E+001
3.00E+04 Mean 2.11E+005 0.00E+000 7.10E-001 2.07E-001 3.94E+003
Median 2.32E+004 0.00E+000 3.13E-001 1.27E-005 1.31E+002
StDev 1.61E+004 0.00E+000 8.54E-002 4.20E-008 1.61E+001
4.3 An Impact of the Number of the Fitness Function Evaluations

A goal of this experiment was to show how the results according to ve statistical
measures, i.e., minimum, maximum, average, median and standard deviation
depend on the number of tness function evaluations. In line with this, the results
according to the observed measures were tracked at three dierent stages, i.e.,
1
at the 25 , the 15 of tness function evaluations and the end of run. The results of
the mentioned test for benchmark functions of dimensions n = 30 are presented
in Table 4, where the minimum values are presented in rows Best, maximum
in rows Worst, average in rows Mean, median in rows Median, and standard
deviation in rows StDev.
From the table, it can be seen that the ES-3 algorithm is not an elitist, because
of not preserving the best results. For instance, the mean values of the most
observed functions decrease, when the number of tness function evaluations
increases. However, this is a characteristic of the used (, ) strategy that tries
to facilitate extinction of miss-adapted solutions [36]. On the other hand, this
characteristic of the mean values indicate that the number of the tness function
evaluations was underestimated by the ES-3 algorithm during the test.
4.4 An Impact of the Dimensionality of Problems

This experiment was guided by an assumption how the dimensions of the bench-
mark functions inuence the results of tests. In line with this, the results obtained
by optimization of benchmark functions of various dimensions, i.e., n = 10,
n = 30, and n = 50, were accumulated in Table 5. Note that the number of
tness function evaluations was limited to 1, 000 n during the experiments. In
other words, the 10, 000 tness function evaluations were used for n = 10, 30, 000
for n = 30 and 50, 000 for n = 50.
In summary, the results regarding the ve statistical measures obtained by
optimizing the benchmark functions at the lower dimensions were better than the
results obtained by optimizing the functions at higher dimensions. As a result,
the benchmark functions of dimension n = 50 were the hardest to solve by the
ES-3 algorithm.
4.5 Convergence Graphs

In order to show how the results of optimizing the various benchmark functions
converge to a nal values, convergence plots are drawn for functions f2 , f3 , f4 , f6 ,
f8 and f10 of dimensions n = 10, n = 30 and n = 50. The functions are illus-
trated in Fig. 3 which is divided into 6-diagrams, two for each of dimension.
Each diagram was obtained after 25 independent runs and consists of three
graphs representing the convergence of the best, worst and average values.
While graphs of benchmark functions with dimensions n = 10 and n = 30
are smooth and steeply descending at the beginning, graphs of functions with
higher dimension n = 50 are ridged, where the location of the optimum value
is changed stepwise. Interestingly, the best values progress with small delays
connected with exploring the new region in the tness landscape.
Table 5. The results according to various dimensions
D Meas. f1 f2 f3 f4 f5
Best 1.22E+000 3.90E+001 3.39E+001 1.69E+000 9.62E+001
Worst 7.11E+000 7.35E+001 8.34E+002 2.46E+000 4.09E+002
10 Mean 4.03E+000 5.52E+001 3.21E+002 2.12E+000 2.41E+002
Median 3.85E+000 5.60E+001 2.07E+002 2.09E+000 2.37E+002
StDev 1.73E+000 1.09E+001 2.89E+002 2.47E-001 1.01E+002
Best 8.15E+000 1.74E+002 7.55E+002 2.84E+000 2.40E+002
Worst 3.02E+001 2.52E+002 1.97E+003 3.40E+000 1.23E+003
30 Mean 1.64E+001 2.23E+002 1.35E+003 3.24E+000 8.04E+002
Median 1.22E+001 2.19E+002 1.21E+003 3.28E+000 8.27E+002
StDev 8.46E+000 2.42E+001 4.10E+002 1.74E-001 3.09E+002
Best 1.50E+001 3.81E+002 1.77E+003 3.51E+000 1.53E+003
Worst 4.13E+001 4.65E+002 6.21E+003 3.82E+000 2.75E+003
50 Mean 2.46E+001 4.19E+002 3.12E+003 3.69E+000 2.10E+003
Median 2.27E+001 4.10E+002 2.57E+003 3.70E+000 1.87E+003
StDev 7.44E+000 2.67E+001 1.35E+003 1.02E-001 4.58E+002
Best 3.79E+003 0.00E+000 0.00E+000 1.46E-003 1.42E+000
Worst 1.94E+004 0.00E+000 4.92E-016 2.94E-003 3.00E+000
10 Mean 9.22E+003 0.00E+000 4.95E-017 2.15E-003 2.07E+000
Median 7.79E+003 0.00E+000 1.62E-025 1.87E-003 1.81E+000
StDev 4.32E+003 0.00E+000 1.55E-016 5.53E-004 5.79E-001
Best 2.46E+004 0.00E+000 6.95E-003 4.63E-009 3.98E+001
Worst 7.49E+004 0.00E+000 2.88E-001 1.55E-007 8.41E+001
30 Mean 5.21E+004 0.00E+000 2.08E-001 4.16E-008 5.82E+001
Median 5.13E+004 0.00E+000 2.04E-001 3.20E-008 5.00E+001
StDev 1.61E+004 0.00E+000 8.54E-002 4.20E-008 1.61E+001
Best 8.32E+004 0.00E+000 6.97E-001 1.55E-012 2.68E+002
Worst 1.57E+005 0.00E+000 1.50E+000 3.91E-011 4.39E+002
50 Mean 1.15E+005 0.00E+000 9.48E-001 1.19E-011 3.36E+002
Median 9.76E+004 0.00E+000 8.36E-001 4.39E-012 3.16E+002
StDev 3.11E+004 0.00E+000 2.56E-001 1.29E-011 5.68E+001
4.6 A Comparative Study

An intention of this experiment was twofold. Firstly, to show that the ES-3 al-
gorithm using the modied mutation operator with n 4D-vectors can improve
the results of the original ES-1 and ES-2 algorithms by optimizing a suite of ten
benchmark functions. Secondly, to show how good the results were obtained us-
ing the ES-3, when compared with the other well-known algorithms, like DE [13],
jDE [14] and HSABA [18]. In line with this, the similar experimental setup was
designed for each algorithm run on the function test suite and the analysis of
the results was made in the senses of both pursued objectives. The analysis was
substantiated using the Friedman statistical tests for evaluating the obtained re-
sults. The results obtained by optimizing the benchmark functions of dimension
n = 50 are presented in Table 6.
3000.0 120000000.0
Best Best
Worst Worst
2500.0 Average 100000000.0 Average
Values of fitness function

2000.0 80000000.0
1500.0 60000000.0
1000.0 40000000.0
500.0 20000000.0
0.0 0.0
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
Calculation progress [%] Calculation progress [%]
(a) f2 of dimension n = 10 (b) f3 of dimension n = 10

20.0 1400000.0
Best Best
18.0 Worst Worst
Average 1200000.0 Average

16.0
1000000.0
14.0
12.0 800000.0
10.0 600000.0
8.0
400000.0
6.0
200000.0
4.0
2.0 0.0
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
(c) f4 of dimension n = 30 (d) f6 of dimension n = 30

3.5 1400.0
Best Best
Worst Worst
3.0 Average 1200.0 Average
2.5 1000.0
2.0 800.0
1.5 600.0
1.0 400.0
0.5 200.0
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
(e) f8 of dimension n = 50 (f) f10 of dimension n = 50
Fig. 3. Convergence graphs
As can be seen from Table 6, the jDE and HSABA algorithms reached the
best results four times. The former achieved the best results by optimizing the
functions f1 , f4 , f6 , and f8 , while the latter by optimizing the functions f2 , f3 , f5
and f9 . The original DE algorithm outperformed results of the other algorithms
in test by optimizing the function f10 , while by optimizing the function f7 all
algorithms achieved similar results.
In order to evaluate these results statistically, the Friedman non-parametric
tests were also performed according to ve measurements (minimum, maximum,
mean, median and standard deviation) obtained over 25 runs for each function.
Three Friedman non-parametric tests were conducted in order to capture the
behavior of dierent algorithms, according to the dimensions of the problems.
Table 6. Obtained results of algorithms (n = 50)
Fun Meas ES-1 ES-2 ES-3 DE jDE HSABA

Mean 4.64E+001 6.20E+001 2.46E+001 1.02E+000 1.32E-003 1.70E-001
f1
Stdev 1.06E+001 1.45E+001 7.44E+000 5.16E-002 7.91E-004 2.29E-001
Mean 1.09E+003 6.93E+002 4.19E+002 4.17E+002 1.08E+002 9.88E+001
f2
Stdev 1.78E+002 1.26E+002 2.67E+001 1.71E+001 1.31E+001 1.27E+002
Mean 4.74E+006 2.94E+006 3.12E+003 5.07E+002 1.31E+002 7.07E+001
f3
Stdev 1.61E+006 1.66E+006 1.35E+003 2.47E+002 7.75E+001 1.89E+002
Mean 1.20E+001 1.10E+001 3.69E+000 1.13E+000 6.57E-003 1.11E+001
f4
Stdev 7.53E-001 8.15E-001 1.02E-001 2.63E-001 1.58E-003 8.59E+000
Mean 4.23E+003 3.84E+003 2.10E+003 1.40E+004 5.14E+003 9.00E+002
f5
Stdev 5.20E+002 7.23E+002 4.58E+002 3.42E+002 5.87E+002 2.12E+003
Mean 2.48E+005 2.44E+005 1.15E+005 1.57E+002 3.69E-002 5.59E-001
f6
Stdev 5.37E+004 5.57E+004 3.11E+004 5.65E+001 2.16E-002 1.97E+000
Mean 0.00E+000 0.00E+000 0.00E+000 0.00E+000 0.00E+000 0.00E+000
f7
Stdev 0.00E+000 0.00E+000 0.00E+000 0.00E+000 0.00E+000 0.00E+000
Mean 1.03E-004 1.01E-004 9.48E-001 -1.47E+001 -3.40E+001 -1.95E+001
f8
Stdev 9.70E-005 1.49E-004 2.56E-001 6.71E-001 1.17E+000 3.25E+000
Mean 3.30E-019 2.98E-020 1.19E-011 1.36E-019 6.89E-020 2.44E-020
f9
Stdev 1.40E-019 4.35E-021 1.29E-011 6.90E-021 6.70E-021 1.99E-020
Mean 4.18E+002 3.90E+002 3.36E+002 1.84E+002 3.50E+002 2.34E+002
f10
Stdev 8.02E+001 8.42E+001 5.68E+001 3.00E+001 8.61E+001 2.04E+002
The Friedman test [26,27] compares average ranks of the algorithms. A null-
hypothesis states that two algorithms are equivalent and, therefore, their ranks
should be equal. If the null-hypothesis is rejected, i.e., the performance of the
algorithms is statistically dierent, the Bonferroni-Dunn test [25] is performed
to calculate the critical dierence between the average ranks of those two algo-
rithms. When the statistical dierence is higher than the critical dierence, the
algorithms are signicantly dierent. The equation for the calculation of critical
dierence can be found in [25].
Friedman tests were performed using the signicance level 0.05. The results
of the Friedman non-parametric test are presented in Fig. 4, being divided into
three diagrams to show the normalized ranks and condence intervals (critical
dierences) for the algorithms under consideration. As closer the normalized rank
to value one, the better the algorithm. The diagrams are organized according to
the dimensions of functions. Two algorithms are signicantly dierent if their
intervals in Fig. 4 do not overlap.
The rst diagram in Fig. 4 shows that the HSABA algorithm outperforms the
other algorithms in test. Moreover, the results of HSABA are signicantly better
than the results of any other algorithm. Furthermore, the results of jDE, DE and
ES-3 are signicantly better than the results of the original ES-1. Interestingly,
HSABA
jDE
DE
ES-3
ES-2
ES-1
0 1 2 3 4 5 0 1 2 3 4 0 1 2 3 4
Average rank (D=10) Average rank (D=30) Average rank (D=50)
Fig. 4. Results of the Friedman non-parametric test
the situation is dierent when taking the next two diagrams (by n = 30 and
n = 50) into consideration. In the rst case, the HSABA and jDE signicantly
improve the results of the other ES algorithms in the test, while in the second
case, the jDE and HSABA outperform the results of ES-1 and ES-2 signicantly.
Comparing the ES algorithms only, it can be seen that the ES-3 outperforms
the results of the ES-1 signicantly, and the ES-2 substantially by the observed
dimensions.
5 Conclusion
The self-adaptation of strategy parameters is the most advanced mechanism in

ES, where the mutation strength strategy parameters control the magnitude and
the direction of changes generated by the mutation operator. There are three
mutation operators in classical ES, as follows: the uncorrelated mutation with
one step size, the uncorrelated mutation with n step sizes and the correlated
mutations using the correlated matrix adaptation. This paper analyses the char-
acteristics of these three mutation operators.
In addition, it proposes the modied mutation operator with n 4D-vectors that
beside the mutation strengths embeds also a shift angle and reverse sign strategy
parameters into presentation of each element in solution. The shift parameter de-
termines a small shift of normal distribution location. That means, when the shift
angle is positive, more changes in positive direction are preferred and vice versa,
when the shift angle is negative, more changes in negative direction are expected.
An extensive experimental work was conducted by solving the function opti-
mization of the benchmark suite consisted of ten well-known benchmark func-
tions taken from publications. Analyzing the characteristics of the proposed
uncorrelated mutation operator had shown that the ES using this operator out-
performed the results obtained with the classical ES using the other two uncor-
related mutation operators. When comparing the results of mentioned ES with
the other state-of-the-art algorithms, like DE, jDE and HSABA, the results of
the ES using the proposed uncorrelated mutation operator are comparable with
the results of the original DE and jDE on the benchmark functions of dimensions
D = 10, while the results are slightly worse by optimizing the other observed
dimensions of benchmark functions (D = 30 and D = 50). The results of the
HSABA were signicantly better in all dimensions, except by dimension D = 50.
Although the research domain of the ES seems to be already explored and, by

this ood of the new nature-inspired algorithms everyday, unattractive for the
developers, the results of our experiments showed the opposite. Additionally, a
hybridization adds to a classical self-adapting ES the new value. As the rst step
of our future work, however, could be to extend this comparative study with the
ES using the CMA, as well.
References
1. Back, T., Hammel, U., Schwefel, H.-P.: Evolutionary Computation: Comments on
the History and Current State. IEEE Trans. Evolutionary Computation 1(1), 317
(1997)
2. Beyer, H.-G., Deb, K.: On self-adaptive features in real-parameter evolutionary
algorithms. IEEE Trans. Evolutionary Computation 5(3), 250270 (2001)
3. Eiben, A.E., Hinterding, R., Michalewicz, Z.: Parameter control in evolutionary
algorithms. Trans. Evol. Comp. 3(2), 124141 (1999)
4. Hinterding, R., Michalewicz, Z., Eiben, A.E.: Adaptation in Evolutionary Com-
putation: A Survey. In: Proceedings of the Fourth International Conference on
Evolutionary Computation (ICEC 1997), pp. 6569 (1997)
5. Shaefer, C.G., Grefenstette, J.J.: The ARGOT strategy: Adaptive representation
genetic optimizer technique. In: Proc. 2nd Int. Conf. Genetic Algorithms and Their
Applications, pp. 5055 (1987)
6. Schaer, J.D., Morishima, A., Grefenstette, J.J.: An adaptive crossover distribution
mechanism for genetic algorithms. In: Proc. 2nd Int. Conf. Genetic Algorithms and
Their Applications, pp. 3640 (1987)
7. Spears, W.M., McDonnell, J.R., Reynolds, R.G., Fogel, D.B.: Adapting crossover
in evolutionary algorithms. In: Proc. 4th Annu. Conf. Evolutionary Programming,
pp. 367384. MIT Press (1995)
8. Srinivas, M., Patnaik, L.M.: Adaptive probabilities of crossover and mutation in
genetic algorithms. IEEE Trans. Syst., Man, and Cybern. 240(4), 1726 (1994)
9. Tuson, A., Ross, P., Voigt, H.-M., Ebeling, W., Rechenberg, I., Schwefel, H.-P.: Cost
based operator rate adaptation: An investigation. In: Ebeling, W., Rechenberg, I.,
Voigt, H.-M., Schwefel, H.-P. (eds.) PPSN 1996. LNCS, vol. 1141, pp. 461469.
Springer, Heidelberg (1996)
10. Zhang, J., Chen, W.-N., Zhan, Z.-H., Yu, W.-J., Li, Y.-L., Chen, N., Zhou, Q.: A
survey on algorithm adaptation in evolutionary computation. Frontiers of Electrical
and Electronic Engineering 7(1), 1631 (2012)
11. Auger, A., Hansen, N.: A restart CMA evolution strategy with increasing popu-
lation size. In: Proceedings of the 2005 Congress on Evolutionary Computation,
pp. 17691776 (2005)
12. Igel, C., Hansen, N., Roth, S.: Covariance matrix adaptation for multi-objective
optimization. Evolutionary Computation 15(1), 128 (2007)
13. Storn, R., Price, K.: Dierential Evolution: A Simple and Ecient Heuristic
for Global Optimization over Continuous Spaces. Journal of Global Optimiza-
tion 11(4), 341359 (1997)

14. Brest, J., Greiner, S., Boskovic, B., Mernik, M., Zumer, V.: Self-adapting control
parameters in dierential evolution: A comparative study on numerical benchmark
problems. IEEE Transactions on Evolutionary Computation 10(6), 646657 (2006)
15. Back, T.: An Overview of Parameter Control Methods by Self-Adaptation in Evo-
lutionary Algorithms. Fundam. Inf. 35(1-4), 5166 (1998)
evolutionary algorithm. Comp. Opt. and Appl. 54(3), 741770 (2013)
17. Fister, I., Yang, X.-S., Brest, J., Fister Jr., I.: Modied rey algorithm using
quaternion representation. Expert Systems with Applications 40(18), 72207230
(2013)
18. Fister, I., Fong, S., Brest, J., Fister Jr., I.: A novel hybrid self-adaptive bat algo-
rithm. The Scientic World Journal, 112 (2014)
19. Hansen, N., Ostermeier, A.: Completely derandomized self-adaptation in evolution
strategies. Evolutionary Computation 9(2), 159195 (2011)
20. Schwefel, H.-P.: Collective intelligence in evolving systems. In: Wol, W., Soeder,
C.J., Drepper, F. (eds.) Ecodynamics - Contributions to Theoretical Ecology,
pp. 95100. Springer, Berlin (1987)
21. Jamil, M., Yang, X.-S.: A literature survey of benchmark functions for global opti-
misation problems. International Journal of Mathematical Modelling and Numer-
ical Optimisation 4(2), 150194 (2013)
22. Yang, X.-S.: Appendix A: Test Problems in Optimization. Engineering Optimiza-
tion, pp. 261266. John Wiley & Sons, Inc., Hoboken (2010)
23. Yang, X.-S.: Firey algorithm, stochastic test functions and design optimisation.
International Journal of Bio-Inspired Computation 2(2), 7884 (2010)
24. Hansen, N.: The CMA Evolution Strategy: A Comparing Review. In: Lozano, J.A.,
Larranaga, P., Inza, I., Bengoetxea, E. (eds.) Towards a New Evolutionary Compu-
tation. Advances on Estimation of Distribution Algorithms, pp. 75102. Springer,
Berlin (2006)
25. Demsar, J.: Statistical Comparisons of Classiers over Multiple Data Sets. Journal
of Machine Learning Research 7, 130 (2006)
26. Friedman, M.: The use of ranks to avoid the assumption of normality implicit in the
analysis of variance. Journal of the American Statistical Association 32, 675701
(1937)
27. Friedman, M.: A comparison of alternative tests of signicance for the problem of
m rankings. The Annals of Mathematical Statistics 11, 8692 (1940)
(2003)
29. Wachter, A., Hoeber, H.: Compendium of Theoretical Physics. Springer, Berlin
(2006)
30. Russell, S., Norvig, P.: Articial Intelligence: A Modern Approach, 3rd edn. Pren-
tice Hall, New Jersey (2009)
31. Rechenberg, I.: Evolutionsstrategie, Optimierung technischer Systeme nach
Prinzipien der biologischen Evolution. Frommann-Holzboog, Stuttgart (1973)
32. Schwefel, H.P.: Numerische Optimierung von Computer-Modellen mittels der Evo-
lutionsstrategie. Birkhauser, Basel (1977)
33. Hansen, N.: The CMA evolution strategy: A tutorial. Vu le 29 (2005)
34. Baeck, T., Fogel, D.B., Michalewicz, Z.: Handbook of Evolutionary Computation.
Taylor & Francis (1997)
35. Beyer, H.-G.: The Theory of Evolution Strategies. Springer, Heidelberg (2001)
36. Back, T.: Evolutionary Algorithms in Theory and Practice: Evolution Strategies,
Evolutionary Programming, Genetic Algorithms. Oxford University Press, Oxford
(1996)
ley & Sons, Inc., New York (2001)
38. Ahsanullah, M., Kibria, B.M.G., Shakil, M.: Normal and students t distributions
and their applications. Springer, Paris (2014)
Adaptation in Cooperative Coevolutionary
Optimization
Giuseppe A. Truno
DADU, University of Sassari, 07041 Alghero (SS), Italy
Abstract. Cooperative Coevolution (CC) is a typical divide-and-conquer

strategy to optimize large scale problems with evolutionary algorithms.
In CC, the original search directions are grouped in a suitable number of
subcomponents. Then, dierent subpopulations are assigned to the sub-
components and evolved using an optimization metaheuristic. To evaluate
the tness of individuals, the subpopulations cooperate by exchanging in-
formation. In this chapter we review some of the most relevant adaptive
techniques proposed in the literature to enhance the eectiveness of CC. In
addition, we present a preliminary version of a new adaptive CC algorithm
that addresses the problem of distributing eciently the computational
eort between the dierent subcomponents.
Keywords: Cooperative coevolution, Problem decomposition, Evolu-

tionary optimization.
1 Introduction
The search for more eective and ecient optimization algorithms is an increas-
ingly important research topic, given the complexity of todays applications in
many elds of science and engineering. For this reason, in recent years a variety
of optimization metaheuristics have been developed, which have shown excellent
performance in many relevant real-world problems. However, most of these al-
gorithms are plagued by the so-called curse of dimensionality, which consists
of a rapid deterioration of their optimization capability as the dimensionality of
the problem increases.
Among the dierent approaches that were proposed in literature for dealing
with large-scale optimization problems, there is the Cooperative Coevolution-
ary (CC) strategy introduced in [1] and later developed by many researchers
[2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]. In brief, the CC idea consists of de-
composing the original high-dimensional problem into a set of lower-dimensional
subproblems, which are easier to solve. Typically, to each subproblem is assigned
a subpopulation of candidate solutions, which is evolved according to the adopted
optimization metaheuristic. During the process, the only cooperation happens
in the evaluation of the tness, through an exchange of information between
subpopulations.


92 G.A. Truno
A large part of the research eorts dedicated to the CC approach, focused on

the development of adaptive strategies in order to improve its eectiveness. In
fact, a CC optimization involves some problem-dependent design choices that
mainly relate to the decomposition issue and to the computational eorts to be
attributed to the dierent subproblems.
After introducing the CC optimization approach, this chapter outlines some
of the main adaptive techniques presented in the literature to improve its eec-
tiveness and eciency. In addition, the chapter proposes a preliminary version
of a new adaptive CC algorithm that addresses the problem of an ecient dis-
tribution of the computational eort between the dierent subproblems.
2 Cooperative Coevolution
The CC approach was rst applied to a Genetic Algorithm by Potter and De Jong
in [1]. Subsequently, the idea has attracted a signicant amount of research and
has been adopted in many search algorithms such as Ant Colony Optimization
[19,20], Particle Swarm Optimization (PSO) [21,5,22,15,14], Simulated Anneal-
ing [23,24], Dierential Evolution (DE) [25,7], Firey Algorithm (FA) [26,27,18]
and many others. A CC optimization is based on partitioning the d-dimensional
set of search directions G = {1, 2, . . . , d} into k sets G1 . . . Gk . Each group Gi
of directions denes a new search space S (i) in which a standard optimization
algorithm is applied. In such an approach, the whole search procedure is then
decomposed into k subcomponents associated to dierent sub-spaces whose di-
mension can be signicantly lower than d. For example, using a population-based
metaheuristic, a separate sub-population is assigned to each subcomponent gen-
erated by the groups Gi . By construction, a candidate solution in S (i) contains
only some elements of the d-dimensional vector required for computing the cor-
responding tness function f . For this reason, a common d-dimensional context
vector b is built using a representative individual (e.g. the best individual) pro-
vided by each subcomponent. Then, the candidate solutions are evaluated by
complementing them through the appropriate elements of the context vector. In
this framework, the cooperation between sub-populations emerges because the
common vector is used for the tness evaluation of all individuals.
In their original paper, Potter and De Jong [1] proposed to decompose a
d-dimensional problem into d sub-populations (i.e. Gi = {i}). The tness of
each individual was computed by evaluating the d-dimensional vector formed
by the individual itself and a selected member (e.g. the current best) from each
of the other sub-populations. The authors showed the eectiveness of the pro-
posed approach on several test functions, although the empirical evaluation was
conducted only on search spaces of up to 30 dimensions.
Later, Liu et al. [4] investigated the performances of the same cooperative
approach applied to an evolutionary programming [28] algorithm. The results
obtained on benchmark functions with 100 to 1000 dimensions were satisfactory
and the authors showed that the CC approach can signicantly improve the
scalability of the optimizer as the dimensionality of the problem increases.
Adaptation in Cooperative Coevolution 93
Fig. 1. Possible decomposition into subcomponents in case of population-based opti-

mization metaheuristic (d = 16 and k = 4). Each row corresponds to an individual. All
subcomponents operate on sub-populations with the same number ni of individuals.
The individuals used for contributing to the context vector are shaded.
Subsequently, the Potter and De Jongs idea was applied to PSO [21] by
Van den Bergh and Engelbrecht in [5], where the authors introduced the
decomposition of the original d-dimensional search space into k subspaces S (i)
of the same dimension dk = d/k. In other words, in such an approach the groups
of dimensions were dened as:
Gi = {(i 1) dk + 1, . . . , i dk }
and the context vector is:
b = (b1 , . . . , bdk , b1 , . . . , bdk , . . . , b1 , . . . , bdk )T
(1) (1) (2) (2) (k) (k)

b(1) b(2) b(k)
94 G.A. Truno
Algorithm 1. CC(f , n)
1 G = {G1 , . . . , Gk } grouping(n);
2 pop initPopulation();
3 contextV ector initContextVector(pop);
4 f itnessEvaluations 0;
5 while f itnessEvaluations < M axF E do
6 foreach Gi G do
7 popi extractPopulation(pop, Gi );
8 besti optimizer(f , popi , contextV ector, Gi , maxF ESC);
9 pop storePopulation(popi , Gi );
10 f itnessEvaluations f itnessEvaluations + maxF ESC;
11 contextV ector updateContextVector(besti , Gi );
12 return contextV ector and f (contextV ector);
where b(i) is the dk -dimensional vector representing the contribution of the i-th
sub-population (e.g., its current best position in the subspace S (i) ):
b(i) = (b1 , b2 , . . . , bdk )T

(i) (i) (i)
Given the j-th individual x(i,j) S (i) of the i-th sub-population:
, . . . , xdk )T
(i,j) (i,j) (i,j)
x(i,j) = (x1 , x2
its tness value is given by f (b(i,j) ), where b(i,j) is dened as:
, . . . , xdk , . . . , b1 , . . . , bdk )T
(1) (1) (i,j) (i,j) (k) (k)
b(i,j) = (b1 , . . . , bdk , . . . , x1

b(1) x(i,j) b(k)
In other words, the tness of x(i,j) is evaluated on the vector obtained from b
by substituting the components provided by the i-th sub-population with the
corresponding components of x(i,j) .
Except for this way of evaluating the individuals, the CC proceeds using the
standard optimizer in each subspace. Algorithm 1 outlines a possible basic CC
optimization process. First, a decomposition function creates the k groups of
directions. Then the population and the context vector are randomly initialized.
The optimization is organized in cycles. During each cycle, the optimizer is acti-
vated in a round-robin fashion for the dierent subcomponents and the context
vector is updated using the current best individual of each sub-population. A
budget of maxFESC tness evaluations is allocated to each subcomponent at
each cycle. The CC cycles terminate when the number of tness evaluations
reaches the value maxFE. Note that several variants to this scheme can be pos-
sible. For example, the context vector could be updated in a synchronous way
at the end of each cycle.
In the CC framework, many design aspects can have a signicant impact on
the optimizing performance. The main issue was early recognized in [1,3] and
later conrmed by several studies (e.g. [4]): when interdependent variables are as-
signed to dierent subcomponents the search eciency can decline signicantly.
The interdependency between decision variables is a common condition in
real optimization problems and in literature is referred to as non-separability
[29] or epistasis, that is gene interaction. Basically, separability means that the
inuence of a variable on the tness value is independent of any other variables.
More formally, following [30] a function f : Rd R is separable i :

arg min f (x1 , . . . , xd ) = arg min f (x1 , . . . ), . . . , arg min f (. . . , xd ) (1)
x1 , ..., xd x1 xd
otherwise the function f (x) is non-separable. However, often real-world optimiza-

tion problems are partially separable, that is they generate objective functions
that are in between separable and fully non-separable. For example, a simple
class of partially separable problems is generated by m-separable objective func-
tions f : Rd R, in which at most m variables xi are interdependent. Moreover,
other real-world problems may consist of k dierent groups of interdependent
variables with little or no interaction between the groups.
In [29], Salomon showed that the performance of a simple GA can decrease
signicantly in case of non-separable problems. For this reason, the level of sep-
arability is considered a measure of the diculty of an optimization problem.
In fact, even when the dimension of the search space d is high, in the line of
principle a fully separable problem can be easily solved by decomposing it into d
sub-problems, each of which involves only one decision variable (i.e each variable
can be optimized keeping constant all the other variables).
As for the CC approach applied to non-separable problems, it is clear that with
the simple decomposition methods described above interdependent variables are
likely to be located in dierent groups. This precludes an eective use of the
adopted optimization metaheuristic and can generate slow convergence.
In some applications (e.g. [31]) the prior knowledge of the problem allows
to create a specic grouping structure in order to account for the interdepen-
dence among variables. However, in most relevant cases this is not possible. For
this reason, in literature many approaches have been devised to cope with the
problem of interacting variables in CC. The rationale behind the proposed ap-
proaches is to automatically group, at least for a certain number of cycles, the
interdependent variables in the same subcomponent.
To such purpose, a major attempt was the so called Random Grouping (RG)
method, proposed in [6,7]. RG is a strategy in which the directions of the orig-
inal search space are periodically grouped in a random way to determine the
subspaces in which the cooperative search is carried out. Such an approach was
successfully applied to DE [25], on high dimensional non-separable problems
with up to 1000 dimensions. Subsequently, the RG idea was integrated into sev-
eral cooperative optimizers. For example, in [14] the authors applied the idea
outlined above to PSO to solve large-scale optimization problems ranging from
100 to 2000 variables. In addition they introduced the idea of dynamically change
96 G.A. Truno
the sizes of the subspaces assigned to the dierent sub-populations. Such a co-
operative PSO outperformed some state-of-the-art evolutionary algorithms on
complex multi-modal problems. Also, in [15] the author used the same coopera-
tive approach with RG in a micro-PSO, showing that even using a small number
of individuals per sub-population (i.e. 5) the algorithm was very ecient on
high-dimensional problems. Recently, in [18] the RG approach has been applied
to a CC version of the FA [26,27].
Besides the RG strategy, further research eorts have been devoted to improve
the CC framework using adaptive approaches.
A relevant contribution in this directions was the automatic adaptation of the
subcomponent sizes, initially proposed in [6] and recently improved in [32].
Also, starting from the initial proposal of [2], many research works devised
strategies for automatically decomposing the problem into subcomponents in or-
der to account for the interdependencies between variables (e.g. [9,10,13,16,17]).
Another aspect strictly related with the problem decomposition, concerns the
possible imbalance between the contributions of the dierent subcomponents to
the improvement of the tness at each cycle. A rst proposal to cope with this
issue was presented in [12], where the authors proposed an adaptive approach
that determines the computation to spend on the subcomponents according to
their contributions to the tness. A preliminary version of a new approach is
presented in Section 3.4 of this chapter.
In the following we outlines some of the above mentioned adaptive techniques
for the enhancement of the CC approach.
3 Adaptation in CC
3.1 Random Grouping
Even if it could not be classied as an example of adaptation, the RG strategy is
part of some adaptive CC methodologies and contributed signicantly to improve
the eectiveness of CC optimization. As outlined above, RG [7] consists of a
randomized periodical re-allocation of search directions to the subcomponents
during the optimization.
Compared with the simple linear decomposition described in section 2, it
has been proved that RG increases the probability of having two interacting
variables in the same sub-population at least for some iteration of the search
algorithm [7,11]. Clearly, when a subcomponent directly operates on all the
interdependent variables, it has a better chance to adjust their values towards
the optimal direction.
More in details, in the linear decomposition proposed in [1,4,5] the i-th sub-
population operates on the group of directions Gi dened as the interval:
Gi = [(i 1) dk + 1, . . . , i dk ]
In addition, the decomposition G = {G1 , . . . , Gk } of Algorithm 1 is static, in the

sense that it is dened before the beginning of optimization cycles. Instead, a
Algorithm 2. CCRG(f , n)
1 G = {G1 , . . . , Gk } randomGrouping(n, k);
4 f itnessEvaluations 0;
6 foreach Gi G do
8 besti optimizer(f , popi , contextV ector, Gi , maxF ESC);
10 f itnessEvaluations f itnessEvaluations + maxF ESC;
11 contextV ector updateContextVector(besti , Gi );
12 G = {G1 , . . . , Gk } randomGrouping(n, k);
13 //depending on the optimizer further operations may be required after
//random grouping
RG approach assigns to the i-th group dk = d/k directions qj , with j randomly

selected without replacement from the set {1, 2, . . . , d }.
The typical RG implementation is shown in Algorithm 2. Note that the func-
tion randomGrouping can be simply based on an array Q that contains the
current map between the directions assigned to a subcomponent and the corre-
sponding directions of the original search space. Before each re-grouping, Q is
randomly shued and used to decompose the d-dimensional vector of directions
into dk -dimensional groups Gi . Using a population-based optimizer, this simple
implementation assumes that each subcomponent operates on the same number
of individuals.
In Algorithm 2 it is worth noting that, depending on the particular optimizer,
further operations may be required after random grouping. For example, as
shown in [14] in case of PSO particular care must be taken to preserve the
personal best of particles. Also, after each random grouping is necessary to re-
evaluate the tness of the personal best in order to keep it consistent with the
new group arrangement. This obviously represents an additional computational
cost.
To motivate the RG approach, in [7] the authors derived a formula expressing
the probability of having two interdependent variables in the same subcomponent
for at least r of the N cycles performed. According to their results, RG has a
relatively high probability to optimize in the same subcomponent two interacting
variables for at least some cycles. Later, in [11] was shown that in case of v
interacting variables, the probability of having all of them grouped in the same
subcomponent for at least r cycles is:
N i N i
N 1 1
Pr = 1 (2)
i=r
i k v1 k v1
98 G.A. Truno
Fig. 2. Probability of grouping the interacting variables in the same subcomponent at

least once in N cycles
where k is the number of subcomponents and N is the number of cycles. Accord-

ing to Eq. 2, when the number v of interacting variables is high the probability
of having all of them in the same subcomponent for at least one cycle is very
low. This can be seen also in Fig. 2, where such a probability was plotted for v
between 2 and 10 and for dierent number of cycles N .
To mitigate this problem, in [11] the authors suggested to increase the fre-
quency of RG. In practice, given a budget of tness evaluations M axF E, the
number of RG can be maximised by keeping at the minimum the number of
tness evaluations maxF ESC that are allowed for each subcomponent at each
cycle. For example, the RG frequency can be maximized by executing only one
generation of the evolutionary optimizer per CC cycle. In [11] a higher fre-
quency of RG provided signicant benets on some non-separable high dimen-
sional problems. However, also depending on the optimizer, a lower RG frequency
may be more ecient on dierent objective functions.
According to Fig. 2, to group at least once many interacting variables together
the RG approach would require an infeasible number of cycles. Nevertheless,
according to the literature the RG strategy can be benecial also in such cases.
Likely, this is because even when only some of such variables are grouped together
in turns, a CC approach based on a suitable optimizer can operate eectively.
3.2 Adapting the Subcomponent Sizes

Another parameter that can signicantly aect the optimization performances of
the CC framework is the number of directions assigned to each group Gi , that is,
in the frequent case of equal-sized groups, the value dk = d/k. As noted in [6] it is
not easy to determine the appropriate value of dk , which is indeed dependent on

both the problem and optimizer. Small group sizes can be suitable for separable
problems, making easier the optimization of each subcomponent. On the other
hand, as can be seen in Eq. 2, large group sizes (i.e. low values of k) increase the
probability of grouping together interacting variables in nonseparable problems.
It was also argued in [6] that the value of dk should also be adapted during the
optimization, from small group sizes at the beginning to large group sizes at the
end of the process.
A relevant CC with adaptive component size is the Multilevel Cooperative
Coevolution (MLCC) framework, which was proposed in [6]. The idea of MLCC
is to dene, before the beginning of the optimization process, a pool of decom-
posers, that is a set of group sizes D = {dk1 , dk2 , . . . , dkt }. Then, at the beginning
of each cycle, MLCC selects a decomposer dki from D on the basis of its perfor-
mance during the past cycles. To such purpose, the algorithm must attribute a
performance index ri to each decomposer. This is done as follows: (i) initially,
all the ri R are set to 1; (ii) then, the ri are updated the basis of the gain of
tness associated to their use on the basis of the equation:
fprev fcur
ri = (3)
|fprev |
where fprev is the best tness at the end of the previous cycle and fcur is the
best tness achieved at the end of the current cycle, in which the decomposer
dki has been used. At the beginning of each cycle, the performance indexes are
converted into probabilities using a Boltzmann soft max distribution [33]:
eri /c
pi = t (4)
rj /c
j=1 e
where c is a suitable constant. The latter should be set in such a way to associate
a high probability of being selected to the best decomposers (exploitation), still
giving some chances to all the available decomposers (exploration). The above
mechanism, allows to self-adapt the problem decomposition to the particular
objective problems and also to the evolution stages.
It is worth noting that the MLCC adaptive method can be seen in a per-
spective of a reinforcement learning (RL) approach [33], where the increase of
tness is the reinforcement signal and the actions consist in the choice of the
decomposer. However, instead of selecting actions on the basis of its long-term
utility, in the MLCC their immediate reward is used.
According to the RL literature, an alternative to the selection strategy pro-
posed in [6] could be the so-called -greedy approach: at each cycle the decom-
poser with the highest performance index is selected with probability 1 ,
otherwise random decomposer is selected. The small parameter [0, 1] should
be set in such a way to balance the exploitation of what has been already learned
and the exploration of alternative decomposers with the chance to better adapt
the performance indexes to the evolution stages.
100 G.A. Truno
In [6], the MLCC adaptation method dened by Eqs. 3 and 4 was tested, using
a RG strategy, on a suite of benchmark functions. The authors found that in
several cases the self-adaptive strategy outperformed the corresponding methods
based on the static selection of dk and on the random selection of the group sizes
at each cycle.
A simpler approach for dynamically setting the group size dk was adopted in
the cooperative PSO (CCPSO2) proposed in [14]: given a set of decomposers D,
at the end of each cycle a new decomposer is selected uniformly at random when
the tness does not improve; otherwise, the same decomposer is used for the next
cycle. A problem with this approach is that when the slope of the convergence
curve is very small, the method does not intervene and the current value of dK
is maintained.
A similar strategy was adopted in [18] for a CC implementation of the Firey
Algorithm (FA) [34,26,35]. However, besides the group size also the number of
individuals in the sub-populations was adapted accordingly, using a predened
look-up table. This further adaptation raised the problem of how to select the
individuals to be deleted from the sub-populations when it was necessary to
reduce their size. In [18] the choice was simply to eliminate the worst individuals
of each sub-population. Instead, when the size of sub-populations had to be
increased, new individuals were initialized at random.
Recently, in [32] an improvement of the MLCC adaptive approach was pre-
sented, namely the MLSoft algorithm. In particular, the authors proposed to
use a standard RL approach, replacing ri in Eq. 4 with a value function Vi . The
latter, which is an estimate of the long term utility associated to the use of a
decomposer, was dened as the arithmetic mean of all rewards ri received by
the decomposer dki during the optimization process.
In [32], the authors showed empirically that for a given fully-separable ob-
jective function f , there exists a value d of the problem dimension d to which
corresponds the best optimizers performance (i.e. dened as a sort of eciency
in using the available budget of function evaluations). In terms of CC framework,
this means that the subcomponents should be neither too small nor too large.
Thus, the objective of the adaptive MLSoft approach should be to discover
such an optimal value d < d in order to decompose in the optimal way the
original d-dimensional problem. The MLSoft algorithm was tested on eight fully-
separable functions using a rich set D of decomposers and dierent values of the
parameter c in Eq. 4. According to the results, MLSoft outperformed MLCC.
However, the MLSoft algorithm was not able to outperform the corresponding
CC framework with a xed and optimal subcomponent size. The authors argued
that there is then room for improvement in the way in which the value function
Vi is determined. Indeed, in the standard RL the value function is typically
updated on the basis of the experienced state-action pairs using the received
rewards and a learning rate [0, 1]. In addition, older values of the reward
are usually weighted through a discount factor [0, 1]. Also in the MLSoft
approach, this might be more eective than computing the value function as an
arithmetic mean of the rewards.
3.3 Adaptive Grouping
As discussed above, having the interacting variables grouped in the same sub-
component is a crucial factor for enabling ecient CC optimizations of non-
separable problems. For this reason, a number of studies have addressed the prob-
lem of automatic and adaptive decomposition into subcomponents. In contrast
to the RG approach, such automatic procedures try to discover the underlying
structure of the problem in order to devise and adapt a suitable decomposition.
The rst attempt in this direction was carried out in [2], where the authors
proposed a technique to identify interacting variables in a CC framework. The
approach was based on the observation that if a candidate solution where two
directions have been changed achieves a better tness than the same solution
where only one of the directions was changed, then this may indicate the pres-
ence of an interdependency. The creations of groups was carried out during the
optimization process exploiting some additional tness evaluations for each indi-
viduals. The technique proved eective, although the approach was tested only
on few functions with dimensionality up to 30.
Following the idea proposed in [2], which basically consists of observing the
changes of the objective function due to a perturbation of variables, more eec-
tive methods have been developed for enhancing the CC approach. In most cases,
the decomposition stage is performed o-line, that is the groups are created be-
fore the optimization starts. Other approaches presented in the literature for
automatically grouping variables in CC are based on learning statistical models
of interdependencies [13] or on the correlation between variables [36]. However,
as noted in [10] two variables might be highly linearly correlated even when they
are completely separable. In other words, correlation coecients are not a proper
measure for separability in the CC optimization context.
A step in the development of an automatic grouping strategy for CC op-
timizations has been the Delta Grouping (DG) approach proposed in [10]. The
DG algorithm is based on the concept of improvement interval of a variable, that
is the interval in which the tness value could be improved while all the other
variables are kept constant [29,10]. It has been observed that in non-separable
functions, when a variable interacts with other variables, its improvement in-
terval tends to be smaller. Therefore, in the DG approach the identication of
interacting variables was based on measuring the amount of change (i.e. the
delta value) in each of the decision variables during the optimization process. In
particular, the DG algorithm sorts the directions according to the magnitude of
their delta values in order to group the variables with smaller delta values in the
same subcomponent.
Clearly, as noted in [10], not always a small improvement interval implies
a variable interdependency. However, when tested on both the CEC2008 [37]
and CEC2010 [38] benchmark functions, the DG method performed better
than other relevant CC methods. However, a drawback of DG is its low perfor-
mance when there is more than one non-separable subcomponent in the objective
function [10].
102 G.A. Truno
It is worth noting that being an on-line adaptation technique, the DG ap-

proach has the ability to adapt itself to the tness landscape. Such a property
can be valuable when the degree of non-separability changes depending on the
current region of the search space explored by the individuals in the population.
A dierent grouping technique, recently proposed in [9], is the Cooperative
Co-evolution with Variable Interaction Learning (CCVIL), which can be viewed
as a development of the method presented in [2]. In the CCVIL algorithm, the
optimization is carried out trough two stages, namely learning and optimization,
in the rst of which the grouping structure is discovered. According to [9], an
interaction between any two variables xi and xj is taken under consideration if
the following condition holds:
x, xi , xj :
f (x1 , . . . xi , . . . , xj , . . . , xd ) < f (x1 , . . . xi , . . . , xj , . . . , xd ) (5)
f (x1 , . . . xi , . . . , xj , . . . , xd ) > f (x1 , . . . xi , . . . , xj , . . . , xd )
The learning stage of CCVIL starts by placing each direction in a separate sub-
component, that is by separately optimizing the variables in sequence. During
this process, CCVIL tests if the currently and the previously optimized dimen-
sions interact by using Eq. 5. The latter can be applied because only two dimen-
sions changed. Before each learning cycle, the order of optimization of variables
is randomly permutated, so that each two dimensions have the same chance to
be processed in sequence. After the convergence of the learning stage in terms
of grouping, CCVIL starts the optimization stage.
In [9], the authors tested the CCVIL approach using the CEC2010 benchmark
functions [38]. According to the results, CCVIL improved the underlying CC
algorithm in most of the benchmark functions. However, a signicant issue to be
solved concerns the distribution of computational eort between learning and
optimization stages of CCVIL.
Another recent approach for adaptive grouping, named Dierential Group-
ing Algorithm (DGA), has been proposed in [17] for additively separable (AS)
functions f : Rd R, which can be expressed as the sum of k independent non-
separable functions. In this case, there exists an ideal problem decomposition
Gid composed of k groups of variables Gi such that if q Gi and r Gj , with
i = j, then q and r are independent directions. However, it is worth noting that
Gid is not necessarily the best decomposition for a CC optimization algorithm,
as can be inferred from the results presented in [32].
The DGA approach was founded on the formal proof that for AS functions,
if the forward dierences along xp :
fxp (x, )|xp =a, xq =b and fxp (x, )|xp =a, xq =c
are not equal with b = c and = 0, then xp and xq are non-separable. The
forward dierence with interval , in a point x and along the direction xp is
dened as:
fxp (x, ) = f (. . . , xp + , . . . ) f (. . . , xp , . . . )
and requires two function evaluations to be estimated. The DGA presented in

[17], exploits the above property to create groups of interacting variables. The
algorithm operates by checking the interactions trough pairwise comparisons
among variables. However, DGA does not necessarily require all the compar-
isons. In fact, when an interaction is detected between two variables, one of the
two is placed on a group and excluded by further comparisons. According to
[17], when there are k = d/m non-separable subcomponents each with m vari-
ables, the maximum number of tness evaluations required by DGA is O(d2 /m).
However, the actual number of additional tness evaluations may change signif-
icantly depending on the problem. For example, for d = 1000, with m = 50 only
21000 function evaluations are required, while with m = 1 (i.e. fully separable
problem) DGA requires 1001000 additional evaluations.
In [17], DGA was tested in a CC optimizer using CEC2010 benchmark func-
tions [38] showing a good grouping capability. Also, the DGA outperformed the
CCVIL approach in most functions, both in terms of grouping accuracy and
computational cost.
3.4 Adaptive Computational Budget Allocation
Given the ability of automatic decomposition described above, in [12] the au-
thors noted that there is often an imbalance between the contribution to the
tness of the separable and non-separable portions of an optimization problem.
In particular, in CC there are situations in which the improvements in some of
the subcomponents are not apparent simply because they are negligible in com-
parison to the tness variation caused by other subcomponents. Thus, according
to [12], in most cases devoting the same amount of computational resources to
all subcomponents (i.e. the value maxFESC in Algorithms 1 and 2) in a round-
robin fashion, can result in a waste of tness evaluations. In order to mitigate
this issue, in [12] the Contribution Based Cooperative Co-evolution (CBCC)
algorithm was proposed, where:
1. the contribution Fi of each subcomponent is estimated by measuring the

changes in global tness when it undergoes optimization. Such contributions
are accumulated from the rst cycle during the optimization.
2. each cycle is composed of a round-robin testing phase, where the contribu-
tions Fi are updated, and a subsequent stage in which the subcomponent
with the greatest Fi is iteratively selected for further optimization;
3. when there is no improvement in the last phase, the algorithm starts a new
cycle with a new testing phase.
Clearly, the CBCC algorithm must be integrated with an eective grouping strat-
egy, which should be able to decompose the problem into independent groups
as much as possible. The CBCC has proved to be promising when tested on the
benchmark functions which have been proposed for the CEC2010 [38]. How-
ever, the experiments showed that CBCC is too much inuenced by historical
information in the early stages of evolution. For example, it may happen that
104 G.A. Truno
Algorithm 3. CCAOI(f , n, nM axAddGen, minGen)

1 G, f itnessEvaluations grouping(n, f );
4 foreach Gi G do
5 i 0;
7 foreach Gi G do
8 geni minGen;
9 m , computeStatistics();
10 if m > 0 then
11 nAddGen nM axAddGen ;
12 foreach Gi G do
nAddGen i
13 geni geni + ;
|G| m
14 foreach Gi G do
16 besti , f Evals optimizer(f , popi , contextV ector, Gi , geni );
18 f itnessEvaluations f itnessEvaluations + f Evals;
19 contextV ector updateContextVector(besti, Gi );
20 prevBestF itness bestF itness;
21 bestF itness f (contextV ector);
22 i max((prevBestF itness bestF itness)/geni, 0);
the subcomponent that is initially recognized as the major tness contributor,

reaches convergence very soon. In this case, the CBCC approach presented in
[12] does not switch immediately to the subcomponent with the largest contri-
bution, due to the inuence of the initial assessment of contributions. From this
point of view, there is still room for developing an adaptive procedure that can
cope eectively with the problem of imbalance between the contribution to the
tness of the dierent subcomponents.
Following the CBCC idea, we have developed a CC in which the computa-
tional eort allocated to the dierent subcomponents is dynamically adapted to
the particular problem and to the dierent stages of evolution. The developed
procedure, named CC with Adaptive Optimizer Iterations (CCAOI), is outlined
in Algorithm 3. The idea of CCAOI is to determine the number of generations to
be executed by each subcomponent at each cycle on the basis of an indicator i ,
which is the contribution of the subcomponent to the global tness normalized
with the number geni of generations executed by the optimizer. At the beginning
of each cycle, a minimum number minGen of generations is attributed to each
subcomponent. Then, we determine the current mean value m of all i as well
as the Gini index [39] as:
Table 1. Results of CCAOI and CCFA on the adopted test functions. The better
average errors are highlighted in bold when there is a signicant dierence according
to the t-test.
Function CCAOI CCFA p

Avg. Error (Std. dev) Avg. Error (Std. dev)
f4 5.701E+011 (8.541E+010) 9.544E+012 (5.003E+012) 0.000
f5 1.377E+008 (2.316E+007) 1.371E+008 (1.936E+007) 0.921
f6 1.011E+006 (8.426E+005) 2.440E+006 (2.720E+005) 0.000
f7 2.118E+002 (2.599E+001) 5.972E+007 (3.185E+007) 0.000
f8 4.082E+007 (3.718E+006) 5.857E+007 (2.124E+007) 0.000
f9 5.463E+007 (5.872E+006) 8.547E+007 (1.348E+007) 0.000
f10 2.664E+003 (1.039E+002) 2.676E+003 (1.704E+002) 0.765
f11 1.969E+001 (1.109E+000) 1.961E+001 (1.439E+000) 0.827
f12 1.205E001 (2.761E002) 5.633E+002 (1.127E+002) 0.000
f13 5.943E+003 (3.205E+003) 9.905E+003 (5.634E+003) 0.004
k k
i=1 j=1 |i j |
=
2 k 2 m
where k = |G| is the number of subcomponents. The value of [0, 1] measures
the inequality between the contributions of the subcomponents. In particular,
= 0 when all the i are the same and = 1 when the normalized contributions
are characterized by the maximum inequality. Subsequently, if the average con-
tribution is greater than zero (i.e. no stagnation) an additional global budget of
generations nAddGen is determined as nM axAddGen, where nM axAddGen
is the amount of generations corresponding to the maximum inequality between
subcomponents (line 11). This is justied by the fact that the main objective of
the adaptive procedure is to rapidly reduce the unbalance between the dierent
subcomponents. Thus, the algorithm assigns to a greater unbalance a higher
total amount of computational eort for that cycle. In the next lines 12-13, the
number of generations nAddGen is distributed to the subcomponents according
to their normalized contribution i . Each subcomponent is activated for a num-
ber geni = minGen + nAddGeni of generations. Later, such value geni is used
for normalizing the contribution of the i-th subcomponent to the global tness.
It is worth noting that the method is based on the assumption that a suitable
decomposition algorithm is available. However, a preliminary version was tested
on some functions with the imbalance characteristics taken from the CEC2010
[38] suite, for which we have manually devised an ideal grouping Gid based on the
knowledge of the functions (in the ideal grouping there is no interdependency
between any two subcomponents). Clearly, when using an automatic decompo-
sition procedure (see Section 3.3) an additional number of function evaluations
is required by the optimization.
106 G.A. Truno
Fig. 3. Some averaged convergence plots obtained on the benchmark functions
A rst test of the proposed CCAOI approach was conducted using the FA
[34,26,35] as optimizer. A CC version of FA was already developed in [18], were
more details on the implementation can be found. The results for functions
f4 f13 are summarized in Table 1. Also, some average convergence plots are
shown in Fig. 3. In the experiments, the CCAOI algorithm was compared with
the corresponding CC approach in which all the subcomponents operate with
the same number of minGen generations. The adopted CCAOI parameters were
minGen = 10 and nAddGen = 50. The results were averaged over 25 inde-
pendent runs. In Table 1, for each function the best result was highlighted in
bold, according to the t-test with signicance 0.05 that was conducted. As can
be seen, the proposed approach led to improved results in 70% of the test func-
tions. In the remaining 30%, the dierences were not statistically signicant. It
is worth noting that in function f4 f8 the imbalance is relevant, while in the re-
maining functions it is less signicant (see [38] for the details). According to the
results, the proposed algorithm seems eective in addressing the issue of imbal-
ance between the tness contributions provided by the dierent subcomponents.
Obviously, a more detailed investigation is needed for reliable conclusions on the

suitability of the method, including the use of CCAOI with other optimizers.
4 Conclusions
According to the literature, the CC approach has proved highly eective in

large-scale optimization problems. In addition, it oered to researchers several
opportunities to devise adaptive techniques for achieving a greater optimization
eciency. After introducing the CC approach, in this chapter we discussed some
of the most relevant proposals of CC enhancements that can be found in the
literature. In addition, we illustrated a preliminary version of an adaptive CC
algorithm that addresses the problem of distributing the computational eort
between subcomponents. The proposed method appears promising and deserves
to be further investigated. Also a suitable integration with other adaptive tech-
niques might be eective and should be object of future research work.
References
1. Potter, M.A., De Jong, K.A.: A cooperative coevolutionary approach to func-

tion optimization. In: Davidor, Y., Manner, R., Schwefel, H.-P. (eds.) PPSN 1994.
LNCS, vol. 866, pp. 249257. Springer, Heidelberg (1994)
2. Weicker, K., Weicker, N.: On the improvement of coevolutionary optimizers by
learning variable interdependencies. In: 1999 Congress on Evolutionary Computa-
tion, pp. 16271632. IEEE Service Center, Piscataway (1999)
3. Potter, M.A., De Jong, K.A.: Cooperative coevolution: An architecture for evolving
coadapted subcomponents. Evolutionary Computation 8(1), 129 (2000)
4. Liu, Y., Yao, X., Zhao, Q.: Scaling up fast evolutionary programming with coop-
erative coevolution. In: Proceedings of the 2001 Congress on Evolutionary Com-
putation, Seoul, Korea, pp. 11011108 (2001)
5. van den Bergh, F., Engelbrecht, A.P.: A cooperative approach to particle swarm
optimization. IEEE Trans. Evolutionary Computation 8(3), 225239 (2004)
6. Yang, Z., Tang, K., Yao, X.: Multilevel cooperative coevolution for large scale
optimization. In: IEEE Congress on Evolutionary Computation, pp. 16631670.
IEEE (2008)
7. Yang, Z., Tang, K., Yao, X.: Large scale evolutionary optimization using coopera-
tive coevolution. Information Sciences 178(15), 29852999 (2008)
8. Parsopoulos, K.E.: Cooperative micro-particle swarm optimization. In: Proceedings
of the rst ACM/SIGEVO Summit on Genetic and Evolutionary Computation,
GEC 2009, pp. 467474 (2009)
9. Chen, W., Weise, T., Yang, Z., Tang, K.: Large-scale global optimization using
cooperative coevolution with variable interaction learning. In: Schaefer, R., Cotta,
C., Kolodziej, J., Rudolph, G. (eds.) PPSN XI. LNCS, vol. 6239, pp. 300309.
Springer, Heidelberg (2010)
10. Omidvar, M.N., Li, X., Yao, X.: Cooperative co-evolution with delta grouping for
large scale non-separable function optimization. In: IEEE Congress on Evolution-
ary Computation, pp. 18 (2010)
108 G.A. Truno
11. Omidvar, M.N., Li, X., Yang, Z., Yao, X.: Cooperative co-evolution for large scale
optimization through more frequent random grouping. In: Proceedings of the IEEE
Congress on Evolutionary Computation, pp. 18. IEEE (2010)
12. Omidvar, M.N., Li, X., Yao, X.: Smart use of computational resources based on
contribution for cooperative co-evolutionary algorithms. In: Proceedings of the 13th
Annual Conference on Genetic and Evolutionary Computation, GECCO 2011, pp.
11151122. ACM, New York (2011)
13. Sun, L., Yoshida, S., Cheng, X., Liang, Y.: A cooperative particle swarm optimizer
with statistical variable interdependence learning. Information Sciences 186(1), 20
39 (2012)
14. Li, X., Yao, X.: Cooperatively coevolving particle swarms for large scale optimiza-
tion. IEEE Trans. Evolutionary Computation 16(2), 210224 (2012)
15. Parsopoulos, K.E.: Parallel cooperative micro-particle swarm optimization: A
master-slave model. Applied Soft Computing 12(11), 35523579 (2012)
16. Hasanzadeh, M., Meybodi, M., Ebadzadeh, M.: Adaptive cooperative particle
swarm optimizer. Applied Intelligence 39(2), 397420 (2013)
17. Omidvar, M.N., Li, X., Mei, Y., Yao, X.: Cooperative co-evolution with dieren-
tial grouping for large scale optimization. IEEE Trans. Evolutionary Computa-
tion 18(3), 378393 (2014)
18. Truno, G.A.: Enhancing the rey algorithm through a cooperative coevolution-
ary approach: an empirical study on benchmark optimisation problems. IJBIC 6(2),
108125 (2014)
19. Dorigo, M., Maniezzo, V., Colorni, A.: Ant system: optimization by a colony of
cooperating agents. IEEE Transactions on Systems, Man, and Cybernetics, Part
B 26(1), 2941 (1996)
20. Doerner, K., Hartl, R.F., Reimann, M.: Cooperative ant colonies for optimizing
resource allocation in transportation. In: Boers, E.J.W., Gottlieb, J., Lanzi, P.L.,
Smith, R.E., Cagnoni, S., Hart, E., Raidl, G.R., Tijink, H. (eds.) EvoWorkshop
2001. LNCS, vol. 2037, pp. 7079. Springer, Heidelberg (2001)
21. Eberhart, R., Kennedy, J.: A new optimizer using particle swarm theory. In: Pro-
ceedings of the Sixth International Symposium on Micro Machine and Human
Science, pp. 3943. IEEE (1995)
22. El-Abd, M., Kamel, M.S.: A Taxonomy of Cooperative Particle Swarm Optimizers.
International Journal of Computational Intelligence Research 4 (2008)
23. Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing.
Science 220, 671680 (1983)
24. Sanchez-Ante, G., Ramos, F., Frausto, J.: Cooperative simulated annealing for
path planning in multi-robot systems. In: Cair o, O., Cant
u, F.J. (eds.) MICAI
2000. LNCS, vol. 1793, pp. 148157. Springer, Heidelberg (2000)
25. Storn, R., Price, K.: Dierential evolution a simple and ecient heuristic for
global optimization over continuous spaces. Journal of Global Optimization 11(4),
341359 (1997)
26. Yang, X.-S.: Firey algorithms for multimodal optimization. In: Watanabe, O.,
Zeugmann, T. (eds.) SAGA 2009. LNCS, vol. 5792, pp. 169178. Springer, Heidel-
berg (2009)
27. Fister, I., Fister Jr., I., Yang, X.S., Brest, J.: A comprehensive review of rey
algorithms. Swarm and Evolutionary Computation (2013)
28. Fogel, L., Owens, A., Walsh, M.: Articial intelligence through simulated evolution.
Wiley, Chichester (1966)
29. Salomon, R.: Reevaluating genetic algorithm performance under coordinate rota-
tion of benchmark functions - a survey of some theoretical and practical aspects
of genetic algorithms. BioSystems 39, 263278 (1995)
30. Auger, A., Hansen, N., Mauny, N., Ros, R., Schoenauer, M.: Bio-inspired contin-
uous optimization: The coming of age. Invited talk at CEC 2007, Piscataway, NJ,
USA (2007)
31. Blecic, I., Cecchini, A., Truno, G.A.: Fast and accurate optimization of a GPU-
accelerated ca urban model through cooperative coevolutionary particle swarms.
Procedia Computer Science 29C, 16311643 (2014)
32. Omidvar, M.N., Mei, Y., Li, X.: Eective decomposition of large-scale separable
continuous functions for cooperative co-evolutionary algorithms. In: Proceedings
of the IEEE Congress on Evolutionary Computation. IEEE (2014)
33. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press
(1998)
34. Yang, X.S.: Nature-Inspired Metaheuristic Algorithms. Luniver Press (2008)
35. Yang, X.S.: Firey algorithm, stochastic test functions and design optimisation.
International Journal of Bio-Inspired Computation 2(2), 7884 (2010)
36. Ray, T., Yao, X.: A cooperative coevolutionary algorithm with correlation based
adaptive variable partitioning. In: Proceedings of the IEEE Congress on Evolu-
tionary Computation, pp. 983989. IEEE (2009)
37. Tang, K., Yao, X., Suganthan, P., MacNish, C., Chen, Y., Chen, C., Yang, Z.:
Benchmark functions for the CEC 2008 special session and competition on large
scale global optimization (2008)
38. Tang, K., Li, X., Suganthan, P.N., Yang, Z., Weise, T.: Benchmark functions for
the CEC 2010 special session and competition on large-scale global optimization
(2010)
39. Gini, C.: Measurement of Inequality of Incomes. The Economic Journal 31(121),
124126 (1921)
Study of Lagrangian and Evolutionary Parameters
in Krill Herd Algorithm
Gai-Ge Wang1,*, Amir H. Gandomi2, and Amir H. Alavi3

1
School of Computer Science and Technology, Jiangsu Normal University,
Xuzhou, Jiangsu 221116, China
2
Department of Civil Engineering, The University of Akron, Akron, OH 44325, USA
3
Department of Civil and Environmental Engineering,
Engineering Building, Michigan State University, East Lansing, MI 48824, USA
Abstract. Krill Herd (KH) is a novel swarm-based intelligent optimization

method developed through the idealization of the krill swarm. In the basic KH
method, all the movement parameters used are originated from real nature-driven
data found in the literature. The parameter setting based on such data is not
necessarily the best selection. In this work, a systematic method is presented for
the selection of the best parameter setting for the KH algorithm through an
extensive study of arrays of high-dimensional benchmark problems. An
important finding is that the best performance of KH can be obtained by setting
effective coefficient of the krill individual (Cbest), food coefficient (Cfood),
maximum diffusion speed (Dmax), crossover probability (Cr) and mutation
probability (Mu) parameters to 4.00, 4.25, 0.014, 0.225, and 0.025, respectively.
This finding would eliminate the concerns regarding the optimal tuning of the
KH algorithm for its most future applications.
Keywords: Swarm intelligence, Krill Herd, benchmark, parameter setting.
1 Introduction
With the development of real-life engineering techniques, the related optimization
problems are becoming more and more complex. Traditional methods seem to be
inefficient to solve such complicated problems. In order to cope with this limitation, a
variety of modern nature-inspired intelligent algorithms have been put forward and
applied to solve optimization problems. Some of them include: differential evolution
(DE) [1,2], artificial bee colony (ABC) [3-5], genetic programming (GP) [6], cuckoo
search (CS) [7-12], biogeography-based optimization (BBO) [13-16], animal migration
optimization (AMO) [17], grey wolf optimizer (GSO) [18], harmony search (HS)
[19,20], interior search algorithm (ISA) [25], particle swarm optimization (PSO)
[21-24], firefly algorithm (FA) [26-28], charged system search (CSS) [29], and bat
algorithm (BA) [30-33]. It has been proven that these methods are superior to the
traditional optimization techniques for solving several challenging problems such as
image segmentation [34], constrained optimization [35], knapsack problem [36],
*
Springer International Publishing Switzerland 2015 111

112 G.-G. Wang, A.H. Gandomi, and A.H. Alavi
feature selection [37], marker optimization [38], parameter estimation [39],

self-potential data [40], neural network training [41,42], scheduling [43], and water,
geotechnical and transport engineering [44,45].
Swarm intelligence-based optimization techniques are one type of famous
intelligent algorithms. They are well capable of finding the optimal solutions though all
the individuals simultaneously instead of each individual. Krill herd (KH) is one of
well-known swarm intelligence techniques proposed by Gandomi and Alavi [46] by the
idealization of communicating and foraging behaviors of krill swarms. The KH method
was studied by several researchers because it has a relatively simple but effective
framework for function optimization [46].
Wang et al. [37, 38] and Saremi et al. [39] proposed chaotic KH (CKH) method by
introducing the chaos theory into the KH optimization process [47-49]. By adding stud
selection and crossover (SSC) operator into the KH method, stud krill herd (SKH) is
put forward to global optimization [50]. Guo et al. [41] developed an improved KH
method (IKH) by adding the exchange of information between top krill during motion
calculation process to generate better candidate solutions. Furthermore, the IKH
method uses a new Lvy flight distribution [51] and elitism scheme to update the KH
motion calculation [52]. For the purpose of improving the population diversity, a krill
migration (KM) operator originated from BBO is added to the KH method [43]. The
KM operator emphasizes the exploitation and lets the krill cluster around the best
solutions at the latter part of the optimization process [53]. Further, another version of
mutation operator originally used in HS and DE is combined with the approach to form
a new HS/KH method [54] and DE/KH [55], respectively. Li et al. [46] analyze a
deficiency of KH which cannot achieve the excellent trade-off between exploration and
exploitation in search process and proposed an improved KH with linear decreasing
step (KHLD) [56]. In addition, some other improved version of the KH methods have
been proposed [57,58]. Furthermore, the KH method is investigated by dealing with
various truss design optimization problems [59]. KH is also introduced for solving
engineering optimization problems. For more verification, KH is subsequently applied
to six design problems [60] and structural optimization [61].
Sur et al. [52] proposed a discrete KH method and used to solve graph based network
route optimization problem.
However, the parameter settings used in all of the above studies is coming from the
basic KH method [46]. In the basic KH method, the used parameters are originated
from real data found in the literature. Apparently, merely relying on such data for
parameter settings cannot be considered as the best choice. This work deals with a
systematic procedure to select the best parameter settings for the KH algorithm.
Accordingly, the optimal values are found for five major parameters of KH, which
notably enhance the search ability of the algorithm.
2 KH Algorithm
Krill herd (KH) [46] is a swarm intelligence optimization method for solving
optimization problems. KH is idealized from the behavior of the krill swarms. The krill
position in two-dimensional surface is determined by three actions described as:
Study of Lagrangian and Evolutionary Parameters in Krill Herd Algorithm 113
i. motion induced by other krill;

ii. foraging motion; and
iii. physical diffusion
In KH, the optimization formulation is used as shown in Eq. (1).
dX i
= N i + Fi + Di (1)
dt
where Ni, Fi and Di are three above responding actions for krill i.
2.1 Motion Induced by Other Krill Individuals
The direction of motion induced, i, is approximately evaluated by the target effect, a

local effect, and a repulsive effect. In KH, it can be defined as:
Ninew = N max i + n Niold (2)
where
i = ilocal + itarget (3)
old
and Nmax is the maximum induced speed, n is its inertia weight in [0, 1], N i is the
last motion;
local
i and target
i are local effect and target direction effect which are
provided by the neighbors and the best krill, respectively. As per the basic KH method,
we set Nmax to 0.01 (ms-1) in our study.
In KH, the effect of the neighbors is determined as:
NN
local
i
= K ij X ij (4)
j =1
X j Xi
X ij = (5)
|| X j X i || +
K Kj
K ij = i
(6)
K worst
K best
worst best
where K and K are the fitness for best and the worst krill; Ki is the fitness of
the i-th krill; Kj is the fitness of j-th (j=1, 2, , NN, NN is the number of the neighbors)
neighbor; X represents the related positions.
For choosing the neighbor, different strategies can be used. In KH method, a sensing
distance (ds) is defined around an individual.
The known target for each krill is its lowest fitness, and it can be defined by using
Eq. (8).
itarget = C K i ,best X i ,best

best (7)
where, Cbest is the effective coefficient. Herein, the value of Cbest is defined as:
I
C best = i * rand * 1 + (8)
I max
where rand is a random values, i is a number between 0 and 5, I is the actual iteration
number and Imax is the maximum number of iterations.
2.2 Foraging Motion
It is influenced by the two main factors. One is the food location and the other one is the
previous experience about the food location. For the i-th krill individual, it can be
expressed as:
Fi = V f i + f Fi old (9)
where
i = i food + ibest (10)
and Vf is the foraging speed, f is its inertia weight, Fi old is the last foraging motion,
i food is the food attractive and ibest is the effect of the best i-th krill. In our study,
we set Vf to 0.02 [46].
Therefore, the food attraction for the i-th krill can be determined as:
i food = C K i , food X i , food

food (11)
where Cfood is the food coefficient.

In the current work, Cfood is defined as
I
C food = j * rand * 1 (12)
I max
where rand is a random value, j is a number between 0 and 5.
2.3 Physical Diffusion
It is essentially a random process. It can be formulated as follows:
Di = Dmax (13)
where Dmax is the maximum diffusion speed, and is the random directional vector and
its arrays are random values in [-1, 1]. In order to accelerate the search, another term is
added to Eq. (13) that is similar to a geometrical annealing schedule:
I
Di = k * 1 (14)
I max
Therefore, Dmax can be given as
I
Dmax = k * 1 (15)
I max
where k is a number between 0 and 0.02.
2.4 Main Procedure of the KH Algorithm
In general, three above motions are able to make all the krill individuals move towards
the best position. The krill position during the t to t+t is given as:
dX i
X i ( t + t ) = X i (t ) + t (16)
dt
More details about the three main motions and KH algorithm can be found in [46].
2.5 Genetic Operators
In KH method, genetic reproduction mechanisms are combined with three motions.

The combined genetic operators are crossover and mutation which have been widely
used in some classical EAs (evolutionary algorithms), such as DE and GA.
2.5.1 Crossover
In this study, a crossover probability, Cr, determines the crossover operator that is
implemented in binomial or exponential way. By generating a random number, the
m-th component of Xi, xi,m, is defined as:
xr ,m rand i ,m < Cr
xi ,m = (17)
xi ,m else
2.5.2 Mutation
The mutation is also widely used in EAs such as ES (evolutionary strategy) and DE,
which is determined by a mutation probability (Mu). The mutation operator used in KH
method is expressed as:
xgbes ,m + ( x p ,m xq ,m ) randi , m < M u
xi ,m = (18)
xi ,m else
where p, q[1, 2, ..., i-1, i+1, ..., K} and is a number.
3 Parametric Study
The main goal of this study to find optimal values for five main parameters of the KH
algorithm (i.e., Cbest, Cfood, Dmax, Cr and Mu). In this section, a setting of these
parameters are analyzed and studied in detail by using various experiments conducted
on benchmark functions (see Table 1). Except for special notifications, all the
implementations are conducted under the same conditions as presented in [28]. More
detailed descriptions of all the benchmarks can be found in [62,13,63]. It is notable that
the dimension of functions is twenty in the current work.
Table 1. Benchmark functions
No. Name Definition
n n
xi
1 2 1
0.2 cos( 2xi )
F01 Ackley n i=1 n i =1
f ( x) = 20 + e 20 e e
n
F02 Alpine f ( x ) = | xi sin( xi ) + 0.1xi |
i =1

f ( x ) = ( xi2 ) + ( xi2+1 )
n -1
xi2+1 +1 xi2 +1
F03 Brown
i =1
n
f ( x ) = ( x1 1) + i ( 2 xi2 xi 1 )
2 2
F04 Dixon & Price
i=2
Table 1. (continued)
n n
f ( x) = ( Ai Bi ) 2 , Ai = ( aij sin j + bij cos j )
i =1 j =1
F05 Fletcher-Powell n
Bi = ( aij sin x j + bij cos x j )
j =1
n
x
2
n
x
F06 Griewank f ( x) = cos i + 1
i
i =1 4000 i =1 i
n
F07 Holzman 2 function f ( x ) = ixi4
i =1
n 1
f ( x ) = sin 2 ( y1 ) + ( yi 1) (1 + 10 sin 2 ( yi + 1) )
2
i =1

F08 Levy 8
+ ( yn 1) (1 + 10 sin 2 (2 yn ) )
2
y =1+(x 1)/4
i i
n 1
f ( x) = 10sin2 ( y1 ) + ( yi 1) 1 +10sin2 ( yi +1 )
2
30 i =1
F09 Penalty #1
}
n
+ ( yn 1) + u ( xi ,10,100,4), yi = 1 + 0.25 ( xi +1)
2
i =1
n 1
f ( x ) = 0.1 sin 2 ( 3 x1 ) + ( xi 1) 1 + sin 2 ( 3 xi +1 )
2
i =1
F10 Penalty #2
}
n
+ ( xn 1) 1 + sin 2 ( 2 xn ) + u ( xi ,5,100, 4 )
2
i =1
2
n n xi k
F11 Perm #1 f ( x ) = ( i + 0.5 ) 1
k
k =1 i =1
i

n/4
f ( x) = ( x4i3 + 10x4i 2 )2 + 5( x4i1 -x4i )2
F12 Powel i =1
+( x4i2 -x4i -1 )4 + 10( x4i 3 -x4i )4
n
F13 Quartic with noise f ( x) = (i xi4 + U (0,1))
i =1
n
F14 Rastrigin f ( x ) = 10 n + ( xi2 10 cos(2 xi ))
i =1
n 1
f ( x ) = 100 ( xi +1 xi2 ) + ( xi 1)
2 2
F15 Rosenbrock
i =1

( )
D
f ( x ) = 418.9829 D xi sin xi
1/ 2
F16 Schwefel 2.26
i =1
n i
2
F17 Schwefel 1.2 f ( x) = x j

i =1 j =1
n n
F18 Schwefel 2.22 f ( x ) = | xi | + | xi |
i =1 i =1

F19 Schwefel 2.21 f ( x ) = max { xi ,1 i n}
i
n
F20 Sphere f ( x) = xi2
i =1
n
F21 Step f ( x ) = 6 n + x i
i =1
n
F22 Sum function f ( x ) = ixi2
i =1
n
n
2
n
4
F23 Zakharov f ( x) = xi2 + 0.5ixi + 0.5ixi

i =1 i =1 i =1
n
F24 Wavy1 f ( x ) = | 2( xi 24) + ( xi 24) sin( xi 24) |
i =1
3.1 The Parametric Study of Cbest, Cfood, and Dmax
The performance of KH with different Cbest, Cfood, and Dmax is tested on twenty-four
optimization problems by studying the values of i, j, and k in Eqs. (8), (12) and (15),
respectively (see Fig. 1). Gandomi and Alavi [46] have proposed four different kinds of
KHs. Here, the KH I (without genetic operators) is selected to do the following
experiments.
Start
Initialization
Fitness evaluation
Three motions
Motion induced by other individuals
with different i as Eq. (8).
Foraging motion with different j as

Eq. (12).
Physical diffusion with different k as

Eq. (15).
Update the krill individual position
Is termination N
condition met?
Y
Output the best solution
End
Fig. 1. Flowchart of parameter study with different i, j, and k
As mentioned before, the ranges of i, j, and k are set to [0, 5], [0, 5], and [0, 0.02]
with the internal of 0.025, 0.025, and 0.002, respectively. The settings for the other
parameters used in the KH method are as given in [46].
It is well-known that the efficiency of the metaheuristic methods for finding the best
solutions depends on certain stochastic distribution. Thus, herein, 100 trials are carried
out for the KH method with certain parameter combination on each test problem in
order to get the best Cbest, Cfood, and Dmax. The obtained function values are presented in
Tables 2 and 3. In these tables, f, i, j, and k in first row represents the average function
fitness, and the values of i, j, and k.
It can be seen in Table 2 that, on average, KH method with i=3.9375, j=4.25,
k=0.0151 has the best performance among various KHs, which makes the KH
implement in the best way. Similarly, it can be observed from Table 3 that KH provides
the best results for the best function values when the i=3.8750, j=4.25, k=0.0123 has the
best performance compared to other KHs.
Referring to Tables 2-3, it can be concluded that the best performance of KH can be
achieved by setting the values of i, j, and k to 4.00, 4.25 and 0.014, respectively.
Consequently, when studying the parameters Cr and Mu, i=4.00, j=4.25, and k=0.014
are respectively adopted in Cbest, Cfood and Dmax.
Table 2. Mean function values with different i, j, and k
f i j k
F01 5.17 2.75 4.75 0.020
F02 3.63 4.25 5.00 0.014
F03 38.00 0.00 0.00 0.000
F04 66.31 4.50 5.00 0.020
F05 3.6E5 3.25 3.50 0.018
F06 2.74 3.75 5.00 0.020
F07 15.30 5.00 4.25 0.012
F08 1.63 4.75 4.50 0.012
F09 4.62 4.75 5.00 0.008
F10 24.69 4.00 4.50 0.020
F11 7.4E29 5.00 4.75 0.018
F12 27.04 4.50 5.00 0.020
F13 0.01 5.00 4.50 0.020
F14 80.62 3.00 5.00 0.018
F15 32.20 5.00 3.75 0.020
F16 3.3E3 4.75 3.00 0.016
F17 396.37 5.00 4.25 0.020
F18 57.71 1.50 4.00 0.008
F19 7.11 3.00 3.50 0.008
F20 0.81 4.00 4.25 0.016
F21 298.00 4.75 5.00 0.014
F22 22.24 5.00 4.00 0.020
F23 135.46 2.00 3.25 0.012
F24 406.85 5.00 4.50 0.008
Average -- 3.9375 4.25 0.0151
Table 3. Best function values with different i, j, and k
f i j k
F01 3.65 2.75 4.75 0.020
F02 1.01 3.50 4.75 0.010
F03 11.19 4.75 4.75 0.014
F04 16.88 4.25 4.50 0.014
F05 1.1E5 4.75 2.25 0.016
F06 1.84 4.00 5.00 0.018
F07 0.99 4.25 4.75 0.010
F08 0.73 3.25 3.25 0.014
F09 1.18 4.00 4.75 0.000
F10 7.37 5.00 4.50 0.010
F11 5.0E22 0.00 4.50 0.008
F12 8.39 4.00 4.50 0.020
F13 3.55E-4 3.50 4.50 0.016
F14 41.25 5.00 4.25 0.014
F15 24.47 4.50 4.00 0.012
F16 2.4E3 5.00 4.00 0.014
F17 108.30 4.50 4.50 0.016
F18 23.62 2.25 4.75 0.014
F19 4.67 5.00 4.00 0.012
F20 0.33 4.75 4.25 0.014
F21 95.00 4.75 5.00 0.014
F22 8.59 4.00 4.25 0.016
F23 48.30 4.00 2.25 0.006
F24 277.27 1.25 4.00 0.000
Average -- 3.8750 4.25 0.0123
3.2 The Parametric Study of Cr and Mu
The performance of KH with different Cr and Mu is tested on twenty-four optimization

problems. As mentioned before, Gandomi and Alavi [46] have proposed four different
kinds of KHs. For the parametric study of Cr and Mu, KH IV (with two genetic
operators: crossover operator and mutation operator) is selected. The simplified
representation of parameter study with different Cr, and Mu is shown in Fig. 2.
Start
Initialization
Fitness evaluation
Implement three motions with best i,

j, and k as Eqs. (8), (12) and (15).
Implement the genetic operator(s) with

different Cr and Mu as Eqs. (17)-(18).
Update the krill individual position
Is termination N
condition met?
Output the best solution
End
Fig. 2. Flowchart of parameter study with different Cr, and Mu
The ranges of Cr and Mu are set to [0, 0.5] and [0, 0.5] with the internal of 0.025 and
0.025, respectively. Similarly, 100 trials are carried out for KH method with certain
parameter combination on each test problem in order to get the best Cr and Mu. The
obtained function values are recorded in Tables 4 and 5. In these tables, f, Cr and Mu in
first row represents the average function fitness, the values of Cr and Mu.
As it is seen in Table 4, KH performs the best when the average Cr and Mu are
0.02375 and 0.0271, respectively. From Table 5, for the best function values, KH
performs the best when the average Cr and Mu are equal to 0.224 and 0.025,
respectively.
As per the above experimental results, we can make a conclusion that the values of
Cr and Mu are set to 0.225 and 0.025 are the optimal values for the performance
enhancement of KH.
Table 4. Mean function values with different Cr and Mu
f Cr Mu
F01 7.83 0.225 0.000
F02 4.38 0.200 0.000
F03 157.73 0.150 0.175
F04 356.31 0.250 0.000
F05 5.1E5 0.100 0.050
F06 7.24 0.275 0.000
F07 58.78 0.150 0.000
F08 3.63 0.375 0.000
F09 8.01 0.030 0.000
F10 4.3E3 0.500 0.000
F11 2.2E46 0.225 0.000
F12 85.76 0.100 0.000
F13 0.01 0.100 0.000
F14 110.92 0.175 0.000
F15 51.01 0.125 0.000
F16 4.3E3 0.275 0.425
F17 602.68 0.425 0.000
F18 35.52 0.225 0.000
F19 11.44 0.050 0.000
F20 1.69 0.450 0.000
F21 517 0.275 0.000
F22 65.61 0.250 0.000
F23 265.79 0.275 0.000
F24 560.69 0.500 0.000
Average -- 0.2375 0.0271
Table 5. Best function values with different Cr and Mu
f Cr Mu
F01 6.38 0.350 0.000
F02 3.71 0.175 0.000
F03 23.96 0.100 0.050
F04 184.92 0.075 0.000
F05 2.6E5 0.375 0.225
F06 5.49 0.125 0.000
F07 16.48 0.150 0.000
F08 1.60 0.025 0.000
F09 3.58 0.275 0.000
F10 23.78 0.300 0.000
F11 6.7E41 0.125 0.075
F12 28.69 0.300 0.000
F13 0.00 0.075 0.000
F14 77.34 0.150 0.000
F15 35.17 0.175 0.000
F16 3.5E3 0.425 0.450
F17 195.04 0.425 0.000
F18 26.72 0.225 0.000
F19 8.49 0.375 0.000
F20 0.96 0.325 0.000
F21 400 0.275 0.000
F22 23.99 0.225 0.000
F23 106.70 0.275 0.000
F24 424.59 0.050 0.000
Average -- 0.224 0.025
4 Conclusion
The parameters used in the basic KH method are originated from experimental data
gathered from the literature. Using such information might not be always the best
choice. The main goal of this study is to derive the optimal values for the KH basic
parameters for future implementations of the algorithm. To this aim, an extensive
parametric analysis is carried out using an array of high-dimensional benchmark
problems. The performance of KH with different Cbest, Cfood, Dmax and Cr, Mu is studied
on twenty-four optimization problems. The KH algorithm without genetic operators is
selected to study the Cbest, Cfood, Dmax parameters. The parametric analysis of Cr and Mu
is done using the KH algorithm with two genetic operators: crossover and mutation
operator. The major finding is that KH would have the best performance for most
high-dimensional test functions by setting the Lagrangian parameters of i, j and k
respectively to 4.00, 4.25 and 0.014 in Cbest, Cfood and Dmax. The best genetic
parameters, Cr and Mu, are also found as 0.225, and 0.025 respectively. Undoubtedly,
this finding would eliminate the concerns regarding the optimal tuning of the KH
algorithm for its most future applications.
Acknowledgements. This work was supported by Research Fund for the Doctoral
Program of Jiangsu Normal University (No. 13XLR041).
References
1. Storn, R., Price, K.: Differential evolution-a simple and efficient heuristic for global
optimization over continuous spaces. J. Global Optim. 11(4), 341359 (1997),
doi:10.1023/A:1008202821328
2. Gandomi, A.H., Yang, X.-S., Talatahari, S., Deb, S.: Coupled eagle strategy and differential
evolution for unconstrained and constrained global optimization. Comput. Math.
Appl. 63(1), 191200 (2012), doi:10.1016/j.camwa.2011.11.010
3. Karaboga, D., Basturk, B.: A powerful and efficient algorithm for numerical function
optimization: artificial bee colony (ABC) algorithm. J. Global Optim. 39(3), 459471
(2007), doi:10.1007/s10898-007-9149-x
4. Li, X., Yin, M.: Self-adaptive constrained artificial bee colony for constrained numerical
optimization. Neural Comput. Appl. 24(3-4), 723734 (2012),
doi:10.1007/s00521-012-1285-7
5. Fister, I., Fister Jr., I., Zumer, J.B.: Memetic artificial bee colony algorithm for large-scale
global optimization. In: IEEE Congress on Evolutionary Computation (CEC 2012),
Brisbane, Australia, June 10-15, pp. 18. IEEE (2012), doi:10.1109/CEC.2012.6252938
6. Gandomi, A.H., Alavi, A.H.: Multi-stage genetic programming: A new strategy to nonlinear
system modeling. Inf. Sci. 181(23), 52275239 (2011), doi:10.1016/j.ins.2011.07.026
7. Gandomi, A.H., Yang, X.-S., Alavi, A.H.: Cuckoo search algorithm: a metaheuristic
approach to solve structural optimization problems. Eng. Comput. 29(1), 1735 (2013),
doi:10.1007/s00366-011-0241-y
8. Yang, X.S., Deb, S.: Cuckoo search via Lvy flights. In: Proceeding of World Congress on
Nature & Biologically Inspired Computing (NaBIC 2009), Coimbatore, India, pp. 210214.
IEEE Publications, USA (2009)
9. Gandomi, A.H., Talatahari, S., Yang, X.-S., Deb, S.: Design optimization of truss structures
using cuckoo search algorithm. Struct. Des. Tall Spec. 22(17), 13301349 (2013),
doi:10.1002/tal.1033
10. Li, X., Wang, J., Yin, M.: Enhancing the performance of cuckoo search algorithm using
orthogonal learning method. Neural Comput. Appl. 24(6), 12331247 (2013),
doi:10.1007/s00521-013-1354-6
11. Fister Jr, I., Yang, X.-S., Fister, D., Fister, I.: Cuckoo Search: A Brief Literature Review. In:
Yang, X.-S. (ed.) Cuckoo Search and Firefly Algorithm. SCI, vol. 516, pp. 4962. Springer,
Heidelberg (2014)
12. Fister Jr, I., Fister, D., Fister, I.: A comprehensive review of cuckoo search: variants and
hybrids. Int. J. Math. Model. Numer. Optim. 4(4), 387409 (2013)
13. Simon, D.: Biogeography-based optimization. IEEE Trans. Evolut. Comput. 12(6),
702713 (2008), doi:10.1109/TEVC.2008.919004
14. Li, X., Wang, J., Zhou, J., Yin, M.: A perturb biogeography based optimization with
mutation for global numerical optimization. Appl. Math. Comput. 218(2), 598609 (2011),
doi:10.1016/j.amc.2011.05.110
15. Li, X., Yin, M.: Multi-operator based biogeography based optimization with mutation for
global numerical optimization. Comput. Math. Appl. 64(9), 28332844 (2012),
doi:10.1016/j.camwa.2012.04.015
16. Saremi, S., Mirjalili, S., Lewis, A.: Biogeography-based optimisation with chaos. Neural
Comput. Appl. (2014), doi:10.1007/s00521-014-1597-x
17. Li, X., Zhang, J., Yin, M.: Animal migration optimization: an optimization algorithm
inspired by animal migration behavior. Neural Comput. Appl. (2013),
doi:10.1007/s00521-013-1433-8
18. Mirjalili, S., Mirjalili, S.M., Lewis, A.: Grey wolf optimizer. Adv. Eng. Softw. 69, 4661
(2014), doi:10.1016/j.advengsoft.2013.12.007
19. Geem, Z.W., Kim, J.H., Loganathan, G.V.: A new heuristic optimization algorithm:
harmony search. Simulation 76(2), 6068 (2001), doi:10.1177/003754970107600201
20. Wang, G., Guo, L., Duan, H., Wang, H., Liu, L., Shao, M.: Hybridizing harmony search
with biogeography based optimization for global numerical optimization. J. Comput. Theor.
Nanos. 10(10), 23182328 (2013), doi:10.1166/jctn.2013.3207
21. Kennedy, J., Eberhart, R.: Particle swarm optimization. Paper presented at the Proceeding of
the IEEE International Conference on Neural Networks, Perth, Australia, November
27-December 1 (1995)
22. Talatahari, S., Kheirollahi, M., Farahmandpour, C., Gandomi, A.H.: A multi-stage particle
swarm for optimum design of truss structures. Neural Comput. Appl. 23(5), 12971309
(2013), doi:10.1007/s00521-012-1072-5
23. Mirjalili, S., Lewis, A.: S-shaped versus V-shaped transfer functions for binary particle
swarm optimization. Swarm Evol. Comput. 9, 114 (2013),
doi:10.1016/j.swevo.2012.09.002
24. Mirjalili, S., Wang, G.-G., Coelho, L.S.: Binary optimization using hybrid particle swarm
optimization and gravitational search algorithm. Neural Comput. Appl. (2014),
doi:10.1007/s00521-014-1629-6
25. Gandomi, A.H.: Interior Search Algorithm (ISA): A Novel Approach for Global
Optimization. ISA Trans. (2014), doi:10.1016/j.isatra.2014.03.018
26. Yang, X.S.: Firefly algorithm, stochastic test functions and design optimisation. Int J of
Bio-Inspired Computation 2(2), 7884 (2010)
27. Fister, I., Fister Jr., I., Yang, X.-S., Brest, J.: A comprehensive review of firefly algorithms.
Swarm Evol. Comput. 13, 3446 (2013), doi:10.1016/j.swevo.2013.06.001
28. Wang, G.-G., Guo, L., Duan, H., Wang, H.: A new improved firefly algorithm for global
numerical optimization. J. Comput. Theor. Nanos. 11(2), 477485 (2014),
doi:10.1166/jctn.2014.3383
29. Kaveh, A., Talatahari, S.: A novel heuristic optimization method: charged system search.
Acta Mech. 213(3-4), 267289 (2010), doi:10.1007/s00707-009-0270-4
30. Gandomi, A.H., Yang, X.-S., Alavi, A.H., Talatahari, S.: Bat algorithm for constrained
optimization tasks. Neural Comput. Appl. 22(6), 12391255 (2013),
doi:10.1007/s00521-012-1028-9
31. Yang, X.S., Gandomi, A.H.: Bat algorithm: a novel approach for global engineering
optimization. Eng. Computation 29(5), 464483 (2012), doi:10.1108/02644401211235834
32. Fister Jr., I., Fong, S., Brest, J., Fister, I.: A Novel Hybrid Self-Adaptive Bat Algorithm. Sci.
World J. 2014, 112 (2014), doi:10.1155/2014/709738
33. Mirjalili, S., Mirjalili, S.M., Yang, X.-S.: Binary bat algorithm. Neural Comput. Appl.
(2013), doi:10.1007/s00521-013-1525-5
34. Zhang, Y., Huang, D., Ji, M., Xie, F.: Image segmentation using PSO and PCM with
Mahalanobis distance. Expert Syst. Appl. 38(7), 90369040 (2011),
doi:10.1016/j.eswa.2011.01.041
35. Chen, C.-H., Yang, S.-Y.: Neural fuzzy inference systems with knowledge-based cultural
differential evolution for nonlinear system control. Inf. Sci. (2014),
doi:10.1016/j.ins.2014.02.071
36. Mukherjee, R., Patra, G.R., Kundu, R., Das, S.: Cluster-based differential evolution with
Crowding Archive for niching in dynamic environments. Inf. Sci. (2014),
doi:10.1016/j.ins.2013.11.025
37. Li, X., Yin, M.: Multiobjective binary biogeography based optimization for feature selection
using gene expression data. IEEE Trans. Nanobiosci. 12(4), 343353 (2013),
doi:10.1109/TNB.2013.2294716
38. Fister, I., Mernik, M., Filipi, B.: A hybrid self-adaptive evolutionary algorithm for marker
optimization in the clothing industry. Appl. Soft Compt. 10(2), 409422 (2010),
doi:10.1016/j.asoc.2009.08.001
39. Li, X., Yin, M.: Parameter estimation for chaotic systems by hybrid differential evolution
algorithm and artificial bee colony algorithm. Nonlinear Dynam. (2014),
doi:10.1007/s11071-014-1273-9
40. Li, X., Yin, M.: Application of Differential Evolution Algorithm on Self-Potential Data.
PLoS ONE 7(12), e51199 (2012), doi:10.1371/journal.pone.0051199
41. Mirjalili, S., Mohd Hashim, S.Z., Moradian Sardroudi, H.: Training feedforward neural
networks using hybrid particle swarm optimization and gravitational search algorithm.
Appl. Math. Comput. 218(22), 1112511137 (2012), doi:10.1016/j.amc.2012.04.069
42. Mirjalili, S., Mirjalili, S.M., Lewis, A.: Let a biogeography-based optimizer train your
Multi-Layer Perceptron. Inf. Sci. 269, 188209 (2014), doi:10.1016/j.ins.2014.01.038
43. Li, X., Yin, M.: An opposition-based differential evolution algorithm for permutation flow
shop scheduling based on diversity measure. Adv. Eng. Softw. 55, 1031 (2013),
doi:10.1016/j.advengsoft.2012.09.003
44. Yang, X.S., Gandomi, A.H., Talatahari, S., Alavi, A.H.: Metaheuristics in Water,
Geotechnical and Transport Engineering. Elsevier, Waltham (2013)
45. Gandomi, A.H., Yang, X.S., Talatahari, S., Alavi, A.H.: Metaheuristic Applications in
Structures and Infrastructures. Elsevier, Waltham (2013)
46. Gandomi, A.H., Alavi, A.H.: Krill herd: A new bio-inspired optimization algorithm.
Commun. Nonlinear Sci. Numer. Simulat. 17(12), 48314845 (2012),
doi:10.1016/j.cnsns.2012.05.010
47. Wang, G.-G., Guo, L., Gandomi, A.H., Hao, G.-S., Wang, H.: Chaotic krill herd algorithm.
Inf. Sci. 274, 1734 (2014), doi:10.1016/j.ins.2014.02.123
48. Wang, G.-G., Gandomi, A.H., Alavi, A.H.: A chaotic particle-swarm krill herd algorithm for
global numerical optimization. Kybernetes 42(6), 962978 (2013),
doi:10.1108/K-11-2012-0108
49. Saremi, S., Mirjalili, S.M., Mirjalili, S.: Chaotic Krill Herd Optimization Algorithm.
Procedia Technology 12, 180185 (2014), doi:10.1016/j.protcy.2013.12.473
50. Wang, G.-G., Gandomi, A.H., Alavi, A.H.: Stud krill herd algorithm. Neurocomputing 128,
363370 (2014), doi:10.1016/j.neucom.2013.08.031
51. Wang, G., Guo, L., Gandomi, A.H., Cao, L., Alavi, A.H., Duan, H., Li, J.: Lvy-flight krill
herd algorithm. Math. Probl. Eng. 2013, 114 (2013), doi:10.1155/2013/682073
52. Guo, L., Wang, G.-G., Gandomi, A.H., Alavi, A.H., Duan, H.: A new improved krill herd
algorithm for global numerical optimization. Neurocomputing 138, 392402 (2014),
doi:10.1016/j.neucom.2014.01.023
53. Wang, G.-G., Gandomi, A.H., Alavi, A.H.: An effective krill herd algorithm with migration
operator in biogeography-based optimization. Appl. Math. Model. 38(9-10), 24542462
(2014), doi:10.1016/j.apm.2013.10.052
54. Wang, G., Guo, L., Wang, H., Duan, H., Liu, L., Li, J.: Incorporating mutation scheme into
krill herd algorithm for global numerical optimization. Neural Comput. Appl. 24(3-4),
853871 (2014), doi:10.1007/s00521-012-1304-8
55. Wang, G.-G., Gandomi, A.H., Alavi, A.H., Hao, G.-S.: Hybrid krill herd algorithm with
differential evolution for global numerical optimization. Neural Comput. Appl. 25(2),
297308 (2014), doi:10.1007/s00521-013-1485-9
56. Li, J., Tang, Y., Hua, C., Guan, X.: An improved krill herd algorithm: Krill herd with linear
decreasing step. Appl. Math. Comput. 234, 356367 (2014),
doi:10.1016/j.amc.2014.01.146
57. Wang, G.-G., Guo, L., Gandomi, A.H., Alavi, A.H., Duan, H.: Simulated annealing-based
krill herd algorithm for global optimization. Abstr. Appl. Anal. 2013, 111 (2013),
doi:10.1155/2013/213853
58. Wang, G.-G., Gandomi, A.H., Yang, X.-S., Alavi, A.H.: A new hybrid method based on
krill herd and cuckoo search for global optimization tasks. Int. J. of Bio-Inspired
Computation (2013)
59. Gandomi, A.H., Talatahari, S., Tadbiri, F., Alavi, A.H.: Krill herd algorithm for optimum
design of truss structures. Int. J. of Bio-Inspired Computation 5(5), 281288 (2013),
doi:10.1504/IJBIC.2013.057191
60. Gandomi, A.H., Alavi, A.H.: An introduction of krill herd algorithm for engineering
optimization. J. Civil Eng. Manag. (2013)
61. Gandomi, A.H., Alavi, A.H., Talatahari, S.: Structural Optimization using Krill Herd
Algorithm. In: Swarm Intelligence and Bio-Inspired Computation: Theory and
Applications, pp. 335349. Elsevier (2013)
62. Yao, X., Liu, Y., Lin, G.: Evolutionary programming made faster. IEEE Trans. Evolut.
Comput. 3(2), 82102 (1999)
63. Yang, X.-S., Cui, Z., Xiao, R., Gandomi, A.H., Karamanoglu, M.: Swarm Intelligence and
Bio-Inspired Computation. Elsevier, Waltham (2013)
Solutions of Non-smooth Economic Dispatch Problems
by Swarm Intelligence
Seyyed Soheil Sadat Hosseini1,*, Xin-She Yang2, Amir H. Gandomi3,

and Alireza Nemati1
1
Department of Electrical Engineering and Computer Science,
University of Toledo, Toledo, OH 43606, USA
2
School of Science and Technology, Middlesex University,
The Burroughs, London NW4 4BT, UK
3
Department of Civil Engineering, University of Akron, Akron, OH 44325, USA
Abstract. The increasing costs of fuels and operations of power generating

units necessitate the development of optimization methods for economic dis-
patch (ED) problems. Classical optimization techniques such as direct search
and gradient methods often fail to find global optimum solutions. Modern opti-
mization techniques are often meta-heuristic, and they are very promising in
solving nonlinear programming problems. This chapter presents a novel method
to determine the feasible optimal solutions of the ED problems utilizing the
newly developed Bat Algorithm (BA). The proposed BA is based on the echo-
location behavior of bats. This technique is adapted to solve non-convex ED
problems under different nonlinear constraints such as transmission losses,
ramp rate limits, multi-fuel options and prohibited operating zones. Parameters
are tuned to give the best results for these problems. To describe the efficiency
and applicability of the proposed algorithm, we will use four ED test systems
with non-convexity. We will compare our results with some of the most recent-
ly published ED solution methods. Comparing with the other existing tech-
niques, the proposed approach can find better solutions than other methods.
This method can be deemed to be a promising alternative for solving the ED
problems in real systems.
Keywords: Economic dispatch, Valve loading effect, Bat Algorithm, Meta-

heuristic algorithm.
1 Introduction
The ED problem is one of the main issues in power system operation and control. The
goal of an ED problem is to schedule the online generating units so as to satisfy the
load demand at minimum operating cost, while satisfying all the equality and inequa-
lity constraints of the units [1]. However, careful and intelligent scheduling of the
units can both decrease the operating cost significantly and assure higher reliability,
improving security with less environmental impact [2]. Therefore, the optimization of
ED problems is a challenging task and new algorithms such as meta-heuristics may be
promising for solving modern power system operations and control.
*

130 S.S.S. Hosseini et al.
Traditionally, mathematical modeling of fuel costs for generating units often uses
approximate models in terms of a single quadratic cost function [3-4]. This type of prob-
lem can often be solved using several mathematical programming techniques e.g. the
lambda-iteration method, the base point and participation factors method, the interior
point method, dynamic programming, and the gradient algorithms [3, 5-8]. However,
none of these techniques can find an optimal solution satisfactorily, as they are local
search methods which can normally be trapped at a local optimum. Thus, to use the
right algorithm is very important. In addition, the effective implementation is equally
important, and even though variables are continuous. Basic ED problems consider the
power balance constraints apart from the generating capacity limits. However, a practic-
al ED model must include the prohibited operating zones, ramp rate limits, valve point
loading effects and multi-fuel options [9] so as to provide a complete formulation for the
ED problem. The resulting ED is a non-convex optimization problem, which is very
challenging to solve and cannot be resolved by the traditional approaches.
To overcome these deficiencies, evolutionary algorithms and meta-heurstics have
been utilized to solve the ED problems, and these techniques include Genetic Algo-
rithm (GA) [10], real-coded genetic algorithm (RCGA) [11], Tabu Search (TS)
[12-13], Hopfield neural network [14], Differential Evolution [15], different types of
Evolutionary Programming (EP) [16-17], biogeography-based optimization (BBO)
[18], Evolutionary Strategy (ES) [19], Particle Swarm Optimization (PSO) [2, 20-26],
an improved coordinated aggregation-based particle swarm optimization (ICA-PSO)
[27-28], Bacterial Foraging (BF) [21], and harmony search (HS) [29], Firefly Algo-
rithm (FA) [30], multiple tabu search (MTS) [31], taguchi self-adaptive real-coded
genetic algorithm (TSARGA) [32].
Although many optimization methods have been developed for the ED problems,
the complexity of the task reveals the necessity for improvement in efficient tech-
niques to accurately find the global optimum solution. Recently, a new meta-heuristic
search algorithm, called Bat Algorithm (BA), has been developed [33-35]. BA is a
new search method based on the echolocation behavior of microbats. The capability
of echolocation of microbats is fascinating, as these bats can locate their prey and
discriminate different types of insects even in complete darkness. Preliminary studies
suggest that the BA can have superior performance over genetic algorithms and par-
ticle swarm optimization [36], and it can solve real world and engineering optimiza-
tion problems [30, 37-39]. In this chapter, we will study BA further in detail and solve
ED problems. Recently two papers [40-41] have been done in this field but more de-
tailed illustration concerning theoretical and implementation feature of the proposed
algorithm is provided in the following sections. To prove the efficiency and applica-
bility of the proposed approach, several types of ED problems are studied and the
results are compared with those available in the literature.
The chapter is organized as follows: Section 2 illustrates the ED problems and its
formulation incorporating valve-loading effect multiple fuel option, prohibited operat-
ing zone (POZ) constraints and ramp rate limits. Moreover, the proposed technique
for constraints handling is described in this section. In Section 3, the Bat Algorithm is
described. In Section 4, the simulation results are presented that show the potential of
the proposed method. Finally, Section 5 concludes the paper with some discussions.
Solutions of Non-smooth Economic Dispatch Problems by Swarm Intelligence 131
2 Problem Formulations
2.1 ED with Smooth Cost Functions

The main goal of the ED problems is to find the optimal combination of power gener-
ations that optimizes the total generation cost, while satisfying an equality constraint
and inequality constraints. Cost efficiency is the most important sub-problem of pow-
er system operations. Due to the highly nonlinearity characteristics of power systems
and generators, ED is part of a class of nonlinear programming under nonlinear equal-
ity and inequality constraints. Generally speaking, the scheduled combined units for
each specific period of operation are listed from unit commitment, and the ED plan-
ning must be performed the optimal dispatch amongst the operating units to satisfy
the load demands and practical constraints of generators, which include maximum
and minimum limits, ramp rate limits, and prohibited operating zones. Generally, the
generation cost function can normally be stated as a quadratic polynomial. Mathemat-
ically, the problem can be described as:
n
min FT = Fi ( Pi ) (1)
i =1
where Fi ( Pi ) is the total generation cost for the generator unit i , which is defined
by the following equation:
Fi ( Pi ) = ai Pi 2 + bi Pi + ci (2)
where ai , bi and ci are coefficients of generator i .

The basic constraints are the real power balance and the real power operating limits:
NG
P = P
i =1
i D + PL (3)
Pi min Pi Pi max (4)
where PD is the total active power demand, PL is the network loss, Pi min is the
minimum operating limit of i-th unit, Pi max is the maximum operating limit of i-th
unit. In an ED problem PL can be approximated by a function of the unit power
outputs and the transmission loss matrix coefficients called matrix loss formula [24,
42-43]. The other important constraints are as follows.
2.1.1 Ramp Rate Limits

One of the unrealistic assumptions that prevailed for simplifying the problem in many
of the earlier research is that the adaptations to the power outputs are instantaneous.
However, under practical circumstances, the ramp rate limit restrains the operating
range of all the online units for tuning the generator operation between two operating
periods [44-45]. The generation may increase or decrease with corresponding upper
and lower ramp rate limits. Therefore, units are restricted because of these ramp rate
limits as mentioned below:
If power generation increases, we have
Pi Pi 0 UR i (5)
while, if power generation decreases, we have
Pi 0 Pi DRi (6)
where Pi 0 is the previous power generation of unit i . URi and DRi are the up-
ramp and down-ramp limits of the i-th generator, respectively. The inclusion of ramp
rate limits changes the generator operation constraints (5) as follows:
max( Pi max ,URi Pi ) Pi min( Pi max , Pi 0 DRi ). (7)
2.1.2 Prohibited Operating Zones

A generator with these characteristics has discontinuous fuel-cost characteristics. The
idea of prohibited operating zones includes the following constraints:
,

, , 2,3, , (8)

,
where Pi ,LBj and PiUB

, j are the lower and upper boundaries of prohibited operating
zone j of generator i , respectively; NPi is the number of prohibited operating

zones of generator i .
2.2 ED with Valve-Point Loading Problem

The valve-opening process of multivalve steam turbines generates a ripple-like impact
in the heat rate curve of the generators. This curve includes higher order nonlinearity
because of the valve-point effect, and must be refined by a sine function. Also the
solution strategy can simply be trapped in the local minima in the vicinity of optimal
value. To take into account for the valve-point effects, sinusoidal terms are
augmented to the quadratic cost functions as below:
Fi ( Pi ) = ai Pi 2 + bi Pi + ci + g i sin(hi ( Pi min Pi ) (9)
where g i and hi are constants of the unit with valve-point effects.

2.3 Nonsmooth Cost Functions with Multiple Fuels

Practically, the operating conditions of many generating units necessitate that the cost
function be segmented into piecewise quadratic functions. So, it is realistic to describe
the generation cost function as a piecewise quadratic cost function [1], which in gen-
eral represents the input-output curve of a generator with multiple fuels [1]. The
piecewise quadratic function can be defined as follows
ai ,1 Pi 2 + bi ,1 Pi + ci ,1 , if Pi ,min Pi Pi ,1

ai , 2 Pi + bi , 2 Pi + ci , 2 , Pi ,1 Pi Pi , 2
2
if

. .
Fi ( Pi ) = (10)
. .
. .

ai ,n Pi 2 + bi ,n Pi + ci ,n , if Pi ,n 1 Pi Pi ,max
where a i , j , bi , j , and c i , j are the cost coefficients of generator i with fuel type j,
respectively; Pi ,min and Pi ,max are the minimum and maximum power generation of
unit i .
2.4 Non-smooth Cost Functions with Valve-Point Effects and Multiple Fuel
Options
To acquire a precise and practical ED solution, the realistic operation of the ED prob-
lem should deem both valve-point effects and multiple fuel options. The cost model in
this paper integrates the valve-point loadings and the fuel changes in one frame. So,
the cost function, combining (5) and (6), can be realistically written as
, , , , sin , , (11)
, , ,
3 Bat Algorithm
The bat-inspired meta-heuristic algorithm, namely the bat algorithm (BA), was recently
introduced by Xin-She Yang [33, 36], based on the echolocation of microbats. This algo-
rithm has been applied to many applications [30, 46-47]. In the real world, echolocation
usually use short impulses of a few thousandths of a second (up to about 8 to 10 ms) with
a varying frequency in the region of 25 kHz to 150 kHz, corresponding to the wave-
lengths of 2 mm to 14 mm in the air. Microbats utilize a type of sonar called echolocation
to recognize prey, escape obstacles, and locate their roosting crevices in the dark, and the
bat algorithm was inspired by this echolocation behavior. These bats transmit a very loud
sound pulse and listen for the echo that bounces back from the surrounding objects. Their
pulses vary in properties and can be correlated with their hunting plans, depending on the
species. Most bats utilize short, frequency-modulated signals to sweep through about an
octave, while others more often utilize constant-frequency signals for echolocation. Their
signal bandwidth changes depending on the species and often increased by utilizing more
harmonics. In the standard bat algorithm, the echolocation characteristics of microbats
can be idealized as the following three rules:
i. All bats utilize echolocation to sense distance, and they know the differ-
ence between food/prey and background barriers in some magical way;
ii. Bats randomly fly with velocity vi at position xi with a fixed frequency frmin,
varying wavelength and loudness A0 to search for prey. They can automati-
cally regulate the wavelength (or frequency) of their emitted pulses and ad-
just the rate of pulse emission r [0,1], depending on the proximity of their
target;
iii. Although the loudness can change in many ways, we assume that the loud-
ness changes from a large (positive) A0 to a minimum constant value Amin
[33, 36].
For simplicity, we do not utilize ray tracing in this technique, though it can produce
an interesting feature for further extension. Generally, ray tracing can be computa-
tionally extensive, but it can be a useful feature for computational geometry and other
applications.
The basic steps of BA can be summarized as the pseudo code shown in Fig. 1.
Bat Algorithm
Objective function F(x), x = (x1, ...,xd)T
Initialize the bat population xi (i = 1,2, ...,n) and vi
Specify pulse frequency fri at xi
Initialize pulse rates r and the loudness A
while (t <Max number of iterations)
Generate new solutions by adjusting frequency,
and updating velocities and locations/solutions [equations (12) to (14)]
if (rand > r)
Select a solution among the best solutions randomly
Generate a local solution around the selected best solution by a local random
walk
end if
Produce a new solution by flying randomly
if (rand < A & F (xi) < F (x*))
Accept the new solutions
end if
Rate the bats and find the current best x*
end while
Postprocess results and visualization
Fig. 1. Pseudo code of the BA
For each i-th bat, its position xit and velocity vit in a d-dimensional search space
should be defined, and upgraded during the iterations. The new solutions xti and
velocities vti at time step t can be estimated by
fri = frmin +(frmax frmin) (12)
, , , (13)
, , , (14)
where in the range of [0,1] is a random vector drawn from a uniform distribution.
Here x* is the current global best location found yet, which is found after comparing all
the solutions amid all the n bats at the current iteration. As the product is the
velocity in the media, which is essentially fixed we can utilize either (or ) to
adjust the velocity change during fixing the other factor (or ), depending on the
type of the problem of interest. For the applications here, we will utilize frmin = 0 and
frmax = 2, depending on the domain size of the problem of interest. Initially, each bat is
randomly assigned a frequency that is drawn uniformly from [frmin, frmax] [30].
For the local search part, once a solution is selected among the current best solu-
tions, a new solution for each bat is produced locally utilizing a local random walk:
xnew = xold+ At (15)
t
where the random number is drawn from [1, 1], while A =<Ait>is the average
loudness of all the bats at this time step. In fact, this is the main updating equation of
simulated annealing. For this reason, simulated annealing could be thought as a very
special case of the BA.
Additionally, the loudness and the rate of pulse emission have to be updated
accordingly as iterations proceed. As the loudness typically decreases once a bat has
found its prey, while the rate of pulse emission grows, the loudness can be chosen as
any value of convenience. For simplicity, we can also utilize A0 = 1 and Amin = 0,
assuming Amin = 0 means that a bat has just found the prey and temporarily stop
transmitting any sound. Now we have
, 1 exp (16)
where and are constants. In fact, is analogous to the cooling factor of a cooling
schedule in the simulated annealing. For any 0 <, < 1, we have
0, as (17)
In the simplest case, we can utilize = , and in the standard BA, we can utilize
= = 0.9 to 0.975 in most cases, though have utilized = =0.9 in our simulations.
Initially, each bat should have several values of loudness and pulse emission rate, and
this can be obtained by randomization. For instance, the initial loudness can
usually be [1,2], while the initial emission rate can be around zero, or any value
0,1 .
3.1 Constraint Handling

An important issue in the application of optimization methods is how the technique
manages the constraints of the problem. The POZ constraints (19) can be expressed as
the following bounds or limits. If the generation of unit i is settled in its j-th POZ,
i.e.:
Pi ,LBj Pi PiUB
,j (18)
then the amount of generation is cut to the nearest boundary of the jth POZ as below:
j = ( Pi , j + Pi , j ) / 2
Pi ,ave LB UB
(19)
Pi ,LBj if Pi ,LBj Pi Pi ,ave

j
Pi = UB (20)
Pi , j if Pi , j Pi Pi , j
ave UB
For a nonlinear optimization problem with equality and inequality constraints, a

widely-used algorithm of applying constraints is the penalty approach. The concep-
tion is to determine a penalty function from the constraints so that the constrained
problem can be transformed into an unconstrained problem. Now we can state
M N
( x, i , v j ) = F ( x) + ii2 ( x) + v j 2j ( x) (21)
i =1 j =1
where and j are equality and inequality constraints, respectively; i 1 and

i
v j 0 which should be large enough, depending on the solution quality needed. As

we can see here, when an equality constraint is satisfied, its impact or contribution to
is zero. However, when it is violated, it is penalized heavily as it grows con-
siderably. Similarly, it is correct when inequality constraints becomes tight or exactly.
It is worth pointing out that generation and ramp rate limits are similar type of
constraints. These constraints state the overall generation limits of the units.
4 Numerical Results
Because of the random nature of the BA (and in fact all meta-heuristic methods), their
performance cannot be assessed by the result of a single run. Many trials with indepen-
dent population initializations should be done to obtain a useful conclusion of the per-
formance of the method. Thus, the results should be evaluated utilizing statistic measures
such as the means and standard deviations. The best, worst and mean obtained in 40 trials
are utilized to compare the performances of different EAs. To show the effectiveness of
the proposed BA, the test results are also compared with the results already reported by
recently published results utilizing most recent algorithms for solving the ED problems.
The parameters of BA in our simulations are: n=20, = =0.95, and the number of total
iterations t=2,000. These are adapted to give the best results for the ED problems.
4.1 Case I: 3 Generating Units

Three generating units have been modeled utilizing a quadratic cost function and with
the effects of the valve-point loading included. The load demand to be met by the
three generating units is 850MW. The description of the system can be found from
[17]. It has been shown in [48] that the global minimum found for the three-generator
system is 8,234.07$/h. Based on the aforementioned parameters, the BA has been
executed for 40 trials with several starting points to verify its performance. The best,
average and worst of cost functions obtained by various techniques are displayed in
Table 1. All algorithms give a similar best solution, whereas average and worst
costs differ. This experimentation compares the performance of BA with the other
methods in terms of dispatching cost. All methods give a similar Best solution, whe-
reas Average and Worst costs differ. Table 2 shows that the proposed algorithm
has accomplished in finding the global optimal solution indicated in [48].
Table 1. The best, average and worst results of different ED solution methods for the 3 unit test
system
Generation Cost ($/h)

Methods
Best Average Worst
GAB [17] 8,234.08 NA NA
GAF [17] 8,234.07 NA NA
CEP [17] 8,234.07 8,235.97 8,241.83
FEP [17] 8,234.07 8,234.24 8,241.78
MFEB [17] 8,234.08 8,234.71 8,241.8
IFEP [17] 8,234.07 8,234.16 8,234.54
BA 8,234.07 8,335.27 8,562.40
NA: Not Available
Table 2. Output power of generators in the best result of the proposed FA for the 3 unit test
system
Unit Power (MW)

1 300.267
2 149.733
3 400.000
Total Generation (MW) 850
Generation Cost ($/h) 8,234.07

4.2 Case II: 13 Generating Units

In this test, there are thirteen-generating units, while the quadratic cost functions com-
bined with the effects of valve-point loading have been utilized as before. The complexi-
ty to the solution procedure has significantly grown. Inasmuch as this is a larger system
with higher non-linearity, it has more local minima and therefore it is difficult to reach
the global solution. To be able to deal with more complicated case with highly non-
linearity is one of the principal goals of BA applications. The load demand of this test
system is 1,800 MW. Exactly the same data as given in [17] will be used in this case.
Table 3 shows the best, average and worst results of different ED solution algorithms
among 40 trial runs in the same way as listed in Table 1. The outcomes of the other tech-
niques shown in Table 3 have been directly quoted from their corresponding references
(NA means the related result is not available in the corresponding reference). Out of the
solutions presented in Table 3, the one obtained by BA is found to be the best. In this
case BA escapes from local minima and reaches the global optima. Due to the random-
ness of the heuristic methods, their performance cannot be judged by the result of a single
run; an algorithm is robust, if it gives consistent result during all the trials. The best solu-
tion of the BA has the minimum cost of 17,963.83$/h which is smaller than that obtained
using CEP, PSO, MFEP, FEP, IFEP, EP-SQP, HDE, CGA_MU, PSO-SQP, HS,
IGA_MU, DEC(1)-SQP(1), TSAEGA, ST-HDE. The output power of the generators of
the 13 unit test system in the minimum solution of the BA is displayed in Table 4.
Table 3. The best, average and worst results of different ED solution algorithms for the 13 unit
test system

Methods
Best Average Worst
CEP [17] 18,048.21 18,190.32 18,404.04
PSO [49] 18,030.72 18,205.78 NA
MFEP [17] 18,028.09 18,192 18,416.89
FEP [17] 18,018 18,200.79 18,453.82
IFEP [17] 17,994.07 18,127.06 18,267.42
EPSQP [49] 17,991.03 18,106.93 NA
HDE [43] 17,975.73 18,134.8 NA
CGA_MU [10] 17,975.34 NA NA
PSOSQP [49] 17,969.93 18,029.99 NA
HS [50] 17,965.62 17,986.563 18,070.176
IGA_MU [10] 17,963.98 NA NA
DEC(1)-SQP(1) [54] 17,963.94 17,973.13 17,984.81
TSARGA [32] 17,963.94 17,974.31 18,089.61
ST-HDE [43] 17,963.89 18,046.38 NA
BA 17,963.83 18,085.06 18,288.00

NA: Not Applicable.
Table 4. Output power of generators in the best result of the proposed BA for the 13 unit test
system
Unit Power (MW)

1 628.31851
2 149.59964
3 222.75255
4 109.86395
5 109.86648
6 109.86654
7 109.86654
8 59.999999
9 109.86569
10 40.00000
11 40.00000
12 55.00000
13 55.00010
Total Generation (MW) 1,800
4.3 Case III: 40 Generating Units

This test system has forty generating units with a non-convex fuel cost function, in-
corporating valve loading effects. The load demand to be satisfied by all the forty
generating units is 10,500 MW. The comprehensive information of generating units of
test system is given in [17]. The global solution for this case has not been discovered
yet. All the forty generators are having valve-point effects, and the solution space has
multiple minima. The optimal generation cost is hard to obtain, and the minimum
generation cost reported so far is 121,415.05$/h [30]. Because of the random nature of
the meta-heuristic methods, averaged performance over many trials with different
initialization populations should be used to obtain a meaningful statistic measure. The
Bat Algorithm has been done for 40 runs with various starting points which are ran-
domly generated. The obtained results of the proposed BA to the ED problem for this
test system are presented in Table 5 where the detailed comparisons of the best, aver-
age and worst solutions of the proposed BA and most recently published ED solution
methods are displayed. As seen from Table 5, the best solution found by BA is better
than those of all other methods, indicating BAs higher efficiency in solving the
ED problem comparing with the other methods. Hence, for large problem sizes with
Table 5. The best, average and worst results of different ED solution methods for the 40 unit test system

Methods
Best Average Worst
HGPSO [26] 124,797.13 126,855.7 NA
SPSO [26] 124,350.4 126,074.4 NA
PSO [49] 123,930.45 124,154.49 NA
CEP [17] 123,488.29 124,793.48 126,902.89
HGAPSO [26] 122,780 124,575.7 NA
FEP [17] 122,679.71 124,119.37 127,245.59
MFEP [17] 122,647.57 123,489.74 124,356.47
IFEP [17] 122,624.35 123,382 125,740.63
TM [51] 122,477.78 123,078.21 124,693.81
EPSQP [49] 122,323.97 122,379.63 NA
MPSO [1] 122,252.26 NA NA
ESO [19] 122,122.16 122,558.45 123,143.07
HPSOM [26] 122,112.4 124,350.87 NA
PSOSQP [49] 122,094.67 122,245.25 NA
PSO-LRS [24] 122,035.79 122,558.45 123,461.67
ImprovedGA [52] 121,915.93 122,811.41 123,334
HPSOWM [26] 121,915.3 122,844.4 NA
IGAMU [53] 121,819.25 NA NA
HDE [43] 121,813.26 122,705.66 NA
DEC(2)-SQP(1) [55] 121,741.97 122,295.12 122,839.29
PSO [56] 121,735.47 122,513.91 123,467.4
APSO(1) [56] 121,704.73 122,221.36 122,995.09
ST-HDE [43] 121,698.51 122,304.3 NA
NPSO-LRS [24] 121,664.43 122,209.31 122,981.59
APSO(2) [56] 121,663.52 122,153.67 122,912.39
SOHPSO [57] 121,501.14 121,853.57 122,446.3
BBO [18] 1214,79.50 121,512.05 121,688.66
TSARGA [32] 121,463.07 122,928.31 124,296.54
BF [21] 121,423.63 121,814.94 NA
GAPSSQP [58] 121,458 122,039 NA
PS [59] 121,415.14 122,332.65 125,486.29
FA [60] 121,415.05 121,416.57 121,424.56
BA 121,414.91 122,094.67 123,447.7
higher nonlinearity, we have tested ED systems with variable sizes, we found that the
proposed method is potentially the best approach among all the methods we have
tested. The optimum dispatch of each generator is also recorded in order to see it in
permissible limits and shown in Table 6.
Table 6. Output power of generators in the best result of the proposed BA for the 40 unit test
system
Unit Power (MW) Unit Power (MW)

1 110.7998 21 523.2812
2 110.8011 22 523.2783
3 97.4009 23 523.2802
4 179.7343 24 523.2789
5 92.6232 25 523.2788
6 140 26 523.2801
7 259.6000 27 10
8 284.6023 28 10
9 284.6013 29 10
10 130 30 87.8079
11 168.8008 31 190
12 168.8005 32 190
13 214.7598 33 190.0000
14 394.2781 34 164.8068
15 304.5200 35 164.8181
16 394.2789 36 164.8840
17 489.2800 37 110
18 489.2838 38 110
19 511.2780 39 110
20 511.2811 40 511.2818
Total Generation (MW) 10,500

4.4 Case IV: 15 Generating Units

In this case study, all mentioned practical constraints and nonlinear characteristics of
the ED problem are included. The ramp rate limits and POZs are deemed for the units
of this test system, whose comprehensive data is given in [57]. The load demand of
Table 7. The best, average and worst result of different ED solution algorithms for the 15 unit
test system including POZ constraints, ramp rate limits and transmission losses

Methods
Best Average Worst
PSO [20] 32,858 33,039 33,331
GA [20] 33,113 33,228 33,337
SOH_PSO [57] 32,751 32,878 32,945
CPSO1 [20] 32,835 33,021 33,318
CPSO2 [20] 32,834 33,021 33,318
BF [21] 32,784.5 32,796.8 NA
BA 32,704.4 32,842.2 33,344.8
NA: Not Available.
Table 8. Output power of generators and transmission losses in the best result of the proposed
BA for the 15 unit test system
Unit Power (MW)

1 455.0000
2 380.0000
3 130.0000
4 130.0000
5 170.0000
6 460.0000
7 430.0000
8 71.7474
9 58.9140
10 160.0000
11 80.0000
12 80.0000
13 25.0000
14 15.0000
15 15.0000
Losses (MW) 30.6614

the system is 2,630 MW. The best cost reported is 32,704.9$/h [61]. The problem
has a number of local optimum points, as it is more likely that most algorithms will be
trapped in a local optimum region. The prohibited operating zones embedded in the 4
units are units 2, 5, 6, and 12. This problem proves particularly challenging because
these zones result in a non-convex decision space where 192 convex subspaces can be
constituted for the dispatch problem. The remaining units have simple operational
zones. This challenging problem not only needs the proper implementation of the
constraints, but also uses an efficient search in different sub regions without wasting
too much time on the prohibited regions. Thus, a fine balance between solution quali-
ty and computational effort is needed.
The comparison of the best, average and worst solutions found by the BA and the
best solutions obtained by other methods in the most recent literature is shown in
Table 7. Again, the BA offers an improved generation cost over the other algorithms,
clearly showing the proposed approach of locating better solutions is superior to oth-
ers. Detailed results of the optimal solutions are shown in Table 8 where that all the
system constraints are met.
5 Conclusions
In this chapter, we have presented a new approach to non-convex ED problems based

on the BA. This algorithm is adapted for solving non-convex ED problems under
different nonlinear constraints. Many of the nonlinear characteristics of power sys-
tems are considered for practical generator operations in the proposed method, includ-
ing the valve-point loadings, multiple fuel option, ramp rate limits, and prohibited
operating zones are considered. Four test cases have been studied in detail and the
quality of the solutions and performance have compared with the best solutions found
so far by several other methods published in the literature.
Additionally, the fine adaptation of the parameters and can affect the conver-
gence rate of the BA. In fact, parameter acts in a similar role as the cooling sche-
dule in the simulated annealing. Though the implementation is more complicated than
many other metaheuristic methods; however, it does show that it uses a balanced
combination of the advantages of existing successful techniques with innovative fea-
ture based on the echolocation behavior of bats. New solutions are produced by mod-
ifying frequencies, loudness and pulse emission rates, while the proposed solution is
approved or not depends on the quality of the solutions controlled or characterized by
loudness and pulse rate which are in turn related to the closeness or the fitness of the
locations/solutions to the global optimal solution.
The capability and robustness of the proposed algorithm make it suitable to solve
complex optimization problems such as non-convex ED problems. It can be expected
that this potentially powerful optimization meta-heuristic can easily be extended to
study multi-objective optimization applications with various constraints, including
NP-hard problems and mixed-integer programming. Further studies can concentrate
on the sensitivity and parameter studies and their possible relationships with the
convergence rate of the algorithm. In addition, hybridization with other popular
algorithms will also be potentially fruitful.
References
1. Park, J.B., Lee, K.S., Shin, J.R., Lee, K.Y.: A particle swarm optimization for economic
dispatch with nonsmooth cost functions. IEEE Trans. Power Syst. 20(1), 3442 (2005)
2. Abido, M.A.: Multiobjective evolutionary algorithms for electric power dispatch problem.
IEEE Trans. Evol. Comput. 10(3), 315329 (2006)
3. Wood, A.J., Wollenberg, B.F.: Power generation, operation and control. John Wiley Sons,
New York (1984)
4. Arrillaga, J., Arnold, C.P.: Computer analysis of power systems. John Wiley Sons, Great
Britain (1990)
5. IEEE Committee Report, Present practices in the economi-coperation of power systems.
IEEE Trans. Power Appar. Syst. PAS-90, 176875 (1971)
6. Chowdhury, B.H., Rahman, S.: A review of recent advances in economic dispatch. IEEE
Trans. Power Syst. 5(4), 12481259 (1990)
7. Lianf, Z.X., Glover, J.D.: A zoom feature for a dynamic programming solution to econom-
ic dispatch including transmission losses. IEEE Trans. Power Syst. 7(2), 544550 (1992)
8. Granville, S.: Optimal reactive dispatch through interior point methods. IEEE Trans. Pow-
er Syst. 9(1), 136146 (1994)
9. Lin, C.E., Viviani, G.L.: Hierarchical economic dispatch for piecewise quadratic cost func-
tions. IEEE Trans. Power Appar. Syst. PAS 103(6), 11701175 (1984)
10. Chiang, C.-L.: Improved genetic algorithm for power economic dispatch of units with
valve-point effects and multiple fuels. IEEE Trans. Power Syst. 20(4), 16901699 (2005)
11. Kumar, S., Naresh, R.: Non-convex economic load dispatch using an efficient real-coded
genetic algorithm. Appl. Soft. Comput. 9(1), 321329 (2009)
12. Pothiya, S., Ngamroo, I., Kongprawechnon, W.: Application of multiple tabu search algo-
rithm to solve dynamic economic dispatch considering generator constraints. Energy Con-
vers Manage 49(4), 506516 (2008)
13. Khamsawang, S., Jiriwibhakorn, S.: DSPSOTSA for economic dispatch problem with
nonsmooth and noncontinuous cost functions. Energ. Convers Manage 51, 365375 (2010)
14. Lee, K.Y., Sode-Yome, A., Park, J.H.: Adaptive Hopfield neural network for economic
load dispatch. IEEE Trans. Power Syst. 13(2), 519526 (1998)
15. Noman, N., Iba, H.: Differential evolution for economic load dispatch problems. Electr.
Power Syst. Res. 78(8), 13221331 (2008)
16. Park, Y.M., Won, J.R., Park, J.B.: A new approach to economic load dispatch based on
improved evolutionary programming. Eng. Intell. Syst. Electr. Eng. Commun. 6(2), 103
110 (1998)
17. Sinha, N., Chakrabati, R., Chattopadhyay, P.K.: Evolutionary programming techniques for
economic load dispatch. IEEE Trans. Evol. Comput. 7(1), 8394 (2003)
18. Bhattacharya, A., Chattopadhyay, P.K.: Solving complex economic load dispatch prob-
lems using biogeography-based optimization. Expert Syst. Appl. 37, 36053615 (2010)
19. Pereira-Neto, A., Unsihuay, C., Saavedra, O.R.: Efficient evolutionary strategy optimiza-
tion procedure to solve the nonconvex economic dispatch problem with generator con-
straints. IEE. Proc. Gen. Transm. Distrib. 152(5), 653660 (2005)
20. Gaing, Z.-L.: Particle swarm optimization to solving the economic dispatch considering
the generator constraints. IEEE Trans. Power Syst. 18(3), 11871195 (2003)
21. Panigrahi, B.K., Pandi, V.R.: Bacterial foraging optimisation: NelderMead hybrid algo-
rithm for economic load dispatch. IEE. Proc. Gen. Transm. Distrib. 2, 556565 (2008)
22. Jiejin, C., Xiaoqian, M., Lixiang, L., Haipeng, P.: Chaotic particle swarm optimization for
economic dispatch considering the generator constraints. Energy Convers Manage 48,
645653 (2007)
23. Kuo, C.-C.: A novel coding scheme for practical economic dispatch by modified particle
swarm approach. IEEE Trans. Power Syst. 23, 18251835 (2008)
24. Selvakumar, A.I., Thanushkodi, K.: A new particle swarm optimization solution to non-
convex economic dispatch problems. IEEE Trans. Power Syst. 22(1), 4251 (2007)
25. Neyestani, M., Farsangi, M.M., Nezamabadi-pour, H.: Amodified particle swarm optimi-
zation for economic dispatch with non-smooth cost functions. Eng. Appl. Artif. In-
tel. 23(7), 11211126 (2010)
26. Ling, S.H., Iu, H.H.C., Chan, K.Y., Lam, H.K., Yeung, B.C.W., Leung, F.H.: Hybrid par-
ticle swarm optimization with wavelet mutation and its industrial applications. IEEE
Trans. Syst. Man. Cyb. 38(3), 743763 (2008)
27. Vlachogiannis, J.G., Lee, K.Y.: Economic Load DispatchA Comparative Study on Heu-
ristic Optimization Techniques With an Improved Coordinated Aggregation-Based PSO.
IEEE Trans. Power Syst. 24(2), 9911001 (2009)
28. Sadat Hosseini, S.S., Gandomi, A.H.: Discussion of Economic Load DispatchA Com-
parative Study on Heuristic Optimization Techniques With an Improved Coordinated Ag-
gregation-Based PSO. IEEE Trans. Power Syst. 25(1), 590 (2010)
29. Fesanghary, M., Ardehali, M.M.: A novel meta-heuristic optimization methodology for
solving various types of economic dispatch problem. Energy 34, 757766 (2009)
30. Yang, X.S., Gandomi, A.H.: Bat Algorithm: A Novel Approach for Global Engineering
Optimization. Eng. Computation 29(5), 464483 (2012)
31. Sangiamvibool, W., Pothiya, S., Ngamroo, I.: Multipletabu search algorithm for economic
dispatch problem considering valve-point effects. Electrical Power and Energy Sys-
tems 33(4), 846854 (2011)
32. Subbaraja, P., Rengaraj, R., Salivahanan, S.: Enhancement of self-adaptive real coded ge-
netic algorithm using Taguchi method for economic dispatch problem. Appl. Soft. Com-
put. 11, 8392 (2011)
33. Yang, X.S.: Bat algorithm for multi-objective optimization. Int. J. of Bio-Inspired Compu-
tation 3(5), 267274 (2011)
34. Fister, I., Fong, S., Brest, J., Fister, I.: A Novel Hybrid Self-Adaptive Bat Algorithm. The
Scientific World Journal (2014)
35. Fister, I.J., Fong, S., Brest, J., Iztok, F.: Towards the self-adaptation in the bat algorithm.
In: Proceedings of the 13th IASTED International Conference on Artificial Intelligence
and Applications (2014)
36. Yang, X.-S.: A New Metaheuristic Bat-Inspired Algorithm. In: Gonzlez, J.R., Pelta, D.A.,
Cruz, C., Terrazas, G., Krasnogor, N. (eds.) NICSO 2010. Studies in Computational Intel-
ligence, vol. 284, pp. 6574. Springer, Heidelberg (2010)
37. Gandomi, A.H., Yang, X.S., Talatahari, S., Alavi, A.H.: BatAlgorithm for Constrained Op-
timization Tasks. Neural Comput. Appl. 22(6), 12391255 (2013)
38. Gandomi, A.H., Yang, X.S.: Chaotic Bat Algorithm. J. Comput. Sci. 5(2), 224232 (2014)
39. Yang, X.S., He, X.: Bat algorithm: Literature review and applications. Int. J. of Bio-
Inspired Computation 5(3), 141149 (2013)
40. Biswal, S., Barisal, A.K., Behera, A., Prakash, T.: Optimal power dispatch using BAT al-
gorithm. In: International Conference on Energy Efficient Technologies for Sustainability
(ICEETS), pp. 10181023 (2013)
41. Sakthivel, S., Natarajan, R., Gurusamy, P.: Application of Bat Optimization Algorithm for
Economic Load Dispatch Considering Valve Point Effects. Int. J. Comput. Appl. 67(11)
(2013)
42. Panigrahi, B.K., Pandi, V.R., Das, S.: Adaptive particle swarm optimization approach for
static and dynamic economic load dispatch. Energy Convers Manage. 49(6), 14071415
(2008)
43. Wang, S.-K., Chiou, J.-P., Liu, C.-W.: Non-smooth/non-convex economic dispatch by a
novel hybrid differential evolution algorithm. IET Gen., Transm., Distrib. 1(5), 793803
(2007)
44. Walters, D.C., Sheble, G.B.: Genetic algorithm solution of economic dispatch with valve-
point loading. IEEE Trans. Power Syst. 8(3), 13251332 (1993)
45. Ma, H., El-Keib, A.A., Smith, R.E.: A genetic algorithm-based approach to economic dis-
patch of power systems. In: IEEE Conf. (1994)
46. Fister Jr, I., Fister, D., Fister, I.: Differential evolution strategies with random forest re-
gression in the bat algorithm. In: Proceeding of the Fifteenth Annual Conference Compa-
nion on Genetic and Evolutionary Computation Conference Companion, pp. 17031706.
ACM (2013)
47. Fister Jr, I., Fister, D., Yang, X.S.: A Hybrid Bat Algorithm. Electrotechnical Re-
view 80(1-2), 17 (2013)
48. Lin, W.M., Cheng, F.S., Tsay, M.T.: An improved tabu search for economic dispatch with
multiple minima. IEEE Trans. Power Syst. 17(1), 108112 (2002)
49. Victoire, T.A.A., Jeyakumar, A.E.: Hybrid PSO-SQP for economic dispatch with valve-
point effect. Electric Power Syst. Res. 71(1), 5159 (2004)
50. Coelho, L.D.S., Mariani, V.C.: An improved harmony search algorithm for power eco-
nomic load dispatch. Energy Convers. Manage. 50, 25222526 (2009)
51. Liu, D., Cai, Y.: Taguchi method for solving the economic dispatch problem with non-
smooth cost functions. IET Gen. Transm. Distrib. 1(5), 793803 (2007)
52. Ling, S.H., Leung, F.H.F.: An Improved genetic algorithm with average-bound crossover
and wavelet mutation operation. Soft Comput. 11(1), 731 (2007)
53. Chiang, C.-L.: Genetic-based algorithm for power economic load dispatch. IEE Proc. Gen-
er. Transm. Distrib. 1(2), 261269 (2007)
54. Coelho, L.S., Mariani, V.C.: Correction to Combining of chaotic differential evolution and
quadratic programming for economic dispatch optimization with valve-point effect. IEEE
Trans. Power Syst. 21(3), 1465
55. Coelho, L.D.S., Mariani, V.C.: Combining of chaotic differential evolution and quadratic
programming for economic dispatch optimization with valve-point effect. IEEE Trans.
Power Syst. 21(2), 989996 (2006)
56. Selvakumar, A.I., Thanushkodi, K.: Anti-predatory particle swarm optimization: Solution
to nonconvex economic dispatch problems. Electric. Power Syst. Res. 78, 210 (2008)
57. Chaturvedi, K.T., Pandit, M., Srivastava, L.: Self-organizing hierarchical particle swarm
optimization for nonconvex economic dispatch. IEEE Trans. Power Syst. 23(3), 1079
1087 (2008)
58. Al-sumait, J.S., Sykulski, J.K., Al-Othman, A.K.: A hybrid GAPSSQP method to solve
power system valve-point economic dispatch problems. Appl. Energ. 87, 17731781
(2010)
59. Al-Sumait, J.S., Al-Othmam, A.K., Sykulski, J.K.: Application of pattern search method to
power system valve-point economic load dispatch. Electr. Power Energy Syst. 29(10),
720730 (2007)
60. Yang, X.S., Sadat Hosseini, S.S., Gandomi, A.H.: Firefly algorithm for solving nonconvex
economic dispatch problems with valve loading effect. Appl. Soft. Comput. 12, 1180
1186 (2012)
61. Amjady, N., Sharifzadeh, H.: Solution of non-convex economic dispatch problem consi-
dering valve loading effect by a new modified differential evolution algorithm. Int. J. Elec.
Power 32(8), 893903 (2010)
Part III
Hybridization in Computational
Intelligence
Hybrid Artificial Neural Network
for Fire Analysis of Steel Frames
Tomaz Hozjan1, , Goran Turk1 , and Iztok Fister2

1
University of Ljubljana, Faculty of Civil and Geodetic Engineering
Jamova cesta 2, p.p. 3422, 1000 Ljubljana, Slovenia
{tomaz.hozjan,goran.turk}@uni-lj.si
2
iztok.fister@um.si
Abstract. Tuning parameters of articial neural networks (ANN) is

a very complex task that typically demands a lot of experimental work
performed by developers. In order to avoid this hard work, the automatic
tuning of these parameters is proposed. A real-coded genetic algorithm
(GA) was developed for this purpose. This, so-called meta-GA, algorithm
acts as a meta-heuristic that searches for the optimal values of ANN
parameters using the genetic operators of crossover and mutation and
evaluates quality of solutions, obtained after applying the ANN for re
analysis of steel frames. As matter of fact, steel exhibits very unusual
wavy behavior which is a very dicult to model by a close form empirical
models when heated to the temperatures between 250 C and 600 C.
Therefore, the use of ANN was one of the possible solutions which proved
to be very promising. However, the results of this ANN with manual
parameter setting by an expert can signicantly be improved when using
the meta-GA for automatic searching the optimal parameter setting of
the original ANN algorithm.
Keywords: genetic algorithms, articial neural networks, tuning pa-

rameters, steel beam, re analysis.
1 Introduction
In the present analysis, alternative approach of modeling the mechanical be-
havior of material when exposed to high temperatures, as expected in res, is
presented. Based on series of stress-strain curves obtained experimentally for
various temperature levels [1], the articial neural network (ANN) is employed
in material modeling of steel. In general, steel at high temperatures behaves
very non-linearly. In order to avoid the ambiguities described above, we have
recently witnessed various attempts to capture the inelastic part of the material
model just by a proper set of time-independent stress-strain curves at various
temperatures involving both plastic and viscous eects. Characteristic samples


150 T. Hozjan, G. Turk, and I. Fister
are the material models proposed by Eurocode 3 [2] and BS5950 [3] wherein the
parameters of temperature dependent bilinear diagrams with elliptic intermedi-
ate part are given. As has been stated by Huang and Tan [6], the heating rate
and the duration of elevated temperatures have considerable inuence on the
development of strains and stresses over the structure. Consequently the time-
independent material models are only suitable in the cases when the temperature
of the steel does not exceed 450 C. In real re, such temperature regime can only
be expected (i) at heat-protected structures when the exposure to the highest
temperatures does not last for too long or (ii) at very low stock of combustion
material not allowing the re to develop to full extent. In others, more realis-
tic cases, the experimental data obtained at certain temperature-time curve like
ISO 834 or constant heating rate are only of limited applicability.
A relatively large set of results of uniaxial tests on structural steel at constant
specimen heating rate of 10 C/min has been provided by Kirby and Preston [1]
for two steel qualities (grades 43A and 50B). The data are given in tabular form
as two series of stress-strain pairs at various temperature levels for the strains
up to 2% in the temperature interval from 20 C to 900C. In the temperature
interval from 250 C to 600C, the irregular wavy shapes of the stress-strain
curves do not allow to be approximated by bilinear model with either elliptic
or parabolic intermediate part. Therefore, the idea of employing the articial
neural network (ANN) was introduced in order to describe the stress-strain-
temperature relations. Some diculties appearing within the course of modeling
the properties of steel at elevated temperatures by the ANN have been solved
by combining the ANN results with linear regression in the linear elastic range
and with linear extrapolation in the hardening range.
The quality of solutions as obtained by ANN depends on setting of the ANN
parameters. However, the properly setting of parameters is not known in ad-
vance. Normally, this can be found during the extensive experimental work man-
ually. In order to avoid this time-consuming process, a genetic algorithm (GA) is
proposed for performing the tuning process. The GA at the higher level appears
as a meta-heuristic to control the parameters of ANN acting at the lower level,
i.e., solving of the problem. Thus, the optimal parameter setting is searched for.
Therefore, this GA is also named as meta-GA.
The parameters of ANN for re analysis of steel frames are coupled into
a representation of candidate solutions within the meta-GA. These solutions
undergo actions of crossover and mutation. The quality of solutions is evaluated
after applying the parameters mapped from a representation of solution in meta-
GA to ANN and obtaining the error rate from the ANN algorithm running
with the proposed parameter setting. As a matter of fact, the results of tuning
parameters with the meta-GA signicantly improved the results of the manually
tuned parameters of ANN as performed by Hozjan et al. in [16].
The structure of the chapter is as follows. Section 2 deals with background
information where the re analysis of steel frames is described and a theory of
ANN and GAs is introduced. In Section 3, the proposed meta-GA algorithm is
presented in detail. Section 4 deals with experiments and results. The chapter
Hybrid ANN for Fire Analysis of Steel Frames 151
concludes with an overview of the performed work and possible directions for
the further work are outlined.
2 Background Information
2.1 Fire Analysis of Steel Frames
In technical codes the term re resistance corresponds to the experimentally
veried endurance of individual structural elements or minor frames with re-
gard to standardized heating mode in a test furnace. Even accurate results of
experiments in a test furnace do not provide an adequate explanation of global
behavior of structure as a whole in a real re. Generally, the re resistance of
whole structures is considerably greater than that of the individual structural
elements. Since full-scale tests of structures represent high cost, but on the other
hand numerical models oer noticeably cheaper solution for determination of re
resistance of the structure. Yet they have to be ecient and accurate enough to
be able to perform such a complex analysis. Therefore, experimental data about
thermo-mechanical properties of materials and structural elements are the nec-
essary basis for development of any computational analysis and their required
reliability increases with the eciency and accuracy of available computing tools.
Current numerical models for mechanical analysis of structures exposed to re
are mostly based on nite element method with various levels of non-linearity
and are combined with non-linear material models calibrated with regard to
experimental results. However, there are several uncertainties related to the in-
uence of temperature gradients and the development of plastic and viscous
strains of steel at temperatures above 400 C.
One of the most important component in accuracy of the numerical model
for re analysis of steel frames is material model. Here, alternative approach of
constructing the material model of steel at high temperatures with help of ANN
based on available experimental data [1] is presented in short. Full description
of this application is given in [16].
2.2 Artificial Neural Networks

Motivation and idea for the early developments of articial neural networks
(ANNs) comes from studying the structure and processes in human brain, which
is in several aspects similar to the ANN. Structure of the ANN is on basic level
very similar, both have units called neurons which are interconnected. As human
brain rst have to be taught to work eciently also the ANN has to be taught
or trained. There are two types of learning procedures: (i) supervised, in which
questions and answers are known and the ANN has to learn the correct answers;
(ii) and unsupervised learning, where the answers are not known.
The structure of ANN is such that neurons (simple units) which operate locally
are connected by connections (weights) which may reduce or amplify the signal
from one unit to another. Each unit receives signals from other units, processes
them and transmits them to other units.
Several types of ANN geometry can be found. A review of dierent ANNs

is given in several papers, books and Internet sites (e.g. [4,5]). The multi-layer
feed-forward network is usually chosen, if functional approximation is sought.
Since our aim is to approximate the strain-stress relation of steel at elevated
temperatures, the multi-layer feed-forward network, trained by the supervised
learning, is chosen.
In structural engineering many successfully applications of the ANN can be
found. For example, ANN was used for the modeling of fatigue crack growth
[7,8], for the modeling of the mechanical properties of steels [9], for the modeling
of the load carrying capacity of steel structures [10], the modeling of conned
reinforced columns [11,12], the modeling of steel columns strength under re [13]
and many other interesting applications [14,15,28].
Multi-layer Feed-forward Network. In the multi-layer feed-forward neural

network input units are connected to the rst layer of hidden units, which are
then further connected to the units of the second hidden layer and so on. Units
of the last hidden layer are connected to the output units. The geometry of
the multi-layer feed forward network is shown on Figure 1. This type of neural
network is usually employed for the approximation of the unknown functional
relation.
The input and output units represent the input and output data, respectively.
The hidden layers and all the connections between the units may be considered
as a black box which performs the necessary transformations of the input data
to reach the target output data.
Each unit in the network is represented by its value yik and connection between
k
units are represented by its weight wij . In Eq. 1 index i corresponds to the unit
th
number of the k layer, while index j corresponds to the unit number of the
(k 1)th layer. The input layer is denoted by zero and the output layer is
denoted by nl (Fig 1). The direction of signal is in only one way, i.e. from the
input layer towards the output layer. The value of a unit yjk1 is multiplied by
k
the corresponding weight wij and added to the value of the signal in the unit of
the next layer. In addition, the value of bias neuron or threshold ki is added to
the equation
nk1

yik = f yj + ki .
k k1
wij (1)
j=1
This equation is illustrated in Figure 2. The activation function f(.) enables the
modeling of an arbitrary non-linear relation between input and output variables.
Dierent functions can be used as an activation function, while usual choice for
the activation function is a sigmoid function
output signal
nl
y1 nl (output layer)
w11 wk1nl
y1
nl-1 nl-1
y2
nl-1
yk nl-1
layer numbers
y1 ym
2 2
1
w111 w211 wnm

1
w1m
1
w2m
1
y1 y2 yn
1 1 1
0 (input layer)
x1 x2 xn
input data x1...xn
Fig. 1. Multi-layer feed-forward articial neural network
1
f (y) = , (2)
1 + ey
tanh y, or Gaussian (Figure 5). The behavior of the neural network also depends
k
on the values of the weights wij and thresholds ki which have to be determined
by the learning (training) procedure. The set of known input and output values
is termed an input-output pair. All input-output pairs are often divided into
k
two sets. With so called learning or training set the connection weights wij and
k
thresholds i are determined. Once the learning procedure ends, it means that
the neural network performs adequately for all input-output pairs in the learning
set and the neural network is assessed on the testing set of data.
Training procedure can be in some cases ill-conditioned if the input and/or
output data are not normalized (see e.g. [5]). Therefore, the values of input
and output units have to be normalized. The normalization of the values of
output units depends on the range of activation function. Usually, the linear
transformation works well, although sometimes a non-linear transformation may
help if the data are clustered.
neuron on layer k
input signals
nurons on layer k-1
outut function
output signal for

sumation neurons on layer k+1
Fig. 2. Scheme of neuron
f (a) f (a) f (a)

1 1 1
0 a 0 a 0 a
Fig. 3. Dierent type of activation functions
The supervised learning is in fact a general optimization problem in which

the minimum of error Ep is sought
no
1
nl 2
Ep = tpi ypi , (3)
no i=1
nl
where tpi are the target output values, ypi are the values of neurons in the output
layer nl , i.e. the output values evaluated by the ANN, and no is the number of
neurons in the output layer, i.e. the number of output variables. A performance
of learning procedure by the ANN can also be expressed as follows
no
nl 2
i=1 tpi ypi
r = 1.0 n
2
nl 2
, (4)
i=1 tpi y
o
pi
nl nl
where r2 denotes a coecient of correlation and ypi is a means of ypi values.
The closer this coecient to the value 1.0, the higher is the performance of the
learning procedure by the ANN. As evident from Eqs. 4-5, both performance
measures Ep and r2 correlate between each other.
Learning procedure of the ANN mesh is numerically very demanding since a

large number of local minima usually exist. For this reason two essential dierent
approaches can be used. One being error back-propagation algorithms which is
basically a gradient method and other is genetic algorithm which is in fact a
stochastic search.
The parameters, i.e. the number of hidden layers and the number of hidden
neurons of the optimal neural network, are problem dependent. If the number of
units is very large, the learning procedure may be very slow, since each forward
calculation takes a substantial computational eort. Although larger networks
are usually able to learn the sought relationship, this may sometimes be a draw-
back. A large network may easily reproduce the training set of input-output
pairs but fails to generalize, yielding a poor testing performance. Networks
with insucient units may have problems to learn properly during the learning
procedure.
2.3 Genetic Algorithms
Genetic Algorithms (GAs) belong to a family of evolutionary algorithms (EAs)

that have origins in two mechanisms of Darwins theory, i.e., natural selection
and genetics. Usually, these algorithms are population based [25,26,27]. This
means, they maintain the population of candidate solutions during an evolu-
tionary search. The solutions undergo the actions of mutation and crossover
during each generation while the selection operator selects solution for surviving
between these according to their tness [17].
Nowadays, there are three kind of search methods, in general, i.e., calculus-
based, enumerative and random search [18]. The solutions are searched for ac-
cording to an objective function, where the current solution is improved using
the hill climbing method. Thus, the best solution depends on the starting po-
sition. Enumerative search methods start with inspecting of every point in the
search space. Indeed, the optimal solution is certainly found, but this search is an
ineective, in general. Stochastic random search tries to overcome a deciencies
of calculus-based and enumerative search methods. However, there is a dierence
between stochastic random search and pure random search, because the stochas-
tic random search considers the history of explored information, while the pure
random search explores information about a search space each time anew.
GA consists of the following components:
representation of individuals,
crossover,
mutation,
parent selection,
survivor selection.
Additionally, the initialization and termination condition are used in GA in

order to complete the running cycle of this algorithm. In the remainder of this
chapter, the mentioned components of the GA are presented in details.
Representation of Individuals. GA is population-based stochastic random

search algorithm, using the binary and real-coded representation of individuals
(i.e., solutions). In the rst case, each element of solution vector represents a
Boolean value (also gene), while in the second case, elements are a real-valued
problem variables. This chapter focuses on the real-coded GA that are suitable
for continuous as well as combinatorial optimization problems, where domains
of variables are not discrete.
Crossover. Crossover operators in real-coded GAs build two ospring from two
parents. They are applied to a population of solutions according to a probability
of crossover pc control parameter. Deb in [19] proposed more crossover opera-
tors that can be applied to real-coded GAs. In this chapter, we are focused on
blend (BLX) and simulated binary (SBX) crossover that are referred to in the
remainder of this chapter.
Blend Crossover. Blend crossover was proposed by Eshelman and Schafer in

1993 [21]. This operator selects two parents in a population of solutions and
(t) (t) (t) (t)
randomly picks an element of new solution in the range [xik (xjk xij ), xjk +
(t) (t) (t) (t)
(xjk xij )], where it is assumed xik < xjk (Fig. 4).
Fig. 4. BLX crossover
Ospring is obtained according to equation:

(t+1) (t) (t)
xik = (1 k )xik + k xjk , (5)
where k = (12)U(0, 1) and U(0, 1) denotes a random number drawn from

uniform distribution in an interval [0, 1]. If = 0, the BLX crossover generates
(t) (t)
a random value in the interval (xik , xjk ). A value of = 0.5 is typical for this
operator.
Simulated Binary Crossover. Simulated binary crossover (SBX) was developed

by Deb and his students [22,23]. It simulates the working principle of the single-
(t+1)
point crossover operator on binary strings and generates two ospring xi ,
(t+1) (t) (t)
and xj from two parent xi and xj . This generation is described as follows.
First, a random number uk is drawn from uniform distribution in the interval

[0, 1]. Then, the value qk is calculated according to the equation
1
(2uk ) nc +1 , if uk 0.5,
qk = 1 nc1+1 (6)
, otherwise,
2(1uk )
that depends on the distribution index nc . When a value of this parameter is

large, values near to parents are expected to be generated with a higher probabil-
ity. In contrast, more distant values from the original parent values are expected
to be generated when a value of nc is smaller.
Finally, osprings are calculated as follows:

(t+1) (t) (t)
xik = 0.5 (1 + qk )xik + (1 qk )xjk ,
(7)
(t+1) (t) (t)
xjk = 0.5 (1 qk )xik + (1 + qk )xjk .
Note that generated osprings are symmetric according to the parent solutions.
Mutation. Mutation is an unary operator that uses only one parent and creates
one ospring by applying some kind of randomized change to the solution [24].
In real-coded GAs especially two mutations are employed, as follows:
random mutation,
polynomial mutation.
The mutation operator is applied to the parent solution regarding the probability
of mutation pm control parameter. In the remainder of the paper, these mutation
operators are described in detail.
Random Mutation. Random mutation is the simplest mutation scheme, where

each element of the solution is drawn randomly in the domain of feasible values
for a specic problem variable [17] according to equation
(t+1) (Ub) (Lb)
xij = (xj xj ) U(0, 1), (8)
(Ub) (Lb)
where xj and xj denote an upper and lower bounds of the observed element
of solution, U(0, 1) is a random variable drawn from a uniform distribution in the
interval [0, 1]. This operator is equivalent to a random initialization because of
independency of the parent solution. In order to consider the parent solution, the
random mutation can select the new value of an element in the neighborhood
of the original element using a uniform probability distribution according to
equation
(t+1) (t)
xij = xij + (U(0, 1) 0.5) j , (9)
where j is the maximal altered perturbation of the j-th problem variable.
Polynomial Mutation. Similar as by SBX operator, the normal distribution can

be replaced by a polynomial function [19] according to the equation
(t+1) (t) (Ub) (Lb)
xij = xij + (xj xj ) j , (10)
where the parameter j is calculated from the probability distribution P () =

0.5(nm + 1)(1 ||)nm according to equation

2Uj (0, 1)1/(nm +1) 1, if U(0, 1) < 0.5,
j = (11)
1 (2(1 U(0, 1)))1/(nm +1) , if U(0, 1) 0.5.
Note that this probability distribution in early generations acts similarly to a
uniform distribution, while in the latter generations it is focused on the direct
search in the neighborhood of the parent solution. It is controlled by distribution
index nm , typically set as nm = 0.5.
Parent Selection. Primary demand of a parent selection operator is multiply-

ing the tter solution in a population and eliminating the worse. This can be
achieved by the operator with the following characteristics [19]:
searching for the tter solutions in the population,
increasing the number of tter solution copies,
eliminating the worse solutions in population and incorporating the increas-
ing number of tter solutions into a population.
There are many ways how this could be achieved. As a result, the dierent
parent selection operators were proposed in GAs, e.g., tournament selection,
roulette-wheel selection, and stochastic universal sampling selection [20]. Two
parents in a population play a tournament by tournament selection. Obviously,
the better between both solutions is a winner of the tournament and therefore
becomes a parent. In average, each solution plays two tournaments, in each gen-
eration. As a result, the better solution can win two times and thus occupies two
places in the new population, while the worse solution looses and is eliminated
from the population.
The roulette-wheel selection assigns a number of copies to each solution pro-
portional to their tness [19]. Let us assume, the average tness of population is
denoted as favg and the tness value of i-th solution as fi . Then, the expected
number of copies is expressed as fi /favg . In this way, the selection operator
operates similar as a roulette-wheel, where the wheel is divided into n-sections
corresponding to the population size, while their surface areas are proportional
to the tnesses of solutions. However, the roulette is turned round n-times.
The classical roulette-wheel selection demands a turning round a wheel n-
times, where n denotes the number of individuals in the population. When the
stochastic universal sampling is used, only the rst wheel section r needs to be
determined, while others are calculated according to a sequence:
R = {r, r + 1/n, r + 2/N, . . . , r + (n 1)/n} mod 1. (12)

Fig. 5. Roulette-wheel parent selection
The roulette-wheel by this selection is also divided into n-sections. The posi-
tion of the rst solution is determined by a random number r, while the positions
of all other solutions are determined by moving identical distances 1/n along the
periphery of the roulette-wheel.
Survivor Selection. In general, the real-coded GAs use the generational sur-
vivor selection, where the best n solutions according to their tness values are
selected between n parents and n ospring for the next generation. Obviously,
this model is also elitist because of preserving the best parents to arise in the
next evolutionary cycle. On the other hand, the worse solutions are eliminated
from the population.
3 The Meta-GA Algorithm
Nature-inspired algorithms depend on the proper setting of their control pa-

rameters. Typically, these parameter settings can be found during an extensive
experimental work in which the best values of parameters are explored during
so called tuning process [24]. Mostly, this process is performed by developers
of these algorithms. In order to facilitate the hard work of developers, the au-
tomatic parameter tuning can be used, where some meta-heuristic algorithm
controls the algorithm parameters of the tuned algorithm. This idea led us in
the development of the meta-GA for automatic parameter control of the ANN
for re analysis of steel frames.
There are several control parameters to dene a topology and a behavior of the
ANN. Topology of the ANN consists of the number of input layers, the number
of output layers, and the number of hidden layers. While the number of input
and output layers are determined by an user, the number of hidden layers can be
varied and therefore it is a subject of the tuning process. The same is also true for
the number of neurons appearing in each hidden layer. On the other hand, some
of behavior parameters are user dependent, like error limit that determines how
well the ANN must be learned. Some parameters can be varied and therefore
it also undergo the tuning process. For instance, the parameters, like the step
size, range of sharing weights and a learning rate ratio belong to this class.
Meaning of these parameters is as follows. The parameter step size controls the
convergence speed of back propagation networks, the parameter range of sharing
weights determines the range of numerical weights that are tuned by the ANN
learning algorithm, and the parameter learning rate ratio determines a size of
weights and bias change during learning.
The pseudo-code of real-coded meta-GA controlling the parameters of ANN
for re analysis of steel frames is illustrated in Algorithm 1.
Algorithm 1. The proposed meta-GA algorithm

1: INITIALIZE the GA population
2: EVALUATE new candidate solutions
3: while not termination condition satised do
4: SELECT parents
5: CROSSOVER pair of parents
6: MUTATE ospring
7: EVALUATE new candidate solutions
8: SELECT individuals for the next generation
9: end while
As can be seen from Algorithm 1, two parents are selected to enter in opera-
tion of crossover in each generation. The obtained ospring undergo acting the
operator of mutation. Then, the quality of each ospring is calculated using the
evaluation function. Finally, the best solution survive and reproduce their best
characteristics in the next generation. In addition, two features are dened in
order to complete an evolutionary cycle of the meta-GA algorithm, i.e., initial-
ization and termination condition. The former initialize the individuals of the
starting population randomly, while the latter terminate the evolutionary cycle,
when a termination condition is satised.
The chromosome of the meta-GA consists of eight genes whose meanings
are represented in Table 1. In the optimization procedure up to maximum four
hidden layers of the ANN mesh were used in the analysis.
The corresponding topology in Table 1 is described with the geometry 2-
50-50-1 that denes the ANN with two hidden layers consisting of 50 neurons.
The number of output neurons dened by the user is two, while the output
neuron is only one. The evaluation function in the meta-GA is obtained by
launching the ANN algorithm with the corresponding parameters mapped from
the representation of solution and exploring the error rate obtained after nishing
the ANN. The task of meta-GA is to minimize the error rate Ep .
Table 1. Parameters and their domains in meta-GA
Parameter Domain Default value

step size [0.001, 5.0] 0.001
range of staring weights [0.001, 1.0] 0.1
learning range ratio [0.1, 10.0] 5.0
number of hidden layers [1, 4] 2
number of neurons - layer 1 [10, 200] 50
4 Experiments and Results
This study extends the paper of Hozjan et al. [16] that proposed the ANN for
modeling the mechanical behavior of steel frames exposed to the re. Thus,
experiments were conducted on the experimental data obtained in [1]. In the
experiments, the ANN was thought to estimate stress , while strain and
temperature T were used as input data. The calculation was carried out for steel
strength fy = 35.5 kN/cm2 .
The goal of this study was to show that the manual setting of ANN param-
eters used in [16] by eye were far to be an optimal. In contrast, the parameter
setting tuned by meta-GA can be near-optimal. Consequently, the results of
material modeling with ANN using these parameters can improve the results of
the material modeling as performed by the ANN using the manual parameter
setting. During the experiments, the allowed relative error was set to 0.05, which
is a relatively low value in an ANN training procedure.
The parameters of meta-GA were set as illustrated in Table 2. Note that
the number of tness function evaluations was F Es = 100 10 = 1, 000 in
each run of meta-GA. On the other hand, the number of independent runs was
limited to 500, in which the best parameter setting was searched for. The SBX
crossover and polynomial mutation operators were applied in the tests, because
Table 2. Parameters of meta-GA as used during the tests
Parameter Value
number of generations 100
population size 10
probability of crossover pc 0.9
probability of mutation pm 0.01
selection type tournament of size 2
crossover type SBX crossover
mutation type polynomial mutation
number of runs 500
they produced the best ospring as shown during the extensive experimental
work. The same hold also for the other parameters in Table 1.
This section is divided into three subsections. In the rst subsection, the test
suite is described. In the second subsection searching for the optimal parameter
setting of the ANN is discussed, while in third subsection comparisons of the
material models obtained with dierent parameter setting are presented.
4.1 Test Suite
The total number of input-output pairs as presented in [1] was 527, of which
435 randomly selected pairs were used for learning and the remaining 92 were
used as testing pairs. The input-output data of the test and the learning set are
assembled in the way that the dispersion of the learning and testing data is as
far as possible similar (Table (3)).
Table 3. Statistics of input-output data for learning and testing pairs
set of input-output learning data, 435 pairs

input data output data
Temperature [ C] Strain [%] Stress [MPa]
min 20 0 0
max 800 2 355
average 399.26 0.498 166.49
s. deviation 242.83 0.565 126.64
median 400 0.2 144.5
set of input-output testing data, 92 pairs

input data output data
Temperature [ C] Strain [%] Stress [MPa]
min 0 0 0
max 800 2 355
average 410.22 0.444 149.83
s.deviation 246.65 0.576 123.80
median 400 0.150 112
4.2 Searching for the Optimal Parameter Setting of ANN
Plenty of calculations with dierent topology of ANN were carried out. For
instance, one of the test result obtained by ANN with manual parameter setting
obtained coecient of correlation equals to r2 = 0.8553. Here, the ANN topology
consists of 3 hidden layers with the geometry 2-20-40-20-1. Clearly, this topology
is not good enough as can be seen in Figure 6. The better solution was obtained
when the geometry 2-30-30-1 was employed. This means, there were two hidden
layers, each of them including 30 neurons. The eciency of learning procedure
is shown in Figure 7, where actual and calculated values are compared. In this
Stress, ANN 2-20-40-20-1

250
r = 0,8553
200
culated Values
150
Calculated
100
50
0
0 50 100 150 200 250 300 350 400
Actual Values
Fig. 6. Actual and calculated values of stress obtained by ANN with manual param-
eter setting
case, the coecient of correlation was higher, i.e., r2 = 0.9975, but the stress-
strain relationship constructed with this ANN topology was not good enough
and therefore further improvements were essential. From this results, it can be
seen that the procedure of searching the optimal topology for ANN by guessing
is demanding. However, the results of this procedure also depend on luck, since
the experimental data is highly non-linear.
In order to further improve the learning procedure of the ANN algorithm,
the meta-GA was introduced. As a results, the allowed relative error was also
lowered to very small value of 0.002 after using the meta-GA. Parameter values
Stress, ANN 2-50-50-1

400
r = 0.9993
Calculated values
2
350
300
250
200
150
100
50
0
0 50 100 150 200 250 300 350 400
Actual values
Fig. 7. Actual and calculated values of stress obtained by ANN with meta-GA
Table 4. Manual and meta-GA suggested parameter setting for ANN training proce-
dure
parameter manual meta-GA

step size 0.5 3.091
number of hidden layers 2 4
number of neurons 50-50 165-103-73-79
range of staring weights 0.1 0.806
learning range ratio 0.08 1.891
manually set by an expert and the best parameter setting as proposed by the
meta-GA are presented in Table 4.
The proposed ANN topology consists of 4 hidden layers with the geometry of
2-165-103-73-79-1. Here, the coecient of correlation was almost perfect for the
suggested topology, i.e., r2 = 0.9999. (Fig. 8). This example presents the success-
ful application of the meta-GA and also shows the capability of this algorithm.
Moreover, it proves that using the meta-GA for optimizing the parameters of the
ANN is necessary due to all the non-linearities involved in this material modeling
problem.
4.3 Interpretation of the Results Obtained by the ANN

The results of the ANN are shown in Figure 9 and represents the stress-strain
relationship at dierent temperature levels T . The experimental data of stress
are shown with diamond marks, while calculated values are presented with a
continuous line. The accordance between the calculated values obtained by the
Stress, ANN 2-165-103-73-79-1

400
350 r = 0,9999
culated Values
300
250
Calculated
200
150
100
50
0
0 50 100 150 200 250 300 350 400
Actual Values
Fig. 8. Comparison between actual and calculated values of stress in case of optimized
ANN geometry
40
T [ C]
35 20
100
30 200
Stress [kN/cm ]
300
2
25 400
500
600
20 700
800
15 test
10
-5 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

Strain [%]
Fig. 9. Stress-strain curves at dierent temperatures T
ANN and the experimental ones is very good along the entire curve for all
temperature levels (Fig. 9).
However, some diculties appear within the course of modeling the proper-
ties of steel at elevated temperatures by the ANN. Firstly, the yield points of
particular stress-strain curves are not explicitly dened by the curve shape itself.
The problem was solved by plotting the rst derivative of the experimentally ob-
tained stress-strain relations where the yield limits are much better pronounced.
Secondly, due to the regressions used in the ANN the obtained approximations
for the stress-strain relations below the yield limit exhibit certain deviations
from a linear shape. Assuming ideal linear behavior of steel the linear regression
based on actual experimental data has been used for this range. Lastly, since the
experimental data are given for the values of strains up to 2%, the ANN model
is inadequate for the range of strains higher than 2% and a constant hardening
parameter is introduced in this strain range.
According to that, the presented material model is divided into three parts
(Figure 10). The rst part is linear elastic M < Y and the stress is determined
by linear elastic law
(M , T ) = ET M = kE,T E20 M , (13)

40
35 1. part
Hooks law
3. part
30
stress s [kN/cm2]
2. part hardening
25 ANN solution
20
15
10
5
0
0 ey 0.5 1 1.5 2 2.5 3
strain e [%]
Fig. 10. Stress-strain relationship at ambient temperature
where kE,T stands for reduction factor of the elastic modulus, describing its
variation in dependency of the temperature T , referring to the elastic modulus
E20 at the room-temperature T = 20 C.
The second part stands for the plastic range where the mechanical strain
exceeds the yield strain M > Y . In this range, the stress is calculated by the
ANN, in accordance with the actual values of the mechanical strain M and
temperature T
(M , T ) = fANN (M , T ). (14)
In the range where the mechanical strain exceeds the value of M > 1.85%, a
uniform strain-hardening parameter K is considered. The value of K is deter-
mined by the slope of the stress-strain curve at the strain M = 1.85% and at
the corresponding temperature T (K = K(M = 1.85%, T )).
We should note that experimental data [1] are determined at the heating rate
10 C/min and therefore the applicability of this material model is limited. The
authors mentioned that the results for these heating rates are quite similar to the
results obtained by other researchers, whose measurements were performed at
dierent heating rates, namely 2.5 C/min, 5 C/min and 20 C/min. With this
statement the application of the presented model at dierent heating rates is
acceptable, when provided that the heating rate does not dier from 10 C/min
excessively.
4.4 Discussion
Although the meta-GA parameters were initialized with a relatively small values,
the time complexity of a meta-GA and ANN algorithms running sequentially one
after the other was crucial increased because of the higher time complexity of the
ANN especially, when the ANN with more hidden layers must be learned. Fortu-
nately, the meta-GA algorithm does not demand any user interaction. Therefore,
the quality of obtained results justies increasing of the performance especially
because here we deal with a real-world problem, where the results of the algo-
rithm can have a crucial impact on the behavior of material structure during
the re.
5 Conclusion
This chapter extends the results obtained by Hozjan et al. in [1] that using
the ANN searched for a material model of steel frames exposed to the high
temperatures caused by re. Although the reported error rate obtained by the
ANN learning procedure was kept within the normal range, the goal of this study
was to show that near-optimal parameter setting of the ANN can be found using
the meta-GA. In line with this, the proposed meta-GA arises as a meta-heuristic
operating at the higher level and controls the parameters of the ANN solving
the problem that operates at the lower level.
The meta-GA was applied to a benchmark problem suite as proposed in [1].
The results of ANN with parameter setting as proposed by this algorithm im-
proved the results of the ANN with parameter setting as proposed by an expert
signicantly and conrmed the fact that the nature-inspired algorithms work
better when they are hybridized. Meta-GA algorithm together with ANN pro-
vides a powerful toll which can be used in many engineering problems especially
in those where parameters behave non-linearly and irregularly.
References
1. Kirby, B.R., Preston, R.R.: High temperature properties of hot-rolled, structural
steels for use in re engineering design studies. Fire Safety Journal 13, 2737 (1988)
2. Eurocode 3: Design of steel structures, Part 1.2: General rules-Structural re de-
sign. European Committee for Standardisation, Brussels (2001)
3. British Standard Institute. BS5950-8:Structural Use of Steelwork in Building -
Part 8: Code of Practice for Fire Resistance Design. British Standards Institution,
London (2003)
4. Lippmann, R.P.: An Introduction to Computing with Neural Nets. IEEE Magazine
on Acoustics, Signal and Speech Processing 4(2), 422 (1987)
5. Sarle, W.S.: Neural Network FAQ. Periodic posting to the Usnet newsgroup
comp.ai.neural-nets (2002), ftp://ftp.sas.com/pub/neural/FAQ.html
6. Huang, Z.F., Tan, K.H.: Eects of External Bending Moments and Heating
Schemes on the Responses of Thermally-restrained Steel Columns. Engineering
Structures 26(6), 769780 (2004)
7. Seed, G.M., Murphy, G.S.: The applicability of neural networks in modelling the
growth of short fatigue cracks. Fatigue & Fracture of Engineering Materials &
Structures 21, 183190 (1998)
8. Haque, M.E., Sudhakar, K.V.: ANN based prediction model for fatigue crack
growth in DP steel. Fatigue & Fracture of Engineering Materials & Structures 23,
6368 (2001)
9. Sterjovski, Z., Nolan, D., Carpenter, K.R., Dunne, D.P., Norrish, J.: Articial neu-
ral networks for modeling the mechanical properties of steels in various applica-
tions. Journal of Materials Processing Technology 170, 536544 (2005)
10. Sakla, S.S.S.: Neural network modeling of the load-carrying capacity of
eccentrically-loaded single-angle struts. Journal of Constructional Steel Re-
search 60, 965987 (2004)
11. Oreta, A., Kawashima, K.: Neural Network Modeling of Conned Compressive
Strength and Strain of Circular Concrete Columns. Journal of Structural Engi-
neering 129(4), 554561 (2003)
12. Tang, C.W., Chen, H.J., Yen, T.: Modeling Connement Eciency of Reinforced
Concrete Columns with Rectilinear Transverse Steel Using Articial Neural Net-
works. Journal of Structural Engineering 129(6), 775783 (2003)
13. Zhao, Z.: Steel columns under re - a neural network based strength model. Ad-
vances in Engineering Software 37(2), 97105 (2004)
14. Mikami, I., Tanaka, S., Hiwatashi, T.: Neural Network System for Reasoning Resid-
ual Axial Forces of High-Strength Bolts in Steel Bridges. Computer-Aided Civil and
Infrastructure Engineering 13, 237246 (1998)
15. Papadrakakis, M., Lagaros, N.D., Plevris, V.: Design optimization of steel struc-
tures considering uncertainties. Engineering Structures 27, 14081418 (2005)
16. Hozjan, T., Turk, G., Srpcic, S.: Fire analysis of steel frames with the use of articial
neural networks. Journal of Constructional Steel Research 63, 13961403 (2007)
17. Michalewicz, Z.: Genetic algorithms + data structures = evolution programs.
Springer, Berlin (1992)
18. Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learn-
ing. Addison-Wesley Longman Publishing Co., Inc., Boston (1989)
ley & Sons, Inc.,, New York (2001)
20. Goldberg, D.E., Deb, K.: A comparative analysis of selection schemes used in
genetic algorithms. In: Foundations of Genetic Algorithms 1 (FOGA-1), pp. 4149
(1991)
21. Eshelman, L.J., Schaer, J.D.: Real-coded genetic algorithms and interval-
schemata. In: Foundations of Genetic Algorithms 2 (FOGA-2), pp. 187202 (1993)
22. Deb, K., Agrawal, R.B.: Simulated binary crossover for continuous search space.
Complex Systems 9(2), 115148 (1995)
23. Deb, K., Kumar, A.: Real-coded genetic algorithms with simulated binary
crossover: Studies on multi-modal and multi-objective problems. Complex Sys-
tems 9(6), 431454 (1995)
(2003)
evolutionary algorithm. Computational Optimization and Applications 54(3), 741
770 (2013)
for marker optimization in the clothing industry. Applied Soft Computing 10(2),
409422 (2010)
27. Fister Jr, I., Yang, X.-S., Fister, I., Brest, J., Fister, D.: A brief review of
nature-inspired algorithms for optimization. Electrotechnical Review 80(3), 116
122 (2013)
28. Fister Jr, I., Suganthan, P.N., Strnad, D., Brest, J., Fister, I.: Articial neural
networks regression on ensemble strategies in dierential evolution. In: 20th In-
ternational Conference on Soft Computing, Mendel 2014, pp. 6570. University
of Technology, Faculty of Mechanical Engineering, Institute of Automation and
Computer Science, Brno (2014)
A Differential Evolution Algorithm with a Variable
Neighborhood Search for Constrained
Function Optimization
M. Fatih Tasgetiren1,*, P.N. Suganthan2, Sel Ozcan3, and Damla Kizilay4

1
Industrial Engineering Department, Yasar University,
Selcuk Yasar Campus, Izmir, Turkey
fatih.tasgetiren@yasar.edu.tr
2 School of Electrical and Electronic Engineering,
Nanyang Technological University, Singapore

epnsugan@ntu.edu.sg
3
sel.ozcan@yasar.edu.tr
4
damla.kizilay@yasar.edu.tr
Abstract. In this paper, a differential evolution algorithm based on a variable

neighborhood search algorithm (DE_VNS) is proposed in order to solve the
constrained real-parameter optimization problems. The performance of DE al-
gorithm depends on the mutation strategies, crossover operators and control pa-
rameters. As a result, a DE_VNS algorithm that can employ multiple mutation
operators in its VNS loops is proposed in order to further enhance the solution
quality. We also present an idea of injecting some good dimensional values to
the trial individual through the injection procedure. In addition, we also present
a diversification procedure that is based on the inversion of the target individu-
als and injection of some good dimensional values from promising areas in the
population by tournament selection. The computational results show that the
simple DE_VNS algorithm was very competitive to some of the best perform-
ing algorithms from the literature.
1 Introduction
In general, a constrained optimization problem focuses on optimizing a vector x in

order to minimize the following problem:
min f (x ) x = (x1 , x 2 ,.., x D ) D (1)
where x F S . On the search space S D , the objective function of a vector

x is described as f (x ) and the feasible region is given on the set F S . Usually,
*

172 M.F. Tasgetiren et al.
S is described as a D-dimensional space in D and its domains of the decision

variables are described by their search ranges as follows:
x kmin x k xkmax 1 k D (2)
By using a set of m additional constraints (m 0) , the feasible region F is

described as follows:
g i (x ) 0 , for i = 1,.., p and (3)
h j ( x ) = 0 , for j = p + 1,.., m . (4)
In general, the equality constraints can be transformed into inequality form and can
be combined with other inequality constraints as
Gi ( x ) = max{g i ( x ),0} i = 1,... p

H i (x ) = max{| hi ( x ) | ,0} i = p + 1,..., m (5)
p m
G (x ) + H ( x )
i i
(x ) = i =1 i = p +1
where (x ) is the average violation of m constraints. In addition, is a tolerance

value for the equality constraints, which is in general taken as = 0.0001 in the
literature.
Differential evolution (DE) is one of the most sophisticated evolutionary
algorithms, which is proposed by Storn and Price [38,39]. Moreover, DE is a
population-based, stochastic global optimizer. So far, DE has been extensively
employed to solve many real-parameter optimization problems. The surveys of DE
can be found in Corne et al. [8], Lampinen [18], Babu and Onwubolu [4], Das and
Suganthan [43], and Price et al. [27]. Recently, Elsayed et al. [32] proposed an
algorithm framework with multiple search operators, where the performance of
evolutionary algorithms can be enhanced through employing a self-adaptative strategy
and multiple search operators. Furthermore, in Qin et al. [1], a self-adaptive DE
variant, called saDE, was proposed, where parameter values were updated gradually
with a learning mechanism. Similarly, in the study of Zhang et al. [48], different than
traditional DE, an adaptive DE where control parameters were updated adaptively
with optional external memory (JADE) was introduced. A composite DE (coDE) was
presented in Yong et al. [47] through the use of efficient trial individual obtaining
strategies and control parameters. Mallipedi et al. [31] introduced an ensemble idea in
DE by considering multiple mutation strategies and control parameters so called
EPSDE. Moreover, this ensemble idea was extended to constrained optimization
problems to handle the constraints in Mallipedi et al. [30]. Tasgetiren et al. [25]
presented an ensemble DE by assigning each individual to a different mutation
strategy or a variable parameter search (VPS). On the other hand, Zamuda and Brest
A Differential Evolution Algorithm with a Variable Neighborhood Search 173
[3,13] introduced a variant of DE algorithm with a population reduction methodology.

Elsayed et al. [34] recently developed a DE algorithm so called SAS-DE, in which an
improvement method was used and was adaptively utilized for testing the CEC2010
constrained optimization benchmark instances. In the study of Gong et al. [46] DE
with a ranking-based mutation operator was presented and various engineering
problems were tested. Mohamed and Sabry [2] introduced a modified differential
evolution algorithm (COMDE) including a new mutation and a dynamic non-linear
increased crossover probability. On the other hand, Long et al. [45] presented a new
hybrid DE-modified with augmented Lagrangian multiplier method in order for
solving constrained optimization problems. Furthermore, various DE algorithms
designed to solve constrained optimization problems can also be obtained in [5, 6, 15,
16, 17, 19, 21, 35, and 37] and a comprehensive survey of DE algorithms on
constrained optimization can also be obtained in [7].
Having obtained successful results in the vpsDE algorithm in [23] as well as the
ensemble concept in [24, 28, 29], this paper presents a DE_VNS algorithm to solve
the benchmark instances in CEC2006 [20].
This paper is organized as follows. In Section 2, a basic DE algorithm is explained
whereas Section 3 outlines the proposed DE_VNS algorithm. Constraint handling
methods employed are given in Section 4. Furthermore, computational results are
given in Section 5 and finally, Section 6 summarizes the conclusions.
2 Differential Evolution Algorithm
Due to the existence of several mutation strategies in the basic DE algorithms, for a
general description, we employ the DE / rand / 1 / bin variant of Storn and Price
[38,39]. In the basic DE algorithm, the initial target population is established by NP
number of individuals. A target individual in the population contains a D-dimensional
vector with parameter values. These parameter values are initially generated
uniformly between predetermined search bounds between xijmin and xijmax as follows:
( )
xijt = xijmin + x ijmax x ijmin r (6)
where xijt is the target individual at generation t . r is a uniform random number

generated within the range U[0,1].
Mutant population is established as follows: Two individuals are picked up from
target population. Then, the weighted difference of them is added to a third individual
in the target population. This can be achieved as follows:
(
vijt = x ajt 1 + F xbjt 1 x cjt 1 ) (7)
where a , b , and c are three randomly chosen individuals from the target population
such a way that (a b c i (1,.., NP )) and ( j = 1,2,.., D ) . F > 0 is a mutation
scale factor affecting the differential variation between two individuals.
In the next step, the trial individual can be obtained by making a uniform crossover
between the target and mutant individuals as follows:
vijt if rijt CR or j = Dj
u ijt = (8)
t 1
xij otherwise
where the D j denotes a randomly selected dimension. It guarantees that at least one
parameter of each trial individual comes from the mutant individual. CR is a cros-
sover rate within the range [0,1] , and rijt is a random number generated from U[0,1].
In case of any violation of the parameter values of trial individual during the evolu-
tion, they are restricted to:
(
u ijt = xijmin + xijmax xijmin r1 ) j = 1,2,.., D (9)
Finally, a one-to-one comparison is made to select the better individual in terms of

their fitness values as follows:
xit = t 1
( ) ( )
u it if f u it f xit 1
(10)
xi otherwise
3 Differential Evolution with Variable Neighborhood Search

To develop a DE with a variable neighborhood search (DE_VNS), we inspire from
the VNS algorithm [26]. We take advantage of variable mutation strategies that affect
the performance of DE algorithms [29]. We choose the following two mutation strat-
egies to be employed in the VNS loop:
M 1 = DE / pbest / 1 / Bin (11)
v =x
t
ij
t 1
pj +F x ( t 1
bj x t 1
cj )
M 2 = DE / rand / 1 / Bin (12)
v =x
t
ij
t 1
aj +F x ( t 1
bj x t 1
cj )
In above mutation strategies, x p is the individual chosen by the tournament selec-
tion with size of 2. In other words, two individuals are randomly taken from the popu-
lation, then the one with the better fitness value is chosen. To generate the trial
individual in the DE_VNS algorithm, we define a neighborhood N k for a temporary
individual by a mutation strategy and a crossover operator together as follows:
N k ( ) = M k (v ), CR ( , v ) (13)
Equation (13) indicates that in order to find a neighborhood of an individual

xi (i.e., implicitly a trial individual ui ) , we use a mutation strategy M k to generate a
mutant individual v first, then we recombine mutant individual v with the individual
through crossover operator CR , which is a typical binomial crossover operator in
equation (8). We use the following two neighborhood structures to be used in the
VNS algorithm to generate each trial individual as follows:
N1( ) = M1(v ), CR( , v ) where Cr = 0.9, F = 0.9 (14)
N 2 ( ) = M 2 (v ), CR( , v ) where Cr = U (0,1), F = U (0,1) (15)
In other words, in the first neighborhood structure, we employ a very high muta-
tion rate F and a very high crossover rate Cr . However, in the second neighborhood
structure, we determine them uniformly in the range [0,1], randomly. With the above
definitions and temporary individuals and x * , we develop a VNS algorithm to
generate a trial individual as shown in Fig. 1.
Procedure VNS ( xi )
k max = 2
k =1
= xi
do{
x * = N k ( )
if ( )
f x * < f ( )
= x*
k =1
else
k = k +1
}while(k k max )
ui =
return ui
Endprocedure
Fig. 1. VNS Algorithm
The performance of VNS algorithms depends on what strategy is used in the first
neighborhood. As explained before, the equation (11) is used as the first neighbor-
hood whereas the equation (12) is used as the second neighborhood. Note that as long
as the first neighborhood improves the current solution, the neighborhood counter k
will be 1 indicating that the first neighborhood will be employed. Otherwise, the
neighborhood counter k will be increased to 2 indicating that the second neighbor-
hood will be employed. If the second neighborhood improves the solution, it gets
back to the first neighborhood again until the second neighborhood fails. It should be
noted that when we compare two solutions in the VNS algorithm, we use the pena-
lized fitness values obtained by the NFT method that will be explained in Section 4.
3.1 Initial Population

The initial target population is randomly established as explained before. In other
words, NP individuals are established by equation (6). However, we also employ the
opposition-based learning algorithm to enrich the initial population. Opposition-based
learning (OBL) is proposed by [9]. It is a new method in computational intelligence
field and has been applied successfully to further improve various heuristic optimiza-
tion algorithms [40-42]. OBL is based on an idea that its opposite solution implies a
chance to obtain a new solution closer to the global optimal. Inspired from OBL, a
generalized OBL (GOBL) is introduced in [10-12]. Suppose that x is the current solu-
tion with x [a, b ] . Then its opposite solution is given by:
x * = k (a + b ) x (16)
In GOBL, opposite solutions are gathered by dynamically updated interval bounda-

ries in the population as follows:
( )
xij* = kU a j , b j xij (17)
a = min (x ), b = max (x )
j ij j ij (18)
x = U (a , b ) if x < x
*
ij j j
*
ij
min
j or xij* > x max
j
i = 1,.., NP, j = 1,.., D, k = U [0,1] (19)
After establishing and evaluating the target population, the above GOBL algorithm
is also used to obtain the opposite target individual. The better one is retained in the
target population.
3.2 Generation of Trial Population

Trial individuals are generated through the VNS algorithm explained before. Once
each individual is obtained from the VNS algorithm, we further apply an injection
procedure to trial individuals to diversify it and escape from the local minima. In the
injection procedure, we select an individual from the target population by tournament
selection with size of 2. Then depending on the injection probability, we inject some
good dimensional values to the trial individuals in such a way that a uniform random
number r is less than the injection probability iP , that dimension is taken from indi-
vidual x a , which is determined by the tournament selection procedure. Otherwise, the
dimension of the trial individual is retained. The injection procedure is given in Fig. 2.
for i = 1 to NP
for j = 1 to D
if (r < iP ) then
x aj = TournamentSelect ()
uij = x aj
else
uij = uij
endfor
endfor
Fig. 2. Injection Procedure
3.3 Selection
When the selection for the next generation is carried out, we employ the EC and SF
constraint handling methods that will be summarized in Section 4 as follows: For each
individual in the trial population, we check the (t ) level. If the constraint violation is
less than (t ) level, we treat the trial individual as a feasible solution. Then we em-
ploy the SF method whether or not the trial individual will survive to be in the next
generation. In addition, we simply use the SF method to update the best so far solu-
tion in the population.
3.4 Diversification
In order to further diversify the target population, we propose a diversification me-
chanism based on the inversion of the dimensional values of the target individual and
an injection procedure explained above. For a small portion of the target population,
following diversification procedure is applied to the randomly selected individuals as
shown in Fig. 3.
x aj = RandomlySelect ()
( )
x aj = invert xaj
for j = 1 to D
if (r < iP ) then
xbj = TournamentSelect ()
x aj = xbj
else
xaj = xaj
endfor
Fig. 3. Diversification Procedure
4 Constraint Handling
Evolutionary algorithms can yield infeasible solutions. In this case, the general ten-
dency is to utilize some constraint handling approaches [7, 49]. In this paper, we used
the following constraint handling methods:
4.1 Superiority of Feasible Solutions (SF)

When using SF [12] for evaluating two solutions such as x a and xb , x a is considered
to be better than xb under the following conditions for a minimization problem: (i)
solution x a is feasible and solution xb is not; (ii) both solutions are feasible but x a
has a smaller objective function value than xb ;(iii) both solutions are infeasible, but
x a has a smaller overall constraint violation amount (x) that can be calculated by
using Eq. (5).
4.2 The Adaptive Penalty Function (NFT)

In [36], an adaptive penalty approach is proposed. In adaptive penalty function, the
idea of near feasibility threshold so called NFT is presented, where the solutions
within feasible region and the NFT-neighborhood of the infeasible region are favored.
Furthermore, an adaptive part is included in the penalty method to differentiate the
gap between the best feasible value and best infeasible value found so far. Then the
adaptive penalty function is given as follows:
i
vi ( x )
( )
m
f p (x ) = f (x ) + f feas f all NFT
(20)
i =1 i
where f all is the unpenalized value of the best solution obtained so far whereas f feas
is the value of the best feasible solution yet obtained. As mentioned in [7], the adap-
tive term may result in zero-or over-penalty. Due to this reason, we only take the
dynamic part of the above penalty function with NFT threshold into account as
follows:

H j (x )

p G (x ) m
f p (x ) = f (x ) + i + NFT (21)
i =1 NFTi j = p +1 j
NFT0
The basic form of the NFT method is presented as NFT = where NFT0 is
1+ *t
the initial value of the NFT method; and t are user-defined positive value and gen-
eration counter, respectively. is severity parameter. Because of the conversion
process of the equality constraints to the inequality constraints by subtracting from
the absolute value of the constraint value and is determined beforehand, the NFT0
is chosen as 1e-4.
4.3 -Constraint (EC)
The -constraint handling method was proposed in [44] in which the constraint re-
laxation is monitored by parameter. A proper control of the parameter is neces-
sary while obtaining good feasible solutions for problems with equality constraints
[44]. The level is updated according to the control generation t C . After t ex-
ceeds t C , the level is set to zero to end up with feasible solutions. The main idea
lies behind the EC method is that solutions having violations less than (t ) are consi-
dered to be feasible solutions when making selection for the next generation. The
general framework is given as follows:
(0 ) = (x ) (22)
cp
t
(0)1
(t ) = tC , 0 < t < tC . (23)

0, t tC
where x is the top -th individual.
5 Computational Results
The DE_VNS algorithm was coded in C++ and run on an Intel P4 1.33 GHz Laptop
PC with 256MB memory. The population size is taken as NP=60. The NFT0 is fixed
at 0.0001. Injection probability is taken as 0.005 whereas the diversification probabili-
ty is taken as 0.05. For the EC constraint handling method, following parameters are
used as = 0.25 NP , tC = 0.4 * MaxGen and cp = 2 . We carried out 30 replications
for each benchmark problem and average, minimum and standard deviation of 30
replications are provided. Note that real numbers are rounded to zero after 10 digits in
the standard deviation calculations.
We compare our algorithm to the best performing algorithms from the literature
such as MDE [22], ECHT-EP2[30] and SAMO-DE [32]. The computational results
are given in Table 1. As seen from Table 1, the DE_VNS algorithm was able to find
the optimal solutions with zero standard deviations for 13 out of 22 benchmark prob-
lems. The DE_VNS algorithm was slightly better than SAMO-DE because it was able
to find 12 optimal solutions with zero standard deviations. The performance of the
ECHT-EP2 was slightly better than DE_VNS and SAMO-DE since it was able find
14 optimal solutions with zero standard deviations. The clear winner was the MDE
algorithm due to the fact that it was able to find 19 optimal solutions with zero stan-
dard deviations. However, DE_VNS, SAMO_DE and ECHT-EP2 algorithms were
run for 240000 function evaluations whereas MDE was run for 500000 function eval-
uations. In 4 benchmark functions, the standard deviation of the DE_VNS algorithm
was smaller than both SAMO_DE and ECHT-EP2, respectively. Together with all
algorithms compared, the DE_VNS algorithm was able to find the optimal solutions
in all 30 replications. In other words, feasibility rate was 100 %. In summary, the
simple DE_VNS algorithm was competitive to the best performing algorithms from
the literature.
Table 1. Computational Results of DE-VNS, SAMO-DE, MDE, ECHT-EP2 FOR CEC2006

test problems
Problem DE-VNS SAMO-DE MDE ECHT-EP2

FEs
240,000 240,000 500,000 240,000
g01 Best -15.0000 -15.0000 -15.0000 -15.0000
Avg -15.0000 -15.0000 -15.0000 -15.0000
Std 0.00E-00 0.00E-00 0.00E-00 0.00E-00
g02 Best -0.8036191 -0.8036191 -0.8036191 -0.8036191
Avg -0.789822 -0.79873521 -0.78616 -0.7998220
Std 1.87E-02 8.80050E-03 1.26E-02 6.29E-03
g03 Best -1.0005 -1.0005 -1.0005 -1.0005
Avg -1.0005 -1.0005 -1.0005 -1.0005
Std 0.00E-00 0.00E-00 0.00E-00 0.00E-00
g04 Best -30665.5386 -30665.5386 -30665.539 -30665.539
Avg -30665.5386 -30665.5386 -30665.539 -30665.539
Std 0.00E-00 0.00E-00 0.00E-00 0.00E-00
g05 Best 5126.497 5126.497 5126.497 5126.497
Avg 5126.497 5126.497 5126.497 5126.497
Std 0.00E-00 0.00E-00 0.00E-00 0.00E-00
g06 Best -6961.813875 -6961.813875 -6961.814 -6961.814
Avg -6961.813875 -6961.813875 -6961.814 -6961.814
Std 0.00E-00 0.00E-00 0.00E-00 0.00E-00
g07 Best 24.3062 24.3062 24.3062 24.3062
Avg 24.306209 24.3096 24.3062 24.3063
Std 2.17E-07 1.58880E-03 0.00E-00 3.19E-05
g08 Best -0.095825 -0.095825 -0.095825 -0.095825
Avg -0.095825 -0.095825 -0.095825 -0.095825
Std 0.00E-00 0.00E-00 0.00E-00 0.0E-00
g09 Best 680.630 680.630 680.630 680.630
Avg 680.630 680.630 680.630 680.630
Std 0.00E-00 1.15670E-05 0.00E-00 0.00E-00
g10 Best 7049.24802 7049.24810 7049.24802 7049.2483
Avg 7049.24803 7059.81345 7049.24802 7049.2490
Std 4.02E-05 7.856E-00 0.00E-00 6.60E -04
g11 Best 0.7499 0.7499 0.7499 0.7499
Avg 0.7499 0.7499 0.7499 0.7499
Std 0.00E-00 0.00E-00 0.00E-00 0.00E-00
g12 Best -1.0000 -1.0000 -1.0000 -1.0000
Avg -1.0000 -1.0000 -1.0000 -1.0000
Std 0.00E-00 0.00E-00 0.00E-00 0.00E-00
g13 Best 0.053942 0.053942 0.053942 0.053942
Avg 0.053942 0.053942 0.053942 0.053942
Std 0.00E-00 0.00E-00 0.00E-00 0.00E-00
g14 Best -47.76489 -47.76489 -47.764887 -47.7649

Avg -47.76489 -47.68115 -47.764874 -47.7648
Std 4.64E-06 4.04300E-02 1.400E-05 2.72E-05
g15 Best 961.71502 961.71502 961.71502 961.71502
Avg 961.71502 961.71502 961.71502 961.71502
Std 0.00E-00 0.00E-00 0.00E-00 0.00E-00
g16 Best -1.905155 -1.905155 -1.905155 -1.905155
Avg -1.905155 -1.905155 -1.905155 -1.905155
Std 0.00E-00 0.00E-00 0.00E-00 0.00E-00
g17 Best 8853.5397 8853.5397 8853.5397 8853.5397
Avg 8877.3107 8853.5397 8853.5397 8853.5397
Std 3.94E+01 1.15E-05 0.00E-00 2.13E -08
g18 Best -0.866025 -0.866025 -0.866025 -0.866025
Avg -0.834185 -0.866024 -0.866025 -0.866025
Std 7.12E-02 7.04367E-07 0.00E-00 0.00E-00
g19 Best 32.656077 32.655593 32.655693 32.6591
Avg 32.685099 32.757340 33.34125 32.6623
Std 3.73E-02 6.145E-02 8.475E-01 3.4E -03
g21 Best 193.72451 193.72451 193.72451 193.7246
Avg 193.72456 193.771375 193.72451 193.7438
Std 2.84E-04 1.9643E-02 0.00E-00 1.65E-02
g23 Best -400.0527 -396.165732 -400.0551 -398.9731
Avg -372.9920 -360.817656 -400.0551 -373.2178
Std 5.75E+01 1.9623E+01 0.00E-00 3.37E+01
g24 Best -5.508013 -5.508013 -5.508013 -5.508013
Avg -5.508013 -5.508013 -5.508013 -5.508013
Std 0.00E-00 0.00E-00 0.00E-00 0.00E-00
6 Conclusions
In this paper, a differential evolution algorithm with a variable neighborhood search

algorithm (DE) is presented to solve the constrained real-parameter optimization
problems. The performance of DE depends on the selection of mutation strategies and
crossover operators as well as control parameters. For this reason, we developed a
DE_VNS algorithm that can employ multiple mutation operators in its VNS loops to
further improve the solution quality. We also present an idea of injecting some good
dimensional values to the trial individual from population through the injection pro-
cedure. In addition, we also present a diversification procedure that is based on the
inversion of the target individuals and injection of some good dimensional values
from promising areas in the target population by tournament selection. The computa-
tional results show that the simple DE_VNS algorithm was very competitive to some
of the best performing algorithms from the literature. For the future work, we will
develop some DE algorithm taking advantage of the idea of neighborhood change of
VNS algorithms for both constrained and unconstrained real parameter optimization
problems.
References
[1] Qin, A.K., Huang, V.L., Suganthan, P.N.: Differential Evolution Algorithm with strategy
adaptation for global numerical optimization. IEEE Trans. Evol. Comput. 13, 398417
(2009)
[2] Mohamed, A.W., Sabry, H.Z.: Constrained optimization based on modified differential
evolution algorithm. Information Sciences 194, 171208 (2012)
[3] Zamuda, A., Brest, J.: Population reduction differential evolution with multiple mutation
strategies in real world industry challenges. In: Rutkowski, L., Korytkowski, M., Scherer,
R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M., et al. (eds.) EC 2012 and SIDE 2012.
LNCS, vol. 7269, pp. 154161. Springer, Heidelberg (2012)
[4] Babu, B.V., Onwubolu, G.C. (eds.): New Optimization Techniques in Engineering.
STUDFUZZ, vol. 141. Springer, Heidelberg (2004)
[5] Becerra, R.L., Coello, C.C.A.: Cultural Differential Evolution for Constrained Optimiza-
tion. Comput. Methods Appl. Mech. Engrg. (2005)
[6] Chiou, J.-P., Wang, F.-S.: Hybrid Method of Evolutionary Algorithms for Static and Dy-
namic Optimization Problems with Applications to a Fed-Batch fermantation Process.
Computers and Chemical Engineering 23, 12771291 (1999)
[7] Coello, C.C.A.: Theoretical and Numerical Constraint-Handling Techniques Used with
Evolutionary Algorithms: A Survey of the State of the Art. Comput. Methods Appl.
Mech. Engrg. 191(11-12), 12451287 (2002)
[8] Part Two: Differential Evolution. In: Corne, D., Dorigo, M., Glover, F. (eds.) New Ideas
in Optimization, pp. 77158. McGraw-Hill (1999)
[9] Tizhoosh, H.R.: Opposition-based learning: a new scheme for machine intelligence. In:
Proceedings of International Conference on Computational Intelligence for Modeling
Control and Automation, pp. 695701 (2005)
[10] Wang, H., Wu, Z.J., Rahnamayan, S.: Enhanced opposition-based differential evolution
for solving high-dimensional continuous optimization problems. Soft Comput. 15(11),
21272140 (2011)
[11] Wang, H., Wu, Z.J., Rahnamayan, S., Kang, L.S.: A scalability test for accelerated DE
using generalized opposition-based learning. In: Proceedings of International Conference
on Intelligent System Design and Applications, pp. 10901095 (2009)
[12] Wang, H., Wu, Z.J., Rahnamayan, S., Liu, Y., Ventresca, M.: Enhancing particle swarm
optimization using generalized opposition-based learning. Inform. Sci. 181(20),
46994714 (2011)
[13] Brest, J., Sepesy Maucec, M.: Population size reduction for the differential evolution al-
gorithm. Appl. Intell., 228247 (2008)
[14] Deb, K.: An efficient constraint handling method for genetic algorithms. Computer Me-
thods in Applied Mechanics and Engineering 186, 311338 (2000)
[15] Koziel, S., Michalewicz, Z.: Evolutionary Algorithms, Homomorphous Mappings, and
Constrained Parameter Optimization. Evol. Comput. 7(1), 1944 (1999)
[16] Lampinen, J.: Multi-Constrained Optimization by the Differential Evolution. In: Proc. of
the IASTED International Conference Artificial Intelligence Applications (AIA 2001),
pp. 177184 (2001)
[17] Lampinen, J.: Solving Problems Subject to Multiple Nonlinear Constraints by the Diffe-
rential Evolution. In: Proc. of the 7th International Conference on Soft Computing,
MENDEL 2001, pp. 5057 (2001)
[18] Lampinen, J.: A Bibliography of Differential Evolution Algorithm. Technical Report,

Lappeenranta University of Technology, Department of Information Technology, Labora-
tory of Information Processing (2001)
[19] Lampinen, J.: A Constraint Handling approach for the Differential evolution Algorithm.
In: Proc. of the Congress on Evolutionary Computation (CEC 2002), pp. 14681473
(2002)
[20] Liang, J.J., Runarsson, T.P., Mezura-Montes, E., Clerc, M., Suganthan, P.N., Coello
Coello, C.A., Deb, K.: Problem Definitions and Evaluation Criteria for the CEC 2006,
Special Session on Constrained Real-Parameter Optimization. Technical Report, Na-
nyang Technological University, Singapore (2005)
[21] Lin, Y.-C., Hwang, K.-S., Wang, F.-S.: Hybrid Differential Evolution with Multiplier up-
dating method for Nonlinear Constrained Optimization. In: Proc. of the Congress on Evo-
lutionary Computation (CEC 2002), pp. 872877 (2002)
[22] Mezura-Montes, E., Velazquez-Reyes, J., Coello, C.: Modified differential evolution for
constrained optimization. In: IEEE Congress on Evolutionary Computation, pp. 2532
(2006)
[23] Tasgetiren, M.F., Suganthan, P.N., Pan, Q.-K., Liang, Y.-C.: A Differential Evolution
Algorithm with a variable Parameter Search for Real-Parameter Continuous Function Op-
timization. In: The Proceeding of the World Congress on Evolutionary Computation
(CEC 2009), Norway, pp. 12471254 (2009)
[24] Tasgetiren, M.F., Suganthan, P.N., Pan, Q.-K.: An ensemble of discrete differential evo-
lution algorithms for solving the generalized traveling salesman problem. Applied Ma-
thematics and Computation 215(9), 33563368 (2010)
[25] Tasgetiren, M.F., Suganthan, P.N., Pan, Q.-K., Mallipedi, R., Sarman, S.: An ensemble of
differential evolution algorithms for constrained function optimization. In: Proceedings of
IEEE Congress on Evolutionary Computation, pp. 18 (2010)
[26] Mladenovic, N., Hansen, P.: Variable neighborhood search. Computers and Operations
Research 24, 10971100 (1997)
[27] Price, K., Storn, R., Lampinen, J.: Differential Evolution A Practical Approach to Glob-
al Optimization. Springer (2005)
[28] Pan, Q.-K., Suganthan, P.N., Tasgetiren, M.F.: A Harmony Search Algorithm with En-
semble of Parameter Sets. In: IEEE Congress on Evolutionary Computation, CEC 2009,
May 18-21, pp. 18151820 (2009)
[29] Gmperle, R., Mller, S.D., Koumoutsakos, P.: A parameter study for differential evolu-
tion. In: Proc. WSEAS Int. Conf. Advances Intell. Syst., Fuzzy Syst., Evol. Comput.,
pp. 293298 (2002)
[30] Mallipedi, R., Suganthan, P.N.: Ensemble of constraint handling techniques. IEEE Trans.
Evol. Comput. 14, 561579 (2010)
[31] Mallipedi, R., Mallipedi, S., Suganthan, P.N., Tasgetiren, M.F.: Differential Evolution
Algorithm with ensemble of parameters and mutation strategies. Applied Soft Com-
put. 11, 16791696 (2011)
[32] Elsayed, S.M., Sarker, R.A., Essam, D.L.: Multi-operator based evolutionary algorithms
for solving constrained optimization problems. Computers & Operations Research 38,
18771896 (2011)
[33] Elsayed, S.M., Sarker, R.A., Essam, D.L.: On an evolutionary approach for constrained
optimization problem solving. Applied Soft Computing 12, 32083227 (2012)
[34] Elsayed, S.M., Sarker, R.A., Essam, D.L.: A self-adaptive combined strategies algorithm
for constrained optimization using differential evolution. Applied Mathematics and Com-
putation 241, 267282 (2014)
[35] Sarimveis, H., Nikolakopoulos, A.: A Line Up Evolutionary Algorithm for Solving Non-
linear Constrained Optimization Problems. Computers & Operations Research 32,
14991514 (2005)
[36] Smith, A.E., Tate, D.M.: Genetic Optimization Using a Penalty Function. In: Forrest, S.
(ed.) Proc. of the Fifth International Conference on genetic Algorithms, pp. 499503.
Morgan Kaufmann (1993)
[37] Storn, R.: System Design by Constraint Adaptation and Differential Evolution. IEEE
Transactions on Evolutionary Computation 3, 2234 (1999)
[38] Storn, R., Price, K.: Differential Evolution a Simple and Efficient Adaptive Scheme for
Global Optimization over Continuous Spaces. Technical Report TR-95-012, ICSI (1995)
[39] Storn, R., Price, K.: Differential Evolution - A Simple and Efficient Heuristic for Global
Optimization over Continuous Space. Journal of Global Optimization 11, 341359 (1997)
[40] Rahnamayan, S., Tizhoosh, H.R., Salama, M.M.A.: Opposition-based differential evolu-
tion algorithms. In: Proceedings of IEEE Congress on Evolutionary Computation,
pp. 20102017 (2006)
tion for optimization of noisy problems. In: Proceedings of IEEE Congress on Evolutio-
nary Computation (1872)
tion. IEEE Trans. Evol. Comput. 12(1), 6479 (2008)
[43] Das, S., Suganthan, P.N.: Differential Evolution: A Survey of the State-of-the-Art. IEEE
Trans. Evolutionary Computation 15(1), 431 (2011)
[44] Takahama, T., Sakai, S.: Constrained Optimization by the Constrained Differential Evo-
lution with Gradient-Based Mutation and Feasible Elites. In: IEEE Congress on Evolutio-
nary Computation, Sheraton Vancouver Wall Centre Hotel, Vancouver, BC, Canada,
pp. 18 (2006)
[45] Long, W., Liang, X., Huang, Y., Chen, Y.: A hybrid differential evolution augmented
Lagrangian method for constrained numerical and engineering optimization. Computer-
Aided Design 45, 15621574 (2013)
[46] Gong, W., Cai, Z., Liang, D.: Engineering Optimization by means of an improved con-
strained differential evolution. Comput. Methods Appl. Mech. Engrg. 268, 884904
(2014)
[47] Wang, Y., Cai, Z., Qingfu, Z.: Differential evolution with composite trial vector genera-
tion strategies and control parameters. IEEE Trans. Evol. Comput. 15, 5566 (2011)
[48] Jingqiao, Z., Sanderson, A.C.: JADE: adaptive differential evolution with optional exter-
nal archieve. IEEE Trans. Evol. Comput. 13, 945958 (2009)
[49] Iztok, F., Marjan, M., Bogdan, F.: Graph 3-coloring with a hybrid self-adaptive evolutio-
nary algorithm. Computational Optimization and Applications 54(3), 741770 (2013)
A Memetic Dierential Evolution Algorithm
for the Vehicle Routing Problem with Stochastic
Demands
Yannis Marinakis , Magdalene Marinaki, and Paraskevi Spanou
Technical University of Crete, School of Production Engineering and Management,

73100, Chania, Greece
marinakis@ergasya.tuc.gr, magda@dssl.tuc.gr
Abstract. This chapter introduces a new hybrid algorithmic approach

based on the Dierential Evolution (DE) algorithm for successfully solv-
ing a number of routing problems with stochastic variables. More pre-
cisely, we solve one problem with stochastic customers, the Probabilistic
Traveling Salesman Problem and one problem with stochastic demands,
the Vehicle Routing Problem with Stochastic Demands. The proposed
algorithm uses a Variable Neighborhood Search algorithm in order to in-
crease the exploitation abilities of the algorithm. The algorithm is tested
on a number of benchmark instances from the literature and it is com-
pared with a hybrid Genetic Algorithm.
Keywords: Dierential Evolution, Memetic Algorithms, Vehicle Rout-

ing Problem with Stochastic Demands, Probabilistic Traveling Salesman
Problem.
1 Introduction
Storn and Price [38] proposed the population-based evolutionary algorithm de-
noted as Dierential Evolution (DE). Analytical presentation and surveys
can be found in [9,10,36]. Two are the basic steps of a Dierential Evolution
algorithm, but in dierent order than a classic evolutionary algorithm, the mu-
tation operator for a generation of a trial vector and the crossover operator to
produce an ospring.
The rst paper that was devoted to the solution of the Vehicle Routing Prob-
lem was published by Dantzig and Ramser [8]. For complete denition of the
problem and of its basic variants, the reader can nd more details in the follow-
ing papers [4,5,11,17,18,26,25,31,39] and in the books [19,20,34,40]. The main
dierence between the Stochastic Vehicle Routing Problems (SVRPs)
and the Classic Vehicle Routing Problem is that in the Stochastic Vehicle Rout-
ing Problems either the customers, or the customers demands or the customers
service and travel times, or all of them are not determined in the beginning of
the process as it happens in the Classic Vehicle Routing Problem but they are


186 Y. Marinakis, M. Marinaki, and P. Spanou
stochastic variables that follow known (or unknown) probability distributions.

For analytical description of the Stochastic Vehicle Routing Problems please
see [16].
In this chapter, a hybridized version of the Dierential Evolution, the Memetic
Dierential Evolution (MDE) algorithm, is applied, analyzed and used for
solving two dierent stochastic routing problems, the one with the number of cus-
tomers as a stochastic variable, the Probabilistic Traveling Salesman Problem,
and the other, a Vehicle Routing Problem with Stochastic Demands where the
stochastic variable is the demand of each one of the customers. In the proposed
algorithm, a local search phase is used in each individual in order to eectively
explore the solution space. It should be noted that a memetic strategy usually
improves the performance of the algorithm [33]. In the last years a number of
new swarm intelligence and evolutionary algorithms have been proposed. Most
of these algorithms improve their eectiveness by hybridization with other algo-
rithms [12,13,14,15]. The proposed algorithm is compared with a hybrid Genetic
Algorithm in order to test the eciency of the proposed algorithm compared to
another evolutionary algorithm.
The rest of the chapter is organized as follows. In the next section, the two
problems, the Probabilistic Traveling Salesman Problem and the Vehicle Routing
Problem with Stochastic Demands, are presented and analyzed and for each one
of them a formulation is given. In section 3, the proposed algorithm is presented
and analyzed in details. In section 4, the computational results of the algorithm
are given. Also, comparisons with the Hybrid Genetic Algorithm are performed
and in the problems where results from the literature are known, comparisons
with them are given. Finally, the conclusions and the future research are given
in the last section.
2 Stochastic Routing Problems

2.1 Probabilistic Traveling Salesman Problem
The rst problem studied in this chapter is the Probabilistic Traveling Sales-
man Problem (PTSP). A number of publications concerning the PTSP are
given in [1,23,24,35]. In this problem, a customer will be present (with proba-
bility p) or not (with probability 1 p) in a specic route during a day. Thus,
while in the Traveling Salesman Problem, a tour with minimum cost should
be calculated, in the PTSP the objective is the minimization of the expected
length of the a priori tour where each customer requires a visit only with a given
probability [30]. The a priori tour is a template for the visiting sequence of all
customers. When an instance is needed to be solved, initially, the a priori tour
will be calculated and, then, the customers should be visited based on the se-
quence of the a priori tour while the customers that do not need to be visited
will simply be skipped [29]. PTSP is an NP-hard problem [1]. The main formu-
lation of the Probabilistic Traveling Salesman Problem can be found in [2,23].
For analytical presentation and analysis of the formulation that is used in this
chapter please see [30].
MDE Algorithm for the VRPSDs 187
2.2 Vehicle Routing Problem with Stochastic Demands

In the Vehicle Routing Problem with Stochastic Demands (VRPSDs), a vehicle
leaves from the depot with full load and serves a set of customers. The dierence
between the VRPSDs and the Capacitated VRP (CVRP) is that in the VRPSDs
the demands of the customers are known only when the vehicle arrives to them
while in the CVRP the demands of the customers are known beforehand. The
problem is NP-hard and a route begins and ends in the depot and serves the
customers exactly once. As in the case of the PTSP, this is the a priori tour
[3] and it is a template for the visiting of all customers. The vehicle will visit
the customers based on the sequence of the a priori tour, however, when the
vehicle needs replenishment it will return to the depot. The nodes from which
the vehicle returns to the depot are stochastic points [37].
The solution of the problem is a permutation of the customers starting from
the depot. An initial path is calculated and depending on the demand of the
next customer in the a priori tour, the vehicle returns to the depot for restock-
ing or continues to the next customer. In a number of papers, including this
chapter, a dierent strategy is selected where a vehicle returns to the depot for
replenishment earlier, although, the expected demand of the next customer is
less than the vehicles load. This strategy is called preventive restocking. The
preventive restocking strategy is used in order to avoid a route failure which
will happen if the vehicle will go to the next customer without having enough
load to satisfy him. If the preventive restocking strategy is not used and a
route failure occurs, then, the vehicle has to go back to the depot for restocking
and, then, to return back to the same customer. The optimum choice for the
return to the depot or not is realized using a threshold value [41]. For analytical
presentation and analysis of the formulation that is used in this chapter please
see [32].
3 Dierential Evolution Algorithm

3.1 Memetic Dierential Evolution Algorithm for Stochastic
Routing Problems
Initially, in the Memetic Dierential Evolution (MDE) algorithm, a popu-
lation is created at random as in the classic Dierential Evolution (DE) algo-
rithm. Each solution is mapped using a path representation of the route (Section
3.2) and, then, the tness function (Section 2) of each member of the popula-
tion is calculated. In order to proceed to the mutation phase, each solution is
transformed in the continuous space as it is described in Section 3.2. Then, the
mutation operator produces a trial vector for each individual of the current po-
pulation by mutating a target vector with a weighted dierential [9,10,36]. This
trial vector will, then, be used by the crossover operator to produce ospring.
The trial vector, ui (t), for each parent, xi (t), is generated as follows: a target
vector, xi1 (t), is selected from the population, such that i = i1 . Then, two
individuals, xi2 and xi3 , are selected randomly from the population such that
i = i1 = i2 = i3 . Using these individuals, the trial vector is calculated by

perturbing the target vector as follows:
ui (t) = xi1 (t) + (xi2 (t) xi3 (t)) (1)
where (0, ) is the scale factor.

The target vector xi1 in Equation (1) is a random member of the population.
After the completion of the mutation phase of the algorithm, the solutions are
transformed back to the discrete space as it is presented in Section 3.2 and a
crossover operator is applied (binomial crossover [9]). In this crossover opera-
tor, the points are selected randomly from the trial vector and from the parent.
Initially, a crossover operator number (Cr) is selected [36] that controls the
fraction of parameters that are selected from the trial vector. The Cr value is
compared with the output of a random number generator, randi (0, 1). If the
random number is less or equal to the Cr, the corresponding value is inherited
from the trial vector, otherwise, it is selected from the parent:

ui (t), if randi (0, 1) Cr
xi (t) = (2)
xi (t), otherwise.
After the crossover operator, for each ospring a Variable Neighborhood

Search algorithm (see Section 3.3) is applied and, then, the tness function of
the ospring xi (t) is calculated and if it is better than the tness function of
the parent, then, the ospring is selected for the next generation, otherwise, the
parent survives for at least one more generation [9].
In the following, a pseudocode of the Memetic Dierential Evolution algorithm
is presented.
Initialization
Initialize the control parameters , Cr
Select the mutation operator
Select the number of generations
Generate the initial population
Calculate the initial cost function value
(tness function) of each member of the population
Main Phase
Do while maximum number of generations has not been reached
Select the parent vector xi (t)
Create the trial vector ui (t) by applying the mutation operator
Create the ospring xi (t) by applying the crossover operator
Perform VNS algorithm in each ospring
Calculate the cost function (f itness) of the ospring
if f itness(xi (t)) f itness(xi (t))
Replace the parent with the ospring for the next generation
else
Add the parent in the next generation

endif
Enddo
Return the best individual (the best solution).
3.2 Path Representation

Each individual is recorded via the path representation of the tour, that is, via
the specic sequence of the nodes. It is represented by a vector in problem space
and its performance is evaluated on the predened tness functions (The tness
function is the objective function of each one of the problems described in Section
2). In case these routes do not start with node 1, we nd and put node 1 at the
beginning of the route as it is necessary for the calculation of the tness function.
In the problem studied in this chapter, one issue that we have to deal with is
the fact that, as all the solutions are represented with the path representation of
the tour, they are not in a suitable form for the MDE algorithm. Each element
of the solution is transformed into a oating point in the interval (0,1], the
trial vectors of all individuals are calculated and, then, a conversion back into
the integer domain is performed using relative position indexing [28]. After the
calculation of the trial vectors, the elements of the vectors are transformed back
into the integer domain by assigning the smallest oating value to the smallest
integer.
3.3 Variable Neighborhood Search

A Variable Neighborhood Search (VNS) algorithm [22] is applied in each indi-
vidual. The basic idea of the method is the successive search in a number of
neighborhoods of a solution. With the term neighborhood it is meant dierent
number of local search algorithms. The search is applied either with random or
with a more systematical manner in order for the solution to escape from a local
optimum. This method takes advantage of the fact that dierent local search
algorithms will lead to dierent local optimums [22].
In this research, the VNS algorithm is used with the following way. Initially,
the 2-opt local search algorithm [27] is applied for each individual for a certain
number of iterations (lsiter ). In the 2-opt heuristic, the neighborhood function
is dened as exchanging two edges of the current solution with two other edges.
Afterwards, if 2-opt is trapped in a local optimum for a number of iterations,
the 3-opt algorithm [27] is applied for the same number of iterations. The 3-opt
heuristic is quite similar to 2-opt. However, because it uses a larger neighborhood,
it introduces more exibility in modifying the current tour. The tour breaks into
three parts instead of only two. There are eight ways to connect the resulting
three paths in order to form a tour. In the VNS metaheuristic, initially the
number of neighbors should be dened. In this chapter, instead of two neighbors
(2-opt and 3-opt), more neighbors are added based on the way that the 3-opt is
utilized. Thus, if the number of neighbors is denoted by Nl , l = 1, , lmax , the
neighbor l = 1 refers to 2-opt and neighbors l = 2, , lmax refer to a dierent

way to applied the 3-opt. A pseudocode of the VNS algorithm is presented in
the following.
Algorithm VNS
Select the number of the neighbors (Nl , l = 1, , lmax )
Select an initial solution s0
l=1
Main Phase
repeat
Create a solution s in the neighbor of Nl
s = LS(s), Apply a local search phase in s
if f (s ) < f (s ) then
s = s
l=1
else
l =l+1
endif
until l lmax
Return the best solution.
3.4 Hybrid Genetic Algorithm
A Hybrid Genetic Algorithm (HGA) is, also, used in order to compare the
results of the proposed algorithm. In the Hybrid Genetic Algorithm, an initial
population of solutions is created and each member of the population is mapped
as in the proposed MDE algorithm. Then, the tness function of each member of
the population is created. In a specic percentage of the individuals, a crossover
phase is applied using the most classic crossover operator, the 1-point crossover.
In a specic percentage of the ospring, a mutation phase is applied. After the
mutation phase, a VNS algorithm is applied in each one of the ospring. In the
next generation, the ttest individual from the whole population survives. With
the term whole population we mean the initial population and the ospring
from both mutation and crossover phases. Thus, the population is sorted based
on the tness function of the individuals and in the next generation the ttest
individuals survive. It should be mentioned that the size of the population of each
generation is equal to the initial size of the population. The stopping criterion
is the maximum number of generations.
In the following, a pseudocode of the hybrid Genetic algorithm is presented.
Initialization
Select the crossover operator
Select the mutation operator
Select the selection operator
Select the percentage of crossover operator
Select the percentage of mutation operator

Generate the initial population
Calculation of the initial cost function value
(tness function) of each member of the population
Main Phase
Do while maximum number of generations has not been reached
Select individuals from the population to be parents
Call crossover operator to produce ospring
Call mutation operator
Perform VNS algorithm in each ospring
Calculate the cost function (f itness) of the ospring
Replace the population with the ttest of the whole population
Enddo
Return the best individual (the best solution).
4 Results
4.1 Parameters Selection
The whole algorithmic approach was implemented in Matlab R2009a. The se-
lected parameters are presented in Table 1. It should be noted that for the
selection of the parameters we used two criteria, the quality of the solution and
the computational time needed to achieve this solution. Thus, we used many
dierent alternative values for the parameters and we made tests with them
and the nal selected parameters were those that gave the best computational
results concerning both criteria mentioned previously. After the selection of the
nal parameters, 10 dierent runs with the selected parameters were performed
for each instance.
Table 1. Parameters for all algorithms
MDE HGA
individuals 200 150
iterations 3500 3500
lsiter 100 100
Cr 0.8 0.8
Mr - 0.2
0.5 -
4.2 Probabilistic Traveling Salesman Problem

In Table 2, the results of the proposed algorithm in the Probabilistic Traveling
Salesman Problem are given. It should be noted that in the following, in all
Tables, in addition to the results of the proposed Memetic Dierential Evolution
algorithm, the results of the Hybrid Genetic Algorithm described in section 3.4
are, also, given. The reason that we used the Hybrid Genetic Algorithm is that
we would like to compare the proposed algorithm with another evolutionary
algorithm in order to see if the Memetic Dierential Evolution algorithm could
solve eectively a stochastic routing problem. In all tables where the notation
(i.e. quality of the solution) is presented, it means that a best known solution
(BKS) from the literature is known and the quality of the solutions measures the
eciency of the algorithm. The quality is given in terms of the relative deviation
cBKS )
from the best known solution, that is = (cM DE cBKS %, where cMDE denotes
the cost of the solution found by the proposed Memetic DE algorithm and cBKS
is the cost of the BKS solution. Similarly the quality of the solution of the GA
cBKS )
algorithm is given = (cHGA cBKS %, where cHGA denotes the cost of the
solution found by HGA algorithm.
Table 2. Results in the Probabilistic Traveling Salesman Problem (Part A)
HGA MDE
Instance p BKS cost cost
eil51 0.1 130.12 129.42 -0.54% 129.42 -0.54%
0.5 310.75 316.52 1.86% 312.52 0.57%
0.9 407.92 416.12 2.01% 414.97 1.73%
eil101 0.1 200.03 197.34 -1.35% 197.42 -1.31%
0.5 455.65 464.48 1.94% 464.06 1.85%
0.9 601.5 629.71 4.69% 623.71 3.69%
kroA100 0.1 9074.94 9175.32 1.11% 9051.77 -0.26%
0.5 16581.6 17135.1 3.34% 16569.7 -0.07%
0.9 20508.8 22590.6 10.15% 20511.3 0.01%
ch150 0.1 2510.11 2530.71 0.82% 2508.77 -0.05%
0.5 5016.85 5264.08 4.93% 5245.44 4.56%
0.9 6292.01 6617.5 5.17% 6527.73 3.75%
d198 0.1 7504.94 7770.59 3.54% 7532 0.36%
0.5 12527.6 12965.9 3.50% 12711 1.46%
0.9 15216.6 15857.9 4.21% 15568 2.31%
PTSP instances were generated starting from TSP instances and assigning
to each customer a probability p of requiring a visit. The test instances were
taken from the TSPLIB [43]. The algorithm was tested on a set of 5 Euclidean
benchmark instances with sizes ranging from 51 to 198 nodes. Each instance
is described by its TSPLIB name and size, e.g. in Table 2 the instance named
kroA100 has size equal to 100 nodes. For each PTSP instance tested, various ex-
periments were done by varying the value of the customer probability p. In Table
2, in the last four columns the results of the HGA and the proposed algorithm
(the cost and the quality of the best solution found) for three probability values
(0.1, 0.5 and 0.9) are presented. The best known solutions are taken from [30].
As it can be seen from Table 2, the proposed algorithm, compared to the best
known solutions published in [30], nds new best solutions in ve out of fteen
instances. However, in one of these instances the HGA nds even better solution.
For the other 10 instances, the quality of the solutions of the proposed algorithm
varies between 0.01% and 4.56%. The HGA, as it is mentioned previously, nds
in one instance a new best solution and in another instance the same new best
solution as the MDE algorithm. In the other thirteen instances, the quality of
the solutions of the HGA varies between 0.82% and 10.15%.
Table 3. Results in the Probabilistic Traveling Salesman Problem (Part B)
HGA MDE
Instance p cost average stdev var median cost average stdev var median
eil 51 0.1 129.42 130.95 1.22 1.49 131.16 129.42 130.45 0.81 0.66 130.38
0.5 316.52 318.40 0.81 0.66 318.57 312.52 313.98 1.02 1.04 313.89
0.9 416.12 417.62 1.09 1.19 417.57 414.97 415.89 0.82 0.67 415.83
eil101 0.1 197.34 198.90 0.86 0.75 199.05 197.42 198.57 0.87 0.76 198.53
0.5 464.48 466.23 0.92 0.84 466.41 464.06 465.45 0.79 0.63 465.48
0.9 629.71 631.01 0.87 0.75 630.86 623.71 625.19 1.18 1.40 624.77
kroA100 0.1 9175.32 9177.01 0.77 0.59 9177.23 9051.77 9053.18 1.10 1.20 9053.05
0.5 17135.1 17136.27 0.63 0.40 17136.30 16569.7 16571.46 1.17 1.37 16571.93
0.9 22590.6 22591.68 0.85 0.72 22591.49 20511.3 20512.56 0.88 0.77 20512.93
ch150 0.1 2530.71 2532.27 0.92 0.84 2532.21 2508.77 2510.32 0.85 0.72 2510.41
0.5 5264.08 5265.29 1.00 1.01 5265.20 5245.44 5246.96 1.09 1.18 5247.10
0.9 6617.5 6618.47 0.91 0.82 6618.20 6527.73 6528.71 0.80 0.63 6528.61
d198 0.1 7770.59 7771.81 0.78 0.61 7771.59 7532 7533.05 1.08 1.17 7532.59
0.5 12965.9 12966.95 1.06 1.13 12966.49 12711 12711.84 0.73 0.53 12711.70
0.9 15857.9 15859.13 0.94 0.88 15858.92 15568 15569.35 0.91 0.82 15569.43
In Table 3, a more analytical presentation of the results of Table 2 is presented.

In the rst two columns of Table 3, the name of the instance and the probability
of requiring a visit are presented respectively. In columns 3 to 7 and 8 to 12 the
results (cost, average, standard deviation, variance and median) of the Hybrid
Genetic Algorithm (HGA) and of the proposed Memetic Dierential Evolution
(MDE) algorithm are presented, respectively. In 13 instances the MDE algorithm
gives better results, in 1 the HGA gives better results and in 1 both algorithms
nd the same solution. The improvement in the quality of the best run for the
MDE algorithm compared to the HGA in the instances where the MDE performs
better than HGA is between 0.09% and 9.20%, while in the one instance where the
HGA performs better the improvement in the instance 0.04%. In the average of
the ten runs, the results are a little dierent than previously as the MDE algorithm
performs better in all instances. The improvement in the quality of the solutions
for the MDE algorithm compared to the HGA is between 0.16% and 9.20%. Both
algorithms in all runs give very good results with small dierences between them as
the variance for the MDE varies between 0.53 and 1.40 and the standard deviation
varies between 0.73 and 1.18, while for the HGA the variance varies between 0.40
and 1.49 and the standard deviation varies between 0.63 and 1.22.
4.3 Vehicle Routing Problem with Stochastic Demands
For the solution of the Vehicle Routing Problem with Stochastic Demands, there
are not commonly used benchmark instances from the researchers in the lit-
erature solving the specic problem. This issue makes the comparison of the
dierent algorithms dicult. Thus, we have tested the algorithm in two set of
benchmark instances, the one was used initially by Christiansen et al. [6]. The set
of benchmark instances contains forty instances with number of nodes between
16 and 60. The second set contains 7 out of 14 benchmark instances proposed
by Christodes [7] that have zero service time. These benchmark instances have,
initially, been proposed and used for the Capacitated Vehicle Routing Prob-
lem, but due to the fact that every variant of the Vehicle Routing Problem is
a generalization of the Capacitated Vehicle Routing Problem, these benchmark
instances have, also, been used in other variants of the Vehicle Routing Problem.
Each instance of the set contains between 51 and 200 nodes including the de-
pot. Each instance includes capacity constraints without maximum route length
restrictions and with zero service time.
Another issue concerning the Vehicle Routing Problem with Stochastic De-
mands is the fact that there are, mainly, two dierent approaches for dealing
with the route failure in this problem. Both approaches have as a goal the min-
imization of the expected cost. In the one approach [6,21], vehicles follow their
assigned routes until a route failure occurs, then, a replenishment of the capacity
is performed at the depot and, nally, a return of the vehicle at the customer
where the route failure occurred and a continuation of the service are performed.
In this approach a set of vehicles can be used. In the other approach, the one
that is, also, used in this chapter, there is a preventive restocking strategy [3,41]
and its main characteristic is that we would like to avoid the route failure. In
order to do that, a threshold value is used. If the residual load after serving a
customer is greater or equal to this value, then, it is better to move to the next
customer, otherwise a return to the depot is performed. In this case only one
vehicle is used. We use the same transformation approach as the one proposed
in [6,21] and, thus, we assume that customers demands are independent Pois-
son random variables with the mean demand for each customer equal to the
deterministic value of the demand given in the corresponding VRP problem. We
tested the proposed algorithm with both approaches used for dealing with the
route failure.
We, also, tested the algorithm using another approach concerning the demands
of the customers. In the beginning, the probability of the demand of each customer
to take a particular value is stored in a variable. This probability depends on the
value of the demands deviation. For example, if the demands deviation is r and
the real demand is R (where r R), the probability of the demand is 2r+1 1
because
the demand can take 2r+1 values (i.e., Rr, R(r1), , R, , R+(r1), R+
r) and the probability of the demand to take one of these values is the same. The
cost from the last node to the depot can be assessed directly as it does not depend
on the customer demand. In the chapter the deviation of the customers demand
takes the values: r = 0, r = 1, r = 2. The rst case (r = 0) denotes that there
Table 4. Results using the preventive restocking strategy in the VRPSDs in the rst
set of benchmark instances
HGA MDE
Q cost average stdev var median cost average stdev var median
A-n32-k5 100 836.07 837.13 0.62 0.39 837.31 820.50 822.66 0.99 0.98 822.27
A-n33-k5 100 693.40 694.55 0.74 0.55 694.73 684.20 688.47 0.91 0.84 688.83
A-n33-k6 100 762.40 763.77 0.98 0.95 763.75 762.60 770.52 0.64 0.41 770.59
A-n34-k5 100 812.30 813.24 0.77 0.59 813.18 788.70 790.80 0.86 0.73 790.51
A-n36-k5 100 833.30 834.15 0.87 0.77 833.93 835.10 837.10 0.75 0.57 837.12
A-n37-k5 100 707.65 708.12 0.23 0.05 708.22 702.00 694.21 0.81 0.66 694.56
A-n37-k6 100 1018.00 1018.79 0.67 0.45 1018.80 1008.20 1000.69 0.91 0.82 1000.54
A-n38-k5 100 755.50 756.55 0.96 0.93 756.39 752.20 757.49 0.73 0.54 757.43
A-n39-k5 100 858.70 859.96 0.87 0.75 860.14 862.60 854.53 1.06 1.12 854.79
A-n39-k6 100 867.12 868.03 0.93 0.86 867.95 845.70 848.82 0.94 0.89 848.61
A-n44-k6 100 1005.90 1006.78 0.53 0.28 1006.87 980.60 980.26 1.06 1.12 980.45
A-n45-k6 100 1007.90 1009.29 1.07 1.13 1008.88 996.86 998.38 0.83 0.70 998.13
A-n45-k7 100 1239.40 1240.19 0.53 0.28 1240.03 1213.10 1177.10 0.97 0.93 1177.29
A-n46-k7 100 976.84 978.49 0.92 0.84 978.83 979.70 985.74 0.68 0.46 985.76
A-n48-k7 100 1182.30 1183.84 1.09 1.19 1183.67 1146.70 1133.71 1.13 1.27 1134.27
A-n53-k7 100 1117.80 1119.21 0.84 0.71 1119.16 1100.20 1097.43 0.65 0.42 1097.18
A-n54-k7 100 1283.90 1285.05 0.86 0.73 1284.84 1279.50 1224.33 0.81 0.66 1224.21
A-n55-k9 100 1168.10 1168.99 0.73 0.54 1169.02 1150.90 1125.56 0.78 0.61 1125.58
A-n60-k9 100 1517.25 1518.18 0.83 0.69 1518.11 1483.20 1455.12 0.60 0.37 1454.93
E-n22-k4 6000 385.12 385.97 0.86 0.75 385.59 379.16 391.85 0.93 0.86 391.59
E-n33-k4 8000 849.35 849.96 0.29 0.09 850.12 848.25 848.30 0.58 0.34 848.41
E-n51-k5 160 550.15 551.22 0.66 0.43 551.30 549.18 546.00 0.88 0.77 545.89
P-n16-k8 35 443.98 444.81 0.69 0.48 444.56 444.55 456.92 1.11 1.24 457.45
P-n19-k2 160 216.66 217.87 0.96 0.92 217.73 215.04 214.38 0.64 0.41 214.41
P-n20-k2 160 225.89 227.14 0.92 0.84 227.19 224.25 228.11 0.93 0.87 228.24
P-n21-k2 160 218.38 218.62 0.23 0.05 218.52 218.52 218.53 0.28 0.08 218.48
P-n22-k2 160 223.06 224.15 0.92 0.84 223.95 229.45 230.51 0.70 0.49 230.45
P-n22-k8 3000 587.32 588.79 0.89 0.80 588.63 589.89 591.55 0.65 0.42 591.54
P-n23-k8 40 536.07 537.12 0.68 0.46 536.99 545.26 537.32 1.00 1.01 537.14
P-n40-k5 140 471.11 471.83 0.49 0.24 471.82 472.15 472.12 0.50 0.25 472.38
P-n45-k5 150 531.29 532.36 0.91 0.83 532.07 527.90 532.13 1.15 1.33 532.58
P-n50-k10 100 755.15 756.35 1.00 0.99 756.28 724.60 741.38 1.06 1.11 741.83
P-n50-k7 150 580.34 581.20 0.65 0.42 581.13 575.92 572.40 0.96 0.93 572.60
P-n50-k8 120 658.00 658.72 0.64 0.40 658.57 664.02 660.64 0.92 0.85 660.69
P-n51-k10 80 805.80 806.71 0.81 0.65 806.45 789.04 796.41 0.88 0.78 796.17
P-n55-k10 115 742.40 743.55 0.87 0.75 743.39 730.15 738.52 0.48 0.23 738.51
P-n55-k15 70 1002.60 1003.60 1.01 1.03 1003.37 1016.40 1009.59 0.92 0.85 1009.51
P-n55-k7 170 588.34 588.45 0.07 0.00 588.44 588.47 588.37 0.28 0.08 588.55
P-n60-k10 120 803.18 803.81 0.39 0.15 803.90 790.55 773.89 0.70 0.50 773.82
P-n60-k15 80 1068.60 1069.47 0.89 0.79 1069.22 1067.60 1022.63 0.96 0.91 1022.33
is not any deviation from the actual demand R, the second case (r = 1) denotes
that the deviation is equal to 1 while the third case (r = 2) denotes that the
deviation is equal to 2. If for a customer the demand becomes negative, then,
this demand takes the value 0.
Based on the tness function the vehicle either returns to the depot for replen-
ishment or it proceeds to the next customer. In all Tables, the results of the pro-
posed algorithm for the Vehicle Routing Problem with Stochastic Demands are
presented in addition with the results of the Hybrid Genetic Algorithm (HGA)
as in the previous section.
In Table 4, the results of the proposed algorithm in the rst set of benchmark
instances with the preventive restocking strategy are presented. The algorithm
was tested in forty instances, the same with the ones used in [6,21], with number
of nodes from 16 to 60. In the rst two columns of Table 4, the name of the
instance (which includes the number of nodes and the number of vehicles, for
example, the instance A-n32-k5 has 32 nodes and 5 vehicles) and the capacity of
the vehicles are presented, respectively. In columns 3 to 7 and 8 to 12 the results
of the Hybrid Genetic Algorithm (HGA) and of the proposed Memetic Dier-
ential Evolution (MDE) algorithm are presented respectively. More precisely, in
columns 3 and 8 the cost of the best out of 10 runs are presented. In columns
4 and 9 the average cost of the 10 runs, in columns 5 and 10 the standard de-
viation, in columns 6 and 11 the variance and in columns 7 and 12 the median
values for both algorithms are presented, respectively. In the forty instances, the
MDE algorithm gives better results in 27 and the HGA algorithm in 13. The
improvement in the quality of the best run for the MDE algorithm compared
to the HGA in the instances where the MDE gives better results is between
0.09% and 4.04%, while the improvement in the quality for the HGA algorithm
compared to the MDE in the instances where the HGA gives better results is
between 0.02% and 2.86%. In the average of the ten runs, the improvement in
the quality of the solutions for the MDE algorithm compared to the HGA in
the instances where the MDE gives better results is between 0.01% and 5.08%,
while the improvement in the solution for the HGA algorithm compared to the
MDE in the instances where the HGA gives better results is between 0.03% and
2.83%. Both algorithms in all runs give very good results with small dierences
between them as the variance for the MDE varies between 0.08 and 1.33 and the
standard deviation varies between 0.28 and 1.15, while for the HGA the variance
varies between 0.00 and 1.19 and the standard deviation varies between 0.07 and
1.19.
Table 5. Results using the preventive restocking strategy in the VRPSDs in the second
set of benchmark instances
HGA MDE
n Q cost average stdev var median cost average stdev var median
par1 51 160 550.15 551.82 0.88 0.78 551.82 549.18 550.27 0.92 0.84 550.30
par2 76 140 942.2357 943.75 1.03 1.06 944.07 941.54 942.52 1.01 1.01 942.22
par3 101 200 971.15 972.65 1.00 1.00 972.78 969.909 971.06 1.01 1.03 970.87
par4 151 200 1453.5 1454.92 0.90 0.80 1454.78 1418.14 1419.37 1.09 1.19 1419.23
par5 200 200 1975.37 1976.75 1.00 1.00 1976.94 1968.24 1969.35 0.89 0.78 1969.28
par11 121 200 1418.15 1419.52 1.08 1.17 1419.21 1412.11 1413.55 1.05 1.11 1413.53
par12 101 200 998.38 999.87 0.84 0.70 999.88 995.14 996.21 0.77 0.59 996.11
In Table 5, a comparison of the proposed algorithm with a dierent set of

benchmark instances is presented. The instances are the 7 out of the 14 bench-
mark instances proposed by Christodes [7] for the solution of the Capacitated
Vehicle Routing Problem. The instances that were selected are those with innite
maximum route length restrictions and zero service times in order to be analo-
gous to the instances used to test the algorithms so far. In the rst three columns
of Table 5, the name of the instance, the number of nodes and the capacity of
the vehicles are presented, respectively. In columns 4 to 8 and 9 to 13 the results
(cost, average, standard deviation, variance and median) of the Hybrid Genetic
Algorithm (HGA) and of the proposed Memetic Dierential Evolution (MDE)
algorithm are presented, respectively. In all instances the MDE algorithm gives
better results than the HGA algorithm. The improvement in the quality of the
best run for the MDE algorithm compared to the HGA is between 0.07% and
2.43%. In the average of the ten runs the improvement in the quality of the
solutions for the MDE algorithm compared to the HGA is between 0.13% and
2.44%. Both algorithms in all runs give very good results with small dierences
between them as the variance for the MDE varies between 0.59 and 1.19 and the
standard deviation varies 0.77 and 1.07, while for the HGA the variance varies
between 0.70 and 1.17 and the standard deviation varies between 0.84 and 1.08.
Table 6. Results using the preventive restocking strategy and the second approach of
the demands deviation in the VRPSDs in the second set of benchmark instances
HGA MDE
Instance r Cost average stdev var median Cost average stdev var median
0 542.62 544.24 1.03 1.06 544.43 537.42 539.11 1.04 1.08 539.52
par1 1 543.8 545.05 1.13 1.28 544.73 538.35 539.97 0.96 0.93 540.23
2 540.59 541.77 1.09 1.19 541.37 536.69 537.89 1.03 1.07 537.40
0 867.86 869.12 0.86 0.74 868.96 862.39 863.98 0.94 0.89 863.83
par2 1 878.45 879.77 1.06 1.12 879.59 862.72 863.97 1.02 1.04 863.71
2 886.87 888.34 1.05 1.10 888.19 865.86 867.13 0.74 0.54 866.99
0 850.66 851.64 0.80 0.64 851.74 839.4 840.69 1.11 1.24 840.63
par3 1 854.51 855.97 0.88 0.77 856.02 866.15 867.45 0.71 0.51 867.53
2 854.18 855.74 1.01 1.02 855.68 851.52 853.11 0.99 0.98 853.21
0 1137.24 1138.84 1.12 1.27 1139.41 1095.18 1097.04 1.05 1.10 1097.42
par4 1 1128.21 1129.42 0.77 0.59 1129.30 1115.37 1116.50 1.23 1.51 1115.78
2 1141.18 1142.41 0.87 0.76 1142.45 1139 1140.27 0.71 0.51 1140.42
0 1511.21 1513.02 1.01 1.03 1513.09 1495.28 1496.52 1.09 1.19 1496.19
par5 1 1505.37 1506.80 0.77 0.59 1506.91 1499.17 1500.41 0.96 0.93 1500.61
2 1497.84 1499.59 1.10 1.22 1499.92 1489.5 1490.86 1.06 1.12 1490.91
0 1061.47 1062.92 1.22 1.49 1062.94 1055.87 1057.20 0.88 0.78 1057.16
par11 1 1085.49 1086.73 0.82 0.67 1086.78 1088.74 1089.91 0.92 0.85 1089.59
2 1117.11 1118.30 0.97 0.93 1118.29 1098.7 1100.21 1.00 1.00 1100.64
0 835.18 836.83 1.03 1.06 837.15 823.47 825.21 0.86 0.73 825.66
par12 1 862.49 863.79 1.07 1.15 863.74 859.79 861.43 1.03 1.06 861.46
2 864.46 865.73 0.87 0.75 865.57 861.28 862.52 1.07 1.15 862.45
In Table 6, the results in the second set of benchmark instances using the sec-
ond approach of the demands deviation are presented. In the rst two columns
of Table 6, the name of the instance and the three dierent customers demands
deviation (r = 0, r = 1, r = 2) for each instance are presented respectively. In
columns 3 to 7 and 8 to 12 the results (cost, average, standard deviation, vari-
ance and median) of the Hybrid Genetic Algorithm (HGA) and of the proposed
Memetic Dierential Evolution (MDE) algorithm are presented, respectively. In
this case, using the three dierent deviations for the seven instances, we have
21 instances. In these instances, the MDE algorithm gives better results in 19
instances and the HGA algorithm in 2 instances. The improvement in the quality
of the best run for the MDE algorithm compared to the HGA in the instances
where the DE gives better results is between 0.19% and 3.69%, while the im-
provement in the quality for the HGA algorithm compared to the MDE in the
two instances where the HGA gives better results is 0.29% and 1.36%, respec-
tively. In the average of the ten runs, the improvement in the quality of the
solutions for the MDE algorithm compared to the HGA in the instances where
the MDE gives better results is between 0.18% and 3.66%, while the improve-
ment in the solution for the HGA algorithm compared to the MDE in the two
instances where the HGA gives better results is 0.23% and 1.34%, respectively.
Both algorithms in all runs give very good results with small dierences between
them as the variance for the MDE varies between 0.51 and 1.51 and the standard
deviation varies 0.71 and 1.23, while for the HGA the variance varies between
0.59 and 1.49 and the standard deviation varies between 0.77 and 1.22.
In Table 7, the results of the proposed algorithm and the results from the
literature without the preventing restocking strategy are presented. In the rst
column of Table 7, the name of the instance is presented. Column 2 presents
the BKS solution without the preventing restocking strategy, while columns 3
to 10 present the results of the Christiansen and Lysgaard (CL) [6] (columns 3
and 4, cost and quality of the solutions, respectively), the results of Goodson et
al. (G) [21] (columns 5 and 6, cost and quality of the solutions, respectively),
the results of the hybrid genetic algorithm (HGA) (columns 7 and 8, cost and
quality of the solutions, respectively) and the results of the proposed algorithm
(MDE) (columns 9 and 10, cost and quality of the solutions, respectively).
The reason that we tested the proposed algorithm without the preventing re-
stocking strategy is that in paper [6] the optimal solutions have been calculated
with two dierent branching strategies in some of the instances (19 out of 40
instances, denoted with bold face letters in column 2 of Table 7), and it would be
very interesting to see how the proposed algorithm performs when the optimal
values are known. As it can be seen, in the 19 instances that the optimal values
are known, the proposed algorithm succeeded to nd them in 13 and the HGA
in 8. In total, the proposed algorithm nds the best solution in 13 and the HGA
in 8 while for the other instances the quality of the solutions in the proposed al-
gorithm varies between 0.02% and 0.80% and for the HGA varies between 0.04%
and 1.26%. The average quality in all 40 instances for the proposed algorithm is
equal to 0.21% and for the HGA is equal to 0.38%.
Table 7. Results without using the preventive restocking strategy in the VRPSDs in
the rst set of benchmark instances (Part A)
CL G HGA MDE
BKS cost cost cost cost
A-n32-k5 853.6 853.6 0.00 853.6 0.00 855.1 0.18 853.6 0.00
A-n33-k5 704.2 704.2 0.00 704.2 0.00 705.8 0.23 704.2 0.00
A-n33-k6 793.9 793.9 0.00 793.9 0.00 794.8 0.11 794.1 0.03
A-n34-k5 826.87 827.87 0.12 826.87 0.00 828.28 0.17 827.95 0.13
A-n36-k5 858.71 - - 858.71 0.00 861.15 0.28 860.28 0.18
A-n37-k5 708.34 708.34 0.00 708.34 0.00 709.25 0.13 709.12 0.11
A-n37-k6 1030.73 1030.75 0.00 1030.73 0.00 1031.18 0.04 1030.95 0.02
A-n38-k5 775.14 778.09 0.38 775.14 0.00 781.25 0.79 778.29 0.41
A-n39-k5 869.18 869.18 0.00 869.18 0.00 872.27 0.36 869.18 0.00
A-n39-k6 876.6 876.6 0.00 876.6 0.00 877.75 0.13 876.98 0.04
A-n44-k6 1025.48 1025.48 0.00 1025.48 0.00 1027.19 0.17 1026.85 0.13
A-n45-k6 1026.73 - - 1026.73 0.00 1029.28 0.25 1028.35 0.16
A-n45-k7 1264.83 1264.83 0.00 1264.99 0.01 1267.98 0.25 1266.15 0.10
A-n46-k7 1002.22 1002.41 0.02 1002.22 0.00 1004.58 0.24 1003.95 0.17
A-n48-k7 1187.14 - - 1187.14 0.00 1191.68 0.38 1190.57 0.29
A-n53-k7 1124.27 - - 1124.27 0.00 1127.59 0.30 1126.87 0.23
A-n54-k7 1287.07 - - 1287.07 0.00 1292.58 0.43 1290.74 0.29
A-n55-k9 1179.11 - - 1179.11 0.00 1191.24 1.03 1185.57 0.55
A-n60-k9 1529.82 - - 1529.82 0.00 1542.98 0.86 1535.24 0.35
E-n22-k4 411.57 411.57 0.00 411.57 0.00 411.57 0.00 411.57 0.00
E-n33-k4 850.27 850.27 0.00 850.27 0.00 852.14 0.22 850.27 0.00
E-n51-k5 552.26 - - 552.26 0.00 559.24 1.26 555.84 0.65
P-n16-k8 512.82 512.82 0.00 512.82 0.00 512.82 0.00 512.82 0.00
P-n19-k2 224.06 224.06 0.00 224.06 0.00 224.06 0.00 224.06 0.00
P-n20-k2 233.05 233.05 0.00 233.05 0.00 233.05 0.00 233.05 0.00
P-n21-k2 218.96 218.96 0.00 218.96 0.00 218.96 0.00 218.96 0.00
P-n22-k2 231.26 231.26 0.00 231.26 0.00 231.26 0.00 231.26 0.00
P-n22-k8 681.06 681.06 0.00 681.06 0.00 681.06 0.00 681.06 0.00
P-n23-k8 619.52 619.52 0.00 619.53 0.00 619.52 0.00 619.52 0.00
P-n40-k5 472.5 472.5 0.00 472.5 0.00 473.8 0.28 472.5 0.00
P-n45-k5 533.52 - - 533.52 0.00 535.28 0.33 534.12 0.11
P-n50-k10 760.94 - - 760.94 0.00 764.18 0.43 762.14 0.16
P-n50-k7 582.37 - - 582.37 0.00 586.47 0.70 584.15 0.31
P-n50-k8 669.81 - - 669.81 0.00 674.41 0.69 673.18 0.50
P-n51-k10 809.7 809.7 0.00 812.74 0.38 817.28 0.94 816.25 0.81
P-n55-k10 745.7 - - 745.7 0.00 751.24 0.74 749.18 0.47
P-n55-k15 1068.05 1068.05 0.00 1068.05 0.00 1077.18 0.85 1074.24 0.58
P-n55-k7 588.56 - - 588.56 0.00 595.14 1.12 593.21 0.79
P-n60-k10 804.24 - - 804.24 0.00 810.41 0.77 808.57 0.54
P-n60-k15 1085.49 1085.49 0.00 1087.41 0.18 1094.71 0.85 1092.14 0.61
Table 8. Results without using the preventive restocking strategy in the VRPSDs in
the rst set of benchmark instances (Part B)
HGA MDE
cost average stdev var median cost average stdev var median
A-n32-k5 855.1 856.61 0.83 0.68 856.76 853.6 854.53 0.69 0.47 854.52
A-n33-k5 705.8 707.18 0.99 0.98 707.35 704.2 705.58 0.76 0.58 705.68
A-n33-k6 794.8 796.19 0.96 0.93 796.13 794.1 795.12 0.74 0.54 795.01
A-n34-k5 828.28 829.66 1.12 1.26 829.88 827.95 829.04 0.69 0.48 829.28
A-n36-k5 861.15 862.71 1.09 1.20 862.84 860.28 861.78 0.78 0.61 861.93
A-n37-k5 709.25 710.85 0.88 0.78 710.85 709.12 710.46 0.84 0.71 710.54
A-n37-k6 1031.18 1032.85 0.99 0.98 1033.15 1030.95 1032.07 0.97 0.93 1031.85
A-n38-k5 781.25 782.42 0.90 0.82 782.13 778.29 779.61 0.82 0.68 779.96
A-n39-k5 872.27 873.92 0.82 0.67 873.80 869.18 870.18 0.92 0.84 870.11
A-n39-k6 877.75 879.10 0.85 0.72 879.09 876.98 878.14 0.75 0.56 878.05
A-n44-k6 1027.19 1028.56 1.02 1.05 1028.37 1026.85 1028.29 0.78 0.60 1028.43
A-n45-k6 1029.28 1030.89 1.01 1.02 1030.99 1028.35 1029.69 0.89 0.79 1029.91
A-n45-k7 1267.98 1269.06 1.03 1.07 1268.67 1266.15 1266.91 0.63 0.40 1266.81
A-n46-k7 1004.58 1005.94 1.05 1.10 1005.96 1003.95 1004.98 0.75 0.56 1004.95
A-n48-k7 1191.68 1192.51 0.71 0.51 1192.45 1190.57 1191.47 0.41 0.17 1191.52
A-n53-k7 1127.59 1128.80 0.76 0.57 1128.88 1126.87 1128.05 0.87 0.75 1127.85
A-n54-k7 1292.58 1294.05 1.03 1.07 1293.96 1290.74 1292.03 0.78 0.62 1292.25
A-n55-k9 1191.24 1192.65 1.05 1.10 1192.83 1185.57 1187.01 0.73 0.53 1187.11
A-n60-k9 1542.98 1544.56 0.94 0.88 1544.60 1535.24 1536.72 0.90 0.81 1536.95
E-n22-k4 411.57 412.86 0.91 0.83 412.88 411.57 412.32 0.57 0.32 412.21
E-n33-k4 852.14 853.89 1.05 1.10 854.14 850.27 851.51 0.97 0.95 851.80
E-n51-k5 559.24 560.86 1.00 1.01 560.77 555.84 557.12 0.85 0.73 557.42
P-n16-k8 512.82 514.22 0.91 0.83 514.24 512.82 512.82 0.00 0.00 512.82
P-n19-k2 224.06 225.38 1.00 1.01 225.20 224.06 224.06 0.00 0.00 224.06
P-n20-k2 233.05 234.30 1.05 1.11 234.36 233.05 233.11 0.04 0.00 233.10
P-n21-k2 218.96 220.19 0.82 0.68 220.12 218.96 218.96 0.00 0.00 218.96
P-n22-k2 231.26 232.48 1.00 1.00 232.54 231.26 231.31 0.03 0.00 231.32
P-n22-k8 681.06 682.59 0.99 0.97 682.92 681.06 681.06 0.00 0.00 681.06
P-n23-k8 619.52 621.10 0.93 0.87 621.29 619.52 619.52 0.00 0.00 619.52
P-n40-k5 473.8 475.17 0.89 0.79 474.97 472.5 473.48 0.63 0.40 473.53
P-n45-k5 535.28 537.11 0.93 0.86 537.25 534.12 535.43 0.75 0.56 535.52
P-n50-k10 764.18 765.40 0.91 0.83 765.51 762.14 763.37 0.80 0.63 763.41
P-n50-k7 586.47 587.41 0.93 0.86 587.06 584.15 584.73 0.49 0.24 584.61
P-n50-k8 674.41 675.57 0.94 0.89 675.53 673.18 674.18 0.72 0.52 674.09
P-n51-k10 817.28 818.45 1.16 1.35 817.99 816.25 817.51 0.87 0.75 817.41
P-n55-k10 751.24 752.50 0.97 0.95 752.41 749.18 750.67 0.86 0.74 751.10
P-n55-k15 1077.18 1078.36 1.00 0.99 1078.16 1074.24 1075.62 0.77 0.60 1075.66
P-n55-k7 595.14 595.98 0.99 0.98 595.61 593.21 594.17 0.54 0.29 594.25
P-n60-k10 810.41 811.48 0.97 0.94 811.20 808.57 809.89 0.81 0.65 809.87
P-n60-k15 1094.71 1095.91 0.87 0.76 1095.87 1092.14 1093.59 0.96 0.92 1093.68
In Table 8, a more analytical presentation of the results of Table 7 is presented.

In the rst column of Table 6, the name of the instance is given. In columns 2
to 6 and 7 to 11 the results (cost, average, standard deviation, variance and
median) of the Hybrid Genetic Algorithm (HGA) and of the proposed Memetic
Dierential Evolution (MDE) algorithm are presented, respectively. The MDE
algorithm gives better results in 32 instances and in the other 8 instances both
algorithms have found the same (in the specic instances the optimum) solu-
tions. The improvement in the quality of the best run for the MDE algorithm
compared to the HGA is between 0.01% and 0.60%. In the average of the ten
runs, the results are a little dierent than previously as the MDE algorithm per-
forms better in all instances. The improvement in the quality of the solutions for
the MDE algorithm compared to the HGA is between 0.02% and 0.66%. Both al-
gorithms in all runs give very good results with small dierences between them
as the variance for the MDE varies between 0.00 and 0.95 and the standard
deviation varies between 0.00 and 0.97, while for the HGA the variance varies
between 0.51 and 1.35 and the standard deviation varies between 0.71 and 1.16.
For the proposed algorithm, in the instances where the standard deviation and
the variance is equal to 0.00 it means that the algorithm nds the same solution
in all runs. The most important thing that we have to point out is that these are
some of the instances where the optimum solutions are known and the proposed
algorithm succeeded to nd the optimum solution in 10 dierent runs.
5 Conclusions and Future Research
In this chapter, we applied a hybridized version of the Dierential Evolution

algorithm, the Memetic Dierential Evolution algorithm, for the solution of a
number of Stochastic Routing Problems. We tested the algorithm in two dif-
ferent problems, the one with stochastic customers, the Probabilistic Traveling
Salesman Problem and the other with stochastic demands, the Vehicle Routing
Problem with Stochastic Demands. For both problems we, also, implemented a
hybrid genetic algorithm for comparing their results and testing the eciency
of the proposed method. In the Probabilistic Traveling Salesman Problem, the
results of the algorithm were compared with the best known solutions from
the literature and ve new best solutions (in 15 instances) were found by the
proposed algorithm. In the Vehicle Routing Problem with Stochastic Demands,
there is a number of algorithms from the literature that they have been used
for solving this problem or a variant of the problem. The main dierence of the
algorithms is the way that they treat the route failure. Other researchers pro-
posed the vehicles to return to the depot when the route failure has occurred
and others use a preventive restocking strategy in which the vehicles return
to the depot before the route failure. In this chapter, we tested the algorithm
with both strategies. In the rst one, we compared the proposed algorithm, in
addition to the hybrid genetic algorithm, with two more algorithms from the
literature. In this case, the proposed algorithm found the optimum solutions in
13 instances. In the second strategy, as there are no instances from the literature,
we tested, initially, the algorithm using the same instances as in the previous
case using two dierent approaches for the stochastic demands and, then, we
applied the algorithm in a second set of instances, those that are used for the
Capacitated Vehicle Routing Problem. In the second strategy, we compared the
algorithm only with the hybrid genetic algorithm and we presented and analyzed
the results. Future research will be focused in two dierent directions, the one
direction will be the solution of more dicult problems as the Vehicle Routing
Problem with Stochastic Demands and Customers where both demands and cus-
tomers are stochastic variables or the Vehicle Routing Problem with Stochastic
Demands and Time Windows and the other direction will be the application of
dierent evolutionary algorithms in these problems.
References
1. Bertsimas, D.J.: Probabilistic Combinatorial Optimization Problems. Ph.D. thesis.
MIT, Cambridge, MA, USA (1988)
2. Bianchi, L.: Ant Colony Optimization and Local Search for the Probabilistic Trav-
eling Salesman Problem: A Case Study in Stochastic Combinatorial Optimization.
Ph.D. Thesis. Universite Libre de Bruxelles, Belgium (2006)
3. Bianchi, L., Birattari, M., Manfrin, M., Mastrolilli, M., Paquete, L., Rossi-Doria,
O., Schiavinotto, T.: Hybrid metaheuristics for the vehicle routing problem with
stochastic demands. Journal of Mathematical Modelling and Algorithms 5(1), 91
110 (2006)
4. Bodin, L., Golden, B.: Classication in vehicle routing and scheduling. Networks 11,
97108 (1981)
5. Bodin, L., Golden, B., Assad, A., Ball, M.: The state of the art in the routing and
scheduling of vehicles and crews. Computers and Operations Research 10, 63212
(1983)
6. Christiansen, C.H., Lysgaard, J.: A branch-and-price algorithm for the capacitated
vehicle routing problem with stochastic demands. Operations Research Letters 35,
773781 (2007)
7. Christodes, N., Mingozzi, A., Toth, P.: The vehicle routing problem. In:
Christodes, N., Mingozzi, A., Toth, P., Sandi, C. (eds.) Combinatorial Optimiza-
tion. John Wiley, Chichester (1979)
8. Dantzig, G.B., Ramser, J.H.: The truck dispatching problem. Management Sci-
ence 6(1), 8091 (1959)
9. Engelbrecht, A.P.: Computational Intelligence: An Introduction. John Wiley and
Sons, Chichester (2007)
10. Feoktistov, V.: Dierential Evolution - In Search of Solutions. Springer, NY (2006)
11. Fisher, M.L.: Vehicle routing. In: Ball, M.O., Magnanti, T.L., Momma, C.L.,
Nemhauser, G.L. (eds.) Network Routing, Handbooks in Operations Research and
Management Science, vol. 8, pp. 133. North Holland, Amsterdam (1995)
for marker optimization in the clothing industry. Applied Soft Computing 10(2),
409422 (2010)
13. Fister, I., Fister Jr., I., Brest, J., Zumer, V.: Memetic articial bee colony algo-
rithm for large-scale global optimization. In: 2012 IEEE Congress on Evolutionary
Computation (CEC). IEEE (2012)
14. Fister Jr, I., Fister, D., Yang, X.S.: A Hybrid Bat Algorithm. Electrotechnical
Review 80(1-2), 17 (2013)
evolutionary algorithm. Computational Optimization and Applications 54(3), 741
770 (2013)
16. Gendreau, M., Laport, G., Seguin, R.: Stochastic vehicle routing. European Journal
of Operational Research 88, 312 (1996)
17. Gendreau, M., Laporte, G., Potvin, J.Y.: Vehicle routing: modern heuristics. In:
Aarts, E.H.L., Lenstra, J.K. (eds.) Local Search in Combinatorial Optimization,
pp. 311336. Wiley, Chichester (1997)
18. Gendreau, M., Laporte, G., Potvin, J.Y.: Metaheuristics for the capacitated VRP.
In: Toth, P., Vigo, D. (eds.) The Vehicle Routing Problem, Monographs on Discrete
Mathematics and Applications, pp. 129154. Siam, Philadelphia (2002)
19. Golden, B.L., Assad, A.A.: Vehicle Routing: Methods and Studies. North Holland,
Amsterdam (1988)
20. Golden, B.L., Raghavan, S., Wasil, E. (eds.): The Vehicle Routing Problem: Latest
Advances and New Challenges. Springer, NY (2008)
21. Goodson, J.C., Ohlmann, J.W., Thomas, B.W.: Cyclic-order neighborhoods with
application to the vehicle routing problem with stochastic demand. European Jour-
nal of Operational Research 217, 312323 (2012)
22. Hansen, P., Mladenovic, N.: Variable neighborhood search: Principles and applica-
tions. European Journal of Operational Research 130, 449467 (2001)
23. Jaillet, P.: Probabilistic Traveling Salesman Problems. Ph.D. Thesis, MIT, Cam-
bridge, MA, USA (1985)
24. Jaillet, P.: A priori solution of a traveling salesman problem in which a random
subset of the customers are visited. Operations Research 36(6), 929936 (1988)
25. Laporte, G., Semet, F.: Classical heuristics for the capacitated VRP. In: Toth, P.,
Vigo, D. (eds.) The Vehicle Routing Problem, Monographs on Discrete Mathemat-
ics and Applications, pp. 109128. Siam, Philadelphia (2002)
26. Laporte, G., Gendreau, M., Potvin, J.Y., Semet, F.: Classical and modern heuris-
tics for the vehicle routing problem. International Transactions on Operations Re-
search 7, 285300 (2000)
27. Lawer, E.L., Lenstra, J.K., Rinnoy Kan, A.H.G., Shmoys, D.B.: The Traveling
Salesman Problem: A Guided Tour of Combinatorial Optimization. Wiley and
Sons (1985)
28. Lichtblau, D.: Discrete optimization using Mathematica. In: Callaos, N., Ebisuzaki,
T., Starr, B., Abe, J.M., Lichtblau, D. (eds.) World Multi-Conference on Systemics,
Cybernetics and Informatics (SCI 2002). International Institute of Informatics and
Systemics, vol. 16, pp. 169174 (2002)
29. Liu, Y.-H.: A hybrid scatter search for the probabilistic traveling salesman problem.
Computers and Operations Research 34(10), 29492963 (2007)
30. Marinakis, Y., Marinaki, M.: A hybrid multi-swarm particle swarm optimization
algorithm for the probabilistic traveling salesman problem. Computers and Oper-
ations Research 37, 432442 (2010)
31. Marinakis, Y., Migdalas, A.: Heuristic solutions of vehicle routing problems in
supply chain management. In: Pardalos, P.M., Migdalas, A., Burkard, R. (eds.)
Combinatorial and Global Optimization, pp. 205236. World Scientic Publishing
Co. (2002)
32. Marinakis, Y., Iordanidou, G.R., Marinaki, M.: Particle Swarm Optimization for
the Vehicle Routing Problem with Stochastic Demands. Applied Soft Comput-
ing 13, 16931704 (2013)
33. Moscato, P., Cotta, C.: A gentle introduction to memetic algorithms. In: Glover,
F., Kochenberger, G.A. (eds.) Handbooks of Metaheuristics, pp. 105144. Kluwer
Academic Publishers, Dordrecht (2003)
34. Pereira, F.B., Tavares, J.: Bio-inspired algorithms for the vehicle routing problem.
SCI, vol. 161. Springer, Heidelberg (2008)
35. Powell, W.B., Jaillet, P., Odoni, A.: Stochastic and dynamic networks and routing.
In: Ball, M.O., Magnanti, T.L., Momma, C.L., Nemhauser, G.L. (eds.) Network
Routing, Handbooks in Operations Research and Management Science, vol. 8, pp.
141295. Elsevier Science B. V., Amsterdam (1995)
36. Price, K.V., Storn, R.M., Lampinen, J.A.: Dierential Evolution: A Practical Ap-
proach to Global Optimization. Springer, Berlin (2005)
37. Stewart, W.R., Golden, B.L.: Stochastic vehicle routing: A comprehensive ap-
proach. European Journal of Operational Research 14, 371385 (1983)
38. Storn, R., Price, K.: Dierential evolution - A simple and ecient heuristic for
global optimization over continuous spaces. Journal of Global Optimization 11(4),
341359 (1997)
39. Tarantilis, C.D.: Solving the vehicle routing problem with adaptive memory pro-
gramming methodology. Computers and Operations Research 32, 23092327 (2005)
40. Toth, P., Vigo, D.: The Vehicle Routing Problem. Monographs on Discrete Math-
ematics and Applications. Siam, Philadelphia (2002)
41. Yang, W.H., Mathur, K., Ballou, R.H.: Stochastic vehicle routing problem with
restocking. Transportation Science 34, 99112 (2000)
42. http://www.coin-or.org/SYMPHONY/branchandcut/VRP/data/Vrp-All.tgz
43. http://www.iwr.uni-heidelberg.de/groups/comopt/software/TSPLIB95
Modeling Nanorobot Control Using Swarm Intelligence
for Blood Vessel Repair: A Rigid-Tube Model
Boonserm Kaewkamnerdpong1, , Pinfa Boonrong1,

Supatchaya Trihirun2, and Tiranee Achalakul2
1
Biological Engineering Program, Faculty of Engineering,
King Mongkuts University of Technology Thonburi, Bangkok, Thailand
boonserm.kae@kmutt.ac.th, pinfa21@gmail.com
2 Department of Computer Engineering, Faculty of Engineering,
King Mongkuts University of Technology Thonburi, Bangkok, Thailand

supatchaya.trihirun@gmail.com, tiranee@cpe.kmutt.ac.th
Abstract. The future nanorobots for diagnosis and treatment purposes in nano-
medicine may exhibit only simple behaviors and work together in their early
stage. Through exploring the existing swarm intelligence techniques, the canoni-
cal particle swarm optimization was selected to employ for adaptively controlling
the locomotion of a swarm system of early-stage nanorobots with only essential
characteristics for self-assembly into a structure in a simulations system. In this
study, we demonstrated nanorobots operating as artificial platelets for repairing
wounds in a simulated human small vessel, which may be used to treat platelet
diseases. In a rigid-tube model, we investigated how artificial platelet capabili-
ties including the perception range, maximum velocity and response speed have
impacts on wound healing effectiveness. It was found that the canonical parti-
cle swarm optimization is an efficient algorithm for controlling the early-stage
nanorobots with essential characteristics in both Newtonian and non-Newtonian
flow models. The demonstration could be beneficial as guidelines of essential
characteristics and useful locomotion control for the realization of nanorobots
for medical applications in the future.
1 Introduction
Even though the advancement of medical technology provides more effective and effi-
cient diagnosis, monitoring and treatment, there still exist some diseases that are difficult
to diagnose at their early stages and some risks/side effects from some treatments. For
example, it is truly hard to diagnose a cancer in the early stage; the traditional treatment
of radiation also damage to healthy cells near the cancer cells. Robots have been intro-
duced and used in medical applications, especially surgery which may provide fewer
risks than open and laparoscopic surgery. Although the current development of medi-
cal robots has not yet allow us to perform operation at cellular or molecular scale that
may dispel the disease at the source without harmful to healthy cells, it is anticipated
that nano-scale robots or nanorobotsa concept arising from the advance in nanotech-
nology introduced by Richard P. Feynman in 1959 [15]may be used to improve the
efficiency of medical technology. As Drexler introduced an idea of cooperative small

206 B. Kaewkamnerdpong et al.
robots or nanorobots that could manipulate substances inside the human blood vessels
[12], these nanorobots may allow us to cure the disease by delivering drugs to specific
positions which could reduce the damage to normal cells as well as other side effects.
Although nanorobots have not yet been realized, some of nanotechnology research
and development might lead to the realization of nanorobots or nanomachines for medi-
cal and other applications in the near future. For instance, catalytic nano-wire motors [42]
could be a great way for constructing self-powered practical nanomachines and could
provide the building block to realize the future nanorobots. The artificial bacterial flag-
ella [45] that control their movement by a low-strength rotating magnetic field may have
potential to be used as a part of future nanorobots for controlling the movement of fu-
ture nanorobots. Apart from the structure and the suitable materials of the nanorobots,
the control systems of nanostructures according to the concept of nanotechnology are
also considered. Nowadays, there are many design concepts of medical nanorobots. For
example, Freitas designed medical nanorobots such as respirocytes [17] or artificial me-
chanical red blood cells, microbivores [33] or artificial mechanical white blood cells, and
clottocytes [32] or artificial platelets. In addition, the simulation of nanorobots will be
beneficial to identify the essential characteristics as well as vital functions of nanorobots
and to investigate the effectiveness of control techniques for nanorobots to achieve their
tasks; the findings can serve as guidelines for developing nanorobots in the future [21].
The examples of the simulation of nanorobots include the simulation of nanorobots trans-
porting nutrition to organ inlets developed by Cavalcanti and Freitas [5], the simulation
of DNA nanorobots for identifying the cancer cell modeled by Sanchita [34], the simu-
lation of the swarm of an early version of future nanorobots for self-assembly and self-
repair application modeled Kaewkamnerdpong and Bentley [22], and the simulation of
nanorobots traveling through blood vessel to detect the aneurysm in the brain by Cav-
alcanti et al. [4]. These studies modeled how the nanorobots would be in the future and
envisaged the way to allow the nanorobots to effectively operate. The results from these
studies could provide suggestions for the development of future nanorobots. In spite of
that, the simulation of nanorobots should be as realistic as possible so that the findings
can be truly beneficial for the realization of future nanorobots.
Based on the concept of bottom-up technology in nanotechnology, nano parts such
as molecular motor and nano swimmer have been constructed [36]. Nevertheless, the
designed nanorobots in most works are quite advanced and require complex nano parts.
With the current development in nanotechnology, complex nano parts are not yet ac-
complished, and such advanced nanorobots may not be available in recent time. Due
to the very small size of nanorobots, it is more reasonable that the early version of
nanorobots may be able to exhibit only simple behaviors. They may be able to move
around environment, interact with the individuals in the group and interact with their
environment; the external control over the nanorobots may not be available. With these
limitation, he control mechanism of nanorobots should be suitably designed according
to these potential behaviors and characteristics so that they could achieve their des-
ignated tasks under dynamic environment. Such nanorobot context, which includes
simple capabilities and no external control, is similar to social insects. The individ-
ual behaviors of social animal such as ants and termites are usually simple but their
Modeling Nanorobot Control Using SI for Blood Vessel Repair: A Rigid-Tube Model 207
collaborative behaviors enable them to achieve their complex tasks for survival. The col-
lective intelligence of social animal, or called swarm intelligence, is modeled and used
as problem solving techniques in various applications. Hence, the swarm intelligence
may be reasonable to control the locomotion of a swarm of nanorobots with adaptation
to effectively accomplish designated tasks even without complex characteristics. The
objective of this study was to investigate the plausibility of controlling swarm system
of nanorobots using a swarm intelligence technique for self-assembly tasks in medical
applications.
In this study, the performance of a swarm-intelligence-based system of nanorobots
was investigated as artificial platelets in blood vessel repair application. The reasons
for selecting the blood vessel repair application are that the artificial platelet model is
one of the existing designed nanorobot models, called clottocytes, in [32] and that the
primary hemostasis is an example of self-assembly in nature which is simple enough
for the early-stage nanorobots to achieve their task. In the primary hemostasis, platelets
play the most important role involving platelet adhesion and the platelet aggregation
to stop the bleeding of a wound. However, there are some patients who have abnormal
platelets that cause defects in the primary hemostasis. Additionally, the treatments of
platelet diseases also have side effects. Hence, it can be anticipated that the nanorobots
with biocompatibility that operate as artificial platelets may be used to treat patients
who have platelet diseases such as thrombocytopenias. In this study, we demonstrated
the performance of swarm-intelligence-based nanorobot system operating as artificial
platelets to repair an injured blood vessel wall in a rigid-tube model. The flow models
used in this study were obtained from those existing in the literature. The modification
for better flow models was not in the scope of this study.
This chapter is organized as follows: Section 2 describes the swarm intelligence tech-
niques and discusses on how the techniques could be appropriate to employ for control-
ling nanorobots. The simulated model for nanorobots in blood vessel repair is described
in section 3. This section includes both the model of nanotobots and their environment
in the blood vessel. Section 4 demonstrate the performance of the nanorobot system
in blood vessel repair application. The study is concluded with the analysis on control
mechanism toward the realization of future nanorobots in section 5.
2 Swarm Intelligence for Nanorobot Control
Swarm intelligence techniques are inspired by collaborative behaviors of social animal

such as ants, termites, birds flocking and fish schooling [2,14,20]. In social insects and
animal, each individual usually exhibits only simple actions, such as moving around the
environment, interacting with other individuals in the swarm, and interacting with their
environment; through these simple activities, they can collaboratively work together to
achieve their goal without any leader. Such collaborative behaviors among individuals
that enable them to achieve their complex tasks such as foraging and nest building are
modeled into algorithms and employed to solve complex problems, mostly optimization
problems. To date, there are numerous swarm intelligence techniques in the literature;
however, the distinguished techniques are still ant colony optimization (ACO), artificial
bee colony (ABC) and particle swarm optimization (PSO). Toward the use of swarm
intelligence techniques for nanorobot control, each distinguished techniques are

discussed as follows:
ACO is inspired by foraging behavior of ants; they can find the shortest path be-
tween their nest and food sources through the deposition and evaporation of pheromone
[10]. ACO has been used to successfully solve combinatorial optimization problems
such as traveling salesman problem (TSP) [11]. ACO also has been extended the use
to solve continuous optimization problems [2]. Pheromone, which is chemical sub-
stance, is crucial to the foraging behavior in ants. For solving optimization problems
in computers, pheromone concentration on a path can be simulated with a function.
However, in physical application such as nanorobot control, unless such chemical sub-
stance could be available there must be a substitute strategy. This may limit the use of
ACO in physical application. It was found that ACO had been employed in physical
applications such as robotics; Payton et al. [31] simulated virtual pheromone using in-
frared transceiver mounted on physical robots called pheromone robots or pherobots.
Instead of laying actual chemical substance, a pherobot transmits message of virtual
pheromone to its neighbors which determine pheromone intensity of virtual pheromone
encoded as a single modulated message and the estimated distance from the sender.
Then, the pherobots relay the received message to their neighbors. The message is re-
layed for a number of times specified by the sender. This method can crudely simulate
pheromone detection and its gradient: as ants lay pheromone on their path, only ants in
the proximity of the path can detect pheromone trail and be influenced. Nevertheless,
such virtual pheromone requires message transmission which may not be plausible in
the early stage of nanorobots. Although one may argue that any biocompatible chemical
substance may be available for using as pheromone in nanorobots, it is still be difficult
to lay such substance in a dynamic environment as inside the blood vessel.
ABC is inspired by the foraging behavior of honey bees [24]. In ABC, there are
three groups of bees: employed bees, onlooker bees and scout bees. The employed bees
fly out of the hive to find food sources (feasible solutions). They fly back to the hive
and share the information on the quality of nectar from their food source (the fitness
of solution) to the onlooker bees by performing waggle dance. Each onlooker bee se-
lects one of the food sources depending on the obtained information to exploit the food
source. The recruitment of onlooker bees toward the food source create a search in
the adjacent area of the selected food source; this can lead to the finding of the best
food source (optimal solution). The role of scout bees is to randomly locate a new
food source. If an employed bee cannot locate a better food source for a predefined
number of iterations, it will abandon the food source and pursue a new food source lo-
cated by a scout bee. With this mechanism, the local optimal solution could be avoided.
Although ABC has been successfully employed in various applications and compared
with other population-based techniques it has been shown to perform best when tested
with benchmark optimization problems[23,24,25], for physical application the mecha-
nism of information sharing among bees at their hive may be difficult, especially for the
early-version nanorobots in dynamic environment such as bloodstream.
PSO inspired by bird flocking and fish schooling [13] is another population-based
optimization algorithm. In PSO, each particle (potential solution) moves around the
problem search space, interact and exchange information with others for finding an
optima or good enough solution. Thus, the mechanism for finding a good solution is
influenced by both their own experience and their neighbors experiences. PSO has very
few parameters to adjust; in addition, it uses less memory. Hence, PSO seems plausible
to apply in nanorobot locomotion control or other physical applications. Nevertheless,
the conventional PSO algorithm requires complex, direct communication among parti-
cles in the neighborhood to exchange detailed information on their performance which
might not be possible by real nanorobots.
There was an attempt on applying PSO-based algorithm in nanorobot locomotion
control; Kaewkamnerdpong and Bentley [22] employed the perceptive particle swarm
optimization (PPSO) [21], which is a modification of the conventional PSO, to control
the system of swarm nanorobots in surface coating application. The nanorobots must
self-assemble into a nanostructure on the desired surface. In [22], the nanorobot model
was based on an early-version of nanorobots whose characteristics and behaviors are
simple due to the limitation from their size; no direct communication with informa-
tion exchange may be available. To substitute for the lack of direct communication,
it is assumed that each particle or nanorobot has a sensory unit to sense its neighbors
and evaluate its performance in the search space for interacting with other individu-
als and its environment. PPSO takes this performance of particles at their positions in
n-dimensional space as another information in the control; hence, PPSO operates in
(n + 1)-dimensional space. By adding the additional dimension and the ability to ob-
serve the search space, it was expected that these could allow particles to perceive their
approximate performance and to reasonably perform their collaborative tasks. It was
found that although PPSO could be used to control nanorobots to achieve their task,
the additional dimension seemed unnecessary in dynamic environment. For the med-
ical nanorobots that potentially operate in an extremely dynamic environment like a
circulatory system, it may not be necessary to keep the signified fitness of their current
positions because their positions would be undetectably altered by the blood flow and,
in turn, their collected performance would be misleading the locomotion control.
In this study, we adopted the concept of early version nanorobot model but applied
the canonical PSO algorithm to control the swarm of medical nanorobots. The canonical
PSO algorithm is the constricted version of PSO proposed by Clerc and Kennedy [9].
The constriction coefficient, , is used to control the exploration versus exploitation
trade-off in velocity equation to ensure the convergence of PSO [9]. Although over the
years there are various modification of PSO proposed, the canonical PSO algorithm
was selected in this study for the reason that it is simple as analogous to early-stage
nanorobots and requires only a few parameters to be adjusted which are corresponding
to plausible characteristics of early-stage nanorobots.
3 The Simulated Model for Nanorobots in Blood Vessel Repair

Platelets are crucial for our body mechanism to stop bleeding when injured. In normal
condition, each liter of blood contains 150,000-400,000 platelet cells [43]. With Throm-
bocytopenia [43], which is the condition that the platelet count is less than 150,000
per liter, patients could bleed for a long time due to not enough platelets and could
be in shock after losing too much blood; if this occurs in the brain, intracranial hem-
orrhage, patients can die or be paralyzed. Among ways to treat Thrombocytopenia,
platelet transfusion is well known. Platelets separated from donated blood are injected
into patients. After injection, platelets will circulate in the body for ten days in aver-
age before destroyed through bodys natural mechanism. Thus, patients usually need
platelet transfusion twice or thrice in a month. However, frequent platelet transfusion
might induce resistance to platelets from others. For patients whose platelets are de-
stroyed by their own immune systems, Immunosuppressant is used but with some side
effects as well. Another futuristic idea is using nanorobots as artificial platelets to heal
the wound from inside the body. Toward this idea, this study explored the control mech-
anism for nanorobots operating as artificial platelets to repair blood vessel.
The platelets or thrombocytes are originated from cytoplasts of megakaryocytes,
which are the largest cells in bone-marrow. The cytoplasts have many small pseudopods
that will slip off and become platelets with 2-4 m in diameter. When platelets are in
their inactivated states, they are discoid in shape but become spheroid when activated
[6]. They play an important role in hemostasis, which is the response mechanism of
human body to stop bleeding. When a blood vessel is injured, the vessel constricts to
decrease the vessel lumen in order to slow the bleeding. In hemostasis, this step can
be referred as vasoconstriction. At the opened wound, blood is exposed to the col-
lagen fibers underlying the endothelium in blood vessels. Platelets become activated
after prolonged exposure to high shear stress or when the shear stress rapidly increases
such as when vasoconstriction occurs at damaged vessel [39]. Through glycoprotein
Ia/IIa receptors on their membrane, platelets adhere to the exposed collagen and von
Willebrand factor (vWF) in the vessel wall [18]. Then, the adherent platelets release
adenosine diphosphate as well as thromboxane A2, which induces additional platelets
to become activated and adhere [1]. Prostacylin, the substance released from the normal
endothelial cells in the blood vessel adjacent to the injured area, plays a role to prevent
the aggregation of platelets along the length of a normal vessel [1]. As thromboxane A2
promotes platelet aggregation whereas prostacyclin inhibits platelet aggregation, the
balance between platelet thromboxane A2 and prostacyclin is required to perform lo-
calized platelet aggregation while preventing excessive clot and maintaining blood flow
around it [18]. The adherent platelets at the site of injury form a temporary hemostatic
plug [18]; the mechanism to form a platelet plug can be referred as primary hemostasis.
In secondary hemostasis, the hemostatic plug is, then, converted into the definitive clot.
Platelets release clotting factors to convert fibrinogena protein found in the plasma
into a dense network of fibrin stands in which blood cells and plasma are trapped [1,3].
In hemostasis, platelets take a crucial role in primary hemostasis to form a platelet
plug. The primary hemostasis is an example of natural self-assembly tasks; hence, it
is appropriate to be used for investigating the model of early-stage nanorobots. The
nanorobots acting as artificial platelets will need to contain all necessary granules to
allow usual primary hemostasis and to promote secondary hemostasis in order to com-
pletely heal the wound. While moving along blood vessels, the artificial platelets must
be able to detect whether these is any sign of a wound nearby and become activated
when the sign is discovered. Then, the artificial platelets must work together to prevent
further blood loss and accommodate to the secondary hemostasis.
3.1 The Model of Nanorobots

There exists a model of artificial mechanical platelet called Clottocyte envisioned by
Freitas in [32]. A clottocyte is a spherical nanorobot comprising of the features for
carrying a folded fiber mesh which becomes sticky when the coated substance
comes in contact with plasma,
sensing its environment to detect the injury,
communicating with other clottocytes to activate them with acoustic pulses when
injury is detected, and
unleashing the mesh at the site of an injured blood vessel.
Instead of aggregating into a platelet plug, clottocytes are designed to use the artificial
nets to trap blood cells and accelerate the clotting process. According to the study of
Freitas in [32], clottocytes could stop the bleeding greatly faster than natural process.
This would be a prominent contribution for people who have problems with hemostasis
process especially people with platelet dysfunction. Nevertheless, based on the current
development of nanotechnology, this may seem too advanced for nanorobots in their
early stage.
Instead of adopting Freitas version of artificial platelets with net packets onboard, this
study regarded nanorobots as biocompatible molecules self-assembling at the wound
site to form a temporary plug to trap blood components and let the hemostasis process
continue the formation of the clot. These early version of nanorobots represent artificial
platelets that are attracted, adhere to the exposed collagen in the injured vessel wall, and
release the substances to recruit additional platelets as well as artificial platelets to the
area.
The Characteristics of Nanorobots. In order to provide guidelines that are truly ben-
eficial toward the realization of future nanorobots, this study used the nanorobot model
based on the current development of nanotechnology as well as existing characteristics
in biological systems that has potentials to be included in the early-stage nanorobots. In
[22], apart from the energy that is fundamental to power the nanorobots, the essential
characteristics of nanorobots for self-assembly and self-repair tasks have been identified
as the following:
moving around the environment,
interacting with other nanorobots as well as the environment,
carrying defined characteristics for the assembly task, and
connecting to other nanorobots at a specific surface.
Kaewkamnerdpong and Bentley [22] used these characteristics to model nanorobots
for surface coating application. These characteristics could also be used for achieving
the self-assembly task for artificial platelets in this study; with these characteristics,
the nanorobots as artificial platelets could move through the bloodstream, seek for the
wound and form into a mass at the damage site to stop the bleeding. To support that
these characteristics have the potential to be realized, Kaewkamnerdpong and Bentley
[22] explored the literature and discussed 4 features with examples of nanotechnology
development and examples in nature including actuator, signal generator and sensor,
programmability, and connection as concluded in Table 1. These features are adequate
to allow swarm-intelligence-based control mechanism for nanorobots.
Table 1. The features of nanorobot and the examples in biological system
Feature Function Examples in Biological System

Actuator Convert energy into motion Bacterial agella
ATP synthase
Signal genera- Perceive the environment Biolominescence
tor
and sensor Generate and sense signals for Chemical releasing in activated
platelets
interaction with other Sensing nutrient levels in bacteria
nanorobots
ProgrammabilityStore necessary data DNA (Genetic system)
Compute for performing tasks
Control the nanorobot opera-
tion
Connection Connect with other nanorobots Atom connection (covalent bonds,
to form into a structure hydrogen bonds, dispersion-repulsion
forces), DNA sticky ends (Nanotech-
nology)
This study adopted the nanorobot model proposed in [22] for self-assembly. Nev-
ertheless, to function as artificial platelets the nanorobots that assemble themselves to
repair the damaged blood vessel must be able to
moving around the environment,
interacting with other nanorobots and its environment,
generating signal that attracts other nanorobots and sensing the attraction signal,
carrying defined characteristics for the assembly task
connecting to other nanorobots at a specific surface, and
operating inside human body with biocompatibility.
In this study, each artificial platelet was spherical in shape, which is similar to clottocyte
[32]. The size of artificial platelets in this study was 2 m in diameter similar to natural
platelets and clottocytes [32,40]. For practical simulation, each artificial platelet was
limited to interact with others and its environment within a defined perception range. As
this study focussed on the use of swarm intelligence to control nanorobot locomotion,
the following assumptions were made in the simulation;
The artificial platelets can move around the vessel model within a defined maximum
velocity that allow them to move in opposite direction to the blood flow to find the
wound site.
The artificial platelets can sense the changing environment within a defined percep-
tion range for measuring the concentration of chemical substance released from the
wound to locate the wound site.
The artificial platelets have connectors that can bind with vWF for adhesion to the
exposed collagen on vessel wall and can aggregate with other adhered artificial
platelets and other adhered natural platelets only at the wound area for forming the
structure.
The forming structure of the artificial platelets is stopped when the vessel releases
the endothelium-derived relaxing factor and prostacylin [40].
The artificial platelets cannot connect with other artificial platelets and other blood
cells while traveling in bloodstream.
The artificial platelets that already adhered to the wound can release the attraction
signals into the environment for inducing other artificial platelets to the wound
site and release the substances involved in blood clotting (secondary hemostasis)
after the formation of platelet plug, such as calcium, fibronectin, fibrinogen, and
coagulation factor FV and FVIII [40].
When an artificial platelet is very close to an optimal artificial platelet, the attraction
force applies; as a result, the artificial platelet is pulled to connect with that optimal
artificial platelet via the vWF, and then the new optimal artificial platelet releases
the attraction signals to the environment for inducing other artificial platelets.
Natural platelets are not included in this simulation.
After the clot dissolution, the human body will naturally dissolve the blood clot
after the wound has healed; the artificial platelets return to the bloodstream and
move along the blood vessel for repairing other wounds.
The Nanorobot Control. In term of the control mechanism for artificial platelet lo-
comotion, in this study the canonical PSO algorithm [9] was chosen to regulate the
artificial platelets to collaboratively self-assemble into a mass at the wound site to stop
the bleeding. Each artificial platelet moves in a three-dimensional model of a blood
vessel. At each time step, the artificial platelets move to new positions according to
the velocity update from the canonical PSO algorithm. With the signaling and sensing
units with limited perception range in artificial platelets for interacting with others and
the environment, only neighbors locating within the perception range can influence the
movement. The algorithm for artificial platelet locomotion can be shown in Table 2.
When the first artificial platelet find the wound site, it will adhere to the exposed colla-
gen at the wound and release attraction signal to activate others. Other artificial platelets
that can sense the attraction signal will become activated and move toward the wound
to adhere to the exposed collagen or optimal artificial platelets at the wound.
For a swarm of m artificial platelets, let xi (t) refers to the position of artificial platelet
i in [x, y, z] dimension of the search space at time t; the initial position of artificial
platelets are uniformly randomized over the search space. The velocity of artificial
platelet i in [x, y, z] dimension at time t is denoted by vi (t). For each dimension, the
velocity is initialized with uniform random between V MAX and +V MAX.
The fitness value, F(xi (t)), or the performance of each artificial platelet at its po-
sition can be the summation of the detected concentration of vWF molecules released
Table 2. The algorithm for artificial platelet locomotion control
Algorithm for Articial Platelet Locomotion Control

initialize position and velocity of each artificial
platelet
initialize personal best position of each artificial
platelet
repeat
for each artificial platelet
observe environment and attraction signal
calculate the fitness value
update the personal best position
observe neighboring artificial platelets
update the local best position
modify velocity for the next time step
update position for the next time step
end
until termination criterion is yielded
from the wound and the detected intensity of attraction signal released from optimal
artificial platelets at the wound site. The concentration of vWF and attraction signal can
be described by Ficks second law,
CA (x,t) 2CA (x,t)

= DA (1)
t x2
where CA is the concentration of the solute A, x is the distance from optimal artificial
platelets at the measurement position, t is time, and DA is the diffusion coefficient of
solute A [41]. In this study, the diffusion coefficient of vWF was set as 4.5 x 1012 m2 /s
that is similar to the diffusion coefficient of natural platelets analyzed by Quasi-elastic
Light Scattering [29,37]. It is anticipated that the artificial platelets may use chemical
substance as means for communication; at present, artificial platelets have not yet been
realized, and the substances used to communicate could not be identified. Hence, the
characteristic of attraction signal in this study was assumed to be the same as vWF.
Each artificial platelet uses its fitness value to determine its personal best position,
x pbest,i . Every iteration, the fitness value at current position is checked whether it is bet-
ter than that at the personal best position for updating the personal best position. How-
ever, as the early-stage nanorobots with only essential characteristics have no knowledge
of their location in the space, their personal best positions could only be calculated by
accumulating of their movement from their previous personal best positions as Eq. 2.
At initialization, the personal best positions are set as zero; this means that the artificial
platelet is currently located at its personal best position, which is the initial position.

0, if F(x pbest,i (t)) F(x pbest,i (t 1));
x pbest,i (t) = (2)
x pbest,i (t 1) vi(t), otherwise.
In canonical PSO, each particle observes their neighbors and uses the local best
position that has the best fitness value in the neighborhood according to the network
topology to influence its movement toward a better position. In practice, the signaling
and sensing units of artificial platelets would be able to operate within a limited per-
ception range; hence, each artificial platelet can only interact to its neighbors and the
environment within its perception range. Moreover, instead of exchanging their per-
formance information with other individuals, the early-stage nanorobots may be able
to sense the presence of other individuals within its perception range only. With such
limitation, each nanorobot could not know whether any of the neighbors is in a better
position or not. Hence, the local best position, xlbest (t), in this study is determined by
randomly selecting neighboring positions within the defined perception range. The av-
erage position of all selected neighboring positions is used as the local best position.
Nevertheless, when the optimal artificial platelet is found, the local best position is set
as the optimal artificial platelet position. If there are more than one optimal artificial
platelets found, the local best position is the position of the nearest optimal artificial
platelet. In the case where both neighbors and optimal artificial platelets are not found,
the local best position is the current position.
The velocity of each nanorobot according to the canonical PSO algorithm can be
expressed as

vi (t + 1) = vi (t) + 1 x pbest,i xi (t) + 2 xlbest,i xi (t) . (3)
where is a constriction coefficient, 1 = c1 r1 and 2 = c2 r2 , c1 and c2 are the ac-

celeration constants, and r1 and r2 are random number between 1 and 1. Usually, the
particles in fluid behave in Brownian motion [41,44]. Many studies involving platelet
simulation modeled the motion of each platelet in Brownian motion [16,28,29]. The
Brownian motion is the random walk of particles in fluid, which is closely related to the
normal (or Gaussian) distribution. Therefore, in this study the random number r1 and r2
in velocity update equation were generated from Gaussian function with mean = 0 and
standard deviation = 1. After the new nanorobot velocity is calculated, the new position
of each artificial platelet is updated as
xi (t + 1) = xi (t) + vi(t + 1). (4)
The constriction coefficient ( ) is used to control the exploration versus exploita-

tion trade-off in canonical PSO. The large constriction coefficient promotes exploration
while the small constriction coefficient promotes exploitation. The constriction coef-
ficient has influence over acceleration coefficients (c1 and c2 ) for both personal expe-
rience and social knowledge parts. The acceleration coefficients indicate the level of
confidence in personal experience and social knowledge to be contributed in the ve-
locity update. When c1 > c2 , each particle trusts in its own experience more than the
neighbor experience. On the other hand, each particle trusts in the neighbor experience
more than its own experience when c1 < c2 . In the study of Clerc and Kennedy [8,9],
it has shown that the convergence of the particle swarm to the optimal result is better
when 1 + 2 > 4. Hence, the limit of 1 + 2 is set to 4.1 and is 0.729 according to
the analysis and suggestion in [9].
3.2 Circulatory System Model

For blood vessel repair task, the environment of artificial platelets is the bloodstream
inside a blood vessel. As the main focus of this study was to investigate the use of
swarm intelligence as control mechanism for nanorobots, we explored the literature for
an appropriate model of circulatory system to implement in this study; the development
of an improved model for circulatory system or blood vessel to be more realistic is out
of the scope of this study.
Within cardiovascular system, there is a network of blood vessels distributed in the
body to transport red blood cells, white blood cells, platelet, etc. throughout the body,
because every cell in the body needs oxygen and nutrients to perform its function nor-
mally. The heart pumps oxygenated blood from aorta throughout the body through
arteries, arterioles and capillaries to deliver oxygen and nutrients to cells. Then, the
deoxygenated blood after oxygen-carbon dioxide exchange and nutrient delivery flows
back to the heart through venules, veins and vena cava. These various types of blood
vessels, categorized by their properties such as position and size, affect the blood ve-
locity profile differently. An arteriole, which is a small blood vessel that leads blood
to capillaries where the exchange of nutrients and waste products between blood and
tissues takes place [38], was chosen for demonstration in this study. The arteriole is
interested as it is a smallest and thinnest vessel in artery system, which needs to handle
higher pressure so its thinner wall has more chance to break [41].
The wall of blood vessels excluding capillaries consists of three layers [26]. The in-
tima, which is the innermost layer, is composed of endothelial cells to contain blood
plasma inside the vessel and to secret various chemicals into the bloodstream. The me-
dia, which is the middle layer, provides strength and ability to contract to the blood
vessel. Elastin in the media layer allow the blood vessel to expand to absorb the me-
chanical energy from heart pumping blood during systole, which in turn drives the blood
to flow during diastole; collagen fibrils in the media layer prevent the over-expansion of
the blood vessel. The outer layer, called adventitia, allows the blood vessel to loosely
connect to the surrounding tissues. In the adventitial of most vessels, there exists sym-
pathetic fibre that can release vasoconstriction agents.
Although the elasticity of blood vessels is crucial to the allow and regulate blood flow
in the blood vessel network in our body. Through exploring the literature, we have not
yet found an elastic model for blood vessels. To simplify the model, the blood vessel
in this study was represented by a segment of rigid cylindrical tube. The ends of the
tube are connected to simulate a closed system. The process of fluid passing through
membranes in blood vessels, called osmosis, and the ability to contract and expand the
vessel wall are disregarded. However, the blood flow model inside a blood vessel is
simulated as closely to the recent finding in the literature as possible.
Usually, a fluid is called a Newtonian fluid if the viscosity of the fluid is unaffected by
the shear rate. If the whole blood is considered, the blood viscosity is non-Newtonian.
This study investigated the performance of nanorobot control in both the Newtonian
and non-Newtonian model of the blood flow in order to examine the effects of different
flows on nanorobot control. The Newtonian and non-Newtonian models are described
in section 3.2 and 3.2 respectively.
A Newtonian Model. In hemodynamics, the patterns of blood flow can be found in

three patterns including laminar flow, turbulent flow and single-file or bolus flow [26].
The single-file flow occurs in small vessels like capillaries. The turbulent flow occurs
when the blood flow with high velocity, which can be found in ventricles and stenosed
arteries. In other cases, the flow in blood vessels including normal arteries, arterioles,
venues and veins exhibits in laminar flow; the velocity profile of the fluid with laminar
flow can be illustrated as Fig. 1. Consider a fluid as several thin layers (or lamina).
Each lamina slides past each other. Different layers move with different velocities. The
velocity of the lamina in contact with the vessel wall is zero due to molecular cohesive
forces. Due to the viscosity of fluid, the adjacent layer is slowed down by the stationary
layer [44]; the next fluid layer is similarly slowed down but with a decreasing rate as the
layer becomes away from the tube wall. Hence, the velocity of the blood in the center
of the vessel is greater than that of blood near the wall.
Fig. 1. Velocity profile of laminar flow in a cylindrical tube
The blood flow along the tube is regulated by Poiseuilles law. The amount of flow,
q, for a given pressure difference, p, is
R4 p
q= , (5)
8
where is the viscosity of fluid, R is the tube radius, and is the length between two
points of the tube measured for pressure difference [44]. When blood is simplified as
Newtonian fluid, the fluid viscosity representing the internal friction in the fluid remains
constant. Hence, the average velocity of fluid, u,
is expressed as [44]
q pR2
u = = . (6)
R 2 8
The velocity of the Poiseuille flow or steady flow at a cross-sectional location of the
tube can be expressed as [44]
ks 2
us (r) = (r R2 ), (7)
4
where r is the radial coordinate measured from the tube axis, and ks is the pressure
gradient driving the flow. The pressure gradient is considered to be constant and equal
to the pressure difference ps between two points of the tube divided by the length of
tube between them,
d p ps
ks = = . (8)
dx
Therefore, the maximum velocity, u, in the Poiseuille flow is at the centre of the tube;
the maximum velocity is two-folded of the average velocity or u = 2u.
As the heart pumps blood to circulate through the body, blood flows with pulsatile
motion according to the change of pressure. The driving pressure can be simulated with
a periodic function of time in term of pressure gradient [27,30],
dp
= A0 A1 (cos t), (9)
dx
where A0 is the constant component of the pressure gradient, A1 is the amplitude of
the oscillating component that gives rise to the systolic and diastolic pressure, and =
2 f p , where f p is the pulse frequency. The velocity of the pulsatile flow is a function
of radius, r, and time, t, that is [41,44]
u(r,t) = us (r) + u (r,t). (10)
The steady and oscillatory flow can be calculated separately. The steady flow velocity,
us (r), in a tube can be calculated by Eq. 7. The oscillatory flow velocity within the tube
can be expressed as
A1 J0 ( r)
u (r,t) = 1 ei t , (11)
i J0 ( R)
where is blood density, J0 is the Bessel function on order zero of first kind, and
= i3 / , where is kinematic viscosity.
In the blood vessels, the higher flow velocity than a critical velocity can cause tur-
bulent flow where the flow velocity become in various directions instead of the laminar
flow. Constriction of blood vessels decreases the lumen and can increase the probabil-
ity of turbulent flow [18]. Nevertheless, a blood vessel is modeled as a rigid tube for
simplicity; the turbulent flow is excluded from the model.
When nanorobots move in the blood vessels, the movement of nanorobots is influ-
enced by blood flow. The new position of artificial platelet in simulation system be-
comes
xi (t + 1) = xi (t) + vi (t + 1) + ub (xz,i (t),t) , (12)
where ub (xz,i (t),t) is the blood velocity at the z position of the artificial platelet xi and
time t. Note that the drag force that acts on the objects in fluid dynamic and affects the
fluid velocity and flow direction is ignored in this study. This is because the artificial
platelets are much smaller and less numerous than other blood cells, so the drag force
can be negligible [16].
In the blood vessels, apart from the blood flow the movement of future nanorobots
can be affected by collisions with blood cells. In this model, a collision between
nanorobots and blood cells is induced with probability p which is dependent on the
hematocrit value1. To simulate a collision, the magnitude of nanorobot velocity after

affected by the fluid flow is reduced in half and the direction of the velocity after colli-
sion is generated at random.
With all the influences from the blood flow, the the process of artificial platelets in
this study can be illustrated in Fig. 2.
A Non-Newtonian Model. In a blood vessel larger than 100 m, blood can be as-
sumed as Newtonian fluid [35]; in a small blood vessel, blood would rather behave like
non-Newtonian fluid. In the Newtonian fluid, the shear stress is linearly proportional to
the shear rate whereas in the non-Newtonian fluid the shear stress is nonlinearly propor-
tional to the shear rate [7]. The viscosity of the blood changes with the vessel radius;
the viscosity becomes lower in smaller tube [26].
The non-Newtonian fluid can be divided into three types as follows [7]:
time-independent fluid or pure viscous fluid, which is the fluid that its shear rate at
that time depends on the current shear stress,
time-dependent fluid, which is the fluid that the relation between shear rate and
shear stress further depends on the history of the duration of shearing and kine-
matic, and
visco-elastic fluid, which is the fluid that shows both viscous fluid behavior and
elastic solid-like behavior.
As the heart pumps, blood exhibits oscillatory flow, which depends on time [6]. The
mathematical analysis can be separated into time-dependent part due to heart pumps,
and time-independent part due to the relation of shear stress and shear rate. There are
three types of time-independent fluid [7]:
shear-thinning or pseudoplastic behavior, which is the fluid that its viscosity de-
creases when the shear rate increases,
visco-plastic behavior with or without shear-thinning behavior, which is the fluid
that the shear stress must be exceeded the yield stress to flow, and
shear-thickening or dilatant behavior, which is the fluid that its viscosity increases
when shear rate increases.
The visco-plastic fluid shows different behaviors depending on the shear stress [7].
When the shear stress is less than the yield stress, the fluid will behave like elastic
solid. However, if the shear stress exceeds the yield stress, the fluid will show various
behaviors such as Newtonian characteristic and shear-thinning behavior. It is found that
blood can be categorized as yield-stress fluids [7].
In the literature, the visco-plastic fluid can be described by three models including [7]:
Bingham plastic fluid model: the fluid that can be represented by this model has
linear flow curve when shear stress exceeds the yield stress,
Herchel-Bulkley fluid model(H-Bmodel):themodelrepresentsayield-pseudoplastic
fluid, which exhibits shear-thinning behavior when the shear stress exceeds the yield
stress, and
Casson model: for other fluid that has steady shear stress or shear rate.
1 Hematocrit is the percentage of red blood cells in a blood sample [38].

Start
Initialize all parameters and variables
Randomly initialize artificial platelet

position and velocity
Check simulation time No

A
for initiating response
Yes
Update artificial platelet velocity in z

Observe environment and attraction signal
direction due to blood velocity
Calculate fitness values at the current Update artificial platelet position in the case
position of collision with the vessel wall
Update artificial platelet position in the case

Compare and update record of personal best of the attraction force from nearby optimal
position artificial platelet is found
Observe neighboring artificial platelet and

Update new position
update the local best position
No
Update new velocity according to situation Test for termination
Yes
A Stop
Fig. 2. The flowchart for artificial platelets

As the average diameter of an arteriole is 30 m [6], Iida [19] reported that when the
blood flows in the arterioles of diameter less than 0.1 mm, the velocity profiles could be
generally described by both Casson and Herchel-Bulkley (H-B) fluid models and that
the velocity profiles of blood flow in the arterioles with diameters less than 0.065 mm
could only be described by H-B fluid model. Hence, the H-B model is the appropriate
model.
In Herchel-Bulkley model, the shear rate, , can be expressed by [7]

0, if 0 r R p ;
= dV (13)
, if r > R p ;
dr
where V is the blood velocity, r is the radial coordinate position in the vessel, and R p is
plug core radiuswhich can be computed by

y
Rp = R . (14)
w
The shear stress, , can be expressed as
= y + m n (15)
where y is the yield stress, m is the consistency index, and n is the power-law index,
n < 0. In addition, the shear stress can be represented as a function of pressure gradient,
which is
r dP
= . (16)
2 dx
where dP/dx is the pressure gradient. The shear stress at wall can be computed by
substituting r = R in Eq. 16,
R dP
w = . (17)
2 dx
The viscosity can be expressed as
d
= (18)
d
The velocity can be computed from Eq. 13 and Eq. 15-17 following by taking inte-
gration with respect to r to obtain the steady flow velocity,
n+1 n+1
nR w 1n y n r y n
Vs (r) = 1 . (19)
n+1 m w R w
When r R p or in the plug core region, the velocity will be constant and equal to Vs at
R p . Hence, the plug core velocity is computed by substituting r = R p and using Eq. 14
in Eq. 19,
n+1
nR w 1n y n
Vp = 1 . (20)
n+1 m w
When comparing between the velocity profile of Newtonian and non-Newtonian

fluid, the velocity profile of non-Newtonian fluid is flatter at the center of the flow due
to the constant velocity at plug core region as illustrated in Fig. 3. In plug core region,
the blood velocity is the greatest. The oscillatory velocity can be expressed as [41]

A1 J0 ( r)
V (r,t) = 1 ei t , (21)
i J0 ( R)
The total velocity can be considered as the summation of the steady flow velocity and
oscillation flow velocity,
V (r,t) = Vs (r) + V (r,t). (22)
Fig. 3. The velocity profile of (a) Newtonian and (b) non-Newtonian fluid
4 The Demonstration
To investigate the performance of a swarm-intelligence-based system of nanorobots op-

erating as artificial platelets to detect and repair injured blood vessels, the model of
nanorobots and circulatory system as described in Sect. 3 were implemented with the
parameter settings in Table 3. These parameters were set according to the physiological
information in [18,40]. The vessel length is 500 m, which ranges from 250 m to
250 m. The wound is modeled as a cylinder with 5 m radius. The wound center is
located at [15, 270, 0] in the cylindrical coordinates (r, , z). The vWF molecules re-
leased from the center of the wound. The concentrations of vWF are determined by Eq.
1. The simulation time step is 0.001 second.
In artificial platelet system, the number of artificial platelets was fixed at 142; this
is set according to the normal platelet concentration proportional to the size of blood
vessel in this simulation model. The size of artificial platelets is assumed as 2.0 m in
diameter similar to the clottocytes by Freitas [32] and natural platelets. Each artificial
platelet can connect with 8 artificial platelets in order to form a mass as illustrated in
Fig. 4. Attraction force that an optimal artificial platelet pulls another artificial platelet
to connect applies when the distance between them is less than 0.2 m. The probabil-
ity that artificial platelets collide with other blood cells is set as 0.4 according to 40%
hematocrit. Using the canonical PSO algorithm to control nanorobot locomotion, the
parameters controlling the behavior of nanorobots include the constriction coefficient
Table 3. The parameter values of an arteriole as a rigid tube model
Parameter Value
Vessel thickness 20m
Vessel outer radius 35m
Vessel inner radius 15m
Endothelium layer 3m
Vessel length 500m
Pulse frequency, f p 1 Hz
Blood density, 1050 kg/m3
Dynamic viscosity, 0.00356 Pa.s
Kinematic viscosity, 3.302x106 m2 /s
Pressure gradient 20000
4000(cos t)
Hematocrit 40%
Fig. 4. Illustration of a nanorobot connecting with 8 other individuals; the nanorobot is presented
in green color whereas individuals connecting to this nanorobot are presented in black
and acceleration constants. In this study, the constriction coefficient, , was set accord-
ing to the suggestion from [9].
As such nanorobots or artificial platelets have not been realized yet, it could be ben-
eficial to study that to what extent of capability early-version nanorobots are needed to
accomplish their tasks. There are three parameters that relate to the capability and affect
the effectiveness and efficiency of artificial platelet control: the perception range, maxi-
mum velocity (V MAX) and response time. The perception range indicates the area that
an artificial platelet can interact with other individuals and its environment; it would be
determined by the operating range of signaling and sensing units in artificial platelets.
V MAX is the maximum velocity for which the artificial platelets are allowed to travel.
In real artificial platelets, this value is defined by the actuator ability. The larger the
value, the faster the artificial platelets can move. However, how fast an artificial platelet
can travel is also dependent on the response time of an artificial platelet to the control
mechanism. In this study, the effects of all three parameters were investigated in both
Newtonian and non-Newtonian models. The summary of parameter settings for all three
testing parameters can be expressed in Table 4.
The performance of self-assembly in blood vessel repair can be indicated in terms
of accuracy and efficiency through observing the percentage of wound coverage and
speed to achieve the goal respectively. The resulting percentage of wound coverage is
determined by Monte Carlo simulation. The wound coverage is the ratio of the number
of testing points that have optimal artificial platelets to the total number of randomly
selected points in the wound area. The greater wound coverage indicates the better self-
assembly performance. The wound coverage can be illustrated in term of the median
value over a number of trials of the experiment. The median values are chosen to repre-
sent the result because, in the simulation, randomness is used in the control mechanism
so it is more important to investigate how reliable the control mechanism could allow
the nanorobots to achieve their task. The speed of artificial platelet system is represented
by the number of iterations used to form a mass. The lower the number of iterations is
used, the greater the speed of self-assembly.
4.1 Artificial Platelets in Newtonian Model

In Newtonian model, artificial platelets operate with dynamics influenced by the blood
flow. In order to effectively control artificial platelets, the acceleration constants in
the canonical PSO algorithm were set to accommodate the characteristic of artificial
platelets. Because each artificial platelet has no knowledge of its current location in
the search space and it cannot exchange its performance information with others, so
it cannot keep track of the accurate personal best position and obtain the actual local
best position. Each artificial platelet may have the same level of confidence in its own
experience and neighbor experience, so c1 and c2 are equally set as 2.05. When both
of neighbors and optimal artificial platelets are not found, the local best position of the
artificial platelets becomes the current position; hence, they have no the influence from
neighbor experience. However, if the artificial platelet finds the optimal one that already
adhered at the wound site, that artificial platelet should have great confidence in the that
optimal artificial platelet and its velocity may be reduced to prevent it from overstep-
ping the wound; therefore, the constriction coefficient is reduced to 0.25. In this case,
the influence from their own experience could be ignored so c1 is set as 0. Table 5 shows
the setting of the , c1 and c2 values according to different situations.
Each experiment was run ten times. The initial position of each artificial platelet is set
to a random location in a blood vessel model. The velocity of each artificial platelet is
randomly initialized between V MAX and +V MAX. At initialization, the personal best
position of each artificial platelet is its initial position. The systems terminate when they
reach the maximum iteration of 60,000 or 60 second in simulation system to ensure that
the system allows sufficient time for artificial platelets to find and form into a structure
at the wound or when the wound coverage percentage more than 80% to ensure that the
structure of artificial platelets covers most of the wound.
In order to examine how effective the performance of artificial platelets using canon-
ical PSO would be, the results should be compared with the artificial platelets with-
out collective control mechanism for locomotion, rather like natural platelets. As many
studies of platelet simulation, the motion of each platelet can be simulated by
Table 4. The setting of test parameters: the perception range (PRANGE), maximum velocity
(V MAX) and response time (NRTime)
Test Parameters Parameter Values

PRANGE* V MAX** NRTime
(m) (second)
1.875
3.75
PRANGE 7.5 1 0.01
15
30
1/4
1/2
V MAX 7.5 1 0.01
2.5
5
0.01
NRTime 7.5 1 0.002
0.001
* PRANGE is set in 1/8, 1/4, 1/2, 1 and 2 times of the blood vessel radius.
** V MAX is set in the number of times of maximum blood velocity.
Table 5. Parameter setting of , c1 and c2 according to different situations
Situations c1 c2
Both of neighbor and optimal articial platelets are not found 0.729 2.050 2.050
Neighbors are found, but no optimal articial platelet is found 0.729 2.050 2.050
Optimal articial platelets are found 0.250 0.000 0.000
Brownian motion [16,29]. The Brownian motion is the random walk of particles in
fluid. Thus, the motion of artificial platelets without control mechanism is randomized
with Gaussian distribution. Both the system with artificial platelets using canonical PSO
to control their operation and the system with artificial platelets using random move-
ment have the attraction signaling and sensing unit and other essential characteristics,
but the movement of the latter case is completely randomized even when the attraction
signal is found. However, the attraction force still applies when the artificial platelet is
very close to an optimal artificial platelet in the system with random movement.
Varying Perception Range. In the case where the perception range of artificial platelets
is varied, the results are illustrated in Fig. 5-a and 5-d. As the perception range is in-
creased, the levels of wound coverage are increased. When the perception range of the
artificial platelets is 7.5, 15 and 30 m, they reached 80 percentage of wound coverage.

Fig. 5. The comparison of the results from random movement and Canonical PSO: (a) the median
wound coverage for the variation of PRANGE, (b) V MAX and (c) NRTime; (d) the mean number
of iterations used for different settings of PRANGE, (e) V MAX and (f) NRTime
When the perception range is 1.875 m, the artificial platelets cannot form the structure
or there are only a few artificial platelets adhered at the wound site. It may be because
the artificial platelets can only observe the environment close to themselves and, hence,
there is low chance to find the wound site. In addition, they may not sense the neigh-
boring artificial platelets, so the velocity modification is only influenced by their own
experience.
In term of the speed to achieve the goal, the speed of artificial platelets forming into
a structure at the wound site is increased as the perception range is increased. It may
be because when they can observe the environment in larger area, there is a greater
chance that they can find the wound site as well as the attraction signal from optimal
artificial platelets; consequently, they can move toward to the wound site and form into
a structure faster. In addition, a greater number of neighboring artificial platelets may
be observed when the perception range of artificial platelets is larger. This may lead
artificial platelets to find the wound site as the control mechanism for finding a good
solution by canonical PSO is influenced by their own experience and their neighbors
experiences.
At different perception ranges, the same level of wound covered and speed to form a
structure were obtained for the system with random movement. It is because the chang-
ing perception range has no effect on their movements. The result indicates that the
capability of signaling and sensing unit have no effect on the performance of the artifi-
cial platelet swarm system with random movement.
Varying Maximum Velocity. The results are illustrated in Fig. 5-b and 5-e. When the
V MAX of the artificial platelets is 0.9416 m, the median of wound coverage percent-
age is the lowest at 11.32%. Because the artificial platelets can move with just small
steps, they can slowly move to the wound site. With the influence from the blood flow,
they are unable to move in opposite direction to the blood flow, so they cannot move to
the wound site located behind their current position. For other cases, all systems reached
80% of wound coverage with the same levels of median wound coverage. The speed for
forming into a mass increases as the V MAX value increases. Because the greater V MAX
allows an artificial platelet to move with a larger step and can move faster and move in
opposite direction to the blood flow. Nevertheless, a larger step movement could lead
to more collision to the vessel wall or could overstep the wound. On the other hand,
the artificial platelets with lower V MAX could not move in opposite direction to the
blood flow; consequently, they could only move along the blood flow, but sometimes
the blood flow could lead a few artificial platelets to find to the damage site. In con-
clusion, the result indicates that the greater maximum velocity of the artificial platelets
brings about the better performance of the artificial platelet swarm in terms of both the
wound coverage and speed for forming into a mass.
When V MAX of the artificial platelets with random movement is 0.9416 and 1.8832
m, the median of wound coverage percentage is lower than 20% because the arti-
ficial platelets move with small step that allow they move slowly or cannot move to
target area. The same levels of wound coverage are obtained when V MAX of the artifi-
cial platelets is greater than 1.8832 m. Although the artificial platelets randomly move
in blood vessel wall, the steps of movement are limited by their V MAX values. The
speed for forming into a mass increases as V MAX increases; because the greater V MAX
allows the artificial platelet to move with larger, they can move faster and move in
opposite direction to the blood flow. Moreover, the chance to find the wound increases
as well. The effects of different level of V MAX values to the performance of the artificial
platelet system with random movement are similar to the system with canonical PSO-
based control mechanism for the same reason.
Varying Response Time. In term of varying response time of artificial platelets, the
level of resulting wound coverage increases as the response time decreases (or the re-
sponse speed of artificial platelets increases). The result illustrated in Fig. 5-c and 5-f
shows the improved trend of speed for forming into a mass as the response speed of
artificial platelets increases. Artificial platelets that perform in an extremely dynamic
environment would need to quickly readjust their positions according to the changes
in the environment. With the better response ability, artificial platelets could deal with
changing environment and achieve the designated task. In addition, these could gain
more chance of artificial platelets finding the wound site and, in turn, more artificial
platelets forming into a structure at wound site. Otherwise, the artificial platelets at-
tempt to travel toward the desired position calculated by the control mechanism but
may not be able to move to the desired position due to the external influence from the
blood flow. When the response time of artificial platelets is greater than the simula-
tion time, the movement of artificial platelets depends on the blood flow more than the
control mechanism; the blood flow could lead the artificial platelet to either closer to or
away from the wound site. Hence, the greater response speed of artificial platelets allow
them to perform their task with better performance in terms of both wound coverage and
speed to form into the structure.
The same levels of wound coverage are also obtained when the artificial platelets
with random movement have different response time settings. Nevertheless, the dif-
ferent response speed of artificial platelets showed impact on the speed of forming
structure; the speed of forming into a mass increases as the response speed of artificial
platelet increases similar to the system with canonical PSO-based control mechanism.
It may be because the artificial platelets with faster response speed could exhibit more
random movement than artificial platelets with lower response speed in the same time.
Hence, the artificial platelets with the better response ability allow the greater chance to
move to the desired position faster.
4.2 Artificial Platelets in Non-Newtonian Model
Although compared to the system with random movement, the results from the system
with canonical PSO based control mechanism in Newtonian model showed better re-
sults as illustrated in Fig. 5, it could not yet assure that the artificial platelets would
effectively complete their tasks in real application. In computer simulation studies, the
realistic of simulation is the one of main concerned issues. The more the simulation
model is closer to the real environment as well as situation, the greater chance that
the result from the study could apply to the real application. In a small blood vessel,
non-Newtonian model could better simulate the blood flow than Newtonian model [7].
The region of high velocity in non-Newtonian blood is larger than in Newtonian blood.
As artificial platelets with the influence from high blood velocity might go forward
too fast and overstep the wound, it can be anticipated that artificial platelets in non-
Newtonian blood may have higher possibility to overstep the wound and need to move
against the blood flow back to the wound.
In time-independent blood, H-B model is used for simulating non-Newtonian blood
flow. The flow is assumed to be steady, so the flow rate is not changed over time. Addi-
tionally, the flow is assumed to be fully developed flow, which means that the velocity
profile is stable. Normally in human vessel, blood has Reynolds number less than 300
except in aorta, so only laminar flow is used in this simulation. At the wall, there is
no-slip condition which means that the velocity at wall is zero. For H- B model, there
are some constant parameters which are n is set as 0.5, the yield stress is 0.001 Pa [6],
the shear stress at wall can be computed from Eq. 17 with pressure gradient = 20400,
and m = 0.0019 by computing Eq. 15.
As illustrated in Fig. 3, the velocity profile of non-Newtonian fluid exhibits with
wider region of high velocity around plug core region than that of Newtonian fluid.
Hence, in this experiment, the parameter setting for the canonical PSO-based control
mechanism of artificial platelets is set to cope with non-Newtonian fluid as summa-
rized in Table 6 and 7. There are two settings for c1 and c2 used for the purpose of
comparison. Firstly, in PSO1, the value of c1 and c2 are equally set at 2.050 for giv-
ing confidence in artificial platelet own knowledge as much as social knowledge in all
situations. Secondly, in PSO2, the values of c1 and c2 are differently set to balance the
influence from artificial platelets own knowledge and social knowledge according to
the state of artificial platelets. Each experiment was run 40 times.
Fig. 6 illustrates the resulting wound coverage and number of iterations respectively
in comparison with those from random movement. The results show that the higher
PRANGE value gives the greater chance for artificial platelets to meet and get informa-
Table 6. Parameter setting of , c1 and c2 according to different situations for artificial platelets
in non-Newtonian model: PSO1
Situations c1 c2
Table 7. Parameter setting of , c1 and c2 according to different situations for artificial platelets
in non-Newtonian model: PSO2
Situations c1 c2
tion from neighboring artificial platelets as well as from the environment. A low-valued
PRANGE artificial platelets would move pass the wound even though it gets near the
wound as it is not near enough to sense the wound in its perception range. For the max-
imum velocity, the greater value allows artificial platelets to move more freely against
the influence of blood flow. On the other hand, the greater value could also allow arti-
ficial platelets to move too fast along the blood flow and overstep the wound site; this
can be observed from the result when V MAX = 60 is worse than that when V MAX =
30. Finally, the response time denotes how fast the artificial platelets could response
to the situation in dynamic environment; usually, the smaller value should provide the
better performance. Nevertheless, the results show that it is not necessarily true as the
case with NRTime = 0.002 gives a better result than the one with NRTime = 0.001. It is
observed that the faster responding artificial platelets move slower in the environment,
so the speed to find the wound site is lower. On the other hand, when NRTime = 0.01 the
artificial platelets respond too slowly under the influence of blood flow. In most trials,
the artificial platelets could not fill the wound within 10,000 iterations.
For using PSO control mechanism, the results show that the artificial platelets us-
ing PSO1 and PSO2 can fill the wound faster than those that randomly move. In term
of the wound coverage, the system using PSO-based control performs slightly better.
Between two PSO-based control mechanisms, PSO2 performs slightly better as the me-
dian wound coverage is a little higher than that of PSO1 in many experiments; this also
indicates that PSO2 has a greater chance for artificial platelets to fill the wound. The
difference is more distinctively observed in term of self-assembly speed. PSO2 results
in faster self-assembly in several experiments. This indicates that balancing the influ-
ence from individual and social knowledge according to corresponding situations could
contribute on the self-assembly performance.
5 Analysis of Control Mechanism Toward the Realization of

Future Nanaorobots
This study investigated the possibility of using a swarm intelligence technique to adap-
tively control swarm system of nanorobots for medical application. The performance of
swarm-intelligence-based nanorobot control was demonstrated in blood vessel repair
application. Through exploring different swarm intelligence techniques, the canonical
PSO algorithm was selected and used in the demonstration; the reason was that the con-
cept of control mechanism in PSO can fit to the characteristics and tasks of early-stage
nanorobots in the future better than other techniques. Moreover, the control parameters
for PSO are also corresponding to the characteristics of early-stage nanorobots.
Through demonstration, the effects of different artificial platelet capabilities includ-
ing perception range, maximum velocity and response time under the control by canon-
ical PSO were investigated in both Newtonian and non-Newtonian model of blood
flow. The results indicated that if the early-stage nanorobots comprise these capabil-
ities and use the control mechanism to adaptively control their locomotion based on
partial influences from their neighbors, they could effectively achieve their tasks as arti-
ficial platelets. From varying principal parameters of the swarm artificial platelets with
canonical PSO based control and comparing the performance with artificial platelets

Fig. 6. The comparison of experimental results between random movement, PSO1 and PSO2 in
non-Newtonian model: (a) the median wound coverage for the variation of PRANGE, (b) V MAX
and (c) NRTime; (d) the mean number of iterations used for different settings of PRANGE, (e)
V MAX and (f) NRTime.
randomly moving in self-assembly application as blood vessel repair, the results from
Newtonian model can be concluded as follows:
The greater perception range increases the chance of the artificial platelets with
canonical PSO based control, which relies on their own experience and their neigh-
bors experience to influence their movement, to find the target site. However, the
different levels of perception ability have no impact on the motion of the artificial
platelets without control because their movements are random.
The greater maximum velocity that allows the artificial platelets to move with larger
step and to move in opposite direction to the blood flow increases the chance to find
the target site.
The grater response ability of artificial platelets provides a better adjustment of
their position to situation in both the system of artificial platelets with canonical
PSO and the system of artificial platelets with random movement.
From the demonstration, the results from non-Newtonian model exhibited the similar
trend, but the effect of maximum velocity and response time are distinct. The maximum
velocity was set with greater values in order to obtain the trend as in Newtonian model.
As non-Newtonian environment brings about wider region of high blood velocity in-
side the vessel, the higher setting of maximum velocity may be required to cope with
the greater blood flow influence. However, the high velocity could also induce too fast
movement so the artificial platelets overstep the wound. In term of response time, arti-
ficial platelets may not need to be so fast responding as the high blood velocity area is
in the middle of the vessel but their target wound is always at the vessel wall where the
velocity is lower.
The natural platelets form into a platelet plug through the following activities: the
non-activated platelet becomes activated when it senses vWF and, then, adheres to the
exposed collagen fiber at the wound. The activated platelets synthesize and release
chemicals for activating nearby platelets to aggregate and form into a plug. The ac-
tivated platelets can link to each other before connecting with the adhered platelets at
the wound. The natural platelets can connect with one another at all directions because
they have many receptors on their surface [40]. Nevertheless, the artificial platelets in
this study have only 8 connectors. They are designed to connect with other artificial
platelets adhered at the wound site only. Whenever the artificial platelet comes within a
very close distance to the wound, it immediately adheres to the exposed collagen fiber at
the wound and, then, releases the attraction signal which draws other artificial platelets
to get nearer and connect to form a structure to cover the wound. The examples of the
self-assembly outcome of artificial platelets into a mass at the wound site are illustrated
in Fig. 7.
Through simulation, the essential characteristics of the nanorobots with canonical
PSO based control mechanism for the self-assembly task are investigated and found
that the future nanorobots with these characteristics could plausibly operate with self-
assembly task in medical applications under swarm-intelligence-based control. The ex-
periment results suggested that the early-stage nanorobots in the future could perform
self-assemble task in blood vessel repair when they have the perception ability around
the diameter of the blood vessel and the maximum velocity of about the average blood
velocity inside the blood vessel.
Fig. 7. Examples of the output structures at the wound site for repairing the blood vessel from
artificial platelet system
Although artificial platelets in this study have no knowledge of their current loca-
tions in the environment and they cannot share their information with their neighbors,
the artificial platelets with attraction signaling and sensing unit and other essential char-
acteristics that are controlled by canonical PSO can achieve their goal. Moreover, the
movements of artificial platelets that operate in dynamic environment like a circula-
tory system are influenced by the blood flow in the vessel. In this study, the artificial
platelets with essential characteristics identified in section 3.1 cannot acknowledge the
changes in their movement caused by the blood flow. Hence, the canonical PSO al-
gorithm calculates for new movement, but the artificial platelets cannot actually travel
toward the intended location because of the external influence from the blood flow. The
information from the artificial platelets personal best position, which is calculated from
accumulating their movement from the previous personal best position, can be mislead-
ing. Nevertheless, the results indicated that PSO-based control could allow the artificial
platelets to perform better than random movement even with misleading information.
Nonetheless, if nanorobots have knowledge of their current position in the problem
space, the accurate personal best position of nanorobots will be obtained. The ability
to sense the external changes in the environment may allow the nanorobots that know
their coordinate to achieve their goal with better performance as they can get the accu-
rate personal best position and will be able to effectively move to the desired position
according to canonical PSO. This may allow nanorobots to perform the self-assembly
tasks with the higher performance level and may allow nanorobots to perform more
complex task than the self-assembly task in this study. Another additional characteristic
that could be useful for nanorobots is the ability to communicate with other nanorobots
in the system. If nanorobots can communicate and share some information about their
performance with their neighbors, the nanorobots can know whether their neighbors
are in better or worse positions which will allow collective behavior in the nanorobot
swarm; this could lead to improvement of system performance and the ability to per-
form more complex tasks.
The findings in this study could serve as guidelines for the characteristics and behav-
iors required for early-stage nanorobots toward the realization of medical nanorobots in
the near future. Nevertheless, this simulation study is based on a rigid tube model due
to our limited knowledge of our circulatory system. In the future, the study on an elas-
tic tube model, which is more realistic simulation of blood vessel, would be applied.
Moreover, when there appears evidence of plausibly additional characteristics, a fur-
ther simulation study should be conducted to investigate for suitable, effective control
mechanism in more complex applications. This study on using swarm intelligence for
nanorobot locomotion control has shown an attempt to move forward toward the real-
ization of nanorobots from the computer scientists point of view. With the collaboration
in various research communities, it could be anticipated that the nanorobots would not
exist in visions but could become real with truly effective benefit in nanomedicine in
the near future.
References
1. Berne, R.M., Levy, M.N., Koeppen, B.M., Stanton, B.A.: Physiology, 5th edn. Elsevier
(2004)
2. Bonabeau, E., Dorigo, M., Theraulaz, G.: Swarm Intelligence from Natural to Artificial Sys-
tems. Oxford University Press (1999)
3. Campbell, N.A., Reece, J.B., Simon, E.J.: Essential Biology with Physiology, 2nd edn. Pear-
son Benjamin Cummings (2007)
4. Cavalcanti, A., Shirinzadeh, B., Fukuda, T., Ikeda, S.: Nanorobot for brain aneurysm. Inter-
national Journal of Robotics Research Archive 28, 558570 (2009)
5. Cavalcanti, A., Freitas Jr., R.A.: Nanorobotics control design: A collective behavior approach
for medicine. IEEE Transactions on NanoBioScience 4(2), 133140 (2005)
6. Chandran, K.B., Yoganathan, A.P., Rittgers, S.E.: Biofluid Mechanics: The Human Circula-
tion, 1st edn. Taylor & Francis (2007)
7. Chhabra, R.P., Richardson, J.F.: Non-Newtonian flow and applied rheology: engineering ap-
plications, 2nd edn. Elsevier, Amsterdam (2008)
8. Clerc, M.: The swarm and the queen: Towards a deterministic and adaptive particle swarm
optimization. In: Proceedings of the IEEE Congress on Evolutionary Computation, pp. 1951
1957 (1999)
9. Clerc, M., Kennedy, J.: The particle swarm-explosion, stability, and convergence in a mul-
tidimensional complex space. IEEE Transactions on Evolutionary Computation 6(1), 5873
(2002)
10. Colorni, A., Dorigo, M., Maniezzo, V., Elettronica, D., Milano, P.: Distributed optimization
by ant colonies. In: Proceedings of the First European Conference on Artificial Life, pp.
134142 (1992)
11. Dorigo, M., Gambardella, L.M.: Ant colony system: A cooperative learning approach to the
traveling salesman problem. IEEE Transactions on Evolutionary Computation 1(1), 5366
(1997)
12. Eric Drexler, K.: Engines of Creation 2.0: The Coming Era of Nanotechnology. WOWIO,
20th anniversary edition (2007)
13. Eberhart, R.C., Kennedy, J.: A new optimizer using particle swarm theory. In: Proceedings of
the Sixth International Symposium on Micro Machine and Human Science, Nagoya, Japan,
pp. 3943 (1995)
14. Engelbrecht, A.P.: Computational Intelligence: An Introduction. John Wiley & Sons (2007)
15. Feynman, R.P.: Theres plenty of room at the bottom: An invitation to enter a new field of
physics. Engineering and Science 23 (1960)
16. Fogelson, A.L., Guy, R.D.: Immersed-boundary-type models of intravascular platelet aggre-
gation. Computer Methods in Applied Mechanics and Engineering 197(25-28), 20872104
(2008)
17. Freitas Jr., R.A.: Exploratory design in medical nanotechnology: A mechanical artificial
red cell. Artificial Cells, Blood Substitutes, and Immobil. Biotech. 26, 411430 (1998),
http://www.foresight.org/Nanomedicine/Respirocytes.html
18. Ganong, W.F.: Review of Medical Physiology, 22th edn. McGraw-Hill (2005)
19. Iida, N.: Influence of plasma layer on steady blood flow in micro vessels. Japanese Journal
of Applied Physics 17, 203214 (1978)
20. Fister Jr., I., Yang, X.S., Fister, I., Brest, J., Fister, D.: A brief review of nature-inspired a
brief review of nature-inspired algorithms for optimization. Elektrotehniski Vestnik 80(3),
116122 (2013)
21. Kaewkamnerdpong, B., Bentley, P.J.: Computer science for nanotechnology: Needs and op-
portunities. In: Proceedings of the Fifth International Conference on Intelligent Processing
and Manufacturing of Materials (2005)
22. Kaewkamnerdpong, B., Bentley, P.J.: Modelling nanorobot control using swarm intelligence:
A pilot study. In: Lim, C.P., Jain, L.C., Dehuri, S. (eds.) Innovations in Swarm Intelligence.
SCI, vol. 248, pp. 175214. Springer, Heidelberg (2009)
23. Karaboga, D., Akay, B.: A comparative study of artificial bee colony algorithm. Applied
Mathematics and Computation 214, 108132 (2009)
24. Karaboga, D., Basturk, B.: A powerful and efficient algorithm for numerical function opti-
mization: Artificial bee colony (abc) algorithm. Journal of Global Optimization 39(3) (2007)
25. Karaboga, D., Basturk, B.: On the performance of artificial bee colony (abc) algorithm. Ap-
plied Soft Computing 8, 687697 (2008)
26. Levick, J.R.: An Introduction to Cardiovascular Physiology, 5th edn. Hodder Arnold (2010)
27. Massoudi, M., Phuoc, T.X.: Pulsatile flow of blood using a modified second-grade fluid
model. Computers and Mathematics with Applications 56, 199211 (2008)
28. Mody, N.A., King, M.R.: Influence of brownian motion on blood platelet flow behavior and
adhesive dynamics near a planar wall. Langmuir 23(11), 63216328 (2007)
29. Mody, N.A., King, M.R.: Platelet adhesive dynamics. part ii: High shear-induced transient
aggregation via gpiba-vwf-gpiba bridging. Biophysical 95(5), 25562574 (2008)
30. Pal, B., Misra, J.C.: A mathematical model for the study of the pulsatile flow of blood under
an externally imposed body acceleration. Mathematical and Computer Modelling 29, 89106
(1999)
31. Payton, D., Daily, M., Estowski, R., Howard, M., Lee, C.: Pheromone robotics. Autonomous
Robot 11(3), 319324 (2001)
32. Robert Jr., A.F.: Clottocytes: Artificial mechanical platelets. Foresight Update 41 (2000)
33. Robert Jr., A.F.: Microbivores: Artificial mechanical phagocytes using digest and discharge
protocol. Journal of Evolution and Technology 14, 152 (2005)
34. Sanchita, P., Abhimanyu, K.S.: Identification of metastatic tumors by using dna nanorobot:
A fuzzy logic approach. International Journal of Computer Applications 1, 514 (2010)
35. Sankar, D.S., Lee, U.: Two-fluid herschel-bulkley model for blood flow in catheterized arter-
ies. Journal of Mechanical Science and Technology 22, 10081018 (2008)
36. Sharma, N.N., Mittal, R.K.: Nanorobot movement: Challenges and biologically inspired so-
lutions. International Journal on Smart Sensing and Intelligent Systems 1(1) (March 2008)
37. Slayter, H., Loscalzom, J., Bockenstedt, P., Handin, R.I.: Native conformation of human von
willebrand protein. analysis by electron microscopy and quasi-elastic light scattering. The
Journal of Biological Chemistry 260, 85598563 (1985)
38. Stanfield, C.L., Germann, W.J.: Principles of Human Physiology, 3rd edn. Pearson Benjamin
Cummings (2008)
39. Stanley, R.G., Tucker, K.L., Barrett, N.E., Gibbins, J.M.: Platelets and their role in throm-
botic and cardiovascular disease: the impact of proteomic analysis. In: Platelet Proteomics:
Principles, Analysis and Application, pp. 326. John Wiley & Sons (2011)
40. Tortora, G.J., Derrickson, B.H.: Principles of Anatomy and Physiology: Maintenance and
Continuity of the Human Body, 12th edn. John Wiley & Sons (2009)
41. Waite, L., Fine, J.: Applied Biofluid Mechanics. McGraw-Hill (2007)
42. Wang, J.: Can man-made nanomachines compete with nature biomotors? ACS Nano 3(1),
49 (2009)
43. Warkentin, T.E.: Thrombocytopenia due to platelet destruction and hypersplenism. In: Hema-
tology: Basic Principles and Practice, 5th edn., pp. 21132131. Churchill Livingstone Else-
vier (2009)
44. Zamir, M.: The Physics of Coronary Blood Flow. Springer (2005)
45. Zhang, L., Abbott, J.J., Dong, L.: Characterizing the swimming properties of artificial bacte-
rial flagella. Nano Letter 9(10), 36633667 (2009)
Author Index
Achalakul, Tiranee 205 Nemati, Alireza 129

Alavi, Amir H. 111
Ozcan, Sel 171
Boonrong, Pinfa 205
Bokovic, Borko 53
Spanou, Paraskevi 185
Brest, Janez 53
Strnad, Damjan 3
Fister, Iztok 3, 69, 149 Suganthan, P.N. 171
Fister Jr., Iztok 3, 69
Tasgetiren, M. Fatih 171
Gandomi, Amir H. 111, 129 Trihirun, Supatchaya 205
Trunfio, Giuseppe A. 91
Hosseini, Seyyed Soheil Sadat 129 Turk, Goran 149
Hozjan, Toma 149
Wang, Gai-Ge 111
Kaewkamnerdpong, Boonserm 205
Kizilay, Damla 171
Yang, Xin-She 3, 129
Marinaki, Magdalene 185
Marinakis, Yannis 185 Zamuda, Ale 53

Adaptation and Hybridization in Computational Intelligence

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Adaptation and Hybridization in Computational Intelligence

Transféré par

Droits d'auteur :

Formats disponibles

Adaptation, Learning, and Optimization 18

More information about this series at http://www.springer.com/series/8335

Adaptation and Hybridization

ISSN 1867-4534 ISSN 1867-4542 (electronic)

Springer Cham Heidelberg New York Dordrecht London

Printed on acid-free paper

Springer International Publishing AG Switzerland is part of Springer Science+Business Media

Chapter 9 introduces a new memetic approach based on the DE algorithm hybridized

October 2014 Iztok Fister

Part I: Background Information and Theoretical Foundations

Part II: Adaptation in Computational Intelligence

Part III: Hybridization in Computational Intelligence

A Memetic Differential Evolution Algorithm for the Vehicle Routing

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

Abstract. The aim of this chapter is to familiarize readers with the

Keywords: Computational intelligence, evolutionary algorithms,

c Springer International Publishing Switzerland 2015 3

Table 1. Domains and corresponding structures, operators, and performance measures

Domain Structures Operators Performance Measure

genetic chromosome mutation, crossover, tness

Fig. 1. Problems and System Analysis

2.1 Optimization Problems and Their Complexity

for. The eciency of algorithms is normally estimated according to the time

2.2 Simulation/Modeling Problems

3 Biological Foundations of Natural Adaptation

In natural evolution, adaptation indicates a genetic as well as non-genetic mod-

The population of nches colonized an island closest to the continent. This

their beaks could survive. Additionally, geographically separated populations

4.1 Algorithm as an Iterative Process

x(t+1) = A(x(t) , p), (1)

where p is an algorithm-dependent parameter. For example, the Newton-Raphson

Here, x is the optimal solution, or a xed point of the iterative formula.

4.2 Articial Neural Networks

Fig. 3. Human and articial neuron

There is a natural desire to compare the performance of the human brain

The structure of a classical feedforward multi-layered neural network, com-

Fig. 4. Multi-layer feed-forward neural network

Every connection within the MLP network is assigned a real-valued weight

where x = {x0 , . . . , xn } is the augmented input vector with x0 = 1 and w =

The ow of signals in a MLP network with structure n/h1 /h2 / . . . /hL1 /m

Algorithm 1. Pseudo-code of back-propagation ANN

4.3 Evolutionary Algorithms

is also known as the microscopic view of natural evolution. As matter of fact,

Algorithm 2. Pseudo-code of evolutionary algorithm

Evolutionary computation (EC) was inspired by Darwinian theory of natural

Evolution Strategies (ES)[46],

EAs have been successfully applied to dierent areas of optimization, modeling

4.4 Swarm Intelligence

communication is performed via modulation of the environment [38]. For exam-

Algorithm 3. Pseudo-code of swarm intelligence algorithm

The main characteristics of SI-based algorithms are as follows [38]:

decentralization via rule-based models,

Some representative SI-based algorithms are as follows:

Articial Immune Systems (AIS) [5],

5 Adaptation and Diversity in Computational Intelligence

Adaptation in nature-inspired algorithms can take many forms. For example,

5.1 Exploration and Exploitation

5.2 Generation of New Solutions

x = L + (U L), (11)

where [0, 1].

x(t+1) = x(t) + L(), (13)

where L() obeys a Levy distribution with the exponent of .

5.3 Right Amount of Diversity via Randomization

As we mentioned earlier, all meta-heuristic algorithms have to use stochastic