Académique Documents
Professionnel Documents
Culture Documents
*;
Introduction
Simulated annealing was created when researchers noticed the analogy between their search algorithms
and metallurgists' annealing algorithms. The idea is to achieve a goal state without reaching it too fast. In
metallurgy, for example, the process of hardening steel requires specially timed heating and cooling to
make the iron and carbon atoms settle just right. In mathematical search algorithms, we want to focus on
promising solutions without ignoring better solutions we might find later. In other words, we want to
reduce error to the global minima without getting stuck in less successful local minima.
The Algorithm
Neighborhood definitions and how they're used have different effects on the behavior of the algorithm.
1. Introduction
In recent years, many real-world problems have become extremely complex and are difficult to solve using
classic analytical optimization algorithms. Metaheuristic search (MS) algorithms have shown more
favorable performance on nonconvex and nondifferentiable problems, resulting in the development of
various MS algorithms for difficult real-world problems. Most of these MS algorithms are nature-inspired,
and several of the prominent algorithms include genetic algorithms (GA) [1], evolution strategies (ES) [2],
differential evolution (DE) [3], particle swarm optimization (PSO) [4, 5], harmony search (HS) [6], and
biogeography-based optimization (BBO) [7, 8]. However, the “No Free Lunch” theorem suggests that no
single algorithm is suitable for all problems [9]; therefore, more research is required to develop novel
algorithms for different optimization problems with high efficiency [10].
2.1. Basic TLBO
TLBO is a population-based MS algorithm which mimics the teaching and learning process of a typical class
[11]. In the TLBO, a group of learners is considered as the population of solutions, and the fitness of the
solutions is considered as results or grades. The algorithm adaptively updates the grade of each learner in
the class by learning from the teacher and learning through the interaction between learners. The TLBO
process is carried out through two basic operations: teacher phase and learner phase.
In the teacher phase, the best solution in the entire population is considered as the teacher, and the teacher
shares his or her knowledge to the learners to increase the mean result of the class. Assume is the position
of the ith learner, the learner with the best fitness is identified as the teacher , and the mean position of a
class with NP learners can be represented as . The position of each learner is updated by the following
equation:where and are the ith learner’s new and old positions, respectively, is a random vector
uniformly distributed within , is a teacher factor, and its value is heuristically set to either 1 or 2. If is
better than , is accepted, otherwise is unchanged.
In the learner phase, a learner randomly interacts with other different learners to further improve his/her
performance. Learner randomly selects another learner and the learning process can be expressed by the
following equation:where is the objective function with D-dimensional variables and is the old position of
the jth learner. If is better than , is used to replace . The pseudocode for TLBO is shown in Algorithm 1.
Artificial Neural Network (ANN) uses the processing of the brain as a basis to develop algorithms that can
be used to model complex patterns and prediction problems.
In our brain, there are billions of cells called neurons, which processes information in the form of electric
signals. External information/stimuli is received by the dendrites of the neuron, processed in the neuron
cell body, converted to an output and passed through the Axon to the next neuron. The next neuron can
choose to either accept it or reject it depending on the strength of the signal.
Here, w1, w2, w3 gives the strength of the input signals
As you can see from the above, an ANN is a very simplistic representation of a how a brain neuron works.
To make things clearer, lets understand ANN using a simple example: A bank wants to assess whether to
approve a loan application to a customer, so, it wants to predict whether a customer is likely to default on
the loan.
Key Points related to the architecture:
1. The network architecture has an input layer, hidden layer (there can be more than 1) and the output layer.
It is also called MLP (Multi Layer Perceptron) because of the multiple layers.
2. The hidden layer can be seen as a “distillation layer” that distills some of the important patterns from the
inputs and passes it onto the next layer to see. It makes the network faster and efficient by identifying only
the important information from the inputs leaving out the redundant information
3. The activation function serves two notable purposes:
- It captures non-linear relationship between the inputs
- It helps convert the input into a more useful output.
In the above example, the activation function used is sigmoid:
O1 = 1 / (1+exp(-F)) Where F = W1*X1 + W2*X2 + W3*X3
Sigmoid activation function creates an output with values between 0 and 1. There can be other activation
functions like Tanh, softmax and RELU.
4. Similarly, the hidden layer leads to the final prediction at the output layer:
O3 = 1 / (1+exp(-F 1)) Where F 1= W7*H1 + W8*H2
Here, the output value (O3) is between 0 and 1. A value closer to 1 (e.g. 0.75) indicates that there is a higher
indication of customer defaulting.
5. The weights W are the importance associated with the inputs. If W1 is 0.56 and W2 is 0.92, then there is
higher importance attached to X2: Debt Ratio than X1: Age, in predicting H1.
6. The above network architecture is called “feed-forward network”, as you can see that input signals are
flowing in only one direction (from inputs to outputs). We can also create “feedback networks where signals
flow in both directions.
7. A good model with high accuracy gives predictions that are very close to the actual values. So, in the table
above, Column X values should be very close to Column W values. The error in prediction is the difference
between column W and column X:
8. The key to get a good model with accurate predictions is to find “optimal values of W — weights” that
minimizes the prediction error. This is achieved by “Back propagation algorithm” and this makes ANN a
learning algorithm because by learning from the errors, the model is improved.
9. The most common method of optimization algorithm is called “gradient descent”, where, iteratively
different values of W are used and prediction errors assessed. So, to get the optimal W, the values of W are
changed in small amounts and the impact on prediction errors assessed. Finally, those values of W are
chosen as optimal, where with further changes in W, errors are not reducing further. To get a more detailed
understanding of gradient descent, please refer to
Key advantages of neural Networks:
ANNs have some key advantages that make them most suitable for certain problems and situations:
1. ANNs have the ability to learn and model non-linear and complex relationships, which is really important
because in real-life, many of the relationships between inputs and outputs are non-linear as well as complex.
2. ANNs can generalize — After learning from the initial inputs and their relationships, it can infer unseen
relationships on unseen data as well, thus making the model generalize and predict on unseen data.
3. Unlike many other prediction techniques, ANN does not impose any restrictions on the input variables
(like how they should be distributed). Additionally, many studies have shown that ANNs can better model
heteroskedasticity i.e. data with high volatility and non-constant variance, given its ability to learn hidden
relationships in the data without imposing any fixed relationships in the data. This is something very useful
in financial time series forecasting (e.g. stock prices) where data volatility is very high.
A few applications:
ANNs, due to some of its wonderful properties have many applications:
1. Image Processing and Character recognition: Given ANNs ability to take in a lot of inputs, process them
to infer hidden as well as complex, non-linear relationships, ANNs are playing a big role in image and
character recognition. Character recognition like handwriting has lot of applications in fraud detection (e.g.
bank fraud) and even national security assessments. Image recognition is an ever-growing field with
widespread applications from facial recognition in social media, cancer detention in medicine to satellite
imagery processing for agricultural and defense usage. The research on ANN now has paved the way for
deep neural networks that forms the basis of “deep learning” and which has now opened up all the exciting
and transformational innovations in computer vision, speech recognition, natural language processing —
famous examples being self-driving cars.
2. Forecasting: Forecasting is required extensively in everyday business decisions (e.g. sales, financial
allocation between products, capacity utilization), in economic and monetary policy, in finance and stock
market. More often, forecasting problems are complex, for example, predicting stock prices is a complex
problem with a lot of underlying factors (some known, some unseen). Traditional forecasting models throw
up limitations in terms of taking into account these complex, non-linear relationships. ANNs, applied in the
right way, can provide robust alternative, given its ability to model and extract unseen features and
relationships. Also, unlike these traditional models, ANN doesn’t impose any restriction on input and
residual distributions.
More research is going on in the field, for example — recent advances in the usage of LSTM and Recurrent
Neural Networks for forecasting.
ANNs are powerful models that have a wide range of applications. Above, I have listed a few prominent ones,
but they have far-reaching applications across many different fields in medicine, security, banking/finance
as well as government, agriculture and defense.