Vous êtes sur la page 1sur 40

REPORT OF MAJOR PROJECT

ON

EVOLUTIONARY DESIGN OF DIGITAL


CIRCUITS USING GENETIC
ALGORITHM

UNDER THE GUIDANCE OF


MR. ATUL KUMAR SRIVASTAVA

BACHELOR OF TECHNOLOGY
ELECTRONICS AND COMMUNICATION

JAYPEE INSTITUTE OF INFORMATION AND TECHNOLOGY


SECTOR 62 NOIDA
(2011-2015)

SUBMITTED BY
JYOTTSNA GUPTA (11102228)

1
RIA RUSTAGI (11102305)

CONTENTS

Certificate
Acknowledgement

1. Introduction
1.1.Basic Definition

2. Evolutionary Strategy
2.1.Genetic Algorithm
2.2.Evolutionary Operators

3. Fitness Landscape
3.1.Landscape Analysis

4. Fitness Distance Correlation

5. NK Fitness Landscape
5.1.Landscape Distribution

6. Circuit Evolution and Structure of Landscape


6.1.Circuit Representation

7. Future Work Proposed

8. Outputs

References

2
CERTIFICATE

This is to certify that the work titled EVOLUTIONARY DESIGN OF DIGITAL

CIRCUITS USING GENETIC ALGORITHM submitted by JYOTTSNA GUPTA


(11102228) and RIA RUSTAGI (11102305) in partial fulfillment for the award of degree of
BTECH of Jaypee Institute of Information and Technology , Noida has been carried out under
my supervision. This work has not been submitted partially or wholly to any other University or
Institute for the award of this or any other degree or diploma.

NAME OF SUPERVISER - MR ATUL KUMAR SRIVASTAVA

SIGNATURE OF SUPERVISOR -

DATE -

3
ACKNOWLEDGEMENT

We consider it our profound privilege and express deep sense of gratitude to our project
supervisor Mr Atul Kumar Srivastava without whose proficient guidance, keen and enthusiastic
interest I would have been impossible for us to develop the idea of this project and execute it
successfully.

We would like to extend our gratitude to the E.C.E department for all the help and resources
made available to us and also to the people directly and indirectly related in the completion of
the project.

We also want to thank our parents for their support and assistance throughout the course of the
development of the project.

NAME OF STUDENTS - JYOTTSNA GUPTA (11102228) RIA RUSTAGI(11102305)

SIGNATURE OF STUDENTS -

DATE -

4
Chapter-1 Introduction
An evolutionary algorithm manipulates a population of individuals where each individual
describes how to construct a candidate circuit. Each circuit is assigned a fitness, which indicates
how well a candidate circuit satisfies the design specification. The evolutionary algorithm uses
genetic operators to evolve new circuit configurations from existing ones. Done properly, over
time the evolutionary algorithm will evolve a circuit configuration that exhibits desirable
behaviour. It has been shown that the structure of fitness landscape affects the ability of the
evolutionary algorithms to search.
Each candidate circuit can either be simulated or physically implemented in a reconfigurable
device. Typical reconfigurable devices are field-programmable gate arrays (for digital designs)
or field-programmable analog arrays (for analog designs).

Our objective is to evolve a combinational circuit using the principles of evolutionary designs
which can be done by covering the following steps :-

1. Genetic Algorithm
2. Evolutionary Strategy
3. Fitness Landscape
4. Implementation on FPGA, Transistors
5. Evolvable Hardware

5
1.1 Basic Definition
1) Chromosome: Each individual in the population.

2) Genes: A set of parameters by which an individual is characterized.

3) Genotype: A string of chromosomes.

4) Phenotype: The characteristics of genotype.

5) Optimization: Choosing the best element from some set of available alternatives.

6) Evolutionary Algorithm: Mimic natural evolutionary principles to constitute search and


optimization procedures.

7) Isotropic: An object or substance having a physical property which has the same value
when measured in different directions.

8) Epistatic: the interaction of genes that are not alleles, in particular the suppression of the
effect of one such gene by another.

9) Entropic Measure: A closed system evolves toward a state of maximum entropy. (in
statistical mechanics) a measure of the randomness of the microscopic constituents of a
thermodynamic system.

10) Meta-heuristic: In computer science and mathematical optimization, a meta-heuristic is


a higher-level procedure or heuristic designed to find, generate, or select a lower-level
procedure or heuristic that may provide a sufficiently good solution to an optimization
problem, especially with incomplete or imperfect information or limited

11) Stochastic process: having a random probability distribution or pattern that may be
analysed statistically but may not be predicted precisely.

6
Chapter-2 Evolutionary Strategy

An evolutionary algorithm (EA) is a subset of evolutionary computation, a generic population-


based meta-heuristic optimization algorithm. An EA uses mechanisms inspired by biological
evolution such as reproduction, mutation, recombination, and selection. Candidate solutions to
the optimization problem play the role of individuals in a population, and the fitness function
determines the quality of the solutions. Evolution of the population then takes place after the
repeated application of the above operators.

ASSEMBLE AND TEST: it is a concept of assembling a larger system from a no of component


parts and then testing the organism in the environment in which it finds itself.

The concept of assemble and test along with evolutionary algorithm to gradually improve the
quality of a design largely been adopted in the evolvable hardware were the task is to build an
electronic circuit.

There are two main categories of the evolvable hardware:


1) Intrinsic evolution: in this each individual is tested out in hardware. Although the
evolution process is still implemented in software, assessment of design quality is based
on an actual implementation. When evolution is complete, the resulting design is already
implemented in hardware.

2) Extrinsic evolution: the evolution process and the resulting evaluations are implemented
in software. Each individual, design instance generated from the evolution process, is
evaluated by a software simulation of the design described by the individual. When
evolution is complete, the resulting design needs to be implemented in hardware.

7
Reed Muller and Exclusive-OR Logic: When a Boolean logic function is expressed using XOR
gates and uncomplemented variables, it is called a Reed Muller canonical form.

FIGURE 1

In order to build an evolutionary algorithm there are a number of steps that we have to perform:
Design a representation
Decide how to initialize a population
Design a way of mapping a genotype to a phenotype
Design a way of evaluating an individual

Designing a Representation
Representation of an individual can be using discrete values (binary, integer, or any other
system with a discrete set of values).
Following is an example of binary representation.

8
2.1 Genetic Algorithm
It is a type of Evolutionary Algorithm inspired by genetics and the survival of the fittest
(Darwins Theory). Steps involved in Genetic Algorithm are:
1) Represent the problem in the form of chromosomes.
2) Generate initial population with many random chromosomes.
3) Evaluate each chromosomes based on predefined function. This function is called fitness
function.
4) A proportion of chromosomes with higher fitness are accepted and rests are rejected.
5) Accepted chromosomes exchange their genes (attributes) and produce new offsprings. This
step gives more favour to chromosomes with higher fitness. It is known as crossover and is
repeated till the lost population is regenerated.
6) Some genes are selected and are randomly changed. This step is known as mutation. It
saves the population from pre-mature convergence. Usually the probability of mutation is
very low. Increasing the probability of mutation results in random search. Steps 3 to 6 &
are repeated for many times or many generations till accepted results are not achieved.
Genetic Algorithm Flowchart

FIGURE 2

9
2.2 EVOLUTIONARY OPERATORS

There are three Evolutionary / Genetic Operators:


1. Selection: chromosomes are selected from the population to be parents to crossover. The
problem is how to select these chromosomes. According to Darwins evolution theory the
best ones should survive and create new offspring. There are many methods how to select
the best chromosomes, for example roulette wheel selection, Boltzmann selection and
many others.

2. Crossover: After selection process, the crossover operator is used to generate two
offspring. In one- and two- point crossover, one or two chromosome positions are
randomly selected between one and (L-1), where L is the chromosome length. And the
two parents are crossed at those points.

In one-point crossover, the first child is identical to the first parent up to the crossing
point and identical to the second parent after the crossing point. In uniform crossover,
each chromosome position is crossed with some probability, typically one-half.

1. Parent 1: 101101101 | 111001100


2. Parent 2: 001101100 | 100101000

3. Offspring 1: 101101101 | 100101000


4. Offspring 2: 001101100 | 111001100

The amount of crossover is controlled by the crossover probability, which is defined as


ratio of the number of offspring produced in each generation to the population size.

10
Mutation: In binary- coded Genetic Algorithm, mutation may be done by flipping a bit,
while in a non-binary coded Genetic Algorithm, mutation involves randomly generating a
new character in a specified position. Mutation produces incremental random changes in
the offspring generated through crossover.

Mutation is equivalent to random search, consisting of incremental random modification


of the existing solution, and acceptance if there is improvement.

Example: Before Mutation: 110100010011


After Mutation: 110000010011

The mutation probability PM is defined as the probability of mutation each gene. It


controls the rate at which new gene values are introduced into the population.

11
Chapter -3 FITNESS LANDSCAPES

One needs to examine about 50 million genotypes for a circuit to achieve a high probability of
success. Thus it becomes essential to understand more about the nature of the fitness landscapes.

Fitness landscape - In evolutionary biology, fitness landscapes or adaptive landscapes (types of


Evolutionary landscapes) are used to visualize the relationship between genotypes and
reproductive success. It is assumed that every genotype has a well-defined replication rate (often
referred to as fitness). It expresses the idea that evolution can be considered as a population flow
on a surface in which the altitude of a point qualifies how well the corresponding organism is
adapted to an environment.
The landscape model is employed to investigate the structure of circuit evolution landscapes in
terms of the interplay between smoothness, ruggedness and neutrality. The smoothness and
ruggedness are related to the fitness differences between neighbouring points whereas the
neutrality refers to the flat landscape areas. The study of the characteristics of these landscapes is
an important concern in digital circuit evolution both for their scalability and in the importance
of choosing appropriate sets of logic functions used in the assembly of the digital circuits.

A fitness value is assigned to each genotype and the evolutionary algorithm refers to these values
when deciding which phenotype should survive and reproduce. The fitness value of a genotype is
evaluated by a fitness function, f, which measures how good the encoded phenotype is.
The evolutionary design of digital circuits can be considered as a search on a fitness landscape. It
has been shown that the structure of fitness landscapes affects the ability of the evolutionary
algorithms to search. The evolutionary search is easier when the landscapes are smoother.
However, the search becomes more difficult on more rugged landscapes since the population can
be trapped in local optima.

12
Information analysis
The landscape structure can be studied by quantifying the time series that is
obtained by sampling values along a random walk on the landscape, when
the landscape is statistically isotropic. A landscape is statistically isotropic
when the sequence of fitness values, obtained by a random walk on the
landscape, forms a stationary random process for the assumed joint
distribution of fitness values.

Weinberger has investigated how the autocorrelation function of the fitness


values of points along the steps of a random walk relates to the ruggedness
of the examined landscape. The autocorrelation function as a measure of
landscapes has been also studied by Stadler who suggested that the
autocorrelations can be used to estimate the amplitude spectra derived from
the Fourier transforms of the landscapes.

FIGURE 3: A sequence of fitness values as an ensemble of objects.

From the above figure, we can infer that a fitness landscape, L, can be defined on a graph,
Gf = (V, E), whose vertices are genotypes labelled with fitness values, and the connections are
defined by the evolutionary operator which agrees with the concept one operator, one
landscape. The genotype representation, the neighbourhood relation, and the fitness function
define the structure of a fitness landscape. The structure can be specified in terms of three

13
characteristics of fitness landscapes. These are the landscape smoothness, ruggedness and
neutrality.
Information characteristics
Consider a sequence of fitness values {f t}nt=0 , real numbers from the interval I, that are obtained
by a walk on a landscape, L. The sequence is a time series that represents a path in L, and
contains information about the structure of the landscape. The aim is to extract this information,
by representing the time series as an ensemble of objects. The ensemble can be defined as a
string, S() = s1 s2 s3.sn, of symbols si {-1, 0, 1}, and they are obtained by function.

The parameter is a real number from the interval [0, lI], where l is the length of the interval I.
The parameter determines the accuracy of calculation of the string S(). If =0, the function

will be very t sensitive to the differences between the fitness values and S() will be
determined as precisely as it is possible. When the parameter is lI , S() will be a string of 0s.

Entropic Measures: There are 2 Entropic measures of the ensemble of the sub-blocks of length
two of S(). These are

1) FEM: It is an estimate of the ruggedness of a landscape with respect to the landscape


neutrality. It is denoted by H().

H ( ) P pq log 6 P pq 2
pq

14
2) SEM: It is an estimate of the smoothness of a landscape with respect to the landscape
neutrality. It is denoted by h().

h( ) P pq log 3 P pq 3
p q

The probabilities P[pq] are frequencies of the possible blocks pq of elements from set {-1, 0, 1}.
They are defined as-

P[pq] = n[pq] / n .4
Where n[pq] is the number of sub-blocks pq in the string S() .

The FEM and the SEM characterise the time series {f t}nt=0 with a certain accuracy. The accuracy
of the estimations can be varied by the parameter that in turn defines the entropic measures as
functions of the accuracy.
For small values of , the function will be very sensitive to the difference between the
fitness values. If is zero then the accuracy of the estimations, obtained by H() and h() , is
high. In contrast, for = l I , the FEM and the SEM of S() are 0, i.e. for such the landscape
path will be determined as relatively flat.

Regular Walk: the FEM and the SEM of a time series generated by a regular walk on the
landscape are positive constants for each [0, *).
The term regularity is defined as follows: a time series, {f t}nt=0, is generated by a regular walk on
a landscape when the time series obeys

ft+1 = ft + kc 5

Where c is a constant and k is a variable which can be 0, 1, or -1. If this equation is not fulfilled,
the landscape path is generated by an irregular walk.

15
In practice regular walks on a landscape are a rare occurrence. Considering the above equation in
the form
ft+1 = ft + ki ci 6

Where ci are different constants.

The degree of regularity of a landscape is the number of different kc i , that is to say the number
of all possible differences of fitness values. For instance, the degree of regularity of Nk
landscapes generated by one-point mutation is low, and it decreases as k increases from 0 to N-1
since the number of possible fitness values increases with a higher than linear rate.

Circuit Evolution Landscapes: The genotype is a composition of three different parts which are
responsible for
1) the gates functionality
2) the array internal connectivity, and
3) the array outputs.

The reason for splitting the genotype into parts is the difference in the purposes of the parts. The
chromosomes have different lengths and they are defined over two different alphabets.
The gate functionality chromosomes are strings over alphabet with length the

number of gates. The alphabet size l is the number of allowed logic functions used in the
circuit design.
The internal connectivity and array outputs chromosomes are defined over alphabet
, and they are strings with length the number of gates and the number of array outputs,
respectively. The alphabet is related to the size of the neighbourhood of the cells and
array outputs, which is dependent upon the levels-back parameter.

To avoid the vagueness in the definition of circuit evolution landscapes, the genotype space is
split into three partitions as was done in Stadler and Grunter.

16
Since each genotype consists of three chromosomes it is assumed that the original landscape for
a given evolutionary operator is a superposition of three configuration spaces defined over
alphabets and .

3.1 Landscape Analysis


The analysis is applied to time series obtained by random walks on the three subspaces of the
landscapes. The random walks are performed by one-point mutation applied only to the studied
subspace. The one-point mutation is an operator which changes the allele of one gene at each
time step.

We have done landscape analysis on three-bit multipliers of 24 gates and 21 gates, and try to find
out the fittest landscape structure.

Figure 4 (24 gates 3 bit-multiplier) Figure 5 (21 gates 3 bit-multiplier)

From the above graph:


(1) Stands for Gate Functionality
(2) Stands for Internal Connectivity
(3) Stand for Output Connectivity

17
Interpretation:
1) H() is an increasing function for small value of . Neutrality prevails over the
ruggedness in a time series, if H() is a decreasing function.
2) The subspaces are characterized with vast neutrality since the FEM H(0) of each
subspace are significantly higher than log62.
3) The information functions H() of the functionality and output connectivity subspaces
increase as increases from 0 to approximately 0.0157, while H() of internal
connectivity subspace decreases as increases.
4) By increasing the scale, the regularity of multiplier landscapes decreases, which is to be
expected since the truth table of these arithmetic functions increases with a rate higher
than linearity. Consequently, by increasing the scale, the corresponding landscape
becomes continuous and perhaps easier for evolutionary search.
More number of gates-> more neutrality->more scalability->less regular.
So, from the points we come to a conclusion that 3 bit multiplier with 21 gates is better
than 3 bit multiplier with 24 gates. Since multiplier having 24 gates will be less regular,
which implies that it will be more rugged and hence it will be difficult to find the fitness
landscape of that multiplier.
5) The degree of regularity of digital circuit evolution landscapes is also related to the set of
logic functions of the gates. The circuit can be easily designed by a simple random
search, when only XNOR gates are allowed.

Figure 6: Only XNOR gate Figure 7: Using gates {6, 9, 12, 15}
From the above graphs we can observe that XNOR gates are more regular than the other gates.

18
Chapter - 4 Fitness Distance Correlation

Fitness distance correlation (FDC) has been offered as a summary statistic with apparent success
in predicting the performance of genetic algorithms for global optimization. Fitness is a function
that declines with the number of switches between 0 and 1 along the bit string. The test function
is GA-easy, in that a GA using only single-point crossover can find the global optimum with a
sample on the order of 10-3 to 10-9 of the points in the search space, an efficiency which increases
with the size of the search space.

Jones (1995) and Jones and Forrest (1995) have proposed fitness distance correlation (FDC) as
a candidate property for predicting the performance of a genetic algorithm in global
optimization. In this approach, the Hamming distances between sets of bitstrings and the global
optimum bitstring are compared with their fitness. Large negative correlations between
Hamming distance and fitness are taken to be indicators that the system is easy to optimize with
a GA. Large positive correlations indicate the problem is misleading and selection will guide
the population away from the global maximum. Near-zero correlations indicate that the GA does
not have guidance toward or away from the optimum, and thus faces the same difficulties as
random search.

Fitness Distance Correlation (FDC) is able to:


1. Predict GA behaviour on a number of well-studied problems,
2. Illuminate problems whose behaviour had been seen as surprising using other analytical
frameworks,
3. Account for the performance of different problem encodings and representations.

The above properties of FDC is both encouraging and alarming encouraging since FDC
appears to work, but alarming since distance is defined without reference to the genetic
operators, the representation of the search space, or any of the dynamics of the genetic algorithm.
So it is recommended that a stronger predictor of GA optimization performance would be FDC

19
analysis using a distance measure based on the genetic operators themselves. Hamming-distance
based FDC is the source of the results.
Hamming distance is strongly related to the mutation operator in classical bitstring genetic
algorithms. The number of times that the mutation operator must be applied to transform a given
string to the global optimum is monotonic with Hamming distance. However, it has been
classically argued that the main role of mutation in genetic algorithms is to prevent premature
convergence, and that recombination is the operator most important for GA performance.
Therefore, one interpretation of Joness results would be that mutation is a much more important
determinant of GA performance, either generally or in the specific examples he examined.

Another interpretation is that there is a deep relationship between Hamming distance and the
recombination operator. Recombination does not fit easily into an FDC framework, because it
involves pairs of bitstrings, so distance cannot be defined simply between individual bitstrings.
And the formation of pairs of bitstrings on which recombination operates depends on the
distribution of bitstrings at each generation of the GA; recombination can thus be considered to
be a frequency-dependent operator. Finally, recombination, including single-point crossover, can
create offspring that are a great Hamming distance from their parents and from each other. Thus
recombination would seem to destroy any relationship between Hamming distance and GA
dynamics.

FDC analysis consists of several conjectures:


1) If there is a large positive fitness distance correlation (r 0.15), then the problem is
misleading and the GA will be led away from the global optimum;
2) If there is a large negative fitness distance correlation (r 0.15), then the problem is
straightforward and the GA will find the global optimum with relatively good performance;
3) If the fitness distance correlation is near zero (- 0.15 < r < 0.15), the prediction is
indeterminate:
(a) If the fitness-distance scatter plot shows no relationship between fitness and Hamming
distance, the problem is GA- difficult;
(b) Certain structures that appear in the scatter plot will indicate that the problem is
straightforward, or misleading, as the case may be.

20
Crossover Distance: Singlepoint crossover can transform any pair of complementary bitstrings
into any other pair of complementary bitstrings through its repeated application. The number of
crossovers needed to transform a complementary pair into the global optimum and its
complement can thus be used to define a crossover distance.
Distance Measure: To make a fitness function that is straightforward to optimize, we would like
each bitstring to have a path to the global optimum (through repeated application of crossover) in
which fitness is monotonically increasing. The expectation is that this will allow the genetic
algorithm to produce the next fittest bitstring along this path using crossover, amplify this
bitstring through selection, and subsequently produce the next fittest bitstring through crossover,
and so forth. The number of crossovers it takes to reach the global optimum can serve as the
distance measure. With this definition of distance, the fitness function should produce a large
negative fitness distance correlation coefficient.

FIGURE 8

Interpretation: In the above figure we can see a sequence of crossover events on


complementary bitstrings that produce a path to the optimal bitstring (set to be the bitstring of all
1s). We notice that as one moves farther from the optimum along this path, the number of
discontinuities between 0 and 1 increases by one with each step. So, we can let the number of
discontinuities be the candidate measure of distance. In order for selection to guide the search
along the path, the fitnesses need to be monotonically increasing along the path toward the

21
optimum. A fitness function which decreases with the number of discontinuities between 0s and
1s will have that property.

Chapter 5 NK Fitness Landscape

NK fitness landscapes are stochastically generated fitness functions on bit strings, parameterized
(with N genes and K interactions between genes) so as to make them tunably rugged.
The specific fitness interaction is epistasis, where the effect on fitness from altering one gene
depends on the allelic state of other genes. Epistasis makes it possible for the population to
evolve toward different combinations of alleles, depending on its initial genetic composition.
Stuart Kauffman devised the NK fitness landscape model to explore the way that epistasis
controls the ruggedness of an adaptive landscape.

An NK landscape is defined by two parameters, N, the number of components in a system, and


K, the number of epistatic linkages between components. Each component has two possible
states (0 and 1) and makes a contribution toward the total fitness of the system that depends on
its own state and the states of the K components to which it is linked. There are 2K+1
combinations at each loci since each of the K + 1 linked components can be in one of two states.
The total fitness of any length N bit-string is given by the average fitness contribution made by
each of its loci. Each point in a landscape can therefore be described by a length N bit-string with
an associated height corresponding to the total fitness value.

Two complementary approaches have been used to explore the properties of a fitness landscape
1) Empirical simulation.
2) Mathematical analysis and

The simplest and most common empirical technique used to explore landscapes is an adaptive
walk performed by a hill-climbing algorithm. A hill-climber starts at an arbitrary point on the
landscape and progresses across the landscape until an optimum is reached, by changing the state
of a single component at each step such that the fitness at step t+1 is greater than the fitness at
step t. When a large number of adaptive walks are performed on a single landscape, an estimate
can be made of the number of local optima in the landscape and the probability of reaching them

22
from an arbitrary starting point. The set of starting points that can climb to a given optimum is
known as its basin of attraction. In general, a larger basin of attraction makes an optimum easier
to find.

The landscapes generated using the NK model varies in size according to N and in correlation
according to K. For fixed N, when K = 0, there are no interactions between components, the
resulting fitness landscape is smooth and contains a single global optimum. All points in the
landscape lie in the basin of attraction of the global optima, making it trivially easy to find. As
the number of interactions between components increases, the number of local optima increases
and the resulting landscape becomes more `rugged'. The basin of attraction of each optimum
(including the global optimum) decreases in size, making the landscape increasingly difficult to
search. When the landscape is `maximally rugged' (i.e., when K = N -1), the expected number of
optima is 2N = (N + 1).
Furthermore, as K increases, the level of interaction between loci increases, until it is impossible
to maximise the fitness contributions of all loci simultaneously. The average heights of local
optima decrease as K increases. The highest fitness values are found on landscapes with a low
level of ruggedness (around K = 2). For K < 2, there are less potential fitness for each
component. For K > 2, the increased interaction between components begins to overwhelm the
advantage of having a greater number of potential values at each locus and again, the average
height of the optima decreases.

Performance Measure:
N
1
W=
N
w
i 1
i 7

Where- wi, the contribution of component i to the overall performance of the system depends on
its own state, and
the states of K neighbouring components.

Example
N = 4 and K = 2

23
W (0, 1, 1, 0) = 1/4 [w0(0, 1, 1) + w1(1, 1, 0) + w2(1, 0, 0) + w3(0, 0, 1)]
5.1 Landscape Distribution

A total fitness, W is the average of N fitness contributions, wi, each independently distributed as
U (0, 1) random variables,
N
1
W=
N
w
i 1
i

By the central limit theorem, the total fitness, W, converges in distribution to a standard normal
random variable when appropriately scaled:

W 1/ 2

D
N (0,1) as N , 8
1 / 12 N

For suitably large values of N, we can therefore approximate the total fitness of any single point
on the landscape using a random variable with a Gaussian distribution, with mean 1/2 and
variance 1/ (12N).

Epistatic interactions between loci are a critical feature of NK models, and have the effect of
inducing correlations between the fitnesses of different points on a landscape. When the level of
epistasis, K, is low, each possible fitness contribution appears in a large proportion of the total
fitnesses, up to half, when K = 0. Thus, any fitness has a non-zero correlation with up to half of
the other fitnesses in a landscape. The greater the level of interaction, K, the lesser the correlation
between two nearby fitnesses, such that when K = N - 1, there is no correlation between any
collection of finesses, and in fact the 2N fitnesses are all independent of each other. All
correlations must be positive because dependencies between total fitnesses are due to their
sharing some of the same fitness contributions. Thus, if N is sufficiently large and K = N - 1,
then the landscape consists of 2N independent random variables, each with mean 0:5 and
variance 1/(12N). Because this variance is proportional to 1=N, then as N increases the variance
of a fitness value decreases, and the landscape becomes flatter as values are clustered more
closely around the mean value of 0.5. For K < N-1, the fitnesses is distributed over an entire

24
landscape.

Suppose that Wmax(K) and Wmin(K) are the global maximum and minimum, respectively, for
an NK landscape with parameter K defining the level of epistasis. For K = N - 1, the global
maximum is the largest of 2N independent random quantities, denoted by Wmax(N - 1). Similarly
the global minimum, Wmin(N - 1), is the smallest of the 2N independent values.
By definition, the probability density functions of Wmax(N - 1) and Wmin(N - 1) are

9
N
1
f W max( N 1) ( z ) 2 N f W ( z )( FW ( z )) 2

10
N
1
fW min( N 1) ( z ) 2 N fW ( z )(1 FW ( z )) 2

where fW(z) is the probability generating function of an individual fitness, W, and FW(z) is the
cumulative distribution function of W.

25
FIGURE 9

Interpretation: The above figure shows Gaussian approximations to the probability density
functions of Wmax(N -1) and Wmax(0) for N = 8 for K = N - 1, and for K = 0. The increased
mean (and decreased variance) of the global maximum for K = N - 1 over K = 0 is clearly
shown. And from the figure we can also infer that the mean of Wmax(N - 1) is greater than the
mean of Wmax(0).

26
FIGURE 10

Interpretation: The above figure gives the distributions of global maxima for N = 8 and K =
0, 1......, 7, and indicates that the global maximum is indeed stochastically increasing in K.

Points for Graphical Interpretation


1) The global maximum for a landscape with K = N - 1 is strictly stochastically greater than
the global maximum of any landscape with K < N 1.

2) Similarly, the global minimum of a landscape with K = N - 1 is strictly stochastically less


than the global minimum of any landscape with K < N - 1.

3) The mean of the global maximum is largest, and the mean of the global minimum
smallest, in the maximally rugged case, K = N - 1.

4) The global maximum is stochastically non-decreasing in K for fixed N.

27
Evans and Steinsaltz derive a value for the limiting value of the global maximum for K = 1 in the
case where the fitness contributions have the exponential distribution, but more importantly, also
demonstrate that such limiting values exist for all K for general distributions of fitness
contributions. Since the global maximum converges in probability to a constant for fixed K , the
variance of the global maximum goes to zero as N gets large, and so the constant is the limiting
value of the mode of the global maximum. The mode of the global maximum is the critical point,
z*, of the distribution function given by-

N
1
f W max( N 1) ( z ) 2 N f W ( z )( FW ( z )) 2

in the range (0, 1);


Using the fact that, limN Fw(z*) = 1 for z* > 0.5, the above equation can be solved to find

1 1 (2 N 1) 2
z* 11
2 12 N 2

Where is Lambert's W function.

Thus, as N .
1 ln 2
z* 0.84 12
2 6

As N , the variance of the distribution fWmax(N-1) decreases while the peak of the distribution
approaches z*, so that in the maximally rugged case the highest fitness value is increasingly
likely to be 0.84 as N gets large. The (strong) law of large numbers implies that the average of N
independent random numbers, uniformly distributed on [0, 1], converges to 0.5 as N .
However, the maximum of 2N such averages is not also 0:5; rather, it approaches about 0.84.

Implication of Search

28
On average, hill-climbers on an NK landscape find their highest values for low values of K
(typically K = 2 to K = 4). As K increases, the number of local optima increases while their
average height and size of their basins of attraction decreases, including the basin of the global
optimum. With increasing ruggedness the global optimum becomes exceedingly difficult to find
using evolutionary search algorithms, as K increases the height of this global optimum also
increases, and this expected value is largest when K = N - 1.
The basins of attraction scale in direct proportion to the relative height of the optima. For K = 0
there is only one optimum with a basin of size 2 N. As K increases, the number of optima
increases, up to an expected 2N / (N + 1) optima when K = N - 1. Thus, about one of every N + 1
fitnesses is locally optimal, and these local optima are dispersed throughout the entire landscape
(and the global optimum is just one of these).

FIGURE 11

Interpretation: From the above figure we can infer that as K increases, the global optimum
increases (higher curve) while the average fitness of the local optima decrease (lower curve). The
average optima actually found by hill-climbing tend to be larger than the average of all local
29
optima because larger local optima tend to have larger basins of attraction. For N = 8, the largest
expected optima are found when K = 2.

The height of the global optimum increases with K, while the average height of the local optima
decreases after about K = 2.The average optimum found by single hill-climber attains its highest
value when K = 2. As the ruggedness increases, however, the average fitnesses of the found
optima also decreases. The expected height of the global optimum increases as K increases, and
achieves its maximum value when K = N -1. As a result, higher fitnesses can be found in more
rugged landscapes, but only with a radically increased search effort. In particular, when K = N -
1, the independence of all fitnesses means that there is no guiding structure in the landscape, and
an exhaustive search is required to find the global optimum. Thus, there is a higher payoff to be
had by optimising over more rugged landscapes - provided they can be searched.

The positive correlation between fitnesses for K < N - 1 means that a single point from a low-K
landscape is likely to be more representative of the landscape from which it is sampled than a
similarly sampled point from a high-K landscape. So we came to a result that the very highest
peaks are found in the most rugged landscapes, scattered among the masses of low lying peaks.

At last we have implemented our research of NK Landscape on MATLAB to plot the fitness
landscape curve.

30
Output of Matlab Codes:

Landscape Fitness Curve

Fitness

Landscape Fitness at K=0 and K=9

Fitness

31
NK Fitness Landscape for different values of N and K

Fitness

32
Chapter 6 : Circuit Evolution and Structure of Landscape

Digital circuits are evolved using an evolutionary algorithm. The genotypes of a population of
digital electronic circuits are initially generated at random. The fitness value of each genotype is
evaluated by calculating the percentage of total correct outputs of the encoded electronic circuit
in response to appropriate inputs.
The encoding of a digital combinational circuit into a genotype is presented by a computational
model called Cartesian Genetic Programming (CGP). To encode a digital electronic circuit
into a genotype, a genotype / phenotype mapping is defined. This is a rectangular array of cells
each of which is an atomic two-input logic gate or a multiplexer. Thus the genotype is a linear
string of integers and it consists of two different types of genes that are responsible for the
functionality and the routing of the evolved array.
The genotype is characterized by four parameters of the array of cells: the number of allowed
logic functions, the number of row, the number of columns and levels-back. The first two are
merely the dimensions of the rectangular array and the last is a parameter which controls the
internal connectivity. It determines how many columns of cells to the left of a particular cell may
have their outputs connected to the inputs of that cell. The cells and outputs are maximally
connectable when the number of rows is one and levels-back is equal to the number of columns.
If however number of rows is one and levels-back is one then each cell must be connected to its
immediate neighbour on the left.

33
FIGURE 12

The nI primary circuit inputs X1, X2 , . . . , XnI are allowed to be connected to the input of any cell

or any of the nO primary circuit outputs Y1, Y2 , . . . , YnO. The cells cij may implement any of the
binary functions.

6.1 Circuit Representation


1) A digital circuit is represented as a matrix of interconnected cells. A gate along with its
input is known as cell. Cells describe logic operations at gate level.
Format of Cell is: {Input 1, , Input n, Gate Type}

Input n is the input to the gate from immediate previous column. Row number of
immediate previous column is used to specify input. If the cell is in first column then
input n is given a negative numbers to specify external inputs.
Example:
{-1,-2, 3} Cell is in first column and it directly receives external inputs.
Gate type is a digital gate or wire. Look up table is used to identify the gate in
programming languages.

2) Matrix representation of circuit can be transformed into linear format by writing each cell

34
in linear format (starting from First row first column and traversing each column in top to
down manner). This representation is known as the chromosome in GA terminology.
Atomic building blocks of chromosomes are known as genes. Genes represents various
attributes of chromosome. Different permutations of genes generate different circuits with
different behavior (Truth table, delay, power) and different topology (Area, Gates etc).
Representation in chromosome format is also known as genotypic representation.
Schematic representation of circuit is also known as phenotypic representation.

3) Difference between Matrix format and Chromosome representation:


Both formats are indispensable for an efficient GA algorithm. Matrix format is helpful in
various mathematical and logical operations over circuit which includes generating the
truth table of circuit. Chromosome format is used for implementing GA operators
(Crossover, Mutation etc.)
Both formats are easily convertible in one another in programming languages MATLAB,
Mathematica, and C.

4) In order to simplify our GA algorithm we will further break the matrix representation of
circuit in three different circuits as described below:
a) Matrix describing first input of each cell.
b) Matrix describing second input of each cell.
c) Matrix describing Gate type of each cell using Look up table.

5) An important point to notice in this representation of combinational circuits is that cells


in any column can take their input from immediate previous column only. This is to
maintain uniformity and reduce complexity in the circuit representation.

35
Look Up Table for Digital Logic Gates
Gate Type Integer Representation
XOR Gate 10
AND Gate 6
MUX Gate 16

Circuit Representation

Linear Indexing of Matrix

36
37
Chromosome Representation:

{-0, -1, -0, 10, -0, -0, -2, 6, 3, 2, 1, 10, 0, 2, 3, 16}


{CELL 1, CELL 2, CELL 3, CELL 4}

-0 -1 -0 10 -0 -0 -2 6 3 2 1 10 0 2 3 16

CELL 1 CELL 2 CELL 3 CELL 4

G G G G G G G G G G G G G G G G
E E E E E E E E E E E E E E E E
N N N N N N N N N N N N N N N N
E E E E E E E E E E E E E E E E
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Chapter 7: Future Work Proposed

38
Till now we have studied that -
Genetic algorithms are used in the evolutionary strategies for the digital circuit
evolution.
How landscape fitness analysis is the best way to examine the genotypes using the
one point mutation operator.
We have performed landscape analysis on 2 circuits of 3 bit multiplier using 21 gates
and 24 gates and concluded which one is more fit using the landscape characteristics.
We have implemented the landscape fitness curve on MATLAB.

In future we will be implementing our research work on hardware and implementing a circuit
which will be having better performance based on the landscape fitness result.

References
1. Julian F. Miller, Dominic Job, Vesselin K. Vassilev, Principles in the Evolutionary

39
Design of Digital CircuitsPart I, 1999.

2. Julian F. Miller, Dominic Job, Vesselin K. Vassilev, Principles in the Evolutionary


Design of Digital CircuitsPart II, 1999.

3. Pinacci Majumder, Elizabeth M.Rudnick, Genetic Algorithm for VLSI design, layout
and test automation, pp. 2-35.

4. Lee Altenberg, Fitness Distance Correlation Analysis: An Instructive Counter Example,


Hawaii Institute of Geophysics and Planetology University of Hawaii at Manoa, 1997.

5. Lee Altenberg, NK Fitness Landscapes, Hawaii Institute of Geophysics and


Planetology University of Hawaii at Manoa, November 27, 1997.

6. The Kauffman NK Model - A Stochastic Combinatorial Optimization Model for


Complex Systems (ppt).

7. Benjamin Skellett, Benjamin Cairns, Bradley Tonkes, Nicholas Geard, and Janet Wiles,
Rugged NK landscapes contain the highest peaks.

40

Vous aimerez peut-être aussi