Vous êtes sur la page 1sur 16

Genetic Algorithms Dr.

Hamid Nemati

Table of Contents
Table of Contents ....................................................................................................................................2 History .....................................................................................................................................................3 General Description - What are GAs and when should they be used? ......................................................4 Genetic Algorithms vs Traditional Methods...............................................................................................6 Calculus-Based Search ........................................................................................................................6 Dynamic Programming .........................................................................................................................6 Random Search ...................................................................................................................................6 Gradient Methods.................................................................................................................................7 Combinations of Traditional Methods....................................................................................................7 Simulated Annealing ............................................................................................................................7 Genetic Algorithms - The Answer?........................................................................................................7 A Step by Step Explanation of the GA Heuristic .......................................................................................8 Theory in Practice..................................................................................................................................10 Function Definition .............................................................................................................................11 Gene Definition ..................................................................................................................................11 Mutation .............................................................................................................................................12 Mutation Type.................................................................................................................................12 Mutation Characteristics..................................................................................................................12 Exit Conditions ...................................................................................................................................13 Generator - An example using Generator with Problems defined in Excel...............................................15 Bibliography...........................................................................................................................................16

Genetic Algorithms Dr. Nemati

Page 2

History
Genetic Algorithms were first developed by computer scientist John Holland in the 1970's as an experiment to see if computer programs could evolve in the Darwinian sense. It is a based on the theory of natural selection in that it takes a population of 'solutions' to a problem, and uses them to 'breed' solutions that take the best 'genes' or characteristics of their parents. Instead of the ability to survive, parent solutions are allowed to mate if they are the best in the population at solving the problem. Since the microcomputer can cycle through a generation in a split second, millions of generations of good breeding can be compacted into a short period of time, and the best offspring can be chosen as the solution to the problem. One of the first major commercial applications came together in General Electric's computeraided design (CAD) system EnGENEous. This system was designed to be a domain-independent tool, combining the speedy local search of a number of traditional (and local) numerical optimization tools, the convenience of expert systems for specifying design constraints and control information, and the more global perspective of a genetic algorithm. The hybrid (or "interdigitized" in GE lingo) system can be interfaced to coordinate the activities of one or more domain-specific simulation or modeling codes, things as diverse as finite-element models, computational fluid dynamics codes, and discrete-event simulators. (Goldman) Genetic Algorithms are beginning to be used widely to handle a diverse range of applications. To give you an idea, I will briefly describe a few. GE's EnGENEous described in the History section above is continually undergoing improvement. The gas turbine application has gone on to make the new Boeing 777 jet engine more efficient, and applications in electric-utility planning, hydroelectric generator design, and steam turbine design have been paying off handsomely. A system called Faceprints developed in the psychology department of New Mexico State University (NMSU) is being used to help police identify suspects. The NMSU system taps into the mind's eye by having a GA generate 20 faces on a computer screen. The witness rates each face on a 10-point subjective scale, and the GA takes that information and through normal selection, crossover and Genetic Algorithms Dr. Nemati Page 3

mutation, operators generate additional faces. The faces are generated from an underlying binary chromosome that maps subcodes for each of five facial features--mouth, hair, eyes, nose, chin--into their pictorial representation, and the picture is assembled and displayed. Early experiments with this attempted to define beauty by using input in a questionnaire style. A startup in Sante Fe, N.M., called the Prediction Company, had developed a set of time-series prediction and trading tools for currency trading in which GAs play an important role. Tests with the Prediction Company's technique demonstrated ratios as good as the best of the known currency traders. Interoffice fiber-optic networks are already a big business at US WEST, but an evolutionary algorithm developed in the Operations Research Modeling group promises to make network additions faster and cheaper. The tool was first tried in May 1992, and network design time has been cut from two person-months to roughly two person-days. Cost savings are estimated in the range from $1 million to $10 million per design, and with 20 designs required over the next six to eight years, total savings could top the $100 million mark.

General Description - What are GAs and when should they be used?
A genetic algorithm is a heuristic technique used to solve optimization problems. Optimization problems attempt to find the best solution(s) for a given problem that has several parameters (goals or resources) with associated constraints. The most basic tools for solving optimization problems are complete enumeration of all possible choices, calculus, and linear optimization techniques using the simplex algorithm such as Lindo or Excel's solver. However, as I will discuss in the next section, these traditional methods break down when the problem gets very large or complicated. Some of the same issues that affect these tools affect genetic algorithms, but for the most part, GAs are far more robust at handling very complex and non-linear problems. For example, a problem involving selecting the best shipping route for a company that needs to make ten shipments may be solved using an 'exhaustive search' technique. The problem could be defined in the computer, which could go through the 3.6 million different combinations of the cities in a reasonable amount of time. However, add many more cities, and you have a problem with what is known as 'combinatorial explosion.' To give you an idea of how

Genetic Algorithms Dr. Nemati

Page 4

unmanageable such a problem can get, review the chart below, which lists the number of permutations associated with up to 25 elements. Such a problem is known as a non-deterministic polynomial problem. Elements Permutations 1 1 2 2 3 6 4 24 5 120 6 720 7 5,040 8 40,320 9 362,880 10 3,628,800 11 39,916,800 12 479,001,600 13 6,227,020,800 14 87,178,291,200 15 1,307,674,368,000 16 20,922,789,888,000 17 355,687,428,096,000 18 6,402,373,705,728,000 19 121,645,100,408,832,000 20 2,432,902,008,176,640,000 21 51,090,942,171,709,400,000 22 1,124,000,727,777,610,000,000 23 25,852,016,738,885,000,000,000 24 620,448,401,733,239,000,000,000 25 15,511,210,043,331,000,000,000,000 A genetic algorithm is a heuristic technique that avoids complete enumeration of the solution space by using 'rules of thumb' to find a good solution. What this means is that it does not guarantee the very best solution. Below are some of the benefits and disadvantages of using GAs. Benefits The solution time is very predictable, and is not radically affected as the problem gets larger. Handles non-linear and discontinuous functions equally as well as linear and continuous. You need only to be able to describe a good solution, you do not need to know how to build it. Thus, it does not require heavy use of expert knowledge. Can produce novel results among a set of good solutions. "I would have never thought of that one!" Tend to be compact, containing only the fitness function and a little code to handle the GA functions.

Genetic Algorithms Dr. Nemati

Page 5

Can usually be embedded easily, and are easy to hybridize.

Disadvantages It is a heuristic - it does not guarantee the optimal solution. Since GA's only drive toward the optimal solution using the fitness function, there is no explanation about how one might logically arrive at the solution.

Genetic Algorithms vs. Traditional Methods


So how do GAs stack up against other traditional methods. I cover Calculus-Based searches, dynamic programming, random searches, gradient methods, and combinations of these traditional approaches below.

Calculus-Based Search
The main disadvantages of calculus-based search are, firstly, a tendency for the search to get trapped on local maxima. Although a better solution may exist, all moves from the local maxima seem to decrease the fitness of the solution. GAs jump around the solution space, and so are not as prone to fall into this trap. Secondly, the application of such searches depends on the existence of derivatives, which require continuous functions. The nature of placing constraints on optimization problems tends to create a discontinuous solution space.

Dynamic Programming
This is a method for solving multi-step control problems, but can only be used where the overall fitness function is the sum of the fitness functions for each stage of the problem. Since there is no interaction between stages, the process cannot use prior stages to improve the solution

Random Search
This was the traditional approach to solving difficult functions where complete enumeration was time prohibitive. Without a mechanism to evaluate the solutions and systematically improve them, this

Genetic Algorithms Dr. Nemati

Page 6

method is largely hit or miss, and can take an unpredictable amount of time to produce acceptable results.

Gradient Methods
Commonly referred to as hill-climbing methods, these perform well on functions with only one peak. However, on functions with many peaks, they also suffer from the trap of local maxima.

Combinations of Traditional Methods


Random search and gradient search may be combined to give an "iterated hill-climbing" search. Once one peak has been located, the hill climb is started again, but with another randomly chosen starting point. However, since each random trial is performed in isolation, no overall idea of the shape of the domain is obtained, and there is no method for eliminating trials with low probability of success. No method is made to improve good solutions, and all solutions are treated identically.

Simulated Annealing
Simulated Annealing was invented by Kirkpatrick in 1982, and is essentially a modified version of the random search and hill-climbing combination. Starting from a random point in the search space, a random move is made. If this move takes us to a higher point, it is accepted, otherwise it is accepted only with probability p(t), where t is time. The function p(t) begins close to 1, but gradually reduces towards zero - an analogy with the cooling of a solid. Therefore, initially any moves are accepted, but as the "temperature" reduces, the probability of accepting a negative move is lowered. Negative moves are essential if local maximums are to be escaped, but too many negative moves would lead the search away from the maxima. This method basically introduced the idea of 'momentum', to allow the search to escape local maxima.

Genetic Algorithms - The Answer?


Genetic Algorithms start with an initial random population, and subsequently allocate more trials to regions of the search space found to have high fitness. They can be combined with hill-climbing techniques to speed up the search process, but are able to 'jump' from local maxima because the Genetic Algorithms Dr. Nemati Page 7

elements that promote a good solution are being mixed up in subsequent iterations. The idea of 'mutation,' discussed in the next section also makes sure that elements that were left out of the initial population have a chance of being incorporated later. Holland invented the notion of a GA 'schema' to explain why the GA is so effective. Simply stated, schemata are similarity subsets (sets of strings that have one or more features in common), and building blocks are those schemata that are 1) consistently emphasized by selection and 2) respected and exchanged by the genetic operators. (Goldberg) Essentially, providing the GA with a full solution allows it to 'know' something about all of the dimensions of the problem. The GA takes these into account when selecting which solutions to exchange elements.

A Step by Step Explanation of the GA Heuristic


Like other optimization problems, GAs have the following elements: 1. Variables: Resources or other parameters that are elements of the decision. 2. Constraints - Limitations associated with the variables. From a mathematical perspective, constraints slice up the solution space to further restrict the range of objective values. 3. Objective - The goal of the problem, or an evaluation of how well the problem has been solved. The real power of a GA is the way it uses the variables and constraints to find a solution that optimizes the objective. Rather than spell out each element in isolation, I will provide a flow of how the GA works, and describe the activities within their proper sequence. A Sequential View of the Genetic Algorithm 1. Create Initial Population - The GA selects an initial chromosome population of a specified size randomly. In other words, it fills in the input variables randomly with acceptable values. A full set of input variables is a chromosome, and it makes as many of these as you specify. For example, if the user has requested 20 population members in the problem definition, they are created by assigning random values to genes, based on the range set when the genes were defined. This provides an initial group of 20 population members for generation 0. 2. Decode the Chromosome - The GA then evaluates the 'fitness' of each chromosome by finding out how well it meets the fitness function. That is, how optimal an answer does this set of input values produce? Genetic Algorithms Dr. Nemati Page 8

3. Order the Chromosomes - At the beginning of each generation, the population members are evaluated and then ordered according to their fitness. 4. Choose Which Chromosomes will Mate - In order for crossover to occur, we must pair two population members so that genes can be exchanged. Mate selection is carried out using evolutionary principles. That is, members with the best fitness are given a higher likelihood of mating, which increases the chances of superior offspring. Mate selection is accomplished by using a "graduated" roulette wheel. For instance, if there are four population members, we could put 10 slots in the roulette wheel. We could then assign four slots for Member A, three slots for Member B, two for Member C, and one for Member D. Each member is allowed to mate once. Its mate is selected by "spinning" the roulette wheel, with the added stipulation that a member of the population is not allowed to mate with itself. 5. Perform Crossover - Once a member of the population and its mate have been chosen, it must be determined which genes will be exchanged. This is done by randomly selecting two "cut points" in the string of genes. Genes in between the two cut points will be swapped between the population member and its selected mate. (Note: Permutation Crossover is different, and is designed to guarantee that each offspring is a valid permutation).

6. Store Offspring - Each member of the original population will be given a chance to mate according to the "roulette" selection outlined above. This will result in a new population which is the same size as the original population. 7. Mutate Selected Chromosomes - After the new population has been created, randomly selected members of the population will undergo mutation based on the settings made by the user. For a random mutation, the GA randomly selects the population members to undergo mutation according to a specified mutation probability. (As in nature, most mutations will probably produce scary results. For this reason, the mutation probability should be kept relatively low.) The GA then randomly selects the genes which will be mutated according to the specified probability. Next, each gene selected will

Genetic Algorithms Dr. Nemati

Page 9

be mutated using a random number and a specified range. Once the mutation amount has been calculated, it is either added to or subtracted from the original gene. The new value of the gene is checked to make sure that it does not go outside the specified range.

8. Replace parts of population with superior mutations and superior prior generation members - The fitness for each member will be recalculated after all the genes have been mutated. The population members are then ordered according to their fitness. The best members of the old population may be added to the new population in some algorithms, unless the best members of the old population are not as good as the worst members of the new population. A genetic algorithm that keeps one or more of the best members from each generation is said to incorporate "elitism". This keeps the best members of the population from getting worse from one generation to the next, and insures that the fitness of the best member can only improve or stay the same. It also gives the "elite", highest-fitness individuals further opportunities to produce offspring in subsequent generations. 9. Create new generation - Once this has been done, the population is ready to create another generation. The population will cycle through generations until the Exit Condition is met.

Theory in Practice
In practice, the theory of Genetic Algorithms is broad enough to take on many flavors. Much development has been done, and it is easy to find shell applications, or pieces of code that can be used in applications. For example, many Java and C++ classes are now available on the Web that are essentially Genetic Algorithm objects built for reuse. They may incorporate the decoding function, the mutation function, or even the crossover function in a general fashion. No matter how the theory is applied, however, there are some essential elements that will be the same, and it is important to understand these elements if you wish to use them. I have compiled a list of

Genetic Algorithms Dr. Nemati

Page 10

parameters that need to be set in a typical Genetic Algorithm and explained what they mean and how to set them.

Function Definition
Fitness or Evaluation Function - In a spreadsheet, this will contain the formula that adds everything up and presents the result. In a program, it is likely a function that takes the input variables as arguments. (e.g. Sub Function Genetic(X,Y,Z)) Function Target - Like any other optimization technique, you must specify whether you want to Minimize the function value, Maximize it, or come as close as possible to matching a value you specify. Gene groups - The variable parameters in your optimization function. A chromosome, or member of the population, is made up of an array containing values for each of the variable parameters. Population Size - The number of chromosomes you will work with per generation. Eligible Chromosome Reproduction Number - The number of chromosomes you will keep and allow to reproduce after each generation.

Gene Definition
Variables - The changeable parameters you want the GA to use to solve the problem. Type of Gene - Three gene types are common. Genes may be defined as Real Numbers, Integers, or Permutations. Real numbers refer to "normal" numbers, which can include positive and negative numbers with decimal places. Integers do not include fractional values, and may range from -32,760 to 32,760 in most computer languages. Permutations are a set of cells with values that do not change, only the order of the values will change. Usually permutations are used for a list of things 1-n which can be reordered (Such as a traveling salesman problem). Crossover Operator - Crossover refers to the process of creating a new offspring trial solution by combining gene values from two parents selected from the current generation. There are typically at least three basic choices for this option: No Crossover, Two-point Crossover and Permutation Crossover. No Crossover can be selected regardless of the type of genes and the population will be

Genetic Algorithms Dr. Nemati

Page 11

improved by mutation alone. Two-point crossover is used with Integers and Real Number genes. Permutations have their own form of crossover. Constraints - You must specify a high and low value for Real Number or Integer genes. When genes are mutated, their values will not be allowed outside of the range. Permutation genes do not need a range because their values do not change.

Mutation

Mutation Type
No Mutation - If No Mutation is selected, population members will only be changed from one generation to the next by crossover processes. When No Mutation is selected, it is necessary to have a large enough population to ensure that all possible gene values are present at the outset. Random Mutation - If random mutation is selected, a number of population members will have the values of some of their genes changed every generation. Random Mutation Hill Climb - This is just like Random Mutation, except that the mutation operation is repeated several times and only beneficial mutations are retained. Directional Hill Climb - When a member of the population mutates, if the fitness has improved, the difference between the new gene values and old gene values will be calculated. The change is then reapplied to the new gene values. The number of times this process is repeated is determined by a finite number of Hill Climb Steps which you must specify. If no improvement is found, the process is terminated.

Mutation Characteristics
% of Population - This sets the percentage of population members which will have a chance to mutate every generation. % of Genes - Once a population member has been selected for mutation, this number will determine the likelihood that each gene will mutate.

Genetic Algorithms Dr. Nemati

Page 12

Size of Mutation ( % of Range ) - This number sets a limit on the amount a single gene will be allowed to mutate. Small mutations are not likely to produce dramatic results for continuous variables.

Hill Climb Steps - If using a Directional Hill Climb type of mutation, this will determine how many times the Hill Climb is repeated before outputting to the main genetic algorithm function.

Exit Conditions
Max Generations - If this exit condition is selected, Generator will run for the requested number of generations. Max Minutes - Select the length of time in minutes that you want Generator to run. Improvement Threshold - (Change in last N Generations is < X )- This feature allows the user to Stop the run when the rate of improvement in the fitness has leveled off. The number of generations should be large enough (about 100 generations) and the change small enough to ensure that Generator will probably find an acceptable solution.

Genetic Algorithms Dr. Nemati

Page 13

Generator - An example using Generator with Problems defined in Excel

Genetic Algorithms Dr. Nemati

Page 15

Bibliography
Holland, J.H., "Genetic Algorithms," Scientific American. July 1992, 66-72. Holland, J.H., Adaptation in Natural and Artificial Systems. MIT Press, Cambridge, MA: 1992. Davis, L., ed., handbook of Genetic Algorithms. Van Nostrand Reinhold, NY: 1991. Goldberg, David E., Communications of the ACM. v37n3, p. 113-119. Mar 1994 Goldberg, David E., Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, 1989.

Genetic Algorithms Dr. Nemati

Page 16