Online Book Reader

Home Category

Choose a category
All
Classic-Fiction

Data Mining - Mehmed Kantardzic [226]

By Root 921 0

we can summarize that the major components of GA include encoding schemata, fitness evaluation, parent selection, and application of crossover operators and mutation operators. These phases are performed iteratively, as presented in Figure 13.3.

Figure 13.3. Major phases of a genetic algorithm.

It is relatively easy to keep track of the best individual chromosomes in the evolution process. It is customary in GA implementations to store “the best ever” individual at a separate location. In that way, the algorithm would report the best value found during the whole process, just in the final population.

Optimization under constraints is also the class of problems for which GAs are an appropriate solution. The constraint-handling techniques for genetic algorithms can be grouped into different categories. One typical way of dealing with GA candidates that violate the constraints is to generate potential solutions without considering the constraints, and then to penalize them by decreasing the “goodness” of the evaluation function. In other words, a constrained problem is transformed to an unconstrained one by associating a penalty with all constraint violations. These penalties are included in the function evaluation, and there are different kinds of implementations. Some penalty functions assign a constant as a penalty measure. Other penalty functions depend on the degree of violation: the larger the violation, the greater the penalty. The growth of the penalty function can be logarithmic, linear, quadratic, exponential, and so on, depending upon the size of the violation. Several implementations of GA optimization under constraints are given in the texts recommended for further study (Section 13.9).

13.3 A SIMPLE ILLUSTRATION OF A GA

To apply a GA for a particular problem, we have to define or to select the following five components:

1. a genetic representation or encoding schema for potential solutions to the problem;

2. a way to create an initial population of potential solutions;

3. an evaluation function that plays the role of the environment, rating solutions in terms of their “fitness”;

4. Genetic operators that alter the composition of the offspring; and

5. values for the various parameters that the GA uses (population size, rate of applied operators, etc.).

We discuss the main features of GAs by presenting a simple example. Suppose that the problem is the optimization of a simple function of one variable. The function is defined as

The task is to find x from the range [0,31], which maximizes the function f(x). We selected this problem because it is relatively easy to analyze optimization of the function f(x) analytically, to compare the results of the analytic optimization with a GA, and to find the approximate optimal solution.

13.3.1 Representation

The first step in the GA is to represent the solution alternative (a value for the input feature) in a coded-string format. Typically, the string is a series of features with their values; each feature’s value can be coded with one from a set of discrete values called the allele set. The allele set is defined according to the needs of the problem, and finding the appropriate coding method is a part of the art of using GAs. The coding method must be minimal but completely expressive. We will use a binary vector as a chromosome to represent real values of the single variable x. The length of the vector depends on the required precision, which, in this example, is selected as 1. Therefore, we need a minimum five-bit code (string) to accommodate the range with required precision:

For this example, the mapping from a real number to a binary code is defined by the relation (because a = 0):

Opposite mapping, from the binary code to the real value of the argument, is also unique:

and it will be used only for checking the intermediate results of optimization. For example, if we want to transform the value x = 11 into a binary string, the corresponding code will be 01011. On the other hand, code 11001 represents the decimal value x = 25.

13.3.2

Online Book Reader

Data Mining - Mehmed Kantardzic [226]

®Online Book Reader