Redundant Genetic Encodings May Not Be Harmful Bryant A. Julstrom Department of Computer Science St. Cloud State University St. Cloud, MN 56301 USA ------------------------------------------------------------------------ In /Proceedings of the Genetic and Evolutionary Computation Conference/ (GECCO'99) (Wolfgang Banzhaf, Jason Daida, Agoston E. Eiben, Max H. Garzon, Vasant Honavar, Mark Jakiela, and Robert E. Smith, Eds.). San Francisco, CA: Morgan Kaufmann Publishers, 1999, p.791. ------------------------------------------------------------------------ Summary In a redundant genetic encoding, several distinct chromosomes represent each candidate solution to the target problem. Such an encoding would seem to hinder genetic search by allowing competing representations of the same information. Tests using a GA for the 3-cycle problem (3CP), which seeks to partition /n/ = 3/k/ points in the plane into 3-cycles of minimum total length, indicate that this is not necessarily so. Three encodings of the problem's solutions have much, some, and no redundancy. All number the target instance's points and represent sets of 3-cycles as permutations of those numbers. In the most redundant encoding, the first three points a chromosome lists form a cycle, the next three points form another, and so on. In this encoding, 6^/n//3 (/n//3)! permutations represent each set of 3-cycles on /n/ points. A less redundant encoding sorts the integers within each triple, so that (/n//3)! chromosomes represent each set of 3-cycles. Finally, a non-redundant encoding also sorts the triples by their first values, so that the mapping from chromosomes to solutions is one-to-one. A reasonable crossover for these encodings is Partially Mapped Crossover (PMX) (Goldberg and Lingle, 1985) which randomly chooses a segment within one parent, then swaps symbols within that parent until the segment is identical to the same positions in the second parent. Mutation can swap the symbols in two random positions of the parent chromosome. The three encodings were compared in a generational genetic algorithm that initializes its population randomly, selects parents in 2-tournaments, and applies PMX and mutation independently. It is 1-elitist and runs through a fixed number of generations. This GA was run, with each encoding, on six instances of 3CP. Three of the data sets are well known traveling salesman problem instances. The others consist of points randomly placed on a 100 by 100 grid. The number of points used was always a multiple of three. In these tests, the GA's population contained 2/n/ chromosomes. The probability that crossover would produce an offspring was 70%; that of mutation therefore 30%. The GA ran through 10/n/ generations, and it was run 50 times with each encoding on each test instance. Table 1 shows the results of these trials. Contradicting conventional expectations, the GA performed best with the two redundant encodings and worst with the non-redundant encoding. *Table 1.* Averages over 50 trials with each encoding. Problem (/n/) Most redundant Less redundant Non- redundant rand42 (42) 603.25 608.62 613.68 eil51 (51) 501.63 498.70 503.37 rand60 (60) 679.17 685.80 694.89 eil76 (76) 629.22 629.16 653.45 rand90 (90) 887.95 895.20 927.42 kroA100 (99) 23981.87 24097.53 26796.28 There are several possible explanations for these results. First, the redundant encodings allow more locality. Second, cross-competition between building blocks may not hinder the GA's search as much as is often suggested. Third, with the redundant encodings, the population may form subpopulations around particular representations of good sub-solutions; one of these subpopulations eventually controls the population's evolution. Finally, under PMX, many pairs of redundant chromosomes are effectively the same. Reference D. E. Goldberg and R. Lingle, Jr. (1985). Alleles, loci, and the traveling salesman problem. In J. J. Greffenstette, ed., /Proceedings of the First ICGA/, pp.154-159. Hillsdale, NJ: Lawrence Erlbaum.