1 Introduction

Genetic Programming (GP) as popularized by Koza [1,2,3] uses syntax trees as program representation. Cartesian Genetic Programming (CGP) as introduced by Miller et al. [4] offers a novel graph-based representation which in addition to standard GP problem domains, makes it easy to be applied to many graph-based applications such as electronic circuits, image processing, and neural networks. CGP is mainly used with mutation as the only genetic operator. The reason for this is that previous work on crossover in CGP has provided mixed results and comparative results about the use of crossover are missing.

Tree-based GP was originally introduced with a sub-tree crossover technique which swaps randomly chosen sub-branches of the parent trees to produce new offsprings. Koza considered crossover as the dominant genetic operator as a result of his experiments [2, 3]. However, later research with more comprehensive and detailed experiments found that the beneficial effects of crossover cannot be generalized in GP [5,6,7].

In contrast to fundamental knowledge about crossover in tree-based GP, the state of knowledge in CGP appears to be comparatively weak. Furthermore, the potential and understanding of crossover in CGP seem to be an open and remaining question. In this paper, we present the results of a first comparative study on crossover in CGP which includes the comparison of formerly proposed crossover techniques. Furthermore, we introduce a new method of crossover for CGP, called Block crossover, which is also investigated in our study.

Section 2 of this paper describes CGP briefly and surveys previous work on crossover in CGP. This section also surveys former attempts of comparative crossover studies in tree-based GP and reviews its contribution to the understanding of GP. In Sect. 3 we introduce our new form of crossover for CGP. Section 4 is devoted to the experimental results of our study and the description of our experiments. In Sect. 5 we discuss the results of our experiments. Finally, Sect. 6 gives a conclusion and outlines future work.

2 Related Work

2.1 Cartesian Genetic Programming

In contrast to tree-based GP, CGP represents a genetic program via genotype-phenotype mapping as an indexed, acyclic and directed graph. Originally the structure of the graphs was a rectangular grid of \(N_ r \) rows and \(N_ c \) columns, but later work also focused on a representation with just one row. The genes in the genotype are grouped, and each group refers to a node of the graph, except the last group which represents the outputs of the phenotype. Each node is represented by two types of genes which index the function number in the GP function set and the node inputs. These nodes are called function nodes and execute functions on the input values. The number of input genes depends on the maximum arity \(N_ a \) of the function set. The last group in the genotype represents the indexes of the nodes which lead to the outputs.

A backward search is used to decode the corresponding phenotype. The backward search starts from the outputs and processes the linked nodes in the genotype. In this way, only active nodes are processed during the evaluation. The number of inputs \(N_ i \), outputs \(N_ o \) and the length of the genotype is fixed. Every candidate program is represented with \(N_ r * N_ c * (N_ a +1) + N_ o \) integers. Even when the length of the genotype is fixed for every candidate program, the length of the corresponding phenotype in CGP is variable which can be considered as a significant advantage of the CGP representation.

CGP traditionally operates with a \((1+\lambda )\) evolutionary algorithm (EA) in which \(\lambda \) is often chosen with a size of four. The new population in each generation consists of the best individual of the previous population and the \(\lambda \) created offspring. The breeding procedure is mostly done by a point mutation which creates offsprings by changing a small number of randomly selected genes from the parent genotype to a random value within the permissible range. One of the most important techniques is a special rule for the selection of the new parent. In the case when two or more individuals can serve as the parent, an individual which has not served as the parent in the previous generation will be selected as a new parent. This strategy is important because it ensures the diversity of the population and has been found highly beneficial for the search performance of CGP.

2.2 Previous Work on Crossover in CGP

Some of the first experiments on crossover in CGP included the investigation of four variations of crossover which were tested on the simple regression problem \(x^2 + 2x + 1\). Clegg et al. [8] reported that all tested variations of crossover techniques influenced the convergence of CGP negatively. In comparison to the mutation-only CGP algorithm, the addition of the crossover techniques hindered the performance of CGP. The crossover techniques were applied to the standard integer-based representation of CGP.

For instance, the genetic material was recombined by swapping parts of the genotypes of the parent individuals or randomly exchanging selected nodes. Clegg et al. [8] stressed that merely swapping the integers (in whatever manner) on a genotypic level in CGP disrupts the performance.

This was the motivation for a new form of crossover which has been introduced by Clegg et al. [8] and is based on a real-valued representation. This variation of CGP represents the graph as a fixed length list of real-valued numbers in the interval [0,1]. The genes are decoded to the integer-based representation with the help of normalization values (e.g. the number of functions or maximum input range). The recombination of two genotypes is performed with a standard Arithmetic crossover operation which uses a random weighting factor and can also be found in the field of real-valued Genetic Algorithms. The experiments of Clegg et al. showed that the new representation in combination with crossover improves the convergence behavior of CGP. However, for the convergence behavior in the later generations, Clegg et al. showed that the use of crossover in real-valued CGP leads to disruptive effects on one of the two tested problems. The improved convergence of the Arithmetic crossover was evaluated in the domain of symbolic regression and has been found useful in this problem domain [8].

Slaný et al. [9] analyzed the fitness landscapes of functional-level CGP on image operator design problems. Slaný et al. analyzed single and multi-point crossover operators. It was demonstrated that the mutation operator and the single-point crossover operator generate the smoothest landscapes for the tested problems.

For a multi-chromosome approach to CGP, Walker et al. [10] investigated a multi-chromosome crossover operator which joins the best chromosome parts from all individuals. This crossover technique was found useful for problems with multiple outputs and independent fitness assignment.

A beneficial effect of crossover in CGP was obtained by the use of an implicit context representation for CGP in which recombination is useful for the Even Parity-3 problem [11].

CGP has been extended for the automatic definition and reuse of functions by Walker et al. [12] and Kaufmann et al. [13]. Kaufmann et al. adopted the module creation mechanisms for a cone- and age-based CGP crossover [13]. Cone-based crossover showed good results for functions with repetitive inner patterns, while age-based crossover excels for randomized inner structures.

Recently, a new form of crossover has been introduced by Kalkreuth et al. [14]. The subgraph crossover recombines random parts of the CGP phenotype of two former selected individuals. This crossover technique has been found beneficial for the performance of CGP on symbolic regression, Boolean functions, and image operator design problems.

To the best of our knowledge, the most recent work on crossover in CGP has been done by Kalkreuth et al. However, while some crossover operators for standard CGP have been introduced and investigated, comprehensive comparative studies are still missing. This has been the motivation for our work.

3 The Block Crossover

The Block crossover is a new method of crossover for standard CGP. The method is mainly inspired by the cone-based crossover of Kaufmann et al. [13] for Embedded CGP, which integrates selected modules of a donor genotype into a receptor genotype. Since Kaufmann et al. have been successful with this crossover technique for specific boolean functions, our motivation for the proposal of the Block crossover is to adapt this mechanism for standard CGP. The Block crossover is also inspired by the subgraph crossover which has been introduced by Kalkreuth [14]. Since CGP suffers from a lack of a diverse and effective set of crossover techniques, the introduction and investigation of new crossover technique is significant.

The Block crossover technique focuses on the one-dimensional representation of CGP where the number of rows is limited to one. Given a previously selected genotypes of two individuals serving as parents, the Block crossover generates a list of all blocks of nodes that meet the following criteria:

  • The block contains a desired number of nodes.

  • All nodes in the block are directly linked through their inputs or outputs.

  • All nodes in the block are part of the genotype’s active path.

In our implementation, we have chosen to use blocks consisting of three nodes. To fulfill the other criteria, we have constructed the blocks by evaluating the genotype’s active path, and selecting active nodes who’s inputs were two distinct nodes and not primary inputs of the genotype. The time complexity of this simple method is linear, and it is performed along with the standard evaluation of the genotype’s active path that precedes its evaluation.

The Block crossover then randomly selects one block from each list and swaps them between the genotypes. The position of the nodes transferred as part of the block may change inside the new genotype. However, their mutual links are preserved and the function performed by the block stays the same. Therefore, the created offsprings retain the same active path but performs a new operation. If either parent contains no swap-able blocks, no crossover operation is performed and the offsprings are simply cloned from their parents. The crossover operation is then followed by the standard point mutation.

Figure 1 illustrates the crossover procedure. First the active paths are determined, and the swap-able blocks are stored in the lists \(M_1\) and \(M_2\). Then, two blocks \(N_1\) and \(N_2\) are chosen from their respective lists. In order to produce the first offspring \(O_1\), the first parent \(P_1\) is cloned, and the function nodes inside the selected block \(N_1\), are replaced by nodes from block \(N_2\). Nodes (2, 5, 6) have been moved to position (2, 3, 4), but by maintaining them in the same order within the one-row genotype, we can ensure their mutual connection, and their logical function stay the same. The second offspring \(O_2\) is produced in the same way but the roles of the parents \(P_1\) and \(P_2\) are reversed.

Fig. 1.
figure 1

The block crossover technique.

4 Experiments

4.1 Experimental Setup

We have performed experiments on problems from the symbolic regression and Boolean function domains. To evaluate the search performance, we measured the best fitness value found after a predefined number of fitness function evaluations (best-fitness-of-run). For all problems, the fitness was to be minimized. Our comparison has focused on four crossover operators. Standard One-point crossover, Subgraph crossover, Arithmetic crossover and our newly proposed Block crossover.

The evolution used a generational model. The initial population was randomly generated. Parent genomes for the next generation are picked using two separate tournaments, which allow for the same individual to be picked multiple times. The parents and a crossover operator are used ot create two offsprings, which are then mutated. This process is repeated until a sufficient number of offsprings has been created. Next generation consists of offsprings and a certain percentage of the best individuals (elites) from the previous generation.

In addition, two more evolutionary setups were added for comparison. The None crossover uses the same evolutionary scheme, but the offsprings it creates are identical clones of their parents, leaving mutation as the only active genetic operator. The \((1+\lambda )\) setup forgoes the above described setup and implements the traditional CGP algorithm.

Our experiments have focused on examining the following hypothesis.

Hypothesis 1

The \((1+\lambda )\) CGP algorithm performs better than the crossover operators in all domains.

In order to test this hypothesis, we first performed two rounds of meta-evolutionary experiments in order to determine which evolutionary parameters were critical, so that the crossover operators can all use their optimal setting, and be compared in a fair way. The two most important parameters were then subject to a parameter sweep, and for every crossover operator the best performing combination of parameters has been selected for comparison. To classify the significance of our results, we have used the Mann-Whitney U Test, to compare the standard \((1+\lambda )\)-CGP with all other crossover operators.

The implementation was done in Java, using the ECJ Evolutionary Computation Research System. All experiments were performed on a computing cluster with the following hardware configuration: 2 x Intel Xeon E5-2680v3 processor, 2.5 GHz, 12 cores; 128 GB RAM, 5.3 GB cache per core, DDR4@2133 MHz; InfiniBand FDR56 network connection.

4.2 Meta-evolution

For the meta-level, we used a basic canonical GA to tune five parameters we considered most important to the evolutionary process. Meta-evolution is very costly in terms of the computational effort necessary to find an optimal parameter setting. Furthermore, since GP benchmark problems can be very noisy in terms of finding the ideal solution, the evaluation of the evolved individuals is repeated multiple times, with fitness defined as the mean result.

During the first round of meta-evolution, all problems used the same setting, and the evolved parameters have been limited to discrete values, as seen in Table 1. During the second round, the granularity and range were modified to better fit each individual problem. Because the \((1+\lambda )\) scheme does not use tournament selection nor elitism, the two parameters have been ignored during its meta-evolution.

Table 1. Configuration of the first round of the meta-evolutionary GA.

Results of the first round of meta-evolution have revealed that the tournament size parameter behaves wildly and does not converge to any specific value for any problem nor type of crossover. In some cases, it even significantly outgrew the population size. This caused the tournaments to include the entire population, resulting in a crossover of the best individual with itself, and wholly defeated the purpose of the crossover operator. To prevent this from happening, the tournament size has not been included in the second round of meta-evolution and its value has been fixed to four.

Table 2. Results of the second round of the meta-evolutionary GA. The table shows the best-performing combination of the four tuned parameters.

Table 2 shows the results of the second round of meta-evolution which were used to set up the ensuing parameter sweep. Because the computational effort required to perform a parameter sweep grows exponentially with the number of parameters, only the two most important parameters, mutation rate and population size, were included in the sweep.

The ideal elitism rate was similar across all problems and types of crossover. For the sweep, it has been set to the overall average of 15%. Combined with the fixed tournament size of four, this means that during the sweep, there would be 52.2% chance none of the individuals in a tourney would be elites from the previous generation. The ideal genotype length was highly variable and largely depended on the problem, rather than the type of crossover used. For the sweep, the genotype length was set up individually for each problem.

4.3 Boolean Functions

We have chosen to evolve both single and multiple output Boolean functions. 2-bit digital adder and multiplier were used as our multiple output problems. Former work by White et al. [15] proposed these, as suitable alternatives to the overused parity problems. Their fitness was defined as a hamming distance between the resulting truth table, and the ideal solution. To increase the speed of the evaluation, we have used compressed truth tables.

For single output problems, we used 8-bit bent and 1-resilient Boolean functions. These functions find their use in cryptography, where they can provide an LFSR based key-stream generator of a stream cipher with resistance to linear and correlation attacks [16].

Table 3. Configuration of the Boolean function parameter sweep.

Bent Boolean functions possess the maximum possible degree of nonlinearity, defined as the Hamming distance between the truth table of a given function, and truth tables of all linear function and their negations. For an 8-bit function, maximum degree of nonlinearity is 120 [17]. We defined their fitness, as the difference between its actual degree of nonlinearity and the optimal value.

1-resilient functions are highly nonlinear functions that are balanced and have correlation immunity of the first degree. Balancedness means that the function’s truth table contains the same number of ones and zeros. Correlation immunity, means that if the truth table was split in half based on the value of a specific input, the two halves of the truth table would each remain balanced. To the best of our knowledge, the maximum possible nonlinearity of an 8-bit 1-resilient function is not known, but it can not be higher than 116 [18]. We defined the fitness, as the difference between the actual degree of nonlinearity and the optimal value, and if the evolved function was not resilient, its fitness was further penalized by 58, half the known limit.

Table 4. Results of the parameter sweep for Boolean functions.

Table 3 shows the setting used for the parameter sweep of Boolean functions. Each setting was run one hundred times, for every problem and type of crossover. All problems used the following function set {AND, OR, XOR, AND with one input inverted}. Because the best performing population size was usually very small, we have reduced the tournament size to two, to avoid repeating the issue from the first round of meta-evolution. For problems where the optimized setting was routinely able to find the ideal solution, we have also reduced the number of fitness function evaluations to get more telling results.

Table 4 shows the results of the parameter sweep. For each problem and crossover operator, we have selected combination of mutation rate and population size which provided the best mean fitness over the hundred runs. Operators that performed significantly different from \((1+\lambda )\) have their mean values marked. The table also shows standard deviation (SD) and three quantiles.

Fig. 2.
figure 2

Comparison of crossover operators for Boolean functions.

Figure 2 provides visual comparison using box plots. The Arithmetic crossover, originally intended for use in symbolic regression, performs the worst when used for Boolean function design. For adder and multiplier problems, the \((1+\lambda )\) strategy has significantly surpassed all other approaches. However, for the bent function, there was no statistically significant difference, and for the 1-resilient function, the \((1+\lambda )\) has performed significantly worse than the other options. Here, even with an optimal setting, some of the runs failed to produce a resilient function, resulting in significant deterioration of the average fitness.

4.4 Symbolic Regression

For symbolic regression, we have chosen four problems from the work of Clegg et al. [8] and McDermott et al. [19] for better GP benchmarks, and the Pagie-1 one problem which has been proposed by White et al. [15] as an alternative to the heavily overused Koza-1 (“quartic”) problem. The analytic functions of the problems are shown in Table 5. The training data set U[abc] refers to c uniform random samples drawn from a to b inclusive and E[abc] refers to a grid of points evenly spaced with an interval of c, from a to b inclusive.

Table 5. Symbolic regression problems used in the experiment.
Table 6. Configuration of the symbolic regression parameter sweep.
Table 7. Results of the parameter sweep for symbolic regression.

The fitness of the individuals was represented by a cost function value, defined as the sum of the absolute differences between the correct function values and the values of an evaluated individual. The configuration of the experiment is shown in Table 6. All problems used the following set of mathematical functions {\(+\), −, \(*\),  / , \(\sin \), \(\cos \), \(\ln (|n|)\), \(e^n\)}.

Table 7 shows the parameter sweep results. Same as before, the primary selected criterion was the best average fitness over one hundred runs. Crossover operators that performed significantly different from \((1+\lambda )\) have their mean values marked. As can be seen in Fig. 3, the arithmetic crossover performs very well, when used for symbolic regression, as originally designed.

Fig. 3.
figure 3

Comparison of crossover operators for symbolic regression.

5 Discussion

In our meta-evolutionary experiments, we dealt with significant problems in order to make a fair comparison. We were able to determine optimal parameter settings for the \((1+\lambda )\)-CGP as the tuning consists of only three parameters: population size, mutation rate, and genotypic length. However, determining optimal parameter settings for the canonical crossover algorithms is more complex. There are three additional parameters to contend with: crossover rate, elitism rate, and tournament size, which makes obtaining an optimal parameter setting for the respective problems significantly more difficult.

Furthermore, former studies on the traditional \((1+\lambda )\)-CGP algorithm have shown that large genotypes are very effective for the performance of CGP for certain problems. Consequently, we have to deal with a big parameter space in CGP in order to determine the optimal parameters and to make a fair comparison.

For this paper, we only used the meta-evolution framework of the Java Evolutionary Computation Toolkit (ECJ)Footnote 1. However, we think that including other state-of-the-art methods for parameter tuning of evolutionary algorithms, like Iterated Race for Automatic Algorithm Configuration (IRace)Footnote 2 or Sequential-parameter-optimization (SPOT)Footnote 3, can provide more insight into well-performing algorithm settings in CGP, and help to provide fair and profound comparisons.

Another point which should be discussed is the observation that each type of crossover works best with different settings. Our findings indicate that there exists no general parameterization pattern for CGP when the crossover is in use. We think it should be investigated if there are similar behaviors like exploration abilities which could be obtained by fitness and search space analyses.

The results of our study show that the parameter settings vary for different problems in the respective problem domain, and indicate that there is no general pattern to parametrize the \((1+\lambda )\)-CGP in a well-performing way. These findings also open up a new question, which conditions or types of problems have the need for bigger or smaller population sizes. A preliminary assumption could be that the fitness landscape of certain problems requires more exploration abilities in order to overcome local optima.

Our results indicate that bigger populations perform well in the symbolic regression domain. This finding is consistent with a recent study on mutation-only CGP by Kaufmann et al. [20] which also indicates that bigger populations perform best in the symbolic regression domain.

Since our experiments validate Kaufmann et al. results, this behavior should be investigated through more detailed experiments. Furthermore, we think that these findings offer a good opportunity to get more understanding of how CGP works in detail and can significantly contribute to the overall knowledge of fitness landscape analysis in CGP.

Specifically, Kaufmann et al. show that a mutational \((\mu +\lambda )\) evolutionary algorithm with big population size can be very effective. Therefore, we think it should be investigated whether the Block crossover can be used with a (\(\mu +\lambda \)) evolutionary algorithm, as a part of our attempts to proceed towards more precise comparative studies in CGP.

5.1 Analysis of Hypothesis

The results of our comparative study show that the traditional \((1+\lambda )\)-CGP algorithm can not be stated as the universally predominant algorithm for CGP. While it is often a good choice, the outcome of our study gives a significant evidence that the \((1+\lambda )\)-CGP can not be considered as the most efficient CGP algorithm in the boolean function domain. The experiments on 1-resilient Boolean function proves that the \((1+\lambda )\)-CGP may indeed be significantly inferior to the other CGP algorithms.

6 Conclusion and Future Work

The first comprehensive comparative study on crossover in CGP has been proposed. We also proposed a new Block crossover technique, inspired by embedded CGP, for use in standard CGP. We have performed a comparative study using our new crossover technique, two evolutionary methods that only use mutation, and three other crossover operators that have been suggested in the literature. Simple One-point Crossover, Arithmetic crossover, used in the field of real-valued Genetic Algorithms, and Subgraph Crossover that recombines parts of the parent chromosome phenotypes.

We have formulated a hypothesis that the traditional \((1+\lambda )\)-CGP algorithm would not perform significantly worse than the crossover operators. We performed a comparison on eight selected tasks from the areas of Boolean function design and symbolic regression. We have used meta-evolution to determine the most important evolutionary parameters and find common values for the parameters of lower importance.

Next, we have performed a series of parameter sweeps, to determine the settings most suitable for every type of crossover and every task, and performed a comparison. Finally, we have performed a non-parametric statistical test to prove our hypothesis false, and shown that the \((1+\lambda )\)-CGP is significantly outperformed by all other approaches, when designing 1-resilient Boolean functions.

Our results show, that it is possible for crossover operators to outperform the standard \((1+\lambda )\) strategy. However, if both methods have their parameters fine-tuned, the \((1+\lambda )\) strategy often remains as the overall best strategy. The question of finding a universal crossover operator is CGP therefore remains open.

Our study opens a new perspective on comparative studies on the use of crossover in CGP and its challenges. The experiments with meta-evolution in CGP have shown that it is difficult to obtain well-performing parameter settings for crossover algorithms in CGP.

These results are the first step toward a fair comparison and a more clear understanding of the function of crossover in CGP. Our future work will focus on exploring ways to make comparisons between crossover techniques and algorithms in CGP more fair, including the investigation of suitable parameter optimization techniques for CGP, widening the spectrum of problem domains on which comparison is made, and using crossover operators from other areas. We will especially focus on investigating the possibility of combining the Block crossover with the \((\mu +\lambda )\) evolutionary algorithm, and on exploring the domain of cryptographically significant boolean functions, where the \((1+\lambda )\) algorithm faces great difficulty.