Created by W.Langdon from gp-bibliography.bib Revision:1.8051
Considering the large amount of books, papers and articles on GP, over 5000 items in the official GP Bibliography, relatively few have addressed the problem of understanding the very basic biases of GP operators, i.e., how they sample program spaces.
This thesis begins to address this lack of knowledge by considering GPs defining variation operator, sub-tree swapping crossover. It first analyses crossovers bias with regard to program sampling in terms of program length, providing a number of empirically verified theoretical models.
With this knowledge in hand, the thesis investigates how length bias affects GP runs, particularly with regard to the sampling of unique programs and bloat. From this work a new bloat theory is presented, Crossover-Bias, and a method, Sampling Parsimony, to directly alter the rate of resampling and hence control bloat is created.
To counteract the length bias of crossover a new technique is introduced, Operator Equalisation, which enables length classes to be sampled according to predetermined probability distributions. This provides essential information regarding GP runs and can be shown to improve GP performance.
We then turn our attentions to the sampling of programs within length classes and its implications for structural convergence within GP. From this work we show that subtree swapping crossover will sample programs with a frequency determined by arity proportions, our length work being a specialisation of this process. A new theoretical model based on arity histograms is then provided.",
Supervisor: Riccardo Poli",
Genetic Programming entries for Stephen Dignum