abstract = "A key to the success of any genetic programming
process is the use of a good alphabet of atomic
building blocks from which solutions can be evolved
efficiently. An alphabet that is too granular may
generate an unnecessarily large search space; an
inappropriately coarse grained alphabet may bias or
prevent finding optimal solutions. Here we introduce a
method that automatically identifies a small alphabet
for a problem domain. We process solutions on the
complexity-optimality Pareto front of a number of
sample systems and identify terms that appear
significantly more frequently than merited by their
size. These terms are then used as basic building
blocks to solve new problems in the same problem
domain. We demonstrate this process on symbolic
regression for a variety of physics problems. The
method discovers key terms relating to concepts such as
energy and momentum. A significant performance
enhancement is demonstrated when these terms are then
used as basic building blocks on new physics problems.
We suggest that identifying a problem-specific alphabet
is key to scaling evolutionary methods to higher
complexity systems.",
notes = "GECCO-2009 A joint meeting of the eighteenth
international conference on genetic algorithms
(ICGA-2009) and the fourteenth annual genetic
programming conference (GP-2009).