abstract = "Algorithms that learn through environmental
interaction and delayed rewards, or reinforcement
learning, increasingly face the challenge of scaling to
dynamic, high-dimensional, and partially observable
environments. Significant attention is being paid to
frameworks from deep learning, which scale to
high-dimensional data by decomposing the task through
multi-layered neural networks. While effective, the
representation is complex and computationally
demanding. In this work we propose a framework based on
Genetic Programming which adaptively complexifies
policies through interaction with the task. We make a
direct comparison with several deep reinforcement
learning frameworks in the challenging Atari video game
environment as well as more traditional reinforcement
learning frameworks based on a priori engineered
features. Results indicate that the proposed approach
matches the quality of deep learning while being a
minimum of three orders of magnitude simpler with
respect to model complexity. This results in real-time
operation of the champion RL agent without recourse to
specialized hardware support. Moreover, the approach is
capable of evolving solutions to multiple game titles
simultaneously with no additional computational cost.
In this case, agent behaviours for an individual game
as well as single agents capable of playing all games
emerge from the same evolutionary",