Evolving hierarchical memory-prediction machines in multi-task reinforcement learning
Created by W.Langdon from
gp-bibliography.bib Revision:1.7970
- @Article{Kelly:GPEM,
-
author = "Stephen Kelly and Tatiana Voegerl and
Wolfgang Banzhaf and Cedric Gondro",
-
title = "Evolving hierarchical memory-prediction machines in
multi-task reinforcement learning",
-
journal = "Genetic Programming and Evolvable Machines",
-
year = "2021",
-
volume = "22",
-
number = "4",
-
pages = "573--605",
-
month = dec,
-
note = "Special Issue: Highlights of Genetic Programming 2020
Events",
-
keywords = "genetic algorithms, genetic programming, Tangled
Program Graph, Reinforcement learning, Temporal memory,
Multi-task, MTRL, Evolving team hierarchies, Run‑time
complexity, Dynamic memory access",
-
ISSN = "1389-2576",
-
DOI = "doi:10.1007/s10710-021-09418-4",
-
size = "33 pages",
-
abstract = "A fundamental aspect of intelligent agent behaviour is
the ability to encode salient features of experience in
memory and use these memories, in combination with
current sensory information, to predict the best action
for each situation such that long-term objectives are
maximized. The world is highly dynamic, and behavioural
agents must generalize across a variety of environments
and objectives over time. This scenario can be modeled
as a partially-observable multi-task reinforcement
learning problem. We use genetic programming to evolve
highly-generalized agents capable of operating in six
unique environments from the control literature,
including OpenAI entire Classic Control suite. This
requires the agent to support discrete and continuous
actions simultaneously. No task-identification sensor
inputs are provided, thus agents must identify tasks
from the dynamics of state variables alone and define
control policies for each task. We show that emergent
hierarchical structure in the evolving programs leads
to multi-task agents that succeed by performing a
temporal decomposition and encoding of the problem
environments in memory. The resulting agents are
competitive with task-specific agents in all six
environments. Furthermore, the hierarchical structure
of programs allows for dynamic run-time complexity,
which results in relatively efficient operation.",
-
notes = "crat-pole, acrobot, cartcentering, pendulumn
mountaincar BEACON Center for the Study of Evolution in
Action, Michigan State University, East Lansing, MI,
USA",
- }
Genetic Programming entries for
Stephen Kelly
Tatiana Voegerl
Wolfgang Banzhaf
Cedric Gondro
Citations