Temporal Memory Sharing in Visual Reinforcement Learning
Created by W.Langdon from
gp-bibliography.bib Revision:1.8120
- @InProceedings{Kelly:2019:GPTP,
-
author = "Stephen Kelly and Wolfgang Banzhaf",
-
title = "Temporal Memory Sharing in Visual Reinforcement
Learning",
-
booktitle = "Genetic Programming Theory and Practice XVII",
-
year = "2019",
-
editor = "Wolfgang Banzhaf and Erik Goodman and
Leigh Sheneman and Leonardo Trujillo and Bill Worzel",
-
pages = "101--119",
-
address = "East Lansing, MI, USA",
-
month = "16-19 " # may,
-
publisher = "Springer",
-
keywords = "genetic algorithms, genetic programming",
-
isbn13 = "978-3-030-39957-3",
-
DOI = "doi:10.1007/978-3-030-39958-0_6",
-
abstract = "Video games provide a well-defined study ground for
the development of behavioural agents that learn
through trial-and-error interaction with their
environment, or reinforcement learning (RL). They cover
a diverse range of environments that are designed to be
challenging for humans, all through a high-dimensional
visual interface. Tangled Program Graphs (TPG) is a
recently proposed genetic programming algorithm that
emphasizes emergent modularity (i.e. automatic
construction of multi-agent organisms) in order to
build successful RL agents more efficiently than
state-of-the-art solutions from other sub-fields of
artificial intelligence, e.g. deep neural networks.
However, TPG organisms represent a direct mapping from
input to output with no mechanism to integrate past
experience (previous inputs). This is a limitation in
environments with partial observability. For example,
TPG performed poorly in video games that explicitly
require the player to predict the trajectory of a
moving object. In order to make these calculations,
players must identify, store, and reuse important parts
of past experience. In this work, we describe an
approach to supporting this type of short-term temporal
memory in TPG, and show that shared memory among
subsets of agents within the same organism seems
particularly important. In addition, we introduce
heterogeneous TPG organisms composed of agents with
distinct types of representation that collaborate
through shared memory. In this study, heterogeneous
organisms provide a parsimonious approach to supporting
agents with task-specific functionality, image
processing capabilities in the case of this work. Taken
together, these extensions allow TPG to discover
high-scoring behaviours for the Atari game Breakout,
which is an environment it failed to make significant
progress on previously.",
-
notes = "Part of \cite{Banzhaf:2019:GPTP}, published after the
workshop",
- }
Genetic Programming entries for
Stephen Kelly
Wolfgang Banzhaf
Citations