Symbolic method for deriving policy in reinforcement learning
Created by W.Langdon from
gp-bibliography.bib Revision:1.8194
- @InProceedings{Alibekov:2016:CDC,
-
author = "Eduard Alibekov and Jiri Kubalik and Robert Babuska",
-
booktitle = "2016 IEEE 55th Conference on Decision and Control
(CDC)",
-
title = "Symbolic method for deriving policy in reinforcement
learning",
-
year = "2016",
-
pages = "2789--2795",
-
abstract = "This paper addresses the problem of deriving a policy
from the value function in the context of reinforcement
learning in continuous state and input spaces. We
propose a novel method based on genetic programming to
construct a symbolic function, which serves as a proxy
to the value function and from which a continuous
policy is derived. The symbolic proxy function is
constructed such that it maximizes the number of
correct choices of the control input for a set of
selected states. Maximization methods can then be used
to derive a control policy that performs better than
the policy derived from the original approximate value
function. The method was experimentally evaluated on
two control problems with continuous spaces, pendulum
swing-up and magnetic manipulation, and compared to a
standard policy derivation method using the value
function approximation. The results show that the
proposed method and its variants outperform the
standard method.",
-
keywords = "genetic algorithms, genetic programming",
-
DOI = "doi:10.1109/CDC.2016.7798684",
-
month = dec,
-
notes = "Also known as \cite{7798684}",
- }
Genetic Programming entries for
Eduard Alibekov
Jiri Kubalik
Robert Babuska
Citations