Evolutionary Development of Hierarchical Learning Structures
Created by W.Langdon from
gp-bibliography.bib Revision:1.8051
- @Article{Elfwing:2007:tec,
-
author = "Stefan Elfwing and Eiji Uchibe and Kenji Doya and
Henrik I. Christensen",
-
title = "Evolutionary Development of Hierarchical Learning
Structures",
-
journal = "IEEE Transactions on Evolutionary Computation",
-
year = "2007",
-
volume = "11",
-
number = "2",
-
pages = "249--264",
-
month = apr,
-
keywords = "genetic algorithms, genetic programming, learning
(artificial intelligence), Lamarckian evolutionary
development, MAXQ hierarchical RL method, foraging
task, genetic programming, hierarchical learning
structures, hierarchical reinforcement learning, task
decomposition",
-
DOI = "doi:10.1109/TEVC.2006.890270",
-
ISSN = "1089-778X",
-
abstract = "Hierarchical reinforcement learning (RL) algorithms
can learn a policy faster than standard RL algorithms.
However, the applicability of hierarchical RL
algorithms is limited by the fact that the task
decomposition has to be performed in advance by the
human designer. We propose a Lamarckian evolutionary
approach for automatic development of the learning
structure in hierarchical RL. The proposed method
combines the MAXQ hierarchical RL method and genetic
programming (GP). In the MAXQ framework, a subtask can
optimise the policy independently of its parent task's
policy, which makes it possible to reuse learned
policies of the subtasks. In the proposed method, the
MAXQ method learns the policy based on the task
hierarchies obtained by GP, while the GP explores the
appropriate hierarchies using the result of the MAXQ
method. To show the validity of the proposed method, we
have performed simulation experiments for a foraging
task in three different environmental settings. The
results show strong interconnection between the
obtained learning structures and the given task
environments. The main conclusion of the experiments is
that the GP can find a minimal strategy, i.e., a
hierarchy that minimises the number of primitive
subtasks that can be executed for each type of
situation. The experimental results for the most
challenging environment also show that the policies of
the subtasks can continue to improve, even after the
structure of the hierarchy has been evolutionary
stabilised, as an effect of Lamarckian mechanisms",
- }
Genetic Programming entries for
Stefan Elfwing
Eiji Uchibe
Kenji Doya
Henrik I Christensen
Citations