Exploring the Predictable
Created by W.Langdon from
gp-bibliography.bib Revision:1.8178
- @InCollection{Schmidhuber:2002:AEC,
-
author = "Juergen Schmidhuber",
-
title = "Exploring the Predictable",
-
booktitle = "Advances in Evolutionary Computing",
-
publisher = "Springer",
-
year = "2002",
-
editor = "Ashish Ghosh and Shigeyoshi Tsutsui",
-
series = "Natural Computing Series",
-
pages = "579--612",
-
keywords = "genetic algorithms, genetic programming",
-
isbn13 = "978-3-642-62386-8",
-
URL = "ftp://ftp.idsia.ch/pub/juergen/explorepredictable.pdf",
-
URL = "http://www.idsia.ch/~juergen/explorepredictable/",
-
DOI = "doi:10.1007/978-3-642-18965-4_23",
-
abstract = "Details of complex event sequences are often not
predictable, but their reduced abstract representations
are. I study an embedded active learner that can limit
its predictions to almost arbitrary computable aspects
of spatio-temporal events. It constructs probabilistic
algorithms that (1) control interaction with the world,
(2) map event sequences to abstract internal
representations (IRs), (3) predict IRs from IRs
computed earlier. Its goal is to create novel
algorithms generating IRs useful for correct IR
predictions, without wasting time on those learned
before. This requires an adaptive novelty measure which
is implemented by a co-evolutionary scheme involving
two competing modules collectively designing (initially
random) algorithms representing experiments. Using
special instructions, the modules can bet on the
outcome of IR predictions computed by algorithms they
have agreed upon. If their opinions differ then the
system checks who's right, punishes the loser (the
surprised one), and rewards the winner. An evolutionary
or reinforcement learning algorithm forces each module
to maximise reward. This motivates both modules to lure
each other into agreeing upon experiments involving
predictions that surprise it. Since each module
essentially can veto experiments it does not consider
profitable, the system is motivated to focus on those
computable aspects of the environment where both
modules still have confident but different opinions.
Once both share the same opinion on a particular issue
(via the loser's learning process, e.g., the winner is
simply copied onto the loser), the winner loses a
source of reward -- an incentive to shift the focus of
interest onto novel experiments. My simulations include
an example where surprise-generation of this kind helps
to speed up external reward.",
- }
Genetic Programming entries for
Jurgen Schmidhuber
Citations