abstract = "From 1960s, Evolutionary Algorithms (EA), which is an
important subtopic of Artificial Intelligence (AI), has
been studied a lot and great progresses have been made
continuously to improve the existed algorithms or
propose novel methods. For example, the studies on many
classical methods such as Genetic Algorithm (GA),
Genetic Programming (GP), Evolutionary Strategies (ES),
etc. have made significant contribution to the research
of EA. In the past decade, a new evolutionary approach
named Genetic Network Programming (GNP) was proposed
and attracted more and more attention. GNP which is
based on the idea of Genetic Algorithm, also can evolve
itself and search in the solution domain of large scale
and finally find the (approximate) optimal solutions.
The unique character of GNP which make it very
different from other methods of EA is the use of the
data structure of directed graphs. Many research has
demonstrated that GNP can deal with complex problems in
the dynamical environments very efficiently and
effectively due to its graph based structure. As a
result, recently, GNP is being used in many different
areas such as data mining, extracting trading rules of
stock markets, elevator supervised control systems,
etc. and GNP has obtained outstanding results in all
the above fields. On the other hand, many research
shows that classical EAs such as GA, usually fail to
solve problems in dynamical environments. So, scholars
devote themselves to the research on the enhancement of
the architecture of EAs. For example, different memory
schemes storing historical information during evolution
and reusing them later are designed for EAs to solve
complex problems in dynamical environments.
So, the motivation of this research is designing memory
schemes for GNP in order to improve its performance
further in the dynamical environments. So, four
different memory schemes are proposed: GNP with rules,
GNP with reconstructed individuals, GNP with route
nodes and adaptive mutation in SARSA learning of GNP.
GNP with rules stores first-order information on GNP
rules and uses them to generate new individuals. GNP
with reconstructed individuals will stores the complete
node transitions which can guide the agent with much
more effectiveness and uses them to enhance the gene
structures of the worst individuals. GNP with route
nodes employs an indirect memory scheme which uses the
stored information associated with current
environments. The adaptive mutation using Q values to
evaluate node branches adjusts the mutation rates and
mutation directions for node branches and achieves the
balance between exploration and exploitation. In order
to measure the performance of the proposed
architectures, the benchmark of tile-world was used as
the simulation environments. The simulation results
show some improvements brought by the memory schemes to
conventional GNPs.",