A variable size mechanism of distributed graph programs and its performance evaluation in agent control problems

https://doi.org/10.1016/j.eswa.2013.08.063Get rights and content

Highlights

  • Distributed structures are used to create graph-based programs.

  • Variable size mechanism of the distributed structures is proposed.

  • The sizes of the structures are optimized by the proposed evolutionary algorithm.

  • The simulations in multiagent environments show some advantages.

Abstract

Genetic Algorithm (GA) and Genetic Programming (GP) are typical evolutionary algorithms using string and tree structures, respectively, and there have been many studies on the extension of GA and GP. How to represent solutions, e.g., strings, trees, graphs, etc., is one of the important research topics and Genetic Network Programming (GNP) has been proposed as one of the graph-based evolutionary algorithms. GNP represents its solutions using directed graph structures and has been applied to many applications. However, when GNP is applied to complex real world systems, large size of the programs is needed to represent various kinds of control rules. In this case, the efficiency of evolution and the performance of the systems may decrease due to its huge structures. Therefore, we have been studied distributed GNP based on the idea of divide and conquer, where the programs are divided into several subprograms and they cooperatively control whole tasks. However, because the previous work divided a program into some subprograms with the same size, it cannot adjust the sizes of the subprograms depending on the problems. Therefore, in this paper, an efficient evolutionary algorithm of variable size distributed GNP is proposed and its performance is evaluated by the tileworld problem that is one of the benchmark problems of multiagent systems in dynamic environments. The simulation results show that the proposed method obtains better fitness and generalization abilities than the method without variable size mechanism.

Introduction

Genetic Algorithm (GA) (Holland, 1975) and Genetic Programming (GP) (Koza, 1992, Koza, 1994) are typical evolutionary algorithms that have been widely studied. A large number of real world applications have been also studied such as robot programming (Kamio & Iba, 2005), financial problems (Alfaro-Cid et al., 2008, Iba and Sasaki, 1999, Ruiz-Torrubiano and Suárez, 2010) and network security systems (Banković et al., 2007, Folino et al., 2005).

In order to create reliable systems using evolutionary algorithms, the program structures (phenotype and genotype representations) and how to efficiently evolve them are important issues. Therefore, as an extended algorithm of GA and GP, Genetic Network Programming (GNP) and its extension using reinforcement learning (GNP-RL) (Mabu et al., 2007, Hirasawa et al., 2001) have been proposed and applied to many applications (Mabu et al., 2011, Hirasawa et al., 2008). The program of GNP is represented by directed graph structures and evolved by crossover and mutation. Originally, GNP was proposed because graph structures may have better representation abilities than strings and trees. In addition, human brain also has a graph (network) structure, so some inherent abilities may be involved in the graph structure. In fact, the graph structure has some advantages such as (1) reusability of nodes and (2) applicability to dynamic environments. Actually, GNP has the following features. (1) The directed graph structure automatically generates some repetitive processes like subroutines, and reuses nodes repeatedly during the node transition, which contributes to creating programs with compact structures. (2) Once GNP starts its node transition from the start node, GNP executes judgment nodes (if–then functions) and processing nodes (action functions) according to the connections between nodes without any terminal nodes. Therefore, the node transition implicitly memorizes the history of judgment and actions, which contributes to the decision making in dynamic environments because GNP can make decisions based not only on the current, but also the past information.

Evolutionary Programming (EP) (Fogel et al., 1966, Fogel, 1994) is a graph-based evolutionary algorithm to create finite state machines (FSMs) automatically, but the characteristics of EP and GNP are different. Generally, FSM must define the state transition rules for all the combinations of states and possible inputs, thus the FSM program will become large and complex when the number of states and inputs is large. On the other hand, the evolution of GNP selects only the necessary nodes by changing connections between nodes, which means that GNP can judge only the essential inputs for making decisions at the current situations. As a result, GNP does not have to consider all the combinations of the inputs and actions, which makes the compact program structures.

When GNP is applied to complicated real world systems, large size of graph structures are needed to represent various kinds of rules for adapting to various kinds of situations. However, huge structures may decrease the efficiency of evolution, as a result, decrease the performance of the systems. Therefore, we have been studied distributed GNP (Yang, He, Mabu, & Hirasawa, 2012) based on the idea of divide and conquer which is used to create the complicated systems using relatively small size of the programs (Li, Tian, & Sclaroff, 2012). In Yang et al. (2012), the programs are divided into several subprograms which cooperatively control whole tasks, and this method is applied to a stock trading model, which shows better performances than the method without distributed structures. However, because the previous work divided a program into some subprograms with the same size, it cannot adjust the sizes of the subprograms depending on the problems. The best size of the structure is different depending on the difficulties of the task, thus the fixed size of the structure limits the representation ability of the solutions. In order to solve this problem, an efficient evolutionary algorithm of variable size distributed GNP (VS-DGNP) is proposed and its performance is evaluated by the tileworld problem (Pollack & Ringuette, 1990) that is one of the benchmark problems of multiagent systems in dynamic environments. The features of the proposed method are as follows.

The complicated structure can be divided into several substructures with different sizes. Genetic operations can be executed inside each substructure and between substructures, respectively, as a result, the efficient optimization can be done. Migration of nodes from a certain substructure to another substructure is executed to optimize the sizes of the substructures appropriately.

The rest of this paper is organized as follows. In Section 2, the basic structure of GNP and GNP with reinforcement learning (GNP-RL) is reviewed. In Section 3, the structure of distributed GNP and how to realize variable size structure are explained. In Section 4, after simulation environments and conditions are explained, the results and analysis are described. Section 5 is devoted to conclusions.

Section snippets

Review of genetic network programming with reinforcement learning

Because the proposed Variable Size Distributed GNP (VS-DGNP) is based on GNP with Reinforcement Learning (GNP-RL), the structure of GNP-RL and its learning and evolution mechanisms are firstly reviewed. An individual of GNP-RL is represented by a directed graph structure, evolved by crossover and mutation, and the node transition rules are learned by reinforcement learning.

Proposed method: variable size mechanism of distributed GNP-RL

In Section 2, the basic structure of GNP-RL is explained, where one individual consists of one graph structure. In this section, the distributed graph structure is firstly introduced, then a variable size mechanism of the distributed structures is proposed.

Simulations

To confirm the effectiveness of the proposed method, the simulations for determining agents’ behavior using the Tileworld problem (Pollack & Ringuette, 1990) are described in this section.

Conclusions

In this paper, a variable size mechanism of distributed GNP is proposed. In order to realize this mechanism, the new crossover and mutation operators considering the distributed structures are introduced. From the simulation results in the Tileworld problem, the proposed method shows better results than distributed GNP without using variable size mechanism in both training and testing environments. In the future work, the contribution of the distributed structure and variable size mechanism is

References (19)

  • Z. Banković et al.

    Improving network security using genetic algorithm approach

    Computers & Electrical Engineering

    (2007)
  • Alfaro-Cid, E., Castillo, P.A., Esparcia, A., Sharman, K., Merelo, J., Prieto, A., Mora, A., Laredo, J. (2008)....
  • D.B. Fogel

    An introduction to simulated evolutionary optimization

    IEEE Transactions on Neural Networks

    (1994)
  • L.J. Fogel et al.

    Artificial intelligence through simulated evolution

    (1966)
  • G. Folino et al.

    GP ensemble for distributed intrusion detection systems

  • K. Hirasawa et al.

    A double-deck elevator group supervisory control system using genetic network programming

    IEEE Transactions on Systems Man and Cybernetics C

    (2008)
  • Hirasawa, K., Okubo, M., Katagiri, H., Hu, J., Murata, J. (2001). Comparison between genetic network programming (GNP)...
  • J.H. Holland

    Adaptation in natural and artificial systems

    (1975)
  • Iba, H., Sasaki, T. (1999). Using genetic programming to predict financial data. In Proceedings of the congress on...
There are more references available in the full text version of this article.

Cited by (2)

View full text