Function approximations by coupling neural networks and genetic programming trees with oblique decision trees

https://doi.org/10.1016/S0954-1810(98)00015-6Get rights and content

Abstract

This paper is concerning the development of the hybrid system of neural networks and genetic programming (GP) trees for problem domains where a complete input space can be decomposed into several different subregions, and these are well represented in the form of oblique decision tree. The overall architecture of this system, called federated agents, consists of a facilitator, local agents, and boundary agents. Neural networks are used as local agents, each of which is expert at different subregions. GP trees serve as boundary agents. A boundary agent refers to the one that specializes at only the borders of subregions where discontinuities or a few different patterns may coexist. The facilitator is responsible for choosing the local agent that is suitable for given input data using the information obtained from oblique decision tree. However, there is a large possibility of selecting the invalid local agent as result of the incorrect prediction of decision tree, provided that input data is close enough to the boundaries. Such a situation can lead the federated agents to produce a higher prediction error than that of a single neural network trained over the whole input space. To deal with this, the approach taken in this paper is that the facilitator selects the boundary agent instead of the local agent when input data is closely located at certain border of subregions. In this way, even if decision tree yields an incorrect prediction, the performance of the system is less affected by it. The validity of our approach is examined by applying federated agents to the approximation of the function with discontinuities and the configuration of the midship section of bulk cargo ships.

Introduction

The typical task of function approximations is to find an appropriate generalizer that correctly maps all possible inputs onto their respective outputs by using only the information contained in a learning set. To construct approximated functions based on learning samples, various methods and algorithms have been presented and developed. These include the simple nearest-neighbor method, statistical models, case-based reasoning, feedforward neural networks, and so forth. In real application domains, it is well acknowledged that generalizers trained by only one method or algorithm over an entire input space do not produce prediction results of a desired accuracy. To overcome such a problem, numerous suggestions have been made. These research activities may fall into two categories.

First, many researchers have pursued the ways of fusing the results of different generalizers based on different principles. Different algorithms work dependent on different principles and can generalize in different ways, so combined generalizers can potentially give a better prediction than individual ones. A typical example is the one that consists of different modules, such as a neural network module, a statistical module, and a memory-based reasoning module [1]. During the training phase, these modules independently learn the mapping between the input space and output space.

Second, the modular or the hierarchical models can be adopted to utilize modularity in their structure by combining generalizers, each of which will take up only some parts of an input space or problem tasks. The major issue of this approach concerns how to decompose an input space or problem tasks, and the way to combine generalizers for obtaining the optimal results. In the neural network community, a number of possible schemes for selecting individual networks and combining architectures have been already suggested [2], [3], [4], [5], [6], [7].

A modular neural network system can be further classified into two types [6]. One that explicitly decomposes an input space, and the other that does not. This paper is concerned with the former, and only considers a “re-composition” problem; that is, how to combine results of subnetworks. In other words, a complete input space can be decomposed into several different regions, and these are already known prior to training neural networks. To relieve complexities, this paper is restricted to constructing neural networks which have only one output node, and all elements of a learning set are real numbers. We adopt a typical multiple neural networks system composed of several subnetworks, called local agents, each of which is expert at a different region of an input space. When a test instance is given, it is necessary to identify which local agent must be used. To do so we use the classification system OC1 [8] (OC1 is a public domain software, and it is available at ftp.cs.jhu.edu) that can generate oblique decision trees, under the assumption that a given input space can be easily decomposed into several different regions, and these can be well represented in the form of decision tree. Decision tree can predict the subregion of an input space in which input data is expected to be, and thus the corresponding local agent can be easily selected. Although this approach is conceptually very simple and straightforward in the implementation of the system, it has a serious drawback. If a test instance is closely located to boundaries of a divided input space, there can be a large probability that the prediction result of decision tree is incorrect, and it causes the system to yield a much higher prediction error than that of a single neural network trained over a whole input space as result of selecting an inappropriate local agent. This motivates and leads us to introduce a boundary agent which is expert at only regimes that are close to the boundaries of subregions. When input data is given, decision tree determines the subregion to which input data belongs, and checks whether it lies close enough to boundaries of this subregion. If it happens, the corresponding boundary agent is invoked, instead of the local agent. In this way, even if the result of decision tree may be incorrect, outputs of the system are less affected by it.

Neural networks can be boundary agents. Feedforward networks are capable of approximating any Borel measurable function to any degree of accuracy, if certain conditions are satisfied [9]. However, in real situations, sometimes obtaining neural networks with a desired accuracy is very difficult. Especially, the borders of subregions are likely to show a few different patterns or discontinuities, and it could pose difficulty in training neural networks. When trained to approximate functions with discontinuities, neural networks typically attempt to attain a best continuous approximation over a whole input space. This yields a high local error in small regions where function shows discontinuities. A functional tree of genetic programming (GP) [10], [11] can be a useful alternative candidate to the boundary agent. GP uses the genetic operators such as reproduction, crossover, and mutation to evolve individuals which are in the form of tree-structured program where nodes are functions and leaves are constants or variables. Recently, many successful applications of GP in diverse engineering fields have been reported [12], [13], [14], [15], [16], [17], [18], [19], but the standard GP has the difficulty of finding appropriate numerical parameters for the trees that are essential elements in the approximation of real-valued functions. Unlike other research activities [12], [13], [14], [15], [16], [17] where nonlinear optimization techniques have been applied to seek appropriate numerical parameters, the approach taken in this paper is to adopt the use of linear associative memories [20] to reduce the computational cost and time.

The overall architecture of a hybrid system, called the federated agents, consists of a facilitator, local agents and boundary agents. The facilitator is responsible for selecting an appropriate local agent or a boundary agent using the information from decision tree when an input instance is given. To demonstrate the performance of federated agents, an example of approximating the discontinuous function is presented. Also, for the purpose of the practical application, the system that that can determine the configuration of a midship section for bulk cargo ships is developed using this federated agents.

In section 2, the architecture of federated agents is presented. The detail description of GP with linear associative memories is given with the method of overfitting avoidance in section 3. In order to construct boundary agents, the systematic way of extracting boundary data from a training set is needed, and that is presented in section 4. Then, the performance of federated agents is verified by the case studies of approximating the function with discontinuities and the configuration of a midship section in section 5. Finally, conclusions and summaries are presented in section 6.

Section snippets

Federated agents of neural networks and GP trees

The overall structure of federated agents which are implemented in an object-oriented style using C++ language is schematically presented in Fig. 1. Neural networks, GP trees and oblique decision tree are major components constituting federated agents. As local agents, neural networks are responsible for subregions of an input space, and as boundary agents, GP trees are in charge of boundaries of subregion. Decision tree, that is contained in a facilitator, is used for representing a divided

Overview of genetic programming

The typical task of function approximations involves the minimization of the following cost function by applying various optimization techniques.C=i=1pE(f(Xi,w),yi)where p is the number of samples contained in a learning set L{(X1,y1),(X2,y2),…,(Xp,yp)}, where yi is a desired output value corresponding to an input vector Xi, w is a system parameter or weight vector of yi=f(Xi,w) and E is the error measure such as the squared-error between yi and yi. For instance, feed-forward neural networks

Generation of boundary data by oblique decision tree

Oblique decision tree [8] is a special case of multivariate tree that is based on a linear combination of the variables at some internal nodes. Although the most widely used classification systems such as C4.5 [29] adopt univariate decision trees, where the test at each node uses a single variable, we believe that using multivariate decision tree is more fit in selecting appropriate variables in an input space which can give the optimal results of federated agents. This issue will be discussed

Approximating the function with discontinuities

To demonstrate the performance of federated agents, we prepared 961 training data and 2500 test data for the function, shown in Fig. 9a, that clearly displays three different patterns together with discontinuities. Fig. 9b shows three training sets for OC1. To construct the facilitator and boundary agents, oblique decision tree which is shown in Fig. 9c is produced using the OC1 system. 30% training data are randomly selected to prune decision tree. Classified regions for a test set are shown

Conclusions and summaries

In this paper, we have dealt with developing federated agents coupled with decision tree. Alternative to the strategy for fusing results of neural networks in the typical multiple neural networks systems, we have introduced a GP tree as a boundary agent which is expert at only boundaries of regions. When given input data is close enough to boundaries of divided input space, boundary agents are always used instead of local agents that may give very poor results at borders of subregions. Also, we

Acknowledgements

This work was supported in part by the Korea Science and Engineering Foundation(96-0200-01-01-3). The authors would like to thank Han, S.M. for his participation in the implementation of the system. The authors are also grateful to Dr. Urm, H.S. for his generous support.

References (35)

  • G. Cybenko

    Approximation by superpositions of a sigmoidal function

    Math Control Signal Systems

    (1989)
  • J.R. Koza

    Genetic programming: on the programming of computers by means of natural selection

    (1992)
  • J.R. Koza

    Genetic programming II: automatic discovery of reusable programs

    (1994)
  • Sharman KC, Esparcia-Alcazar AI, Li Y. Evolving signal processing algorithms by genetic programming. In: Proc of 1st....
  • G.J. Gray et al.

    Structural system identification using genetic programming and a block diagram oriented simulation tool

    Electronics Letters

    (1996)
  • Bettenhausen KD, Marenbach P, Freyer S, Rettenmaier H, Nieken U. Self-organizing structured modeling of a...
  • B. McKay et al.

    Identification of industrial processes using genetic programming

    Identification in Engineering Systems

    (1996)
  • Cited by (17)

    • The computational model to predict accurately inhibitory activity for inhibitors towardsCYP3A4

      2010, Computers in Biology and Medicine
      Citation Excerpt :

      To address the kind of problem, some nonlinear methods were applied into QSAR studies. In recently years, the neural network method including radial basis function neural networks, back propagation neural networks, and Kohonen self-organizing feature maps have been received more attentions [9–13]. But the neural network method has the unavoidable disadvantage such as the overfitting problem and high uncertainty in the models.

    • Soft computing in engineering design - A review

      2008, Advanced Engineering Informatics
    • Soft Computing and Its Applications

      2018, Soft Computing and Its Applications
    • Soft Computing and Its Applications: Volume Two: Fuzzy Reasoning and Fuzzy Control

      2014, Soft Computing and Its Applications: Volume Two: Fuzzy Reasoning and Fuzzy Control
    View all citing articles on Scopus
    View full text