An adaptive knowledge-acquisition system using generic genetic programming

doi:10.1016/S0957-4174(98)00010-4

Expert Systems with Applications

Volume 15, Issue 1, July 1998, Pages 47-58

https://doi.org/10.1016/S0957-4174(98)00010-4 Get rights and content

Abstract

The knowledge-acquisition bottleneck greatly obstructs the development of knowledge-based systems. One popular approach to knowledge acquisition uses inductive concept learning to derive knowledge from examples stored in databases. However, existing learning systems cannot improve themselves automatically. This paper describes an adaptive knowledge-acquisition system that can learn first-order logical relations and improve itself automatically. The system is composed of an external interface, a biases base, a knowledge base of background knowledge, an example database, an empirical ILP learner, a meta-level learner, and a learning controller. In this system, the empirical ILP learner performs top-down search in the hypothesis space defined by the concept description language, the language bias, and the background knowledge. The search is directed by search biases which can be induced and refined by the meta-level learner based on generic genetic programming.

It has been demonstrated that the adaptive knowledge-acquisition system performs better than FOIL on inducing logical relations from perfect or noisy training examples. The result implies that the search bias evolved by evolutionary learning is better than that of FOIL which is designed by a top researcher in the field. Consequently, generic genetic programming is a promising technique for implementing a meta-level learning system. The result is very encouraging as it suggests that the process of natural selection and evolution can successfully evolve a high-performance learning system.

Introduction

The knowledge-acquisition bottleneck greatly obstructs the development of knowledge-based systems. One popular approach to knowledge acquisition uses inductive concept learning to derive knowledge from examples stored in databases. The knowledge acquired can be expressed in different knowledge representations such as first-order logical relations, decision trees, decision lists, and production rules. Existing learning systems such as CART (Breiman et al., 1984), C4.5 (Quinlan, 1992), ASSISTANT (Cestnik et al., 1987), AQ15 (Michalski et al., 1986), and CN2 (Clark and Niblett, 1989) use attribute-value language for representing the training examples and the induced knowledge and allow a finite number of objects in the universe of discourse. This representation limits them to learn only propositional descriptions in which concepts are described in terms of values of a fixed number of attributes.

Dzeroski and Lavrac show that Inductive Logic Programming (ILP) can be used to induce knowledge represented as first-order logical relations (Dzeroski and Lavrac, 1993; Dzeroski, 1996). ILP is more powerful than traditional inductive learning methods because it uses an expressive first-order logic framework and facilitates the application of background knowledge. In this formalism, domain knowledge represented in the form of relations can be used in the induced relational descriptions of concepts. Moreover, ILP has a strong theoretical foundation from logic programming and computational learning theory.

The task of inducing first-order logical relations can be formulated as a search problem (Mitchell, 1982) in a hypotheses space of logical relations. Various approaches (Quinlan, 1990; Muggleton and Feng, 1990) differ mainly in the search strategy and the heuristics used to guide the search. The search space is extremely large, so strong heuristics are required to manage the problem. Most systems are based on a greedy search strategy. They generate a sequence of logical relations from general to specific (or from specific to general) until a consistent relation is found. Each relation in the sequence is obtained by specializing (or generalizing) the previous one. For example, FOIL (Quinlan, 1990) applies a hill-climbing search strategy guided by an information-gain heuristic to search relations from general to specific. But these strategies and heuristics are not always applicable because these systems may become trapped in local maxima. In order to overcome this problem, non-greedy strategies should be adopted. Moreover, existing ILP systems cannot improve themselves automatically.

In this paper, we describe an adaptive knowledge-acquisition system that induces first-order logical relations and improves itself during learning. We formulate the definitions of inductive concept learning and adaptive knowledge acquisition in the next section. The system is based on a generic genetic programming approach that is presented in Section 3. A generic top-down first-order learning algorithm is described in Section 4. Section 5contains a description of a meta-level learner that induces search bias. The experimentation and some evaluations of the system are reported in Section 6. Finally, conclusions are presented in Section 7.

Section snippets

Inductive concept learning and adaptive knowledge acquisition

The goal of machine learning is to develop techniques and tools for building intelligent learning machines. Machine learning paradigms include inductive, deductive, genetic-based, and connectionist learning. Multi-strategy learning integrates several learning paradigms. This section focuses on supervised inductive concept learning. If U is a universal set of observations, a concept C is formalized as a subset of observations in U. Inductive concept learning finds descriptions for various target

Generic genetic programming (GGP)

Generic Genetic Programming (GGP) is a novel approach that combines genetic programming (Koza, 1992, Koza, 1994; Kinnear, 1994) and inductive logic programming (Quinlan, 1990; Muggleton, 1992). Using GGP, programs in various programming languages can be evolved. The approach is also powerful enough to handle context-sensitive information and domain-dependent knowledge which can be used to accelerate the learning speed and/or improve the quality of the programs.

GGP can induce programs in various

A generic top-down first-order learning algorithm

This section presents a generic top-down first-order learning algorithm based on FOIL (Quinlan, 1990). The algorithm is depicted in Table 4. The algorithm consists of three steps. In the pre-processing step, missing argument values in training examples are handled by assigning default or random values to them. A training example will be removed if it has too many missing values. If there are no or inadequate negative examples in the training set, they can be generated. Different ways of

Inducing procedural search biases

In this section, GGP is used in the meta-level learner to induce procedural search biases (i.e., the `scoring' function). In order to employ GGP, a logic grammar must be defined (Table 6).

In the grammar, the terminal symbols n-pos-i-plus-1, n-neg-i-plus-1, and n-pos-i represent respectively n⁺_i+1, n⁻_i+1 and n⁺⁺_i. With reference to the algorithms in Table 4, Table 5, assume that E_i is the extension of current training examples E_current by current clause C_i, n⁺_i and n⁻_i are respectively the

Experimentation and evaluations

This section compares the performance of our system with that of FOIL (Quinlan, 1990). Standard learning tasks in the literature are used in these experiments (Quinlan, 1990; Muggleton and Feng, 1990).

Conclusion

In this paper, we formulate an adaptive knowledge-acquisition system which is composed of an external interface, a biases base, a knowledge base of background knowledge, an example database, an empirical ILP learner, a meta-level learner, and a learning controller. An implementation of the adaptive knowledge-acquisition system has been developed. In the implementation, the empirical ILP learner performs top-down search in the hypothesis space defined by the concept description language, the

Acknowledgements

This work was partially supported by Hong Kong Baptist University FRG Grant (FRG/96-97/4-28) and Hong Kong RGC CERG Grant CUHK 486/95E.

References (27)

T.M. Mitchell
Generalization as search
Artificial Intelligence
(1982)
F.C.N. Pereira et al.
Definite clause grammars for language analysis — a survey of the formalism and a comparison with augmented transition networks
Artificial Intelligence
(1980)
Abramson, H., & Dahl, V. (1989). Logic grammars. Berlin:...
Bremen, L. et al. (1984). Classification and regression trees. Belmont, CA:...
Cestnik, B. et al. (1987). ASSISTANT 86: A knowledge elicitation tool for sophisticated users. In I. Bratko & N. Lavrac...
P. Clark et al.
The CN2 induction algorithm
Machine Learning
(1989)
Cohen, W. (1992). Compiling prior knowledge into an explicit bias. In Proceedings of the Ninth International Workshop...
Colmerauer, A. (1978). Metamorphosis grammars. In L. Bolc (Ed.), Natural language communication with computers. Berlin:...
Dzeroski, S. (1996). Inductive logic programming and knowledge discovery in databases. In U. M. Fayyad, G....
S. Dzeroski et al.
Inductive learning in deductive databases
IEEE Transactions on Knowledge and Data Engineering
(1993)

Gruau, F. (1996). On using syntactic constraints with genetic programming. In P. J. Angeline, & K. E. Kinnear Jr....

Hopcroft, J. E., & Ullman, J. D. (1979). Introduction to automata theory, languages, and computation. MA:...

Kinnear Jr., K. E. (Ed.) (1994). Advances in genetic programming. Cambridge, MA: MIT...

Cited by (16)

Multi-step optimal control of complex process: A genetic programming strategy and its application
2004, Engineering Applications of Artificial Intelligence
In many industrial processes, especially chemistry and metallurgy industry, the plant is slow for feedback and data test because of complex and varying factors. Considering the multi-objective feature and the complex problem of production stability in optimal control, this paper proposed an optimal control strategy based on genetic programming (GP), used as a multi-step state transferring procedure. The fitness function is computed by multi-step comprehensive evaluation algorithm, which provides a synthetic evaluation of multi-objective in process state based on single objective models. The punishment to process state variance is also introduced for the balance between optimal performance and stability of production. The individuals in GP are constructed as a chain linked by a few relation operators of time sequence for a facilitated evolution in GP with compact individuals. The optimal solution gained by evolution is a multi-step command program of process control, which not only ensures the optimization tendency but also avoids violent process variation by adjusting control parameters step by step. An optimal control system for operation direction is developed based on this strategy for imperial smelting process in Shaoguan. The simulation and application results showed its effectiveness for production objects optimization in complex process control.
A flexible knowledge discovery system using genetic programming and logic grammars
2001, Decision Support Systems
As the computing world moves from the information age into the knowledge-based age, it is beneficial to induce knowledge from the information superhighway formed from the Internet and intranet. The knowledge acquired can be expressed in different knowledge representations such as computer programs, first-order logical relations, or fuzzy Petri nets (FPNs). In this paper, we present a flexible knowledge discovery system called generic genetic programming (GGP) that applies genetic programming (GP) and logic grammars to learn knowledge in various knowledge representation formalisms. An experiment is performed to demonstrate that GGP can discover knowledge represented in FPNs that support fuzzy and approximate reasoning. To evaluate the performance of GGP in producing good FPNs, the classification accuracy of the FPN induced by GGP and that of the decision tree generated by C4.5 are compared. Moreover, the performance of GGP in inducing logic programs from noisy examples is evaluated. A detailed comparison to FOIL, a system that induces logic programs, has been conducted. These experiments demonstrate that GGP is a promising alternative to other knowledge discovery systems and sometimes is superior for handling noisy and inexact data.
To tune or not to tune: rule evaluation for metaheuristic-based sequential covering algorithms
2015, Data Mining and Knowledge Discovery
Using logic rules for concept refinement learning in FOL
2010, Journal of Computational Information Systems
Using logic rules for concept refinement learning in first order logic
2010, Proceedings 2010 IEEE 5th International Conference on Bio-Inspired Computing: Theories and Applications, BIC-TA 2010
Optimization of a fuzzy controller based on improved neural networks and genetic algorithms
2009, Advances in Systems Science and Applications

View all citing articles on Scopus

View full text