abstract = "We are interested in learning concepts expressed in a
fragment of first-order logic (FOL). This subject is
known as Inductive Logic Programming (ILP), where the
knowledge to be learn is expressed by Horn clauses,
which are used in programming languages based on logic
programming like Prolog. Learning systems that use a
representation based on first-order logic have been
successfully applied to relevant real life problems,
e.g., learning a specific property related to
carcinogenicity. Learning first-order hypotheses is a
hard task, due to the huge search space one has to deal
with. The approach used by the majority of ILP systems
tries to overcome this problem by using specific search
strategies, like the top-down and the inverse
resolution mechanism (see chapter 2). However, the
greedy selection strategies adopted for reducing the
computational effort, render techniques based on this
approach often incapable of escaping from local optima.
An alternative approach is offered by genetic
algorithms (GAs). GAs have proved to be successful in
solving comparatively hard optimization problems, as
well as problems like ICL. GAs represents a good
approach when the problems to solve are characterised
by a high number of variables, when there is
interaction among variables, when there are mixed types
of variables, e.g., numerical and nominal, and when the
search space presents many local optima. Moreover it is
easy to hybridise GAs with other techniques that are
known to be good for solving some classes of problems.
Another appealing feature of GAs is represented by
their intrinsic parallelism, and their use of
exploration operators, which give them the possibility
of escaping from local optima. However this latter
characteristic of GAs is also responsible for their
rather poor performance on learning tasks which are
easy to tackle by algorithms that use specific search
strategies. These observations suggest that the two
approaches above described, i.e., standard ILP
strategies and GAs, are applicable to partly
complementary classes of learning problems. More
important, they indicate that a system incorporating
features from both approaches could profit from the
different benefits of the approaches. This motivates
the aim of this thesis, which is to develop a system
based on GAs for ILP that incorporates search
strategies used in successful ILP systems. Our approach
is inspired by memetic algorithms (Moscato, 1989), a
population based search method for combinatorial
optimization problems. In evolutionary computation
memetic algorithms are GAs in which individuals can be
refined during their lifetime.
In particular the thesis introduces a hybrid
evolutionary system called ECL (Evolutionary Concept
Learner). ECL uses four intelligent mutation operators
and an optimization phase that follows each
mutation.
Two mutation operators are used for generalisation of
rules, and the other two for specialisation of rules.
The optimisation phase consists of the repeated
application of mutation operators until the fitness of
the individual being optimised increases.
A high level representation of rules is adopted, in
order to enable the use of these mutation operators.
Rules are represented as a list of predicates,
variables and constants. In this way at each time of
the evolutionary process ECL can distinguish between
the various part of the rule.
A selection mechanisms, called EWUS, is used in order
to select individuals and to promote diversity in the
population. This last aspect is very important in all
EAs system of ICL.
A method for handling numerical values is used, which
evolves discretization intervals along with rules, so
that each rule can have a discretization intervals that
is good for itself.
ECL proved to be competitive with other state of the
art systems for ICL, both in the relational and in the
propositional settings.
You can obtain a copy by clicking on the picture below.
Would you prefer a printed copy of the thesis, request
it with an email.",