Combining GP operators with SA search to evolve fuzzy rule based classifiers
Introduction
A linguistically understandable fuzzy classifier system is defined by a base of fuzzy rules. Following Zadeh [17], a fuzzy rule base can be described in two levels of detail. The surface structure of the base, so-called “linguistic rule base”, comprises the linguistic rules, and the deep structure comprises the set of rules plus the definition of linguistic partitions of all features used in the classification system. Every linguistic term is tied to a fuzzy set defined over one of the features.
The algorithms used for inducting a fuzzy classifier from a classified sample can operate in two different manners. Some of them separately learn the fuzzy partitions and the surface structure, but others integrate both parts in the same process. In the first case it is common to assume an equidistant partition and learn iteratively the rule base [8], [13], [15] or by means of genetic algorithms [4], [16]. The memberships can also be approximated with clustering techniques, followed by projections [1], [2]. The latter case relies on optimization or search techniques [6], in many cases genetic algorithms, with linear [3] or tree genotype [14].
Two different codifications of the base have been used when applying GA to this problem: linear genotype and tree-shaped genotype algorithms, so-called genetic programming. Linear genotype-based methods almost always use a kind of rules for which antecedents comprise linguistic terms, linguistic modifiers and the logical connective “and”. With tree-shaped genotypes, whole bases of rules are regarded as single chains in a context-free grammar and are represented by means of their parse tree [4] or by a pair, composed by a syntactic tree and a chain of parameters [14]. We will refer to both approaches as “grammar-based GP”. These genotypes should allow us to represent rule bases more compactly than linear representation can, hence we are mainly concerned with these representations.
In grammar-based GP, only subtrees generated by the same production rule can be interchanged, to preserve the syntactic correctness of the offspring. Mutation can be defined in different ways. We will cross the individual being mutated with a randomly generated individual (which is a concept similar to that of “headless chicken” crossover [9]) and select one of the offsprings. We will show that this mutation operator can be used within an SA-based search to learn simultaneously the fuzzy partitions and the surface structure of a rule base. By comparing GP to SA solutions for a fixed number of fitness evaluations, we will also show that SA is not worse than GP for this problem, thus SA can be a good algorithm when the sizes of the rule bases are large. It suffices to keep two individuals in memory at a time, while GA may need hundreds or thousands of them.
This paper is organized as follows: first we will define how we will represent a fuzzy classification system and define the objective of the learning algorithm. Then we will propose an SA algorithm able to search for a solution in the space of all tree shaped genotypes of fuzzy rule bases. Next, numerical results are shown in which the behaviors of SA and GA are compared.
Section snippets
Representation of an individual. Grammar of a valid fuzzy rule base
A classifier system is a decision rule that assigns a class to every point in the feature space [5]. We will regard fuzzy classifiers as linguistic representations of these decision rules, and say that two different linguistic expressions that represent the same decision surfaces are two equivalent fuzzy classifiers.
Not all decision surfaces can be represented by linguistic rules. The goals of the linguistic classifier induction are two: (1) finding the most precise representable decision
Genetic programming and simulated annealing
In this section we will incorporate the macromutation operator, and the representation of a fuzzy rule base we developed before, to the SA Metropolis algorithm. There are small differences between GA mutation and the operation needed in SA. The “adjacent” operation in SA produces a random individual which is near the individual being mutated in some sense. This is immediate in real optimization, but not so evident when optimizing a function defined over the chains of a grammar. We have used an
Numerical results
All values defining SA algorithm execution parameters are displayed in Table 1. Every experiment has been repeated 10 times starting from different, random candidates. SA was rather more sensible to initial temperature and cooling pattern than GA is to its own execution parameters, so tuning the learning was more difficult in SA than it was in GA. As a rule of thumb, we obtained good results with an initial temperature approximately equal to the expected per cent error of the classifier (i.e., T
Concluding remarks and future work
In this work we have shown that the research devoted to new representations of fuzzy rule bases and to application-dependent crossover operators can be applied to different search schemes, allowing them to be applied to new fields. It remains to be studied whether this technique is also useful for inducting rules in other problems. Preliminary results in fuzzy modeling go in this direction. The ability of this method to perform a feature selection is also interesting and it should be studied in
References (17)
- et al.
Induction of fuzzy rules and membership functions from training examples
Fuzzy Sets and Systems
(1996) Distributed representation of fuzzy rules and its application to pattern classification
Fuzzy Sets and Systems
(1992)- et al.
Linguistic recognition system based in approximate reasoning
Information Sciences
(1992) - et al.
A genetic algorithm for generating fuzzy classification rules
Fuzzy Sets and Systems
(1996) - et al.
A method for fuzzy rule extraction directly from numerical data and its application to pattern classification
IEEE Transactions on Fuzzy Systems
(1995) - et al.
A fuzzy classifier with ellipsoidal regions
IEEE Transactions on Fuzzy Systems
(1997) - et al.
A proposal on reasoning methods in fuzzy rule-based classification systems
International Journal of Approximate Reasoning
(1991) Fuzzy Rule-Based Expert Systems and Genetic Machine Learning
(1997)
Cited by (117)
A rule-based deep fuzzy system with nonlinear fuzzy feature transform for data classification
2023, Information SciencesQuantitative-integration-based TSK fuzzy classification through improving the consistency of multi-hierarchical structure
2021, Applied Soft ComputingCitation Excerpt :In this section, the experimental results are supplied to verify the classification performance of the QI-TSK-FC. We carry out the experiment on sleep datasets and compare them with the algorithms in the KEEL software toolbox, which are GFS-AdaBoost-C [30], GFS-GP-C [31–33], GFS-SP-C [31–33], and GFS-GPG-C [31,32]. The experimental results with respect to the learning algorithm are arranged for discussing in this section.
FUZZ-EQ: A data equalizer for boosting the discrimination power of fuzzy classifiers
2020, Applied Soft Computing JournalCitation Excerpt :To assess the effectiveness of our proposal, we conducted an extensive empirical study consisting of 41 classification tasks available at UCI [20] and KEEL [21] repositories. We included 9 fuzzy classifiers with different FPMs, rule induction algorithms, and rule structures, namely CHI [6], FARC-HD [4], FHGBML [9], FURIA [8], FuzzyID3 [22], GFS-GP [10], IVTURS [11], PDFC [5], and SLAVE [7]. The results revealed that FUZZ-EQ was able to boost classification performance in all cases where fuzzy partitions were common to all rules.
Heuristic design of fuzzy inference systems: A review of three decades of research
2019, Engineering Applications of Artificial IntelligenceAn approximation to solve regression problems with a genetic fuzzy rule ordinal algorithm
2019, Applied Soft Computing JournalCitation Excerpt :On the one hand, there are several algorithms which use TSK Rules (TSK-IRL-R [39], MOGUL-TSK-R [8]) and others one that uses Mamdani Rule(MOGUL-IRLSC-R [40], MOGUL-IRLHC-R [5]). On the other hand, there are algorithms that use genetic programming grammar operators (GFS-SP-R, GFS-GPG-R [32]). Moreover, the implementation of Thrift algorithm (Thrift-R [41]) is included in this subcategory.
Two-stage consumer credit risk modelling using heterogeneous ensemble learning
2019, Decision Support Systems