Elsevier

Neurocomputing

Volume 116, 20 September 2013, Pages 311-316
Neurocomputing

Two-stage learning for multi-class classification using genetic programming

https://doi.org/10.1016/j.neucom.2012.01.048Get rights and content

Abstract

This paper introduces a two-stage strategy for multi-class classification problems. The proposed technique is an advancement of tradition binary decomposition method. In the first stage, the classifiers are trained for each class versus the remaining classes. A modified fitness value is used to select good discriminators for the imbalanced data. In the second stage, the classifiers are integrated and treated as a single chromosome that can classify any of the classes from the dataset. A population of such classifier-chromosomes is created from good classifiers (for individual classes) of the first phase. This population is evolved further, with a fitness that combines accuracy and conflicts. The proposed method encourages the classifier combination with good discrimination among all classes and less conflicts. The two-stage learning has been tested on several benchmark datasets and results are found encouraging.

Introduction

Data classification finds its application in many real world problems, like fraud detection, face recognition, speech recognition and knowledge extraction from databases. The field of data classification is receiving increased importance due to unpredictability and complexity of real-world data. Evolutionary algorithms have shown evident performance for classification tasks. Genetic Programming (GP) is one of the evolutionary algorithms introduced by Koza [1] for automatic evolution of computer programs (including classifiers). GP has been successfully used for evolution of classifier-programs like decision trees [2]. Other GP based classification approaches include evolution of neural networks [3], [4], [5], autonomous classification systems [6], rule induction algorithms [7], fuzzy rule based systems and fuzzy petri nets [5], [8]. Most of these methods involve defining a grammar that is used to create and evolve classification algorithms using GP.

Various researchers [9], [10], [11], [12], [13] have used GP for evolution of classification rules. The rule based systems include, atomic representations proposed by Eggermont [14], [15] and SQL based representations proposed by Freitas et al. [12]. Tunsel and Jamshidi [16], Berlanga et al. [17] and Mendes et al. [18] introduced evolution of fuzzy rules using GP. Chien et al. [19] used fuzzy discrimination function for classification. Falco et al. [20] discovered comprehensive classification rules that use continuous value attributes. Bozarczuk et al. [21], [22] used different set of functions applicable to different type of attributes that represent rules as disjunctive normal form. This type of GP is also referred as constrained syntax GP. Tsakonas et al. [23] introduced two GP based systems for medical domain and achieved noticeable performance. Lin et al. [24] proposed a layered GP, where different layers correspond to different populations that perform feature extraction and classification. Another method is evolution of arithmetic expressions for classification. The arithmetic expressions can be used for numerical data and they output a real value. This real value is translated into the classification decision using different thresholds. This includes static thresholds [25], [26], dynamic thresholds [26], [27] and slotted thresholds [28].

Multi-class classification problems are common in the real world applications for the tasks like object recognition, character recognition, person recognition, disease diagnosis and several others. Many classification algorithms are binary in nature and must be extended for multi-class classification. These include neural networks, decision trees, k-nearest neighbor, naive Baye's classifiers, and support vector machines [29]. GP also needs to be extended for multiclass classification problems. Several methods have been presented to use GP for multi-class classification problems. Most noticeable among them is the one-versus-all method also known as binary decomposition method. This method has been used widely in GP based multi-class classification. In this method, one classifier is evolved for each class, discriminating a particular class from other classes present in the data. The final decision is made by presenting the input vector to classifiers of all classes. The classifier with positive or highest output is declared the winner. This method has been explored by many researchers [30], [31], [32], [33], [34]. Another relatively different method proposed by Muni et al. [35], uses a multi-tree representation, where a single classifier is an integrated version of individual classifiers for all classes. This amalgamated classifier is evolved in search of best classifier that has the ability to classify any of the class in one evolution.

Several other methods like ‘all versus all’ [36], error correcting output codes [37], and generalized error correcting output codes [38] have also been used to tackle multi-class classification problems by binary classification algorithms. However, none of them has been used in GP due to the large number of computations.

The drawback of binary decomposition method is the conflicting situations, where more than one classifier outputs a positive signal or none of the classifier outputs a belong-to signal. This situation degrades the classification accuracy. Several conflict resolution methods have been devised for this problem but they require extra processing during training and classification step. Another problem is the presence of skewed data. The data appears unbalanced for classification of a single class versus remaining classes. This problem is solved by increasing the number of training instances to make them appear balanced for each class [30], [36]. This is named ‘interleaved data format’ where the samples belonging to class under consideration are repeated and alternately placed between samples belonging to other classes. This strategy increases the training data as well as the training time.

The proposed staged approach overcomes these two problems. It evolves the classifiers in two different stages that perform discrimination and integration, and incorporates a discriminative fitness function which takes care of skewed data without increasing the computation. The integrated evolution eliminates the conflicting situations decreasing the evaluation time required for conflict resolution. The proposed algorithm is detailed in the next section.

Section snippets

Proposed methodology

Many attempts have been made to develop general approaches to multi-class classification. One of the well known methods, in machine learning community, is one vs. all method. It involves learning a discriminator for each pair of class labels. The proposed classification mechanism uses the same principle but divides the training process into two phases. The first stage resembles the traditional binary decomposition method. The output, given by this phase, is a set of classifier populations for

Results

Five benchmark multi-class classification problems have been selected from UCI ML repository [41], for performance evaluation of this work. We have selected the datasets based on following properties:

  • (1)

    Dataset should be real or numerical valued.

  • (2)

    Problem should be multi-class classification.

  • (3)

    There should be no missing values.

The datasets have been chosen from various dimensions of life to show the applicability of GP classification as well as generalization of our proposed optimization technique.

Conclusions

The proposed two stage learning mechanism for multi-class classification using Genetic Programming has yielded better results when compared to one-versus-all or binary decomposition method. This is due to the fact that binary decomposition method suffers from conflicting situations. On the other hand, we have used a fitness measure that favors accurate classifiers and less conflicting outputs. The proposed method reduces the computation required to perform the conflict resolution during the

Hajira Jabeen is working as an assistant professor at Iqra University, Islamabad, Pakistan since 2009. Her field of expertise include evolutionary Computation, swarm intelligence and data classification.

References (48)

  • G.A. Pappa et al.

    Evolving rule induction algorithms with multiobjective grammer based genetic programming

    Knowl. Inf. Syst.

    (2008)
  • J., Eggermont, Evolving Fuzzy Decision Trees for Data Classification, Proceedings of the 14th Belgium Netherlands...
  • R. Konig et al.

    Genetic programming—a tool for flexible rule extraction

    IEEE Cong. Evol. Comput.

    (2007)
  • A.P. Engelbrecht et al.

    A building block approach to genetic programming for rule discovery, in data mining: a heuristic approach

  • E. Carreno et al.

    Evolution of classification rules for comprehensible knowledge discovery

    IEEE Cong. Evol. Comput.

    (2007)
  • A.A. Freitas

    A Genetic Programming Framework for Two Data Mining Tasks : Classification and Generalized Rule Induction

    (1997)
  • C.S. Kuo et al.

    Applying genetic programming technique in classification trees

    Soft Computing

    (2007)
  • J., Eggermont, A.E., Eiben, J.I., Hemert, A comparison of genetic programming variants for data classification....
  • J. Eggermont, J.N. Kok, W.A. Kosters, GP For Data Classification, Partitioning The Search Space, Proceedings of the...
  • E. Tunstel et al.

    On genetic programming of fuzzy rule-based systems for intelligent control

    Int. J. Intelligent Autom. Soft Comput.

    (1996)
  • F.J. Berlanga, et al., A genetic-programming-based approach for the learning of compact fuzzy rule-based classification...
  • R.R.F. Mendes, et al., Discovering fuzzy classification rules with genetic programming and co-evolution, Principles of...
  • I.D. Falco, A.D. Cioppa, E. Tarantino, Discovering interesting classificationrules with GP, Appl. Soft Comput. 4 (2002)...
  • C.C. Bojarczuk et al.

    An innovative application of a constrained-syntax genetic programming system to the problem of predicting survival of patients

    (2003)
  • Cited by (12)

    • A genetically optimized neural network model for multi-class classification

      2016, Expert Systems with Applications
      Citation Excerpt :

      So, there is no point in adding BFS in crossover operation, because it increases the time required to reach the solution drastically. Jabeen and Baig (2013) proposed two stage learning for multi-class classification problems. In the first stage, the classifiers are trained for each class versus the remaining classes.

    • Designing efficient discriminant functions for multi-category classification using evolutionary methods

      2016, Neurocomputing
      Citation Excerpt :

      Finally, Section 5 provides the conclusion and remarks. Some previous works such as [6,16,19,20,9,11] successfully employed GP and GA for data classification task. Two leading works to design a decision tree for data classification by GP are [6,16].

    • A Novel Quadtree-Based Genetic Programming Search for Searchable Encryption Optimization

      2023, GECCO 2023 Companion - Proceedings of the 2023 Genetic and Evolutionary Computation Conference Companion
    • Genetic Programming with Random Binary Decomposition for Multi-Class Classification Problems

      2021, 2021 IEEE Congress on Evolutionary Computation, CEC 2021 - Proceedings
    View all citing articles on Scopus

    Hajira Jabeen is working as an assistant professor at Iqra University, Islamabad, Pakistan since 2009. Her field of expertise include evolutionary Computation, swarm intelligence and data classification.

    Abdul Rauf Baig has been assosiated with National University of Computing and Emerging Technologies, NU-FAST, Islamabad, Pakistan, since 2004. His field of expertise include Aritifical Intelligence, Data Mining and swarm intelligence.

    View full text