Elsevier

Neurocomputing

Volume 173, Part 3, 15 January 2016, Pages 1885-1897
Neurocomputing

Designing efficient discriminant functions for multi-category classification using evolutionary methods

https://doi.org/10.1016/j.neucom.2015.08.093Get rights and content

Highlights

  • We propose two approaches for designing accurate multi-category classifiers.

  • Our approaches consist of two phases in the training stage.

  • We provide a special modification box to cope with conflicting situations.

  • Several special functions are employed in our binary tree function nodes.

  • We use both negative and positive constants in our binary tree terminal nodes.

Abstract

In this paper, we propose two approaches to obtain accurate classifiers for dealing with multi-category classification problem. Our work is based on one-vs-all strategy where we try to decrease conflicting situations. In the first phase of both approaches we employ Genetic Programming to find populations of the best discriminant functions (one population for each class). In addition to traditional function set, like {+,,,/}, we utilize other special functions in our binary trees. We also use both negative and positive constants in the terminal nodes of the trees. In the second phase, we employ Ant Colony in our first approach, called GP-Ant, and Genetic Algorithm in the second one, called GP–GA, to find the best combination of discriminant functions found in the previous phase. We also provide a special modification box to modify the decision of our final integrated classifiers, when conflicting situations happen. To cope with conflicting situations, we also utilize an appropriate fitness function in the second phase. We compare our works with both state of the art and basic multi-category classification methods on eight well-known publicly available data sets. Our experimental results show that our methods are statistically significantly better than all the other classification methods used.

Introduction

Data classification undoubtedly plays a vital role in a wide range of applications, such as Automatic Speech Recognition, Knowledge Extraction and Discovery, Remote Sensing, Face Recognition, Computer Vision, Business Decision-Making and other Pattern Recognition problems.

Despite widespread research in data classification area, multi-category classification task is still one of the main challenging problems. One-vs-one [1], one-vs-all [2], all-vs-all [3], error correcting output codes [4], and generalized error correcting output codes [5] are the most common strategies which are conspicuously used to deal with the multi-category problem. In one-vs-one strategy, one classifier is designed to discriminate between the data of each pair of classes, while in one-vs-all strategy, one classifier is learned for each class, which distinguishes the data of the corresponding class from the other data.

There are several well-known methods to solve data classification problems, such as Decision Trees [6], Discriminant Analysis [7], Maximum Likelihood [8], Bayes classifier [8], Artificial Neural Networks [9], Support Vector Machines [10], and Evolutionary Approaches [11], [12], [13], [14], [15]. Recently, evolutionary approaches have being extensively employed for classification tasks, especially for multi-category ones. The evolutionary approaches include Genetic Algorithm (GA) [16], Genetic Programming (GP) [16], Ant Colony System (ACS) [17], Particle Swarm Optimization (PSO) [18], Bee Colony System (BCS), etc.

GP is one of those search methodologies which are based on Darwinian principle of natural selection and has been introduced by Koza [16]. This method is able to represent fundamental relationships amongst data by mathematical or logical expressions, thereby resulting in an interpretable classifier. Discovering the discriminant features, no need to any prior knowledge about data distribution, and flexibility to design a suitable classifier are some of the main advantages of GP [16]. GP is an iterative approach where a population of individuals is evaluated in each iteration. Each individual is represented by a binary tree that is a solution of a given problem. In this approach, biologically inspired operators such as reproduction, recombination and mutation are employed to achieve efficient solutions to the problem.

Some other evolutionary approaches such as ACS and BCS are based on collective intelligence. In these approaches, indirect communications amongst members of the colony are used to solve advanced complex problems. Specifically in the ACS, artificial ants (candidate solutions) utilize pheromone trails as a communication mechanism in order to try to achieve the best solution of a given problem [17].

The main contribution of this paper is designing accurate classifiers for multi-category classification task by reducing conflicting situations. We propose two approaches, namely GP-Ant and GP–GA, to design data classifiers whose high accuracies are mainly provided by the following ideas:

  • Our approaches consist of a two-phase training process to decrease conflicting situations in a multi-category classification task.

  • To improve accuracy of the proposed classifiers, we provide a special modification box to modify the final decision of our integrated classifiers when conflicting situations happen. In fact, this modification box tries to decrease conflicting situations, in both training and test stages.

  • In addition to the traditional function set in binary trees of GP, that is {+,,,/}, we employ some other special functions that show appropriate effects on the accuracy of our classifiers.

  • While previous works used only positive constants in the terminal nodes of their binary trees, we utilize both negative and positive constants in such situations.

The rest of this paper is organized as follows. Section 2 presents some main related works. In Section 3, we propose our approaches for multi-category classification task. Our experimental results are reported and compared with those of some important related works in Section 4. Finally, Section 5 provides the conclusion and remarks.

Section snippets

Related works

Some previous works such as [6], [16], [19], [20], [9], [11] successfully employed GP and GA for data classification task. Two leading works to design a decision tree for data classification by GP are [6], [16]. In [6], the population of individuals is hierarchical compositions of arguments and functions. Besides, traditional GA and multilayer perceptron are merged in [19] to specify a proper network architecture with suitable parameters for solving pattern recognition and classification

Proposed methods

This paper focuses on providing accurate classifiers for multi-category classification task by decreasing conflicting situations. To achieve this goal, we present two approaches, called GP-Ant and GP–GA, both of which contain two phases in the training stage. Note that, in all parts of this section, CN and Y refer to the number of present classes in our data set and the number of individuals in one population of our GP respectively.

Our main motivations for employing evolutionary search

Experiments and results

In order to evaluate the performances of the proposed GP-Ant and GP–GA, we have carried out some experiments in this section. We have implemented our proposed algorithms using Matlab. In our experiments, eight well-known data sets from UCI are employed which are publicly available through [38]. Detailed information of these data sets is provided in Table 2. As depicted in this table, all chosen data sets contain more than two classes, have no missing values, and have different number of

Conclusion

In this paper, we proposed two approaches, namely GP-Ant and GP–GA, to accurately deal with multi-category classification problem. Our experimental results showed that our approaches were able to outperform many widely used classification methods and lead to statistically significantly better results than them.

Our approaches consisted of a two-phase training process to reduce conflicting situations in the multi-category classification task. Moreover, we provided special modification boxes to

Abolfazl Soltani was born in Iran on October 6, 1990. He received the B.Sc. degree in 2013 from the Electrical Engineering Department, Amirkabir University of Technology, Tehran, Iran, majoring in electronics. Currently, he is pursuing the M.Sc. degree from the Department of Electrical Engineering, Amirkabir University of Technology, Tehran, Iran. He is a member of the Speech Processing Research Laboratory, Department of Electrical Engineering, Amirkabir University of Technology. His main

References (43)

  • P. Clark, R. Boswell, Rule induction with cn2: some recent improvements, in: Machine Learning EWSL-91, vol. 482, 1991,...
  • T. Hastie et al.

    Classification by pairwise coupling

    Ann. Stat.

    (1998)
  • T.G. Dietterich et al.

    Solving multiclass learning problems via error-correcting output codes

    J. Artif. Intell. Res.

    (1995)
  • E.L. Allwein et al.

    Reducing multiclass to binarya unifying approach for margin classifiers

    J. Mach. Learn. Res.

    (2001)
  • J.R. Koza, Concept formation and decision tree induction using the genetic programming paradigm, in: Parallel Problem...
  • R.O. Duda et al.

    Pattern Classification

    (2012)
  • D. Rivero et al.

    Modifying genetic programming for artificial neural network development for data mining

    Soft Comput.

    (2009)
  • V.N. Vapnik, V. Vapnik, Statistical Learning Theory, vol. 1, Wiley, New York,...
  • M. Kang et al.

    Audio classification using ga-based fuzzy c-means

    Front. Innov. Future Comput. Commun.

    (2014)
  • P. GaneshKumar et al.

    Hybrid ant bee algorithm for fuzzy expert system based sample classification

    IEEE/ACM Trans. Comput. Biol. Bioinform.

    (2014)
  • E. Alfaro-Cid et al.

    Genetic programming and serial processing for time series classification

    IEEE Trans. Evol. Comput.

    (2014)
  • Cited by (0)

    Abolfazl Soltani was born in Iran on October 6, 1990. He received the B.Sc. degree in 2013 from the Electrical Engineering Department, Amirkabir University of Technology, Tehran, Iran, majoring in electronics. Currently, he is pursuing the M.Sc. degree from the Department of Electrical Engineering, Amirkabir University of Technology, Tehran, Iran. He is a member of the Speech Processing Research Laboratory, Department of Electrical Engineering, Amirkabir University of Technology. His main research interests include design and implementation of embedded systems and FPGA programming. In addition, his research interests include cryptography and digital signal processing theories and its applications, in particular audio processing.

    Seyed Mohammad Ahadi received the B.Sc. and M.Sc. degrees in electronics from the Electrical Engineering Department, Amirkabir University of Technology, Tehran, Iran, in 1984 and 1987, respectively, and the Ph.D. degree in engineering from the University of Cambridge, Cambridge, UK, in 1996, working in the field of speech recognition. He was appointed faculty member at the Electrical Engineering Department, Amirkabir University of Technology, in 1988, where he began his teaching profession as well as involvement in research projects. Since receiving the Ph.D. degree, he has been with the same department, where he is currently an associate professor involved in teaching research in electronics and communications.

    Neda Faraji received the B.Sc. and M.Sc. degrees in electrical engineering, respectively from Iran University of Science and Technology (IUST), and Amirkabir University of Technology, Tehran, Iran. Also she received the Ph.D. degree from Amirkabir University of Technology, in 2013. She was a visiting student for 9 months during her Ph.D. at the Delft University of Technology, Delft, the Netherlands. Currently, she is an assistant professor at the engineering faculty of Imam Khomeini International University (IKIU), Qazvin, Iran. Her main research interests are speech processing, statistical signal processing and pattern recognition.

    Saeed Sharifian received his B.Sc. degree in electrical engineering from KNT University of Technology, Tehran, Iran, in 2000, and M.Sc. and Ph.D. degrees in digital electronic engineering from Amirkabir University of Technology, Tehran, Iran, in 2002 and 2008, respectively. He is currently a member of Iranian High Performance Computing Research Center (HPCRC) as vice chancellor for research and development. He is currently a faculty member in the Department of Electrical Engineering, Amirkabir University of Technology, Tehran, Iran. His research interests include embedded systems, high-performance and parallel computing, cloud and pervasive computing, soft computing, machine vision, biomedical engineering and HW design.

    View full text