Designing efficient discriminant functions for multi-category classification using evolutionary methods
Introduction
Data classification undoubtedly plays a vital role in a wide range of applications, such as Automatic Speech Recognition, Knowledge Extraction and Discovery, Remote Sensing, Face Recognition, Computer Vision, Business Decision-Making and other Pattern Recognition problems.
Despite widespread research in data classification area, multi-category classification task is still one of the main challenging problems. One-vs-one [1], one-vs-all [2], all-vs-all [3], error correcting output codes [4], and generalized error correcting output codes [5] are the most common strategies which are conspicuously used to deal with the multi-category problem. In one-vs-one strategy, one classifier is designed to discriminate between the data of each pair of classes, while in one-vs-all strategy, one classifier is learned for each class, which distinguishes the data of the corresponding class from the other data.
There are several well-known methods to solve data classification problems, such as Decision Trees [6], Discriminant Analysis [7], Maximum Likelihood [8], Bayes classifier [8], Artificial Neural Networks [9], Support Vector Machines [10], and Evolutionary Approaches [11], [12], [13], [14], [15]. Recently, evolutionary approaches have being extensively employed for classification tasks, especially for multi-category ones. The evolutionary approaches include Genetic Algorithm (GA) [16], Genetic Programming (GP) [16], Ant Colony System (ACS) [17], Particle Swarm Optimization (PSO) [18], Bee Colony System (BCS), etc.
GP is one of those search methodologies which are based on Darwinian principle of natural selection and has been introduced by Koza [16]. This method is able to represent fundamental relationships amongst data by mathematical or logical expressions, thereby resulting in an interpretable classifier. Discovering the discriminant features, no need to any prior knowledge about data distribution, and flexibility to design a suitable classifier are some of the main advantages of GP [16]. GP is an iterative approach where a population of individuals is evaluated in each iteration. Each individual is represented by a binary tree that is a solution of a given problem. In this approach, biologically inspired operators such as reproduction, recombination and mutation are employed to achieve efficient solutions to the problem.
Some other evolutionary approaches such as ACS and BCS are based on collective intelligence. In these approaches, indirect communications amongst members of the colony are used to solve advanced complex problems. Specifically in the ACS, artificial ants (candidate solutions) utilize pheromone trails as a communication mechanism in order to try to achieve the best solution of a given problem [17].
The main contribution of this paper is designing accurate classifiers for multi-category classification task by reducing conflicting situations. We propose two approaches, namely GP-Ant and GP–GA, to design data classifiers whose high accuracies are mainly provided by the following ideas:
- •
Our approaches consist of a two-phase training process to decrease conflicting situations in a multi-category classification task.
- •
To improve accuracy of the proposed classifiers, we provide a special modification box to modify the final decision of our integrated classifiers when conflicting situations happen. In fact, this modification box tries to decrease conflicting situations, in both training and test stages.
- •
In addition to the traditional function set in binary trees of GP, that is , we employ some other special functions that show appropriate effects on the accuracy of our classifiers.
- •
While previous works used only positive constants in the terminal nodes of their binary trees, we utilize both negative and positive constants in such situations.
The rest of this paper is organized as follows. Section 2 presents some main related works. In Section 3, we propose our approaches for multi-category classification task. Our experimental results are reported and compared with those of some important related works in Section 4. Finally, Section 5 provides the conclusion and remarks.
Section snippets
Related works
Some previous works such as [6], [16], [19], [20], [9], [11] successfully employed GP and GA for data classification task. Two leading works to design a decision tree for data classification by GP are [6], [16]. In [6], the population of individuals is hierarchical compositions of arguments and functions. Besides, traditional GA and multilayer perceptron are merged in [19] to specify a proper network architecture with suitable parameters for solving pattern recognition and classification
Proposed methods
This paper focuses on providing accurate classifiers for multi-category classification task by decreasing conflicting situations. To achieve this goal, we present two approaches, called GP-Ant and GP–GA, both of which contain two phases in the training stage. Note that, in all parts of this section, CN and Y refer to the number of present classes in our data set and the number of individuals in one population of our GP respectively.
Our main motivations for employing evolutionary search
Experiments and results
In order to evaluate the performances of the proposed GP-Ant and GP–GA, we have carried out some experiments in this section. We have implemented our proposed algorithms using Matlab. In our experiments, eight well-known data sets from UCI are employed which are publicly available through [38]. Detailed information of these data sets is provided in Table 2. As depicted in this table, all chosen data sets contain more than two classes, have no missing values, and have different number of
Conclusion
In this paper, we proposed two approaches, namely GP-Ant and GP–GA, to accurately deal with multi-category classification problem. Our experimental results showed that our approaches were able to outperform many widely used classification methods and lead to statistically significantly better results than them.
Our approaches consisted of a two-phase training process to reduce conflicting situations in the multi-category classification task. Moreover, we provided special modification boxes to
Abolfazl Soltani was born in Iran on October 6, 1990. He received the B.Sc. degree in 2013 from the Electrical Engineering Department, Amirkabir University of Technology, Tehran, Iran, majoring in electronics. Currently, he is pursuing the M.Sc. degree from the Department of Electrical Engineering, Amirkabir University of Technology, Tehran, Iran. He is a member of the Speech Processing Research Laboratory, Department of Electrical Engineering, Amirkabir University of Technology. His main
References (43)
- et al.
Classifier design with feature selection and feature extraction using layered genetic programming
Expert Syst. Appl.
(2008) - et al.
Two-stage learning for multi-class classification using genetic programming
Neurocomputing
(2013) - et al.
Pattern recognition using multilayer neural-genetic algorithm
Neurocomputing
(2003) - et al.
A genetic algorithm for solving the inverse problem of support vector machines
Neurocomputing
(2005) - et al.
A multi-objective evolutionary algorithm-based ensemble optimizer for feature selection and classification with neural network models
Neurocomputing
(2014) - et al.
A two-stage genetic algorithm for automatic clustering
Neurocomputing
(2012) - et al.
Multiobjective genetic programming for maximizing roc performance
Neurocomputing
(2014) - et al.
Utilizing multiple pheromones in an ant-based algorithm for continuous-attribute classification rule discovery
Appl. Soft Comput.
(2013) - et al.
An overview of ensemble methods for binary classifiers in multi-class problemsexperimental study on one-vs-one and one-vs-all schemes
Pattern Recognit.
(2011) - et al.
Single-layer learning revisiteda stepwise procedure for building and training a neural network
Neurocomputing
(1990)
Classification by pairwise coupling
Ann. Stat.
Solving multiclass learning problems via error-correcting output codes
J. Artif. Intell. Res.
Reducing multiclass to binarya unifying approach for margin classifiers
J. Mach. Learn. Res.
Pattern Classification
Modifying genetic programming for artificial neural network development for data mining
Soft Comput.
Audio classification using ga-based fuzzy c-means
Front. Innov. Future Comput. Commun.
Hybrid ant bee algorithm for fuzzy expert system based sample classification
IEEE/ACM Trans. Comput. Biol. Bioinform.
Genetic programming and serial processing for time series classification
IEEE Trans. Evol. Comput.
Cited by (0)
Abolfazl Soltani was born in Iran on October 6, 1990. He received the B.Sc. degree in 2013 from the Electrical Engineering Department, Amirkabir University of Technology, Tehran, Iran, majoring in electronics. Currently, he is pursuing the M.Sc. degree from the Department of Electrical Engineering, Amirkabir University of Technology, Tehran, Iran. He is a member of the Speech Processing Research Laboratory, Department of Electrical Engineering, Amirkabir University of Technology. His main research interests include design and implementation of embedded systems and FPGA programming. In addition, his research interests include cryptography and digital signal processing theories and its applications, in particular audio processing.
Seyed Mohammad Ahadi received the B.Sc. and M.Sc. degrees in electronics from the Electrical Engineering Department, Amirkabir University of Technology, Tehran, Iran, in 1984 and 1987, respectively, and the Ph.D. degree in engineering from the University of Cambridge, Cambridge, UK, in 1996, working in the field of speech recognition. He was appointed faculty member at the Electrical Engineering Department, Amirkabir University of Technology, in 1988, where he began his teaching profession as well as involvement in research projects. Since receiving the Ph.D. degree, he has been with the same department, where he is currently an associate professor involved in teaching research in electronics and communications.
Neda Faraji received the B.Sc. and M.Sc. degrees in electrical engineering, respectively from Iran University of Science and Technology (IUST), and Amirkabir University of Technology, Tehran, Iran. Also she received the Ph.D. degree from Amirkabir University of Technology, in 2013. She was a visiting student for 9 months during her Ph.D. at the Delft University of Technology, Delft, the Netherlands. Currently, she is an assistant professor at the engineering faculty of Imam Khomeini International University (IKIU), Qazvin, Iran. Her main research interests are speech processing, statistical signal processing and pattern recognition.
Saeed Sharifian received his B.Sc. degree in electrical engineering from KNT University of Technology, Tehran, Iran, in 2000, and M.Sc. and Ph.D. degrees in digital electronic engineering from Amirkabir University of Technology, Tehran, Iran, in 2002 and 2008, respectively. He is currently a member of Iranian High Performance Computing Research Center (HPCRC) as vice chancellor for research and development. He is currently a faculty member in the Department of Electrical Engineering, Amirkabir University of Technology, Tehran, Iran. His research interests include embedded systems, high-performance and parallel computing, cloud and pervasive computing, soft computing, machine vision, biomedical engineering and HW design.