Evolving multi-dimensional wavelet neural networks for classification using Cartesian Genetic Programming
Introduction
The wavelet transform has been used in pattern recognition, signal processing and compression applications for its ability to extract information from signals at either high time or frequency resolutions [1], [2], [3]. Wavelet neural networks (WNNs) utilize the concept of the wavelet transform in neural networks.
A combined model of wavelets and neural networks is suitable for function approximation and can be used for prediction and classification. WNNs have been successfully applied in many areas, including signal denoising [4], signal classification and compression [5], short-term electricity load forecasting [6], speech segmentation [7] and speaker recognition [8]. WNNs can provide better function approximation ability than standard multilayer perceptrons (MLPs) and radial basis function (RBF) neural networks over a wide range of applications [9], [10].
A WNN is determined by five key parameters. Three of them refer to the activation function (scale, translation, rotation) and two of them to the architecture of the network (weights and number of neurons). More details on the behavior of these parameters is discussed in Section 2.1.4. The standard training procedure of a WNN employs a gradient descent algorithm that can suffer from slow convergence and local optima [11]. In order to optimize the performance of WNNs, a number of research studies have utilized evolutionary algorithms and evolutionary programming techniques [12], [13].
Prediction of air and ground traffic flow [14], [15], energy consumption [16], large scale function estimation [17], function approximation [18], [19], [20], power transformer monitoring [21] and centrifugal compression [22] are a few of the many applications of WNNs which utilize genetic algorithms (GA). WNN evolution via differential evolution (DE) has also been quite successful and includes applications such as load forecasting [23] and bankruptcy prediction [24]. This variety of applications illustrates the adaptability of WNNs to different data domains.
The most common strategy to optimize combinations of WNN parameters is to evolve only the activation function parameters [14], [15], [16], [17]. Awad [25] evolved the translation and scale parameters using a GA and trained the weights using the Levenberg Marquardt algorithm. Jinru et al. [26] use a two-stage approach, where a GA is used first for a global search of the parameters, and in the second stage, the optimized parameters are further fine-tuned by using local search algorithms like gradient descent. In [27], the translation parameter is adaptive to the network input and its response to a non-linear function while the remaining attributes are evolved and optimized using particle swarm optimization. Simultaneous evolution of activation function parameters and network structure have also been studied and applied in various domains such as function approximation, Parkinson’s disease detection and prediction of hydro-turbine machine condition [20], [28], [29].
Apart from the existing methods of training wavelet parameters, there are a number of optimization algorithms, including self-adaptive differential evolution [30] and a social emotional algorithm which uses local search function [31] that can be used to optimize the WNN parameters.
In the present paper, a novel algorithm based on the concept of Cartesian Genetic Programming (CGP) is used to evolve a multi-dimensional wavelet neural network, so that its potential application to classification tasks can be evaluated. The paper also aims to contribute to a better understanding of the behaviour of WNNs when their parameters are adjusted.
CGP is an evolutionary programming technique developed by Miller et al. [32]. The concept of CGP has also been used to evolve artificial neural networks [33]. The motivation behind using CGP for evolving parameters is, firstly, that CGP doesn’t bloat [34] because the network becomes dominated by redundant genes that have a neutral effect [35], [36], [37] on the performance. Secondly, most of the applications evolved via CGP are generic, robust and present good accuracy compared to other methods [38], [39], [40], [41].
The computational cost of a WNN increases with the input dimensions of the system. Our objective is to introduce an algorithm that would have the ability to switch features on and off, hence making them either active or inactive during the evolution process. Discarding too many features might result in reduced accuracy. The advantage of using an evolution-based concept to evolve parameters is that features can be pruned during evolution while balancing the need for accuracy, thus efficiently reducing the time to train a network.
Another contribution of this work is the introduction of a rotation parameter Ri, represented as an n × n matrix where n is the total number of input features. Rotation matrices have not been used in any similar applications yet, due to non-differentiability issues and high computational cost. Our intent is to exploit rotations so that the approximation capability of WNNs can be correctly assessed.
In two of our previous publications [29], [42] we have used CGP to evolve wavelet parameters for a one-dimensional WNN. The present manuscript is about a separate study on the concept of multi-dimensional WNNs and the introduction of the rotation parameter for approximating functions. The structure of this paper is as follows. Section 2 describes WNNs, their properties and the tuning parameters, with visual examples. This section also introduces the mechanism used for building wavelet networks via Cartesian Genetic Programming (CGPWNN), constituting the main technical contribution of the paper. Sections 3–5 present the application of WNNs to three test problems: the standard 2D spiral benchmark, breast cancer classification via mammographic images, and Parkinson’s disease detection via speech signal analysis. Section 6 incorporates conclusions and possible directions for future research.
Section snippets
Wavelet neural networks
WNNs represent a class of neural networks with wavelets as activation functions; i.e. they combine the theory of wavelet transforms and neural networks [43]. WNNs generally have a feed-forward structure, with one hidden layer, as shown in Fig. 1, and activation functions are drawn from an orthonormal wavelet family. The most common wavelet activation functions are Gaussian, Mexican hat, Morelet and Haar wavelets [44].
Three parameters play a significant role in the tuning of wavelets for
Case study I: Two-spiral task
The two-spiral task is a benchmark task for non-linear classification [71], [72]. The dataset consists of two spirals, each with 97 sample data points in a 2D Cartesian space (shown in Fig. 11). The objective is to classify sample points close to each of the spirals by using only the (x, y)-Cartesian coordinates.
In this study, the two-spiral task is investigated under three different configurations of the wavelet neural network.
Case study II: Breast cancer classification
The Digital Database for Screening Mammography (DDSM) [74], [75] is an online repository of mammographic images of different resolutions obtained from various hospitals. The suspicious areas on the mammograms are manually marked by two experienced radiologists. For analysis, these markings are represented as chain codes and hence can be extracted easily.
In the dataset used by [76], mammographic images scanned by a HOWTEK scanner at 43.5 microns per pixel spatial resolution were downloaded and
Case study III: Sakar’s Parkinson’s disease dataset classification
This part of the research uses a recent, publicly available dataset from an online machine learning data repository from the University of California at Irvine (UCI) [98], [99]. The dataset represents features extracted from speech signals of Parkinson’s disease (PD) of affected and healthy individuals.
Case study IV: Little’s Parkinson’s disease dataset classification
This case study uses a Parkinson’s disease dataset donated by Max Little to the University of California Irvine’s machine learning repository [103], [104]. This dataset consists of multiple recordings of the same speech from 31 individuals. Each individual has 6 or 7 speech records. A total of 22 features were extracted from each speech sample using acoustic analysis software. Details of the dataset and the research literature surrounding its usage can be found in [29].
Conclusion and future work
Wavelet neural networks (WNNs) combine the characteristics of wavelet transforms and neural networks. They have been the focus of many studies, including studies on time-series prediction and approximation of 1D functions. One of the contributions of our study is the introduction of a rotation parameter for multi-dimensional networks. In addition, we have proposed a genetic algorithm to evolve all of the parameters of the network to obtain better classification accuracies. We have applied these
Acknowledgment
The first author would like to acknowledge the support through an Australian Government Research Training Program Scholarship.
Maryam Mahsal Khan did her B.Sc. Computer System Engineering from University of Engineering & Technology Peshawar, Pakistan in 2005 and Masters in Electrical & Electronic Engineering from Universti Teknologi Petronas, Malaysia in 2008. Before commencing her Ph.D. at the University of Newcastle, she worked as an Assistant Professor at UET Peshawar, Pakistan and later as a Research Engineer at LMKR Pvt. Ltd, Islamabad, Pakistan. She has a keen interest in Non-Linear Control, Genetic Algorithms
References (109)
- et al.
Wavelet transforms and neural networks for compression and recognition
Neural Netw.
(1996) Learning algorithm of wavelet network based on sampling theory.
Neurocomputing
(2007)- et al.
Short term load forecasting by using wavelet neural networks
Proceedings of the Canadian Conference on Electrical and Computer Engineering
(2000) - et al.
Wavelet basis function neural networks
Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN2007)
(2007) - et al.
Evolving wavelet neural networks for function approximation
Electron. Lett.
(1996) - et al.
A hierarchical evolutionary algorithm for constructing and training wavelet networks
Neural Comput. Appl.
(2002) - et al.
A network traffic prediction model based on recurrent wavelet neural network
Proceedings of the 2nd International Conference on Computer Science and Network Technology (ICCSNT2012)
(2012) - et al.
Time series prediction with wavelet neural networks
Proceedings of the 5th Seminar on Neural Network Applications in Electrical Engineering (NEUREL2000)
(2000) - et al.
Elliptic and radial wavelet neural networks
Proceedings of the Second World Automation Congress (WAC1996)
(1996) Wavelet neural networks with a hybrid learning approach
J. Inf. Sci. Eng.
(2006)
Radial wavelet neural network with a novel self-creating disk-cell-splitting algorithm for license plate character recognition
Entropy (Basel)
Intelligent classification of real heart diseases based on radial wavelet neural network
Proceedings of the Cairo International Biomedical Engineering Conference (CIBEC2014)
Classification and Regression Trees
Comparing data mining with ensemble classification of breast cancer masses in digital mammograms
Proceedings of the Second Australian Workshop on Artificial Intelligence in Health: AIH 2012
A theory for multiresolution signal decomposition: the wavelet representation
IEEE Trans. Pattern Anal. Mach. Intell.
Pattern recognition of motor imagery EEG using wavelet transform
J. Biomed. Sci. Eng.
Adaptive wavelets for signal classification and compression
Int. J. Electron. Commun.
Neural network adaptive wavelets for signal representation and classification
Opt. Eng.
Wavelet transforms and neural networks for compression and recognition
Neural Netw.
Wavelet neural networks for function learning
IEEE Trans. Signal Process.
Wavelet neural network with improved genetic algorithm for traffic flow time series prediction
Opt. - Int. J. Light Electron. Opt.
Evolving wavelet neural networks
Proceedings of the IEEE International Conference on Neural Networks
Air traffic flow of genetic algorithm to optimize wavelet neural network prediction
Proceedings of the IEEE International Conference on Software Engineering and Service Science (ICSESS’2014)
Wavelet neural network with improved genetic algorithm for traffic flow time series prediction
Opt. - Int. J. Light Electron. Opt.
Analysis of energy consumption prediction model based on genetic algorithm and wavelet neural network
Proceedings of the 3rd International Workshop on Intelligent Systems and Applications (ISA’2011)
Evolutionary wavelet neural network for large scale function estimation in optimization
Proceedings of the 11th Multidisciplinary Analysis and Optimization Conference (AIAA/ISSMO)
A genetic algorithm for constructing wavelet neural networks
Proceedings of the International Conference on Intelligent Computing (ICIC’2006)
A niche hierarchy genetic algorithms for learning wavelet neural networks
Proceedings of the 2nd IEEE Conference on Industrial Electronics and Applications
A novel learning algorithm for wavelet neural networks
Evolving wavelet networks for power transformer condition monitoring
IEEE Trans. Power Deliv.
Immune evolutionary algorithm of wavelet neural network to predict the performance in the centrifugal compressor and research
Proceedings of the Third International Conference on Measuring Technology and Mechatronics Automation (ICMTMA’2011)
Application a novel evolutionary computation algorithm for load forecasting of air conditioning
Proceedings of the Asia-Pacific Power and Energy Engineering Conference
Differential evolution trained wavelet neural networks: Application to bankruptcy prediction in banks
Expert Syst. Appl.
Using genetic algorithms to optimize wavelet neural networks parameters for function approximation
Int. J. Comp. Sci. Issues
Fault diagnosis of piston compressor based on wavelet neural network and genetic algorithm
Proceedings of the 7th World Congress on Intelligent Control and Automation (WCICA’2008)
Improved hybrid particle swarm optimized wavelet neural network for modeling the development of fluid dispensing for electronic packaging
IEEE Trans. Ind. Electron.
Self-adaptive differential evolution with global neighborhood search
Soft Comput.
Enhancing social emotional optimization algorithm using local search
Soft Comput.
Fast learning neural networks using cartesian genetic programming
Neurocomputing
What bloat? Cartesian genetic programming on boolean problems
Proceedings of the Genetic and Evolutionary Computation Conference (GECCO2001) - Late Breaking Papers
Efficient representation of recurrent neural networks for Markovian/non-Markovian non-linear control problems
Proceedings of the International Conference on System Design and Applications (ISDA2010)
Solving real-valued optimisation problems using cartesian genetic programming
Proceedings of the Genetic and Evolutionary Computation Conference (GECCO2007)
Cited by (24)
Enhanced decision tree induction using evolutionary techniques for Parkinson's disease classification
2022, Biocybernetics and Biomedical EngineeringCitation Excerpt :Their experiment on the identification of PD showed that the SVM achieved a higher performance compared to all the other methods, and the RBFNN needed a large dataset to obtain better results. Khan, Mendes [32] developed a system for PD diagnosis using cartesian genetic programming (CGP) to evolve a multi-dimensional wavelet neural network, and achieved an accuracy of 90.13%. Little, McSharry [33] proposed a new method for constructing features based on the calculation of traditional (Kay Pentax Multi-Dimensional Voice Program), non-standard (correlation dimension D2) and pitch period entropy (PPE) measures.
Segmentation of skin lesion images using discrete wavelet transform
2021, Biomedical Signal Processing and ControlCleaning decision model of MBR membrane based on Bandelet neural network optimized by improved Bat algorithm
2020, Applied Soft Computing JournalSingle-channel SEMG using wavelet deep belief networks for upper limb motion recognition
2020, International Journal of Industrial ErgonomicsCitation Excerpt :The activation function of the standard RBM is generally selected as the sigmoid function, which is difficult to establish the accurate mapping relationship between many modes and input signals. Many current studies have proved that as a new activation function of shallow neural network, wavelet neural network (WNN) usually shows obvious advantages over traditional neural network (Yang and Hu, 2016; Khan et al., 2017). The wavelet transform can gradually multi-scale refine the SEMG signals by scaling translation, which has the characteristics of time-frequency localization.
Optimizing wavelet neural networks using modified cuckoo search for multi-step ahead chaotic time series prediction
2019, Applied Soft Computing JournalCitation Excerpt :The most difficult problem is that because the explicit formula only considers the input data domain, it may fail miserably if the observed data are contaminated, or if subtle changes exist. A similar heuristic initialization procedure which is based on the domain of input space, too, has been proposed by Oussar [6] and has been used in several studies [7,8]. To alleviate the existing limitation, a host of approaches have been put forward in this direction.
Maryam Mahsal Khan did her B.Sc. Computer System Engineering from University of Engineering & Technology Peshawar, Pakistan in 2005 and Masters in Electrical & Electronic Engineering from Universti Teknologi Petronas, Malaysia in 2008. Before commencing her Ph.D. at the University of Newcastle, she worked as an Assistant Professor at UET Peshawar, Pakistan and later as a Research Engineer at LMKR Pvt. Ltd, Islamabad, Pakistan. She has a keen interest in Non-Linear Control, Genetic Algorithms and Genetic Programming, Artificial Neural Networks, Pattern Recognition, Image Processing, Signal Processing, Time-frequency decomposition. She has a range of publications in these fields in the conferences of repute.
Alexandre Mendes received his Ph.D. degree in Electrical Engineering from the State University of Campinas, Brazil, in 2003. He is a Senior Lecturer with the School of Electrical Engineering and Computer Science at The University of Newcastle, Australia. His research interests include optimization and data mining, with applications in bioinformatics, robotics and operations research.
Dr. Ping Zhang is a Research Fellow at Menzies Health Institute Queensland, Griffith University Australia. She has worked in bioinformatics and health informatics research area in the last 10 years. Her research interests include pattern recognition, biomarkers discovery, vaccine target identification and applying machine learning and statistical techniques for medical decision making.
Stephan Chalup is an Associate Professor in Computer Science and Software Engineering at the University of Newcastle, Australia, where he leads the Interdisciplinary Machine Learning Research Group. He received his Ph.D. (Machine Learning) in 2002 from Queensland University of Technology in Brisbane, Australia. His research interests include manifold learning, kernel machines, humanoid robots, computer vision, and neural information processing systems.