Abstract
In this paper, we propose a hybrid approach to solving multi-class problems which combines evolutionary computation with elements of traditional machine learning. The method, Grammatical Evolution Machine Learning (GEML) adapts machine learning concepts from decision tree learning and clustering methods and integrates these into a Grammatical Evolution framework. We investigate the effectiveness of GEML on several supervised, semi-supervised and unsupervised multi-class problems and demonstrate its competitive performance when compared with several well known machine learning algorithms. The GEML framework evolves human readable solutions which provide an explanation of the logic behind its classification decisions, offering a significant advantage over existing paradigms for unsupervised and semi-supervised learning. In addition we also examine the possibility of improving the performance of the algorithm through the application of several ensemble techniques.
References
Al-Madi, N., Ludwig, S.A.: Improving genetic programming classification for binary and multiclass datasets. In: Hammer, B., Zhou, Z.H., Wang, L., Chawla, N. (eds.) IEEE Symposium on Computational Intelligence and Data Mining, CIDM 2013. pp. 166–173. Singapore, 16–19 April 2013
Altman, N.S.: An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat. 46(3), 175–185 (1992)
Azad, R.M.A., Ryan, C.: The best things don’t always come in small packages: constant creation in grammatical evolution. In: Nicolau, M., Krawiec, K., Heywood, M.I., Castelli, M., GarcÃa-Sánchez, P., Merelo, J.J., Rivas Santos, V.M., Sim, K. (eds.) EuroGP 2014. LNCS, vol. 8599, pp. 186–197. Springer, Heidelberg (2014). doi:10.1007/978-3-662-44303-3_16
Banzhaf, W.: Evolutionary computation and genetic programming. In: Lakhtakia, A., Martin-Palma, R.J. (eds.) Engineered Biomimicry, chap. 17, pp. 429–447. Elsevier, Boston (2013). http://www.sciencedirect.com/science/article/pii/B9780124159952000179
Barros, R.C., Basgalupp, M.P., De Carvalho, A.C., Freitas, A., et al.: A survey of evolutionary algorithms for decision-tree induction. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 42(3), 291–312 (2012)
Belhassen, S., Zaidi, H.: A novel fuzzy c-means algorithm for unsupervised heterogeneous tumor quantification in pet. Med. Phy. 37(3), 1309–1324 (2010)
Bhowan, U., Johnston, M., Zhang, M.: Developing new fitness functions in genetic programming for classification with unbalanced data. IEEE Trans. Syst. Man Cybern. Part B Cybern. 42(2), 406–421 (2012)
Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pp. 144–152. ACM (1992)
Breiman, L.: Bagging predictors. In: Machine Learning, pp. 123–140 (1996)
Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. CRC Press, New York (1984)
Castelli, M., Silva, S., Vanneschi, L., Cabral, A., Vasconcelos, M.J., Catarino, L., Carreiras, J.M.B.: Land cover/land use multiclass classification using GP with geometric semantic operators. In: Esparcia-Alcázar, A.I. (ed.) EvoApplications 2013. LNCS, vol. 7835, pp. 334–343. Springer, Heidelberg (2013). doi:10.1007/978-3-642-37192-9_34
Cowgill, M.C., Harvey, R.J., Watson, L.T.: A genetic algorithm approach to cluster analysis. Comput. Math. Appl. 37(7), 99–108 (1999)
Deodhar, S., Motsinger-Reif, A.: Grammatical evolution decision trees for detecting gene-gene interactions. In: Pizzuti, C., Ritchie, M.D., Giacobini, M. (eds.) EvoBIO 2010. LNCS, vol. 6023, pp. 98–109. Springer, Heidelberg (2010). doi:10.1007/978-3-642-12211-8_9
Dietterich, T.: Ensemble methods in machine learning. In: Maimon, O., Rokach, L. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000)
Dietterich, T.G., Bakiri, G.: Solving multiclass learning problems via error-correcting output codes. J. Artif. Intell. Res. 2(1), 263–286 (1995)
Downey, C., Zhang, M., Liu, J.: Parallel linear genetic programming for multi-class classification. Genet. Programm. Evolvable Mach. 13(3), 275–304 (2013). Special issue on selected papers from the 2011 European conference on genetic programming
Fitzgerald, J., Azad, R.M.A., Ryan, C.: GEML: Evolutionary unsupervised and semi-supervised learning of multi-class classification with grammatical evolution. In: Rosa, A., Merelo, J.J., Dourado, A., Cadenas, J.M., Madani, K., Ruano, A., Filipe, J. (eds.) ECTA. 7th International Conference on Evolutionary Computation Theory and Practice, paper 31. SCITEPRESS - Science and Technology Publications, Lisbon, Portugal, 12–14 November 2015
Fitzgerald, J., Ryan, C.: A hybrid approach to the problem of class imbalance. In: Matousek, R. (ed.) 19th International Conference on Soft Computing, MENDEL 2013, pp. 129–137, Brno, Czech Republic, 26–28 June 2013
Fogel, D.B.: What is evolutionary computation? IEEE Spectr. 37(2), 26–28 (2000)
Freund, Y., Schapire, R.E., et al.: Experiments with a new boosting algorithm. In: ICML, vol. 96, pp. 148–156 (1996)
Fu, W., Johnston, M., Zhang, M.: Unsupervised learning for edge detection using genetic programming. In: Coello, C.A.C. (ed.) Proceedings of the 2014 IEEE Congress on Evolutionary Computation, pp. 117–124, Beijing, China, 6–11 July 2014
Greene, D., Tsymbal, A., Bolshakova, N., Cunningham, P.: Ensemble clustering in medical diagnostics. In: 17th IEEE Symposium on Computer-Based Medical Systems, CBMS 2004, Proceedings, pp. 576–581. IEEE (2004)
Hruschka, E.R., Campello, R.J., Freitas, A., De Carvalho, A.C., et al.: A survey of evolutionary algorithms for clustering. IEEE Trans. Syst. Man Cybern. Part C: appl. Rev. 39(2), 133–155 (2009)
Ji, C., Ma, S.: Combinations of weak classifiers. IEEE Trans. Neural Netw. 8(1), 32–42 (1997)
Kattan, A., Agapitos, A., Poli, R.: Unsupervised problem decomposition using genetic programming. In: Esparcia-Alcázar, A.I., Ekárt, A., Silva, S., Dignum, S., Uyar, A.Ş. (eds.) EuroGP 2010. LNCS, vol. 6021, pp. 122–133. Springer, Heidelberg (2010). doi:10.1007/978-3-642-12148-7_11
Kattan, A., Fatima, S., Arif, M.: Time-series event-based prediction: an unsupervised learning framework based on genetic programming. Inf. Sci. 301, 99–123 (2015). http://www.sciencedirect.com/science/article/pii/S0020025515000067
Keijzer, M., Babovic, V.: Genetic programming, ensemble methods and the bias/variance tradeoff – introductory investigations. In: Poli, R., Banzhaf, W., Langdon, W.B., Miller, J., Nordin, P., Fogarty, T.C. (eds.) EuroGP 2000. LNCS, vol. 1802, pp. 76–90. Springer, Heidelberg (2000). doi:10.1007/978-3-540-46239-2_6
Kim, Y., Street, W.N., Menczer, F.: Feature selection in unsupervised learning via evolutionary search. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 365–369. ACM (2000)
Koza, J.R.: Genetic programming: a paradigm for genetically breeding populations of computer programs to solve problems. Technical report (1990)
Maulik, U., Bandyopadhyay, S.: Genetic algorithm-based clustering technique. Pattern Recogn. 33(9), 1455–1465 (2000)
Mierswa, I., Wurst, M.: Information preserving multi-objective feature selection for unsupervised learning. In: Proceedings of the 8th Annual Conference on Genetic and Evolutionary Computation, pp. 1545–1552. ACM (2006)
Mojsilović, A., Popović, M.V., Nešković, A.N., Popović, A.D.: Wavelet image extension for analysis and classification of infarcted myocardial tissue. IEEE Trans. Biomed. Eng. 44(9), 856–866 (1997)
Morita, M., Sabourin, R., Bortolozzi, F., Suen, C.Y.: Unsupervised feature selection using multi-objective genetic algorithms for handwritten word recognition. In: 2013 12th International Conference on Document Analysis and Recognition, vol. 2, pp. 666–666. IEEE Computer Society (2003)
Muñoz, L., Silva, S., Trujillo, L.: M3GP – multiclass classification with GP. In: Machado, P., Heywood, M.I., McDermott, J., Castelli, M., GarcÃa-Sánchez, P., Burelli, P., Risi, S., Sim, K. (eds.) EuroGP 2015. LNCS, vol. 9025, pp. 78–91. Springer, Heidelberg (2015). doi:10.1007/978-3-319-16501-1_7
Neshatian, K., Zhang, M.: Unsupervised elimination of redundant features using genetic programming. In: Nicholson, A., Li, X. (eds.) AI 2009. LNCS (LNAI), vol. 5866, pp. 432–442. Springer, Heidelberg (2009). doi:10.1007/978-3-642-10439-8_44
Omran, M.G., Engelbrecht, A.P., Salman, A.: Differential evolution methods for unsupervised image classification. In: The 2005 IEEE Congress on Evolutionary Computation, vol. 2, pp. 966–973. IEEE (2005)
O’Neill, M., Brabazon, A.: Grammatical differential evolution. In: Arabnia, H.R. (ed.) Proceedings of the 2006 International Conference on Artificial Intelligence, ICAI 2006, vol. 1, pp. 231–236, CSREA Press, Las Vegas, Nevada, USA, 26-29 June 2006. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.91.3012
O’Neill, M., Brabazon, A.: Self-organizing swarm (SOSwarm): a particle swarm algorithm for unsupervised learning. In: IEEE Congress on Evolutionary Computation, CEC 2006, pp. 634–639. IEEE (2006)
O’Neill, M., Leahy, F., Brabazon, A.: Grammatical swarm: a variable-length particle swarm algorithm. In: Nedjah, N., de Macedo Mourelle, L. (eds.) Swarm Intelligent Systems, Studies in Computational Intelligence, vol. 28, pp. 59–74. Springer, Heidelberg (2006) Chap. 5
O’Neill, M., Ryan, C.: Automatic generation of programs with grammatical evolution. In: Bridge, D., Byrne, R., O’Sullivan, B., Prestwich, S., Sorensen, H. (eds.) Artificial Intelligence and Cognitive Science AICS 1999, No. 10, University College Cork, Ireland, 1–3 September 1999. http://ncra.ucd.ie/papers/aics99.ps.gz
Pan, H., Zhu, J., Han, D.: Genetic algorithms applied to multi-class clustering for gene expression data. Bioinf. Genom. Proteomics 1(4), 279–287 (2003)
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Raghavan, U.N., Albert, R., Kumara, S.: Near linear time algorithm to detect community structures in large-scale networks. Phys. Rev. E 76(3), 036106 (2007)
Ren, Y., Zhang, L., Suganthan, P.: Ensemble classification and regression-recent developments, applications and future directions. IEEE Comput. Intell. Mag. 11(1), 41–53 (2016)
Ryan, C., O’Neill, M.: How to do anything with grammars. In: Barry, A.M. (ed.) GECCO 2002: Proceedings of the Bird of a Feather Workshops, Genetic and Evolutionary Computation Conference. pp. 116–119. AAAI, New York, 8 Jul 2002. http://www.grammatical-evolution.org/gews2002/howto.ps
Schapire, R.E.: The strength of weak learnability. Mach. Learn. 5(2), 197–227 (1990)
Smart, W., Zhang, M.: Probability based genetic programming for multiclass object classification. Technical report, CS-TR-04-7, Computer Science, Victoria University of Wellington, New Zealand (2004). http://www.mcs.vuw.ac.nz/comp/Publications/archive/CS-TR-04/CS-TR-04-7.pdf
Smart, W., Zhang, M.: Using genetic programming for multiclass classification by simultaneously solving component binary classification problems. In: Keijzer, M., Tettamanzi, A., Collet, P., Hemert, J., Tomassini, M. (eds.) EuroGP 2005. LNCS, vol. 3447, pp. 227–239. Springer, Heidelberg (2005). doi:10.1007/978-3-540-31989-4_20
Steinhaus, H.: Sur la division des corps matériels en parties. Bull. Acad. Pol. Sci. Cl. III 4, 801–804 (1957)
Wu, S.X., Banzhaf, W.: Rethinking multilevel selection in genetic programming. In: Krasnogor, N., Lanzi, P.L., Engelbrecht, A., Pelta, D., Gershenson, C., Squillero, G., Freitas, A., Ritchie, M., Preuss, M., Gagne, C., Ong, Y.S., Raidl, G., Gallager, M., Lozano, J., Coello-Coello, C., Silva, D.L., Hansen, N., Meyer-Nieberg, S., Smith, J., Eiben, G., Bernado-Mansilla, E., Browne, W., Spector, L., Yu, T., Clune, J., Hornby, G., Wong, M.L., Collet, P., Gustafson, S., Watson, J.P., Sipper, M., Poulding, S., Ochoa, G., Schoenauer, M., Witt, C., Auger, A. (eds.) GECCO 2011: Proceedings of the 13th Annual Conference on Genetic and Evolutionary Computation, pp. 1403–1410. ACM, Dublin, Ireland, 12–16 July 2011, best paper
Zhang, M., Smart, W.: Multiclass object classification using genetic programming. In: Raidl, G.R., Cagnoni, S., Branke, J., Corne, D.W., Drechsler, R., Jin, Y., Johnson, C.G., Machado, P., Marchiori, E., Rothlauf, F., Smith, G.D., Squillero, G. (eds.) EvoWorkshops 2004. LNCS, vol. 3005, pp. 369–378. Springer, Heidelberg (2004). doi:10.1007/978-3-540-24653-4_38
Acknowledgement
We gratefully acknowledge the support of Science Foundation Ireland. Grant number 10/IN.1/I3031.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Fitzgerald, J.M., Azad, R.M.A., Ryan, C. (2017). GEML: A Grammatical Evolution, Machine Learning Approach to Multi-class Classification. In: Merelo, J.J., et al. Computational Intelligence. IJCCI 2015. Studies in Computational Intelligence, vol 669. Springer, Cham. https://doi.org/10.1007/978-3-319-48506-5_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-48506-5_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-48504-1
Online ISBN: 978-3-319-48506-5
eBook Packages: EngineeringEngineering (R0)