Abstract
Ensemble learning is a powerful paradigm that has been used in the top state-of-the-art machine learning methods like Random Forests and XGBoost. Inspired by the success of such methods, we have developed a new Genetic Programming method called Ensemble GP. The evolutionary cycle of Ensemble GP follows the same steps as other Genetic Programming systems, but with differences in the population structure, fitness evaluation and genetic operators. We have tested this method on eight binary classification problems, achieving results significantly better than standard GP, with much smaller models. Although other methods like M3GP and XGBoost were the best overall, Ensemble GP was able to achieve exceptionally good generalization results on a particularly hard problem where none of the other methods was able to succeed.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
de Araújo Padilha, C.A., Barone, D.A.C., Neto, A.D.D.: A multi-level approach using genetic algorithms in an ensemble of least squares support vector machines. Knowl.-Based Syst. 106, 85–95 (2016). https://doi.org/10.1016/j.knosys.2016.05.033
Bhowan, U., Johnston, M., Zhang, M., Yao, X.: Evolving diverse ensembles using genetic programming for classification with unbalanced data. IEEE Trans. Evol. Comput. 17(3), 368–386 (2013). https://doi.org/10.1109/TEVC.2012.2199119
Brameier, M., Banzhaf, W.: Evolving teams of predictors with linear genetic programming. Genet. Program Evolvable Mach. 2(4), 381–407 (2001). https://doi.org/10.1023/A:1012978805372
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001). https://doi.org/10.1023/A:1010933404324
Cantu-Paz, E., Kamath, C.: Inducing oblique decision trees with evolutionary algorithms. IEEE Trans. Evol. Comput. 7(1), 54–68 (2003)
Chandra, A., Yao, X.: Ensemble learning using multi-objective evolutionary algorithms. J. Math. Model. Algorithms 5(4), 417–445 (2006). https://doi.org/10.1007/s10852-005-9020-3
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. ArXiv abs/1603.02754 (2016)
Coelho, A.L.V., Fernandes, E., Faceli, K.: Multi-objective design of hierarchical consensus functions for clustering ensembles via genetic programming. Decis. Support Syst. 51(4), 794–809 (2011). https://doi.org/10.1016/j.dss.2011.01.014
Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-45014-9_1
Escalante, H.J., Acosta-Mendoza, N., Morales-Reyes, A., Gago-Alonso, A.: Genetic programming of heterogeneous ensembles for classification. In: Ruiz-Shulcloper, J., Sanniti di Baja, G. (eds.) CIARP 2013. LNCS, vol. 8258, pp. 9–16. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41822-8_2
Gagné, C., Sebag, M., Schoenauer, M., Tomassini, M.: Ensemble learning for free with evolutionary algorithms? In: Proceedings of the 9th Annual Conference on Genetic and Evolutionary Computation (GECCO 2007), pp. 1782–1789. ACM, New York (2007). https://doi.org/10.1145/1276958.1277317
Gijsbers, P.: Gametes\_epistasis\_2-way\_1000atts\_0.4h\_edm-1\_edm-1\_1 (2017). https://www.openml.org/d/40645
Iba, H.: Bagging, boosting, and bloating in genetic programming. In: Proceedings of the 1st Annual Conference on Genetic and Evolutionary Computation (GECCO 1999), vol. 2, pp. 1053–1060. Morgan Kaufmann Publishers Inc., San Francisco (1999). http://dl.acm.org/citation.cfm?id=2934046.2934063
Ingalalli, V., Silva, S., Castelli, M., Vanneschi, L.: A multi-dimensional genetic programming approach for multi-class classification problems. In: Nicolau, M., et al. (eds.) EuroGP 2014. LNCS, vol. 8599, pp. 48–60. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44303-3_5
Islam, M.M., Yao, X.: Evolving artificial neural network ensembles. IEEE Comput. Intell. Mag. 3, 31–42 (2008)
Johansson, U., Lofstrom, T., Konig, R., Niklasson, L.: Building neural network ensembles using genetic programming. In: The 2006 IEEE International Joint Conference on Neural Network Proceedings, pp. 1260–1265, July 2006. https://doi.org/10.1109/IJCNN.2006.246836
Koza, J.R.: Genetic Programming (1992)
La Cava, W., Silva, S., Vanneschi, L., Spector, L., Moore, J.: Genetic programming representations for multi-dimensional feature learning in biomedical classification. In: Squillero, G., Sim, K. (eds.) EvoApplications 2017. LNCS, vol. 10199, pp. 158–173. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-55849-3_11
Langdon, W.B., Buxton, B.F.: Genetic programming for combining classifiers. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2001), pp. 66–73. Morgan Kaufmann (2001)
Lichman, M.: UCI Machine Learning Repository (2013). https://archive.ics.uci.edu/ml/index.php
Luke, S., Panait, L.: Fighting bloat with nonparametric parsimony pressure. In: Guervós, J.J.M., Adamidis, P., Beyer, H.-G., Schwefel, H.-P., Fernández-Villacañas, J.-L. (eds.) PPSN 2002. LNCS, vol. 2439, pp. 411–421. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45712-7_40
Muñoz, L., Silva, S., Trujillo, L.: M3GP – multiclass classification with GP. In: Machado, P., et al. (eds.) EuroGP 2015. LNCS, vol. 9025, pp. 78–91. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16501-1_7
Muñoz, L., Trujillo, L., Silva, S., Castelli, M., Vanneschi, L.: Evolving multidimensional transformations for symbolic regression with M3GP. Memetic Comput. 11(2), 111–126 (2018). https://doi.org/10.1007/s12293-018-0274-5
de Oliveira, D.F., Canuto, A.M.P., de Souto, M.C.P.: Use of multi-objective genetic algorithms to investigate the diversity/accuracy dilemma in heterogeneous ensembles. In: 2009 International Joint Conference on Neural Networks, pp. 2339–2346 (2009)
Poli, R., Langdon, W.B., McPhee, N.F.: A Field Guide to Genetic Programming. Lulu Enterprises, UK Ltd., Essex (2008)
Silva, S., Vanneschi, L., Cabral, A.I., Vasconcelos, M.J.: A semi-supervised genetic programming method for dealing with noisy labels and hidden overfitting. Swarm Evol. Comput. 39, 323–338 (2018). https://doi.org/10.1016/j.swevo.2017.11.003
Sousa, R.T., Silva, S., Pesquita, C.: Evolving knowledge graph similarity for supervised learning in complex biomedical domains. BMC Bioinform. 21, 6 (2020). https://doi.org/10.1186/s12859-019-3296-1
Vanneschi, L.: An introduction to geometric semantic genetic programming. In: Schütze, O., Trujillo, L., Legrand, P., Maldonado, Y. (eds.) NEO 2015. SCI, vol. 663, pp. 3–42. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-44003-3_1
Veeramachaneni, K., Arnaldo, I., Derby, O., O’Reilly, U.M.: FlexGP. J. Grid Comput. 13, 391–407 (2015)
Yu, J., Guo, M., Needham, C.J., Huang, Y., Cai, L., Westhead, D.R.: Simple sequence-based kernels do not predict protein-protein interactions. Bioinformatics 26(20), 2610–2614 (2010). https://doi.org/10.1093/bioinformatics/btq483
Zhang, B., Joung, J.G.: Enhancing robustness of genetic programming at the species level. In: Genetic Programming Conference (GP 1997), pp. 336–342. Morgan Kaufmann (1997)
Zhang, S.: sonar.all-data (2018). https://www.kaggle.com/ypzhangsam/sonaralldata
Acknowledgement
This work was partially supported by FCT through funding of LASIGE Research Unit UIDB/00408/2020 and projects PTDC/CCI-INF/29168/2017, PTDC/CCI-CIF/29877/2017, DSAIPA/DS/0022/2018, PTDC/ASP-PLA/28726/2017 and PTDC/CTA-AMB/30056/2017.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Rodrigues, N.M., Batista, J.E., Silva, S. (2020). Ensemble Genetic Programming. In: Hu, T., Lourenço, N., Medvet, E., Divina, F. (eds) Genetic Programming. EuroGP 2020. Lecture Notes in Computer Science(), vol 12101. Springer, Cham. https://doi.org/10.1007/978-3-030-44094-7_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-44094-7_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-44093-0
Online ISBN: 978-3-030-44094-7
eBook Packages: Computer ScienceComputer Science (R0)