A Boosting Approach to Constructing an Ensemble Stack

Zhou, Zhilei; Qiu, Ziyu; Niblett, Brad; Johnston, Andrew; Schwartzentruber, Jeffrey; Zincir-Heywood, Nur; Heywood, Malcolm I.

doi:10.1007/978-3-031-29573-7_9

Zhilei Zhou¹⁰,
Ziyu Qiu¹⁰,
Brad Niblett¹¹,
Andrew Johnston¹¹,
Jeffrey Schwartzentruber¹⁰,
Nur Zincir-Heywood ORCID: orcid.org/0000-0003-2796-7265¹⁰ &
…
Malcolm I. Heywood ORCID: orcid.org/0000-0002-1521-0671¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13986))

Included in the following conference series:

European Conference on Genetic Programming (Part of EvoStar)

406 Accesses
2 Citations

Abstract

An approach to evolutionary ensemble learning for classification is proposed using genetic programming in which boosting is used to construct a stack of programs. Each application of boosting identifies a single champion and a residual dataset, i.e. the training records that thus far were not correctly classified. The next program is only trained against the residual, with the process iterating until some maximum ensemble size or no further residual remains. Training against a residual dataset actively reduces the cost of training. Deploying the ensemble as a stack also means that only one classifier might be necessary to make a prediction, so improving interpretability. Benchmarking studies are conducted to illustrate competitiveness with the prediction accuracy of current state-of-the-art evolutionary ensemble learning algorithms, while providing solutions that are orders of magnitude simpler. Further benchmarking with a high cardinality dataset indicates that the proposed method is also more accurate and efficient than XGBoost.

Supported by 2Keys Corporation - An Interac Company.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Classification tasks often assume the majority vote, although voting/weighting schemes might be evolved [3].
2.
https://archive-beta.ics.uci.edu.
3.
Laptop with Intel i7 10700k CPU, 4.3 GHz single core.
4.
2SEGP parameterization: pop. size 500, ensemble size 50, max. tree size 500.

References

Agapitos, A., Loughran, R., Nicolau, M., Lucas, S.M., O’Neill, M., Brabazon, A.: A survey of statistical machine learning elements in genetic programming. IEEE Trans. Evol. Comput. 23(6), 1029–1048 (2019)
Article Google Scholar
Badran, K.M.S., Rockett, P.I.: Multi-class pattern classification using single, multi-dimensional feature-space feature extraction evolved by multi-objective genetic programming and its application to network intrusion detection. Genet. Program Evolvable Mach. 13(1), 33–63 (2012)
Article Google Scholar
Brameier, M., Banzhaf, W.: Evolving teams of predictors with linear genetic programming. Genet. Program Evolvable Mach. 2(4), 381–407 (2001)
Article MATH Google Scholar
Cava, W.G.L., Silva, S., Danai, K., Spector, L., Vanneschi, L., Moore, J.H.: Multidimensional genetic programming for multiclass classification. Swarm Evol. Comput. 44, 260–272 (2019)
Article Google Scholar
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM (2016)
Google Scholar
Curry, R., Lichodzijewski, P., Heywood, M.I.: Scaling genetic programming to large datasets using hierarchical dynamic subset selection. IEEE Trans. Syst. Man, Cybern. - Part B 37(4), 1065–1073 (2007)
Article Google Scholar
Dietterich, T.G.: An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach. Learn. 40(2), 139–157 (2000)
Article Google Scholar
Fahlman, S.E., Lebiere, C.: The cascade-correlation learning architecture. In: Advances in Neural Information Processing Systems, vol. 2, pp. 524–532. Morgan Kaufmann (1989)
Google Scholar
Folino, G., Pizzuti, C., Spezzano, G.: Training distributed GP ensemble with a selective algorithm based on clustering and pruning for pattern classification. IEEE Trans. Evol. Comput. 12(4), 458–468 (2008)
Article Google Scholar
García, S., Grill, M., Stiborek, J., Zunino, A.: An empirical comparison of botnet detection methods. Comput. Secur. 45, 100–123 (2014)
Article Google Scholar
Gathercole, C., Ross, P.: Dynamic training subset selection for supervised learning in genetic programming. In: Davidor, Y., Schwefel, H.-P., Männer, R. (eds.) PPSN 1994. LNCS, vol. 866, pp. 312–321. Springer, Heidelberg (1994). https://doi.org/10.1007/3-540-58484-6_275
Chapter Google Scholar
Iba, H.: Bagging, boosting, and bloating in genetic programming. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 1053–1060. Morgan Kaufmann (1999)
Google Scholar
Imamura, K., Soule, T., Heckendorn, R.B., Foster, J.A.: Behavioral diversity and a probabilistically optimal GP ensemble. Genet. Program Evolvable Mach. 4(3), 235–253 (2003)
Article Google Scholar
Lichodzijewski, P., Heywood, M.I.: Managing team-based problem solving with symbiotic bid-based genetic programming. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 363–370. ACM (2008)
Google Scholar
Lichodzijewski, P., Heywood, M.I.: Symbiosis, complexification and simplicity under GP. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 853–860. ACM (2010)
Google Scholar
McIntyre, A.R., Heywood, M.I.: Classification as clustering: a pareto cooperative-competitive GP approach. Evol. Comput. 19(1), 137–166 (2011)
Article Google Scholar
Muni, D.P., Pal, N.R., Das, J.: A novel approach to design classifiers using genetic programming. IEEE Trans. Evol. Comput. 8(2), 183–196 (2004)
Article Google Scholar
Muñoz, L., Silva, S., Trujillo, L.: M3GP – multiclass classification with GP. In: Machado, P., et al. (eds.) EuroGP 2015. LNCS, vol. 9025, pp. 78–91. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16501-1_7
Chapter Google Scholar
Potter, M.A., Jong, K.A.D.: Cooperative coevolution: an architecture for evolving coadapted subcomponents. Evol. Comput. 8(1), 1–29 (2000)
Article Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, Burlington (1993)
Google Scholar
Rodrigues, N.M., Batista, J.E., Silva, S.: Ensemble genetic programming. In: Hu, T., Lourenço, N., Medvet, E., Divina, F. (eds.) EuroGP 2020. LNCS, vol. 12101, pp. 151–166. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-44094-7_10
Chapter Google Scholar
Rudin, C.: Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1(5), 206–215 (2019)
Article Google Scholar
Sipper, M., Moore, J.H.: Symbolic-regression boosting. CoRR abs/2206.12082 (2022)
Google Scholar
Song, D., Heywood, M.I., Zincir-Heywood, A.N.: Training genetic programming on half a million patterns: an example from anomaly detection. IEEE Trans. Evol. Comput. 9(3), 225–239 (2005)
Article Google Scholar
Soule, T.: Voting teams: a cooperative approach to non-typical problems using genetic programming. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 916–922. Morgan Kaufmann (1999)
Google Scholar
Thomason, R., Soule, T.: Novel ways of improving cooperation and performance in ensemble classifiers. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 1708–1715. ACM (2007)
Google Scholar
Virgolin, M.: Genetic programming is naturally suited to evolve bagging ensembles. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 830–839. ACM (2021)
Google Scholar
Wang, S., Mei, Y., Zhang, M.: Novel ensemble genetic programming hyper-heuristics for uncertain capacitated arc routing problem. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 1093–1101. ACM (2019)
Google Scholar
Wu, S.X., Banzhaf, W.: Rethinking multilevel selection in genetic programming. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 1403–1410. ACM (2011)
Google Scholar

Download references

Acknowledgements

This research was enabled by the support of the Natural Science and Engineering Research Council (NSERC) of Canada Alliance Grant.

Author information

Authors and Affiliations

Faculty of Computer Science, Dalhousie University, Nova Scotia, Canada
Zhilei Zhou, Ziyu Qiu, Jeffrey Schwartzentruber, Nur Zincir-Heywood & Malcolm I. Heywood
2Keys Corporation - An Interac Company, Ottawa, Canada
Brad Niblett & Andrew Johnston

Authors

Zhilei Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Ziyu Qiu
View author publications
You can also search for this author in PubMed Google Scholar
Brad Niblett
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Johnston
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey Schwartzentruber
View author publications
You can also search for this author in PubMed Google Scholar
Nur Zincir-Heywood
View author publications
You can also search for this author in PubMed Google Scholar
Malcolm I. Heywood
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Malcolm I. Heywood .

Editor information

Editors and Affiliations

Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
Gisele Pappa
Università degli studi di Torino, Turin, Italy
Mario Giacobini
Brno University of Technology, Brno, Czech Republic
Zdenek Vasicek

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhou, Z. et al. (2023). A Boosting Approach to Constructing an Ensemble Stack. In: Pappa, G., Giacobini, M., Vasicek, Z. (eds) Genetic Programming. EuroGP 2023. Lecture Notes in Computer Science, vol 13986. Springer, Cham. https://doi.org/10.1007/978-3-031-29573-7_9

Download citation

DOI: https://doi.org/10.1007/978-3-031-29573-7_9
Published: 29 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-29572-0
Online ISBN: 978-3-031-29573-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Boosting Approach to Constructing an Ensemble Stack