Skip to main content

Bloating Reduction in Symbolic Regression Through Function Frequency-Based Tree Substitution in Genetic Programming

  • Conference paper
  • First Online:
AI 2023: Advances in Artificial Intelligence (AI 2023)

Abstract

Genetic programming (GP) is an evolutionary machine learning method that can be used to address a wide range of both classification and regression conundrums. However, traditional GP algorithms can lead to unnecessary code growth known as bloating. This can slow down the convergence time, lead to over-fitting, and increase the computational cost required by the algorithm. The main focus of this paper is to control bloating caused by symbolic regression in GP trees. To address the bloating issue, this paper introduces a novel tree substitution method to reduce the tree size while increasing the exploring ability of the GP algorithm. The proposed method incorporates a comprehensive analysis to detect bloating in parent trees. When a bloated tree is detected, a new, smaller tree is generated, leveraging the function frequency of the identified bloated tree. A set of regression experiments have been conducted on six real-world datasets. Results showed that the proposed GP method obtains a reduction in the size of the best individual while maintaining similar performance as standard GP with a tree height limit.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Alfaro-Cid, E., Esparcia-Alcázar, A., Sharman, K., Vega, F.F.D.: Prune and plant: a new bloat control method for genetic programming. In: 2008 Eighth International Conference on Hybrid Intelligent Systems, pp. 31–35 (2008)

    Google Scholar 

  2. Dignum, S., Poli, R.: Crossover, sampling, bloat and the harmful effects of size limits. In: O’Neill, M., et al. (eds.) EuroGP 2008. LNCS, vol. 4971, pp. 158–169. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78671-9_14

    Chapter  Google Scholar 

  3. Dignum, S., Poli, R.: Operator equalisation and bloat free GP. In: O’Neill, M., et al. (eds.) EuroGP 2008. LNCS, vol. 4971, pp. 110–121. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78671-9_10

    Chapter  Google Scholar 

  4. Gardner, M.A., Gagné, C., Parizeau, M.: Bloat control in genetic programming with a histogram-based accept-reject method. In: Proceedings of the 13th Annual Conference Companion on Genetic and Evolutionary Computation, New York, NY, USA, pp. 187–188 (2011)

    Google Scholar 

  5. Kinzett, D., Johnston, M., Zhang, M.: Numerical simplification for bloat control and analysis of building blocks in genetic programming. Evol. Intell. 2(4), 151–168 (2009)

    Article  Google Scholar 

  6. Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge (1992)

    Google Scholar 

  7. Koza, J.R.: Genetic programming as a means for programming computers by natural selection. Stat. Comput. 4, 87–112 (1994)

    Article  Google Scholar 

  8. Luke, S., Panait, L.: A comparison of bloat control methods for genetic programming. Evol. Comput. 14(3), 309–344 (2006)

    Article  Google Scholar 

  9. O’Neill, M.: Riccardo Poli, William B. Langdon, Nicholas F. Mcphee: a field guide to genetic programming. Genetic Program. Evol. Mach. 10(2), 229–230 (2009)

    Google Scholar 

  10. Panait, L., Luke, S.: Alternative bloat control methods. In: Deb, K. (ed.) GECCO 2004. LNCS, vol. 3103, pp. 630–641. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24855-2_71

    Chapter  Google Scholar 

  11. Poli, R., Langdon, W., Mcphee, N.: A field guide to genetic programming (2008)

    Google Scholar 

  12. Raymond, C., Chen, Q., Xue, B., Zhang, M.: Genetic programming with rademacher complexity for symbolic regression. In: 2019 IEEE Congress on Evolutionary Computation (CEC), pp. 2657–2664 (2019)

    Google Scholar 

  13. Silva, S., Costa, E.: Dynamic limits for bloat control in genetic programming and a review of past and current bloat theories. Genet. Program. Evol. Mach. 10(2), 141–179 (2009)

    Article  Google Scholar 

  14. Silva, S., Dignum, S.: Extending operator equalisation: fitness based self adaptive length distribution for bloat free GP. In: Vanneschi, L., Gustafson, S., Moraglio, A., De Falco, I., Ebner, M. (eds.) EuroGP 2009. LNCS, vol. 5481, pp. 159–170. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-01181-8_14

    Chapter  Google Scholar 

  15. Silva, S., Dignum, S., Vanneschi, L.: Operator equalisation for bloat free genetic programming and a survey of bloat control methods. Genet. Program Evolvable Mach. 13, 197–238 (2011)

    Article  Google Scholar 

  16. Silva, S., Vanneschi, L.: The importance of being flat-studying the program length distributions of operator equalisation. In: Riolo, R., Vladislavleva, E., Moore, J. (eds.) Genetic Programming Theory and Practice IX. Genetic and Evolutionary Computation, pp. 211–233. Springer, New York (2011). https://doi.org/10.1007/978-1-4614-1770-5_12

    Chapter  Google Scholar 

  17. Uy, N.Q., Chu, T.H.: Semantic approximation for reducing code bloat in genetic programming. Swarm Evol. Comput. 58, 100729 (2020)

    Article  Google Scholar 

  18. Vanneschi, L., Castelli, M., Silva, S.: Measuring bloat, overfitting and functional complexity in genetic programming. In: Proceedings of the 12th Annual Conference on Genetic and Evolutionary Computation, New York, NY, USA, pp. 877–884 (2010)

    Google Scholar 

Download references

Acknowledgement

This work is supported in part by the Marsden Fund of New Zealand Government under Contract MFP-VUW2016 and MFP-VUW1913.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohamad Rimas .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rimas, M., Chen, Q., Zhang, M. (2024). Bloating Reduction in Symbolic Regression Through Function Frequency-Based Tree Substitution in Genetic Programming. In: Liu, T., Webb, G., Yue, L., Wang, D. (eds) AI 2023: Advances in Artificial Intelligence. AI 2023. Lecture Notes in Computer Science(), vol 14472. Springer, Singapore. https://doi.org/10.1007/978-981-99-8391-9_34

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8391-9_34

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8390-2

  • Online ISBN: 978-981-99-8391-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics