skip to main content
10.1145/3583131.3590400acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

Down-Sampled Epsilon-Lexicase Selection for Real-World Symbolic Regression Problems

Published:12 July 2023Publication History

ABSTRACT

Epsilon-lexicase selection is a parent selection method in genetic programming that has been successfully applied to symbolic regression problems. Recently, the combination of random subsampling with lexicase selection significantly improved performance in other genetic programming domains such as program synthesis. However, the influence of subsampling on the solution quality of real-world symbolic regression problems has not yet been studied. In this paper, we propose down-sampled epsilon-lexicase selection which combines epsilon-lexicase selection with random subsampling to improve the performance in the domain of symbolic regression. Therefore, we compare down-sampled epsilon-lexicase with traditional selection methods on common real-world symbolic regression problems and analyze its influence on the properties of the population over a genetic programming run. We find that the diversity is reduced by using down-sampled epsilon-lexicase selection compared to standard epsilon-lexicase selection. This comes along with high hyperselection rates we observe for down-sampled epsilon-lexicase selection. Further, we find that down-sampled epsilon-lexicase selection outperforms the traditional selection methods on all studied problems. Overall, with down-sampled epsilon-lexicase selection we observe an improvement of the solution quality of up to 85% in comparison to standard epsilon-lexicase selection.

References

  1. Ryan Boldi, Martin Briesch, Dominik Sobania, Alexander Lalejini, Thomas Helmuth, Franz Rothlauf, Charles Ofria, and Lee Spector. 2023. Informed Down-Sampled Lexicase Selection: Identifying productive training cases for efficient problem solving. arXiv preprint arXiv:2301.01488v1 (2023).Google ScholarGoogle Scholar
  2. Thomas F. Brooks, D. Stuart Pope, and Michael A. Marcolini. 1989. Airfoil self-noise and prediction. National Aeronautics and Space Administration, Office of Management, Scientific and Technical Information Division.Google ScholarGoogle Scholar
  3. Shu-Heng Chen. 2012. Genetic algorithms and genetic programming in computational finance. Springer Science & Business Media.Google ScholarGoogle Scholar
  4. Dheeru Dua and Casey Graff. 2019. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml.Google ScholarGoogle Scholar
  5. Austin J. Ferguson, Jose Guadalupe Hernandez, Daniel Junghans, Alexander Lalejini, Emily Dolson, and Charles Ofria. 2020. Characterizing the Effects of Random Subsampling on Lexicase Selection. In Genetic Programming Theory and Practice XVII. Springer International Publishing, 1--23.Google ScholarGoogle Scholar
  6. Stefan Forstenlechner, David Fagan, Miguel Nicolau, and Michael O'Neill. 2017. A Grammar Design Pattern for Arbitrary Program Synthesis Problems in Genetic Programming. In Genetic Programming. Springer International Publishing, 262--277.Google ScholarGoogle Scholar
  7. Félix-Antoine Fortin, François-Michel de Rainville, Marc-André Gardner, Marc Parizeau, and Christian Gagné. 2012. DEAP: Evolutionary algorithms made easy. The Journal of Machine Learning Research 13, 1 (2012), 2171--2175.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Gerritsma, R. Onnink, and A. Versluis. 1981. Geometry, resistance and stability of the delft systematic yacht hull series. International shipbuilding progress 28, 328 (1981), 276--297.Google ScholarGoogle Scholar
  9. Ivo Gonçalves, Sara Silva, Joana B. Melo, and João M. B. Carreiras. 2012. Random Sampling Technique for Overfitting Control in Genetic Programming. In Genetic Programming. Springer Berlin Heidelberg, 218--229.Google ScholarGoogle Scholar
  10. David Harrison and Daniel L. Rubinfeld. 1978. Hedonic Housing Prices and the Demand for Clean Air. Journal of environmental economics and management 5, 1 (1978), 81--102.Google ScholarGoogle ScholarCross RefCross Ref
  11. Thomas Helmuth and Amr Abdelhady. 2020. Benchmarking parent selection for program synthesis by genetic programming. In Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion (GECCO '20). ACM, 237--238.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Thomas Helmuth, Nicholas Freitag McPhee, and Lee Spector. 2016. Effects of Lexicase and Tournament Selection on Diversity Recovery and Maintenance. In Proceedings of the 2016 on Genetic and Evolutionary Computation Conference Companion (GECCO '16 Companion). ACM, 983--990.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Thomas Helmuth, Nicholas Freitag McPhee, and Lee Spector. 2016. The Impact of Hyperselection on Lexicase Selection. In Proceedings of the Genetic and Evolutionary Computation Conference 2016 (GECCO '16). ACM, 717--724.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Thomas Helmuth, Nicholas Freitag McPhee, and Lee Spector. 2016. Lexicase selection for program synthesis: a diversity analysis. In Genetic Programming Theory and Practice XIII. Springer International Publishing, 151--167.Google ScholarGoogle Scholar
  15. Thomas Helmuth, Edward Pantridge, and Lee Spector. 2019. Lexicase selection of specialists. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO '19). ACM, 1030--1038.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Thomas Helmuth, Edward Pantridge, and Lee Spector. 2020. On the importance of specialists for lexicase selection. Genetic Programming and Evolvable Machines 21, 3 (2020), 349--373.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Thomas Helmuth and Lee Spector. 2015. General Program Synthesis Benchmark Suite. In Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation (GECCO '15). ACM, 1039--1046.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Thomas Helmuth and Lee Spector. 2020. Explaining and exploiting the advantages of down-sampled lexicase selection. In ALIFE 2020: The 2020 Conference on Artificial Life. MIT Press, 341--349.Google ScholarGoogle ScholarCross RefCross Ref
  19. Thomas Helmuth and Lee Spector. 2021. Problem-Solving Benefits of Down-Sampled Lexicase Selection. Artificial life 27, 3--4 (2021), 183--203.Google ScholarGoogle Scholar
  20. Thomas Helmuth, Lee Spector, and James Matheson. 2014. Solving Uncompromising Problems with Lexicase Selection. IEEE Transactions on Evolutionary Computation 19, 5 (2014), 630--643.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Jose Guadalupe Hernandez, Alexander Lalejini, Emily Dolson, and Charles Ofria. 2019. Random subsampling improves performance in lexicase selection. In Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO '19). ACM, 2028--2031.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. David Jackson. 2010. Promoting Phenotypic Diversity in Genetic Programming. In Parallel Problem Solving from Nature, PPSN XI. Springer Berlin Heidelberg, 472--481.Google ScholarGoogle Scholar
  23. Thomas Jansen and Christine Zarges. 2018. Theoretical Analysis of Lexicase Selection in Multi-objective Optimization. In Parallel Problem Solving from Nature - PPSN XV. Springer International Publishing, 153--164.Google ScholarGoogle Scholar
  24. John R. Koza. 1992. On the programming of computers by means of natural selection. A Bradford book, Vol. 1. MIT Press.Google ScholarGoogle Scholar
  25. Krzysztof Krawiec and Una-May O'Reilly. 2014. Behavioral programming: a broader and more detailed take on semantic GP. In Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation (GECCO '14). ACM, 935--942.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. William La Cava, Thomas Helmuth, Lee Spector, and Jason H. Moore. 2019. A probabilistic and multi-objective analysis of lexicase selection and epsilon-lexicase selection. Evolutionary Computation 27, 3 (2019), 377--402.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. William La Cava, Patryk Orzechowski, Bogdan Burlacu, Fabrício Olivetti de França, Marco Virgolin, Ying Jin, Michael Kommenda, and Jason H. Moore. 2021. Contemporary Symbolic Regression Methods and their Relative Performance. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track.Google ScholarGoogle Scholar
  28. William La Cava, Lee Spector, and Kourosh Danai. 2016. Epsilon-Lexicase Selection for Regression. In Proceedings of the Genetic and Evolutionary Computation Conference 2016 (GECCO '16). ACM, 741--748.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Joao Francisco B. S. Martins, Luiz Otavio V. B. Oliveira, Luis F. Miranda, Felipe Casadei, and Gisele L. Pappa. 2018. Solving the exponential growth of symbolic regression trees in geometric semantic genetic programming. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO '18). ACM, 1151--1158.Google ScholarGoogle Scholar
  30. Jared M. Moore and Adam Stanton. 2017. Lexicase selection outperforms previous strategies for incremental evolution of virtual creature controllers. In ECAL 2017, the Fourteenth European Conference on Artificial Life. 290--297.Google ScholarGoogle Scholar
  31. Ji Ni, Russ H. Drieberg, and Peter I. Rockett. 2013. The Use of an Analytic Quotient Operator in Genetic Programming. IEEE Transactions on Evolutionary Computation 17, 1 (2013), 146--152.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Edward Pantridge, Thomas Helmuth, Nicholas Freitag McPhee, and Lee Spector. 2018. Specialization and Elitism in Lexicase and Tournament Selection. In Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO '18). ACM, 1914--1917.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Thu Pham-Gia and Tran Loc Hung. 2001. The Mean and Median Absolute Deviations. Mathematical and Computer 34, 7--8 (2001), 921--936.Google ScholarGoogle Scholar
  34. Riccardo Poli, William B. Langdon, and Nicholas Freitag McPhee. 2008. A field guide to genetic programming. Lulu Press.Google ScholarGoogle ScholarCross RefCross Ref
  35. Anil Kumar Saini and Lee Spector. 2020. Effect of Parent Selection Methods on Modularity. In Genetic Programming. Springer International Publishing, 184--194.Google ScholarGoogle Scholar
  36. Dirk Schweim, Dominik Sobania, and Franz Rothlauf. 2022. Effects of the Training Set Size: A Comparison of Standard and Down-Sampled Lexicase Selection in Program Synthesis. In 2022 IEEE Congress on Evolutionary Computation (CEC). IEEE, 1--8.Google ScholarGoogle Scholar
  37. Dominik Sobania and Franz Rothlauf. 2021. A generalizability measure for program synthesis with genetic programming. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO '21). ACM, 822--829.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Dominik Sobania, Dirk Schweim, and Franz Rothlauf. 2023. A Comprehensive Survey on Program Synthesis With Evolutionary Algorithms. IEEE Transactions on Evolutionary Computation 27, 1 (2023), 82--97.Google ScholarGoogle ScholarCross RefCross Ref
  39. Lee Spector. 2012. Assessment of Problem Modality by Differential Performance of Lexicase Selection in Genetic Programming: A Preliminary Report. In Proceedings of the 14th Annual Conference Companion on Genetic and Evolutionary Computation (GECCO '12). ACM, 401--408.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Athanasios Tsanas and Angeliki Xifara. 2012. Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools. Energy and Buildings 49 (2012), 560--567.Google ScholarGoogle ScholarCross RefCross Ref
  41. Marco Virgolin, Tanja Alderliesten, Arjan Bel, Cees Witteveen, and Peter A. N. Bosman. 2018. Symbolic regression and feature construction with GP-GOMEA applied to radiotherapy dose reconstruction of childhood cancer survivors. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO '18). ACM, 1395--1402.Google ScholarGoogle Scholar
  42. Marco Virgolin, Tanja Alderliesten, Cees Witteveen, and Peter A. N. Bosman. 2021. Improving Model-Based Genetic Programming for Symbolic Regression of Small Expressions. Evolutionary Computation 29, 2 (2021), 211--237.Google ScholarGoogle ScholarCross RefCross Ref
  43. I-Cheng Yeh. 1998. Modeling of strength of high-performance concrete using artificial neural networks. Cement and Concrete research 28, 12 (1998), 1797--1808.Google ScholarGoogle Scholar

Index Terms

  1. Down-Sampled Epsilon-Lexicase Selection for Real-World Symbolic Regression Problems

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        GECCO '23: Proceedings of the Genetic and Evolutionary Computation Conference
        July 2023
        1667 pages
        ISBN:9798400701191
        DOI:10.1145/3583131

        Copyright © 2023 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 12 July 2023

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate1,669of4,410submissions,38%

        Upcoming Conference

        GECCO '24
        Genetic and Evolutionary Computation Conference
        July 14 - 18, 2024
        Melbourne , VIC , Australia
      • Article Metrics

        • Downloads (Last 12 months)52
        • Downloads (Last 6 weeks)13

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader