research-article

Down-Sampled Epsilon-Lexicase Selection for Real-World Symbolic Regression Problems

Authors:
Alina Geiger

University of Mainz, Mainz, Germany

University of Mainz, Mainz, Germany

https://orcid.org/0009-0002-3413-283X
View Profile

,
Dominik Sobania

University of Mainz, Mainz, Germany

University of Mainz, Mainz, Germany

https://orcid.org/0000-0001-8873-7143
View Profile

,
Franz Rothlauf

University of Mainz, Mainz, Germany

University of Mainz, Mainz, Germany

https://orcid.org/0000-0003-3376-427X
View Profile

GECCO '23: Proceedings of the Genetic and Evolutionary Computation ConferenceJuly 2023Pages 1109–1117https://doi.org/10.1145/3583131.3590400

Published:12 July 2023Publication History

GECCO '23: Proceedings of the Genetic and Evolutionary Computation Conference

Pages 1109–1117

ABSTRACT

Epsilon-lexicase selection is a parent selection method in genetic programming that has been successfully applied to symbolic regression problems. Recently, the combination of random subsampling with lexicase selection significantly improved performance in other genetic programming domains such as program synthesis. However, the influence of subsampling on the solution quality of real-world symbolic regression problems has not yet been studied. In this paper, we propose down-sampled epsilon-lexicase selection which combines epsilon-lexicase selection with random subsampling to improve the performance in the domain of symbolic regression. Therefore, we compare down-sampled epsilon-lexicase with traditional selection methods on common real-world symbolic regression problems and analyze its influence on the properties of the population over a genetic programming run. We find that the diversity is reduced by using down-sampled epsilon-lexicase selection compared to standard epsilon-lexicase selection. This comes along with high hyperselection rates we observe for down-sampled epsilon-lexicase selection. Further, we find that down-sampled epsilon-lexicase selection outperforms the traditional selection methods on all studied problems. Overall, with down-sampled epsilon-lexicase selection we observe an improvement of the solution quality of up to 85% in comparison to standard epsilon-lexicase selection.

References

Ryan Boldi, Martin Briesch, Dominik Sobania, Alexander Lalejini, Thomas Helmuth, Franz Rothlauf, Charles Ofria, and Lee Spector. 2023. Informed Down-Sampled Lexicase Selection: Identifying productive training cases for efficient problem solving. arXiv preprint arXiv:2301.01488v1 (2023).Google Scholar
Thomas F. Brooks, D. Stuart Pope, and Michael A. Marcolini. 1989. Airfoil self-noise and prediction. National Aeronautics and Space Administration, Office of Management, Scientific and Technical Information Division.Google Scholar
Shu-Heng Chen. 2012. Genetic algorithms and genetic programming in computational finance. Springer Science & Business Media.Google Scholar
Dheeru Dua and Casey Graff. 2019. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml.Google Scholar
Austin J. Ferguson, Jose Guadalupe Hernandez, Daniel Junghans, Alexander Lalejini, Emily Dolson, and Charles Ofria. 2020. Characterizing the Effects of Random Subsampling on Lexicase Selection. In Genetic Programming Theory and Practice XVII. Springer International Publishing, 1--23.Google Scholar
Stefan Forstenlechner, David Fagan, Miguel Nicolau, and Michael O'Neill. 2017. A Grammar Design Pattern for Arbitrary Program Synthesis Problems in Genetic Programming. In Genetic Programming. Springer International Publishing, 262--277.Google Scholar
Félix-Antoine Fortin, François-Michel de Rainville, Marc-André Gardner, Marc Parizeau, and Christian Gagné. 2012. DEAP: Evolutionary algorithms made easy. The Journal of Machine Learning Research 13, 1 (2012), 2171--2175.Google ScholarDigital Library
J. Gerritsma, R. Onnink, and A. Versluis. 1981. Geometry, resistance and stability of the delft systematic yacht hull series. International shipbuilding progress 28, 328 (1981), 276--297.Google Scholar
Ivo Gonçalves, Sara Silva, Joana B. Melo, and João M. B. Carreiras. 2012. Random Sampling Technique for Overfitting Control in Genetic Programming. In Genetic Programming. Springer Berlin Heidelberg, 218--229.Google Scholar
David Harrison and Daniel L. Rubinfeld. 1978. Hedonic Housing Prices and the Demand for Clean Air. Journal of environmental economics and management 5, 1 (1978), 81--102.Google ScholarCross Ref
Thomas Helmuth and Amr Abdelhady. 2020. Benchmarking parent selection for program synthesis by genetic programming. In Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion (GECCO '20). ACM, 237--238.Google ScholarDigital Library
Thomas Helmuth, Nicholas Freitag McPhee, and Lee Spector. 2016. Effects of Lexicase and Tournament Selection on Diversity Recovery and Maintenance. In Proceedings of the 2016 on Genetic and Evolutionary Computation Conference Companion (GECCO '16 Companion). ACM, 983--990.Google ScholarDigital Library
Thomas Helmuth, Nicholas Freitag McPhee, and Lee Spector. 2016. The Impact of Hyperselection on Lexicase Selection. In Proceedings of the Genetic and Evolutionary Computation Conference 2016 (GECCO '16). ACM, 717--724.Google ScholarDigital Library
Thomas Helmuth, Nicholas Freitag McPhee, and Lee Spector. 2016. Lexicase selection for program synthesis: a diversity analysis. In Genetic Programming Theory and Practice XIII. Springer International Publishing, 151--167.Google Scholar
Thomas Helmuth, Edward Pantridge, and Lee Spector. 2019. Lexicase selection of specialists. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO '19). ACM, 1030--1038.Google ScholarDigital Library
Thomas Helmuth, Edward Pantridge, and Lee Spector. 2020. On the importance of specialists for lexicase selection. Genetic Programming and Evolvable Machines 21, 3 (2020), 349--373.Google ScholarDigital Library
Thomas Helmuth and Lee Spector. 2015. General Program Synthesis Benchmark Suite. In Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation (GECCO '15). ACM, 1039--1046.Google ScholarDigital Library
Thomas Helmuth and Lee Spector. 2020. Explaining and exploiting the advantages of down-sampled lexicase selection. In ALIFE 2020: The 2020 Conference on Artificial Life. MIT Press, 341--349.Google ScholarCross Ref
Thomas Helmuth and Lee Spector. 2021. Problem-Solving Benefits of Down-Sampled Lexicase Selection. Artificial life 27, 3--4 (2021), 183--203.Google Scholar
Thomas Helmuth, Lee Spector, and James Matheson. 2014. Solving Uncompromising Problems with Lexicase Selection. IEEE Transactions on Evolutionary Computation 19, 5 (2014), 630--643.Google ScholarDigital Library
Jose Guadalupe Hernandez, Alexander Lalejini, Emily Dolson, and Charles Ofria. 2019. Random subsampling improves performance in lexicase selection. In Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO '19). ACM, 2028--2031.Google ScholarDigital Library
David Jackson. 2010. Promoting Phenotypic Diversity in Genetic Programming. In Parallel Problem Solving from Nature, PPSN XI. Springer Berlin Heidelberg, 472--481.Google Scholar
Thomas Jansen and Christine Zarges. 2018. Theoretical Analysis of Lexicase Selection in Multi-objective Optimization. In Parallel Problem Solving from Nature - PPSN XV. Springer International Publishing, 153--164.Google Scholar
John R. Koza. 1992. On the programming of computers by means of natural selection. A Bradford book, Vol. 1. MIT Press.Google Scholar
Krzysztof Krawiec and Una-May O'Reilly. 2014. Behavioral programming: a broader and more detailed take on semantic GP. In Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation (GECCO '14). ACM, 935--942.Google ScholarDigital Library
William La Cava, Thomas Helmuth, Lee Spector, and Jason H. Moore. 2019. A probabilistic and multi-objective analysis of lexicase selection and epsilon-lexicase selection. Evolutionary Computation 27, 3 (2019), 377--402.Google ScholarDigital Library
William La Cava, Patryk Orzechowski, Bogdan Burlacu, Fabrício Olivetti de França, Marco Virgolin, Ying Jin, Michael Kommenda, and Jason H. Moore. 2021. Contemporary Symbolic Regression Methods and their Relative Performance. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track.Google Scholar
William La Cava, Lee Spector, and Kourosh Danai. 2016. Epsilon-Lexicase Selection for Regression. In Proceedings of the Genetic and Evolutionary Computation Conference 2016 (GECCO '16). ACM, 741--748.Google ScholarDigital Library
Joao Francisco B. S. Martins, Luiz Otavio V. B. Oliveira, Luis F. Miranda, Felipe Casadei, and Gisele L. Pappa. 2018. Solving the exponential growth of symbolic regression trees in geometric semantic genetic programming. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO '18). ACM, 1151--1158.Google Scholar
Jared M. Moore and Adam Stanton. 2017. Lexicase selection outperforms previous strategies for incremental evolution of virtual creature controllers. In ECAL 2017, the Fourteenth European Conference on Artificial Life. 290--297.Google Scholar
Ji Ni, Russ H. Drieberg, and Peter I. Rockett. 2013. The Use of an Analytic Quotient Operator in Genetic Programming. IEEE Transactions on Evolutionary Computation 17, 1 (2013), 146--152.Google ScholarDigital Library
Edward Pantridge, Thomas Helmuth, Nicholas Freitag McPhee, and Lee Spector. 2018. Specialization and Elitism in Lexicase and Tournament Selection. In Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO '18). ACM, 1914--1917.Google ScholarDigital Library
Thu Pham-Gia and Tran Loc Hung. 2001. The Mean and Median Absolute Deviations. Mathematical and Computer 34, 7--8 (2001), 921--936.Google Scholar
Riccardo Poli, William B. Langdon, and Nicholas Freitag McPhee. 2008. A field guide to genetic programming. Lulu Press.Google ScholarCross Ref
Anil Kumar Saini and Lee Spector. 2020. Effect of Parent Selection Methods on Modularity. In Genetic Programming. Springer International Publishing, 184--194.Google Scholar
Dirk Schweim, Dominik Sobania, and Franz Rothlauf. 2022. Effects of the Training Set Size: A Comparison of Standard and Down-Sampled Lexicase Selection in Program Synthesis. In 2022 IEEE Congress on Evolutionary Computation (CEC). IEEE, 1--8.Google Scholar
Dominik Sobania and Franz Rothlauf. 2021. A generalizability measure for program synthesis with genetic programming. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO '21). ACM, 822--829.Google ScholarDigital Library
Dominik Sobania, Dirk Schweim, and Franz Rothlauf. 2023. A Comprehensive Survey on Program Synthesis With Evolutionary Algorithms. IEEE Transactions on Evolutionary Computation 27, 1 (2023), 82--97.Google ScholarCross Ref
Lee Spector. 2012. Assessment of Problem Modality by Differential Performance of Lexicase Selection in Genetic Programming: A Preliminary Report. In Proceedings of the 14th Annual Conference Companion on Genetic and Evolutionary Computation (GECCO '12). ACM, 401--408.Google ScholarDigital Library
Athanasios Tsanas and Angeliki Xifara. 2012. Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools. Energy and Buildings 49 (2012), 560--567.Google ScholarCross Ref
Marco Virgolin, Tanja Alderliesten, Arjan Bel, Cees Witteveen, and Peter A. N. Bosman. 2018. Symbolic regression and feature construction with GP-GOMEA applied to radiotherapy dose reconstruction of childhood cancer survivors. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO '18). ACM, 1395--1402.Google Scholar
Marco Virgolin, Tanja Alderliesten, Cees Witteveen, and Peter A. N. Bosman. 2021. Improving Model-Based Genetic Programming for Symbolic Regression of Small Expressions. Evolutionary Computation 29, 2 (2021), 211--237.Google ScholarCross Ref
I-Cheng Yeh. 1998. Modeling of strength of high-performance concrete using artificial neural networks. Cement and Concrete research 28, 12 (1998), 1797--1808.Google Scholar

Index Terms

Down-Sampled Epsilon-Lexicase Selection for Real-World Symbolic Regression Problems
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by regression
    2. Machine learning approaches
      1. Bio-inspired approaches
        Genetic programming

Recommendations

Epsilon-Lexicase Selection for Regression
GECCO '16: Proceedings of the Genetic and Evolutionary Computation Conference 2016

Lexicase selection is a parent selection method that considers test cases separately, rather than in aggregate, when performing parent selection. It performs well in discrete error spaces but not on the continuous-valued problems that compose most ...
Read More
Probabilistic Lexicase Selection
GECCO '23: Proceedings of the Genetic and Evolutionary Computation Conference

Lexicase selection is a widely used parent selection algorithm in genetic programming, known for its success in various task domains such as program synthesis, symbolic regression, and machine learning. Due to its non-parametric and recursive nature, ...
Read More
A Comprehensive Comparison of Lexicase-Based Selection Methods for Symbolic Regression Problems
Genetic Programming
Abstract
Lexicase selection is a parent selection method that has been successfully used in many application domains. In recent years, several variants of lexicase selection have been proposed and analyzed. However, it is still unclear which lexicase ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
GECCO '23: Proceedings of the Genetic and Evolutionary Computation Conference
July 2023
1667 pages
ISBN:9798400701191
DOI:10.1145/3583131
Chair:
Sara Silva,
Program Chair:
Luís Paquete
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 July 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
symbolic regression
genetic programming
parent selection
down-sampled epsilon-lexicase selection
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,669of4,410submissions,38%
Upcoming Conference
GECCO '24

Sponsor:

sigevo

Genetic and Evolutionary Computation Conference

July 14 - 18, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 52
  Total Downloads
- Downloads (Last 12 months)52
- Downloads (Last 6 weeks)13
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Down-Sampled Epsilon-Lexicase Selection for Real-World Symbolic Regression Problems

GECCO '23: Proceedings of the Genetic and Evolutionary Computation Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Epsilon-Lexicase Selection for Regression

Probabilistic Lexicase Selection

A Comprehensive Comparison of Lexicase-Based Selection Methods for Symbolic Regression Problems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Down-Sampled Epsilon-Lexicase Selection for Real-World Symbolic Regression Problems

GECCO '23: Proceedings of the Genetic and Evolutionary Computation Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Epsilon-Lexicase Selection for Regression

Probabilistic Lexicase Selection

A Comprehensive Comparison of Lexicase-Based Selection Methods for Symbolic Regression Problems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media