ABSTRACT
Epsilon-lexicase selection is a parent selection method in genetic programming that has been successfully applied to symbolic regression problems. Recently, the combination of random subsampling with lexicase selection significantly improved performance in other genetic programming domains such as program synthesis. However, the influence of subsampling on the solution quality of real-world symbolic regression problems has not yet been studied. In this paper, we propose down-sampled epsilon-lexicase selection which combines epsilon-lexicase selection with random subsampling to improve the performance in the domain of symbolic regression. Therefore, we compare down-sampled epsilon-lexicase with traditional selection methods on common real-world symbolic regression problems and analyze its influence on the properties of the population over a genetic programming run. We find that the diversity is reduced by using down-sampled epsilon-lexicase selection compared to standard epsilon-lexicase selection. This comes along with high hyperselection rates we observe for down-sampled epsilon-lexicase selection. Further, we find that down-sampled epsilon-lexicase selection outperforms the traditional selection methods on all studied problems. Overall, with down-sampled epsilon-lexicase selection we observe an improvement of the solution quality of up to 85% in comparison to standard epsilon-lexicase selection.
- Ryan Boldi, Martin Briesch, Dominik Sobania, Alexander Lalejini, Thomas Helmuth, Franz Rothlauf, Charles Ofria, and Lee Spector. 2023. Informed Down-Sampled Lexicase Selection: Identifying productive training cases for efficient problem solving. arXiv preprint arXiv:2301.01488v1 (2023).Google Scholar
- Thomas F. Brooks, D. Stuart Pope, and Michael A. Marcolini. 1989. Airfoil self-noise and prediction. National Aeronautics and Space Administration, Office of Management, Scientific and Technical Information Division.Google Scholar
- Shu-Heng Chen. 2012. Genetic algorithms and genetic programming in computational finance. Springer Science & Business Media.Google Scholar
- Dheeru Dua and Casey Graff. 2019. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml.Google Scholar
- Austin J. Ferguson, Jose Guadalupe Hernandez, Daniel Junghans, Alexander Lalejini, Emily Dolson, and Charles Ofria. 2020. Characterizing the Effects of Random Subsampling on Lexicase Selection. In Genetic Programming Theory and Practice XVII. Springer International Publishing, 1--23.Google Scholar
- Stefan Forstenlechner, David Fagan, Miguel Nicolau, and Michael O'Neill. 2017. A Grammar Design Pattern for Arbitrary Program Synthesis Problems in Genetic Programming. In Genetic Programming. Springer International Publishing, 262--277.Google Scholar
- Félix-Antoine Fortin, François-Michel de Rainville, Marc-André Gardner, Marc Parizeau, and Christian Gagné. 2012. DEAP: Evolutionary algorithms made easy. The Journal of Machine Learning Research 13, 1 (2012), 2171--2175.Google ScholarDigital Library
- J. Gerritsma, R. Onnink, and A. Versluis. 1981. Geometry, resistance and stability of the delft systematic yacht hull series. International shipbuilding progress 28, 328 (1981), 276--297.Google Scholar
- Ivo Gonçalves, Sara Silva, Joana B. Melo, and João M. B. Carreiras. 2012. Random Sampling Technique for Overfitting Control in Genetic Programming. In Genetic Programming. Springer Berlin Heidelberg, 218--229.Google Scholar
- David Harrison and Daniel L. Rubinfeld. 1978. Hedonic Housing Prices and the Demand for Clean Air. Journal of environmental economics and management 5, 1 (1978), 81--102.Google ScholarCross Ref
- Thomas Helmuth and Amr Abdelhady. 2020. Benchmarking parent selection for program synthesis by genetic programming. In Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion (GECCO '20). ACM, 237--238.Google ScholarDigital Library
- Thomas Helmuth, Nicholas Freitag McPhee, and Lee Spector. 2016. Effects of Lexicase and Tournament Selection on Diversity Recovery and Maintenance. In Proceedings of the 2016 on Genetic and Evolutionary Computation Conference Companion (GECCO '16 Companion). ACM, 983--990.Google ScholarDigital Library
- Thomas Helmuth, Nicholas Freitag McPhee, and Lee Spector. 2016. The Impact of Hyperselection on Lexicase Selection. In Proceedings of the Genetic and Evolutionary Computation Conference 2016 (GECCO '16). ACM, 717--724.Google ScholarDigital Library
- Thomas Helmuth, Nicholas Freitag McPhee, and Lee Spector. 2016. Lexicase selection for program synthesis: a diversity analysis. In Genetic Programming Theory and Practice XIII. Springer International Publishing, 151--167.Google Scholar
- Thomas Helmuth, Edward Pantridge, and Lee Spector. 2019. Lexicase selection of specialists. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO '19). ACM, 1030--1038.Google ScholarDigital Library
- Thomas Helmuth, Edward Pantridge, and Lee Spector. 2020. On the importance of specialists for lexicase selection. Genetic Programming and Evolvable Machines 21, 3 (2020), 349--373.Google ScholarDigital Library
- Thomas Helmuth and Lee Spector. 2015. General Program Synthesis Benchmark Suite. In Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation (GECCO '15). ACM, 1039--1046.Google ScholarDigital Library
- Thomas Helmuth and Lee Spector. 2020. Explaining and exploiting the advantages of down-sampled lexicase selection. In ALIFE 2020: The 2020 Conference on Artificial Life. MIT Press, 341--349.Google ScholarCross Ref
- Thomas Helmuth and Lee Spector. 2021. Problem-Solving Benefits of Down-Sampled Lexicase Selection. Artificial life 27, 3--4 (2021), 183--203.Google Scholar
- Thomas Helmuth, Lee Spector, and James Matheson. 2014. Solving Uncompromising Problems with Lexicase Selection. IEEE Transactions on Evolutionary Computation 19, 5 (2014), 630--643.Google ScholarDigital Library
- Jose Guadalupe Hernandez, Alexander Lalejini, Emily Dolson, and Charles Ofria. 2019. Random subsampling improves performance in lexicase selection. In Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO '19). ACM, 2028--2031.Google ScholarDigital Library
- David Jackson. 2010. Promoting Phenotypic Diversity in Genetic Programming. In Parallel Problem Solving from Nature, PPSN XI. Springer Berlin Heidelberg, 472--481.Google Scholar
- Thomas Jansen and Christine Zarges. 2018. Theoretical Analysis of Lexicase Selection in Multi-objective Optimization. In Parallel Problem Solving from Nature - PPSN XV. Springer International Publishing, 153--164.Google Scholar
- John R. Koza. 1992. On the programming of computers by means of natural selection. A Bradford book, Vol. 1. MIT Press.Google Scholar
- Krzysztof Krawiec and Una-May O'Reilly. 2014. Behavioral programming: a broader and more detailed take on semantic GP. In Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation (GECCO '14). ACM, 935--942.Google ScholarDigital Library
- William La Cava, Thomas Helmuth, Lee Spector, and Jason H. Moore. 2019. A probabilistic and multi-objective analysis of lexicase selection and epsilon-lexicase selection. Evolutionary Computation 27, 3 (2019), 377--402.Google ScholarDigital Library
- William La Cava, Patryk Orzechowski, Bogdan Burlacu, Fabrício Olivetti de França, Marco Virgolin, Ying Jin, Michael Kommenda, and Jason H. Moore. 2021. Contemporary Symbolic Regression Methods and their Relative Performance. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track.Google Scholar
- William La Cava, Lee Spector, and Kourosh Danai. 2016. Epsilon-Lexicase Selection for Regression. In Proceedings of the Genetic and Evolutionary Computation Conference 2016 (GECCO '16). ACM, 741--748.Google ScholarDigital Library
- Joao Francisco B. S. Martins, Luiz Otavio V. B. Oliveira, Luis F. Miranda, Felipe Casadei, and Gisele L. Pappa. 2018. Solving the exponential growth of symbolic regression trees in geometric semantic genetic programming. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO '18). ACM, 1151--1158.Google Scholar
- Jared M. Moore and Adam Stanton. 2017. Lexicase selection outperforms previous strategies for incremental evolution of virtual creature controllers. In ECAL 2017, the Fourteenth European Conference on Artificial Life. 290--297.Google Scholar
- Ji Ni, Russ H. Drieberg, and Peter I. Rockett. 2013. The Use of an Analytic Quotient Operator in Genetic Programming. IEEE Transactions on Evolutionary Computation 17, 1 (2013), 146--152.Google ScholarDigital Library
- Edward Pantridge, Thomas Helmuth, Nicholas Freitag McPhee, and Lee Spector. 2018. Specialization and Elitism in Lexicase and Tournament Selection. In Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO '18). ACM, 1914--1917.Google ScholarDigital Library
- Thu Pham-Gia and Tran Loc Hung. 2001. The Mean and Median Absolute Deviations. Mathematical and Computer 34, 7--8 (2001), 921--936.Google Scholar
- Riccardo Poli, William B. Langdon, and Nicholas Freitag McPhee. 2008. A field guide to genetic programming. Lulu Press.Google ScholarCross Ref
- Anil Kumar Saini and Lee Spector. 2020. Effect of Parent Selection Methods on Modularity. In Genetic Programming. Springer International Publishing, 184--194.Google Scholar
- Dirk Schweim, Dominik Sobania, and Franz Rothlauf. 2022. Effects of the Training Set Size: A Comparison of Standard and Down-Sampled Lexicase Selection in Program Synthesis. In 2022 IEEE Congress on Evolutionary Computation (CEC). IEEE, 1--8.Google Scholar
- Dominik Sobania and Franz Rothlauf. 2021. A generalizability measure for program synthesis with genetic programming. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO '21). ACM, 822--829.Google ScholarDigital Library
- Dominik Sobania, Dirk Schweim, and Franz Rothlauf. 2023. A Comprehensive Survey on Program Synthesis With Evolutionary Algorithms. IEEE Transactions on Evolutionary Computation 27, 1 (2023), 82--97.Google ScholarCross Ref
- Lee Spector. 2012. Assessment of Problem Modality by Differential Performance of Lexicase Selection in Genetic Programming: A Preliminary Report. In Proceedings of the 14th Annual Conference Companion on Genetic and Evolutionary Computation (GECCO '12). ACM, 401--408.Google ScholarDigital Library
- Athanasios Tsanas and Angeliki Xifara. 2012. Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools. Energy and Buildings 49 (2012), 560--567.Google ScholarCross Ref
- Marco Virgolin, Tanja Alderliesten, Arjan Bel, Cees Witteveen, and Peter A. N. Bosman. 2018. Symbolic regression and feature construction with GP-GOMEA applied to radiotherapy dose reconstruction of childhood cancer survivors. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO '18). ACM, 1395--1402.Google Scholar
- Marco Virgolin, Tanja Alderliesten, Cees Witteveen, and Peter A. N. Bosman. 2021. Improving Model-Based Genetic Programming for Symbolic Regression of Small Expressions. Evolutionary Computation 29, 2 (2021), 211--237.Google ScholarCross Ref
- I-Cheng Yeh. 1998. Modeling of strength of high-performance concrete using artificial neural networks. Cement and Concrete research 28, 12 (1998), 1797--1808.Google Scholar
Index Terms
- Down-Sampled Epsilon-Lexicase Selection for Real-World Symbolic Regression Problems
Recommendations
Epsilon-Lexicase Selection for Regression
GECCO '16: Proceedings of the Genetic and Evolutionary Computation Conference 2016Lexicase selection is a parent selection method that considers test cases separately, rather than in aggregate, when performing parent selection. It performs well in discrete error spaces but not on the continuous-valued problems that compose most ...
Probabilistic Lexicase Selection
GECCO '23: Proceedings of the Genetic and Evolutionary Computation ConferenceLexicase selection is a widely used parent selection algorithm in genetic programming, known for its success in various task domains such as program synthesis, symbolic regression, and machine learning. Due to its non-parametric and recursive nature, ...
A Comprehensive Comparison of Lexicase-Based Selection Methods for Symbolic Regression Problems
Genetic ProgrammingAbstractLexicase selection is a parent selection method that has been successfully used in many application domains. In recent years, several variants of lexicase selection have been proposed and analyzed. However, it is still unclear which lexicase ...
Comments