A Comprehensive Comparison of Lexicase-Based Selection Methods for Symbolic Regression Problems

Geiger, Alina; Sobania, Dominik; Rothlauf, Franz

doi:10.1007/978-3-031-56957-9_12

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14631))

Included in the following conference series:

European Conference on Genetic Programming (Part of EvoStar)

127 Accesses

Abstract

Lexicase selection is a parent selection method that has been successfully used in many application domains. In recent years, several variants of lexicase selection have been proposed and analyzed. However, it is still unclear which lexicase variant performs best in the domain of symbolic regression. Therefore, we compare in this work relevant lexicase variants on a wide range of symbolic regression problems. We conduct experiments not only over a given evaluation budget but also over a given time as practitioners usually have limited time for solving their problems. Consequently, this work provides users a comprehensive guide for choosing the right selection method under different constraints in the domain of symbolic regression. Overall, we find that down-sampled \(\epsilon \)-lexicase selection outperforms other selection methods on the studied benchmark problems for the given evaluation budget and for the given time. The improvements with respect to solution quality are up to 68% using down-sampled \(\epsilon \)-lexicase selection given a time budget of 24 h.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We refer to the variant called BTSS from De Melo et al. [4].
2.
problems: 589_fri_c2_1000_25, 606_fri_c2_1000_10, 623_fri_c4_1000_10, 1030_ERA, 607_fri_c4_1000_50, 581_fri_c3_500_25, 617_fri_c3_500_5, 654_fri_c0_500_10, 641_fri_c1_500_10, 1027_ESL, 519_vinnie, 647_fri_c1_250_10, 615_fri_c4_250_10, 230_machine_cpu, 207_autoPrice, 665_sleuth_case2002, 523_analcatdata_neavote, 621_fri_c0_100_10, 624_fri_c0_100_5, 591_fri_c1_100_10.
3.
We track in each generation the MSE of the best-performing individual (according to the performance on the validation cases) in the current population. The normalized MSE is averaged over all problems every 10 min.

References

Aenugu, S., Spector, L.: Lexicase selection in learning classifier systems. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 356–364. ACM (2019)
Google Scholar
Boldi, R., et al.: Informed down-sampled lexicase selection: Identifying productive training cases for efficient problem solving. arXiv preprint arXiv:2301.01488v1 (2023)
Chen, S.H.: Genetic Algorithms and Genetic Programming in Computational Finance. Springer, New York (2012). https://doi.org/10.1007/978-1-4615-0835-9
De Melo, V.V., Vargas, D.V., Banzhaf, W.: Batch tournament selection for genetic programming: the quality of lexicase, the speed of tournament. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 994–1002. GECCO ’19, ACM (2019)
Google Scholar
Ding, L., Boldi, R., Helmuth, T., Spector, L.: Going faster and hence further with lexicase selection. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, pp. 538–541. ACM (2022)
Google Scholar
Ding, L., Boldi, R., Helmuth, T., Spector, L.: Lexicase selection at scale. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, pp. 2054–2062. ACM (2022)
Google Scholar
Ding, L., Pantridge, E., Spector, L.: Probabilistic lexicase selection. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 1073–1081. GECCO ’23, ACM (2023)
Google Scholar
Ding, L., Spector, L.: Optimizing neural networks with gradient lexicase selection. In: International Conference on Learning Representations (2021)
Google Scholar
Fang, Y., Li, J.: A review of tournament selection in genetic programming. In: Cai, Z., Hu, C., Kang, Z., Liu, Y. (eds.) Advances in Computation and Intelligence. ISICA 2010. LNCS, vol. 6382, pp. 181–192. Springer, Berlin, Heidelberg (2010). https://doi.org/10.1007/978-3-642-16493-4_19
Ferguson, A.J., Hernandez, J.G., Junghans, D., Lalejini, A., Dolson, E., Ofria, C.: Characterizing the effects of random subsampling on lexicase selection. In: Banzhaf, W., Goodman, E., Sheneman, L., Trujillo, L., Worzel, B. (eds.) Genetic Programming Theory and Practice XVII. Genetic and Evolutionary Computation, LNCS, pp. 1–23. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-39958-0_1
Fortin, F.A., de Rainville, F.M., Gardner, M.A., Parizeau, M., Gagné, C.: DEAP: evolutionary algorithms made easy. J. Mach. Learn. Res. 13(1), 2171–2175 (2012)
MathSciNet Google Scholar
Geiger, A., Sobania, D., Rothlauf, F.: Down-sampled epsilon-lexicase selection for real-world symbolic regression problems. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 1109–1117. GECCO ’23, ACM (2023)
Google Scholar
Helmuth, T., Abdelhady, A.: Benchmarking parent selection for program synthesis by genetic programming. In: Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion, pp. 237–238. GECCO ’20, ACM (2020)
Google Scholar
Helmuth, T., McPhee, N.F., Spector, L.: Effects of lexicase and tournament selection on diversity recovery and maintenance. In: Proceedings of the 2016 on Genetic and Evolutionary Computation Conference Companion, pp. 983–990. GECCO ’16 Companion, ACM (2016)
Google Scholar
Helmuth, T., McPhee, N.F., Spector, L.: Lexicase selection for program synthesis: a diversity analysis. In: Riolo, R., Worzel, W., Kotanchek, M., Kordon, A. (eds.) Genetic Programming Theory and Practice XIII, pp. 151–167. LNCS, Genetic and Evolutionary Computation. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-34223-8_9
Helmuth, T., Pantridge, E., Spector, L.: Lexicase selection of specialists. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 1030–1038. GECCO ’19, ACM (2019)
Google Scholar
Helmuth, T., Pantridge, E., Spector, L.: On the importance of specialists for lexicase selection. Genet. Program. Evolvable Mach. 21(3), 349–373 (2020)
Article Google Scholar
Helmuth, T., Spector, L.: General program synthesis benchmark suite. In: Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation, pp. 1039–1046. GECCO ’15, ACM (2015)
Google Scholar
Helmuth, T., Spector, L.: Explaining and exploiting the advantages of down-sampled lexicase selection. In: ALIFE 2020: The 2020 Conference on Artificial Life, pp. 341–349. MIT Press (2020)
Google Scholar
Helmuth, T., Spector, L.: Problem-solving benefits of down-sampled lexicase selection. Artif. Life 27(3–4), 183–203 (2021)
Article Google Scholar
Helmuth, T., Spector, L., Matheson, J.: Solving uncompromising problems with lexicase selection. IEEE Trans. Evol. Comput. 19(5), 630–643 (2014)
Article Google Scholar
Hernandez, A., Balasubramanian, A., Yuan, F., Mason, S.A., Mueller, T.: Fast, accurate, and transferable many-body interatomic potentials by symbolic regression. NPJ Comput. Mater. 5(1), 112 (2019)
Google Scholar
Hernandez, J.G., Lalejini, A., Dolson, E., Ofria, C.: Random subsampling improves performance in lexicase selection. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, pp. 2028–2031. GECCO ’19, ACM (2019)
Google Scholar
Jundt, L., Helmuth, T.: Comparing and combining lexicase selection and novelty search. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 1047–1055. ACM (2019)
Google Scholar
Kelly, J., Hemberg, E., O’Reilly, U.M.: Improving genetic programming with novel exploration - exploitation control. In: Sekanina, L., Hu, T., Lourenco, N., Richter, H., Garcia-Sanchez, P. (eds.) Genetic Programming. EuroGP 2019. LNCS, vol. 11451, pp. 64–80. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-16670-0_5
Koza, J.R.: On the Programming of Computers by Means of Natural Selection, A Bradford Book, vol. 1. MIT Press, Cambridge (1992)
Google Scholar
Krawiec, K., Liskowski, P.: Automatic derivation of search objectives for test-based genetic programming. In: Machado, P., et al. (eds.) Genetic Programming. EuroGP 2015. LNCS, vol. 9025, pp. 53–65. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16501-1_5
La Cava, W., Helmuth, T., Spector, L., Moore, J.H.: A probabilistic and multi-objective analysis of lexicase selection and epsilon-lexicase selection. Evol. Comput. 27(3), 377–402 (2019)
Article Google Scholar
La Cava, W., et al.: Contemporary symbolic regression methods and their relative performance. In: Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (2021)
Google Scholar
La Cava, W., Spector, L., Danai, K.: Epsilon-lexicase selection for regression. In: Proceedings of the Genetic and Evolutionary Computation Conference 2016, pp. 741–748. GECCO ’16, ACM (2016)
Google Scholar
La Cava, W.G., et al.: A flexible symbolic regression method for constructing interpretable clinical prediction models. NPJ Digit. Med. 6(1), 107 (2023)
Article Google Scholar
Moore, J.M., Stanton, A.: Lexicase selection outperforms previous strategies for incremental evolution of virtual creature controllers. In: ECAL 2017, The Fourteenth European Conference on Artificial Life, pp. 290–297 (2017)
Google Scholar
Moore, J.M., Stanton, A.: Tiebreaks and diversity: isolating effects in lexicase selection. In: The 2018 Conference on Artificial Life, pp. 590–597. MIT Press, Cambridge, MA (2018)
Google Scholar
Ni, J., Drieberg, R.H., Rockett, P.I.: The use of an analytic quotient operator in genetic programming. IEEE Trans. Evol. Comput. 17(1), 146–152 (2013)
Article Google Scholar
Pantridge, E., Helmuth, T., McPhee, N.F., Spector, L.: Specialization and elitism in lexicase and tournament selection. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, pp. 1914–1917. GECCO ’18, ACM (2018)
Google Scholar
Pham-Gia, T., Hung, T.L.: The mean and median absolute deviations. Math. Comput. 34(7–8), 921–936 (2001)
MathSciNet Google Scholar
Poli, R., Langdon, W.B., McPhee, N.F.: A Field Guide to Genetic Programming. Lulu Press, Morrisville (2008)
Google Scholar
Schweim, D., Sobania, D., Rothlauf, F.: Effects of the training set size: a comparison of standard and down-sampled lexicase selection in program synthesis. In: 2022 IEEE Congress on Evolutionary Computation (CEC), pp. 1–8. IEEE (2022)
Google Scholar
Sobania, D., Rothlauf, F.: A generalizability measure for program synthesis with genetic programming. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 822–829. GECCO ’21, ACM (2021)
Google Scholar
Sobania, D., Rothlauf, F.: Program synthesis with genetic programming: the influence of batch sizes. In: Medvet, E., Pappa, G., Xue, B. (eds.) Genetic Programming. EuroGP 2022. LNCS, vol. 13223, pp. 118–129. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-02056-8_8
Sobania, D., Schweim, D., Rothlauf, F.: A comprehensive survey on program synthesis with evolutionary algorithms. IEEE Trans. Evol. Comput. 27(1), 82–97 (2023)
Article Google Scholar
Spector, L.: Assessment of problem modality by differential performance of lexicase selection in genetic programming: a preliminary report. In: Proceedings of the 14th Annual Conference Companion on Genetic and Evolutionary Computation, pp. 401–408. GECCO ’12, ACM (2012)
Google Scholar
Spector, L., La Cava, W., Shanabrook S, Helmuth, T., Pantridge, E.: Relaxations of lexicase parent selection. In: Banzhaf, W., Olson, R., Tozier, W., Riolo, R. (eds.) Genetic Programming Theory and Practice XV, pp. 105–120. LNCS, Genetic and Evolutionary Computation. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-90512-9_7
Wagner, A.R.M., Stein, A.: Adopting lexicase selection for michigan-style learning classifier systems with continuous-valued inputs. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, pp. 171–172. ACM (2021)
Google Scholar

Download references

Author information

Authors and Affiliations

Johannes Gutenberg University Mainz, Mainz, Germany
Alina Geiger, Dominik Sobania & Franz Rothlauf

Authors

Alina Geiger
View author publications
You can also search for this author in PubMed Google Scholar
Dominik Sobania
View author publications
You can also search for this author in PubMed Google Scholar
Franz Rothlauf
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alina Geiger .

Editor information

Editors and Affiliations

University of Torino, Grugliasco, Italy
Mario Giacobini
Victoria University of Wellington, Wellington, New Zealand
Bing Xue
University of Trieste, Trieste, Italy
Luca Manzoni

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Geiger, A., Sobania, D., Rothlauf, F. (2024). A Comprehensive Comparison of Lexicase-Based Selection Methods for Symbolic Regression Problems. In: Giacobini, M., Xue, B., Manzoni, L. (eds) Genetic Programming. EuroGP 2024. Lecture Notes in Computer Science, vol 14631. Springer, Cham. https://doi.org/10.1007/978-3-031-56957-9_12

Download citation

DOI: https://doi.org/10.1007/978-3-031-56957-9_12
Published: 28 March 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-56956-2
Online ISBN: 978-3-031-56957-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Comprehensive Comparison of Lexicase-Based Selection Methods for Symbolic Regression Problems