Skip to main content

DALex: Lexicase-Like Selection via Diverse Aggregation

  • Conference paper
  • First Online:
Genetic Programming (EuroGP 2024)

Abstract

Lexicase selection has been shown to provide advantages over other selection algorithms in several areas of evolutionary computation and machine learning. In its standard form, lexicase selection filters a population or other collection based on randomly ordered training cases that are considered one at a time. This iterated filtering process can be time-consuming, particularly in settings with large numbers of training cases, including many symbolic regression and deep learning applications. In this paper, we propose a new method that is nearly equivalent to lexicase selection in terms of the individuals that it selects, but which does so in significantly less time. The new method, called DALex (for Diversely Aggregated Lexicase selection), selects the best individual with respect to a randomly weighted sum of training case errors. This allows us to formulate the core computation required for selection as matrix multiplication instead of recursive loops of comparisons, which in turn allows us to take advantage of optimized and parallel algorithms designed for matrix multiplication for speedup. Furthermore, we show that we can interpolate between the behavior of lexicase selection and its “relaxed” variants, such as epsilon and batch lexicase selection, by adjusting a single hyperparameter, named “particularity pressure,” which represents the importance granted to each individual training case. Results on program synthesis, deep learning, symbolic regression, and learning classifier systems demonstrate that DALex achieves significant speedups over lexicase selection and its relaxed variants while maintaining almost identical problem-solving performance. Under a fixed computational budget, these savings free up resources that can be directed towards increasing population size or the number of generations, enabling the potential for solving more difficult problems.

Supported by Amherst College and members of the PUSH lab.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://arxiv.org/abs/2401.12424.

  2. 2.

    Fixed a bug in the downsampling implementation in the released version of [8].

  3. 3.

    Due to specific quirks of the code-building system, it is very difficult for CBGP to generalize successfully on Compare String Lengths.

References

  1. Aenugu, S., Spector, L.: Lexicase selection in learning classifier systems. In: Proceedings of the Genetic and Evolutionary Computation Conference. GECCO 2019, pp. 356–364. Association for Computing Machinery, New York, NY, USA (2019). https://doi.org/10.1145/3321707.3321828

  2. Boldi, R., et al.: Informed down-sampled lexicase selection: identifying productive training cases for efficient problem solving. Evol. Comput. 1–32 (2024). https://doi.org/10.48550/arXiv.2301.01488, to appear

  3. Breiman, L., Friedman, J., Olshen, R., Stone, C.: LED Display Domain. UCI Mach. Learn. Repository (1988). https://doi.org/10.24432/C5FG61

  4. Coppersmith, D., Winograd, S.: Matrix multiplication via arithmetic progressions. In: Proceedings of the Nineteenth Annual ACM Symposium on Theory of Computing. STOC 1987, pp. 1–6. Association for Computing Machinery, New York, NY, USA (1987). https://doi.org/10.1145/28395.28396

  5. Deb, K., Agrawal, S., Pratap, A., Meyarivan, T.: A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: NSGA-II. In: Schoenauer, M., et al. (eds.) PPSN 2000. LNCS, pp. 849–858. Springer, Berlin Heidelberg, Berlin, Heidelberg (2000). https://doi.org/10.1007/3-540-45356-3_83

    Chapter  Google Scholar 

  6. Deb, K., Goldberg, D.E.: An investigation of niche and species formation in genetic function optimization. In: Proceedings of the Third International Conference on Genetic Algorithms.p p. 42–50. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1989)

    Google Scholar 

  7. Ding, L., Boldi, R., Helmuth, T., Spector, L.: Lexicase selection at scale. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion. p. 2054–2062. GECCO’22, Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3520304.3534026

  8. Ding, L., Pantridge, E., Spector, L.: Probabilistic lexicase selection. In: Proceedings of the Genetic and Evolutionary Computation Conference. GECCO 2023, pp. 1073–1081. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3583131.3590375

  9. Ding, L., Spector, L.: Optimizing neural networks with gradient lexicase selection. In: International Conference on Learning Representations (2022). https://doi.org/10.48550/arXiv.2312.12606

  10. Dolson, E.: Calculating lexicase selection probabilities is NP-hard. In: Proceedings of the Genetic and Evolutionary Computation Conference. GECCO 2023. ACM (2023). https://doi.org/10.1145/3583131.3590356

  11. Goldberg, D.E., Deb, K.: A comparative analysis of selection schemes used in genetic algorithms. Found. Genet. Algorithms 1, 69–93 (1991). https://doi.org/10.1016/B978-0-08-050684-5.50008-2

    Article  MathSciNet  Google Scholar 

  12. Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 1861–1870. PMLR (2018). https://proceedings.mlr.press/v80/haarnoja18b.html

  13. Hasselt, H.: Double Q-learning. In: Advances in neural information processing systems. vol. 23, pp. 2613–2621 (2010)

    Google Scholar 

  14. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

    Google Scholar 

  15. Helmuth, T., Abdelhady, A.: Benchmarking parent selection for program synthesis by genetic programming. In: Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion. GECCO 2020, pp. 237–238. Association for Computing Machinery, New York, NY, USA (2020). https://doi.org/10.1145/3377929.3389987

  16. Helmuth, T., Lengler, J., La Cava, W.: Population diversity leads to short running times of lexicase selection. In: Rudolph, G., Kononova, A.V., Aguirre, H., Kerschke, P., Ochoa, G., Tušar, T. (eds.) PPSN 2022. LNCS, pp. 485–498. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-14721-0_34

    Chapter  Google Scholar 

  17. Helmuth, T., McPhee, N.F., Spector, L.: Lexicase selection for program synthesis: a diversity analysis. In: Riolo, R., Worzel, B., Kotanchek, M., Kordon, A. (eds.) Genetic Programming Theory and Practice XIII. GEC, pp. 151–167. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-34223-8_9

    Chapter  Google Scholar 

  18. Helmuth, T., McPhee, N.F., Spector, L.: Program synthesis using uniform mutation by addition and deletion. In: Proceedings of the Genetic and Evolutionary Computation Conference. GECCO 2018, pp. 1127–1134. Association for Computing Machinery, New York, NY, USA (2018). https://doi.org/10.1145/3205455.3205603

  19. Helmuth, T., Pantridge, E., Spector, L.: Lexicase selection of specialists. In: Proceedings of the Genetic and Evolutionary Computation Conference. GECCO 2019, pp. 1030–1038. Association for Computing Machinery, New York, NY, USA (2019). https://doi.org/10.1145/3321707.3321875

  20. Helmuth, T., Pantridge, E., Spector, L.: On the importance of specialists for lexicase selection. Genet. Program Evolvable Mach. 21(3), 349–373 (2020). https://doi.org/10.1007/s10710-020-09377-2

    Article  Google Scholar 

  21. Helmuth, T., Spector, L.: General program synthesis benchmark suite. In: Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation. GECCO 2015, p. 1039–1046. Association for Computing Machinery, New York, NY, USA (2015). https://doi.org/10.1145/2739480.2754769

  22. Helmuth, T., Spector, L.: Explaining and exploiting the advantages of down-sampled lexicase selection. In: Proceedings of ALIFE 2020: The 2020 Conference on Artificial Life, pp. 341–349 (2020). https://doi.org/10.1162/isal_a_00334

  23. Helmuth, T., Spector, L.: Problem-solving benefits of down-sampled lexicase selection. Artif. Life 27(3–4), 183–203 (2022). https://doi.org/10.1162/artl_a_00341

    Article  Google Scholar 

  24. Helmuth, T., Spector, L., Matheson, J.: Solving uncompromising problems with lexicase selection. IEEE Trans. Evol. Comput. 19(5), 630–643 (2015). https://doi.org/10.1109/TEVC.2014.2362729

    Article  Google Scholar 

  25. Hernandez, J.G., Lalejini, A., Dolson, E., Ofria, C.: Random subsampling improves performance in lexicase selection. In: Proceedings of the 2019 Genetic and Evolutionary Computation Conference Companion. GECCO 2019, pp. 2028–2031. Association for Computing Machinery, New York, NY, USA (2019). https://doi.org/10.1145/3319619.3326900

  26. Holland, J.H.: Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence. MIT press, Cambridge (1992)

    Book  Google Scholar 

  27. Keskar, N.S., Mudigere, D., Nocedal, J., Smelyanskiy, M., Tang, P.T.P.: On large-batch training for deep learning: generalization gap and sharp minima. CoRR (2016). http://arxiv.org/abs/1609.04836

  28. Klein, J., Spector, L.: Genetic programming with historically assessed hardness. In: Genetic Programming Theory and Practice VI, pp. 1–14. Springer, Boston (2008). https://doi.org/10.1007/978-0-387-87623-8_5

  29. Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images. Technical report. University of Toronto (2009)

    Google Scholar 

  30. La Cava, W., Helmuth, T., Spector, L., Moore, J.H.: A probabilistic and multi-objective analysis of lexicase selection and \(\varepsilon \)-lexicase selection. Evol. Comput. 27(3), 377–402 (2019). https://doi.org/10.1162/evco_a_00224

    Article  Google Scholar 

  31. La Cava, W., et al.: Contemporary symbolic regression methods and their relative performance. In: Vanschoren, J., Yeung, S. (eds.) Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks. vol. 1. Curran (2021)

    Google Scholar 

  32. La Cava, W., Spector, L., Danai, K.: Epsilon-lexicase selection for regression. In: Proceedings of the Genetic and Evolutionary Computation Conference 2016. GECCO 2016, pp. 741–748. Association for Computing Machinery, New York, NY, USA (2016). https://doi.org/10.1145/2908812.2908898

  33. McKay, R.I.B.: Fitness sharing in genetic programming. In: Proceedings of the 2nd Annual Conference on Genetic and Evolutionary Computation. GECCO 2000, pp. 435–442. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (2000)

    Google Scholar 

  34. de Melo, V.V., Vargas, D.V., Banzhaf, W.: Batch tournament selection for genetic programming: the quality of lexicase, the speed of tournament. In: Proceedings of the 2019 Genetic and Evolutionary Computation Conference. GECCO 2019, pp. 994–1002. Association for Computing Machinery, New York, NY, USA (2019). https://doi.org/10.1145/3321707.3321793

  35. Metevier, B., Saini, A.K., Spector, L.: Lexicase selection beyond genetic programming. In: Banzhaf, W., Spector, L., Sheneman, L. (eds.) Genetic Programming Theory and Practice XVI. GEC, pp. 123–136. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-04735-1_7

    Chapter  Google Scholar 

  36. Moore, J.M., Stanton, A.: Tiebreaks and diversity: isolating effects in lexicase selection. In: Proceedings of ALIFE 2018: The 2018 Conference on Artificial Life, pp. 590–597 (2018). https://doi.org/10.1162/isal_a_00109

  37. Pantridge, E., Spector, L.: Code building genetic programming. In: Proceedings of the 2020 Genetic and Evolutionary Computation Conference. GECCO 2020, pp. 994–1002. Association for Computing Machinery, New York, NY, USA (2020). https://doi.org/10.1145/3377930.3390239

  38. Romano, J.D., et al.: PMLB v1.0: an open source dataset collection for benchmarking machine learning methods (2021). https://doi.org/10.48550/arXiv.2012.00058

  39. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Proceedings of the International Conference on Learning Representations (2015). https://doi.org/10.48550/arXiv.1409.1556

  40. Sobania, D., Rothlauf, F.: Program synthesis with genetic programming: the influence of batch sizes. In: Medvet, E., Pappa, G., Xue, B. (eds.) EuroGP 2022: Genetic Programming. Lecture Notes in Computer Science, vol. 13223, pp. 118–129. Springer, Cham (2022)

    Google Scholar 

  41. Spector, L.: Autoconstructive evolution: Push, PushGP, and Pushpop. In: Proceedings of the 2001 Genetic and Evolutionary Computation Conference, GECCO-2001, pp. 137–146. Morgan Kaufmann Publishers, San Francisco, CA (2001)

    Google Scholar 

  42. Spector, L., Ding, L., Boldi, R.: Particularity. In: Winkler, S., Trujillo, L., Ofria, C., Hu, T. (eds.) Genetic Programming Theory and Practice XX. Genetic and Evolutionary Computation, Springer, Singapore (2023). https://doi.org/10.1007/978-981-99-8413-8_9 to appear

    Chapter  Google Scholar 

  43. Spector, L., Robinson, A.: Genetic programming and autoconstructive evolution with the push programming language. Genet. Program Evolvable Mach. 3(1), 7–40 (2002). https://doi.org/10.1023/A:1014538503543

    Article  Google Scholar 

  44. Stanton, A., Moore, J.M.: Lexicase selection for multi-task evolutionary robotics. Artif. Life 28(4), 479–498 (2022). https://doi.org/10.1162/artl_a_00374

    Article  Google Scholar 

  45. Stephens, T.: gplearn (2023). https://github.com/trevorstephens/gplearn

  46. Urbanowicz, R.J., Moore, J.H.: Learning classifier systems: a complete introduction, review, and roadmap. J. Artif. Evol. Appl. 2009, 1–25 (2009). https://doi.org/10.1155/2009/736398

    Article  Google Scholar 

Download references

Acknowledgements

This work was performed in part using high-performance computing equipment at Amherst College obtained under National Science Foundation Grant No. 2117377. Any opinions, findings, and conclusions or recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of the National Science Foundation. The authors would like to thank Ryan Boldi, Bill Tozier, Tom Helmuth, Edward Pantridge and other members of the PUSH lab for their insightful comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrew Ni .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ni, A., Ding, L., Spector, L. (2024). DALex: Lexicase-Like Selection via Diverse Aggregation. In: Giacobini, M., Xue, B., Manzoni, L. (eds) Genetic Programming. EuroGP 2024. Lecture Notes in Computer Science, vol 14631. Springer, Cham. https://doi.org/10.1007/978-3-031-56957-9_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-56957-9_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-56956-2

  • Online ISBN: 978-3-031-56957-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics