DALex: Lexicase-Like Selection via Diverse Aggregation

Ni, Andrew; Ding, Li; Spector, Lee

doi:10.1007/978-3-031-56957-9_6

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14631))

Included in the following conference series:

European Conference on Genetic Programming (Part of EvoStar)

124 Accesses

Abstract

Lexicase selection has been shown to provide advantages over other selection algorithms in several areas of evolutionary computation and machine learning. In its standard form, lexicase selection filters a population or other collection based on randomly ordered training cases that are considered one at a time. This iterated filtering process can be time-consuming, particularly in settings with large numbers of training cases, including many symbolic regression and deep learning applications. In this paper, we propose a new method that is nearly equivalent to lexicase selection in terms of the individuals that it selects, but which does so in significantly less time. The new method, called DALex (for Diversely Aggregated Lexicase selection), selects the best individual with respect to a randomly weighted sum of training case errors. This allows us to formulate the core computation required for selection as matrix multiplication instead of recursive loops of comparisons, which in turn allows us to take advantage of optimized and parallel algorithms designed for matrix multiplication for speedup. Furthermore, we show that we can interpolate between the behavior of lexicase selection and its “relaxed” variants, such as epsilon and batch lexicase selection, by adjusting a single hyperparameter, named “particularity pressure,” which represents the importance granted to each individual training case. Results on program synthesis, deep learning, symbolic regression, and learning classifier systems demonstrate that DALex achieves significant speedups over lexicase selection and its relaxed variants while maintaining almost identical problem-solving performance. Under a fixed computational budget, these savings free up resources that can be directed towards increasing population size or the number of generations, enabling the potential for solving more difficult problems.

Supported by Amherst College and members of the PUSH lab.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://arxiv.org/abs/2401.12424.
2.
Fixed a bug in the downsampling implementation in the released version of [8].
3.
Due to specific quirks of the code-building system, it is very difficult for CBGP to generalize successfully on Compare String Lengths.

References

Aenugu, S., Spector, L.: Lexicase selection in learning classifier systems. In: Proceedings of the Genetic and Evolutionary Computation Conference. GECCO 2019, pp. 356–364. Association for Computing Machinery, New York, NY, USA (2019). https://doi.org/10.1145/3321707.3321828
Boldi, R., et al.: Informed down-sampled lexicase selection: identifying productive training cases for efficient problem solving. Evol. Comput. 1–32 (2024). https://doi.org/10.48550/arXiv.2301.01488, to appear
Breiman, L., Friedman, J., Olshen, R., Stone, C.: LED Display Domain. UCI Mach. Learn. Repository (1988). https://doi.org/10.24432/C5FG61
Coppersmith, D., Winograd, S.: Matrix multiplication via arithmetic progressions. In: Proceedings of the Nineteenth Annual ACM Symposium on Theory of Computing. STOC 1987, pp. 1–6. Association for Computing Machinery, New York, NY, USA (1987). https://doi.org/10.1145/28395.28396
Deb, K., Agrawal, S., Pratap, A., Meyarivan, T.: A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: NSGA-II. In: Schoenauer, M., et al. (eds.) PPSN 2000. LNCS, pp. 849–858. Springer, Berlin Heidelberg, Berlin, Heidelberg (2000). https://doi.org/10.1007/3-540-45356-3_83
Chapter Google Scholar
Deb, K., Goldberg, D.E.: An investigation of niche and species formation in genetic function optimization. In: Proceedings of the Third International Conference on Genetic Algorithms.p p. 42–50. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1989)
Google Scholar
Ding, L., Boldi, R., Helmuth, T., Spector, L.: Lexicase selection at scale. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion. p. 2054–2062. GECCO’22, Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3520304.3534026
Ding, L., Pantridge, E., Spector, L.: Probabilistic lexicase selection. In: Proceedings of the Genetic and Evolutionary Computation Conference. GECCO 2023, pp. 1073–1081. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3583131.3590375
Ding, L., Spector, L.: Optimizing neural networks with gradient lexicase selection. In: International Conference on Learning Representations (2022). https://doi.org/10.48550/arXiv.2312.12606
Dolson, E.: Calculating lexicase selection probabilities is NP-hard. In: Proceedings of the Genetic and Evolutionary Computation Conference. GECCO 2023. ACM (2023). https://doi.org/10.1145/3583131.3590356
Goldberg, D.E., Deb, K.: A comparative analysis of selection schemes used in genetic algorithms. Found. Genet. Algorithms 1, 69–93 (1991). https://doi.org/10.1016/B978-0-08-050684-5.50008-2
Article MathSciNet Google Scholar
Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 1861–1870. PMLR (2018). https://proceedings.mlr.press/v80/haarnoja18b.html
Hasselt, H.: Double Q-learning. In: Advances in neural information processing systems. vol. 23, pp. 2613–2621 (2010)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Helmuth, T., Abdelhady, A.: Benchmarking parent selection for program synthesis by genetic programming. In: Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion. GECCO 2020, pp. 237–238. Association for Computing Machinery, New York, NY, USA (2020). https://doi.org/10.1145/3377929.3389987
Helmuth, T., Lengler, J., La Cava, W.: Population diversity leads to short running times of lexicase selection. In: Rudolph, G., Kononova, A.V., Aguirre, H., Kerschke, P., Ochoa, G., Tušar, T. (eds.) PPSN 2022. LNCS, pp. 485–498. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-14721-0_34
Chapter Google Scholar
Helmuth, T., McPhee, N.F., Spector, L.: Lexicase selection for program synthesis: a diversity analysis. In: Riolo, R., Worzel, B., Kotanchek, M., Kordon, A. (eds.) Genetic Programming Theory and Practice XIII. GEC, pp. 151–167. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-34223-8_9
Chapter Google Scholar
Helmuth, T., McPhee, N.F., Spector, L.: Program synthesis using uniform mutation by addition and deletion. In: Proceedings of the Genetic and Evolutionary Computation Conference. GECCO 2018, pp. 1127–1134. Association for Computing Machinery, New York, NY, USA (2018). https://doi.org/10.1145/3205455.3205603
Helmuth, T., Pantridge, E., Spector, L.: Lexicase selection of specialists. In: Proceedings of the Genetic and Evolutionary Computation Conference. GECCO 2019, pp. 1030–1038. Association for Computing Machinery, New York, NY, USA (2019). https://doi.org/10.1145/3321707.3321875
Helmuth, T., Pantridge, E., Spector, L.: On the importance of specialists for lexicase selection. Genet. Program Evolvable Mach. 21(3), 349–373 (2020). https://doi.org/10.1007/s10710-020-09377-2
Article Google Scholar
Helmuth, T., Spector, L.: General program synthesis benchmark suite. In: Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation. GECCO 2015, p. 1039–1046. Association for Computing Machinery, New York, NY, USA (2015). https://doi.org/10.1145/2739480.2754769
Helmuth, T., Spector, L.: Explaining and exploiting the advantages of down-sampled lexicase selection. In: Proceedings of ALIFE 2020: The 2020 Conference on Artificial Life, pp. 341–349 (2020). https://doi.org/10.1162/isal_a_00334
Helmuth, T., Spector, L.: Problem-solving benefits of down-sampled lexicase selection. Artif. Life 27(3–4), 183–203 (2022). https://doi.org/10.1162/artl_a_00341
Article Google Scholar
Helmuth, T., Spector, L., Matheson, J.: Solving uncompromising problems with lexicase selection. IEEE Trans. Evol. Comput. 19(5), 630–643 (2015). https://doi.org/10.1109/TEVC.2014.2362729
Article Google Scholar
Hernandez, J.G., Lalejini, A., Dolson, E., Ofria, C.: Random subsampling improves performance in lexicase selection. In: Proceedings of the 2019 Genetic and Evolutionary Computation Conference Companion. GECCO 2019, pp. 2028–2031. Association for Computing Machinery, New York, NY, USA (2019). https://doi.org/10.1145/3319619.3326900
Holland, J.H.: Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence. MIT press, Cambridge (1992)
Book Google Scholar
Keskar, N.S., Mudigere, D., Nocedal, J., Smelyanskiy, M., Tang, P.T.P.: On large-batch training for deep learning: generalization gap and sharp minima. CoRR (2016). http://arxiv.org/abs/1609.04836
Klein, J., Spector, L.: Genetic programming with historically assessed hardness. In: Genetic Programming Theory and Practice VI, pp. 1–14. Springer, Boston (2008). https://doi.org/10.1007/978-0-387-87623-8_5
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images. Technical report. University of Toronto (2009)
Google Scholar
La Cava, W., Helmuth, T., Spector, L., Moore, J.H.: A probabilistic and multi-objective analysis of lexicase selection and \(\varepsilon \)-lexicase selection. Evol. Comput. 27(3), 377–402 (2019). https://doi.org/10.1162/evco_a_00224
Article Google Scholar
La Cava, W., et al.: Contemporary symbolic regression methods and their relative performance. In: Vanschoren, J., Yeung, S. (eds.) Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks. vol. 1. Curran (2021)
Google Scholar
La Cava, W., Spector, L., Danai, K.: Epsilon-lexicase selection for regression. In: Proceedings of the Genetic and Evolutionary Computation Conference 2016. GECCO 2016, pp. 741–748. Association for Computing Machinery, New York, NY, USA (2016). https://doi.org/10.1145/2908812.2908898
McKay, R.I.B.: Fitness sharing in genetic programming. In: Proceedings of the 2nd Annual Conference on Genetic and Evolutionary Computation. GECCO 2000, pp. 435–442. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (2000)
Google Scholar
de Melo, V.V., Vargas, D.V., Banzhaf, W.: Batch tournament selection for genetic programming: the quality of lexicase, the speed of tournament. In: Proceedings of the 2019 Genetic and Evolutionary Computation Conference. GECCO 2019, pp. 994–1002. Association for Computing Machinery, New York, NY, USA (2019). https://doi.org/10.1145/3321707.3321793
Metevier, B., Saini, A.K., Spector, L.: Lexicase selection beyond genetic programming. In: Banzhaf, W., Spector, L., Sheneman, L. (eds.) Genetic Programming Theory and Practice XVI. GEC, pp. 123–136. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-04735-1_7
Chapter Google Scholar
Moore, J.M., Stanton, A.: Tiebreaks and diversity: isolating effects in lexicase selection. In: Proceedings of ALIFE 2018: The 2018 Conference on Artificial Life, pp. 590–597 (2018). https://doi.org/10.1162/isal_a_00109
Pantridge, E., Spector, L.: Code building genetic programming. In: Proceedings of the 2020 Genetic and Evolutionary Computation Conference. GECCO 2020, pp. 994–1002. Association for Computing Machinery, New York, NY, USA (2020). https://doi.org/10.1145/3377930.3390239
Romano, J.D., et al.: PMLB v1.0: an open source dataset collection for benchmarking machine learning methods (2021). https://doi.org/10.48550/arXiv.2012.00058
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Proceedings of the International Conference on Learning Representations (2015). https://doi.org/10.48550/arXiv.1409.1556
Sobania, D., Rothlauf, F.: Program synthesis with genetic programming: the influence of batch sizes. In: Medvet, E., Pappa, G., Xue, B. (eds.) EuroGP 2022: Genetic Programming. Lecture Notes in Computer Science, vol. 13223, pp. 118–129. Springer, Cham (2022)
Google Scholar
Spector, L.: Autoconstructive evolution: Push, PushGP, and Pushpop. In: Proceedings of the 2001 Genetic and Evolutionary Computation Conference, GECCO-2001, pp. 137–146. Morgan Kaufmann Publishers, San Francisco, CA (2001)
Google Scholar
Spector, L., Ding, L., Boldi, R.: Particularity. In: Winkler, S., Trujillo, L., Ofria, C., Hu, T. (eds.) Genetic Programming Theory and Practice XX. Genetic and Evolutionary Computation, Springer, Singapore (2023). https://doi.org/10.1007/978-981-99-8413-8_9 to appear
Chapter Google Scholar
Spector, L., Robinson, A.: Genetic programming and autoconstructive evolution with the push programming language. Genet. Program Evolvable Mach. 3(1), 7–40 (2002). https://doi.org/10.1023/A:1014538503543
Article Google Scholar
Stanton, A., Moore, J.M.: Lexicase selection for multi-task evolutionary robotics. Artif. Life 28(4), 479–498 (2022). https://doi.org/10.1162/artl_a_00374
Article Google Scholar
Stephens, T.: gplearn (2023). https://github.com/trevorstephens/gplearn
Urbanowicz, R.J., Moore, J.H.: Learning classifier systems: a complete introduction, review, and roadmap. J. Artif. Evol. Appl. 2009, 1–25 (2009). https://doi.org/10.1155/2009/736398
Article Google Scholar

Download references

Acknowledgements

This work was performed in part using high-performance computing equipment at Amherst College obtained under National Science Foundation Grant No. 2117377. Any opinions, findings, and conclusions or recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of the National Science Foundation. The authors would like to thank Ryan Boldi, Bill Tozier, Tom Helmuth, Edward Pantridge and other members of the PUSH lab for their insightful comments and suggestions.

Author information

Authors and Affiliations

Amherst College, Amherst, MA, 01002, USA
Andrew Ni & Lee Spector
University of Massachusetts Amherst, Amherst, MA, 01003, USA
Li Ding & Lee Spector

Authors

Andrew Ni
View author publications
You can also search for this author in PubMed Google Scholar
Li Ding
View author publications
You can also search for this author in PubMed Google Scholar
Lee Spector
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrew Ni .

Editor information

Editors and Affiliations

University of Torino, Grugliasco, Italy
Mario Giacobini
Victoria University of Wellington, Wellington, New Zealand
Bing Xue
University of Trieste, Trieste, Italy
Luca Manzoni

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ni, A., Ding, L., Spector, L. (2024). DALex: Lexicase-Like Selection via Diverse Aggregation. In: Giacobini, M., Xue, B., Manzoni, L. (eds) Genetic Programming. EuroGP 2024. Lecture Notes in Computer Science, vol 14631. Springer, Cham. https://doi.org/10.1007/978-3-031-56957-9_6

Download citation

DOI: https://doi.org/10.1007/978-3-031-56957-9_6
Published: 28 March 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-56956-2
Online ISBN: 978-3-031-56957-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

DALex: Lexicase-Like Selection via Diverse Aggregation