Skip to main content

Symbolic Regression in Materials Science: Discovering Interatomic Potentials from Data

  • Chapter
  • First Online:

Part of the book series: Genetic and Evolutionary Computation ((GEVO))

Abstract

Particle-based modeling of materials at atomic scale plays an important role in the development of new materials and the understanding of their properties. The accuracy of particle simulations is determined by interatomic potentials, which allow calculating the potential energy of an atomic system as a function of atomic coordinates and potentially other properties. First-principles-based ab initio potentials can reach arbitrary levels of accuracy, however, their applicability is limited by their high computational cost. Machine learning (ML) has recently emerged as an effective way to offset the high computational costs of ab initio atomic potentials by replacing expensive models with highly efficient surrogates trained on electronic structure data. Among a plethora of current methods, symbolic regression (SR) is gaining traction as a powerful “white-box” approach for discovering functional forms of interatomic potentials. This contribution discusses the role of symbolic regression in Materials Science (MS) and offers a comprehensive overview of current methodological challenges and state-of-the-art results. A genetic programming-based approach for modeling atomic potentials from raw data (consisting of snapshots of atomic positions and associated potential energy) is presented and empirically validated on ab initio electronic structure data.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://gitlab.com/muellergroup/poet.

  2. 2.

    https://www.vasp.at/wiki/index.php/POSCAR.

  3. 3.

    https://en.wikipedia.org/wiki/Callable_object.

  4. 4.

    https://en.wikipedia.org/wiki/Reduction_operator.

References

  1. Agrawal, A., Choudhary, A.: Perspective: Materials informatics and big data: realization of the “fourth paradigm” of science in materials science. APL Mater. 4(5), 053208 (2016)

    Google Scholar 

  2. Araújo, J.P., Ballester, M.Y.: A comparative review of 50 analytical representation of potential energy interaction for diatomic systems: 100 years of history. Int. J. Quantum Chem. 121(24), e26808 (2021)

    Article  Google Scholar 

  3. Baker, J.E.: Reducing bias and inefficiency in the selection algorithm. In: Proceedings of the Second International Conference on Genetic Algorithms on Genetic Algorithms and Their Application, pp. 14–21, L. Erlbaum Associates Inc., USA (1987)

    Google Scholar 

  4. Balabin, R.M., Lomakina, E.I.: Support vector machine regression (ls-svm)–an alternative to artificial neural networks (anns) for the analysis of quantum chemistry data? Phys. Chem. Chem. Phys. 13, 11710–11718 (2011)

    Article  Google Scholar 

  5. Bartók, A.P., Kondor, R., Csányi, G.: On representing chemical environments. Phys. Rev. B 87, 184115 (2013)

    Article  Google Scholar 

  6. Behler, J.: Perspective: Machine learning potentials for atomistic simulations. J. Chem. Phys. 145(17), 170901 (2016)

    Article  Google Scholar 

  7. Bellucci, M.A., Coker, D.F.: Empirical valence bond models for reactive potential energy surfaces: A parallel multilevel genetic program approach. J. Chem. Phys. 135(4), 044115 (2011)

    Article  Google Scholar 

  8. Bellucci, M.A., Coker, D.F.: Molecular dynamics of excited state intramolecular proton transfer: 3-hydroxyflavone in solution. J. Chem. Phys. 136(19), 194505 (2012)

    Article  Google Scholar 

  9. Binder, K., Heermann, D., Roelofs, L., John Mallinckrodt, A., McKay, S.: Monte carlo simulation in statistical physics. Comput. Phys. 7(2), 156–157 (1993)

    Google Scholar 

  10. Brown, A., McCoy, A.B., Braams, B.J., Jin, Z., Bowman, J.M.: Quantum and classical studies of vibrational motion of ch5+ on a global potential energy surface obtained from a novel ab initio direct dynamics approach. J. Chem. Phys. 121(9), 4105–4116 (2004)

    Article  Google Scholar 

  11. Brown, M.W., Thompson, A.P., Watson, J.-P., Schultz, P.A.: Bridging scales from ab initio models to predictive empirical models for complex materials. Technical report, Laboratories, Sandia National (2008)

    Google Scholar 

  12. Brown, W.M., Thompson, A.P., Schultz, P.A.: Efficient hybrid evolutionary optimization of interatomic potential models. J. Chem. Phys. 132(2), 024108 (2010)

    Article  Google Scholar 

  13. Burlacu, B., Kronberger, G., Kommenda, M.: Operon C++: an efficient genetic programming framework for symbolic regression. In: Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion, GECCO’20, pp. 1562–1570. Association for Computing Machinery (2020). (internet, 8–12 July 2020)

    Google Scholar 

  14. La Cava, W.G., Orzechowski, P., Burlacu, B., de França, F.O., Virgolin, M., Jin, Y., Kommenda, M., Moore, J.H.: Contemporary symbolic regression methods and their relative performance (2021). CoRR, arXiv:2107.14351

  15. Chen, R., Shao, K., Fu, B., Zhang, D.H.: Fitting potential energy surfaces with fundamental invariant neural network. ii. generating fundamental invariants for molecular systems with up to ten atoms. J. Chem. Phys. 152(20), 204307 (2020)

    Google Scholar 

  16. Deb, K., Agrawal, S., Pratap, A., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)

    Google Scholar 

  17. Dral, P.O.: Quantum chemistry in the age of machine learning. J. Phys. Chem. Lett. 11(6), 2336–2347 (2020). PMID: 32125858

    Article  Google Scholar 

  18. Eldridge, A., Rodriguez, A., Hu, M., Hu, J.: Genetic programming-based learning of carbon interatomic potential for materials discovery (2022)

    Google Scholar 

  19. Gagné, C., Parizeau, M.: Genericity in evolutionary computation software tools: Principles and case study. Int. J. Artif. Intell. Tools 15(2), 173–194 (2006)

    Article  Google Scholar 

  20. Gao, H., Wang, J., Sun, J.: Improve the performance of machine-learning potentials by optimizing descriptors. J. Chem. Phys. 150(24), 244110 (2019)

    Article  Google Scholar 

  21. Ghiringhelli, L.M., Vybiral, J., Levchenko, S.V., Draxl, C., Scheffler, M.: Big data of materials science: Critical role of the descriptor. Phys. Rev. Lett. 114, 105503 (2015)

    Article  Google Scholar 

  22. Guennebaud, G., Jacob, B., et al.: Eigen v3 (2010). http://eigen.tuxfamily.org

  23. Handley, C.M., Behler, J.: Next generation interatomic potentials for condensed systems. Eur. Phys. J. B 87(7), 152 (2014)

    Article  Google Scholar 

  24. Hernandez, A., Balasubramanian, A., Yuan, F., Mason, S.A.M., Mueller, T.: Fast, accurate, and transferable many-body interatomic potentials by symbolic regression. NPJ Comput. Mater. 5(1), 112 (2019)

    Article  Google Scholar 

  25. Hey, T., Butler, K., Jackson, S., Thiyagalingam, J.: Machine learning and big scientific data. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 378(2166), 20190054 (2020)

    Article  MathSciNet  Google Scholar 

  26. Himanen, L., Geurts, A., Foster, A.S., Rinke, P.: Data-driven materials science: status, challenges, and perspectives. Adv. Sci. 6(21), 1900808 (2019)

    Google Scholar 

  27. Hospital, A., Goñi, J.R., Orozco, M., Gelpí, J.L.: Molecular dynamics simulations: advances and applications. Adv. Appl. Bioinform. Chem. AABC 8, 37 (2015)

    Google Scholar 

  28. Hu, J., Goodman, E., Seo, K., Fan, Z., Rosenberg, R.: The hierarchical fair competition (hfc) framework for sustainable evolutionary algorithms. Evol. Comput. 13(2), 241–277 (06 2005)

    Google Scholar 

  29. Ischtwan, J., Collins, M.A.: Molecular potential energy surfaces by interpolation. J. Chem. Phys. 100(11), 8080–8088 (1994)

    Article  Google Scholar 

  30. Kenoufi, A., Kholmurodov, K.: Symbolic regression of interatomic potentials via genetic programming. Biol. Chem. Res 2, 1–10 (2015)

    Google Scholar 

  31. Kim, C., Pilania, G., Ramprasad, R.: From organized high-throughput data to phenomenological theory using machine learning: the example of dielectric breakdown. Chem. Mater. 28(5), 1304–1311 (2016)

    Article  Google Scholar 

  32. Kim, K.H., Lee, Y.S., Ishida, T., Jeung, G.-H.: Dynamics calculations for the lih+h li+h2 reactions using interpolations of accurate ab initio potential energy surfaces. J. Chem. Phys. 119(9), 4689–4693 (2003)

    Article  Google Scholar 

  33. Kohn, W., Sham, L.J.: Self-consistent equations including exchange and correlation effects. Phys. Rev. 140, A1133–A1138 (1965)

    Article  MathSciNet  Google Scholar 

  34. Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, USA (1992)

    MATH  Google Scholar 

  35. Kresse, G., Furthmüller, J.: Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys. Rev. B 54, 11169–11186 (1996)

    Article  Google Scholar 

  36. Kruskal, W.H., Allen Wallis, W.: Use of ranks in one-criterion variance analysis. J. Am. Stat. Assoc. 47(260), 583–621 (1952)

    Google Scholar 

  37. Kusne, A., Mueller, T., Ramprasad, R.: Machine learning in materials science: recent progress and emerging applications. Rev. Comput. Chem. (2016). (2016-05-06)

    Google Scholar 

  38. Makarov, D.E., Metiu, H.: Fitting potential-energy surfaces: a search in the function space by directed genetic programming. J. Chem. Phys. 108(2), 590–598 (1998)

    Article  Google Scholar 

  39. Makarov, D.E., Metiu, H.: Using genetic programming to solve the schrödinger equation. J. Phys. Chem. A 104(37), 8540–8545 (2000)

    Article  Google Scholar 

  40. Mueller, T., Hernandez, A., Wang, C.: Machine learning for interatomic potential models. J. Chem. Phys. 152(5), 050902 (2020)

    Article  Google Scholar 

  41. Mueller, T., Johlin, E., Grossman, J.C.: Origins of hole traps in hydrogenated nanocrystalline and amorphous silicon revealed through machine learning. Phys. Rev. B 89, 115202 (2014)

    Article  Google Scholar 

  42. Pilania, G.: Machine learning in materials science: From explainable predictions to autonomous design. Comput. Mater. Sci. 193, 110360 (2021)

    Article  Google Scholar 

  43. Plimpton, S.: Fast parallel algorithms for short-range molecular dynamics. J. Comput. Phys. 117(1), 1–19 (1995)

    Article  MATH  Google Scholar 

  44. Rothe, T., Schuster, J., Teichert, F., Lorenz, E.E.: Machine Learning Potentials-State of the Research and Potential Applications for Carbon Nanostructures. Technische Universität, Faculty of Natural Sciences, Institute of Physics (2019)

    Google Scholar 

  45. Sastry, K.N.: Genetic algorithms and genetic programming for multiscale modeling: Applications in materials science and chemistry and advances in scalability. PhD thesis, University of Illinois, Urbana-Champaign (March 2007)

    Google Scholar 

  46. Shao, K., Chen, J., Zhao, Z., Zhang, D.H.: Communication: fitting potential energy surfaces with fundamental invariant neural network. J. Chem. Phys. 145(7), 071101 (2016)

    Article  Google Scholar 

  47. Shapeev, A.V.: Moment tensor potentials: a class of systematically improvable interatomic potentials. Multiscale Model. Simul. 14(3), 1153–1173 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  48. Slepoy, A., Peters, M.D., Thompson, A.P.: Searching for globally optimal functional forms for interatomic potentials using genetic programming with parallel tempering. J. Comput. Chem. 28(15), 2465–2471 (2007)

    Article  Google Scholar 

  49. Steele, D., Lippincott, E.R., Vanderslice, J.T.: Comparative study of empirical internuclear potential functions. Rev. Mod. Phys. 34, 239–251 (1962)

    Article  Google Scholar 

  50. Stillinger, F.H., Weber, T.A.: Computer simulation of local order in condensed phases of silicon. Phys. Rev. B 31, 5262–5271 (1985)

    Article  Google Scholar 

  51. Sutton, A.P., Chen, J.: Long-range finnis-sinclair potentials. Philos. Mag. Lett. 61(3), 139–146 (1990)

    Article  Google Scholar 

  52. Thompson, A.P., Swiler, L.P., Trott, C.R., Foiles, S.M., Tucker, G.J.: Spectral neighbor analysis method for automated generation of quantum-accurate interatomic potentials. J. Comput. Phys. 285, 316–330 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  53. Unke, O.T., Chmiela, S., Sauceda, H.E., Gastegger, M., Poltavsky, I., Schütt, K.T., Tkatchenko, A., Müller, K.-R.: Machine learning force fields. Chem. Rev. 0(0):null. PMID: 33705118 (2021)

    Google Scholar 

  54. Wang, Y., Wagner, N., Rondinelli, J.M.: Symbolic regression in materials science. MRS Commun. 9(3), 793–805 (2019)

    Google Scholar 

  55. Zhang, L., Han, J., Wang, H., Car, R., Weinan, E.: Deep potential molecular dynamics: a scalable model with the accuracy of quantum mechanics. Phys. Rev. Lett. 143001 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bogdan Burlacu .

Editor information

Editors and Affiliations

5 Appendix

5 Appendix

Empirical potentials

For a comprehensive overview of empirical potentials, we recommend the work of Araújo and Ballester [2]. Below, we give a casual overview of the most important empirical potentials mentioned in this contribution.

Morse potential

This is an empirical potential used to model diatomic molecules:

$$\begin{aligned} V_\text {M}(r) = D\Big ( 1 - \exp \big ( -a(r - r_0) \big ) ^2\Big ) \end{aligned}$$
(17)

where D is the dissociation energy, r is the distance between atoms, a is a set of parameters and \(r_0\) is the equilibrium bond distance.

Lennard-Jones potential

The Lennard-Jones potential models soft repulsive and attractive interactions and can describe electronically neutral atoms or molecules. Interacting particles repel each other at very close distances, attract each other at moderate distances, and do not interact at infinite distances:

$$\begin{aligned} V_\text {LJ}(r) = 4 \varepsilon \bigg [ \Big ( \frac{\sigma }{r} \Big )^{12} - \Big ( \frac{\sigma }{r} \Big )^6 \bigg ] \end{aligned}$$
(18)

where r is the distance between atoms, \(\varepsilon \) is the dispersion energy and \(\sigma \) is the distance at which the particle-particle potential energy V is zero.

Lippincott potential

Lippincott [49] potential involves an exponential of interatomic distances

$$\begin{aligned} V_\text {LIP}(r) = D\Bigg (1 - \exp \Big (\frac{-n(r - r_0)^2}{2r} \Big ) \Bigg )\Big (1 + aF(r) \Big ) \end{aligned}$$
(19)

where D is the dissociation energy, r is the distance between atoms, \(r_0\) is the equilibrium bond distance and a and n are parameters. F(r) is a function of internuclear distance such that \(F(r) = 0\) when \(r=\infty \) and \(F(r) = \infty \) when \(r=0\).

Stillinger-Weber potential

The Stillinger-Weber potential [50] models two- and three-body interactions by taking into account not only the distances between atoms but also the bond angles:

$$\begin{aligned} V_\text {SW}(r) = \sum _{\langle i, j \rangle } \phi _2(r_{ij}) + \sum _{\langle i, j, k \rangle } \phi _3(r_{ij}, r_{ik}, \theta _{ijk}) \end{aligned}$$
(20)

where

$$\begin{aligned} \phi _2(r_{ij})&= A \varepsilon \left[ B\left( \frac{\sigma }{r_{ij}} \right) ^p - \left( \frac{\sigma }{r_{ij}} \right) ^q \right] \exp \left( \frac{\sigma }{r_{ij} - a\sigma } \right) \text { and}\end{aligned}$$
(21)
$$\begin{aligned} \phi _3(r_{ij}, r_{ik}, \theta _{ijk})&= \lambda \varepsilon \left[ \cos \theta _{ijk} - \cos \theta _0 \right] ^2 \times \exp \left( \frac{\gamma \sigma }{r_{ij} - a\sigma } \right) \exp \left( \frac{\gamma \sigma }{r_{ik} - a \sigma } \right) \end{aligned}$$
(22)

Sutton-Chen potential

The Sutton-Chen potential [51] has been used in molecular dynamics and Monte Carlo simulations of metallic systems. It offers a reasonable description of various bulk properties, with an approximate many-body representation of the delocalized metallic bonding:

$$\begin{aligned} V_\text {SC} = \sum _{\langle i,j \rangle } U(r_{ij}) - \sum _i u\sqrt{\rho _i} \end{aligned}$$
(23)

Here, the first term represents the repulsion between atomic cores, and the second term models the bonding energy due to the electrons. Both terms are further defined in terms of reciprocal power so that the complete expression is

$$\begin{aligned} V_\text {SC} = \epsilon \left[ \sum _{\langle i,j \rangle } \left( \frac{a}{r_{ij}} \right) ^n - C \sum _i \sqrt{\sum _j \left( \frac{a}{r_{ij}} \right) ^m} \right] \end{aligned}$$
(24)

where C is a dimensionless parameter, \(\epsilon \) is a parameter with dimensions of energy, a is the lattice constant, mn are positive integers with \(n > m\) and \(r_{ij}\) is the distance between the ith and jth atoms.

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Burlacu, B., Kommenda, M., Kronberger, G., Winkler, S.M., Affenzeller, M. (2023). Symbolic Regression in Materials Science: Discovering Interatomic Potentials from Data. In: Trujillo, L., Winkler, S.M., Silva, S., Banzhaf, W. (eds) Genetic Programming Theory and Practice XIX. Genetic and Evolutionary Computation. Springer, Singapore. https://doi.org/10.1007/978-981-19-8460-0_1

Download citation

  • DOI: https://doi.org/10.1007/978-981-19-8460-0_1

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-19-8459-4

  • Online ISBN: 978-981-19-8460-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics