Integrative approach of machine learning and symbolic regression for stability prediction of multicomponent perovskite oxides and high-throughput screening
Created by W.Langdon from
gp-bibliography.bib Revision:1.8129
- @Article{ZHANG:2024:commatsci,
-
author = "Zhaosheng Zhang and Yingjie Zhang and Sijia Liu",
-
title = "Integrative approach of machine learning and symbolic
regression for stability prediction of multicomponent
perovskite oxides and high-throughput screening",
-
journal = "Computational Materials Science",
-
year = "2024",
-
volume = "236",
-
pages = "112889",
-
month = mar,
-
keywords = "genetic algorithms, genetic programming, Symbolic
regression, Machine learning, High-throughput,
Perovskite oxides",
-
ISSN = "0927-0256",
-
URL = "https://www.sciencedirect.com/science/article/pii/S0927025624001101",
-
DOI = "doi:10.1016/j.commatsci.2024.112889",
-
abstract = "This work unfolds a robust and interpretable strategy
for evaluating the stability and potential photovoltaic
application of 6526 multicomponent perovskite oxides,
employing a synergetic methodology that intertwines
advanced machine learning (ML) algorithms and symbolic
regression based on genetic programming. Initially, ML
algorithms, namely XGBoost, LightGBM, and random
forest, were harnessed, with elemental oxidation state
and electronegativity serving as input features,
achieving R2 values of 0.98, 0.98, and 0.74,
respectively, on the test set for predicting the
formation enthalpy, a criterion for perovskite
stability. Despite the amplified interpretability
offered by SHAP analysis, the inherent black-box nature
of ML obfuscates a transparent understanding of
intrinsic relations between input features and
performance. To surmount this, symbolic regression
introduced not only elucidates a clear functional
relationship between input features and perovskite
stability but also achieves a commendable R2 of 0.79 on
the test set. Subsequent high-throughput screening,
based on perovskite stability ranking, designated the
top 500 stable perovskites for band gap calculation
using the PBE functional, wherein DyNdHf2O6, CeEuAl2O6,
and CeSrAl2O6 emerged as potential candidates for
photovoltaic applications and were subjected to further
electronic structure simulations employing the HSE06
functional, encompassing density of states, band
structure, charge density, and optical absorption
spectra. Ultimately, CeEuAl2O6, boasting an optical
direct bandgap of 2.31 eV and minimal electron-hole
wavefunction overlap, stands out as the prime choice
for photovoltaic materials. This research not only
pioneers the exploration of enhancing the
interpretability of ML but also propels theoretical
guidance for the evolution of photovoltaic cells by
bridging predictive modeling with high-throughput
screening",
- }
Genetic Programming entries for
Zhaosheng Zhang
Yingjie Zhang
Sijia Liu
Citations