Partial differential equations discovery with EPDE framework: Application for real and synthetic data ☆
Introduction
The ability to simulate complex processes, neglecting a lack of knowledge about the system's underlying structure, can be vital for developing models in such spheres of science as biology, medicine, materials technology, and metocean studies. In contrast to the deterministic physics-based models, developed by application of conservation laws to the studied process, data-driven modeling (DDM) involves developing complete models from various fields of measurements, describing the process, using means of statistics and machine learning algorithms. Moreover, in some occasions, DDM can enhance the existing physics-based models with supplementary expressions or refined weight values [3]. In fluid dynamics science and hydrometeorology, surrogate models’ development is the most common application of data-driven algorithms.
In the current paper's scope are the methods of data-driven differential equation discovery. Differential equations, in some cases, are interpretable by the expert either in the application field or in the differential equations. Moreover, the well-developed mathematical physics methods for the differential equations analysis may interpret the equations. In most cases, actual algorithms utilize the sparse regression in a prescribed differential terms library [4], [5]. The second popular case of the study is the neural network's algorithms for differential equations discovery [6], [7], [8].
We consider discovered models as the surrogate models that could be applied to the hydrometeorological examples. Various approaches to surrogate modeling are described below, including differential equations discovery.
The modern surrogate models tend to belong to one of three major groups [9]:
- •
Data-driven empirical approximations of the deterministic model outputs. These models use conclusions obtained with the statistical or machine learning tools (response surfaces, kriging) applied to the data.
- •
Reduced-order models are based on the projection of the model's main equations to the subspace with the reduced dimensionality, using various orthogonal decompositions.
- •
Multifidelity models: simplifications of representing the complex physics of the model's process by omitting the less significant subprocesses or increasing the model's scale. In some cases, the experimental setup requires applying models with different fidelity levels to evaluate multiple scales of processes or modeling ensemble [10], [11].
In this research, we are interested in developing a new approach that belongs to the first class of models. However, natural sciences applications require robustness of the model and should work in high-dimensional space to handle spatio-temporal and other types of variability. Transferring from one spatial dimension usually considered in references to higher spatial dimensions requires the algorithm to handle exponentially growing noise levels.
In the previous works [12] we have described the EPDE (Evolutionary Partial Differential Equations)1 approach, that can provide a flexible, yet efficient tool for data-driven equation derivation. This work increases the problem's difficulty by introducing higher-dimensional cases and high-magnitude noise in the data.
This version extends conference paper [2] and introduces a series of experiments that allow comparing EPDE framework with the analogs in a better way. The module system of the PDE algorithm that is briefly described in Section 6 allows to, as an example, use different from the finite-difference differentiation scheme. We show it using neural networks and automatic derivatives in Section 7.
This paper is organized as follows: Section 2 briefly introduces the existing surrogate modeling approaches. Section 3 describes the problem of the data-driven PDE discovery and Section 4 describes the practical realization. In Section 5, numerical examples of the synthetic data and the real data are shown. Section 6 presents the additions to the method described in the previous article [12], which allows dealing with the higher-dimension data-driven PDE discovery. Section 7 is dedicated to illustrating the module structure and experiments with replacement of differentiation model with neural network approximation. Section 8 concludes the paper.
Section snippets
Related work
The first examples of the data-driven surrogate modeling in hydrometeorology have appeared in its earliest stages with the understanding, that the contemporary full-scale models required computational powers, inaccessible for many research teams. The original approaches were based on the pattern scaling – the extension of the present trend, obtained from the ensemble of full-scale models [13], [14]. The statistical emulation on the base of an ensemble of pre-computed deterministic models has
Problem statement
The class of problems, which the described EPDE algorithm can solve, can be summarized as follows: the process, which involves scalar field , is occurring in the area and is governed by the partial differential equation (1). However, there is no a priori information about the dynamics of the process except that some form of PDE can describe it (for simplicity, we consider temporally varying 2D field case, even though the problem could be formulated for an arbitrary field). In recent
Method description
In this section, the details of the evolutionary method of partial differential equation derivation are described. The proposed method involves a combination of evolutionary algorithms and sparse regression to detect the equation structure. The sparse regression aims to construct equation terms set, while the evolutionary algorithm is focused on selecting significant terms from the created set and calculating weights that will be present in the resulting equation. At first, we introduce the
Synthetic data
a) Wave equation. The analysis of the algorithm performance is held on the synthetic data. This simplification can show the result's response to various types and magnitudes of noise, which is generally unknown on the measurement data. As in the previous studies, the solution of the wave equation with two spatial variables Eq. (14), where – time, , – spatial coordinates, – studied function (for example, small out-of-plane membrane displacement), and was taken as the synthetic
EPDE framework description
The framework, encompassing the described method, is designed to allow the user to customize the algorithm's significant elements while giving the default pipeline and necessary tools for the differential equation discovery. The setup of the equation discovery experiment requires the selection of functions (tokens) that form the pool, from which the algorithm creates the candidate equations. The main element that has to be defined is obtaining the function values on the set of processed points
Neural networks approximation with automatic differentiation
This section is dedicated to changing the differentiation method. The proposed algorithm has a modular structure. Thus, we may replace the differentiation algorithm from finite differences or analytical differentiation of polynomials to the neural network approximation with further automatic differentiation.
Conclusion
The proposed method has proven to be suitable for the data-driven derivation of equations that can model various physical processes. The robustness of the algorithm to the noise in the input data provided by improved preprocessing of data allows the framework applicable to real-world problems. Even in the cases of substantial noise in the input data, the resulting equations had the correct structures and, therefore, can correctly describe the studied system. Other notable points about the
Declaration of interests
None.
Acknowledgements
This research is financially supported by The Russian Scientific Foundation, Agreement #19-71-00150.
Mikhail Maslyaev is an engineer and PhD student of Nature System Simulation lab, National Centre for Cognitive research, ITMO University, Russia. Mikhail works on his thesis on an evolutionary algorithm for PDE discovery. Mikhail is a creator and maintainer of EPDE framework repository.
References (26)
- et al.
Data driven governing equations approximation using deep neural networks
J. Comput. Phys.
(2019) - et al.
Data-driven discovery of pdes in complex datasets
J. Comput. Phys.
(2019) - et al.
On the recovery of multiple flow parameters from transient head data
J. Comput. Appl. Math.
(2004) - et al.
Multilayer feedforward networks are universal approximators
Neural Networks
(1989) - NSS Team, Fedot E* algotirhms, https://github.com/ITMO-NSS-team/FEDOT.Algs...
- et al.
Data-driven partial differential equations discovery approach for the noised multi-dimensional data
- J. Berg, K. Nyström, Neural network augmented inverse problems for pdes, arXiv preprint arXiv:1712.09685 (2017)....
- et al.
Learning partial differential equations via data discovery and sparse optimization
Proc. Royal Soc. A: Math. Phys. Eng. Sci.
(2017) - S.H. Kang, W. Liao, Y. Liu, Ident: Identifying differential equations with numerical time evolution, arXiv preprint...
- et al.
PDE-net: Learning PDEs from data
International Conference on Machine Learning
(2018)
Deep hidden physics models: Deep learning of nonlinear partial differential equations
J. Mach. Learn. Res.
A review of surrogate models and their application to groundwater modeling
Water Resour. Res.
Multi-fidelity surrogate models for flutter database generation
Comput. Fluids
Cited by (32)
MORL4PDEs: Data-driven discovery of PDEs based on multi-objective optimization and reinforcement learning
2024, Chaos, Solitons and FractalsLearning dynamics from coarse/noisy data with scalable symbolic regression
2023, Mechanical Systems and Signal ProcessingForecasting of Sea Ice Concentration using CNN, PDE discovery and Bayesian Networks
2023, Procedia Computer ScienceDiscovery of multivariable algebraic expressions using evolutionary optimization
2022, Procedia Computer ScienceTowards Discovery of the Differential Equations
2023, Doklady Mathematics
Mikhail Maslyaev is an engineer and PhD student of Nature System Simulation lab, National Centre for Cognitive research, ITMO University, Russia. Mikhail works on his thesis on an evolutionary algorithm for PDE discovery. Mikhail is a creator and maintainer of EPDE framework repository.
Dr. Alexander Hvatov is a senior researcher of Nature System Simulation lab, National Centre for Cognitive research, ITMO University, Russia. Alex mostly interested in the classical mathematical models such as partial differential equations and mathematical physics methods. Another direction of Alex's research is the wave propagation in periodic structures.
Dr. Anna Kalyuzhnaya is the head of Nature System Simulation lab National Centre for Cognitive research, ITMO University, Russia and an assistance professor. Anna mostly interested in the statistical methods and machine learning methods applied in different fields of natural and social sciences.
- ☆
The code (and data) in this article has been certified as Reproducible by Code Ocean: (https://codeocean.com/). More information on the Reproducibility Badge Initiative is available at https://www.elsevier.com/physical-sciences-and-engineering/computer-science/journals.