Partial differential equations discovery with EPDE framework: Application for real and synthetic data

https://doi.org/10.1016/j.jocs.2021.101345Get rights and content

Highlights

Abstract

Data-driven methods provide model creation tools for systems where the application of conventional analytical methods is restrained. The proposed method involves the data-driven derivation of a partial differential equation (PDE) for process dynamics, helping process simulation and study. The paper describes the methods that are used within the EPDE (Evolutionary Partial Differential Equations) partial differential equation discovery framework [1]. The framework involves a combination of evolutionary algorithms and sparse regression. Such an approach is versatile compared to other commonly used data-driven partial differential derivation methods by making fewer assumptions about the resulting equation. This paper highlights the algorithm features that allow data processing with noise, which is similar to the algorithm's real-world applications. This paper is an extended version of the ICCS-2020 conference paper [2].

Introduction

The ability to simulate complex processes, neglecting a lack of knowledge about the system's underlying structure, can be vital for developing models in such spheres of science as biology, medicine, materials technology, and metocean studies. In contrast to the deterministic physics-based models, developed by application of conservation laws to the studied process, data-driven modeling (DDM) involves developing complete models from various fields of measurements, describing the process, using means of statistics and machine learning algorithms. Moreover, in some occasions, DDM can enhance the existing physics-based models with supplementary expressions or refined weight values [3]. In fluid dynamics science and hydrometeorology, surrogate models’ development is the most common application of data-driven algorithms.

In the current paper's scope are the methods of data-driven differential equation discovery. Differential equations, in some cases, are interpretable by the expert either in the application field or in the differential equations. Moreover, the well-developed mathematical physics methods for the differential equations analysis may interpret the equations. In most cases, actual algorithms utilize the sparse regression in a prescribed differential terms library [4], [5]. The second popular case of the study is the neural network's algorithms for differential equations discovery [6], [7], [8].

We consider discovered models as the surrogate models that could be applied to the hydrometeorological examples. Various approaches to surrogate modeling are described below, including differential equations discovery.

The modern surrogate models tend to belong to one of three major groups [9]:

  • Data-driven empirical approximations of the deterministic model outputs. These models use conclusions obtained with the statistical or machine learning tools (response surfaces, kriging) applied to the data.

  • Reduced-order models are based on the projection of the model's main equations to the subspace with the reduced dimensionality, using various orthogonal decompositions.

  • Multifidelity models: simplifications of representing the complex physics of the model's process by omitting the less significant subprocesses or increasing the model's scale. In some cases, the experimental setup requires applying models with different fidelity levels to evaluate multiple scales of processes or modeling ensemble [10], [11].

In this research, we are interested in developing a new approach that belongs to the first class of models. However, natural sciences applications require robustness of the model and should work in high-dimensional space to handle spatio-temporal and other types of variability. Transferring from one spatial dimension usually considered in references to higher spatial dimensions requires the algorithm to handle exponentially growing noise levels.

In the previous works [12] we have described the EPDE (Evolutionary Partial Differential Equations)1 approach, that can provide a flexible, yet efficient tool for data-driven equation derivation. This work increases the problem's difficulty by introducing higher-dimensional cases and high-magnitude noise in the data.

This version extends conference paper [2] and introduces a series of experiments that allow comparing EPDE framework with the analogs in a better way. The module system of the PDE algorithm that is briefly described in Section 6 allows to, as an example, use different from the finite-difference differentiation scheme. We show it using neural networks and automatic derivatives in Section 7.

This paper is organized as follows: Section 2 briefly introduces the existing surrogate modeling approaches. Section 3 describes the problem of the data-driven PDE discovery and Section 4 describes the practical realization. In Section 5, numerical examples of the synthetic data and the real data are shown. Section 6 presents the additions to the method described in the previous article [12], which allows dealing with the higher-dimension data-driven PDE discovery. Section 7 is dedicated to illustrating the module structure and experiments with replacement of differentiation model with neural network approximation. Section 8 concludes the paper.

Section snippets

Related work

The first examples of the data-driven surrogate modeling in hydrometeorology have appeared in its earliest stages with the understanding, that the contemporary full-scale models required computational powers, inaccessible for many research teams. The original approaches were based on the pattern scaling – the extension of the present trend, obtained from the ensemble of full-scale models [13], [14]. The statistical emulation on the base of an ensemble of pre-computed deterministic models has

Problem statement

The class of problems, which the described EPDE algorithm can solve, can be summarized as follows: the process, which involves scalar field u, is occurring in the area Ω and is governed by the partial differential equation (1). However, there is no a priori information about the dynamics of the process except that some form of PDE can describe it (for simplicity, we consider temporally varying 2D field case, even though the problem could be formulated for an arbitrary field). In recent

Method description

In this section, the details of the evolutionary method of partial differential equation derivation are described. The proposed method involves a combination of evolutionary algorithms and sparse regression to detect the equation structure. The sparse regression aims to construct equation terms set, while the evolutionary algorithm is focused on selecting significant terms from the created set and calculating weights that will be present in the resulting equation. At first, we introduce the

Synthetic data

a) Wave equation. The analysis of the algorithm performance is held on the synthetic data. This simplification can show the result's response to various types and magnitudes of noise, which is generally unknown on the measurement data. As in the previous studies, the solution of the wave equation with two spatial variables Eq. (14), where t – time, x, y – spatial coordinates, u – studied function (for example, small out-of-plane membrane displacement), and α1=α2=1 was taken as the synthetic

EPDE framework description

The framework, encompassing the described method, is designed to allow the user to customize the algorithm's significant elements while giving the default pipeline and necessary tools for the differential equation discovery. The setup of the equation discovery experiment requires the selection of functions (tokens) that form the pool, from which the algorithm creates the candidate equations. The main element that has to be defined is obtaining the function values on the set of processed points

Neural networks approximation with automatic differentiation

This section is dedicated to changing the differentiation method. The proposed algorithm has a modular structure. Thus, we may replace the differentiation algorithm from finite differences or analytical differentiation of polynomials to the neural network approximation with further automatic differentiation.

Conclusion

The proposed method has proven to be suitable for the data-driven derivation of equations that can model various physical processes. The robustness of the algorithm to the noise in the input data provided by improved preprocessing of data allows the framework applicable to real-world problems. Even in the cases of substantial noise in the input data, the resulting equations had the correct structures and, therefore, can correctly describe the studied system. Other notable points about the

Declaration of interests

None.

Acknowledgements

This research is financially supported by The Russian Scientific Foundation, Agreement #19-71-00150.

Mikhail Maslyaev is an engineer and PhD student of Nature System Simulation lab, National Centre for Cognitive research, ITMO University, Russia. Mikhail works on his thesis on an evolutionary algorithm for PDE discovery. Mikhail is a creator and maintainer of EPDE framework repository.

References (26)

  • M. Raissi

    Deep hidden physics models: Deep learning of nonlinear partial differential equations

    J. Mach. Learn. Res.

    (2018)
  • M.J. Asher et al.

    A review of surrogate models and their application to groundwater modeling

    Water Resour. Res.

    (2015)
  • M.P. Rumpfkeil et al.

    Multi-fidelity surrogate models for flutter database generation

    Comput. Fluids

    (2020)
  • Cited by (32)

    View all citing articles on Scopus

    Mikhail Maslyaev is an engineer and PhD student of Nature System Simulation lab, National Centre for Cognitive research, ITMO University, Russia. Mikhail works on his thesis on an evolutionary algorithm for PDE discovery. Mikhail is a creator and maintainer of EPDE framework repository.

    Dr. Alexander Hvatov is a senior researcher of Nature System Simulation lab, National Centre for Cognitive research, ITMO University, Russia. Alex mostly interested in the classical mathematical models such as partial differential equations and mathematical physics methods. Another direction of Alex's research is the wave propagation in periodic structures.

    Dr. Anna Kalyuzhnaya is the head of Nature System Simulation lab National Centre for Cognitive research, ITMO University, Russia and an assistance professor. Anna mostly interested in the statistical methods and machine learning methods applied in different fields of natural and social sciences.

    The code (and data) in this article has been certified as Reproducible by Code Ocean: (https://codeocean.com/). More information on the Reproducibility Badge Initiative is available at https://www.elsevier.com/physical-sciences-and-engineering/computer-science/journals.

    View full text