Elsevier

Engineering Geology

Volume 315, 20 March 2023, 107031
Engineering Geology

Physics-guided genetic programming for predicting field-monitored suction variation with effects of vegetation and atmosphere

https://doi.org/10.1016/j.enggeo.2023.107031Get rights and content

Highlights

  • Propose a physics-guided genetic programming method for more accurate calculation of field-monitored soil suction.

  • Comparative and uncertainty analysis indicate the reliability and validity of the proposed method.

  • Perform global sensitivity analysis to provide the importance ranking of selected influencing parameters.

  • Provide a new way of integrating analytical solution and machine learning for solving complicated engineering problems.

Abstract

The complicated interactions among shallow soil, vegetation, and atmospheric parameters make the precise prediction of field-monitored soil suction under natural conditions challenging. This study integrated an analytical solution with a genetic programming (GP) model in proposing a physics-guided GP method for better calculation and prediction of field-monitored matric suction in a shallow soil layer. Model development and analysis involved 3987 collected data values for soil suction as well as atmospheric and tree-related parameters from a field monitoring site. Natural algorithm values of transpiration rates obtained by back-calculation were simulated with GP using easily obtained parameters. Global sensitivity analysis demonstrated that the tree canopy-related parameter was the most important for transpiration rate. It was indicated that the proposed physics-guided GP method greatly improved calculation accuracy and, as a result, demonstrated a better performance and was more reliable than the individual GP method in calculating field-monitored suction. The proposed physics-guided GP method was also validated as more stable and reliable due to its smaller uncertainty and higher confidence level compared to the individual GP method based on quantile regression uncertainty analysis.

Introduction

A major challenge in implementing unsaturated soil mechanics involves identifying indirect techniques for estimating unsaturated soil property functions (Fredlund, 2019). As a key parameter in many property functions of unsaturated soil, soil suction plays an essential role in evaluating the performance of geotechnical infrastructures. In this study, unless otherwise specified, soil suction refers to soil matric suction. For example, a typical engineering problem—landslides—usually involve a decrease in soil suction triggered by rainfall under natural conditions; in such cases, any safety assessment of engineering infrastructure requires determining the actual soil suction (Augusto Filho and Fernandes, 2019; Chang et al., 2020; Crawford et al., 2019; Guo et al., 2021; Sattler et al., 2021). The use of common vegetation to control water and soil erosion in green geotechnical infrastructures represents a frequently employed and environmentally friendly solution. Consequently, matric suction in shallow soil layers appears to vary because of complex interactions involving vegetation-related factors as well as atmospheric parameters in green geotechnical infrastructures. Accordingly, considering complicated soil-vegetation-atmosphere interactions is a necessary component of calculating and predicting soil suction variations in situ and can provide significant insight into the mechanisms involved.

The theory of soil suction arose from investigations focused on the soil-water-plant system, and the concept of soil suction in soil physics was developed in the early 1900s (Fredlund and Rahardjo, 1993). Soil suction can be defined as “the free energy state of soil water,” which clearly establishes that soil suction is controlled by soil moisture. Additionally, some soil physical parameters, such as void ratio, density, and particle size distribution, also cause noteworthy impacts on soil suction (Chiandussi et al., 2012; Doncieux et al., 2015; Gao and Sun, 2017; Garg et al., 2021; Johari et al., 2006; Johari and Hooshmand Nejad, 2015; Xu et al., 2020; Zeng et al., 2022; Zhai et al., 2020). Therefore, soil suction can be depicted as a function of soil water moisture, as demonstrated by the soil water characteristic curve (SWCC). Soil suction variation primarily comprises two processes: the drying process, involving the removal of water from the soil, and the wetting process, related to water infiltration into the soil, which can be triggered by vegetation-related and atmospheric factors under natural conditions. Various analytical and numerical solutions have been established to date for the calculation of soil suction in light of the effects of various influential factors, reflecting the current state of methodology development for suction analysis that takes the effects of multiple factors into consideration. Furthermore, the ongoing development of computer techniques promotes the application of machine learning (ML) approaches in different research field, and some ML methods are available to perform analysis of the interactions between soil, vegetation, and the atmosphere. However, scholars commonly acknowledge that each individual approach has unique advantages and disadvantages. In general, the analytical solution provides a strict relationship description between different parameters, which has physical meaning, but it is challenging to apply in a complex system due to various assumptions that must be made, simplification, and even the lack of understanding of one or more of the associated physical processes. Numerical simulation integrated with the finite element method or finite difference method is usually relatively easier to apply than the analytical solution in solving various engineering problems. However, some required input parameters in the numerical simulation are difficult to measure or determine in the field; furthermore, the whole process is time-consuming. In recent decades, with the advent of the era of big data and Internet of Things, ML has attracted much attention due to its ability to evolve a numerical model automatically for the purpose of exploring an incomprehensible or even previously unknown interaction mechanism among multi-dimensional and multi-source parameters in a complex system. However, the ML approach faces various challenges, including model generalization, interpretability, and repeatability. According to these observations, the individual approach involves certain limitations in solving a series of complex problems. Consequently, some scholars have combined different methods in developing models designed to analyze complicated problems, taking advantage of the methods' complementary features to create a more user friendly and reliable approach. Among these combined methods, the physics-guided ML approach fully incorporates the respective advantages of scientific theories and ML in providing more reliable and efficient solutions for a variety of complicated problems. The prospect of combining ML methods with an analytical solution has attracted much scholarly attention, and researchers have applied this technique in many research fields. Obviously, as an example of an effective application of the complementary advantages of multiple methods, the physics-guided ML approach is likely to enjoy ready acceptance. To the authors' knowledge, none of the currently available studies have focused on estimating or predicting field-monitored soil suction by integrating an analytical solution with the ML method. Thus, exploring the potential of physics-guided ML in the context of some complex problems, such as field-monitored soil suction variation, is a worthwhile endeavor. That said, regardless of the nature of the combination model, it is the key to find the “pain point” of the analytical solution for a multifaceted problem. As one example of an analytical solution for soil suction variation while taking other factors into account, Ng et al. (2015) established an analytical solution for suction calculation while incorporating the effect of root water uptake. In the calculations for this proposed analytical solution, several critical parameters had to be determined precisely to ensure the calculation accuracy of soil suction, such as the ground water table, desaturation coefficient, saturated permeability, and transpiration rate. Among them, transpiration rate is a key parameter that must be determined accurately. The Penman–Monteith equation can be used to calculate the transpiration after knowing some vegetation- and atmosphere-related parameters. That said, the reliability of this equation is primarily influenced by canopy resistance parameterization and the accuracy of the input variables that determine the magnitude of net radiation, and the uncertainties of those two key parameterizations, which are limited by the experimental setup and required relation model, can cause a large computation error in terms of transpiration (Langensiepen et al., 2009). Additionally, Merta et al. (2001) pointed out the difficulty of quantifying the transpiration rate because of the influences imposed by the atmosphere, the soil, and the plants. A plant physiological method was proposed for estimating transpiration, but it required a series of complex measurements for some plant-related parameters. Accordingly, the transpiration rate can be recognized as a major “pain point” for engineers in a determination of field-monitored soil suction that includes the effects of atmospheric and vegetation-related factors. The main factor influencing the transpiration rate is atmospheric conditions (Wang et al., 2018), which are relatively easy to measure. From the perspective of physics-guided ML, one combination approach is to replace a difficult-to-determine key parameter of an analytical solution with a ML model. Therefore, ML can be used to simulate the transpiration rate using several easily obtained parameters. Next, this model that combines the analytical solution with ML can yield a solution. In particular, among the available ML methods, genetic programming (GP) appears to offer a good choice to be integrated with analytical solution because it can automatically generate an explicit mathematical formula with no assumptions regarding the model structure, which makes the proposed solution more interpretable.

This study therefore proposes a physics-guided GP (PGGP) method for estimating field-monitored suction variation with the effects of various influential factors. The following discussion will clarify the rationale and working procedure of applying this PGGP method for the calculation of field-monitored soil suction. Model development and analysis employed data drawn from a database established from a field monitoring test. Global sensitivity analysis was performed to rank the importance of influential parameters on the transpiration rate. Lastly, the efficiency and reliability of the proposed PGGP method for calculating field-monitored soil suction were validated through a performance and uncertainty evaluation via comparison with the results from the individual GP method.

Section snippets

The theory-guided data science

The rapid development of information and communication technology has given rise to the advent of the era of big data, facilitating the exploration of uncharted territories or the development of feasible ways to find solutions for serious challenges in various research fields. That exploration can be regarded as a part of data science development. Specifically, data science is defined as a “concept to unify statistics, data analysis, informatics, and their related methods” in order to

Field monitoring

A site covered with trees and grass on the campus of the University of Macau was selected for the field monitoring test to explore the soil parameters' variations while taking into account the effects of plant and weather conditions. The campus is located on Hengqin Island in Zhuhai, China. According to the latest statistics from the Bureau of Geophysics and Meteorology (SMG) (2016), the air temperature varies from 5 °C to 36.1 °C, and the mean relative air humidity is 80% in Macau. The name of

Modeling results

From the total selected data for model development, 3190 data sets (80% of the total data) were taken as the training data to build the GP model, and the remaining data, comprising 797 data sets (20% of the total data), were used as the testing data to validate the reliability of the obtained multivariate models. The parameter settings used in the GP method for model development are shown in Table 2. The probability rates for crossover, mutation, and reproduction were set at 85%, 10%, and 5%,

Calculation accuracy

As previously described, the proposed PGGP method integrated the GP equation with the analytical solution to calculate field-monitored soil suction (Fig. 14). The PGGP method incorporated all of the advantages of the analytical solution and GP. Applying the PGGP method to calculate and predict field-monitored soil suction required determining several other basic physical soil parameters in addition to the selected influential parameters for the transpiration rate, such as desaturation

Conclusions

A PGGP solution was proposed for calculating and predicting field-monitored soil suction while including the effects of vegetation-related and atmospheric factors. In all, 3987 data collected from a field monitoring test were used for model development through GP to simulate the transpiration rate, a key parameter in suction calculation using an analytical solution. The target variable -ln T was initially obtained by back-calculation based on one analytical solution. Next, GP modeling was

CRediT authorship contribution statement

Zhi-Liang Cheng: Writing – original draft, Writing – review & editing, Methodology, Formal analysis, Validation. K.K. Pabodha M. Kannangara: Writing – review & editing, Supervision. Li-Jun Su: Investigation. Wan-Huan Zhou: Conceptualization, Writing – review & editing, Supervision. Chen Tian: Data curation.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The authors would like to acknowledge the support funded by the Guangdong Provincial Department of Science and Technology (Grant No. 2022A0505030019), the Science and Technology Development Fund of Macau SAR (Grant Nos. SKL-IOTSC(UM)-2021–2023) and University of Macau (No. MYRG2018-00173-FST).

References (38)

  • O. Augusto Filho et al.

    Landslide analysis of unsaturated soil slopes based on rainfall and matric suction data

    Bull. Eng. Geol. Environ.

    (2019)
  • G. Carleo et al.

    Machine learning and the physical sciences

    Rev. Mod. Phys.

    (2019)
  • Z. Chang et al.

    Study on the creep behaviours and the improved Burgers model of a loess landslide considering matric suction

    Nat. Hazards

    (2020)
  • Z.L. Cheng et al.

    Estimation of spatiotemporal response of rooted soil using a machine learning approach

    J. Zhejiang Univ.: Sci. A

    (2020)
  • Z.L. Cheng et al.

    Genetic programming model for estimating soil suction in shallow soil layers in the vicinity of a tree

    Eng. Geol.

    (2020)
  • G. Chiandussi et al.

    Comparison of multi-objective optimization methodologies for engineering applications

  • A. Danandeh Mehr et al.

    Season Algorithm-multigene genetic programming: a new approach for Rainfall-Runoff modelling

    Water Resour. Manag.

    (2018)
  • S.D. Day et al.

    At the Root of it. 20–22

  • S. Doncieux et al.

    Multi-objective analysis of computational models

  • Cited by (0)

    View full text