Scour depth modelling by a multi-objective evolutionary paradigm

https://doi.org/10.1016/j.envsoft.2010.10.013Get rights and content

Abstract

Local scour modelling is an important issue in environmental engineering in order to prevent degradation of river bed and safeguard the stability of grade-control structures. Many empirical formulations can be retrieved from literature to predict the equilibrium scour depth, which is usually assumed as representative of the phenomenon. These empirical equations have been mostly constructed in some ways by leveraging regression procedures on experimental data, usually laboratory observations (thus from small/medium scale experiments). Laboratory data are more accurate measurements but generally not completely representative of the actual conditions in real-world cases, that are often much more complex than those schematized by the laboratory equipment. This is the main reason why some of the literature expressions were not adequate when used for practical applications in large-scale examples. This work deals with the application of an evolutionary modelling paradigm, named Evolutionary Polynomial Regression (EPR), to such problem. Such a technique was originally presented as a classical approach, used to achieve a single model for each analysis, and has been recently updated by implementing a multi-modelling approach (i.e., to obtain a set of optimal candidate solutions/models) where a multi-objective genetic algorithm is used to get optimal models in terms of parsimony of mathematical expressions vs. fitting to data. A wide database of field and laboratory observations is used for predicting the equilibrium scour depth as a function of a set of variables characterizing the flow, the sediments and the dimension of the grade-control structure. Results are discussed considering two regressive models available in literature that have been trained on the same data used for EPR. The proposed modelling paradigm proved to be a useful tool for data analysis and, in the particular case study, able to find feasible explicit models featured by an appreciable generalization performance.

Introduction

Analysis and modelling of the interaction between engineering manufacts and environmental systems are a key element for an appropriate and effective design of these structures, while preventing excessive degradation of the environment. This manuscript will consider the effect of Grade-Control Structures (GCSs): the local scour of alluvial bed channels. This kind of manufacts (aprons, spillways, bed sills, weirs, check dams, etc.) is devoted to reduce/limit the excessive degradation of streams and rivers, while reducing their slope, limiting the solid transport, reducing the flow velocity and banks erosion. Their positive effects were however counterbalanced by local scouring, which occurs downstream of them and could affect the stability of the structures themselves, and eventually of other neighbour structures such as bridge piers and embankments.

Local scouring is caused by the presence of the manufact itself which modifies the water flow directions generating jets with velocity and impact angle quite different from those in the main stream. As a result, shear stresses are exerted on soil particles of river bed downstream of the structure (see Fig. 1) causing their removing. It is worth noting that for live-bed conditions (i.e., those that usually happen in real-world cases) there is a transport of bed material from upstream; the scour depth increases rapidly and, due to the interaction between erosion and deposition, it tends to fluctuate around an equilibrium value (D’Agostino and Ferro, 2004). The consequence of this phenomenon is the creation of a “scour pool” (also named impinging pool), which is often featured by means of its maximum depth. Therefore, the amount of erosion can be directly correlated to the maximum depth of the scour pool, which will be named henceforth as “equilibrium scour depth”.

This brief description of the phenomenon under consideration helps in listing the components playing a role in its definition: the dimensional features of the manufact, the geological and geophysical features of channel bed, the hydraulics of flow, without omitting the temporal evolution of the phenomenon (Pagliara et al., 2008). The resulting portrait is a very complex phenomenon due to a wide range of physical parameters that need to be defined/measured for its complete definition.

For these reasons, researchers have mainly concentrated their efforts on developing empirical formulæ based on laboratory (usually) and field (rarely) measurements to predict the equilibrium (or quasi-equilibrium) scour depth under various flow conditions and structure configurations, as those reported for example in Veronese, 1937, Mason and Arumugam, 1985, Hoffmans, 1998, Dargahi, 2003. In general, these are regressive models defined on their own experimental observations, thus they are mainly effective on those data which are included in the same range of the training data, while often showing some problems if applied out of this range (e.g., in real-scale applications). This is basically due to the difference of the scale of representation (from laboratory scale to real-world scale), the presence of errors (in the field data) and the unavailability of some input variables necessary to define the model in real-scale contexts, which need to be eventually estimated (tabled values, for example). This, in turn, introduces uncertainty/errors (Ettema et al., 2000, D’Agostino and Ferro, 2004, Azmathullah et al., 2005).

Trying to define a more general formulation for equilibrium scour depth prediction, Bormann and Julien (1991) theoretically derived an equation based on the concepts of jet diffusion and particle stability in scour pools downstream of grade-control structures, testing the equation on prototype experiments. In this direction, researchers have recently preferred to use of dimensionless variables for featuring the phenomenon, trying to overcome scaling problems. For example, D’Agostino and Ferro (2004) studied the scour pattern downstream of a grade-control structure using dimensionless groups, by the application of the incomplete self-similarity theory. They tested the procedure on published and unpublished data collected at different scales and characterized by different bed grain-size distributions, producing two relationships for predicting the maximum scour depth.

In the last years, in the light of the growing availability of computational power, some researchers tried to assess local scouring by means of pure data-driven approaches, such as artificial neural networks (Liriano and Day, 2001, Azmathullah et al., 2005, Guven and Gunal, 2008) and fuzzy logics (Uyumaz et al., 2006, Azamathulla et al., 2009, Farhoudi et al., 2010). All these studies were aimed at increasing the generalization capacity of the returned models, which means their performance on unseen data (i.e., out of their training range). However, limitations of pure data-driven approaches affected the final results in few ways. (i) The relationship among explanatory variables and scour depth is sought out as a single mathematical expression, under prior assumptions about the influencing factors and its mathematical form, thus motivating the adoption of “trial & error” approaches to select good models. (ii) More often than not the retrieved expression/model is not parsimonious (in terms of number of variables involved and complexity of mathematical structures) (i.e., artificial neural networks), thus being quite accurate on training data but showing poor generalization ability if applied to real-scale problems. In fact unnecessary model complexity of such models is often a symptom of over-fitting to experimental data and scarce generalization capabilities; moreover, complex formulas are usually difficult to evaluate from a physical point of view. (iii) The modelling strategy usually provides one model considering as single objective the maximization of fitting to data without explicitly accounting for model parsimony (useful for its generalization ability). Azamathulla et al. (2010) applied the Genetic Programming (GP) modelling paradigm (Babovic and Keijzer, 2000). It provides explicit formulations of data models (symbolic structures of models) by performing a global exploration of the model space, although this approach shows some drawbacks. For example, it is not very powerful in finding constants and, more important, it tends to produce models with a very complex formulation (Giustolisi and Savic, 2006), which are formally symbolic structures, but are difficult to analyze.

In summary, all models are supported by physical knowledge of the phenomenon, which has led scientists in collecting data and defining the mathematical structure of the model expression. Such models have been usually calibrated by numerical regression or alternative data-driven techniques using measured data. This undoubtedly affects the portability of formulas on different data which can often be incomplete or affected by errors. For example, neglecting data of different scale, a model over-fitted on laboratory data may show scarce accuracy when applied on data from the same scale but affected by errors. Therefore, while modelling local scouring by fitting data (maximization of accuracy) it could be useful to introduce another objective that control the complexity of resulting mathematical expression thus leading to improved generalization of the model. In fact, the so-called principle of parsimony (Ockham’s razor, William of Ockham, 1300–1349) states that for a set of otherwise equivalent models of a given phenomenon one should choose the simplest one to explain a dataset; this in turn helps preventing over-fitting to training data (Young et al., 1996, Crout et al., 2009). However, this modelling strategy calls for a trade-off between model complexity and fitness to data while developing the model itself.

The paper is focused on the implementation of a newly proposed modelling strategy by Giustolisi and Savic, 2006, Giustolisi and Savic, 2009: Evolutionary Polynomial Regression (EPR). It is a hybrid modelling technique which combines linear regression and evolutionary search for mathematical model structures. A multi-objective evolutionary optimization paradigm is employed to find a set of mathematical expressions that can be assumed as “optimal” given some objectives according to the Pareto dominance criterion (Pareto, 1896). EPR is here used to model the equilibrium scour depth downstream of grade-control structures by analyzing a large database, ranging from real observations to laboratory scale data (D’Agostino and Ferro, 2004, Bormann and Julien, 1991). EPR generates a number of eligible models for predicting the equilibrium scour depth (i.e., the Pareto front of models) which have symbolic/explicit formulations, thus allowing the evaluation of physical insight about local scouring process (in this case). EPR has been already proved reliable in many applications in civil and environmental engineering. For further details on EPR applications the reader is referred to the EPR (2009) webpage.

Section snippets

Introduction to evolutionary polynomial regression

EPR can be defined as a non-linear global stepwise regression, providing symbolic formulæ of models. It is global since the search for optimal model structure is based on the exploration of the entire space of models by leveraging a flexible coding of model structure. Moreover, EPR generalizes the original stepwise regression of Draper and Smith (1998) by considering non-linear model structures (i.e., pseudo-polynomials) although they are linear with respect to regression parameters.

Although

Data collection

In this work, the models obtained by MO-EPR application to local scouring downstream of GCSs have been compared with those obtained by D’Agostino and Ferro (2004). For this reason, MO-EPR employs the same database used therein, thus the same explanatory variables. In particular, the database reported in D’Agostino and Ferro (2004) can be distinguished into seven different sets coming from different experimental layouts. In the following application, they will be referred to the relevant

Modelling strategy

The selection of the explanatory variables is an important issue in environmental modelling. Usually it is performed based on the physical insight of the analyst and influenced by the availability of measurements/observations. In this particular context, for example, D’Agostino and Ferro (2004) selected five dimensionless groups (those on the left of the line in Eq. (2)) assuming them as the most important factors based on the physics of the scour phenomenon, validated by a multiple-regression

Modelling discussion of MO-EPR results

The aim of this section is to discuss MO-EPR results from the modelling point of view, emphasizing its main features (e.g., ability to provide symbolic expressions, ranking of formulæ according to their accuracy vs. parsimony, etc.). As evident from Table 2, the optimal solutions range from a very parsimonious model, the average model in Eq. (6), which is however the less accurate, up to the best fitting model, Eq. (19), which contains conversely the higher number of terms and explanatory

Statistical comparison using literature formulæ

EPR generalization performances have been evaluated with respect to models in Eqs. (8), (16) have been selected, see Table 2. The model in Eq. (8) is very parsimonious and for this reason it is expected to have a reliable generalization performance on equilibrium scour prediction at different scales, as demonstrated by D’Agostino and Ferro (2004) about the model in Eq. (4). The model in Eq. (16) has similar generalization performance to models in Eqs. (18), (19), see the last column in Table 2,

Summary and conclusions

The paper presents the application of Multi-Objective Evolutionary Polynomial Regression, a new modelling technique that combines numerical regression and evolutionary computing, to local scouring downstream of GCS modelling. MO-EPR performs an evolutionary-based multi-objective optimization in the space of solutions, using three conflicting objective functions describing accuracy and parsimony of the candidate models. MO-EPR returns a set of optimal data models (Pareto front) of different

References (41)

  • N.E. Bormann et al.

    Scour downstream of grade-control structures

    J. Hydraul. Eng.

    (1991)
  • V. D’Agostino

    Indagine sullo scavo a valle di opere trasversali mediante modello fisico a fondo mobile

    Energ. Elettr.

    (1994)
  • V. D’Agostino et al.

    Scour on alluvial bed downstream of grade-control structures

    J. Hydraul. Eng.

    (2004)
  • B. Dargahi

    Scour development downstream of a spillway

    J. Hydraul. Res.

    (2003)
  • K. Deb et al.

    Scalable multi-objective optimization test problems

  • N.R. Draper et al.

    Applied Regression Analysis

    (1998)
  • EPR
  • R. Ettema et al.

    Scale effect in pier-scour experiments

    J. Hydraul. Eng.

    (2000)
  • M. Falciai et al.

    Indagine sui gorghi che si formano a valle delle traverse torrentizie

    Italia Forestale Mont.

    (1978)
  • J. Farhoudi et al.

    Application of neuro-fuzzy model to estimate the characteristics of local scour downstream of stilling basins

    J. Hydroinformatics

    (2010)
  • Cited by (78)

    • A hybrid multi-step sensitivity-driven evolutionary polynomial regression enables robust model structure selection

      2022, Engineering Applications of Artificial Intelligence
      Citation Excerpt :

      Sensitivity analysis is an essential process for differentiating (gray) EPR models from black-box approaches. For these reasons, EPR models have found its way into engineering practice (Ahangar-Asr et al., 2011b; Alani and Faramarzi, 2014; Balf et al., 2018; Berardi et al., 2008; Bruno et al., 2018; Costa et al., 2020; Doglioni et al., 2010; Doglioni and Simeone, 2021; Faramarzi et al., 2012; Fiore et al., 2012, 2016; Giustolisi et al., 2007, 2008; Gomes et al., 2021a; Jin and Yin, 2020; Laucelli and Giustolisi, 2011; Montes et al., 2020; Rezania et al., 2008, 2010, 2011; Shahin, 2015). EPR is a useful two-stage hybrid regression technique that performs (i) model structure identification, and (ii) parameter estimation to fit simple polynomials in the input–output process.

    View all citing articles on Scopus
    View full text