Scour depth modelling by a multi-objective evolutionary paradigm
Introduction
Analysis and modelling of the interaction between engineering manufacts and environmental systems are a key element for an appropriate and effective design of these structures, while preventing excessive degradation of the environment. This manuscript will consider the effect of Grade-Control Structures (GCSs): the local scour of alluvial bed channels. This kind of manufacts (aprons, spillways, bed sills, weirs, check dams, etc.) is devoted to reduce/limit the excessive degradation of streams and rivers, while reducing their slope, limiting the solid transport, reducing the flow velocity and banks erosion. Their positive effects were however counterbalanced by local scouring, which occurs downstream of them and could affect the stability of the structures themselves, and eventually of other neighbour structures such as bridge piers and embankments.
Local scouring is caused by the presence of the manufact itself which modifies the water flow directions generating jets with velocity and impact angle quite different from those in the main stream. As a result, shear stresses are exerted on soil particles of river bed downstream of the structure (see Fig. 1) causing their removing. It is worth noting that for live-bed conditions (i.e., those that usually happen in real-world cases) there is a transport of bed material from upstream; the scour depth increases rapidly and, due to the interaction between erosion and deposition, it tends to fluctuate around an equilibrium value (D’Agostino and Ferro, 2004). The consequence of this phenomenon is the creation of a “scour pool” (also named impinging pool), which is often featured by means of its maximum depth. Therefore, the amount of erosion can be directly correlated to the maximum depth of the scour pool, which will be named henceforth as “equilibrium scour depth”.
This brief description of the phenomenon under consideration helps in listing the components playing a role in its definition: the dimensional features of the manufact, the geological and geophysical features of channel bed, the hydraulics of flow, without omitting the temporal evolution of the phenomenon (Pagliara et al., 2008). The resulting portrait is a very complex phenomenon due to a wide range of physical parameters that need to be defined/measured for its complete definition.
For these reasons, researchers have mainly concentrated their efforts on developing empirical formulæ based on laboratory (usually) and field (rarely) measurements to predict the equilibrium (or quasi-equilibrium) scour depth under various flow conditions and structure configurations, as those reported for example in Veronese, 1937, Mason and Arumugam, 1985, Hoffmans, 1998, Dargahi, 2003. In general, these are regressive models defined on their own experimental observations, thus they are mainly effective on those data which are included in the same range of the training data, while often showing some problems if applied out of this range (e.g., in real-scale applications). This is basically due to the difference of the scale of representation (from laboratory scale to real-world scale), the presence of errors (in the field data) and the unavailability of some input variables necessary to define the model in real-scale contexts, which need to be eventually estimated (tabled values, for example). This, in turn, introduces uncertainty/errors (Ettema et al., 2000, D’Agostino and Ferro, 2004, Azmathullah et al., 2005).
Trying to define a more general formulation for equilibrium scour depth prediction, Bormann and Julien (1991) theoretically derived an equation based on the concepts of jet diffusion and particle stability in scour pools downstream of grade-control structures, testing the equation on prototype experiments. In this direction, researchers have recently preferred to use of dimensionless variables for featuring the phenomenon, trying to overcome scaling problems. For example, D’Agostino and Ferro (2004) studied the scour pattern downstream of a grade-control structure using dimensionless groups, by the application of the incomplete self-similarity theory. They tested the procedure on published and unpublished data collected at different scales and characterized by different bed grain-size distributions, producing two relationships for predicting the maximum scour depth.
In the last years, in the light of the growing availability of computational power, some researchers tried to assess local scouring by means of pure data-driven approaches, such as artificial neural networks (Liriano and Day, 2001, Azmathullah et al., 2005, Guven and Gunal, 2008) and fuzzy logics (Uyumaz et al., 2006, Azamathulla et al., 2009, Farhoudi et al., 2010). All these studies were aimed at increasing the generalization capacity of the returned models, which means their performance on unseen data (i.e., out of their training range). However, limitations of pure data-driven approaches affected the final results in few ways. (i) The relationship among explanatory variables and scour depth is sought out as a single mathematical expression, under prior assumptions about the influencing factors and its mathematical form, thus motivating the adoption of “trial & error” approaches to select good models. (ii) More often than not the retrieved expression/model is not parsimonious (in terms of number of variables involved and complexity of mathematical structures) (i.e., artificial neural networks), thus being quite accurate on training data but showing poor generalization ability if applied to real-scale problems. In fact unnecessary model complexity of such models is often a symptom of over-fitting to experimental data and scarce generalization capabilities; moreover, complex formulas are usually difficult to evaluate from a physical point of view. (iii) The modelling strategy usually provides one model considering as single objective the maximization of fitting to data without explicitly accounting for model parsimony (useful for its generalization ability). Azamathulla et al. (2010) applied the Genetic Programming (GP) modelling paradigm (Babovic and Keijzer, 2000). It provides explicit formulations of data models (symbolic structures of models) by performing a global exploration of the model space, although this approach shows some drawbacks. For example, it is not very powerful in finding constants and, more important, it tends to produce models with a very complex formulation (Giustolisi and Savic, 2006), which are formally symbolic structures, but are difficult to analyze.
In summary, all models are supported by physical knowledge of the phenomenon, which has led scientists in collecting data and defining the mathematical structure of the model expression. Such models have been usually calibrated by numerical regression or alternative data-driven techniques using measured data. This undoubtedly affects the portability of formulas on different data which can often be incomplete or affected by errors. For example, neglecting data of different scale, a model over-fitted on laboratory data may show scarce accuracy when applied on data from the same scale but affected by errors. Therefore, while modelling local scouring by fitting data (maximization of accuracy) it could be useful to introduce another objective that control the complexity of resulting mathematical expression thus leading to improved generalization of the model. In fact, the so-called principle of parsimony (Ockham’s razor, William of Ockham, 1300–1349) states that for a set of otherwise equivalent models of a given phenomenon one should choose the simplest one to explain a dataset; this in turn helps preventing over-fitting to training data (Young et al., 1996, Crout et al., 2009). However, this modelling strategy calls for a trade-off between model complexity and fitness to data while developing the model itself.
The paper is focused on the implementation of a newly proposed modelling strategy by Giustolisi and Savic, 2006, Giustolisi and Savic, 2009: Evolutionary Polynomial Regression (EPR). It is a hybrid modelling technique which combines linear regression and evolutionary search for mathematical model structures. A multi-objective evolutionary optimization paradigm is employed to find a set of mathematical expressions that can be assumed as “optimal” given some objectives according to the Pareto dominance criterion (Pareto, 1896). EPR is here used to model the equilibrium scour depth downstream of grade-control structures by analyzing a large database, ranging from real observations to laboratory scale data (D’Agostino and Ferro, 2004, Bormann and Julien, 1991). EPR generates a number of eligible models for predicting the equilibrium scour depth (i.e., the Pareto front of models) which have symbolic/explicit formulations, thus allowing the evaluation of physical insight about local scouring process (in this case). EPR has been already proved reliable in many applications in civil and environmental engineering. For further details on EPR applications the reader is referred to the EPR (2009) webpage.
Section snippets
Introduction to evolutionary polynomial regression
EPR can be defined as a non-linear global stepwise regression, providing symbolic formulæ of models. It is global since the search for optimal model structure is based on the exploration of the entire space of models by leveraging a flexible coding of model structure. Moreover, EPR generalizes the original stepwise regression of Draper and Smith (1998) by considering non-linear model structures (i.e., pseudo-polynomials) although they are linear with respect to regression parameters.
Although
Data collection
In this work, the models obtained by MO-EPR application to local scouring downstream of GCSs have been compared with those obtained by D’Agostino and Ferro (2004). For this reason, MO-EPR employs the same database used therein, thus the same explanatory variables. In particular, the database reported in D’Agostino and Ferro (2004) can be distinguished into seven different sets coming from different experimental layouts. In the following application, they will be referred to the relevant
Modelling strategy
The selection of the explanatory variables is an important issue in environmental modelling. Usually it is performed based on the physical insight of the analyst and influenced by the availability of measurements/observations. In this particular context, for example, D’Agostino and Ferro (2004) selected five dimensionless groups (those on the left of the line in Eq. (2)) assuming them as the most important factors based on the physics of the scour phenomenon, validated by a multiple-regression
Modelling discussion of MO-EPR results
The aim of this section is to discuss MO-EPR results from the modelling point of view, emphasizing its main features (e.g., ability to provide symbolic expressions, ranking of formulæ according to their accuracy vs. parsimony, etc.). As evident from Table 2, the optimal solutions range from a very parsimonious model, the average model in Eq. (6), which is however the less accurate, up to the best fitting model, Eq. (19), which contains conversely the higher number of terms and explanatory
Statistical comparison using literature formulæ
EPR generalization performances have been evaluated with respect to models in Eqs. (8), (16) have been selected, see Table 2. The model in Eq. (8) is very parsimonious and for this reason it is expected to have a reliable generalization performance on equilibrium scour prediction at different scales, as demonstrated by D’Agostino and Ferro (2004) about the model in Eq. (4). The model in Eq. (16) has similar generalization performance to models in Eqs. (18), (19), see the last column in Table 2,
Summary and conclusions
The paper presents the application of Multi-Objective Evolutionary Polynomial Regression, a new modelling technique that combines numerical regression and evolutionary computing, to local scouring downstream of GCS modelling. MO-EPR performs an evolutionary-based multi-objective optimization in the space of solutions, using three conflicting objective functions describing accuracy and parsimony of the candidate models. MO-EPR returns a set of optimal data models (Pareto front) of different
References (41)
- et al.
Is my model too complex? Evaluating model formulation using model reduction
Environ. Model. Software
(2009) - et al.
A multi-model approach to analysis of environmental phenomena
Environ. Model. Software
(2007) - et al.
Comparing state-of-the-art evolutionary multi-objective algorithms for long-term groundwater monitoring design
Adv. Water Resour.
(2006) - et al.
Local scouring and morphological adjustments in steep channels with check-dam sequences
Geomorphology
(2003) - et al.
Selection and validation of parameters in multiple linear and principal component regressions
Environ. Model. Software
(2008) - et al.
Using interactive archives in evolutionary multi-objective optimization: a case study for long-term groundwater monitoring design
Environ. Model. Software
(2007) - et al.
ANFIS based approach for predicting maximum scour location of spillway
Water Manage.
(2009) - et al.
Genetic programming to predict bridge pier scour
J. Hydraul. Eng.
(2010) - et al.
Neural networks for estimation of scour downstream of a ski-jump bucket
J. Hydraul. Eng.
(2005) - et al.
Genetic programming as a model induction engine
J. Hydroinformatics
(2000)
Scour downstream of grade-control structures
J. Hydraul. Eng.
Indagine sullo scavo a valle di opere trasversali mediante modello fisico a fondo mobile
Energ. Elettr.
Scour on alluvial bed downstream of grade-control structures
J. Hydraul. Eng.
Scour development downstream of a spillway
J. Hydraul. Res.
Scalable multi-objective optimization test problems
Applied Regression Analysis
Scale effect in pier-scour experiments
J. Hydraul. Eng.
Indagine sui gorghi che si formano a valle delle traverse torrentizie
Italia Forestale Mont.
Application of neuro-fuzzy model to estimate the characteristics of local scour downstream of stilling basins
J. Hydroinformatics
Cited by (78)
A hybrid multi-step sensitivity-driven evolutionary polynomial regression enables robust model structure selection
2022, Engineering Applications of Artificial IntelligenceCitation Excerpt :Sensitivity analysis is an essential process for differentiating (gray) EPR models from black-box approaches. For these reasons, EPR models have found its way into engineering practice (Ahangar-Asr et al., 2011b; Alani and Faramarzi, 2014; Balf et al., 2018; Berardi et al., 2008; Bruno et al., 2018; Costa et al., 2020; Doglioni et al., 2010; Doglioni and Simeone, 2021; Faramarzi et al., 2012; Fiore et al., 2012, 2016; Giustolisi et al., 2007, 2008; Gomes et al., 2021a; Jin and Yin, 2020; Laucelli and Giustolisi, 2011; Montes et al., 2020; Rezania et al., 2008, 2010, 2011; Shahin, 2015). EPR is a useful two-stage hybrid regression technique that performs (i) model structure identification, and (ii) parameter estimation to fit simple polynomials in the input–output process.
Water recovery and on-site reuse of laundry wastewater by a facile and cost-effective system: Combined biological and advanced oxidation process
2021, Science of the Total EnvironmentDesign and performance of two decomposition paradigms in forecasting daily solar radiation with evolutionary polynomial regression: Wavelet transform versus ensemble empirical mode decomposition
2020, Predictive Modelling for Energy Management and Power Systems EngineeringProcess performance and multi-kinetic modeling of a membrane bioreactor treating actual oil refinery wastewater
2019, Journal of Water Process EngineeringSimple flowmeter device for LID systems: From laboratory procedure to full-scale implementation
2019, Flow Measurement and Instrumentation