Elsevier

Journal of Hydrology

Volumes 450–451, 11 July 2012, Pages 48-58
Journal of Hydrology

Suspended sediment modeling using genetic programming and soft computing techniques

https://doi.org/10.1016/j.jhydrol.2012.05.031Get rights and content

Summary

Modeling suspended sediment load is an important factor in water resources engineering as it crucially affects the design and management of water resources structures. In this study the genetic programming (GP) technique was applied for estimating the daily suspended sediment load in two stations in Cumberland River in U.S. Daily flow and sediment data from 1972 to 1989 were used to train and test the applied genetic programming models. The effect of various GP operators on sediment load estimation was investigated. The optimal fitness function, operator functions, linking function and learning algorithm were obtained for modeling daily suspended sediment. The GP estimates were compared with those of the Adaptive Neuro-Fuzzy Inference System (ANFIS), Artificial Neural Networks (ANNs) and Support Vector Machine (SVM) results, in term of coefficient of determination, mean absolute error, coefficient of residual mass and variance accounted for. The comparison results indicated that the GP is superior to the ANFIS, ANN and SVM models in estimating daily suspended sediment load.

Highlights

► We use genetic programming (GP) technique to model daily suspended sediment load. ► The GP results are compared with those of the other soft computing techniques. ► These techniques are neuro fuzzy, neural network and Support Vector Machines. ► Comparison results show that the GP models perform better than the others.

Introduction

The correct estimation of the volume of sediment being transported by a river is of great importance in water resources engineering, as it directly affects the planning, design and management as well as the operation of hydraulic structures. So far, several attempts have been made to find a relationship between the amount of suspended load and flow characteristics in the river, such as flow velocity and discharge as well as the shear stress. But, they never reached to universal degree for application in all cases (Vanoni, 1971). The physically based models are based on the simplified partial differential equations of flow and sediment flux as well as on some unrealistic simplifying assumptions for flow and empirical relationships for erosive effects of rainfall and flow (Aytek and Kisi, 2008). Examples of such models are presented by Wicks and Bathurst, 1996, Kothyari et al., 1997, Refsgaard, 1997. McBean and Al-Nassri (1998) proposed the regression model to be established between sediment concentration and discharge. Karim and Kennedy (1990) aims to establish relations among flow velocity, sediment discharge, bed form geometry and friction factor of alluvial rivers. Lopes and Ffolliott (1993) concluded that the relationship between the sediment concentration and stream flow is rather complex because of the hysteresis effect. Over the years some sediment curves have been proposed by researchers to determine the average relationship between discharge and suspended sediment load (Thomas, 1985, Asselman, 2000, Picoet et al., 2001, Overleir, 2004, Crowder et al., 2007). Those models have components that correspond to physical processes and are theoretically capable to taking into account the special variation of catchment properties as well as uneven distribution of precipitation and evapotranspiration (Aytek and Kisi, 2008). However, the application of physical based models is rather complicated because it necessitates detailed spatial and temporal environmental data, which is not often available. Suspended sediment load estimation at high resolution is a difficult task because (Sivakumar, 2006): (i) it depends on the availability of high-resolution water discharge and sediment concentration, which may be not available in every case, (ii) suspended sediment load estimation, is subject to influences introduced by any error in the measurements of two above mentioned components and (iii) direct measurements of these variables are expensive. Generally, the time series techniques assumed linear relationships between variables, but in the case of real hydrological data, such relationships cannot be employed easily; so the new techniques of artificial intelligence may improve the analysis.

In recent years, Artificial Neural Networks (ANNs), Adaptive Neuro-Fuzzy Inference System (ANFIS), Support Vector Machine (SVM) and genetic programming (GP) methods have been applied in water resources. Recent experiments have reported that ANN may offer some promising results in hydrology and water resources engineering (e.g. ASCE, 2000a, ASCE, 2000b). Review of the whole ANN applications in hydrology is beyond the scope of the present paper and only some most relevant papers are discussed here.

Jain (2001) used the ANN approach to establish an integrated stage-discharge-sediment concentration relation for two sites on the Mississippi River and introduced the ANN approach to be better than the conventional methods. Cigizoglu (2002) used ANNs to forecast and estimate sediment concentration values and compared the results with corresponding classical regression models. The results obtained showed the superiority of ANNs. Cigizoglu (2004) investigated the performance of multi- layer perceptrons (MLPs) in daily suspended sediment estimation and forecasting and found that MLPs capture the complex non-linear behavior of the sediment series relatively better than the conventional models. Kisi (2004) used multi-layer perceptions with Levenberg–Marquardt training algorithm for suspended sediment concentration prediction and estimation. Partal and Cigizoglu (2008) applied a combined Wavelet–ANN method to estimate and predict the suspended sediment load in rivers and observed a good fit between model and measured data. Azamathulla and Wu (2011) applied SVM technique for computing longitudinal dispersion coefficients in natural streams.

An ANFIS is a combination of an adaptive neural network and a fuzzy inference system. The parameters of the fuzzy inference system are determined by the ANN learning algorithms. Since this system is based on the fuzzy inference system, reflecting amazing knowledge, an important aspect is that the system should be always interpretable in terms of fuzzy IF-THEN rules. ANFIS is capable of approximating any real continuous function on a compact set to any degree of accuracy (Jang et al., 1997). ANFIS identifies a set of parameters through a hybrid learning rule combining back propagation gradient descent error digestion and a least squared error method. There are largely two approaches for fuzzy inference systems, namely the approaches of Mamdani (Mamdani and Assilian, 1975) and Sugeno (Takagi and Sugeno, 1985). The differences between the two approaches arise from the consequent part where Mamdani’s approach uses fuzzy membership functions, while linear or constant functions are used in Sugeno’s approach. The neuro-fuzzy model used in this study implements the Sugeno’s fuzzy approach (Takagi and Sugeno, 1985) to obtain the values for the output variable from those of input variables. Here, ANFIS has some input variables (recorded river discharge and sediment loads) and one output, sediment load at the present time.

The methodology of GP was first proposed by Koza (1992), as a generalization of Genetic Algorithms (GA) (Goldberg, 1989). The fundamental difference between GP and GAs lies in the nature of individuals, where in GAs individuals are linear strings of fixed length (as chromosomes), while in GP individuals are nonlinear entities of different sizes and shapes (as parse trees). One of the strong points of using GP over other data driven techniques (e.g., ANFIS) is that it can produce explicit formulations (model expression) of the relationship that rules the physical phenomenon. Such expressions may be subject to some physical interpretations. Actually, the comprehensibility of GP models is also a way to reduce the risk of over-fitting to training data and improve generalization of resulting models. In this way, one may perform knowledge discovery using GP, finding some confirmation of well-known physical relationships and evolving interesting new formulae, as an upgrading of particular cases of study.

Keskin et al. (2004) used fuzzy models to estimate daily pan evaporation in Western Turkey. Kisi (2005) applied neuro-fuzzy and neural network techniques for estimating suspended sediment. Kisi (2006) investigated the ability of ANFIS technique to improve the accuracy of daily pan evaporation estimation. Partal and Kisi (2007) proposed a new wavelet-neuro-fuzzy conjunction model for precipitation forecast. Kisi (2009) proposed evolutionary fuzzy models for suspended sediment concentration estimation. Shiri and Kisi (2010a) introduced a wavelet-neuro-fuzzy conjunction model for predicting short-term and long-term streamflows. Azamathulla and Ghani (2011) used ANFIS for predicting scour depth at culvert outlets and they found ANFIS to be more effective when compared with the results of regression equations and ANN. Ahmad et al. (2011) applied neuro-fuzzy method for estimating transverse mixing coefficients. Shiri et al. (2011a) applied ANFIS for modeling daily pan evaporation in the Illinois State of USA and found it better than ANNs. Shiri et al. (2011b) applied ANFIS for short term operational sea water level variations. Azamathulla et al. (2012) introduced an ANFIS based approach for predicting sediment transport in clean sewer.

Drecourt, 1999, Savic et al., 1999, and Aytek and Alp (2008) applied GP to rainfall-runoff modeling. Giustolisi (2004) determined the Chezy resistance coefficient using GP. Aytek and Kisi (2008) applied GP to suspended sediment transport streams, and found it to be better than conventional rating curve and multi-linear regression techniques. Harris et al. (2003) used GP to predict velocity in compound channels with vegetated flood plains. Babovic et al., 2001, Babovic and Keijzer, 2002 applied GP for modeling of risks in water supply. Shiri and Kisi (2010b) compared the GP to ANFIS for predicting short-term groundwater table depth fluctuations. Kisi and Shiri (2011) introduced a new wavelet-GEP conjunction model for precipitation forecasting. Azamathulla and Ahmad (2012) applied GEP for computing transverse mixing coefficients. Shiri et al. (2012) applied GEP for modeling daily evapotranspiration.

The focus of this paper is to apply GP (i.e. gene expression programming), ANFIS, ANN and SVM models (all of which are data-driven approaches) for estimating daily sediment load values (using recorded river discharge and sediment load data) as well as inter-comparison of the results obtained using these techniques.

Section snippets

Data used

In the present work, the daily water discharge and river sediment load data of two stations on Cumberland River in U.S for a period of 10 years (01-October-1979 to 30-September-1989) taken from the USGS web site, were used for training and testing the employed models. The upstream station, near Pineville (Station ID: 03403000) has a basin area of 2095 km2 and the downstream station is located at Barbourville (Station ID: 03403500) has a basin area of 960 sq. Mile. Fig. 1 illustrates the time

Results and discussions

All of the applied models viz., GEP, ANFIS, SVM and ANN were trained and tested to carry out the suspended sediment load in the studied stations. The river parameters considered in this study are the river water discharge (Q) and suspended sediment load (S). The study examines various combinations of these two parameters as inputs to the applied models so as to evaluate the degree of the effect of each of these variables on sediment load. The following input combinations were investigated:

  • (i)

    Qt

Conclusions

The knowledge of modeling suspended sediment load in rivers is of great importance as it affects the management and operation of hydraulic structures as well as river morphology. The current work investigated the potential of evolutionary data driven method, GP (i.e. GEP) approach in estimation of suspended sediment load. The data of two stations on Cumberland River in U.S for a period of 10 years (01-October-1979 to 30-September-1989) taken from the USGS web site for training and testing the

References (59)

  • T. Partal et al.

    Estimation and forecasting of daily suspended sediment data using wavelet-neural networks

    J. Hydrol.

    (2008)
  • T. Partal et al.

    Wavelet and neuro-fuzzy conjunction model for precipitation forecasting

    J. Hydrol.

    (2007)
  • J.C. Refsgaard

    Parameterization, calibration and validation of disturbed hydrological models

    J. Hydrol.

    (1997)
  • J. Shiri et al.

    Short-term and long-term streamflow forecasting using a wavelet and neuro-fuzzy conjunction model

    J. Hydrol.

    (2010)
  • J. Shiri et al.

    Daily reference evapotranspiration modeling by using genetic programming approach in the Basque Country (Northern Spain)

    J. Hydrol.

    (2012)
  • J.M. Wicks et al.

    SHESED: a physically based, disturbed erosion and sediment yield component for the SHE hydrological modeling system

    J. Hydrol.

    (1996)
  • Z. Ahmad et al.

    ANFIS-based approached for the estimation of transverse mixing coefficient

    Water Sci. Technol.

    (2011)
  • ASCE Task Comitte on Application of Artificial Neural Networks in Hydrology, 2000a. Artificial neural networks in...
  • ASCE Task Comitte on Application of Artificial Neural Networks in Hydrology, 2000b. Artificial neural networks in...
  • A. Aytek et al.

    An application of artificial intelligence for rainfall runoff modeling

    J. Earth Syst. Sci.

    (2008)
  • H.Md. Azamathulla et al.

    Gene-expression programming for transverse mixing coefficient

    J. Hydrol

    (2012)
  • H.Md. Azamathulla et al.

    ANFIS-based approach for predicting the scour depth at culvert outlets

    J. Pipeline Syst. Eng. Pract.

    (2011)
  • V. Babovic et al.

    Rainfall runoff modeling based on genetic programming

    Nord. Hydrol.

    (2002)
  • V. Babovic et al.

    Neural networks as routine for error updating of numerical models

    J. Hydrol. Eng.

    (2001)
  • H.K. Cigizoglu

    Suspended sediment estimation and forecasting using artificial neural networks

    Turkish J. Eng. Env. Sci.

    (2002)
  • Drecourt, J.P., 1999. Application of Neural Networks and Genetic Programming to Rainfall Runoff Modeling. D2K Technical...
  • C. Ferreira

    Gene expression programming: a new adaptive algorithm for solving problems

    Complex Syst.

    (2001)
  • Ferreira, C., 2001b. Gene expression programming in problem solving. In: 6th Online World Conference on Soft Computing...
  • C. Ferreira

    Gene Expression Programming: Mathematical Modeling by An Artificial Intelligence

    (2006)
  • Cited by (0)

    View full text