Comparison of genetic programming with neuro-fuzzy systems for predicting short-term water table depth fluctuations

doi:10.1016/j.cageo.2010.11.010

Computers & Geosciences

Volume 37, Issue 10, October 2011, Pages 1692-1701

https://doi.org/10.1016/j.cageo.2010.11.010 Get rights and content

Abstract

This paper investigates the ability of genetic programming (GP) and adaptive neuro-fuzzy inference system (ANFIS) techniques for groundwater depth forecasting. Five different GP and ANFIS models comprising various combinations of water table depth values from two stations, Bondville and Perry, are developed to forecast one-, two- and three-day ahead water table depths. The root mean square errors (RMSE), scatter index (SI), Variance account for (VAF) and coefficient of determination (R²) statistics are used for evaluating the accuracy of models. Based on the comparisons, it was found that the GP and ANFIS models could be employed successfully in forecasting water table depth fluctuations. However, GP is superior to ANFIS in giving explicit expressions for the problem.

Introduction

Physical-based numerical groundwater flow models are powerful tools for representing high spatial and temporal variations of aquifers. However, this capability renders the models data intensive, and to achieve acceptable simulations and prediction performance, the properties and conditions of the groundwater system must be accurately presented within the model's space and time domains (Coppola et al., 2003, Feng. et al., 2008). Because the properties and conditions of groundwater can never be ascertained with absolute accuracy, unavoidable discrepancies between the model and the real-world system reduce simulation accuracy hinders efforts to appropriately manage the groundwater resources (Coppola et al., 2005). Therefore, empirical models may be considered as alternative methods and can provide useful results without costly calibration time (Daliakopoulos et al., 2005, Box and Jenkins, 1976, Hipel and Mc Leod, 1994). However, these models have their own limitations, because they are data demanding models and they are not adequate when the dynamical behavior of the hydrological system changes in time (Bierkens, 1998).

In the recent past, the use of Artificial Intelligence techniques, such as Genetic Programming (GP), Adaptive Neuro-Fuzzy Inference System (ANFIS) and Artificial Neural Networks (ANNs) have become viable: Coulibaly et al. (2001) applied ANNs for modeling of monthly groundwater level fluctuations; Coppola et al. (2005) developed ANNs for accurately predicting potentiometric surface elevations; Daliakopoulos et al. (2005) applied ANN for forecasting groundwater level; Szidarovszky et al. (2007) introduced a hybrid ANNs-numerical model for groundwater problems; Coppola et al. (2007) applied a combination of ANN modeling with multi-objective optimization for a complicated real-world groundwater management problem in New Jersey; and Feng et al. (2008) applied ANNs to investigate the effects of human activities on regional groundwater levels; Yang et al. (2009) applied ANN for forecasting groundwater levels in Western Jilin Province, China.

The focus of the current paper is on the application of GP and ANFIS data driven models to forecast groundwater table depth time series. The methodology of GP was first proposed by Koza (1992), as a generalization of Genetic Algorithms (GA) (Goldberg, 1989). The fundamental difference between GP and GAs lie in the nature of individuals, where in GAs individuals are linear strings of fixed length (as chromosomes), while in GP individuals are nonlinear entities of different sizes and shapes (as parse trees). Major advantages of GP are that it can be applied to areas where (a) the interrelationships among the relevant variables are poorly understood (or where it is suspected that the current understanding may well be less than satisfactory), (b) finding the ultimate solution is hard, (c) conventional mathematical analysis does not, or cannot, provide analytical solutions, (d) an approximate solution is acceptable (or is the only result that is ever likely to be obtained), (e) small improvements in the performance are routinely measured (or easily measurable) and highly valued, and (f) there is a large amount of data, in computer readable form, that requires examination, classification, and integration (such as satellite observations) (Banzhaf et al., 1998). Also effective data driven neuro-fuzzy models have received more attention in the recent past. ANFIS was firstly introduced by Jang (1993), Jang and Sun (1995) and Jang et al. (1997), and later on widely applied in engineering problems. Jang (1993) introduced architecture and a learning procedure for the Fuzzy-Inference Systems (FIS) that uses a neural network learning algorithm for constructing a set of fuzzy if-then rules with appropriate membership functions (MFs) from the specified input–output pairs. This procedure is called an adaptive network-based-fuzzy inference system (ANFIS). There are largely two approaches for fuzzy inference systems, namely the approaches of Mamdani (Mamdani and Assilian, 1975) and Sugeno (Takagi and Sugeno, 1985). The differences between the two approaches arise from the consequent part. Mamdani's approach uses fuzzy membership functions, whereas Sugeno's approach uses linear or constant functions. The neuro-fuzzy model used in this study implements Sugeno's fuzzy approach (Takagi and Sugeno, 1985) to obtain the values for the output variable from those of input variables. For a given input–output data set, various Sugeno models may be developed by using different identification methods (i.e., grid partitioning, subtractive clustering and Gustafson–Kessel clustering methods). However, the recent researches demonstrated that the type of identification method does not affect the results rigorously (Vernieuwe et al., 2005). Therefore, the commonly used grid partitioning identification method was applied for constructing the neuro-fuzzy models in this paper. The grid partitioning method proposes independent partitions of each antecedent variable through defining the membership functions of all antecedent variables. A major problem with application of this method is that the construction of the membership functions of each variable is not dependent on each other, hence the relationship between the variables is omitted.

One of the strong points of using GP over other data driven techniques (e.g., ANFIS) is that it can produce explicit formulations (model expression) of the relationship that rules the physical phenomenon. Such expressions may be subject to some physical interpretations. Actually, the comprehensibility of GP models is also a way to reduce the risk of over-fitting to training data and improve generalization of resulting models. In this way, one may perform knowledge discovery using GP, finding some confirmation of well-known physical relationships and evolving interesting new formulae, as an upgrading of particular cases of study.

Review of all of the applications of GP and ANFIS in hydrology and water resources engineering is beyond the scope of this paper and only some limited studies are discussed here. Babovic et al. (2002) applied GP for modeling of risks in water supply. Aytek and Alp (2008) applied GP to rainfall-runoff modeling. Aytek and Kisi (2008) applied GP to suspended sediment transport streams. Ghorbani et al. (2010) applied GP to forecast averaged sea water level values. Kisi and Shiri (2010) applied GP and ANFIS techniques for predicting short-term and long term river flow.

Kisi (2005) estimated suspended sediment using neuro-fuzzy and neural network approaches. Kisi (2006) proposed a neuro-fuzzy computing technique for daily pan evaporation modeling. Partal and Kisi (2007) proposed a new wavelet-neuro-fuzzy conjunction model for precipitation forecast. Kisi (2009) applied evolutionary fuzzy models for river suspended sediment concentration estimation.

To the best knowledge of the authors, no study has been carried out to predict groundwater table fluctuations using GP and ANFIS. This provides an impetus for the current work. The aim of this study is the application and comparison of GP and ANFIS for forecasting short-term daily groundwater table depths. It is relevant to remarked that the models investigated here are normally applied within deterministic frameworks in professional practices, which has encouraged the practice of comparing the actual with predicted values. However, this is a black-and-white approach for selecting the merits of a method and does not necessarily measure the impact on the decision.

Section snippets

Used data

The data set used in this study was obtained from Illinois State Water Survey, U.S (www.isws.illinois.edu/data.asp). The time series of daily depth to water table records from two wells were used: Bondville (station no: 421832; FIPS code: 019; Latitude: 40°05′N; Longitude: 88°37′W; Altitude: 213 m) and Perry (station no: 421843; FIPS Code: 149; Latitude: 39°80′N; Longitude: 90°83′W; Altitude: 213 m). Groundwater levels are monitored continuously with Stevens Type-F paper chart recorders. The

Statistical measures and model implementations

Four statistical evaluation criteria were used to assess the model performance

(1)
The coefficient of determination (R²); which ranges between 0 and 1, with higher values indicate the better performance of the model. Legates and McCabe (1999) argued that this indicator should not be applied as fitness measure alone. Therefore, it is appropriate to quantify the error in the same unit as for the variables, as discussed by Legates and McCabe (1999). One of these measures is
(2)
the root mean square error (

Conclusions

The accuracy of GEP and ANFIS techniques in forecasting short-term (one-, two- and three-day ahead) ground water depth has been investigated in the present study. The inter-comparison of the results obtained using GEP and ANFIS indicated that the GEP models performed slightly better than the ANFIS models in forecasting ground water depths. It can be concluded that both the GEP and ANFIS models can be considered as promising tools for forecasting daily groundwater depths, based on previously

Acknowledgments

The data set used in this study was obtained from the website of Illinois State Water Survey, U.S. The authors are grateful to the staff of the Illinois State Water Survey, who were associated with data observation, processing and management of web site.

References (41)

A. Aytek et al.
A genetic programming approach to suspended sediment modeling
Journal of Hydrology
(2008)
V. Babovic et al.
A data mining approach to modeling of water supply assessment
Urban Water
(2002)
G.J. Bowden et al.
Input determination for neural network models in water resources applications. Part1—background and methodologies
Journal of hydrology
(2005)
I. Daliakopoulos et al.
Ground water level forecasting using artificial neural networks
Journal of Hydrology
(2005)
O. Kisi
Daily pan evaporation modeling using a neuro-fuzzy computing technique
Journal of Hydrology
(2006)
O. Kisi
Evolutionary fuzzy models for river suspended sediment cpncentration estimation
Journal of Hydrology
(2009)
E.H. Mamdani et al.
An experiment in linguistic synthesis with a fuzzy logic controller
International Journal of Man–Machine Studies
(1975)
T. Partal et al.
Wavelet and neuro-fuzzy conjunction model for precipitation forecasting
Journal of Hydrology
(2007)
H. Vernieuwe et al.
Comparison of data-driven Takagi-Sugeno models of rainfall-discharge dynamics
Journal of Hydrology
(2005)
Z.P. Yang et al.
Application and comparison of two prediction models for groundwater levels: a case study in Western Jilin Province, China
Journal of Arid Environments
(2009)

A. Aytek et al.

An application of artificial intelligence for rainfall runoff modeling

Journal of Earth Systems Science

(2008)

W. Banzhaf et al.

Genetic programming

(1998)

M.F.P. Bierkens

Modeling water table fluctuations by means of a stochastic differential equation

Water Resources Research

(1998)

G.E.P. Box et al.

Time series analysis: forecasting and control

(1976)

E. Coppola et al.

Artificial neural network approach for predicting transient water levels multi layered ground water system under variable state, pumping and climatic conditions

Journal of Hydrological Engineering

(2003)

E. Coppola et al.

A neural network model for predicting aquifer water level elevations

Ground Water

(2005)

E. Coppola et al.

Multi objective analysis of a public wellfield using artificial neural networks

Ground Water

(2007)

P. Coulibaly et al.

Artificial neural network modeling of water table depth fluctuations

Water Resources Research

(2001)

S. Feng. et al.

Neural networks to simulate regional groundwater levels affected by human activities

Ground Water

(2008)

C. Ferreira

Gene expression programming: a new adaptive algorithm for solving problems

Complex Systems

(2001)

Cited by (120)

Groundwater level forecasting using ensemble coactive neuro-fuzzy inference system
2024, Science of the Total Environment
A modeling framework utilizing the coactive neuro-fuzzy inference system (CANFIS) has been developed for multi-lead time groundwater level (GWL) forecasting in four different wells located in Texas and Florida, USA. Various model input combinations, including GWL, precipitation, temperature, and surface water level variables, have been derived based on proposed correlation analysis using singular spectrum analysis (SSA) remainders. The models have been trained on data subsets of varying lengths to identify the optimal training data duration. Additionally, we have introduced the bagging ensemble learning method to enhance the performance of the CANFIS model. As part of a comprehensive model evaluation process, the best-performing CANFIS model for each forecasting scenario has undergone uncertainty analysis using bootstrap sampling. Our results reveal that the CANFIS model performs satisfactorily for daily forecasting but leaves room for improvement in monthly forecasting, particularly for two-month and three-month ahead forecasts. Moreover, we have identified several optimal input combinations, highlighting the significance of the temperature variable in monthly forecasting. Furthermore, our findings indicate that additional training data does not necessarily lead to improved performance. The ensemble CANFIS model has demonstrated significant performance enhancement, particularly for monthly forecasting. Finally, the CANFIS model uncertainty analysis has shown satisfactory results for daily forecasting scenarios, while monthly forecasting models exhibit higher uncertainties, particularly during periods with distinctly different GWL fluctuation patterns.
Pipe failure prediction of wastewater network using genetic programming: Proposing three approaches
2023, Ain Shams Engineering Journal
Finding critical points of the wastewater network by rebuilding the infrastructure is cheaper than repairing it after occurring failure. This task can be done by using predictive approaches. Therefore, in this study, a new method is proposed to predict the number of pipe failures per length of wastewater network. For this purpose, genetic programming (GP) is used to predict the pipe failure of sewer network in Isfahan region 2 using the data from year 2014 to 2017.The obtained results are compared with the results of corresponding artificial neural network (ANN) model. For this purpose, three different approaches are proposed. In the first approach named GA-CLU-T, the number of pipe failures is predicted using all data. However, in the second ones named GA-CLU-Y, the models are created and trained using the data of year 2014 and the obtained model is used to predict the number of pipe failure for other years in future. Finally, the third ones named GA-CLU-R is proposed to determine the number of pipe failures in other regions. Here, two different models are proposed for each approaches using GP method. The result shows that the best RMSE (R²) values of first, second and third approaches for test data set are 0.00316 (0.966), 0.00074 (0.996) and 0.00075 (0.997), respectively. The results show that the result accuracy of GP models is better than the corresponding ANN models.
Employing machine learning to quantify long-term climatological and regulatory impacts on groundwater availability in intensively irrigated regions
2022, Journal of Hydrology
The steady overexploitation of the Ogallala Aquifer underlying the U.S. High Plains Region has put irrigated crop production at risk, particularly in the Southern and Central High Plains. To manage this issue properly, a data-driven modeling framework is developed and tested that is fast to employ and yet provides reliable long-term groundwater level (GWL) forecasts as a function of climatological and anthropogenic factors. The modeling framework uses the random forests (RF) technique in combination with ordinary kriging, and is tested in Finney County in southwest Kansas. The introduction of groundwater withdrawal potential as a new surrogate for pumping intensity enables the RF model to capture decline in groundwater depletion rate as the system progresses towards aquifer depletion and/or as a result of well retirement policies. The RF model is executed from 2017 to 2099 for 20 different downscaled global climate models (GCMs) for the two representative concentration pathways (RCP) scenarios of 4.5 and 8.5. The results show the aquifer will cease to support irrigated agriculture in most of the county by 2060 under status quo management and average climate conditions. Moreover, climate will likely shift the aquifer’s depletion time frame by 15 years or less in most of the study area. The long-term combined impact of well retirement plans and climate conditions on groundwater depletion trends imply well retirement policies do not lead to sustained groundwater savings. This study demonstrates the capacity of machine learning models to serve as a rapid assessment tool, informing policymakers about future groundwater availability in intensively irrigated regions and under different climate and management conditions.
Groundwater level prediction using machine learning models: A comprehensive review
2022, Neurocomputing
Developing accurate soft computing methods for groundwater level (GWL) forecasting is essential for enhancing the planning and management of water resources. Over the past two decades, significant progress has been made in GWL prediction using machine learning (ML) models. Several review articles have been published, reporting the advances in this field up to 2018. However, the existing review articles do not cover several aspects of GWL simulations using ML, which are significant for scientists and practitioners working in hydrology and water resource management. The current review article aims to provide a clear understanding of the state-of-the-art ML models implemented for GWL modeling and the milestones achieved in this domain. The review includes all of the types of ML models employed for GWL modeling from 2008 to 2020 (138 articles) and summarizes the details of the reviewed papers, including the types of models, data span, time scale, input and output parameters, performance criteria used, and the best models identified. Furthermore, recommendations for possible future research directions to improve the accuracy of GWL prediction models and enhance the related knowledge are outlined.
Boosted artificial intelligence model using improved alpha-guided grey wolf optimizer for groundwater level prediction: Comparative study and insight for federated learning technology
2022, Journal of Hydrology
Modeling groundwater level (GWL) is a challenging task particularly in intensive groundwater-based irrigated regions due to its dependency on multiple natural and anthropogenic factors. The main motivation of the current investigation is to develop a new advanced artificial intelligence (AI) model for GWL simulation. An Adaptive Neuro-Fuzzy Inference System (ANFIS) optimized by Improved Alpha-Guided Grey Wolf optimization (IA-GWO) algorithm is proposed in this study for reliable prediction of GWL in an intensively irrigated region of Northwest Bangladesh. Natural and anthropogenic factors including rainfall, evapotranspiration, groundwater abstraction, and irrigation return flow were considered as input variables for the development of the models. The efficacy of the proposed model was compared with standalone ANFIS and ANN models and their hybrid versions using particle swarm optimization (ANFIS-PSO) models. Both standard statistical metrics and visual inspection of scatter plots, violin plots, and Taylor diagrams were employed for performance evaluation. Thirty-one years (1981–2011) monthly groundwater level data were used for the calibration and validation of the models. The results revealed the better performance of ANFIS-IA-GWO with normalized root mean square error (NRMSE) of 0.06–0.11 and Kling-Gupta efficiency (KGE) of 0.96–0.98 compared to ANFIS-PSO (NRMSE ∼ 0.38–0.55 and KGE ∼ 0.70–0.86) and ANN-IA-GWO (NRMSE ∼ 0.42–0.57 and KGE ∼ 0.75–0.91) and ANN-PSO (NRMSE ∼ 0.50–0.63 and KGE ∼ 0.63–0.83). The visual comparison of results showed that ANFIS-IA-GWO model was able to replicate the mean, distribution, interquartile range, and standard deviation of observed GWL more appropriately compared to other models
Efficacy of machine learning techniques in predicting groundwater fluctuations in agro-ecological zones of India
2021, Science of the Total Environment
In the 21st century, groundwater depletion is posing a serious threat to humanity throughout the world, particularly in developing nations. India being the largest consumer of groundwater in the world, dwindling groundwater storage has emerged as a serious concern in recent years. Consequently, the judicious and efficient management of vital groundwater resources is one of the grand challenges in India. Groundwater modeling is a promising tool to develop sustainable management strategies for the efficient utilization of this treasured resource. This study demonstrates a pragmatic framework for predicting seasonal groundwater levels at a large scale using real-world data. Three relatively powerful Machine Learning (ML) techniques viz., ANFIS (Adaptive Neuro-Fuzzy Inference System), Deep Neural Network (DNN) and Support Vector Machine (SVM) were employed for predicting seasonal groundwater levels at the country scale using in situ groundwater-level and pertinent meteorological data of 1996–2016. ANFIS, DNN and SVM models were developed for 18 Agro-Ecological Zones (AEZs) of India and their efficacy was evaluated using suitable statistical and graphical indicators. The findings of this study revealed that the DNN model is the most proficient in predicting seasonal groundwater levels in most AEZs, followed by the ANFIS model. However, the prediction ability of the three models is ‘moderate’ to ‘very poor’ in 3 AEZs [‘Western Plain and Kutch Peninsula’ in Western India, and ‘Deccan Plateau (Arid)’ and ‘Eastern Ghats and Deccan Plateau’ in Southern India]. It is recommended that groundwater-monitoring network and data acquisition systems be strengthened in India in order to ensure efficient use of modeling techniques for the sustainable management of groundwater resources.

View all citing articles on Scopus

View full text

Comparison of genetic programming with neuro-fuzzy systems for predicting short-term water table depth fluctuations

Abstract

Introduction

Section snippets

Used data

Statistical measures and model implementations

Conclusions

Acknowledgments

Journal of Hydrology

Urban Water

Journal of hydrology

Journal of Hydrology

Journal of Hydrology

Journal of Hydrology

International Journal of Man–Machine Studies

Journal of Hydrology

Journal of Hydrology

Journal of Arid Environments

An application of artificial intelligence for rainfall runoff modeling

Journal of Earth Systems Science

Genetic programming

Modeling water table fluctuations by means of a stochastic differential equation

Water Resources Research

Time series analysis: forecasting and control

Artificial neural network approach for predicting transient water levels multi layered ground water system under variable state, pumping and climatic conditions

Journal of Hydrological Engineering

A neural network model for predicting aquifer water level elevations

Ground Water

Multi objective analysis of a public wellfield using artificial neural networks

Ground Water

Artificial neural network modeling of water table depth fluctuations

Water Resources Research

Neural networks to simulate regional groundwater levels affected by human activities

Ground Water

Gene expression programming: a new adaptive algorithm for solving problems

Complex Systems