Elsevier

Journal of Hydrology

Volume 571, April 2019, Pages 406-415
Journal of Hydrology

Research papers
Pareto-optimal MPSA-MGGP: A new gene-annealing model for monthly rainfall forecasting

https://doi.org/10.1016/j.jhydrol.2019.02.003Get rights and content

Highlights

  • A new Pareto-optimal MPSA-MGG model was proposed for rainfall forecasting.

  • The new model reflects the periodic patterns of rainfall time series into a Pareto-optimal multigene model.

  • The Pareto-optimal MPSA-MGGP is superior to GP, MGGP and SA-MGGP models.

  • The new model can be applied for month ahead forecast of monthly rainfall in semiarid catchments.

Abstract

Rainfall is considered the hardest weather variable to forecast, and its cause-effect relationships often cannot be expressed in simple or complex mathematical forms. This study introduces a novel hybrid model to month ahead forecasting monthly rainfall amounts which is motivated to be used in semi-arid basins. The new approach, called MPSA-MGGP, is based on integrating multi-period simulated annealing (MPSA) optimizer with multigene genetic programming (MGGP) symbolic regression so that the hybrid model reflects the periodic patterns in rainfall time series into a Pareto-optimal multigene forecasting equation. The model was trained and verified using observed rainfall at two meteorology stations located in north-west of Iran. The model accuracy was also cross-validated against two benchmarks: conventional genetic programming (GP) and MGGP. The results indicated that the proposed gene-annealing model provides slight to moderate decline in absolute error as well as noteworthy augment in Nash-Sutcliffe coefficient of efficiency. Promising efficiency together with parsimonious structure endorse the proposed model to be used for monthly rainfall forecasting in practice, particularly in semi-arid regions.

Introduction

Medium- to long-term forecasts of rainfall amount are required for many practical purposes in watershed management such as optimal irrigation, food production, water allocation, mine operations, management of water infrastructure, and flood preventive measures. However, it is known as one of the most challenging tasks in forecasting community (Mekanik et al., 2013, Abbot and Marohasy, 2014, Feng et al., 2015, Farajzadeh and Alizadeh, 2018). To realize the underling process, a number of physically-based and stochastic/probabilistic models and approaches have been established in the literature. Stochastic modelling of rainfall events for long time scales such as monthly and seasonal has been attempted in earlier studies using classical time series modeling approaches such as auto regressive integrated moving average (ARIMA), seasonal ARIMA (SARIMA), and periodic autoregressive moving average (PARMA) (e.g., Delleur and Kavvas, 1978, Kaushik and Singh, 2008). Despite being popular, these are basically linear models and incapable of truly capturing the irregularities of rainfall (Cramer et al., 2018). They can be applied for stationary time series when month-to-month or season-to-season correlations do not vary throughout the year (Salas et al., 2003, Nourani et al., 2009).

Machine learning (ML) methods are seen as robust alternatives and have become more popular over recent years. Those of ML methods commonly applied for rainfall forecasting include artificial neural networks (ANNs, Nasseri et al., 2008, Aksoy and Dahamsheh, 2009, Moustris et al., 2011), fuzzy logic (FL, Pongracz et al., 2001), support vector regression (SVR, Chau and Wu, 2010; Danandeh Mehr et al., 2019), and genetic programming (GP, Kisi and Shiri, 2011; Danandeh Mehr et al., 2017). Despite desired flexibility, recent studies have shown that stand-alone ML methods are not suitable enough for rainfall forecasting at long time scales, particularly in arid and semi-arid regions where the time series of rainfall is highly discontinuous and probability of zero values are not negligible. Therefore, hybrid ML methods such as wavelet-ANN (Nourani et al., 2009, Wu et al., 2010), wavelet-SVR (Kisi and Cimen, 2012), wavelet-GP (Kisi and Shiri, 2011), adaptive neuro fuzzy (ANFIS; Partal and Kişi, 2007, Mekanik et al., 2016), and wavelet-least square-SVR (Farajzadeh and Alizadeh, 2018) were developed and recommended. For instance, singular spectrum analysis (SSA) was used by Sivapragasam et al. (2001) to improve forecasting accuracy of SVR results. Genetic algorithm (GA) was utilized to optimize ANN-based rainfall forecasting model structures (Nasseri et al., 2008, Saxena et al., 2014). Chau and Wu (2010) showed that combination of SSA and fuzzy C-means clustering with ANNs and SVR provides considerable accuracy in rainfall forecasting. Nourani et al. (2009) showed that hybrid wavelet-ANN conjunction model may be effectively used for one-month ahead rainfall forecasting at Ligvan Basin, Iran. Kisi and Cimen (2012) studied the effect of wavelet decomposition on SVR-based rainfall forecasting in two meteorology stations in Turkey, and demonstrated that the hybrid model is able to statistically outperform the stand-alone SVR and ANN models. Zhu and Wu (2013) suggested a hybrid optimization algorithm in which GA is combined with simulated annealing (SA) to simultaneously choose the proper input variables and optimize SVM parameters for monthly rainfall forecasting in Guilin of Guangxi, China. Solgi et al. (2014) showed that wavelet-ANN conjunction model is superior to ANFIS to forecast rainfall with one-day ahead lead time in a rain gauge station in Iran. In a recent study, firefly optimization technique was applied by Yaseen et al. (2018) to increase the accuracy of ANFIS-based monthly rainfall forecasting in Pahang River basin, Malaysia. The results proved superiority of the hybrid model to the ANFIS.

In hydrological studies, SA has been widely used as a classic adaptive optimization algorithm to solve different optimization problems. In a preliminary study by Pardo-Igúzquiza (1998), a new approach for constructing an optimal network design for the estimation of areal average of rainfall events has been presented. The author used SA to optimize the objective function which includes both the accuracy of the areal mean estimation and the economic cost of the data collection. In another study by Wang et al. (2010), neural network-based wavelet function was applied to model rainfall-runoff process. The authors combined SA with ANN and showed that the hybrid SA-ANN approach enables the prediction model to reach global optimal solutions and the hybrid model has superb performance in mapping non-linear relations in the data. SA has also been combined with GA to improve rainfall-runoff forecasting models. For instance, a hybrid optimization strategy was developed by Ding et al. (2012) that integrates SA search methodology into GA in order to train and optimize the network architecture and connection weights of ANNs for rainfall-runoff forecasting in a catchment area. In a similar study by Pan and Wu (2014), the hybrid SA-GA algorithm was used to simultaneously choose appropriate input variables and optimize all SVR parameters for daily rainfall-runoff modeling.

One-month ahead forecast of rainfall, as a medium- to long-term scenario, is beneficial for many activities in planning and operation of water resource systems such as reservoir management, drought monitoring, and food production. On the other hand, forecasts with shorter lead times (up to a day) are particularly required to drive hydrological rainfall-runoff models and operation of flood warning systems. The majority of published works has so far focused on creating rainfall forecasting models at short time scales. Nevertheless, there are only a few works investigating the efficiency of hybrid ML techniques for long lead time rainfall forecasting (e.g., Mekanik et al., 2013, Abbot and Marohasy, 2014, Yaseen et al., 2018, Farajzadeh and Alizadeh, 2018) and therefore, additional research is required to model stochastic nature of rainfall events on long-term basis.

The above-mentioned review revealed that accurate forecast of monthly rainfall amounts is usually difficult because of underlying nonlinear interrelations between rainfall and its preceding amounts. The task is more difficult, particularly in arid and semi-arid regions, where the probability of zero values is not negligible. The objective of the present study is therefore, to propose a new hybrid model to enhance ability of available AI methods to forecast monthly rainfall amounts at heavily localized semi-arid areas. Furthermore, parsimony of rainfall forecasting models as another goal, for the first time, is taken into account in the present study, so that the proposed model could be motivated to be used in practice. To achieve these goals, power of SA optimizer has been integrated with multi-gene GP (MGGP) technique to detect underlying non-linear relations, precisely. Despite different applications of SA in hydrological studies, to the best of the authors’ knowledge, this is the first study that couples SA with MGGP to improve the forecasting accuracy of MGGP-based rainfall forecasting models. The proposed model, called MPSA-MGGP, is trained and validated using monthly rainfall amounts from two meteorology stations in north-west of Iran that reflect rainfall pattern of semi-arid regions. Step by step modelling procedure of the proposed MPSA-MGGP model is presented in the next section after a brief overview on the fundamentals of SA and MGGP methods.

Section snippets

Simulated annealing (SA)

SA (Kirkpatrick et al., 1983), is a classic adaptive optimization algorithm that has roots in statistical mechanics. Statistical mechanic models and analyzes the behavior of systems with large number of atoms in a sample liquid or solid matter. The goal of statistical mechanics is to predict and explain the aggregate properties of atoms in a condensed matter during the annealing process in which the temperature approaches the ground state; i.e. when the temperature decrease so much that the

Results and discussion

To attain the best stand-alone GP and MGGP models (i.e., benchmarks) at each rain gauge station, two open source GP toolbox written in MATLAB language, i.e., GPLABv4 (Silva and Almeida, 2003) and GPTIPS2 (Searson, 2015) were used in the present study. The former includes most of the standard GP rules with highly modular structure that makes it a particularly versatile and easily extendable tool for creating classic GP solutions. By contrast, the latter stands on the multigene variant of GP that

Conclusion

In this study, a new Pareto-optimal multi-period simulated annealing multigene genetic programming model (MPSA-MGGP) was developed to month-ahead forecast of total monthly rainfall (TMR) in a semi-arid region. Rainfall records from two rain gauges located in Urmia Lake basin were used to train and verify the model. The original TMR observations were transformed into weak stationary time series and imported to MGGP engine to construct initial Pareto-optimal multigene regression model that uses

Conflict of interest

None.

Acknowledgments

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. The rainfall data used in the present study were obtained from Iran Meteorological Service (www.irimo.ir). The authors are thankful to the editors and four anonymous reviewers for their times allocated to review the initial versions of this paper.

References (48)

  • T. Partal et al.

    Wavelet and neuro-fuzzy conjunction model for precipitation forecasting

    J. Hydrol.

    (2007)
  • R. Pongracz et al.

    Fuzzy rule-based prediction of monthly precipitation

    Phys. Chem. Earth, Part B: Hydrol., Oceans Atmos.

    (2001)
  • C.L. Wu et al.

    Prediction of rainfall time series using modular artificial neural networks coupled with data-preprocessing techniques

    J. Hydrol.

    (2010)
  • H. Aksoy et al.

    Artificial neural network models for forecasting monthly precipitation in Jordan

    Stoch. Env. Res. Risk Assess.

    (2009)
  • V. Babovic et al.

    Genetic programming as a model induction engine

    J. Hydroinform.

    (2000)
  • K.W. Chau et al.

    A hybrid model coupled with singular spectrum analysis for daily rainfall prediction

    J. Hydroinform.

    (2010)
  • K.W. Chau

    Use of meta-heuristic techniques in rainfall-runoff modelling

    Water

    (2017)
  • A. Danandeh Mehr et al.

    Season algorithm-multigene genetic programming: a new approach for rainfall-runoff modelling

    Water Resour. Manage.

    (2018)
  • A.D. Danandeh Mehr et al.

    A hybrid support vector regression–firefly model for monthly rainfall forecasting

    Int. J. Environ. Sci. Technol.

    (2019)
  • J.W. Delleur et al.

    Stochastic models for monthly rainfall forecasting and synthetic generation

    J. Appl. Meteorol.

    (1978)
  • Ding, H., Wu, J., Li, X. 2012, June. Evolving neural network using hybrid genetic algorithm and simulated annealing for...
  • O. Eray et al.

    Comparison of multi-gene genetic programming and dynamic evolving neural-fuzzy inference system in modeling pan evaporation

    Hydrol. Res.

    (2018)
  • J. Farajzadeh et al.

    A hybrid linear–nonlinear approach to predict the monthly rainfall over the Urmia Lake watershed using wavelet-SARIMAX-LSSVM conjugated model

    J. Hydroinform.

    (2018)
  • Q. Feng et al.

    Wavelet analysis-support vector machine coupled models for monthly rainfall forecasting in arid regions

    Water Resour. Manage.

    (2015)
  • Cited by (32)

    • A review on rainfall forecasting using ensemble learning techniques

      2023, e-Prime - Advances in Electrical Engineering, Electronics and Energy
    • Sediment transport with soft computing application for tropical rivers

      2022, Handbook of HydroInformatics: Volume III: Water Data Management Best Practices
    • Daily suspended sediment forecast by an integrated dynamic neural network

      2022, Journal of Hydrology
      Citation Excerpt :

      Consequently, for better representation, data pre-processing techniques (e.g. sampling, transformation, de-noising and normalization) are suggested to reformulate and reshape the raw signals. In a rainfall forecast study, Mehr et al. (2019) remove the non-stationary features by performing square root and standardization of the original data. Such a procedure produces a weak stationary signal that is easy to model, as the trend in the mean is separated and the variance is suppressed.

    • Genetic programming for streamflow forecasting: A concise review of univariate models with a case study

      2021, Advances in Streamflow Forecasting: From Traditional to Modern Approaches
    View all citing articles on Scopus
    View full text