Statistical downscaling of watershed precipitation using Gene Expression Programming (GEP)
Highlights
► We present GEP model as a new tool for precipitation downscaling at watershed scale. ► The new model was compared with an existing model of similar nature known as SDSM. ► Our new model proves to be simpler and more efficient than SDSM.
Introduction
Investigation of the hydrological impacts of climate change at the regional or local scale requires the use of a statistical downscaling technique rather than dynamical downscaling as, often, in such investigations, point specific information is desirable (Timbal et al., 2009). Downscaling is a process of transforming the coarse spatial resolution of Global Climate Model (GCM) information to a form suitable for direct use in many types of climate impact models. On the basis of its working principle, it can be broadly categorized as statistical and dynamical. There have been a number of studies to date which provide detailed discussions about both categories such as Xu, 1999, Wilby et al., 2004 and Christenson et al. (2007). Significant progress has already been made in the development of new statistical downscaling techniques. These techniques can be classified into three major classes namely; (1) Weather generator (2) Weather typing and (3) Multiple regression. The third class of statistical downscaling techniques involves the development of relationships between the large scale climatic parameters such as geopotential height and mean sea level pressure (predictors) and the local variables (predictands) such as temperature and precipitation. Traditionally, this has been done through multiple linear regression, with or without data pre-processing techniques aimed at reducing the dimensionality of the problem (Huth, 1999, Wilby et al., 2002). These techniques include principle component analysis (e.g. Schoof and Pryor, 2001) and canonical correlation analysis (e.g. Busuioc et al., 2008). When the variable of interest is precipitation, the predictor-predictand relationships are often very complex and linear regression based methods may not work very well. For this reason, a number of non-linear regression downscaling schemes and the use of Artificial Neural Networks (ANNs) was introduced. Recent studies have shown that statistical downscaling schemes based on the use of artificial neural networks (ANNs) models can present good non-linear regression models (e.g. Mpelasoka et al., 2001, Haylock et al., 2006) and can be considered as a global correlation detector. The ANN-based models are often complex in nature so their solutions are not easily interpretable in a physically meaningful way and they can suffer significantly from conventional modelling problems such as being trapped at the local optimum during their calibration (Tripathi et al., 2006).
Another soft computing technique known as Genetic Programming (GP) (Koza, 1994) has emerged as a popular tool in recent years. In the field of water resources, GP has a number of applications. It has been used to model groundwater related problems (Hong and Rosen, 2002). It has also been used as a rainfall-runoff modelling tool on a number of occasions, which are not fully reviewed here. Previous studies show that GP is more efficient than ANNs in the area of rainfall-runoff modelling (e.g. Savic et al., 1999, Whigham and Crapper, 2001, Babovic and Keijzer, 2002, Liong et al., 2002 and Aytek et al., 2008). In the context of downscaling for climate change studies, the use of GP (or its variants) is not widespread. The advantage of GP over ANN for developing downscaling functions is that it provides efficient and transparent modelling solutions. As far as the authors are aware, the first attempt to use GP as a downscaling tool was made by Coulibaly (2004). The target parameter in that study was temperature and the testing of GP for a challenging task of precipitation downscaling was recommended. This study endorses the recommendations in Coulibaly (2004). A precipitation downscaling case study has been presented here at the Clutha watershed in New Zealand, using a non-linear multiple regression model developed by the authors using Gene Expression Programming (GEP) which is a variant of GP (Ferreira, 2001). While some of the results of the GEP model used in this study are presented previously (Hashmi et al., 2009), complete modelling details were not, then, provided. The main focus of the present study is to: i) Fully explain the GEP modelling process in the context of watershed precipitation downscaling; and ii) Evaluate and compare the performance of our GEP model with a popular downscaling model of similar nature.
Section snippets
Study site and data
This study is focused on the Clutha River watershed, upstream of Balclutha stream flow gauge, located in the South Island of New Zealand (Fig. 1). The Clutha is the largest river by volume and the second longest river in New Zealand, having a length of 322 km. As discussed by Murray [1975], the precipitation in this watershed is strongly influenced by the mountains of Fiordland and the Southern Alps (see Fig. 1). As far as the distribution of the monthly precipitation is concerned, recorded
Methodology
Symbolic regression is a type of nonparametric regression which relates the predictors and predictand variables in the form of a function. This function is not specified a priori, but is constrained to contain a number of mathematical or logical expressions to be chosen from a pre-selected set of mathematical expressions (symbols) and predictor variables. In symbolic regression, genetic programming, which is based on Darwin’s evolution theory is used to obtain the optimum set of symbols and
Results and discussion
In the GEP-based symbolic regression the most important predictors are selected automatically from the set of all of the independent variables (predictors). The best evolved model in our case contained seven predictors, these being considered as the most important among the total of 26 predictors. Further details about this model are given in Appendix A. Table 5 shows the full list of the predictors with their short names as used in the best model. The predictors shown in bold in Table 5 were
Conclusions
A comparative statistical downscaling analysis was undertaken to evaluate the GEP-based symbolic regression as a statistical downscaling tool. Daily precipitation time series of the Clutha Watershed in New Zealand were downscaled using GEP. The SDSM model results were used as a benchmark for analysing the downscaled results of the GEP model. The results were analysed visually by using scatter plots and also in terms of parameters such as the number of predictors used by the downscaling model,
Acknowledgements
The authors of this paper are thankful to the National Institute of Water and Atmospheric Research (NIWA), New Zealand, for providing the precipitation data used in this study and the Higher Education Commission (HEC) of Pakistan for funding this research work.
References (31)
- et al.
Identification of an urban fractured-rock aquifer dynamics using an evolutionary self-organizing modelling
Journal of Hydrology
(2002) - et al.
Generalization of a statistical downscaling model to provide local climate change projections for Australia
Environmental Modelling and Software
(2009) - et al.
Downscaling of precipitation for climate change scenarios: a support vector machine approach
Journal of Hydrology
(2006) - et al.
Modelling rainfall-runoff using genetic programming
Mathematical and Computer Modelling
(2001) - et al.
SDSM-a decision support tool for the assessment of regional climate change impacts
Environmental Modelling and Software
(2002) - et al.
An application of artificial intelligence for rainfall-runoff modeling
Journal of Earth System Science
(2008) - et al.
Rainfall-runoff modelling based on genetic programming
Nordic Hydrology
(2002) - et al.
Empirical-Statistical Downscaling
(2008) - et al.
Statistical downscaling model based on canonical correlation analysis for winter extreme precipitation events in the Emilia-Romagna region
International Journal of Climatology
(2008) - et al.
Evaluating the performance and utility of regional climate models: the PRUDENCE project
Climatic Change
(2007)
The internal climate variability of HadCM3, a version of the Hadley centre coupled model without flux adjustments
Climate Dynamics
Downscaling daily extreme temperatures with genetic programming
Geophysical Research Letters
Gene expression programming: a new adaptive algorithm for solving problems
Complex Systems
Gene Expression Programming: Mathematical Modeling by an Artificial Intelligence
"What is GEP?" From GeneXproTools Tutorials - A Gepsoft Web Resource
Cited by (87)
Modeling hydrogen solubility in water: Comparison of adaptive boosting support vector regression, gene expression programming, and cubic equations of state
2024, International Journal of Hydrogen EnergyApplication of artificial intelligence methods to model the effect of grass curing level on spread rate of fires
2024, Environmental Modelling and SoftwareDevelopment of predictive models for sustainable concrete via genetic programming-based algorithms
2023, Journal of Materials Research and TechnologyPredictive modeling of compressive strength of sustainable rice husk ash concrete: Ensemble learner optimization and comparison
2022, Journal of Cleaner ProductionA dynamic interactive optimization model of CCHP system involving demand-side and supply-side impacts of climate change. Part I: Methodology development
2022, Energy Conversion and Management