Statistical downscaling of precipitation using machine learning techniques
Introduction
The assessment of water resources in catchments under changing climate is important as the spatial and temporal variability of water resources is highly influenced by the changes in the climate (Pascual et al., 2015). General Circulation Models (GCMs) are considered as the most advanced tools available for obtaining global scale climate change projections of hydroclimatic variables (Bates et al., 2010). GCMs are forced with likely future GHG emission scenarios in order to produce scenarios of global climate likely to occur in the future. Owing to the coarse spatial scale at which GCMs operate, they are unable to resolve sub-grid scale processes such as cloud physics and land surface processes, and also the topography of the Earth is coarsely represented within the structure of GCMs (Iorio et al., 2004). Therefore, projections of GCMs cannot be readily used in catchment scale applications such as hydrologic modelling or water resources allocation modelling.
In order to bridge the spatial scale gap between the coarse scale GCM outputs and catchment scale hydroclimatic variables, statistical and dynamic downscaling approaches have been developed (Wilby and Wigley, 1997). In statistical downscaling, empirical statistical relationships between GCM outputs and catchment scale hydroclimatic variables are developed to bridge the spatial scale gap between GCM outputs and catchment scale hydroclimatic variables (Benestad et al., 2008). In dynamic downscaling, physics based equations are used for the same purpose (Fowler and Wilby, 2010). Statistical downscaling has gained wide popularity due to its low computational cost and simplicity (Okkan and Inan, 2014; Rashid et al., 2015; Sachindra et al., 2016), compared to its counterpart dynamic downscaling.
According to Wilby et al. (2004), statistical downscaling approaches can be further sub-divided into three categories; regression-based approaches, weather classification-based approaches and approaches based on weather generators. Regression-based statistical downscaling approaches have gained popularity out of the above three categories owing to their simplicity in application. The regression techniques widely used in statistical downscaling include Multi Linear Regression (MLR) (Sachindra et al., 2014a), Generalized Linear Models (GLMs) (Beecham et al., 2014), Artificial Neural Networks (ANNs) (Tripathi et al., 2006; Ahmed et al., 2015), Support Vector Machine (SVM) (Sachindra et al., 2013; Goly et al., 2014), Relevance Vector Machine (RVM) (Ghosh and Mujumdar, 2008; Okkan and Inan, 2014), Genetic Programming (GP) (Coulibaly, 2004; Sachindra et al., 2018) and Gene Expression Programming (GEP) (Hashmi et al., 2011; Sachindra et al., 2016). Owing to the learning abilities from data and their use in computer algorithms, techniques such as ANN, SVM, RVM, and GP are often called machine learning techniques.
In the past literature, studies have been documented on the comparison of performance of different downscaling approaches developed with machine learning techniques and traditional statistical techniques. Some examples for such studies are provided in this paragraph. In a downscaling exercise, Coulibaly (2004) found that GP-based downscaling models were able to better simulate both daily minimum and maximum temperature in comparison to that by MLR-based downscaling models. In a streamflow downscaling study, Sachindra et al. (2013) discovered that a Least Square Support Vector Machine (LSSVM) based downscaling model was able to better capture the observed streamflow in comparison to that by a MLR-based model. Duhan and Pandey (2015) found that, SVM-based downscaling models are able to better perform in simulating the observed monthly maximum and minimum temperature in comparison to that by ANN and MLR-based models. Goly et al. (2014) employed MLR, positive coefficient regression (PCR), stepwise regression (SR), and SVM for downscaling large scale atmospheric variables to monthly precipitation, and concluded that SVM-based downscaling models outperform models developed with all other techniques in simulating statistics of monthly observed precipitation. According to above studies, downscaling models developed with machine learning techniques perform better in comparison to downscaling models developed with traditional statistical regression techniques.
Though downscaling literature contains the details of many comparison studies of models developed with various techniques, the literature lacks the details of a single study which assesses the performances of models developed using machine learning techniques; GP, ANN, SVM and RVM for downscaling large scale atmospheric information to catchment scale precipitation under diverse climate (relatively wet, intermediate and relatively dry). Also, the current literature does not contain a detailed investigation on the selection of a suitable kernel in the application of SVM and RVM techniques in developing downscaling models. This paper is dedicated to the assessment of the effectiveness of the use of GP, ANN, SVM and RVM in the development of models for downscaling large scale atmospheric information to catchment scale monthly precipitation under diverse climate. In addition to that, this paper presents an investigation on the assessment of suitability of number of different kernel functions in SVM and RVM-based downscaling models under diverse climate.
Section snippets
Study area and data
For the case study, 48 precipitation observation stations located across Victoria (237,000 km2), Australia were selected. These precipitation observation stations were selected in such a way that they contain records of observations over the period 1950–2014 with the minimum missing data and they represent relatively wet, intermediate and relatively dry climate regimes. Names of the precipitation observations stations, their locations along with the long-term statistics of observed
Methodology
Sub-section 3.1 provides the details of the theory of machine learning techniques used in this study, and Sub-section 3.2 details the application of the machine learning techniques in the development of downscaling models.
Results of selection of kernels for SVM and RVM-based downscaling models
It was observed that the kernels used in the SVM and RVM-based downscaling models which produced the best performance in calibration in terms of RMSE (the best kernel), varied from one calendar month to another, even at the same station. Table 4 shows the percentage of selection of a given kernel as the best in relatively wet, intermediate and relatively dry climate regimes for SVM and RVM. The percentage of selection of a kernel as the best for a given climate regime was calculated using Eq.
Differences in bias in the statistics of modelled precipitation in calibration and validation
It was seen that there are noticeable differences in the percentages of bias in the statistics of precipitation simulated by the downscaling models in the calibration and validation periods, at certain stations. It is natural for a downscaling model to display a relatively larger bias percentage in validation in comparison to that in calibration. This is because during calibration, the model parameters are optimised (values of parameters are allowed to change freely) in order to achieve the
Conclusions
Following conclusions were drawn from this investigation;
Irrespective of the climate regime and the machine learning technique, in both calibration and validation at the majority of the stations downscaling models showed an over-estimating trend of low to mid percentiles (i.e. below the 50th percentile) of precipitation and under-estimating trend of high percentiles of precipitation (i.e. above the 90th percentile). The over-estimating trend of low to mid percentiles and under-estimating trend
References (90)
- et al.
Basic concepts of artificial neural network (ANN) modeling and its application in pharmaceutical research
J. Pharmaceut. Biomed.
(2000) - et al.
Sea water level forecasting using genetic programming and comparing the performance with Artificial Neural Networks
Comput. Geosci.
(2010) - et al.
Long-term SPI drought forecasting in the Awash River Basin in Ethiopia using wavelet neural networks and wavelet support vector regression models
J. Hydrol.
(2014) - et al.
Coupling machine learning methods with wavelet transforms and the bootstrap and boosting ensemble approaches for drought prediction
Atmos. Res.
(2016) - et al.
Daily reservoir inflow forecasting using artificial neural networks with stopped training approach
J. Hydrol.
(2000) - et al.
Application of the artificial neural network model for prediction of monthly standardized precipitation and evapotranspiration index using hydrometeorological parameters and climate indices in eastern Australia
Atmos. Res.
(2015) - et al.
Drought forecasting in eastern Australia using multivariate adaptive regression spline, least square support vector machine and M5Tree model
Atmos. Res.
(2017) - et al.
Dynamic coupling of support vector machine and K-nearest neighbour for downscaling daily rainfall
J. Hydrol.
(2015) - et al.
Advancing monthly streamflow prediction accuracy of CART models using ensemble learning paradigms
J. Hydrol.
(2013) - et al.
Statistical downscaling of GCM simulations to streamflow using relevance vector machine
Adv. Water Resour.
(2008)
Statistical downscaling of watershed precipitation using Gene Expression Programming (GEP)
Environ. Model Softw.
A comparative study of artificial neural network, adaptive neuro fuzzy inference system and support vector machine for forecasting river flow in the semiarid mountain region
J. Hydrol.
A GA-based feature selection and parameters optimization for support vector machines
Expert Syst. Appl.
Databased comparison of sparse Bayesian learning and multiple linear regression for statistical downscaling of low flow indices
J. Hydrol.
Projection of climate change impacts on precipitation using soft-computing techniques: a case study in Zayandeh-rud Basin, Iran
Glob. Planet. Chang.
Detecting outliers: do not use standard deviation around the mean, use absolute deviation around the median
J. Exp. Soc. Psychol.
Multiple regression and Artificial Neural Network for long-term rainfall forecasting using large scale climate modes
J. Hydrol.
Comparison of four machine learning algorithms for their applicability in satellite-based optical rainfall retrievals
Atmos. Res.
A combined neural-wavelet model for prediction of Ligvanchai watershed precipitation
Eng. Appl. Artif. Intell.
Daily streamflow forecasting by machine learning methods with weather and climate inputs
J. Hydrol.
Drought sensitivity mapping using two one-class support vector machine algorithms
Atmos. Res.
Water resources climate change projections using supervised nonlinear and multivariate soft computing techniques
J. Hydrol.
Testing the structure of a hydrological model using genetic programming
J. Hydrol.
Flood susceptibility mapping using a novel ensemble weights-of-evidence and support vector machine models in GIS
J. Hydrol.
Generalization of a statistical downscaling model to provide local climate change projections for Australia
Environ. Model. Softw.
Downscaling of precipitation for climate change scenarios: a support vector machine approach
J. Hydrol.
Modelling rainfall-runoff using genetic programming
Math. Comput. Model.
Discharge forecasting using an online sequential extreme learning machine (OS-ELM) model: a case study in Neckar River, Germany
Measurement
Stream-flow forecasting using extreme learning machines: a case study in a semi-arid region in Iraq
J. Hydrol.
Multilayer perceptron neural network for downscaling rainfall in arid region: a case study of Baluchistan, Pakistan
J. Earth Syst. Sci.
Downscaling precipitation to river basin in India for IPCC SRES scenarios using support vector machine
Int. J. Climatol.
Role of predictors in downscaling surface temperature to river basin in India for IPCC SRES scenarios using support vector machine
Int. J. Climatol.
Incorporating Climate Change in Water Allocation Planning, Waterlines Report
Statistical downscaling of multi-site daily rainfall in a south Australian catchment using a generalized linear model
Int. J. Climatol.
Empirical-Statistical Downscaling
On the use of reanalysis data for downscaling
J. Clim.
Comparison of statistical downscaling methods for monthly total precipitation: case study for the Paute River basin in southern Ecuador
Adv. Meteorol.
The twentieth century reanalysis project
Q. J. Roy. Meteorol. Soc.
Downscaling daily extreme temperatures with genetic programming
Geophys. Res. Lett.
Estimation of monthly evaporative loss using relevance vector machine, extreme learning machine and multivariate adaptive regression spline models
Stoch. Env. Res. Risk A.
Statistical downscaling of temperature using three techniques in the Tons River basin in Central India
Theor. Appl. Climatol.
Detecting changes in seasonal precipitation extremes using regional climate model projections: implications for managing fluvial flood risk
Water Resour. Res.
SVM-PGSL coupled approach for statistical downscaling to predict rainfall from GCM output
J. Geophys. Res.
Development and evaluation of statistical downscaling models for monthly precipitation
Earth Interact.
Cited by (194)
Downscaling of environmental indicators: A review
2024, Science of the Total EnvironmentData fusion of satellite imagery and downscaling for generating highly fine-scale precipitation
2024, Journal of HydrologyDeep learning in statistical downscaling for deriving high spatial resolution gridded meteorological data: A systematic review
2024, ISPRS Journal of Photogrammetry and Remote SensingDeep-learning post-processing of short-term station precipitation based on NWP forecasts
2023, Atmospheric ResearchHPO-empowered machine learning with multiple environment variables enables spatial prediction of soil heavy metals in coastal delta farmland of China
2023, Computers and Electronics in Agriculture