Towards estimation of electricity demand utilizing a robust multi-gene genetic programming technique

Mousavi, Seyyed Mohammad; Mostafavi, Elham Sadat; Hosseinpour, Fariba

doi:10.1007/s12053-015-9343-5

Towards estimation of electricity demand utilizing a robust multi-gene genetic programming technique

Original Article
Published: 12 April 2015

Volume 8, pages 1169–1180, (2015)
Cite this article

Download PDF

Energy Efficiency Aims and scope Submit manuscript

Towards estimation of electricity demand utilizing a robust multi-gene genetic programming technique

Download PDF

Seyyed Mohammad Mousavi¹,
Elham Sadat Mostafavi² &
Fariba Hosseinpour³

216 Accesses
4 Citations
Explore all metrics

Abstract

Multi-gene genetic programming (MGGP) is a new nonlinear system modeling approach that integrates the capabilities of standard genetic programming and classical regression. This paper deals with the application of this robust technique for the prediction of annual electricity demand in Thailand. The predictor variables included in the analysis were population, gross domestic product, stock index, and total revenue from exporting industrial products. Several statistical criteria were used to verify the validity of the model. A sensitivity analysis was performed to evaluate the contributions of the input features. The correlation coefficients between the measured and predicted electricity demand values are equal to 0.999 and 0.997 for the calibration and testing data sets, respectively. In addition to its high accuracy, MGGP outperforms regression and other powerful soft computing-based techniques.

A review on genetic algorithm: past, present, and future

Article 31 October 2020

Sourabh Katoch, Sumit Singh Chauhan & Vijay Kumar

An exhaustive review of the metaheuristic algorithms for search and optimization: taxonomy, applications, and open challenges

Article 09 April 2023

Kanchan Rajwar, Kusum Deep & Swagatam Das

Puma optimizer (PO): a novel metaheuristic optimization algorithm and its application in machine learning

Article 19 January 2024

Benyamin Abdollahzadeh, Nima Khodadadi, … Seyedali Mirjalili

Introduction

A major concern in the field of energy management is to have accurate estimations of the electricity demand. Accordingly, a wide variety of methods have been used by researchers to tackle this issue. In this context, autoregressive integrated moving average (ARIMA) is one of the most widely used approaches. This method is usually utilized for the modeling of stochastic disturbances in time series analysis (Box and Jenkins 1970). The ARIMA technique has been applied to different energy problems such as prediction of the customer short-term load and domestic electric energy consumption (Cho et al. 1995; Abdel-Aal and Al-Garni 1997; Ediger and Akar 2007). Moreover, several researchers used double seasonal ARIMA for electricity demand forecasting (Suhartono and Endharta 2009; Mohamed et al. 2010; Taylor 2003, 2010). Besides, conventional approaches such as multiple linear regression (MLR) are widely used for the energy demand estimation (Kandananond 2011). However, ARIMA, MLR, and other statistical analyses are based on defining the linear or nonlinear structure of the model in advance, which is not always true (Alavi and Gandomi 2011a; Mostafavi et al. 2013a).

Soft computing techniques are considered as alternative to traditional methods for tackling real-world problems. They automatically learn from experience and extract various discriminators (Mitchell 1997). Artificial neural networks (ANNs) are one of the widely used branches of soft computing. ANNs and other soft computing techniques have been successfully applied to different real world problems including industrial engineering and energy management (e.g., Abdel-Aal et al. 1997; Miranda et al. 1998; Pino et al. 2000, 2008; Annunziato et al. 2004, 2006; Tien Pao 2007; Srinivasan 2008; Tsujimura et al. 1997; Abdel-Aal 2008; Moghrabi and Eid 1998; Currie 1992; Daim et al. 2010; Alavi and Gandomi 2011b; Pizzuti et al. 2013; Annunziato et al. 2013; Najafzadeh et al. 2013; Najafzadeh and Lim 2014; Najafzadeh and Azamathulla 2013a, b; Kaydani et al. 2014; Najafzadeh et al. 2012, 2014a, b). ANNs have been used to predict the electricity demand in different countries such as Taiwan (Hsu and Chen 2003), Ireland (Ringwood et al. 2001), Spain (Catalao et al. 2007), Saudi Arabia (Abdel-Aal 2008), and Iran (Azadeh et al. 2007, 2008a, b; Kheirkhah et al. 2013). Support vector machine (SVM) is another soft computing technique that has been successfully applied to predict the electric consumption (Fan and Chen 2006; Hong 2009). Despite their good performance, ANNs and SVM are considered as black-box models. That is, they are not capable of generating practical prediction equations. The structure of ANNs should be defined in advance, which limits their practicability (Alavi and Gandomi 2011a; Yang et al. 2012; Mostafavi et al. 2013a, b).

In order to cope with the limitations of the existing methods, a robust soft computing approach, namely genetic programming (GP) is introduced (Koza 1992). In fact, GP uses the principle of Darwinian natural selection to generate computer programs for solving a problem. GP has several advantages over the conventional and ANN techniques. A notable feature of GP is that it can produce practical prediction equations without a need to pre-define the form of the existing relationship (Tay and Ho 2008; Alavi et al. 2011a, b; Can and Heavey 2011, 2012; Gandomi and Alavi 2011, 2013; Alavi and Gandomi 2012; Mostafavi et al. 2013b). GP and its variants have been shown to be powerful tools for the electricity demand prediction (Lee et al. 1997; Bhattacharya et al. 2001). Multi-gene genetic programming (MGGP) (Searson et al. 2007, 2010) is a robust variant of GP. MGGP is designed to generate mathematical models of predictor response data that are “multi-gene” in nature, i.e., linear combinations of low-order nonlinear transformations of the input variables. The traditional GP representation is based on the evaluation of a single tree (model) expression. In multi-gene representation, a single GP individual (program) is constructed from a number of genes, each of which is a tree expression (Searson et al. 2010; Gandomi and Alavi 2012a, b). Despite remarkable prediction capabilities of the MGGP approach (Searson et al. 2010; Gopalakrishnan et al. 2010; Desai and Shaikh 2012; Gandomi and Alavi 2012a, b), it has not been yet applied to solving problems in the field of energy conversion and management.

This study proposes a new MGGP approach to derive a prediction model for the electricity demand. The data for total electricity demand in Thailand from 1986 to 2009 were used for the model development. The results provided by the developed MGGP model were further compared with those obtained by other existing methods. The paper is organized as follows: The “Multi-gene genetic programming” section presents brief descriptions of the MGGP technique. The “Methodology” section outlines the model development using MGGP and reviews the results. The detailed performance analysis of the proposed model is discussed in the “Performance analysis” section. The results of the sensitivity analysis are given in the “Sensitivity analysis” section. Finally, concluding remarks are outlined in the “Conclusion” section.

Multi-gene genetic programming

GP creates computer programs to solve a problem by simulating the biological evolution of living organisms (Koza 1992). The genetic operators of genetic algorithm (GA) and GP are almost the same. The difference between GA and GP is that the former gives the solution as a string of numbers while the solution generated by the latter are computer programs represented as tree structures (Koza 1992; Gandomi and Alavi 2012a, b).

Figure 1 shows GP in the context of the input-processing-output (IPO) model (Weise 2009). As it is seen, inputs and corresponding output data samples are known in GP and the main goal is to find a program that connects them. In GP, a random population of computer programs is created to obtain high diversity. Each program evolved by GP is a structured tree composed of functions (e.g., +, −, ×, /, etc.) and terminals (e.g., numerical constants, logical constants, variables, etc.). The tree-like structure of a GP model is constructed by randomly choosing the functions and terminals. This structure has a root point with branches extending from each function and ending in a terminal. An example of a simple tree representation of a GP program is demonstrated in Fig. 2. Comprehensive descriptions of the GP algorithm can be found in Koza (1992); Alavi and Gandomi (2011a); Gandomi and Alavi (2012a, b).

MGGP (Searson et al. 2007, 2010; Searson 2009) is a new variant of GP. The traditional GP representation is based on the evaluation of a single tree (model) expression. In MGGP, a single GP individual (program) is derived from a number of genes, each of which is a tree expression (Searson et al. 2007, 2010; Searson 2009). In other words, each model evolved by MGGP is a weighted linear combination of the outputs from a number of GP trees. The tress are called “gene”. Figure 3 shows a typical program evolved by MGGP. The inputs of the model are a, b, and c and the functions used for the evolution process are ×, −, +, Log, and √. The model is linear in the parameters with respect to the coefficients α₀, α₁, and α ₂ despite using nonlinear terms. As it is seen, the evolved model is a linear combination of nonlinear transformations of the predictor variables (Searson et al. 2007, 2010; Gandomi and Alavi 2012a, b). Two important MGGP parameters that need notable control are the maximum allowable number of genes and maximum tree depth. Restricting the tree depth mostly results in generating more compact models (Searson et al. 2007, 2010; Gandomi and Alavi 2012a, b).

In order to obtain the linear coefficients an ordinary least squares analysis is performed on the training data. Besides, it is possible to embed multi-gene approach within a partial least squares method (Searson et al. 2007, 2010; Gandomi and Alavi 2012a, b). The initial population generated by MGGP contains GP trees with different randomly generated genes. In addition to traditional GP’s recombination operators, MGGP uses a tree crossover operator, called two-point high-level crossover to acquire and delete the genes (Searson et al. 2007, 2010; Gandomi and Alavi 2012a, b). As an example, assume that two parent programs evolved by MGGP contain two (Gene 1 Gene 2) and three genes (Gene 3 Gene 4 Gene 5). The genes enclosed by the crossover points are denoted by {} as follows: (Gene 1 {Gene 2}) and (Gene 3 {Gene 4 Gene 5}). Thus, during the crossover operation the genes are exchanges to create two new programs: (Gene 1 Gene 4 Gene 5) and (Gene 3 Gene 2). In MGGP, standard GP sub-tree crossover is referred to as low level crossover. In this case, a gene is chosen at random from each parent individual. Then, the standard sub-tree crossover is applied and the created trees replace the parent trees in the unaltered individual in the next generation. Moreover, there are different types of mutation in MGGP such as sub-tree mutation, mutation of constants using an additive Gaussian perturbation, and set a randomly selected constant to zero. Further details about MGGP can be found in (Searson et al. 2007, 2010; Gandomi and Alavi 2012a, b).

Methodology

The steps followed by the soft computing techniques to find optimal models are generally similar. A methodology similar to that successfully used in previously published studies was considered to derive a precise MGGP-based prediction model for the electricity demand (Azadeh et al. 2008b; Mostafavi et al. 2013b). The steps followed to derive the model were as follows:

I.
The input variables affecting the electricity demand were selected.
II.
Annual energy data of Thailand from 1986 to 2009 were collected.
III.
The gathered database was divided in to training and testing data. The training data were taken for the learning process. The testing data were used to measure the performance of the models obtained by MGGP on data that played no role in building the models (model validation) (Mostafavi et al. 2013b).
IV.
MGGP was run on the training data to find a computer program that connects the input variables to the output (annual electricity demand).
V.
The best MGGP model was chosen considering both its simplicity and the best performance on the training data.
VI.
The best MGGP model was run for the testing data to prove its generalization capability when dealing with unseen data in its future applications.

Figure 4 shows the steps of the proposed methodology for developing a prediction model for the electricity demand.

The performance measures used for in this study were correlation coefficient (R), root mean squared error (RMSE), and mean absolute percent error (MAPE):

$$ R=\frac{\left({\displaystyle {\sum}_{i=1}^n\left({h}_i-\overline{h_i}\right)\left({t}_i-\overline{t_i}\right)}\right)}{\sqrt{{\displaystyle {\sum}_{i=1}^n{\left({h}_i-\overline{h_i}\right)}^2}{\displaystyle {\sum}_{i=1}^n{\left({t}_i-\overline{t_i}\right)}^2}}} $$

(1)

$$ \mathrm{RMSE}=\sqrt{\frac{{\displaystyle \sum_{i=1}^n{\left({h}_i-{t}_i\right)}^2}}{n}} $$

(2)

$$ \mathrm{MAPE}=\frac{1}{n}{\displaystyle \sum_{i=1}^n\left[\frac{\left|{h}_i-{t}_i\right|}{h_i}\right]} $$

(3)

where, h _i and t _i are, respectively, the actual and predicted output values for the ith output, $ \overline{h_i} $ and $ \overline{t_i} $ are, respectively, the average of the actual and predicted outputs, and n is the number of samples. The R value is not solely a descriptive indicator of prediction accuracy. This is because by shifting the output values of a model equally, R would not change. That is why the RMSE and MAPE measures were also included for the performance evaluation. Evidently, higher R values and lower RMSE and MAPE values indicate a more precise model (Gandomi et al. 2011a).

The medium- and long-term electricity demand prediction models have a wide variety of application including evaluation of the capacity of generation, transmission, and the type of facilities required in transmission expansion planning, power plant construction scheduling, etc. (Lee et al. 1997; Henriksson et al. 2013; Mostafavi et al. 2013a). This study was dedicated to presenting a new approach for the long-term electricity demand prediction. Four effective parameters were considered as inputs of the MGGP models to make a more comprehensive prediction of the electricity demand. Selection of the input parameters was based on their popularity in the literature (Hsu and Chen 2003; Ediger and Akar 2007; Kandananond 2011; Mostafavi et al. 2013a). Finally, the formulation of the annual electricity demand (E) (GWh) was considered to be as follows:

$$ E = f\left(P,\mathrm{G}\mathrm{D}\mathrm{P},\mathrm{S}\mathrm{I},\mathrm{T}\mathrm{R}\right) $$

(4)

where, P, GDP, SI, and TR represent the yearly values of the population, gross domestic product, stock index (SET index), and total revenue from exporting industrial products (export) (million baht), respectively.

Data preprocessing

The proposed model was developed upon a reliable database containing the energy data of Thailand from 1986 to 2009 (Kandananond 2011). Table 1 presents the descriptive statistics of the input and output variables. The proposed models are applicable to the ranges shown in this table.

Table 1 Descriptive statistics of the variables included in the analysis

Full size table

Of the available data sets, 18 data vectors (75 %) were taken for the training process and the remaining 6 data sets (25 %) were used as the testing data. The selection strategy was based on the consistency of the parameters in the training and testing data sets with regard to some statistical parameters (Alavi et al. 2011a, b; Gandomi et al. 2011a, b).

MGGP-based formulation of electricity demand

Several runs were conducted to obtain the best parameterization for MGGP. Various parameters are involved in the MGGP predictive algorithm. These parameters are selected are based on both some previously suggested values (Searson et al. 2010; Gopalakrishnan et al. 2010; Desai and Shaikh 2012; Gandomi and Alavi 2012a, b) and after making several preliminary runs and observing the performance. The parameter settings are shown in Table 2. In this study, basic arithmetic operators and mathematical functions are utilized to get the optimum MGGP models. The number of programs in the population is set by the population size. The number of generation sets the number of levels the algorithm uses before the run terminates (Searson et al. 2010; Gandomi and Alavi 2012a, b). The proper number of population and generation often depends on the complexity of problems and on the number of possible solutions. A fairly large number of population and generations are tested to find models with minimum error. The programs are run until the runs automatically terminated. The maximum allowable number of genes in an individual and the maximum tree depth directly influence the size of the search space and the number of solutions explored within the search space (Searson et al. 2010; Gandomi and Alavi 2012a, b). The success of the MGGP algorithm usually increases with increasing these parameters. In this case, the complexity of the evolved function increases and the speed of the algorithm decreases. The allowable number of genes and tree depth are respectively set to optimal values as tradeoffs between the running time and the complexity of the evolved solutions (Gandomi and Alavi 2012a, b). There are 3 × 3 × 3 × 2 × 2 × 2 = 216 different combinations of the parameters. All of these parameter combinations are tested and two replications for each were carried out. Therefore, the overall number of optimal individual runs is equal to 216 × 2 = 432. GPTIPS toolbox (Searson 2009), in conjunction with subroutines coded in MATLAB, is used to implement MGGP. Fitness function evaluates the evolved expressions to designate the best encoded expressions (Gandomi and Alavi 2012a, b). The default GPTIPS multi-gene symbolic regression function is used to minimize the error (root mean squared error). The best MGGP models are chosen on the basis of the providing the best fitness value on the training data as well as the simplicity of the models (Gandomi and Alavi 2012a, b).

Table 2 Parameter settings for the MGGP algorithm

Full size table

The optimal MGGP-based formulation of the electricity demand (E) is as follows:

$$ \begin{array}{cc}\hfill E\left(\mathrm{G}\mathrm{W}\mathrm{h}\right)=\hfill & \hfill 385.9+0.0002098\mathrm{P}-0.05322\mathrm{G}\mathrm{D}\mathrm{P}+0.05826\mathrm{T}\mathrm{R}\hfill \\ {}\hfill \hfill & \hfill -\frac{1.515\mathrm{S}\mathrm{I}\left(\mathrm{G}\mathrm{D}\mathrm{P}-15.66\mathrm{S}\mathrm{I}\right)}{10^7}\hfill \\ {}\hfill \hfill & \hfill +\frac{6.994\mathrm{G}\mathrm{D}\mathrm{P}\left(3\mathrm{G}\mathrm{D}\mathrm{P}+7.831\mathrm{S}\mathrm{I}-\mathrm{T}\mathrm{R}\right)}{10^{10}}\hfill \\ {}\hfill \hfill & \hfill +\frac{8.755\mathrm{P}\left(\mathrm{G}\mathrm{D}\mathrm{P}-\mathrm{T}\mathrm{R}\right)}{10^{10}}-\frac{\left(8.755\mathrm{S}\mathrm{I}\left(\mathrm{G}\mathrm{D}\mathrm{P}-\mathrm{T}\mathrm{R}\right)\right)}{10^{10}}\hfill \\ {}\hfill \hfill & \hfill \hfill \\ {}\hfill \hfill & \hfill \hfill \end{array} $$

(5)

in which, P, GDP, SI, and TR represent the inputs variables. Figure 5 shows the measured versus predicted E values using the MGGP model. The numbers of population, generations, genes, and head size for the optimal run were equal to 200, 200, 6, and 5, respectively. As it is seen, the performance of the model on the testing data is better than training data. Figure 6 shows the variation of the best (log values) and mean fitness with the number of generations. It can be observed from this figure that the fitness value decreases with increasing the number of generations. The best fitness is found at the 195th generation. The statistical significance of each of the three genes of the derived model is visualized in Fig. 7. According to Fig. 7, the weight of the bias term is higher than the other genes. Figure 7 depicts the degree of significance of each gene evaluated using p values. As it is seen, except for the bias and fourth gene (Gene 4), the contribution of the genes to explain variations in E is very high, as their relevant p values are very low and are near 0. The statistical significance of Gene 1, Gene 3, Gene 5, and Gene 6 is lower than the bias term and other genes.

Performance analysis

Smith (1986) suggested the following criteria for judging performance of a model:

if a model gives |R| > 0.8, a strong correlation exists between the predicted and measured values.

In all cases, the error values (e.g., RMSE, MAPE) should be at the minimum. It can be observed from Fig. 5 that the MGGP model provides very good predictions both for the training (R = 0.999, RMSE = 389.913, MAPE = 1.461) and testing (R = 0.997, RMSE = 317.968, MAPE = 0.425) data. Besides, new criteria recommended by Golbraikh and Tropsha (2002) were checked for external validation of the models on the validation data sets. It is suggested that at least one slope of regression lines (k or k') through the origin should be close to 1. It should be noted that k and k' are the slopes of regression lines between the regressions of actual (h _i) against predicted output (t _i) or t _i against h _i through the origin, i.e., h _i = k t _i and t _i = k' h _i, respectively. Also, the performance indexes of m and n should be lower than 0.1. Recently, Roy and Roy (2008) introduced a confirm indicator (R _m) of the external predictability of models. For R _m > 0.5, the condition is satisfied. Either the squared correlation coefficient (through the origin) between predicted and experimental values (Ro²), or the coefficient between experimental and predicted values (Ro′²) should be close to R ², and to 1 (Alavi et al. 2011a, b). The considered validation criteria and the relevant results obtained by the models are presented in Table 3. As it is seen, the derived model satisfies the required conditions. The validation phase ensures the derived model is strongly valid.

Table 3 Statistical parameters of the MGGP model for the external validation

Full size table

In order to have an idea about the predictive power of the MGGP model, its performance was compared with that of a conventional and two powerful soft computing-based models. For this aim, traditional regression (MLR) (Kandananond 2011), ANN (Mostafavi et al. 2013a), and hybrid genetic programming-simulated annealing (GSA) (Mostafavi et al. 2013a) models were considered. The MLR model proposed by Kandananond (2011) is as follows:

$$ E\left(\mathrm{G}\mathrm{W}\mathrm{h}\right)=-91411+0.00170\times P+0.00794\times \mathrm{G}\mathrm{D}\mathrm{P}-2.57\times \mathrm{S}\mathrm{I}+0.00114\times \mathrm{T}\mathrm{R} $$

(6)

The best ANN architecture for the estimation of E had one input layer with four arguments (P, GDP, SI, and TR), one output layer with 1 node providing the value of E and one hidden layer having 15 nodes. Log-sigmoid was adopted as the transfer function between the input-hidden and hidden-output layers. Also, the ANN model was built with a learning rate of 0.05 and trained for 1500 epochs (Mostafavi et al. 2013a). It should be noted that GSA and MGGP are totally different evolutionary approaches. In GSA, only a single computer program is initially created at random. Then, the essential role of SA in the integrated GP and SA algorithm is selection of new computer programs and optimizing the evolutionary process to find the optimal model (Alavi et al. 2010). On the other hand, MGGP is working based on a population of computer programs and follows the basic GP evolutionary process. The final model evolved by MGGP is a weighted linear combination of the outputs from a number of GP trees (computer programs). Figure 8 presents a comparison of the predictions made by the MGGP, MLR, ANN, and GSA models. It can be observed from this figure that the results provided by the proposed model are a significant improvement over those provided by the MLR model. Besides, the MGGP model performs superior to the ANN and GSA models. It was not possible to include the ARIMA model proposed by Kandananond (2011) in the comparative study because the predicted electricity demand values were not presented in that research. However, Kandananond (2011) found that the ANN model had a notably better performance than the ARIMA model. Since the MGGP model performs superior to the ANN model, it apparently outperforms the ARIMA model.

As it is seen, ANN provides good results but a significant limitation of this method is that it usually does not provide practical prediction equations. Thus, it is very difficult for practitioners to utilize and interpret an ANN model (Kandananond 2011). However, it should be noted that only if one hidden layer is considered for developing the ANN models, it is possible to convert them into a functional form. Even in this case, the derived equations are very complicated as they are based on all of the connection weights between the input, hidden, and output layers (Alavi and Gandomi 2011b). Another important point is that the MLR model provides acceptable predictions only for years 2008 and 2009. A rational reason for this behavior of the MLR model is that such conventional statistical analyses assume a linear relationship between the outcome and the predictor variables, which is not always true. In most cases, the best models developed using the commonly used statistical approaches are obtained after controlling only some equations established in advance. Thus, such models cannot efficiently consider the interactions between the dependent and independent variables. On the other hand, the trends shown in Fig. 8 confirm that the performance of the proposed MGGP model is very good for all years. This is because the best solutions provided by this technique are determined after controlling numerous preliminary models, even billions of linear and nonlinear models (Alavi and Gandomi 2011a; Alavi et al. 2011a, b; Mostafavi et al. 2013b). A notable limitation of GP and its variants such as MGGP is that these methods are parameter sensitive. The best models are usually obtained after a notable number of runs with different combinations of the parameters. However, this process may be optimized by using any form of optimally controlling the parameters of the run (e.g., GAs) (Alavi et al. 2010).

Sensitivity analysis

In order to evaluate the importance of the input parameters to the prediction of the electricity demand, their frequency values (Gandomi et al. 2011a, b) were obtained. A frequency value equal to 100 for an input indicates that this input variable has been appeared in 100 % of the best 30 programs evolved by MGGP (Gandomi et al. 2011a, b). The frequency values of the predictor variables are presented in Fig. 9. According to this figure, the electricity demand is notably sensitive to all of the considered predictor variables. However, the electricity demand seems to be more influenced by P and GDP compared to SI and TR. The results are in close agreement with those reported by other researchers (Mostafavi et al. 2013a).

Conclusion

This study presents a novel application of MGGP for the empirical modeling of the electricity demand in Thailand based on historical data from 1986 to 2009. This case study illustrated the success of the MGGP technique for the prediction of the electricity demand. The validation of the derived model was verified using different criteria. The model has a very good performance on both training (R = 0.999, RMSE = 389.913, MAPE = 1.461) and testing data (R = 0.997, RMSE = 317.968, MAPE = 0.425). Besides, MGGP has produced better outcomes than the MLR and two soft computing methods (ANN and GSA). The main advantage of MGGP over traditional methods such as MLR and ARIMA is that there is no predefined function to be considered for the modeling of the electricity demand. As expected, the results of the sensitivity analysis indicate that the electricity demand is more affected by population and GDP. The model can be easily retrained and improved to make more accurate predictions for a wider range by including the data for other years. Further research can focus on identifying other predictor variables and incorporating them into the modeling process. For instance, visibility, water vapor pressure, and wind speed may directly be included into the analysis in addition to the parameters considered in this study.

References

Abdel-Aal, R. E. (2008). Univariate modeling and forecasting of monthly energy demand time series using abductive and neural networks. Computers & Industrial Engineering, 54(4), 903–917.
Article Google Scholar
Abdel-Aal, R. E., & Al-Garni, A. Z. (1997). Forecasting monthly electric energy consumption in eastern Saudi Arabia using univariate time-series analysis. Energy, 22, 1059–1069.
Article Google Scholar
Abdel-Aal, R. E., Al-Garni, A. Z., & Al-Nassar, Y. N. (1997). Modelling and forecasting monthly electric energy consumption in eastern Saudi Arabia using abductive networks. Energy, 22(9), 911–921.
Article Google Scholar
Alavi, A. H., & Gandomi, A. H. (2011a). A robust data mining approach for formulation of geotechnical engineering systems. Engineering Computations, 28(3), 242–274.
Article MATH Google Scholar
Alavi, A. H., & Gandomi, A. H. (2011b). Prediction of principal ground-motion parameters using a hybrid method coupling artificial neural networks and simulated annealing. Computers & Structures, 89(23–24), 2176–2194.
Article Google Scholar
Alavi, A. H., & Gandomi, A. H. (2012). Energy-based models for assessment of soil liquefaction. Geoscience Frontiers, 3(4), 541–555.
Article Google Scholar
Alavi, A. H., Gandomi, A. H., Mousavi, M., & Mollahasani, A. (2010). High-precision modeling of uplift capacity of suction caissons using a hybrid computational method. Geomechanics and Engineering, 2(4), 253–280.
Article Google Scholar
Alavi, A. H., Gandomi, A. H., & Mollahasani, A. (2011a). A genetic programming-based approach for performance characteristics assessment of stabilized soil. Variants of evolutionary algorithms for real-world applications, Springer, Berlin, Chapter 9, 343–375
Alavi, A. H., Ameri, M., Gandomi, A. H., & Mirzahosseini, M. R. (2011b). Formulation of flow number of asphalt mixes using a hybrid computational method. Construction and Building Materials, 25(3), 1338–1355.
Article Google Scholar
Annunziato, M., Bertini, I., Lucchetti, M., Pannicelli, A., & Pizzuti, S. (2004). The evolutionary control methodology: an overview. Artificial Evolution, Lecture Notes in Computer Science Volume, 2936, 331–342.
Article Google Scholar
Annunziato, M., Bertini, I., Iannone, R., & Pizzuti, S. (2006). Evolving feed-forward neural networks through evolutionary mutation parameters. Intelligent Data Engineering and Automated Learning—IDEAL 2006. Lecture Notes in Computer Science Volume, 4224(2006), 554–561.
Article Google Scholar
Annunziato, M., Moretti, F., & Pizzuti, S. (2013). Urban traffic flow forecasting using neural-statistic hybrid modeling. Soft computing models in industrial and environmental applications. Advances in Intelligent Systems and Computing Volume, 188, 183–190.
Article Google Scholar
Azadeh, A., Ghaderi, S. F., Tarverdian, S., & Saberi, M. (2007). Integration of artificial neural networks and genetic algorithm to predict electrical energy consumption. Applied Mathematics and Computation, 186, 1731–1741.
Article MATH MathSciNet Google Scholar
Azadeh, A., Ghaderi, S. F., & Sohrabkhani, S. (2008a). Annual electricity consumption forecasting by neural network in high energy consuming industrial sectors. Energy Conversion and Management, 49, 2272–2278.
Article Google Scholar
Azadeh, A., Ghaderi, S. F., & Sohrabkhani, S. (2008b). A simulated-based neural network algorithm for forecasting electrical energy consumption in Iran. Energy Policy, 36, 2637–2644.
Article Google Scholar
Bhattacharya, M., Abraham, A., & Nath, B. (2001). A linear genetic programming approach for modeling electricity demand prediction in Victoria. In Proceedings of the hybrid information systems, Adelaide, Australia, pp. 379–393.
Box, G. E. P., & Jenkins, G. M. (1970). Time series analysis: forecasting and control. San Francisco: Holden-Day.
MATH Google Scholar
Can, B., & Heavey, C. (2011). Comparison of experimental designs for simulation-based symbolic regression of manufacturing systems. Computers & Industrial Engineering, 61(3), 447–462.
Article Google Scholar
Can, B., & Heavey, C. (2012). A comparison of genetic programming and artificial neural networks in metamodeling of discrete-event simulation models. Computers & Operations Research, 39(2), 424–436.
Article Google Scholar
Catalao, J. P. S., Mariano, S. J. P. S., Mendes, V. M. F., & Ferreira, L. A. F. M. (2007). Short-term electricity prices forecasting in a competitive market: a neural network approach. Electric Power Systems Research, 77, 1297–1304.
Article Google Scholar
Cho, M. Y., Hwang, J. C., & Che, S. (1995). Customer short term load forecasting by using ARIMA transfer function model. In Proceedings of the International Conference on Energy Management and Power Delivery, EMPD’95, Singapore, pp. 317–322.
Currie, K. R. (1992). An intelligent grouping algorithm for cellular manufacturing. Computers & Industrial Engineering, 23(1–4), 109–112.
Article Google Scholar
Daim, T. U., Kayakutlu, G., & Cowan, K. (2010). Developing Oregon’s renewable energy portfolio using fuzzy goal programming model. Computers & Industrial Engineering, 59(4), 786–793.
Article Google Scholar
Desai, C. K., & Shaikh, A. (2012). Prediction of depth of cut for single-pass laser micro-milling process using semi-analytical, ANN and GP approaches. The International Journal of Advanced Manufacturing Technology, 60, 865–882.
Article Google Scholar
Ediger, V. S., & Akar, S. (2007). ARIMA forecasting of primary energy demand by fuel in Turkey. Energy Policy, 35, 1701–1708.
Article Google Scholar
Fan, S., & Chen, L. (2006). Short-term load forecasting based on an adaptive hybrid method. IEEE Transactions on Power Systems, 21, 392–401.
Article Google Scholar
Gandomi, A. H., & Alavi, A. H. (2011). Multi-stage genetic programming: a new strategy to nonlinear system modeling. Information Sciences, 181(23), 5227–5239.
Article Google Scholar
Gandomi, A. H., & Alavi, A. H. A. (2012a). New multi-gene genetic programming approach to nonlinear system modeling. Part I: materials and structural engineering problems. Neural Computing & Applications, 21, 171–187.
Article Google Scholar
Gandomi, A. H., & Alavi, A. H. (2012b). A new multi-gene genetic programming approach to nonlinear system modeling. Part II: geotechnical and earthquake engineering problems. Neural Computing & Applications, 21, 189–201.
Article Google Scholar
Gandomi, A. H., & Alavi, A. H. (2013). Hybridizing genetic programming with orthogonal least squares for modeling of soil liquefaction. International Journal of Earthquake Engineering and Hazard Mitigation, 1(1), 2–8. Praise Worthy Prize.
MathSciNet Google Scholar
Gandomi, A. H., Alavi, A. H., Mirzahosseini, M. R., & Moghadas Nejad, F. (2011a). Nonlinear genetic-based models for prediction of flow number of asphalt mixtures. Journal of Materials in Civil Engineering, 23(3), 1–18.
Article Google Scholar
Gandomi, A. H., Alavi, A. H., & Yun, G. J. (2011b). Nonlinear modeling of shear strength of SFRCB beams using linear genetic programming. Structural Engineering and Mechanics, 38(1), 1–25.
Article Google Scholar
Golbraikh, A., & Tropsha, A. (2002). Beware of q2. Journal of Molecular Graphics and Modelling, 20, 269–276.
Article Google Scholar
Gopalakrishnan, K., Kim, S., Ceylan, H., & Khaitan, S. K. (2010) Natural selection of asphalt stiffness predictive models with genetic programming. Proceedings of the Artificial Neural Networks In Engineering (ANNIE) 2010. In C. H. Dagli et al. (Eds.), St. Louis, Missouri, November 1–3, 2010.
Henriksson, E., Söderholm, P., & Wårell, L. (2013). Industrial electricity demand and energy efficiency policy: the case of the Swedish mining industry. Energy Efficiency. doi:10.1007/s12053-013-9233-7.
Google Scholar
Hong, W. C. (2009). Electric load forecasting by support vector model. Applied Mathematical Modelling, 33, 2444–2454.
Article MATH Google Scholar
Hsu, C. C., & Chen, C. Y. (2003). Regional load forecasting in Taiwan-applications of artificial neural networks. Energy Conversion and Management, 44, 1941–1949.
Article Google Scholar
Kandananond, K. (2011). Forecasting electricity demand in Thailand with an artificial neural network approach. Energies, 4, 1246–1257.
Article Google Scholar
Kaydani, H., Najafzadeh, M., & Mohebbi, A. (2014). Wellhead choke performance in oil well pipeline systems based on genetic programming. Journal of Pipeline Systems Engineering and Practice, 5(3), 06014001.
Article Google Scholar
Kheirkhah, A., Azadeh, A., Saberi, M., Azaron, A., & Shakouri, H. (2013). Improved estimation of electricity demand function by using of artificial neural network, principal component analysis and data envelopment analysis. Computers & Industrial Engineering, 64(1), 425–441.
Article Google Scholar
Koza, J. R. (1992). Genetic programming, on the programming of computers by means of natural selection. Cambridge: MIT Press.
MATH Google Scholar
Lee, D. G., Lee, B. W., & Chang, S. H. (1997). Genetic programming model for long-term forecasting of electric power demand. Electric Power Systems Research, 40, 17–22.
Article Google Scholar
Miranda, V., Srinivasan, D., & Proenca, L. M. (1998). Evolutionary computation in power systems. International Journal of Power & Energy Systems, 20(2), 89–98.
Article Google Scholar
Mitchell, T. (1997). Does machine learning really work? AI Magazine, 18(3), 11–20.
Google Scholar
Moghrabi, C., & Eid, M. S. (1998). Modeling users through an expert system and a neural network. Computers & Industrial Engineering, 35(3–4), 583–586.
Article Google Scholar
Mohamed, N., Ahmad, M. H., Ismail, Z., & Suhartono. (2010). Double seasonal ARIMA model for forecasting load demand. Matematika, 26, 217–231.
MathSciNet Google Scholar
Mostafavi, E. S., Mostafavi, S. I., Hosseinpour, F., & Jaafari, A. (2013a). A novel machine learning approach for the estimation of electricity demand. Energy Conversion and Management, 74, 548–555.
Article Google Scholar
Mostafavi, E. S., Saeedi, S., Sarvar, R., Izadi Moud, H., & Mousavi, S. M. (2013b). A hybrid computational approach to estimate solar global radiation: an empirical evidence from Iran. Energy, 49, 204–210.
Article Google Scholar
Najafzadeh, M., & Azamathulla, H. M. (2013a). Group method of data handling to predict scour depth around bridge piers. Neural Computing and Applications, 23(7–8), 2107–2112.
Article Google Scholar
Najafzadeh, M., & Azamathulla, H. (2013a). Neuro-fuzzy GMDH to predict the scour pile groups due to waves. Journal of Computing in Civil Engineering. doi:10.1061/(ASCE)CP.1943-5487.0000376, 04014068.
Najafzadeh, M., & Lim, S. Y. (2014). Application of improved neuro-fuzzy GMDH to predict scour depth at sluice gates. Earth Science Informatics 8(1), 187–196.
Najafzadeh, M., Barani, G., & Azamathulla, H. M. (2012). Prediction of pipeline scour depth in clear-water and live-bed conditions using group method of data handling. Neural Computing and Applications, 24(3–4), 629–635.
Google Scholar
Najafzadeh, M., Barani, G., & Hessami-Kermani, M. (2013). Group method of data handling to predict scour depth around vertical piles under regular waves. Scientia Iranica, 20(3), 406–413.
Google Scholar
Najafzadeh, M., Barani, G., & Hessami Kermani, M. (2014a). Estimation of pipeline scour due to waves by GMDH. Journal of Pipeline Systems Engineering and Practice, 5(3), 06014002.
Article Google Scholar
Najafzadeh, M., Barani, G., & Hessami-Kermani, M. (2014b). Group method of data handling to predict scour at downstream of a ski-jump bucket spillway. Earth Science Informatics, 7(4), 231–248.
Article Google Scholar
Pino, R., de la Fuente, D., Priore, P., & Parreño, J. (2000). Short term forecasting of the electricity market of Spain using neural networks. In: The 20th International Symposium on Forecasting, ISF’00, pp. 39-53, Lisboa, Portugal.
Pino, R., Parreno, J., Gomez, A., & Priore, P. (2008). Forecasting next-day price of electricity in the Spanish energy market using artificial neural networks. Engineering Applications of Artificial Intelligence, 21(1), 53–62.
Article Google Scholar
Pizzuti, S., Annunziato, M., & Moretti, F. (2013). Smart street lighting management. Energy Efficiency, 6(3), 607–616.
Article Google Scholar
Ringwood, J. V., Bofelli, D., & Murray, F. T. (2001). Forecasting electricity demand on short, medium and long time scales using neural networks. Journal of Intelligent and Robotic Systems, 31, 129–147.
Article MATH Google Scholar
Roy, P. P., & Roy, K. (2008). On some aspects of variable selection for partial least squares regression models. QSAR and Combinatorial Science, 27, 302–313.
Article Google Scholar
Searson, D. P. (2009). GPTIPS: genetic programming & symbolic regression for MATLAB. White paper. http://www.cs.bham.ac.uk/~wbl/biblio/cache/http___sites.google.com_site_gptips4matlab_file-cabinet_GPTIPSGuide1.0.pdf. Accessed 01 November 2013.
Searson, D. P., Willis, M. J., & Montague, G. A. (2007). Co-evolution of nonlinear PLS model components. Journal of Chemometrics, 2, 592–603.
Article Google Scholar
Searson, D. P., Leahy, D. E., & Willis, M. J. (2010). GPTIPS: an open source genetic programming toolbox for multigene symbolic regression. Proc Int Multi Conf Eng Comput Scie Hong Kong.
Smith, G. N. (1986). Probability and statistics in civil engineering. London: Collins.
Google Scholar
Srinivasan, D. (2008). Energy demand prediction using GMDH networks. Neurocomputing, 72(1–3), 625–629.
Article Google Scholar
Suhartono, & Endharta, A. J. (2009). Short term electricity load demand forecasting in Indonesia by using double seasonal recurrent neural networks. International Journal of Mathematical Models and Methods in Applied Sciences, 3, 171–178.
Google Scholar
Tay, J. C., & Ho, N. B. (2008). Evolving dispatching rules using genetic programming for solving multi-objective flexible job-shop problems. Computers & Industrial Engineering, 54(3), 453–473.
Article Google Scholar
Taylor, J. W. (2003). Short-term electricity demand forecasting using double seasonal exponential smoothing. Journal of the Operational Research Society, 54, 799–805.
Article MATH Google Scholar
Taylor, J. W. (2010). Triple seasonal methods for short-term electricity demand forecasting. European Journal of Operational Research, 204, 139–152.
Article MATH Google Scholar
Tien Pao, H. (2007). Forecasting electricity market pricing using artificial neural networks. Energy Conversion and Management, 48, 907–912.
Article Google Scholar
Tsujimura, Y., Gen, M., & Ishizaki, S. (1997). Optimal routing in multiple IO data network using neural network with perturbed energy function. Computers & Industrial Engineering, 33(3–4), 477–480.
Article Google Scholar
Weise, T. (2009). Global optimization algorithms—theory and application [online]. Germany. Available from Internet: http://www.it-weise.de. Accessed 21 October 2013.
Yang, X. S., Gandomi, A. H., Talatahari, S., & Alavi, A. H. (2012). Metaheuristics in water resources, geotechnical and transportation engineering. Waltham: Elsevier.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Geography and Urban Planning, Islamic Azad University, Science and Research Branch, Tehran, Iran
Seyyed Mohammad Mousavi
Department of Industrial Engineering, Isfahan University of Technology, Isfahan, Iran
Elham Sadat Mostafavi
Faculty of Economic and Accounting, Islamic Azad University, Central Tehran Branch (IAUCTB), Tehran, Iran
Fariba Hosseinpour

Authors

Seyyed Mohammad Mousavi
View author publications
You can also search for this author in PubMed Google Scholar
Elham Sadat Mostafavi
View author publications
You can also search for this author in PubMed Google Scholar
Fariba Hosseinpour
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Elham Sadat Mostafavi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mousavi, S.M., Mostafavi, E.S. & Hosseinpour, F. Towards estimation of electricity demand utilizing a robust multi-gene genetic programming technique. Energy Efficiency 8, 1169–1180 (2015). https://doi.org/10.1007/s12053-015-9343-5

Download citation

Received: 18 February 2014
Accepted: 03 March 2015
Published: 12 April 2015
Issue Date: December 2015
DOI: https://doi.org/10.1007/s12053-015-9343-5

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Towards estimation of electricity demand utilizing a robust multi-gene genetic programming technique

Abstract

Similar content being viewed by others

A review on genetic algorithm: past, present, and future

An exhaustive review of the metaheuristic algorithms for search and optimization: taxonomy, applications, and open challenges

Puma optimizer (PO): a novel metaheuristic optimization algorithm and its application in machine learning

Introduction

Multi-gene genetic programming