Elsevier

Applied Soft Computing

Volume 80, July 2019, Pages 310-328
Applied Soft Computing

Straight line programs for energy consumption modelling

https://doi.org/10.1016/j.asoc.2019.04.001Get rights and content

Highlights

  • Straight Line Programs are studied as representation for Symbolic Regression problems.

  • Straight Line Programs allow to reduce the search space.

  • A hybrid genetic algorithm estimates both parameters and energy consumption data.

Abstract

Energy consumption has increased in recent decades at a rate ranging from 1.5% to 10% per year in the developed world. As a consequence, several efforts have been made to model energy consumption in order to achieve a better use of energy and to minimize environmental impact. Open problems in this area range from energy consumption forecasting to user profile mining, energy source planning, to transportation, among others. To address these problems, it is important to have suitable tools to model energy consumption data series, so that the analysts and CEOs can have knowledge about the underlying properties of the power demand in order to make high-level decisions. In this paper, we focus on the problem of energy consumption modelling, and provide a solution from the perspective of symbolic regression. More specifically, we develop hybrid genetic programming algorithms to find the algebraic expression that best models daily energy consumption in public buildings at the University of Granada as a testbed, and compare the benefits of Straight Line Programs with the classic tree representation used in symbolic regression. Regarding algorithm design, the outcomes of our experimentation suggest that Straight Line Programs outperform other representation models in the symbolic regression problems studied, and also that the hybridation with local search methods can improve the quality of the resulting algebraic expression. On the other hand, with regards to energy consumption modelling, our approach empirically demonstrates that symbolic regression can be a powerful tool to find underlying relationships between multivariate energy consumption data series.

Introduction

Energy efficiency is gaining special interest in recent years due to the remarkable increase in energy consumption that has been happening for decades [1]. As it is shown in [2], energy consumption in buildings has increased by between 1.5% and 1.9% per year in Europe and North America from 1994 to 2004, 10% per year during the past 20 years in China [3], and by 1.54 times in Iran [4]. The increase in the price of energy and the high demand from citizens and companies have encouraged governments to consider energy saving policies, trying to avoid irresponsible energy consumption and increase social welfare [5], [6]. For all of the aforementioned reasons, researchers carry out several studies in order to reduce energy consumption and to use energy efficiently [7].

If we focus on the case of residential and public buildings, nowadays we can set up a Building Automation System (BAS) [8], and deploy multiple sensors to perform energy consumption monitoring, occupancy, lighting, temperature, etc., for online or a later offline analysis [9]. While energy consumption forecasting is the problem that has been studied most [10], [11], sensor-based technologies have provided the possibility of studying further applications of computer science in the area of energy efficiency research, such as anomaly detection [12], [13], energy consumption modelling [14], [15], consumer profile mining [16], [17], systems control [18], [19], or energy demand planning [20], among others. The techniques used to solve each of these problems vary depending not only on the nature of the problem, but also on the requirements of the desired outputs. For instance, in the case of consumer profile mining, in [16] a Fuzzy C-Means algorithm is used to classify consumer patterns assuming there are pre-selected clusters, while in [17] it is assumed that the consumer patterns are unknown in advance and the authors propose carrying out a cluster analysis prior to the consumer profile classification procedure. As another example, in Ref. [12] an anomaly detection system for energy consumption that works in real-time as prerequisite is developed, comparing different prediction methods such as neural networks or ARIMA for a later classification with K-Means. On the other hand, the work [13] also addresses anomaly detection but it focuses on data visualization and model selection to improve output information and assessment for facility managers.

The use of a BAS in a single building or a compound eases data collection and control over the automation systems with which a building is equipped, but it also enables the integration of energy consumption data with other information coming from external sources (climate, occupancy, etc.). Thus, we can use this new information to build more accurate prediction systems of energy consumption, or to include new knowledge in the system. As an example of both situations, we cite the study [21], in which the authors address the problem of energy consumption forecasting using neural networks considering exogenous input data such as the temperature, time of day, solar radiation, or wind speed, among others; or the work [18] that develops a control system that uses the WiFi network traffic in a building to calculate occupancy and then uses this new information to control the HVAC. However, using several information sources also implies an increase in the complexity of the monitoring system, due to the potential heterogeneity of the data [21]and it is necessary to use machine learning techniques in order to process and extract knowledge from large amount of data sources. We can find different studies focused on the use of machine learning to reduce or manage the energy consumption in buildings [22]. More specifically, the proposal in [23] uses neural networks and support vector machines to predict the power consumption in residential buildings as a support for decision making.

Another aspect to be considered for a preliminary study of a problem regarding energy consumption analysis is to know if the input energy consumption data are either univariate or multivariate. Traditionally, forecasting methods used to predict energy consumption, and also time series in general, assume that the consumption data series is univariate and comes from a single source (i.e., the energy consumption sensor of a building or room, etc.), as for instance in [10], [21], although they could use additional exogenous data as in [21]. However, other applications, such as energy consumption modelling problems, may consider energy consumption data from different sources as a single multivariate data series, where each dimension of the data could come from different energy consumption sensors. One example of this situation is the work in the Ref. [24], where the authors tackle the problem of finding the relationship between the energy consumption of similar buildings in the same compound.

Finally, another relevant aspect to be considered in the design of machine learning techniques for energy consumption analysis is the trade-off between accuracy and interpretability [25]. There are plenty of forecasting approaches with an average or high accuracy of prediction, such as, for instance neural networks [26], support vector machines or ensemble models [10]. In our opinion, these techniques are useful if their role is to be used as a black box system that takes input data from sources and provides output data that can be used for decision-making or as input for another system. The interpretability of the prediction model itself is not relevant in these types of applications. However, there are applications in which the interpretability of the model is a key requirement. One example is the work [27], where it is developed a system to model household energetic behaviour for high-level information gathering and modelling. The authors selected Mamdani fuzzy rules to model the energy consumption behaviour, so that the resulting models could be assessed by experts for a later analysis of energy plant sizing management.

After this short summary about energy efficiency, energy consumption analysis and related problems and their characteristics, we are ready to formulate the principal problem addressed in this study. In this manuscript, we tackle the problem of energy consumption modelling. Unlike forecasting, where the goal is to predict future values of energy consumption data, energy consumption modelling focuses on data mining and targets at developing models that can discover new knowledge, or explain the behaviour of energy consumption considering either univariate or multivariate energy consumption data series, plus additional exogenous data in some cases. Refs. [14], [15], [28], [29] are examples of previous approaches to energy consumption modelling. In [28], the authors study the relationships between pollutant emissions and energy consumption in France, using co-integration and vector error-correction modelling techniques, and conclude about the high correlation of the studied variables. The Ref. [14] proposes a vector autoregressive model (VAR) to model the relationship between energy consumption, employment and output for Taiwan, concluding that these three variables are co-integrated with one vector. Al-Garni et al. [15] show how the energy consumption in Eastern Saudi Arabia could be modelled as a function of climate data, solar radiation and population, using a regression model. Conversely, Perry Sadorsky [29] shows some models of renewable energy consumption using panel co-integration techniques that explain how the economic growth of a country and the demand of energy creates opportunities for increasing the use of renewable energy.

To be more specific, the problem addressed in this work is the development of a method that can automatically find the inter-relationships between data in an energy consumption data series, and provide an accurate interpretable model that explains energy consumption considering these relationships. As these relationships might not be linear, and the problem is not targeted at forecasting but at knowledge discovery, classic time series analysis such as autocorrelation, linear regression or Box–Jenkins methodology [30] cannot answer both questions. Finding these relationships is a data mining problem that could not only provide information that explains the data series behaviour, but it could also be a powerful tool for an accurate estimation of energy consumption data. We tackle this problem from the perspective of symbolic regression (see Section 2), and the main contributions of this work are: The formulation of an energy consumption modelling problem from the perspective of the symbolic regression paradigm, for both data approximation and feature selection in energy consumption data; the proposal of a suitable representation model for symbolic regression, being compared experimentally with other classic models; and the methodology for data acquisition and treatment for energy consumption data modelling, including an algorithm with dynamic parameter settings estimation during the genetic algorithm iteration. Further applications such as time series prediction, anomaly detection, or higher-level decision making, might benefit from the outputs of our approach, as we suggest in Section 5.

In our experimentation, we provide a proof of concept of the proposed method applied to energy consumption data of public buildings at the University of Granada. The problem formulation is to know if the energy consumption of a working day can be explained over time with the energy consumption of the remaining working days in the same week and, if so, which days are related and how. The methodology that we propose to achieve our goal formulates the problem under the symbolic regression paradigm [31], since symbolic regression can be applied to a dataset of numeric data series, it can find the relationships between dependent and independent data, and it provides an algebraic expression as output which explains the relationships between dependent (output) and independent (input) data accurately. Despite the potential benefits of symbolic regression, classic genetic programming techniques [31] have limitations and they return local optima easily. In this research work, we make a study of different representation techniques for symbolic regression and provide a hybrid algorithm to solve some limitations of the state of the art. Due to we are aware that the reader might not be familiar with related concepts about Symbolic Regression, Straight Line Programs and genetic programming, Section 2 provides an additional background in these topics so that the remaining of the article can be read fluently. After that, Section 3 explains how the problem of energy consumption modelling can be formulated as a symbolic regression problem, and the proposed search algorithm. Section 4 shows the experimentation and discusses the outcomes and limitations of the approach, and finally conclusions and future research work are shown in Section 5.

Section snippets

Fundamentals of symbolic regression

Regression analysis [32] is a mathematical methodology used to fit a functional model between independent and dependent variables. In the literature, we can find that regression analysis is a widely used methodology in research for prediction [33] or data modelling [34]. The components of regression analysis are: a function or model hypothesis f(x̄,w̄), a set of input data x̄=(x1,x2,,xn), a set of output data ȳ=(y1,y2,,ym), and a set of constant parameters that depend on the model hypothesis

Problem statement

As described in the introduction, energy consumption modelling is a general research topic that can be tackled in different ways, depending of the objectives pursued and the output requirements. In this piece of research, our input is an energy consumption data series measured in kW/h, coming from a BAS installed in a building. Our goal is to find inter-relationships between the daily energy consumption of the building, which help to approximate the energy consumption of working days in the

Experimentation

With this experimentation, we pursue to validate experimentally the hypotheses described the following questions:

  • (a)

    Is it possible to model the energy consumption of a working day (target) considering the remaining days in the week (sources) using symbolic regression?

  • (b)

    If so, is it possible to know which source days have influence to predict the energy consumption of the target day, and which ones do not influence in the model?

The answer to these two questions, formulated as a symbolic regression

Conclusions

In this paper, we have used symbolic regression to model the energy consumption of the working days in different public buildings of the University of Granada. The results suggest that symbolic regression can be used to find algebraic expressions that model energy consumption accurately, using different representation models such as trees, Straight Line Programs or Linear Programs. The outcomes of our experimentation shows that modelling energy consumption can be performed accurately, and

Acknowledgement

This work has been supported by the project TIN201564776-C3-1-R.

References (67)

  • ShaikhP.H. et al.

    Intelligent optimized control system for energy and comfort management in efficient and sustainable buildings

    Proc. Technol.

    (2013)
  • GuanJ. et al.

    Energy planning of university campus building complex: Energy usage and coincidental analysis of individual buildings with a case study

    Energy Build.

    (2016)
  • EdwardsR.E. et al.

    Predicting future hourly residential electrical consumption: A machine learning case study

    Energy Build.

    (2012)
  • KhoshnevisanB. et al.

    Modeling of energy consumption and ghg (greenhouse gas) emissions in wheat production in esfahan province of iran using artificial neural networks

    Energy

    (2013)
  • CiabattoniL. et al.

    Fuzzy logic home energy consumption modeling for residential photovoltaic plant sizing in the new italian scenario

    Energy

    (2014)
  • AngJ.B.

    Co2 emissions, energy consumption, and output in france

    Energy Policy

    (2007)
  • SadorskyP.

    Renewable energy consumption and income in emerging economies

    Energy Policy

    (2009)
  • TsoG.K. et al.

    Predicting electricity energy consumption: A comparison of regression analysis, decision tree and neural networks

    Energy

    (2007)
  • KialashakiA. et al.

    Modeling of the energy demand of the residential sector in the united states using regression models and artificial neural networks

    Appl. Energy

    (2013)
  • BraunM. et al.

    Using regression analysis to predict the future energy consumption of a supermarket in the uk

    Appl. Energy

    (2014)
  • TosunE. et al.

    Comparison of linear regression and artificial neural network model of a diesel engine fueled with biodiesel-alcohol mixtures

    Alexandria Eng. J.

    (2016)
  • Boussaï dI. et al.

    A survey on optimization metaheuristics

    Inform. Sci., Pred., Control Diag. Adv. Neural Comput.

    (2013)
  • McCallJ.

    Genetic algorithms for modelling and optimisation

    J. Comput. Appl. Math.

    (2005)
  • BerkowitzS.J.

    On computing the determinant in small parallel time using a small number of processors

    Inf. Process. Lett.

    (1984)
  • GiustiM. et al.

    Straight-line programs in geometric elimination theory

    J. Pure Appl. Algebra

    (1998)
  • SadeghiH. et al.

    AWT TAG, estimation of electricity demand in residential sector using genetic algorithm approach

    Int. J. Ind. Eng. Prod. Res.

    (2011)
  • ZareiM. et al.

    Energy consumption modeling in residential buildings

    Int. J. Archit. Urban Devel.

    (2013)
  • PanJ. et al.

    A survey of energy efficiency in buildings and microgrids using networking technologies

    IEEE Commun. Surv. Tutor.

    (2014)
  • T. Ekwevugbe, N. Brown, V. Pakka, D. Fan, Real-time building occupancy sensing using neural-network based sensor...
  • W. Cui, H. Wang, A new anomaly detection system for school electricity consumption data, Information, 8 (4)...
  • ChangT. et al.

    Energy consumption, employment, output, and temporal causality: evidence from taiwan based on cointegration and error-correction modelling techniques

    Appl. Econ.

    (2001)
  • FigueiredoV. et al.

    An electric energy consumer characterization framework based on data mining techniques

    IEEE Trans. Power Syst.

    (2005)
  • B. Balaji, J. Xu, A. Nwokafor, R. Gupta, Y. Agarwal, Sentinel: Occupancy based hvac actuation using existing wifi...
  • Cited by (0)

    No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work. For full disclosure statements refer to https://doi.org/10.1016/j.asoc.2019.04.001.

    View full text