Elsevier

Information Sciences

Volume 301, 20 April 2015, Pages 99-123
Information Sciences

Time-series event-based prediction: An unsupervised learning framework based on genetic programming

https://doi.org/10.1016/j.ins.2014.12.054Get rights and content

Abstract

In this paper, we propose an unsupervised learning framework based on Genetic Programming (GP) to predict the position of any particular target event (defined by the user) in a time-series. GP is used to automatically build a library of candidate temporal features. The proposed framework receives a training set S={(Va)|a=0n}, where each Va is a time-series vector such that VaS,Va={(xt)|t=0tmax} where tmax is the size of the time-series. All VaS are assumed to be generated from the same environment. The proposed framework uses a divide-and-conquer strategy for the training phase. The training process of the proposed framework works as follow. The user specifies the target event that needs to be predicted (e.g., Highest value, Second Highest value, …, etc.). Then, the framework classifies the training samples into different Bins, where Bins={(bi)|i=0tmax}, based on the time-slot t of the target event in each Va training sample. Each biBins will contain a subset of S. For each bi, the proposed framework further classifies its samples into statistically independent clusters. To achieve this, each bi is treated as an independent problem where GP is used to evolve programs to extract statistical features from each bi’s members and classify them into different clusters using the K-Means algorithm. At the end of the training process, GP is used to build an ‘event detector’ that receives an unseen time-series and predicts the time-slot where the target event is expected to occur. Empirical evidence on artificially generated data and real-world data shows that the proposed framework significantly outperforms standard Radial Basis Function Networks, standard GP system, Gaussian Process regression, Linear regression, and Polynomial Regression.

Introduction

A time-series is a sequence of data points, measured typically at successive time instants spaced at equidistant time intervals. Usually, time-series data have a natural temporal ordering, which makes time-series analysis distinct from other common data analysis problems, in which there is no natural ordering of the observations. In many real-world applications, a vector V of observations {x0,x1,,xtmax} collected from equidistant time periods maintains some form of salient characteristics that can be exploited to predict the near future. Although time-series analysis algorithms may use solid mathematical formulas or complex statistical models, their predictions are entirely limited to the available historical data. Thus, no matter how accurate these algorithms tend to be on training data, they cannot guarantee a 100% correct prediction of the future. For this reason, time-series prediction can be seen as conditional statements of the form that “if such-and-such behaviour continues in the future, then so and so may happen…” [7].

Generally, time-series analysis is divided into two categories; (A) Forecasting algorithms in which the aim is to predict the value x at time t+1 given that sufficient historical data points are available, and (B) Discovering events in a time-series. Discovering an event means to detect unusual variations in the time-series pattern and label them as rare events. An event in a time-series is defined as “the occurrence of a variation in values over a time span that is of particular interest to a user” [37]. The focus of this paper is on time-series events detection. Generally, work on time-series event-based detection is divided into two main categories. The first category is based on extract rule sets from the time-series and correlate them with particular events using machine learning algorithms (e.g., see [36]). The disadvantage of these techniques is that they are suitable only when rules for determining the occurrence of an event are clear and well understood. The second category is to detect changes in the flow of the time-series values and label these changes as events (e.g., see [37]). The underlying assumption of these models is that it is possible to mathematically model a time-series to detect unusual variations. The advantage of this approach is that it requires no previous knowledge of the problem domain. However, its main disadvantage is that it looks at the time-series from only one dimension, assuming events are correlated by the past behaviour in the time-series itself and ignoring the fact that other variables may cause an event. Another disadvantage is that it defines events based on time-series variations and prevents the user from defining a particular event of interest.

For the purpose of this work, we consider an event to be the occurrence of an occasion defined by the user. For example, given a time-series vector V={(xt)|t=0tmax}, a user may sometimes be interested in knowing when the highest point (i.e., max xj for 0jtmax) is likely to occur (tmax is the size of the time-series). In general, the user may be interested in knowing when the nth point will occur (e.g., highest point, second highest, or the lowest point in an unseen time-series), depending on the problem domain.

The contributions of this paper are twofold:

  • 1.

    We propose an unsupervised learning framework based on GP to predict the position of any particular target event (defined by the user) in an unseen time-series.

  • 2.

    Unlike other time-series-event based detectors, the proposed framework learns the behaviour of the environment that generates the time-series itself and uses this knowledge to predict when a target event is likely to occur in an unseen time-series.

The proposed framework receives training examples of historical time-series vectors generated from the same environment and uses GP to automatically build a library of candidate temporal features. In this paper we will use the term “behaviour” to refer to statistical features. Thus, for example, as illustrated in Fig. 1, two time-series V1 and V2 generated from the same environment may not be identical but have similar behaviour in their trends of going up and down. In real-world applications, the environment can be anything including, but not limited to, stock markets, buyer–seller negotiations, or prices of oil, gas, or electricity in international markets.

The proposed framework works as follows. The examples of the training set are first put in different bins based on the exact time (in [0,tmax]) at which the event of interest happens. All bins are considered independent learning problems, and the next goal is to partition each bin into clusters of time-series of similar statistical features using GP and the K-Means algorithm. As new unseen time-series comes in (sequentially, one point at a time), the system measures the similarity between the new unseen time-series and clusters that have formed in the training phase. The closest cluster among all clusters of all bins is computed, and the algorithm returns the time of occurrence of the event as its prediction (more on this in Section 3). One advantage of the proposed framework is that it allows the user to define a particular target event of interest. Another advantage of this framework is that it requires no previous knowledge of the problem domain in order to predict events. As will be shown in the experiments section, this approach is experimented with, first on artificial data for the sake of understanding the behaviour of the method, then on real-world data from Google Trends service, reporting frequencies of use of keywords in Google searches. The results of the proposed method are better than those of standard Radial Basis Function Networks, standard GP system, Gaussian Process regression, Linear regression, and Polynomial Regression.

The proposed framework has many potential applications. For example, in economics, a monopsony is a market form in which only one buyer faces many sellers (e.g., for military equipment, contracts are limited to governments) [20]. Each seller makes different offers to the buyer in different time-slots based on their true valuation of the deal. The buyer needs to know the best time to accept an offer before it increases and the buyer loses the opportunity to maximise his/her savings. The buyer cannot recall offers from the past because the seller’s interests and true valuation change over time. Another example is multi-round online auctions with limited time steps [34]. Here, if the buyer stores historical data about the sellers’ behaviour, he/she can predicts whether the seller’s bids will win on item, the seller will re-list the same item in the next round with a lower price, or he/she should bid in the current round. Also, as we will see in the experiments section, the proposed framework can be used to analyse time-series data regarding keyword searches on the Internet and predict the next peak to assist marketing managers in deciding the best time to release their digital marketing campaigns. In addition to the examples mentioned above, the framework can assist market makers in understanding the behaviour of the player so as to design better market rules. Hence, understanding the buyer–seller relation is very useful for situations when governments decide to intervene in the stock markets (or any other market) during a crisis. In order to minimise the impact of a crisis, governments sometimes opt for injecting cash through buying shares from different stocks. However, such acts may not induce the right reaction when the correct rules of engagement are not properly set up and followed. By carefully studying the players’ behaviour gain in such a scenario, governmental efforts towards salvaging markets during panic periods may be more effective.

The remaining of this paper is organised as follows. In the next section, we discuss related work and highlight the difference between our framework and other existing algorithms. In Section 3, we provide a comprehensive description of the framework. This is followed by experimental results and analysis in Sections 4 Experiments, 5 Analysis, respectively. Finally, conclusions and possible future work is provided in Section 6.

Section snippets

GP for time-series forecasting

Time-series forecasting algorithms can be divided into two main categories, (A) algorithms to forecast univariate time-series, and (B) algorithms to forecast multivariate time-series [35]. The difference is that the former assumes that the future behaviour of the time-series is affected by its past, for example, say, sales are affected by sales levels in previous periods. However, the latter assumes that other variables impact the time-series behaviour, for example, sales are affected by

The proposed learning framework

The proposed framework process is divided into two main phases: (A) Training phase, in which the framework extracts statistical features from training time-series vectors (generated from the same environment) and matches them with the target event defined by the user (the training phase can be seen as an attempt to understand the distinct set of possible behaviours that an environment may generate in order to predict target events in unseen time-series vectors), and (B) Testing phase, in which

Experiment strategy

The key question that should be addressed in any experimental setup is: will the given algorithm solve the problem that it intends to solve or not?

In an abstract sense, algorithms can be viewed as mathematical formulation of a particular problem and a set of computer instructions to solve this problem. Naturally, all algorithms are bounded to the “No Free Lunch” theory and it is not possible to design a single approach that solves all instances in a class of problems. Ideally, one would like to

Analysis

Due to the dynamic nature of evolutionary algorithms, experiments render distributions not numbers [31]. It is no longer sufficient to report the mean of best-of-run values over a finite number of runs and to perform an off-the-shelf and statistical test to conclude that the presented work is robust. It is important to build an understanding of why the algorithm performs well on the testing cases and poor in other cases (assuming the experimental design was broad enough) in order to gain more

Conclusion and future work

In our opinion, the goodness of an algorithm should not be measured only by its results or how far it is from other state-of-the-art techniques; rather, it should also be evaluated by its novelty and how far it will allow other researchers to build on it in order to achieve a truly intelligent system. If an ideal time-series event detector system were to exist, it would return a full prediction of what the environment will generate before it actually generates anything. While this ideal system

References (38)

  • C. Chatfield

    The Analysis of Time Series

    (1996)
  • D. Dohare et al.

    Combination of similarity measures for time series classification using genetic algorithms

  • Google, Google insights, June 2012....
  • V. Guralnik, J. Srivastava, Event detection from time series data, in: KDD, 1999, pp....
  • M. Hetland et al.

    Evolutionary rule mining in time series databases

    Machine Learn.

    (2005)
  • D. Jackson

    The performance of a selection architecture for genetic programming

  • A. Kattan et al.

    Unsupervised problem decomposition using genetic programming

  • A. Kattan et al.

    Detecting localised muscle fatigue during isometric contraction using genetic programming

  • J.R. Koza

    Genetic Programming II: Automatic Discovery of Reusable Programs

    (1994)
  • Cited by (30)

    • Modeling of nonlinear systems using the self-organizing fuzzy neural network with adaptive gradient algorithm

      2017, Neurocomputing
      Citation Excerpt :

      The modeling results demonstrate the effectiveness of IOAP-FNN. Furthermore, some other novel evolutionary algorithms have been used to optimize the parameters of FNNs in [29–30]. However, one of the basic disadvantages of the evolutionary algorithms is time-consuming [31–32].

    • JTangCMS: An efficient monitoring system for cloud platforms

      2016, Information Sciences
      Citation Excerpt :

      The adapter transforms the runtime information into the event format of DDS and then sends the information to the monitoring server, or it interprets the event-based instructions from the server to produce the command formats for configuration or scheduling tasks. Besides dynamically enabling or disabling the monitoring facilities of the agent, a comprehensive data delivery model that combines the pull-based model and the push-based model is proposed in [18] to reduce the costs of updating to achieve a balance between runtime overheads and monitoring capability. As described before, the comprehensive data delivery model can switch intelligently between the pull and push models.

    • Time series k-means: A new k-means type smooth subspace clustering for time series data

      2016, Information Sciences
      Citation Excerpt :

      In this section, we give a brief survey of time series clustering methods with reference to the two approaches. For a detailed review of time series clustering methods, readers may refer to [42] and [27]. For many real-world applications, objects can be represented by a time series such as the prices of a stock, electrocardiogram, medical and biological experimental observations, and many more.

    View all citing articles on Scopus
    View full text