Adaptive sampling for active learning with genetic programming
Introduction
Evolutionary Algorithms (EA) (Pétrowski and Ben Hamida, 2017, Simon, 2013, Yu and Gen, 2010) are meta-heuristics that comply with a wide range of problems such as complex optimization, identification, machine learning, and adaptation problems. Applied to machine learning, Evolutionary Algorithms, especially Genetic Programming (GP) (Koza, 1992), have been seen very effective in a wide range of problems in supervised and unsupervised learning. However, their flexibility and expressiveness comes with two major flaws: an excessive computational cost and a problematic parameters setting.
In supervised learning field, the lack of data may lead to unsatisfactory learners. This is no longer an issue with numerous data sources and high data volume that we witness in the era of Big Data. Nonetheless, this toughens up the computation problem of GP and precludes its application in data-intensive problems. There have been various research efforts on improving GP when applied with large datasets. These research efforts include hardware solutions, such as parallelization, or algorithmic solutions. The most affordable is software based solutions that do not require any specific hardware configuration. Sampling is the mainstream approach in this category. It relies on reducing processing time by reducing data while keeping relevant records.
A complete review of sampling methods used with GP is published in Hmida, Ben Hamida, Borgi, and Rukoz (2016b), extended with a discussion on their ability to deal with large datasets. In fact, sampling methods can be classified with regard to three properties: re-sampling frequency, sampling scheme, or strategy and sampling quantity. Sampling strategy defines how to select records from the input database. Sampling quantity defines how many samples are needed by the algorithm. Sampling frequency defines when the sampling technique is applied throughout the training process. The latest property is the focus of this study.
According to the re-sampling frequency, machine learning algorithms use a unique or a renewable sample. They are called respectively static or dynamic sampling. On the one hand, in static sampling for GP, like the Historical Subset Selection (Gathercole & Ross, 1994) and bagging/boosting (Iba, 1999, Paris et al., 2003), a selection of representative training set needs to be performed. With large datasets, this poses a problem of combining downsizing and data coverage objectives. On the other hand, dynamic sampling creates samples per generation according to its selection strategy. Consequently, GP individuals do not have enough time to learn from sampled data. The population might waste some good resources for solving some difficult cases in the current training set. Otherwise, re-sampling at each GP iteration might be computationally expensive, especially when using some sophisticated sampling strategies.
We propose, in this paper, an extension to dynamic sampling techniques in which sample renewal is controlled through a parameter that adapts the sampling to the learning process. This extension aims to preserve original sampling strategy while making an enhancement in learning robustness and/or learning time.
After studying the effect of the re-sampling frequency on the training quality and learning time, we propose two predicates to implement the adaptive sampling based on the status of resolved fitness cases. These predicates are tested and compared with two deterministic variation rules defined by two functions with an increasing and a decreasing patterns. The objective of this study is to demonstrate that controlling sampling frequency with deterministic or dynamic functions does not degrade the results. On the contrary, in some cases they allow an improvement in quality and learning time.
This paper is organized as follows. Next section gives an overview of the adaptive sampling in active machine learning. In Section 3, we expose the background of this work in GP and decision designs needed to add dynamic sampling to the GP engine. Section 4 reviews some sampling methods for active learning with GP that are involved in the experimental study. In Section 5 we study the effect of varying the sampling frequency on the Genetic Learners. Section 6 introduces the novel sampling approach and explains how it can extend dynamic sampling methods. Then, in Section 7, an experimental study gives the proof of concept of adaptive sampling and traces its effect on learning process through the discussion of registered results in Section 8. The main results in this section are compared to some results of three multi-level dynamic sampling methods published in Hmida, Ben Hamida, Borgi, and Rukoz (2016a) to demonstrate how adaptive sampling could be an alternative to hierarchical sampling. Finally, we give some conclusions and propose further developments.
Section snippets
Related works: adaptive sampling
In this paper, we are mainly interested on sampling methods aiming at reducing the original training data-set size by substituting it with a representative subset much smaller, thus reducing the evaluation cost of learning algorithm. Two major classes of sampling techniques can be laid out: static sampling where the training set is selected independently from the training process and remains unmodified along evolution, and active sampling, also known as active learning that could be defined as (
Genetic programming engine
As any EA, GP evolves a population of individuals throughout a number of generations. A generation is in fact an iteration of the main loop as described in Fig. 1. Each individual represents a complete mathematical expression or a small computer program. The standard GP uses a tree representation of individuals built from a function set for nodes and a terminal set for leaves. When GP is applied to a classification problem, each individual is a candidate classifier. Thus, the objective is to
Active sampling with GP
To select a training subset from the database , many approaches were proposed either for static or active sampling. For static sampling, the database is partitioned before the learning process, based essentially on some criteria or some features in the data. This sampling strategy is not discussed in this paper. For active sampling, we identify basically five main approaches used with GP: stochastic sampling, weighted sampling, data-topology based sampling, balanced sampling and incremental
The sampling frequency feature
Sampling frequency (f) is a main parameter for any active sampling technique. It defines how often the training subset is changed across the learning process. When f = 1, the training sample is extracted at each generation and the sampling approach is considered as a generation-wise sampling technique. Most of the sampling techniques applied with GP belong to this category. This is the case of the techniques described in Section 4. When f is set to 1, individuals in the current population have
The proposed sampling approach
Three main approaches are possible to control any EA parameter: deterministic, adaptive and self-adaptive (Eiben, Michalewicz, Schoenauer, & Smith, 2007). The deterministic control uses a deterministic rule to alter the EA parameter along the evolution. However, the adaptive control uses feed-backs from the search state to define a strategy for updating the parameter value. With the self-adaptive control, the parameter is encoded within the chromosome and it evolves with the population.
The
Cartesian Genetic Programming
Cartesian Genetic Programming (CGP) (Miller & Thomson, 2000) is a GP variant where individuals represent graph-like programs. It is called ”Cartesian” because it uses a two-dimensional grid of computational nodes implementing directed acyclic graphs. Each graph node encodes a function from the function set. The arguments of the encoded function are provided by the inputs of the node and its output designs the result.
CGP shows several advantages over other GP approaches. Unlike trees, there are
Results and discussion
The experimental study is organised in two parts. The aim of the first experiments is to study the impact of the variation of the sampling frequency on the learning time and performance indicators. A set of fixed values for f is chosen for this study given in Section 7.5. The second set of experiments study the efficiency of the proposed sampling frequency controlling strategies. Results are discussed and then compared with some hierarchical sampling results published in previous works.
For each
Conclusion
This work is a proposal for a new form of active learning with Genetic Programming based on adaptive sampling. Its main objective is to extend some known dynamic sampling techniques with an adaptive frequency control that takes into account the state of learning process. After a study of the impact of the sampling frequency variation on the performance of the derived models learning meantime, we proposed an increasing and a decreasing deterministic patterns, and two adaptive patterns for
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References (41)
- et al.
Adaptive data reduction for large-scale transaction data
European Journal of Operational Research
(2008) - et al.
Sampling-based adaptive bounding evolutionary algorithm for continuous optimization problems
Information Sciences
(2017) - Atlas, L. E., Cohn, D., & Ladner, R. (1990) Training connectionist networks with queries and selective sampling. In...
- Balkanski, E. & Singer, Y. (2018a). The adaptive complexity of maximizing a submodular function. In: I. Diakonikolas,...
- Balkanski, E. & Singer, Y. (2018b). Approximation guarantees for adaptive sampling. In: J.G. Dy, A. Krause (eds.)...
- CGP. (2009). Cartesian gp website....
- et al.
Improving generalization with active learning
Machine Learning
(1994) - Curry, R. & Heywood, M. I. (2004). Towards efficient training on large datasets for genetic programming. In Advances in...
- et al.
Adaptive sampling algorithm for macromodeling of parameterized s -parameter responses
IEEE Transactions on Microwave Theory and Techniques
(2011) - et al.
Parameter control in evolutionary algorithms
Data mining and knowledge discovery with evolutionary algorithms
A survey on instance selection for active learning
Knowledge and Information Systems
Dynamic training subset selection for supervised learning in genetic programming
Small populations over many generations can beat large populations over few generations in genetic programming
A survey of adaptive sampling for global metamodeling in support of simulation-based complex engineering design
Structural and Multidisciplinary Optimization
Implementing cartesian genetic programming classifiers on graphics processing units using gpu.net
Cited by (3)
A comprehensive review of automatic programming methods
2023, Applied Soft ComputingActive Learning with an Adaptive Classifier for Inaccessible Big Data Analysis
2021, Proceedings of the International Joint Conference on Neural Networks