Abstract
This paper explores the Genetic Programming and Boosting technique to obtain an ensemble of regressors and proposes a new formula for the updating of weights, as well as for the final hypothesis. Differently from studies found in the literature, in this paper we investigate the use of the correlation metric as an additional factor for the error metric. This new approach, called Boosting using Correlation Coefficients (BCC) has been empirically obtained after trying to improve the results of the other methods. To validate this method, we conducted two groups of experiments. In the first group, we explore the BCC for time series forecasting, in academic series and in a widespread Monte Carlo simulation covering the entire ARMA spectrum. The Genetic Programming (GP) is used as a base learner and the mean squared error (MSE) has been used to compare the accuracy of the proposed method against the results obtained by GP, GP using traditional boosting and the traditional statistical methodology (ARMA). The second group of experiments aims at evaluating the proposed method on multivariate regression problems by choosing Cart (Classification and Regression Tree) as the base learner.
Similar content being viewed by others
References
Hastie R, Tibshirani T, Friedman J (2001) The elements of statistical learning: data mining, inference, and prediction. Springer, Berlin
Todorovski L, Ljubic P, Dzeroski S (2004) Inducing polynomial equations for regression. In: Proceedings of the European conference on principles and practice of knowledge discovery in databases
Weiss S, Inurkhya M (1995) Rule-based machine learning methods for functional predictions. J Artif Intell Res 3:383–403
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55:119–139
Liu B, Mckay B, Abbrass HA (2003) Improving genetic classifiers with a booting algorithm. In: Proceedings of the congress on evolutionary computation
Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: Proceedings of 13th international conference on machine learning, pp 148–156
Ridgeway G (1999) The state of boosting. Comput Sci Stas 31:172–181
Drucker H (1997) Improving regressor using boosting. In: Proceedings of the 14th international conference on machine learning. Morgan Kaufmann, San Mateo, pp 107–115
Solomatine D, Shrestha D (2004) Adaboost-rt: a boosting algorithm for regression problems. In: Proceedings of the IEEE international join conference on neural networks
Breiman L (1999) Prediction games and arcing algorithms. Neural Comput 11(7):1493–1517
Buhlmann P, Yu B (2003) Boosting with the l2 loss: Regression and classification. J Am Stat Assoc 98(462):324–339
Mason L, Baxter J, Bartlett P, Frean M (1999) Functional gradient techniques for combining hypotheses. In: Advances in large margin classifiers. MIT Press, Cambridge, pp 221–247
Assad M, Bone R (2003) Improving time series prediction by recurrent neural network ensembles. Universite de Tours, Tech Rep
Kaboudan MA (2000) Genetic programming prediction on stock prices. J Comput Econ 16:207–236
Costa EO, de Souza GA, Pozo ATR, Vergilio SR (2007) Exploring genetic programming and boosting techniques to model software reliability. IEEE Trans Reliab 56(3):422–434
Box GEP, Jenkins GM (1970) Time series analysis: forecasting and control. Holden–Day, Oakland
Quinlan J (1992) Learning with continuous classes. In: Proceedings of the 5th Australian joint conference on AI, pp 343–348
Witten I, Frank E (2000) Data mining. San Francisco. Morgan Kaufmann, San Mateo
Koza J (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge
Holland J (1975) Adaptation in natural and artificial systems. MIT Press, Cambridge
Banzhaf F, Nordin W, Keller P, Francone FD (1998) Genetic programming: an introduction. Morgan Kaufmann, San Mateo
Schapire R (1990) The strength of weak learnability. Mach Learn 5:197–227
Paris G, Robiliard D, Fonlupt C (2002) Applying boosting techniques to genetic programming. In: Selected papers from the 5th European conference on artificial evolution. Springer, London, pp 267–280
Bone R, Assad M, Crucianu M (2003) Boosting recurrent neural networks for times series prediction. In: Proceedings of the international conference in Roanne. Springer Computer Science, Springer, Roanne, pp 18–22
Iba H (1999). Bagging, boosting, and bloating in genetic programming. In: Proceedings of the genetic and evolutionary computation conference, pp 1053–1060
Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 11:716–723
Zongker D, Punch B (1995) Lil-gp 1.0 user’s manual. Michigan State University, East Lansing
Siegal S, Castellan N (1988) Non-parametric statistics for the behavioural sciences. McGraw–Hill, New York
Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Souza GA, Vergilio SR (2006). Modeling software reliability growth with artificial neural networks. In: Proceedings of the IEEE Latin American test workshop, Buenos Aires, Argentina, pp 165–170
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
de Souza, L.V., Pozo, A., da Rosa, J.M.C. et al. Applying correlation to enhance boosting technique using genetic programming as base learner. Appl Intell 33, 291–301 (2010). https://doi.org/10.1007/s10489-009-0166-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-009-0166-y