Abstract
The chapter summarizes the use of Genetic Programming (GP) inMultiple Linear Regression (MLR) to address multicollinearity and Lack of Fit (LOF). The basis of the proposed method is applying appropriate input transforms (model respecification) that deal with these issues while preserving the information content of the original variables. The transforms are selected from symbolic regression models with optimal trade-off between accuracy of prediction and expressional complexity, generated by multiobjective Pareto-front GP. The chapter includes a comparative study of the GP-generated transforms with Ridge Regression, a variant of ordinary Multiple Linear Regression, which has been a useful and commonly employed approach for reducing multicollinearity. The advantages of GP-generated model respecification are clearly defined and demonstrated. Some recommendations for transforms selection are given as well. The application benefits of the proposed approach are illustrated with a real industrial application in one of the broadest empirical modeling areas in manufacturing - robust inferential sensors. The chapter contributes to increasing the awareness of the potential of GP in statistical model building by MLR.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Box, G.E.P. and Cox, D.R. (1964). An analysis of transformations. J. R. Stat. Soc. Series B, 26:211–243.
Box, G.E.P and Draper, N. R. (1987). John Wiley and Sons, New York.
Castillo, F., Kordon, A., and Smits, G. (2007). Robust Pareto Front Genetic Programming Parameter Selection Based on Design of Experiments and Industrial Data, pages 149–166. Springer, New York.
Draper, N. R. and Smith, H. (1998). Applied Regression Analysis. Wiley, New York.
Hoerl, A. E., Kennard, R.W., and Baldwin, K. F. (1975). Ridge regression: Some simulation. Commun. Statis., 4:105–123.
Hoerl, Arthur E. and Kennard, Robert W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1):55–67.
Keijzer, M. and Babovic, V. (1999).Dimensionally aware genetic programming. In Proceedings of the Genetic and Evolutionary Computation Conference, volume 2, pages 1069–1076, Orlando, FL, USA.
Kordon, A., Smits, G., Jordaan, E., Kalos, A., and Chiang, L. (2006). Empirical models with self-assessment capabilities for on-line industrial applications. In Proceedings of CEC 2006, pages 10463–10470, Vancouver.
Kordon, A., Smits, G., Kalos, A., and Jordaan, E. (2003). Robust Soft Sensor Development Using Genetic Programming, pages 69–108. Elsevier, Amsterdam.
Koza, J. (1992). Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA.
Smits, G. and Kotanchek, M. (2004). Pareto-front exploitation in symbolic regression. Springer, New York.
Swihart, R.K. and Slade, N.A. (1985). Testing for independence of observations in animal movements. Ecology, 66:1176–1184.
Thesen, A. and Travis, L.E. (1992). Simulation for Decision Making. West Publishing Company.
Villa, C.M., Mazy, J.P, Castillo, F., Thompson, L.H., and Weston, J.W. (2004). Model validation in chemical process with multiple steady states. In Proceedings of the Fourth International Conference on Sensitivity Analysis of Modeling Output.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Castillo, F., Kordon, A., Villa, C. (2011). Genetic Programming Transforms in Linear Regression Situations. In: Riolo, R., McConaghy, T., Vladislavleva, E. (eds) Genetic Programming Theory and Practice VIII. Genetic and Evolutionary Computation, vol 8. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-7747-2_11
Download citation
DOI: https://doi.org/10.1007/978-1-4419-7747-2_11
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4419-7746-5
Online ISBN: 978-1-4419-7747-2
eBook Packages: Computer ScienceComputer Science (R0)