Abstract
Symbolic regression is a common application for genetic programming (GP). This paper presents a new non-evolutionary technique for symbolic regression that, compared to competent GP approaches on real-world problems, is orders of magnitude faster (taking just seconds), returns simpler models, has comparable or better prediction on unseen data, and converges reliably and deterministically. I dub the approach FFX, for Fast Function Extraction. FFX uses a recentlydeveloped machine learning technique, pathwise regularized learning, to rapidly prune a huge set of candidate basis functions down to compact models. FFX is verified on a broad set of real-world problems having 13 to 1468 input variables, outperforming GP as well as several state-of-the-art regression techniques.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Ampazis, N. and Perantonis, S. J. (2002). Two highly efficient second-order algorithms for training feedforward networks. IEEE-EC, 13:1064–1074.
Boyd, Stephen and Vandenberghe, Lieven (2004). Convex Optimization. Cambridge University Press, New York, NY, USA.
Castillo, Flor, Kordon, Arthur, and Villa, Carlos (2010). Genetic programming transforms in linear regression situations. In Riolo, Rick,McConaghy, Trent, and Vladislavleva, Ekaterina, editors, Genetic Programming Theory and Practice VIII, volume 8 of Genetic and Evolutionary Computation, chapter 11, pages 175–194. Springer, Ann Arbor, USA.
Daems, Walter, Gielen, Georges G. E., and Sansen, Willy M. C. (2003). Simulation-based generation of posynomial performance models for the sizing of analog integrated circuits. IEEE Trans. on CAD of Integrated Circuits and Systems, 22(5):517–534.
Deb,Kalyanmoy, Pratap, Amrit,Agarwal, Sameer, andMeyarivan, T. (2002). A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Transactions on Evolutionary Computation, 6:182–197.
Fonlupt, Cyril and Robilliard, Denis (2011). A continuous approach to genetic programming. In Silva, Sara et al., editors, Proceedings of the 14th European Conference on Genetic Programming, EuroGP 2011, volume 6621 of LNCS, pages 335–346, Turin, Italy. Springer Verlag.
Friedman, J. H. (1991). Multivariate adaptive regression splines. Annals of Statistics, 19(1):1–141.
Friedman, Jerome H., Hastie, Trevor, and Tibshirani, Rob (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1):1–22.
Hansen, N. and Ostermeier, A. (2001). Completely derandomized selfadaptation in evolution strategies. Evolutionary Computation, 9(2):159–195.
Hastie, Trevor, Tibshirani, Robert, and Friedman, Jerome (2008). The elements of statistical learning: data mining, inference and prediction. Springer, 2 edition.
Kim,Minkyu, Becker, Ying L., Fei, Peng, and O’Reilly, Una-May (2008). Constrained genetic programming to minimize overfitting in stock selection. In Riolo, Rick L., Soule, Terence, and Worzel, Bill, editors, Genetic Programming Theory and Practice VI, Genetic and Evolutionary Computation, chapter 12, pages 179–195. Springer, Ann Arbor.
Korns, Michael F. (2010). Abstract expression grammar symbolic regression.
In Riolo, Rick, McConaghy, Trent, and Vladislavleva, Ekaterina, editors,
Genetic Programming Theory and Practice VIII, volume 8 of Genetic and
Evolutionary Computation, chapter 7, pages 109–128. Springer, Ann Arbor,
USA.
Koza, John R. (1992). Genetic Programming: On the Programming of Computers
by Means of Natural Selection. MIT Press, Cambridge, MA, USA.
Langley, Pat, Simon,HerbertA., Bradshaw,Gary L., and Zytkow, JanM. (1987). Scientific discovery: computational explorations of the creative process.MIT Press, Cambridge, MA, USA.
Leung, Henry and Haykin, Simon (1993). Rational function neural network. Neural Comput., 5:928–938.
Looks,Moshe (2006). Competent Program Evolution. Doctor of science,Washington University, St. Louis, USA.
McConaghy, Trent, Eeckelaert, Tom, and Gielen, Georges (2005). CAFFEINE: Template-free symbolic model generation of analog circuits via canonical form functions and genetic programming. In Proceedings of the Design Automation and Test Europe (DATE) Conference, volume 2, pages 1082–1087, Munich.
McConaghy, Trent and Gielen, Georges (2005). Analysis of simulation-driven numerical performance modeling techniques for application to analog circuit optimization. In Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS). IEEE Press.
McConaghy, Trent and Gielen, Georges (2006). Double-strength caffeine: fast template-free symbolic modeling of analog circuits via implicit canonical form functions and explicit introns. In Proceedings of the conference on Design, automation and test in Europe: Proceedings, DATE ’06, pages 269– 274, 3001 Leuven, Belgium, Belgium. European Design and Automation Association.
McConaghy, Trent and Gielen, Georges G. E. (2009). Template-free symbolic performance modeling of analog circuits via canonical-form functions and genetic programming. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 28(8):1162–1175.
McConaghy, Trent, Vladislavleva, Ekaterina, and Riolo, Rick (2010). Genetic programming theory and practice 2010: An introduction. In Riolo, Rick,Mc- Conaghy, Trent, andVladislavleva, Ekaterina, editors, Genetic Programming Theory and Practice VIII, volume 8 of Genetic and Evolutionary Computation, pages xvii–xxviii. Springer, Ann Arbor, USA.
Montgomery, Douglas C. (2009). Design and analysis of experiments. Wiley, Hoboken, NJ, 7. ed., international student version edition.
Nelder, J. A. and Wedderburn, R. W. M. (1972). Generalized linear models. Journal of the Royal Statistical Society, Series A, General, 135:370–384.
Nikolaev, Nikolay Y. and Iba, Hitoshi (2001). Regularization approach to inductive genetic programming. IEEE Transactions on Evolutionary Computing, 54(4):359–375.
O’Neill,Michael and Brabazon, Anthony (2006). Grammatical differential evolution. In Arabnia, Hamid R., editor, Proceedings of the 2006 International Conference on Artificial Intelligence, ICAI 2006, volume 1, pages 231–236, Las Vegas, Nevada, USA. CSREA Press.
O’Neill, Michael and Ryan, Conor (2003). Grammatical Evolution: Evolutionary Automatic Programming in a Arbitrary Language, volume 4 of Genetic programming. Kluwer Academic Publishers.
O’Reilly, Una-May (1995). An Analysis of Genetic Programming. PhD thesis, Carleton University, Ottawa-Carleton Institute for Computer Science, Ottawa, Ontario, Canada.
Riolo, Rick, McConaghy, Trent, and Vladislavleva, Ekaterina, editors (2010). Genetic Programming Theory and Practice VIII, Genetic and Evolutionary Computation, Ann Arbor, USA. Springer.
Rothlauf, Franz (2006). Representations for genetic and evolutionary algorithms. Springer-Verlag, pub-SV:adr, second edition. First published 2002, 2nd edition available electronically.
Sacks, Jerome,Welch,William J.,Mitchell, Toby J., andWynn, Henry P. (1989). Design and analysis of computer experiments. Statistical Science, 4(4.409– 435):409–427.
Schmidt, Michael D. and Lipson, Hod (2006). Co-evolving fitness predictors for accelerating and reducing evaluations. In Riolo, Rick L., Soule, Terence, and Worzel, Bill, editors, Genetic Programming Theory and Practice IV, volume 5 of Genetic and Evolutionary Computation, chapter 17, pages –. Springer, Ann Arbor.
Smits, Guido F., Vladislavleva, Ekaterina, and Kotanchek, Mark E. (2010).Scalable symbolic regression by continuous evolution with very small pop260 ulations. In Riolo, Rick, McConaghy, Trent, and Vladislavleva, Ekaterina, editors, Genetic Programming Theory and Practice VIII, volume 8 of Genetic and Evolutionary Computation, chapter 9, pages 147–160. Springer, Ann Arbor, USA.
Suykens, J. A. K., Gestel, T. Van, Brabanter, J. De, Moor, B. De, and Vandewalle, J. (2002). Least Squares Support Vector Machines. World Scientific, Singapore.
Teytaud, Olivier and Gelly, Sylvain (2007). Dcma: yet another derandomization in covariance-matrix-adaptation. In Proceedings of the 9th annual conference onGenetic and evolutionary computation,GECCO’07, pages 955–963,New York, NY, USA. ACM.
Topchy, Alexander and Punch, William F. (2001). Faster genetic programming based on local gradient search of numeric leaf values. In Spector,Lee et al., editors, Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2001), pages 155–162, San Francisco, California, USA. Morgan Kaufmann.
Zou, Hui and Hastie, Trevor (2005). Regularization and variable selection via the elastic net. Journal Of The Royal Statistical Society Series B, 67(2):301– 320.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
McConaghy, T. (2011). FFX: Fast, Scalable, Deterministic Symbolic Regression Technology. In: Riolo, R., Vladislavleva, E., Moore, J. (eds) Genetic Programming Theory and Practice IX. Genetic and Evolutionary Computation. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-1770-5_13
Download citation
DOI: https://doi.org/10.1007/978-1-4614-1770-5_13
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-1769-9
Online ISBN: 978-1-4614-1770-5
eBook Packages: Computer ScienceComputer Science (R0)