Abstract
This paper presents a first step of our research on designing an effective and efficient GP-based method for symbolic regression. First, we propose three extensions of the standard Single Node GP, namely (1) a selection strategy for choosing nodes to be mutated based on depth and performance of the nodes, (2) operators for placing a compact version of the best-performing graph to the beginning and to the end of the population, respectively, and (3) a local search strategy with multiple mutations applied in each iteration. All the proposed modifications have been experimentally evaluated on five symbolic regression benchmarks and compared with standard GP and SNGP. The achieved results are promising showing the potential of the proposed modifications to improve the performance of the SNGP algorithm. We then propose two variants of hybrid SNGP utilizing a linear regression technique, LASSO, to improve its performance. The proposed algorithms have been compared to the state-of-the-art symbolic regression methods that also make use of the linear regression techniques on four real-world benchmarks. The results show the hybrid SNGP algorithms are at least competitive with or better than the compared methods.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
- 3.
Checked using the t-test calculated with the significance level \(\alpha =0.05\).
- 4.
The only exception is EFS: we changed the round variable to false (which was originally hard-coded to true) according to the issue on the algorithm’s GitHub repository, see https://github.com/exgp/efs/issues/1.
References
Arnaldo, I., Krawiec, K., O’Reilly, U.-M.: Multiple regression genetic programming. In: Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation, GECCO 2014, pp. 879–886. ACM, New York (2014)
Arnaldo, I., O’Reilly, U.-M., Veeramachaneni, K.: Building predictive models via feature synthesis. In: Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation, GECCO 2015, pp. 983–990. ACM, New York (2015)
EFS commit 6d991fa. http://github.com/exgp/efs/tree/6d991fa
Ferreira, C.: Gene expression programming: a new adaptive algorithm for solving problems. Complex Syst. 13(2), 87–129 (2001)
Friedman, J., Hastie, T., Tibshirani, R.: Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33(1), 1–22 (2010)
Garg, A., Garg, A., Tai, K.: A multi-gene genetic programming model for estimating stress-dependent soil water retention curves. Comput. Geosci. 18(1), 45–56 (2013)
Hart, E., Smith, J.E., Krasnogor, N.: Recent Advances in Memetic Algorithms. STUDFUZZ, vol. 166. Springer, Heidelberg (2005)
Hinchliffe, M., Hiden, H., McKay, B., Willis, M., Tham, M., Barton, G. Modelling chemical process systems using a multi-gene genetic programming algorithm. In: Koza, J.R. (ed.) Late Breaking Papers at the Genetic Programming 1996 Conference, pp. 56–65 (1996)
Jackson, D.: A new, node-focused model for genetic programming. In: Moraglio, A., Silva, S., Krawiec, K., Machado, P., Cotta, C. (eds.) EuroGP 2012. LNCS, vol. 7244, pp. 49–60. Springer, Heidelberg (2012). doi:10.1007/978-3-642-29139-5_5
Jackson, D.: Single node genetic programming on problems with side effects. In: Coello, C.A.C., Cutello, V., Deb, K., Forrest, S., Nicosia, G., Pavone, M. (eds.) PPSN 2012. LNCS, vol. 7491, pp. 327–336. Springer, Heidelberg (2012). doi:10.1007/978-3-642-32937-1_33
Keijzer, M.: Scaled symbolic regression. Genet. Program Evolvable Mach. 5(3), 259–269 (2004)
Koza, J.: On the Programming of Computers by Means of Natural Selection, 2nd edn. MIT Press, Cambridge (1992)
Luke, S., Panait, L.: Lexicographic parsimony pressure. In: Proceedings of GECCO 2002, pp. 829–836. Morgan Kaufmann Publishers (2002)
Lichman, M.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine (2013). http://archive.ics.uci.edu/ml
McConaghy, T.: Fast, scalable, deterministic symbolic regression technology. In: Riolo, R., Vladislavleva, E., Moore, J.H. (eds.) Genetic Programming Theory and Practice IX, Genetic and Evolutionary Computation, pp. 235–260 (2011)
FFX 1.3.4. http://pypi.python.org/pypi/ffx/1.3.4
McDermott, J., et al.: Genetic programming needs better benchmarks. In: Proceedings of the GECCO 2012, pp. 791–798. ACM, New York (2012)
Miller, J.F., Thomson, P.: Cartesian genetic programming. In: Poli, R., Banzhaf, W., Langdon, W.B., Miller, J., Nordin, P., Fogarty, T.C. (eds.) EuroGP 2000. LNCS, vol. 1802, pp. 121–132. Springer, Heidelberg (2000). doi:10.1007/978-3-540-46239-2_9
Ryan, C., Azad, R.M.A.: A simple approach to lifetime learning in genetic programming-based symbolic regression. Evol. Comput. 22(2), 287–317 (2014)
Ryan, C., Collins, J.J., Neill, M.O.: Grammatical evolution: evolving programs for an arbitrary language. In: Banzhaf, W., Poli, R., Schoenauer, M., Fogarty, T.C. (eds.) EuroGP 1998. LNCS, vol. 1391, pp. 83–96. Springer, Heidelberg (1998). doi:10.1007/BFb0055930
Searson, D.P., Leahy, D.E., Willis, M.J.: Gptips: an open source genetic programming toolbox for multigene symbolic regression. In International MultiConference of Engineers and Computer Scientists, vol. 1, pp. 77–80 (2010)
Searson, D.P.: GPTIPS 2: an open-source software platform for symbolic datamining. In: Gandomi, A.H., Alavi, A.H., Ryan, C. (eds.) Springer Handbook of Genetic Programming Applications, pp. 551–573. Springer, Switzerland (2015)
Vladislavleva, E.J., Smits, G.F., Den Hertog, D.: Order of nonlinearity as a complexity measure for models generated by symbolic regression via pareto genetic programming. Trans. Evol. Comp. 13(2), 333–349 (2009)
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. Roy. Stat. Soc. Ser. B (Stat. Methodol.) 67(2), 301–320 (2005)
Acknowledgment
This research was supported by the Grant Agency of the Czech Republic (GAČR) with the grant no. 15-22731S entitled “Symbolic Regression for Reinforcement Learning in Continuous Spaces”.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Kubalík, J., Alibekov, E., Žegklitz, J., Babuška, R. (2016). Hybrid Single Node Genetic Programming for Symbolic Regression. In: Nguyen, N., Kowalczyk, R., Filipe, J. (eds) Transactions on Computational Collective Intelligence XXIV. Lecture Notes in Computer Science(), vol 9770. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-53525-7_4
Download citation
DOI: https://doi.org/10.1007/978-3-662-53525-7_4
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-53524-0
Online ISBN: 978-3-662-53525-7
eBook Packages: Computer ScienceComputer Science (R0)