ABSTRACT
Symbolic Regression searches for a function form that approximates a dataset often using Genetic Programming. Since there is usually no restriction to what form the function can have, Genetic Programming may return a hard to understand model due to non-linear function chaining or long expressions. A novel representation called Interaction-Transformation was recently proposed to alleviate this problem. In this representation, the function form is restricted to an affine combination of terms generated as the application of a single univariate function to the interaction of selected variables. This representation obtained competing solutions on standard benchmarks. Despite the initial success, a broader set of benchmarking functions revealed the limitations of the constrained representation. In this paper we propose an extension to this representation, called Transformation-Interaction-Rational representation that defines a new function form as the rational of two Interaction-Transformation functions. Additionally, the target variable can also be transformed with an univariate function. The main goal is to improve the approximation power while still constraining the overall complexity of the expression. We tested this representation with a standard Genetic Programming with crossover and mutation. The results show a great improvement when compared to its predecessor and a state-of-the-art performance for a large benchmark.
- Guilherme Seidyo Imai Aldeia and Fabricio Olivetti de Franca. 2020. A Parametric Study of Interaction-Transformation Evolutionary Algorithm for Symbolic Regression. In 2020 IEEE Congress on Evolutionary Computation (CEC). IEEE, New York, 8 pages. Google ScholarDigital Library
- Guilherme Seidyo Imai Aldeia and Fabrício Olivetti de França. 2018. Lightweight Symbolic Regression with the Interaction - Transformation Representation. In 2018 IEEE Congress on Evolutionary Computation (CEC). IEEE, New York, 8 pages. Google ScholarDigital Library
- Ignacio Arnaldo, Krzysztof Krawiec, and Una-May O'Reilly. 2014. Multiple regression genetic programming. In Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation. ACM, 879--886.Google ScholarDigital Library
- Bogdan Burlacu, Gabriel Kronberger, and Michael Kommenda. 2020. Operon C++ an eficient genetic programming framework for symbolic regression. In Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion. 1562--1570.Google ScholarDigital Library
- F. O. de Franca and G. S. I. Aldeia. 2020. Interaction-Transformation Evolutionary Algorithm for Symbolic Regression. Evolutionary Computation (12 2020), 1--25. arXiv:https://direct.mit.edu/evco/article-pdf/doi/10.1162/evco_a_00285/1888497/evco_a_00285.pdf Google ScholarCross Ref
- Fabricio Olivetti de Franca and Maira Zabuscha de Lima. 2021. Interaction-transformation symbolic regression with extreme learning machine. Neurocomputing 423 (2021), 609--619.Google ScholarCross Ref
- Grant Dick. 2014. Bloat and generalisation in symbolic regression. In Asia-Pacific Conference on Simulated Evolution and Learning. Springer, 491--502.Google ScholarDigital Library
- Norbert Gaffke and Berthold Heiligers. 1996. 30 Approximate designs for polynomial regression: Invariance, admissibility, and optimality. Handbook of Statistics 13 (1996), 1149--1199.Google ScholarCross Ref
- Andrew Gelman, Jennifer Hill, and Aki Vehtari. 2020. Regression and other stories. Cambridge University Press.Google Scholar
- Alexandre Goldsztejn. 2008. Modal intervals revisited part 1: A generalized interval natural extension. (2008).Google Scholar
- Frank E Harrell. 2017. Regression modeling strategies. Bios 330, 2018 (2017), 14.Google Scholar
- Timothy Hickey, Qun Ju, and Maarten H Van Emden. 2001. Interval arithmetic: From principles to implementation. Journal of the ACM (JACM) 48, 5 (2001), 1038--1068.Google ScholarDigital Library
- Kevin Jamieson and Ameet Talwalkar. 2016. Non-stochastic best arm identification and hyperparameter optimization. In Artificial Intelligence and Statistics. PMLR, 240--248.Google Scholar
- Daniel Kantor, Fernando J Von Zuben, and Fabricio Olivetti de Franca. 2021. Simulated annealing for symbolic regression. In Proceedings of the Genetic and Evolutionary Computation Conference. 592--599.Google ScholarDigital Library
- Robert E Kass. 1990. Nonlinear regression analysis and its applications. J. Amer. Statist. Assoc. 85, 410 (1990), 594--596.Google ScholarCross Ref
- Maarten Keijzer. 2004. Scaled symbolic regression. Genetic Programming and Evolvable Machines 5, 3 (2004), 259--269.Google ScholarDigital Library
- Michael Kommenda, Bogdan Burlacu, Gabriel Kronberger, and Michael Affenzeller. 2020. Parameter identification for symbolic regression using nonlinear least squares. Genetic Programming and Evolvable Machines 21, 3 (2020), 471--501.Google ScholarDigital Library
- John R Koza et al. 1994. Genetic programming II. Vol. 17. MIT press Cambridge, MA.Google Scholar
- William La Cava and Jason H. Moore. 2019. Semantic Variation Operators for Multidimensional Genetic Programming. In Proceedings of the Genetic and Evolutionary Computation Conference (Prague, Czech Republic) (GECCO '19). ACM, New York, NY, USA, 1056--1064. Google ScholarDigital Library
- William La Cava, Patryk Orzechowski, Bogdan Burlacu, Fabricio Olivetti de França, Marco Virgolin, Ying Jin, Michael Kommenda, and Jason H. Moore. 2021. Contemporary Symbolic Regression Methods and their Relative Performance. In Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks. https://openreview.net/pdf?id=xVQMrDLyGstGoogle Scholar
- William La Cava, Tilak Raj Singh, James Taggart, Srinivas Suri, and Jason Moore. 2019. Learning concise representations for regression by evolving networks of trees. In International Conference on Learning Representations. https://openreview.net/forum?id=Hke-JhA9Y7Google Scholar
- William B Langdon. 1999. Size fair and homologous tree genetic programming crossovers. In Proceedings of the 1st Annual Conference on Genetic and Evolutionary Computation-Volume 2. Morgan Kaufmann Publishers Inc., 1092--1097.Google Scholar
- SH Alizadeh Moghaddam, M Mokhtarzade, A Alizadeh Naeini, and SA Alizadeh Moghaddama. 2017. Statistical method to overcome overfitting issue in rational function models. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences 42, 4/W4 (2017).Google ScholarCross Ref
- Fabrício Olivetti de França. 2018. A greedy search tree heuristic for symbolic regression. Information Sciences 442-443 (2018), 18 -- 32. Google ScholarDigital Library
- Riccardo Poli, William B Langdon, Nicholas F McPhee, and John R Koza. 2008. A field guide to genetic programming. Lulu. com.Google Scholar
- Veli-Matti Taavitsainen. 2010. Ridge and PLS based rational function regression. Journal of chemometrics 24, 11-12 (2010), 665--673.Google ScholarCross Ref
- Veli-Matti Taavitsainen. 2013. Rational function ridge regression in kinetic modeling: A case study. Chemometrics and Intelligent Laboratory Systems 120 (2013), 136--141.Google ScholarCross Ref
Index Terms
- Transformation-interaction-rational representation for symbolic regression
Recommendations
Transformation-Interaction-Rational Representation for Symbolic Regression: A Detailed Analysis of SRBench Results
Special Issue on the Best of GECCO 2022: Part ISymbolic Regression searches for a parametric model with the optimal value of the parameters that best fits a set of samples to a measured target. The desired solution has a balance between accuracy and interpretability. Commonly, there is no constraint ...
LGP-VEC: A Vectorial Linear Genetic Programming for Symbolic Regression
GECCO '23 Companion: Proceedings of the Companion Conference on Genetic and Evolutionary ComputationSymbolic regression (SR) is a well-known regression problem, that aims to find a symbolic expression that best fits a given dataset. Linear Genetic Programming (LGP) is a good and powerful candidate for solving symbolic regression problems. However, ...
Hybrid Single Node Genetic Programming for Symbolic Regression
Transactions on Computational Collective Intelligence XXIV - Volume 9770This paper presents a first step of our research on designing an effective and efficient GP-based method for symbolic regression. First, we propose three extensions of the standard Single Node GP, namely 1 a selection strategy for choosing nodes to be ...
Comments