TaylorGP

The paper Taylor Genetic Programming for Symbolic Regression has been accepted by GECCO-2022 . You could also see our appendix for more details.

1. Introduction

TaylorGP, A Symbolic Regression method, leverages a Taylor polynomial to approximate the symbolic equation that fits the dataset. It also utilizes the Taylor polynomial to extract the features of the symbolic equation: low order polynomial discrimination, variable separability, boundary, monotonic, and parity. GP is enhanced by these Taylor polynomial techniques. Experiments are conducted on three kinds of benchmarks: classical SR, machine learning, and physics. The experimental results show that TaylorGP not only has higher accuracy than the nine baseline methods, but also is faster in finding stable results.

2. Code

Requirements

Make sure you have installed the following python version and pacakges before start running our code:

python3.6~3.8
scikit-learn
numpy
sympy
pandas
time
copy
itertools
timeout_decorator
scipy
joblib
numbers
itertools
abc
warnings
math

Our experiments were running in Ubuntu 18.04 with Intel(R) Xeon(R) Gold 5218R CPU @ 2.10GHz.

Examples

We provide an example to test whether the module required by Taylor GP is successfully installed:

python TaylorGP.py

In addition, you can run the specified dataset through the following method:

python TaylorGP.py --fileName="Feynman/F24.tsv"

3. Experiments

DataSet

We evaluate the performance of TaylorGP on three kinds of benchmarks: classical Symbolic Regression Benchmarks (SRB), Penn Machine Learning Benchmarks (PMLB), and Feynman Symbolic Regression Benchmarks (FSRB) .(You could get them from directories GECCO, PMLB and Feynman respectively).The distribution of the total 81 benchmark sizes by samples and features is shown in the following.

The details of these benchmarks are listed in the appendix.

Performance

We compare TaylorGP with two kinds of baseline algorithms \footnote{The nine baseline algorithms are implemented in SRBench : four symbolic regression methods and five machine learning methods. The symbolic regression methods include GPlearn, FFX , geometric semantic genetic programming (GSGP) and bayesian symbolic regression (BSR). The machine learning methods include linear regression (LR), kernel ridge regression (KR), random forest regression (RF), support vector machines (SVM), and XGBoost .

As shown in the figure below , we illustrate the normalized R^2 scores of the ten algorithms running 30 times on all benchmarks. Since the normalized R^2 closer to 1 indicates better results, overall TaylorGP can find more accurate results than other algorithms.

Normalized R^2 comparisons of the ten SR methods on classical Symbolic Regression Benchmarks

Normalized R^2 comparisons of the ten SR methods on Feynman Symbolic Regression Benchmarks

Normalized R^2 comparisons of the ten SR methods on Penn Machine Learning Benchmarks

4. Cite

Please cite our paper if you use the code.

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
ML_methods		ML_methods
data		data
src/taylorGP		src/taylorGP
README.md		README.md
__init__.py		__init__.py
appveyor.yml		appveyor.yml
regressor.py		regressor.py
regressor_test.py		regressor_test.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ML_methods

ML_methods

data

data

src/taylorGP

src/taylorGP

README.md

README.md

init.py

init.py

appveyor.yml

appveyor.yml

regressor.py

regressor.py

regressor_test.py

regressor_test.py

setup.py

setup.py

Repository files navigation

TaylorGP

1. Introduction

2. Code

Requirements

Examples

3. Experiments

DataSet

Performance

4. Cite

About

Releases

Packages

Contributors 3

Languages

KGAE-CUP/TaylorGP

Folders and files

Latest commit

History

Repository files navigation

TaylorGP

1. Introduction

2. Code

Requirements

Examples

3. Experiments

DataSet

Performance

4. Cite

About

Resources

Stars

Watchers

Forks

Languages