Data Imputation for Symbolic Regression with Missing Values: A Comparative Study
Created by W.Langdon from
gp-bibliography.bib Revision:1.8051
- @InProceedings{Al-Helali:2020:SSCI,
-
author = "Baligh Al-Helali and Qi Chen and Bing Xue and
Mengjie Zhang",
-
title = "Data Imputation for Symbolic Regression with Missing
Values: A Comparative Study",
-
booktitle = "2020 IEEE Symposium Series on Computational
Intelligence (SSCI)",
-
year = "2020",
-
pages = "2093--2100",
-
abstract = "Symbolic regression via genetic programming is
considered as a crucial machine learning tool for
empirical modelling. However, in reality, it is common
for real-world data sets to have some data quality
problems such as noise, outliers, and missing values.
Although several approaches can be adopted to deal with
data incompleteness in machine learning, most studies
consider the classification tasks, and only a few have
considered symbolic regression with missing values. In
this work, the performance of symbolic regression using
genetic programming on real-world data sets that have
missing values is investigated. This is done by
studying how different imputation methods affect
symbolic regression performance. The experiments are
conducted using thirteen real-world incomplete data
sets with different ratios of missing values. The
experimental results show that although the performance
of the imputation methods differs with the data set,
CART has a better effect than others. This might be due
to its ability to deal with categorical and numerical
variables. Moreover, the superiority of the use of
imputation methods over the commonly used deletion
strategy is observed.",
-
keywords = "genetic algorithms, genetic programming",
-
DOI = "doi:10.1109/SSCI47803.2020.9308216",
-
month = dec,
-
notes = "Also known as \cite{9308216}",
- }
Genetic Programming entries for
Baligh Al-Helali
Qi Chen
Bing Xue
Mengjie Zhang
Citations