DG-SMOTE: A Distance-Angle-Based Genetic Synthetic Minority Over-Sampling Technique for Unbalanced Data Learning
Created by W.Langdon from
gp-bibliography.bib Revision:1.8344
- @Article{Pei:TEVC,
-
author = "Wenbin Pei and Yuyang Cui and Bing Xue and
Mengjie Zhang and Jiqing Zhang and Yaqing Hou and
Guangyu Zou and Zhang Qiang",
-
title = "{DG-SMOTE:} A Distance-Angle-Based Genetic Synthetic
Minority Over-Sampling Technique for Unbalanced Data
Learning",
-
journal = "IEEE Transactions on Evolutionary Computation",
-
note = "Early access",
-
keywords = "genetic algorithms, genetic programming, Evolutionary
computation, Sampling methods, Noise measurement,
Meteorology, Contracts, Standards, Noise, Nearest
neighbour methods, Ensemble learning, Unbalanced data
classification, Oversampling",
-
ISSN = "1941-0026",
-
DOI = "
doi:10.1109/TEVC.2024.3515485",
-
abstract = "Many real-world applications often generate unbalanced
data. Learning from such data may lead to biased
classifiers that perform poorly on the class of
interest. Oversampling methods have been shown to be
effective in re-balancing unbalanced data to help
classifiers avoid performance bias. However, many
existing oversampling methods rely on a pre-designed
linear model structure and the neighbourhood
information of an original instance. This may lead to
the generation of noisy instances when the original
data has noise. In this study, we develop a novel
oversampling method in which genetic programming is
introduced to automatically select good-quality
instances and evolve a model structure that combines
the selected instances to create a new instance. In the
proposed oversampling method, an individual is used to
represent a generated instance, which is evaluated by
the fitness function designed based on the Euclidean
distance and cosine theorem. In the experiments, we
examine the effectiveness of the proposed oversampling
method in assisting different types of classifiers to
solve the issue of class imbalance, and compare it with
popular sampling methods in unbalanced classification.
The results have been analysed comprehensively,
indicating that the new method successfully addressed
the class imbalance issue by generating a group of
good-quality instances for the minority class and
outperformed the compared sampling methods in almost
all cases.",
-
notes = "Also known as \cite{10793073}",
- }
Genetic Programming entries for
Wenbin Pei
Yuyang Cui
Bing Xue
Mengjie Zhang
Jiqing Zhang
Yaqing Hou
Guangyu Zou
Zhang Qiang
Citations