A Classification Model For Class Imbalance Dataset Using Genetic Programming
Created by W.Langdon from
gp-bibliography.bib Revision:1.8051
- @Article{Tahir:2019:ACC,
-
author = "Mirza Amaad Ul Haq Tahir and Sohail Asghar and
Awais Manzoor and Muhammad Asim Noor",
-
title = "A Classification Model For Class Imbalance Dataset
Using Genetic Programming",
-
journal = "IEEE Access",
-
year = "2019",
-
volume = "7",
-
keywords = "genetic algorithms, genetic programming",
-
DOI = "doi:10.1109/ACCESS.2019.2915611",
-
ISSN = "2169-3536",
-
pages = "71013--71037",
-
abstract = "Since the last few decades, a class imbalance has been
one of the most challenging problems in various fields,
such as data mining and machine learning. The
particular state of an imbalanced dataset, where each
class associated with a given dataset is distributed
unevenly. This happens when the positive class is much
smaller than the negative class. In this case, most
standard classification algorithms do not identify
examples related to the positive class. A positive
class usually refers to the key interest of the
classification task. In order to solve this problem,
several solutions were proposed such as sampling-based
over-sampling and under-sampling, changes at the
classifier level or the combination of two or more
classifiers. However the main problem is that most
solutions are biased towards negative class,
computationally expensive, have storage issues or
taking long training time. An alternative approach to
this problem is the genetic algorithm (GA), which has
shown the promising results. The GA is an evolutionary
learning algorithm that uses the principles of
Darwinian evolution, it is a powerful global search
algorithm. Moreover, the fitness function is a key
parameter in GA. It determines how well a solution can
solve the given problem. In this paper, we propose a
solution which uses entropy and information gain as a
fitness function in GA with an objective to improve the
impurity and gives a more balanced result without
changing the original dataset. The experiments
conducted on different datasets demonstrate the
effectiveness of the proposed solution in comparison
with the several other state-of-the-art algorithms in
term of Accuracy (Acc), geometric mean (GM), F-measure
(FM), kappa, and Matthews correlation coefficient
(MCC).",
-
notes = "Also known as \cite{8709798}",
- }
Genetic Programming entries for
Mirza Amaad Ul Haq Tahir
Sohail Asghar
Awais Manzoor
Muhammad Asim Noor
Citations