A hybrid under-sampling approach for mining unbalanced datasets: applications to banking and insurance
Created by W.Langdon from
gp-bibliography.bib Revision:1.7913
- @Article{Vasu:2011:IJDMMM,
-
title = "A hybrid under-sampling approach for mining unbalanced
datasets: applications to banking and insurance",
-
author = "Madireddi Vasu and Vadlamani Ravi",
-
publisher = "Inderscience Publishers",
-
year = "2011",
-
month = mar # "~03",
-
volume = "3",
-
keywords = "genetic algorithms, genetic programming, insurance
fraud detection, credit card churn prediction, data
mining; unbalanced datasets, machine learning, banking,
classifiers, classifier performance, k-means
clustering, support vector machines, SVM, logistic
regression, multilayer perceptron, radial basis
function networks, RBF neural networks, GMDH, decision
trees",
-
ISSN = "1759-1171",
-
bibsource = "OAI-PMH server at www.inderscience.com",
-
journal = "Int. J. of Data Mining and Modelling and Management",
-
issue = "1",
-
language = "eng",
-
pages = "75--105",
-
relation = "ISSN online: 1759-1171 ISSN print: 1759-1163",
-
rights = "Inderscience Copyright",
-
source = "IJDMMM (2011), Vol 3 Issue 1, pp 75 - 105",
-
URL = "http://www.inderscience.com/link.php?id=38812",
-
DOI = "doi:10.1504/IJDMMM.2011.038812",
-
abstract = "In solving unbalanced classification problems, machine
learning algorithms are overwhelmed by the majority
class and consequently misclassify the minority class
observations. Here, we propose a hybrid under-sampling
approach to improve the performance of classifiers. The
proposed approach first employs k-reverse nearest
neighbour (kRNN) method to detect the outliers from
majority class. After removing the outliers, using
K-means clustering, K-clusters are selected to further
reduce the influence of the majority class. Then, we
employed support vector machine (SVM), logistic
regression (LR), multi layer perceptron (MLP), radial
basis function network (RBF), group method of data
handling (GMDH), genetic programming (GP) and decision
tree (J48) for classification purpose. The
effectiveness of the proposed approach was demonstrated
on datasets taken from insurance fraud detection and
credit card churn in banking domain. Ten-fold cross
validation method was used in the study. It is observed
that the proposed approach improved the performance of
the classifiers.",
- }
Genetic Programming entries for
Madireddi Vasu
Vadlamani Ravi
Citations