Optimization Techniques To Record Deduplication
Created by W.Langdon from
gp-bibliography.bib Revision:1.8051
- @Article{Karunakaran:2012:JCS,
-
author = "Deepa Karunakaran and Rangarajan Rangaswamy",
-
title = "Optimization Techniques To Record Deduplication",
-
journal = "Journal of Computer Science",
-
year = "2012",
-
volume = "8",
-
number = "9",
-
pages = "1487--1495",
-
month = aug # " 11",
-
keywords = "genetic algorithms, genetic programming, Data
preprocessing, remaining datasets, similarity measure
obtained, evaluation metrics, Artificial Bee Colony
(ABC)",
-
publisher = "Science Publications",
-
ISSN = "1549-3636",
-
bibsource = "OAI-PMH server at www.doaj.org",
-
language = "eng",
-
oai = "oai:doaj-articles:6180e9f77d61fc46394fa6778978efc6",
-
URL = "http://www.thescipub.com/pdf/10.3844/jcssp.2012.1487.1495",
-
URL = "http://thescipub.com/abstract/10.3844/jcssp.2012.1487.1495",
-
DOI = "doi:10.3844/jcssp.2012.1487.1495",
-
size = "9 pages",
-
abstract = "Duplicate record detection is important for data
preprocessing and cleaning. Artificial Bee Colony (ABC)
is one of the most recently introduced algorithms based
on the intelligent foraging behaviour of a honey bee
swarm. Our approach to duplicate detection is the use
of ABC algorithm for generating the optimal similarity
measure to decide whether the data is duplicate or not.
In the training phase, ABC algorithm is used to
generate the optimal similarity measure. Once the
optimal similarity measure obtained, the deduplication
of remaining datasets is done with the help of optimal
similarity measure generated from the ABC algorithm. We
have used Restaurant and Cora datasets to analyse the
proposed algorithm and the performance of the proposed
algorithm is compared against the genetic programming
technique with the help of evaluation metrics.",
- }
Genetic Programming entries for
Deepa Karunakaran
Rangarajan Rangaswamy
Citations