Discovery of characteristic knowledge in databases using cluster analysis and genetic programming
Created by W.Langdon from
gp-bibliography.bib Revision:1.8120
- @PhdThesis{oai:xtcat.oclc.org:OCLCNo/ocm42996482,
-
title = "Discovery of characteristic knowledge in databases
using cluster analysis and genetic programming",
-
author = "Tae-wan Ryu",
-
year = "1998",
-
description = "Degree granted by Dept. of Computer Science.; Thesis
(Ph. D.)--University of Houston, 1998.; Includes
bibliographical references (leaves 142-150).",
-
oai = "oai:xtcat.oclc.org:OCLCNo/ocm42996482",
-
school = "Department of Computer Science, University of
Houston",
-
address = "USA",
-
month = dec,
-
email = "tryu@ecs.fullerton.edu",
-
keywords = "genetic algorithms, genetic programming, Computer
science, Cluster analysis--Data processing",
-
URL = "http://search.proquest.com/docview/304438390",
-
size = "156 pages",
-
abstract = "Knowledge discovery in data (KDD) is the generic
approach to analyse and extract useful knowledge from
data collections using computerised tools. Applying KDD
techniques directly to a database is not
straightforward, since in a database, there may be
several views of the database depending on the user's
interests, unlike the data collections stored in a
single flat file format. Moreover, in many cases, there
is a data model discrepancy between the target database
and the representation format for the input data set
that most KDD techniques expect. The presented research
centres on developing methodologies, techniques, and
tools to discover useful characteristic knowledge in
databases. Our approach is first to partition a given
database into several clusters with similar properties
using cluster analysis, and then to discover
characteristic knowledge in each cluster using genetic
programming. In this research, we analyzed the problems
in clustering databases. We proposed an extended data
set format as an input data set format that can store
related information unlike a traditional flat file
format. We developed an automatic tool that generates
an extended data set from databases, which may contain
the related information from related tables or classes.
We proposed a unified similarity framework that can
cope with various kinds of data sets, and generalised
clustering algorithms for the proposed similarity
framework. We also developed a discovery system that
takes the set of data objects in each cluster and
discovers characteristic knowledge for the given object
set using genetic programming.",
-
notes = "http://www2.cs.uh.edu/~ceick/stud2004.html UMI 9917211
Supervisor: Christoph F. Eick",
- }
Genetic Programming entries for
Tae-Wan Ryu
Citations