Created by W.Langdon from gp-bibliography.bib Revision:1.7538
Many studies that were conducted in the realm of clustering have addressed some of these challenges. This study proposes a new approach, called parallel hybrid clustering using genetic programming and multi-objective fitness with density (PYRAMID), that tackles several of these challenges from a different perspective.
PYRAMID employs genetic programming to represent arbitrary cluster shapes and circumvent falling in local optima. It accommodates large data sets and avoids dependency on the order of input by quantizing the data space, i.e., the space on which the data set resides, thus abstracting it into hyper-rectangular cells and creating genetic programming individuals as concatenations of these cells. Thus the cells become the subject of clustering, rather than the data points themselves. PYRAMID also uses a density-based multi-objective fitness function to handle outliers. It gathers statistics in a pre-processing step and uses them so not to rely on user-defined parameters. Finally, PYRAMID employs data parallelism in a master-slave model in an attempt to cure the inherent slow performance of evolutionary algorithms and provide speedup. A master processor distributes the clustering data evenly onto multiple slave processors. The slave processors conduct the clustering on their local data sets and report their clustering results back to the master, which consolidates them by merging the partial results into a final clustering solution. This last step also involves determining the number of clusters dynamically and labeling them accordingly.
Experiments have demonstrated that, using these features, PYRAMID offers an advantage over some of the existing approaches by tackling the clustering challenges from a different angle.",
UMI Microform 3208008",
Genetic Programming entries for Samir Tout