TPE-AutoClust: A Tree-based Pipline Ensemble Framework for Automated Clustering
Created by W.Langdon from
gp-bibliography.bib Revision:1.7964
- @InProceedings{ElShawi:2022:ICDMW,
-
author = "Radwa ElShawi and Sherif Sakr",
-
booktitle = "2022 IEEE International Conference on Data Mining
Workshops (ICDMW)",
-
title = "{TPE-AutoClust:} A Tree-based Pipline Ensemble
Framework for Automated Clustering",
-
year = "2022",
-
pages = "1144--1153",
-
abstract = "Novel technologies in automated machine learning ease
the complexity of building well-performed machine
learning pipelines. However, these are usually
restricted to supervised learning tasks such as
classification and regression, while unsu-pervised
learning, particularly clustering, remains a largely
un-explored problem due to the ambiguity involved when
evaluating the clustering solutions. Motivated by this
shortcoming, in this paper, we introduce TPE-AutoClust,
a genetic programming-based automated machine learning
framework for clustering. TPE-AutoCl ust optimizes a
series of feature preprocessors and machine learning
models to optimize the performance on an unsupervised
clustering task. TPE-AutoClust mainly consists of three
main phases: meta-learning phase, optimization phase
and clustering ensemble construction phase. The
meta-learning phase suggests some instantiations of
pipelines that are likely to perform well on a new
dataset. These pipelines are used to warm start the
optimization phase that adopts a multi-objective
optimization technique to select pipelines based on the
Pareto front of the trade-off between the pipeline
length and performance. The ensemble construction phase
develops a collaborative mechanism based on a
clustering ensemble to combine optimized pipelines
based on different internal cluster validity indices
and construct a well-performing solution for a new
dataset. The proposed framework is based on
scikit-learn with 4 preprocessors and 6 clustering
algorithms. Extensive experiments are conducted on 27
real and synthetic benchmark datasets to validate the
superiority of TPE-AutoCl ust. The results show that
TPE-AutoClust outperforms the state-of-the-art
techniques for building automated clustering
solutions.",
-
keywords = "genetic algorithms, genetic programming",
-
DOI = "doi:10.1109/ICDMW58026.2022.00149",
-
ISSN = "2375-9259",
-
month = nov,
-
notes = "Also known as \cite{10031132}",
- }
Genetic Programming entries for
Radwa ElShawi
Sherif Sakr
Citations