Can-Evo-Ens: Classifier stacking based evolutionary ensemble system for prediction of human breast cancer using amino acid sequences
Created by W.Langdon from
gp-bibliography.bib Revision:1.8081
- @Article{Ali:2015:JBI,
-
author = "Safdar Ali and Abdul Majid",
-
title = "{Can-Evo-Ens}: Classifier stacking based evolutionary
ensemble system for prediction of human breast cancer
using amino acid sequences",
-
journal = "Journal of Biomedical Informatics",
-
volume = "54",
-
pages = "256--269",
-
year = "2015",
-
month = apr,
-
keywords = "genetic algorithms, genetic programming, Breast
cancer, Amino acids, Physicochemical properties,
Stacking ensemble",
-
ISSN = "1532-0464",
-
URL = "http://www.sciencedirect.com/science/article/pii/S1532046415000064",
-
DOI = "doi:10.1016/j.jbi.2015.01.004",
-
abstract = "The diagnostic of human breast cancer is an intricate
process and specific indicators may produce negative
results. In order to avoid misleading results, accurate
and reliable diagnostic system for breast cancer is
indispensable. Recently, several interesting
machine-learning (ML) approaches are proposed for
prediction of breast cancer. To this end, we developed
a novel classifier stacking based evolutionary ensemble
system Can-Evo-Ens for predicting amino acid sequences
associated with breast cancer. In this paper, first, we
selected four diverse-type of ML algorithms of Naive
Bayes, K-Nearest Neighbour, Support Vector Machines,
and Random Forest as base-level classifiers. These
classifiers are trained individually in different
feature spaces using physicochemical properties of
amino acids. In order to exploit the decision spaces,
the preliminary predictions of base-level classifiers
are stacked. Genetic programming (GP) is then employed
to develop a meta-classifier that optimal combine the
predictions of the base classifiers. The most suitable
threshold value of the best-evolved predictor is
computed using Particle Swarm Optimisation technique.
Our experiments have demonstrated the robustness of
Can-Evo-Ens system for independent validation dataset.
The proposed system has achieved the highest value of
Area Under Curve (AUC) of ROC Curve of 99.95percent for
cancer prediction. The comparative results revealed
that proposed approach is better than individual ML
approaches and conventional ensemble approaches of
AdaBoostM1, Bagging, GentleBoost, and Random Subspace.
It is expected that the proposed novel system would
have a major impact on the fields of Biomedical,
Genomics, Proteomics, Bioinformatics, and Drug
Development.",
- }
Genetic Programming entries for
Safdar Ali
Abdul Majid
Citations