Improving Feature Ranking for Biomarker Discovery in Proteomics Mass Spectrometry Data using Genetic Programming
Created by W.Langdon from
gp-bibliography.bib Revision:1.8051
- @Article{Ahmed:2014:CS,
-
author = "Soha Ahmed and Mengjie Zhang and Lifeng Peng",
-
title = "Improving Feature Ranking for Biomarker Discovery in
Proteomics Mass Spectrometry Data using Genetic
Programming",
-
journal = "Connection Science",
-
year = "2014",
-
volume = "26",
-
number = "3",
-
pages = "215--243",
-
keywords = "genetic algorithms, genetic programming, biomarker
discovery, feature selection, classification",
-
ISSN = "0954-0091",
-
DOI = "doi:10.1080/09540091.2014.906388",
-
size = "29 pages",
-
abstract = "Feature selection on mass spectrometry (MS) data is
essential for improving classification performance and
biomarker discovery. The number of MS samples is
typically very small compared with the high
dimensionality of the samples, which makes the problem
of biomarker discovery very hard. In this paper, we
propose the use of genetic programming for biomarker
detection and classification of MS data. The proposed
approach is composed of two phases: in the first phase,
feature selection and ranking are performed. In the
second phase, classification is performed. The results
show that the proposed method can achieve better
classification performance and biomarker detection rate
than the information gain (IG) based and the RELIEF
feature selection methods. Meanwhile, four classifiers,
Naive Bayes, J48 decision tree, random forest and
support vector machines, are also used to further test
the performance of the top ranked features. The results
show that the four classifiers using the top ranked
features from the proposed method achieve better
performance than the IG and the RELIEF methods.
Furthermore, GP also outperforms a genetic algorithm
approach on most of the used data sets.",
- }
Genetic Programming entries for
Soha Ahmed
Mengjie Zhang
Lifeng Peng
Citations