abstract = "Biomarker detection in LC-MS data depends mainly on
feature selection algorithms as the number of features
is extremely high while the number of samples is very
small. This makes classification of these data sets
extremely challenging. In this paper we propose the use
of genetic programming (GP) for subset feature
selection in LC-MS data which works by maximizing the
signal to noise ratio of the selected features by GP.
The proposed method was applied to eight LC-MS data
sets with different sample sizes and different levels
of concentration of the spiked biomarkers. We evaluated
the accuracy of selection from the list of biomarkers
and also using the classification accuracy of the
selected features via the support vector machines
(SVMs) and Naive Bayes (NB) classifiers. Features
selected by the proposed GP method managed to achieve
perfect classification accuracy for most of the data
sets. The results show that the proposed method strikes
a reasonable compromise between the detection rate of
the biomarkers and the classification accuracy for all
data sets. The method was also compared to linear
Support Vector Machine-Recursive Features Elimination
(SVM-RFE) and t-test for feature selection and the
results show that the biomarker detection rate of the
proposed approach is higher.",
notes = "Also known as \cite{6557621}
CEC 2013 - A joint meeting of the IEEE, the EPS and the
IET.",