A novel fitness function in genetic programming for medical data classification

https://doi.org/10.1016/j.jbi.2020.103623Get rights and content
Under an Elsevier user license
open archive

Highlights

  • A novel fitness function in Genetic Programming for Medical Data Classification has been proposed.

  • Four benchmark medical data-sets, taken from the UCI repository, are classified using the proposed technique.

  • The performance of the proposed technique has been compared with Support Vector Machine (SVM) and other state-of-the-art works available in the literature.

  • The result outcomes show that the proposed technique gives either better or comparable performance than the SVM and other state-of-the-art works.

Abstract

In the last decade, machine learning (ML) techniques have been widely applied to identify different diseases. This facilitates an early diagnosis and increases the chance of survival. The majority of medical data-sets are unbalanced. Due to this, ML classification techniques give biased classification over the majority class. In this paper, a novel fitness function in Genetic Programming, for medical data classification has been proposed that handles the problem of unbalanced data. Four benchmark medical data-sets named chronic kidney disease (CKD), fertility, BUPA liver disorder, and Wisconsin diagnostic breast cancer (WDBC) have been taken from the University of California (UCI) machine learning repository. Classification is done using the proposed technique. The proposed technique achieved the best accuracy for CKD, WDBC, Fertility, and BUPA dataset as 100%, 99.12%, 85.0%, and 75.36% respectively, and the best AUC as 1.0, 0.99, 0.92, and 0.75 respectively. The result outcomes show an improvement over other GP and SVM methods that confirm the efficiency of our proposed algorithm.

Keywords

Medical data classification
Genetic Programming
Fitness function
Unbalanced data classification

Cited by (0)