Abstract
In real-world data classification, applications often have an imbalanced distribution of data over various classes. This imbalanced distribution imposes intense challenges, and because of this, traditional classification methods are not effective in this case. This problem also influences genetic programming (GP). One approach to resolve this issue is to assign a custom high weight to the classes during training. This custom weight assignment may nullify the impact of higher counts of any classes during the learning phase of the classifier. The GP fitness function may introduce the custom weight assignment for the minority class samples. The fitness function performs an essential role in GP and influences each building block of GP. This research work assesses the impact of weight factors in GP’s fitness function for imbalanced data classification. For this assessment, eight imbalanced classification problems are taken from the UCI repository, and intensive experimentation is done on the different weight factors.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Haixiang, G., Yijing, L., Shang, J., Mingyun, G., Yuanyue, H., Bing, G.: Learning from class-imbalanced data: review of methods and applications. Expert Syst. Appl. 73, 220–239 (2017)
Hassib, E., El-Desouky, A., Labib, L., El-kenawy, E.S.M.: WOA + BRNN: an imbalanced big data classification framework using whale optimization and deep neural network. Soft Comput. 24(8), 5573–5592 (2020)
Zhang, C., Tan, K.C., Li, H., Hong, G.S.: A cost-sensitive deep belief network for imbalanced classification. IEEE Trans. Neural Netw. Learn. Syst. 30(1), 109–122 (2018)
Zhu, M., et al.: Class weights random forest algorithm for processing class imbalanced medical data. IEEE Access 6, 4641–4652 (2018)
Han, W., Huang, Z., Li, S., Jia, Y.: Distribution-sensitive unbalanced data oversampling method for medical diagnosis. J. Med. Syst. 43(2), 39 (2019)
Kovács, G.: An empirical comparison and evaluation of minority oversampling techniques on a large number of imbalanced datasets. Appl. Soft Comput. 83, 105662 (2019)
Bhowan, U., Johnston, M., Zhang, M.: Developing new fitness functions in genetic programming for classification with unbalanced data. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 42(2), 406–421 (2012)
Devarriya, D., Gulati, C., Mansharamani, V., Sakalle, A., Bhardwaj, A.: Unbalanced breast cancer data classification using novel fitness functions in genetic programming. Expert Syst. Appl. 140, 112866 (2020)
Kumar, A., Sinha, N., Bhardwaj, A.: Predicting the presence of newt-amphibian using genetic programming. In: Tiwari, S., Trivedi, M.C., Kolhe, M.L., Mishra, K., Singh, B.K. (eds.) Advances in Data and Information Sciences. LNCS, vol. 318, pp. 215–223. Springer, Singapore (2022). https://doi.org/10.1007/978-981-16-5689-7_19
Koza, J.R.: Human-competitive results produced by genetic programming. Genet. Program. Evolvable Mach. 11(3–4), 251–284 (2010)
Koza, J.: On the programming of computers by means of natural selection. Genet. Program. (1992)
Cheng, K., Gao, S., Dong, W., Yang, X., Wang, Q., Yu, H.: Boosting label weighted extreme learning machine for classifying multi-label imbalanced data. Neurocomputing 403, 360–370 (2020)
Kumar, A., Sinha, N., Bhardwaj, A.: A novel fitness function in genetic programming for medical data classification. J. Biomed. Inform. 112, 103623 (2020)
Tao, X., et al.: Self-adaptive cost weights-based support vector machine cost-sensitive ensemble for imbalanced data classification. Inf. Sci. 487, 31–56 (2019)
Zhao, J., Jin, J., Chen, S., Zhang, R., Yu, B., Liu, Q.: A weighted hybrid ensemble method for classifying imbalanced data. Knowl.-Based Syst. 203, 106087 (2020)
Kumar, A., Sinha, N., Bhardwaj, A., Goel, S.: Clinical risk assessment of chronic kidney disease patients using genetic programming. Comput. Methods Biomech. Biomed. Eng. 1–9 (2021). PMID: 34726985
Dua, D., Graff, C.: UCI machine learning repository (2017)
Poli, R., Langdon, W.B., McPhee, N.F., Koza, J.R.: A field guide to genetic programming. Lulu. com (2008)
Ballabio, D., Grisoni, F., Todeschini, R.: Multivariate comparison of classification performance measures. Chemometr. Intell. Lab. Syst. 174, 33–44 (2018)
Mullick, S.S., Datta, S., Dhekane, S.G., Das, S.: Appropriateness of performance indices for imbalanced data classification: an analysis. Pattern Recogn. 102, 107197 (2020)
Cuadros-Rodríguez, L., Pérez-Castaño, E., Ruiz-Samblás, C.: Quality performance metrics in multivariate classification methods for qualitative analysis. TrAC Trends Anal. Chem. 80, 612–624 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Kumar, A., Goel, S., Sinha, N., Bhardwaj, A. (2022). Assessment of Weight Factor in Genetic Programming Fitness Function for Imbalanced Data Classification. In: Sharma, H., Vyas, V.K., Pandey, R.K., Prasad, M. (eds) Proceedings of the International Conference on Intelligent Vision and Computing (ICIVC 2021). ICIVC 2021. Proceedings in Adaptation, Learning and Optimization, vol 15. Springer, Cham. https://doi.org/10.1007/978-3-030-97196-0_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-97196-0_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-97195-3
Online ISBN: 978-3-030-97196-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)