Skip to main content

Assessment of Weight Factor in Genetic Programming Fitness Function for Imbalanced Data Classification

  • Conference paper
  • First Online:
Proceedings of the International Conference on Intelligent Vision and Computing (ICIVC 2021) (ICIVC 2021)

Part of the book series: Proceedings in Adaptation, Learning and Optimization ((PALO,volume 15))

Included in the following conference series:

Abstract

In real-world data classification, applications often have an imbalanced distribution of data over various classes. This imbalanced distribution imposes intense challenges, and because of this, traditional classification methods are not effective in this case. This problem also influences genetic programming (GP). One approach to resolve this issue is to assign a custom high weight to the classes during training. This custom weight assignment may nullify the impact of higher counts of any classes during the learning phase of the classifier. The GP fitness function may introduce the custom weight assignment for the minority class samples. The fitness function performs an essential role in GP and influences each building block of GP. This research work assesses the impact of weight factors in GP’s fitness function for imbalanced data classification. For this assessment, eight imbalanced classification problems are taken from the UCI repository, and intensive experimentation is done on the different weight factors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Haixiang, G., Yijing, L., Shang, J., Mingyun, G., Yuanyue, H., Bing, G.: Learning from class-imbalanced data: review of methods and applications. Expert Syst. Appl. 73, 220–239 (2017)

    Article  Google Scholar 

  2. Hassib, E., El-Desouky, A., Labib, L., El-kenawy, E.S.M.: WOA + BRNN: an imbalanced big data classification framework using whale optimization and deep neural network. Soft Comput. 24(8), 5573–5592 (2020)

    Article  Google Scholar 

  3. Zhang, C., Tan, K.C., Li, H., Hong, G.S.: A cost-sensitive deep belief network for imbalanced classification. IEEE Trans. Neural Netw. Learn. Syst. 30(1), 109–122 (2018)

    Article  Google Scholar 

  4. Zhu, M., et al.: Class weights random forest algorithm for processing class imbalanced medical data. IEEE Access 6, 4641–4652 (2018)

    Article  Google Scholar 

  5. Han, W., Huang, Z., Li, S., Jia, Y.: Distribution-sensitive unbalanced data oversampling method for medical diagnosis. J. Med. Syst. 43(2), 39 (2019)

    Article  Google Scholar 

  6. Kovács, G.: An empirical comparison and evaluation of minority oversampling techniques on a large number of imbalanced datasets. Appl. Soft Comput. 83, 105662 (2019)

    Article  Google Scholar 

  7. Bhowan, U., Johnston, M., Zhang, M.: Developing new fitness functions in genetic programming for classification with unbalanced data. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 42(2), 406–421 (2012)

    Google Scholar 

  8. Devarriya, D., Gulati, C., Mansharamani, V., Sakalle, A., Bhardwaj, A.: Unbalanced breast cancer data classification using novel fitness functions in genetic programming. Expert Syst. Appl. 140, 112866 (2020)

    Article  Google Scholar 

  9. Kumar, A., Sinha, N., Bhardwaj, A.: Predicting the presence of newt-amphibian using genetic programming. In: Tiwari, S., Trivedi, M.C., Kolhe, M.L., Mishra, K., Singh, B.K. (eds.) Advances in Data and Information Sciences. LNCS, vol. 318, pp. 215–223. Springer, Singapore (2022). https://doi.org/10.1007/978-981-16-5689-7_19

  10. Koza, J.R.: Human-competitive results produced by genetic programming. Genet. Program. Evolvable Mach. 11(3–4), 251–284 (2010)

    Article  Google Scholar 

  11. Koza, J.: On the programming of computers by means of natural selection. Genet. Program. (1992)

    Google Scholar 

  12. Cheng, K., Gao, S., Dong, W., Yang, X., Wang, Q., Yu, H.: Boosting label weighted extreme learning machine for classifying multi-label imbalanced data. Neurocomputing 403, 360–370 (2020)

    Article  Google Scholar 

  13. Kumar, A., Sinha, N., Bhardwaj, A.: A novel fitness function in genetic programming for medical data classification. J. Biomed. Inform. 112, 103623 (2020)

    Article  Google Scholar 

  14. Tao, X., et al.: Self-adaptive cost weights-based support vector machine cost-sensitive ensemble for imbalanced data classification. Inf. Sci. 487, 31–56 (2019)

    Article  MathSciNet  Google Scholar 

  15. Zhao, J., Jin, J., Chen, S., Zhang, R., Yu, B., Liu, Q.: A weighted hybrid ensemble method for classifying imbalanced data. Knowl.-Based Syst. 203, 106087 (2020)

    Article  Google Scholar 

  16. Kumar, A., Sinha, N., Bhardwaj, A., Goel, S.: Clinical risk assessment of chronic kidney disease patients using genetic programming. Comput. Methods Biomech. Biomed. Eng. 1–9 (2021). PMID: 34726985

    Google Scholar 

  17. Dua, D., Graff, C.: UCI machine learning repository (2017)

    Google Scholar 

  18. Poli, R., Langdon, W.B., McPhee, N.F., Koza, J.R.: A field guide to genetic programming. Lulu. com (2008)

    Google Scholar 

  19. Ballabio, D., Grisoni, F., Todeschini, R.: Multivariate comparison of classification performance measures. Chemometr. Intell. Lab. Syst. 174, 33–44 (2018)

    Article  Google Scholar 

  20. Mullick, S.S., Datta, S., Dhekane, S.G., Das, S.: Appropriateness of performance indices for imbalanced data classification: an analysis. Pattern Recogn. 102, 107197 (2020)

    Article  Google Scholar 

  21. Cuadros-Rodríguez, L., Pérez-Castaño, E., Ruiz-Samblás, C.: Quality performance metrics in multivariate classification methods for qualitative analysis. TrAC Trends Anal. Chem. 80, 612–624 (2016)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arvind Kumar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kumar, A., Goel, S., Sinha, N., Bhardwaj, A. (2022). Assessment of Weight Factor in Genetic Programming Fitness Function for Imbalanced Data Classification. In: Sharma, H., Vyas, V.K., Pandey, R.K., Prasad, M. (eds) Proceedings of the International Conference on Intelligent Vision and Computing (ICIVC 2021). ICIVC 2021. Proceedings in Adaptation, Learning and Optimization, vol 15. Springer, Cham. https://doi.org/10.1007/978-3-030-97196-0_1

Download citation

Publish with us

Policies and ethics