Skip to main content

Fitness Functions in Genetic Programming for Classification with Unbalanced Data

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4830))

Abstract

This paper describes a genetic programming (GP) approach to binary classification with class imbalance problems. This approach is examined on two benchmark and two synthetic data sets. The results show that when using the overall classification accuracy as the fitness function, the GP system is strongly biased toward the majority class. Two new fitness functions are developed to deal with the class imbalance problem. The experimental results show that both of them substantially improve the performance for the minority class, and the performance for the majority and minority classes is much more balanced.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Japkowicz, N., Stephen, S.: The class imbalance problem: A systematic study. Intell. Data Anal. 6(5), 429–449 (2002)

    MATH  Google Scholar 

  2. Weiss, G.M., Provost, F.: The effect of class distribution on classifier learning: An empirical study (August 12, 2001)

    Google Scholar 

  3. Orriols, A., Bernadó-Mansilla, E.: The class imbalance problem in learning classifier systems:A preliminary study. In: Rothlauf, et al. (eds.) Genetic and Evolutionary Computation Conference (GECCO2005) workshop program, June 25-29, 2005, pp. 74–78. ACM Press, Washington, D.C., USA (2005)

    Chapter  Google Scholar 

  4. Flach, P.A.: The geometry of ROC space: Understanding machine learning metrics through ROC isometrics. In: Fawcett, T., Mishra, N. (eds.) ICML, pp. 194–201. AAAI Press, USA (2003)

    Google Scholar 

  5. Koza, J.R.: Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge, Mass (1992)

    MATH  Google Scholar 

  6. Banzhaf, W., Nordin, P., Keller, R.E., Francone, F.D.: Genetic Programming: An Introduction on the Automatic Evolution of computer programs and its Applications. Morgan Kaufmann Publishers, San Francisco, Calif. (1998)

    MATH  Google Scholar 

  7. Zhang, M., Ciesielski, V., Andreae, P.: A domain independent window-approach to multiclass object detection using genetic programming. EURASIP Journal on Signal Processing, Special Issue on Genetic and Evolutionary Computation for Signal Processing and Image Analysis 2003(8), 841–859 (2003)

    MATH  Google Scholar 

  8. Muni, D.P., Pal, N.R., Das, J.: A novel approach to design classifier using genetic programming. IEEE Transactions on Evolutionary Computation 8(2), 183–196 (2004)

    Article  Google Scholar 

  9. Krawiec, K., Bhanu, B.: Visual learning by coevolutionary feature synthesis. IEEE Transactions on System, Man, and Cybernetics – Part B 35(3), 409–425 (2005)

    Article  Google Scholar 

  10. Newman, D., Hettich, S., Blake, C., Merz, C.: Uci repository of machine learning databases (1998)

    Google Scholar 

  11. Koza, J.R.: Genetic Programming II: Automatic Discovery of Reusable Programs. MIT Press, Cambridge, Mass (1994)

    MATH  Google Scholar 

  12. Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30(7), 1145–1159 (1997)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Mehmet A. Orgun John Thornton

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Patterson, G., Zhang, M. (2007). Fitness Functions in Genetic Programming for Classification with Unbalanced Data. In: Orgun, M.A., Thornton, J. (eds) AI 2007: Advances in Artificial Intelligence. AI 2007. Lecture Notes in Computer Science(), vol 4830. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76928-6_90

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-76928-6_90

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-76926-2

  • Online ISBN: 978-3-540-76928-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics