Skip to main content

Kaizen Programming for Feature Construction for Classification

  • Chapter
  • First Online:
Book cover Genetic Programming Theory and Practice XIII

Part of the book series: Genetic and Evolutionary Computation ((GEVO))

Abstract

A data set for classification is commonly composed of a set of features defining the data space representation and one attribute corresponding to the instances’ class. A classification tool has to discover how to separate classes based on features, but the discovery of useful knowledge may be hampered by inadequate or insufficient features. Pre-processing steps for the automatic construction of new high-level features proposed to discover hidden relationships among features and to improve classification quality. Here we present a new tool for high-level feature construction: Kaizen Programming. This tool can construct many complementary/dependent high-level features simultaneously. We show that our approach outperforms related methods on well-known binary-class medical data sets using a decision-tree classifier, achieving greater accuracy and smaller trees.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Thirty-two runs were performed because it is a multiple of 8, and the runs were done in parallel on a quad-core machine with hyper-threading, so we employed all available processing units.

References

  • Banzhaf W, Nordin P, Keller R, Francone F (1998) Genetic programming - an introduction. Morgan Kaufmann, San Francisco

    Book  MATH  Google Scholar 

  • Brameier M, Banzhaf W (2001) Evolving teams of predictors with linear genetic programming. Genet Program Evolvable Mach 2(4):381–407

    Article  MATH  Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  MathSciNet  MATH  Google Scholar 

  • Breiman L, Friedman J, Stone C, Olshen R (1984) Classification and regression trees. The Wadsworth and Brooks-Cole statistics-probability series. Taylor & Francis, London

    Google Scholar 

  • de Melo VV (2014) Kaizen programming. In: Proceedings of the 2014 conference on genetic and evolutionary computation (GECCO). ACM, New York, pp 895–902

    Google Scholar 

  • Drozdz K, Kwasnicka H (2010) Feature set reduction by evolutionary selection and construction. In: Agent and multi-agent systems: technologies and applications. Springer, Berlin, Heidelberg, pp 140–149

    Chapter  Google Scholar 

  • Freitas AA (2008) A review of evolutionary algorithms for data mining. In: Soft computing for knowledge discovery and data mining, Springer, Berlin, pp 79–111

    Chapter  Google Scholar 

  • Gavrilis D, Tsoulos IG, Dermatas E (2008) Selecting and constructing features using grammatical evolution. Pattern Recogn Lett 29(9):1358–1365. doi:10.1016/j.patrec.2008.02.007. http://www.sciencedirect.com/science/article/B6V15-4S01WDH-4/2/aaff3c40c5eca125dfacb 426d88fa177

    Google Scholar 

  • Gitlow H, Gitlow S, Oppenheim A, Oppenheim R (1989) Tools and methods for the improvement of quality. Irwin series in quantitative analysis for business. Taylor & Francis, London

    MATH  Google Scholar 

  • Guo H, Zhang Q, Nandi AK (2008) Feature extraction and dimensionality reduction by genetic programming based on the fisher criterion. Expert Syst 25(5):444–459

    Article  Google Scholar 

  • Guo PF, Bhattacharya P, Kharma N (2010) Advances in detecting parkinson’s disease. In: Zhang D, Sonka M (eds) Medical biometrics. Lecture notes in computer science, vol 6165. Springer, Berlin, Heidelberg, pp 306–314

    Chapter  Google Scholar 

  • Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. SIGKDD Explor Newsl 11(1):10–18. doi:10.1145/1656274.1656278. http://doi.acm.org/10.1145/1656274.1656278

    Google Scholar 

  • Imai M (1986) Kaizen (Ky’zen), the key to Japan’s competitive success. McGraw-Hill, New York

    Google Scholar 

  • Isabelle G, André E, An introduction to feature extraction. In: Guyon I, Gunn S, Nikravesh M, Zadeh LA (eds) Feature extraction: foundations and applications (Studies in Fuzziness and Soft Computing). Springer, Berlin/Heidelberg, pp 1–25. doi:10.1007/978-3-540-35488-8

    Google Scholar 

  • Jolliffe I (2005) Principal component analysis. Wiley Online Library

    Book  MATH  Google Scholar 

  • Kantardzic M (2011) Data mining: concepts, models, methods, and algorithms. Wiley, New York

    Book  MATH  Google Scholar 

  • Lichman M (2013) UCI machine learning repository. http://archive.ics.uci.edu/ml

  • Liu H, Motoda H (1998) Feature extraction, construction and selection: a data mining perspective. Springer, Berlin

    Book  MATH  Google Scholar 

  • Miner G, Nisbet R, Elder IVJ (2009) Handbook of statistical analysis and data mining applications. Academic Press, New York

    Google Scholar 

  • Muharram MA, Smith GD (2004) Evolutionary feature construction using information gain and gini index. In: Genetic programming, Springer, pp 379–388

    Google Scholar 

  • Neshatian K, Zhang M, Johnston M (2007) Feature construction and dimension reduction using genetic programming. In: AI 2007: advances in artificial intelligence. Springer, Berlin, pp 160–170

    Chapter  Google Scholar 

  • Neshatian K, Zhang M, Andreae P (2012) A filter approach to multiple feature construction for symbolic learning classifiers using genetic programming. Trans Evol Comp 16(5):645–661

    Article  Google Scholar 

  • Nguyen DV, Rocke DM (2004) On partial least squares dimension reduction for microarray-based classification: a simulation study. Comput Stat Data Anal 46(3):407–425

    Article  MathSciNet  MATH  Google Scholar 

  • Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830

    MathSciNet  MATH  Google Scholar 

  • Schölkopf B, Smola A, Müller KR (1997) Kernel principal component analysis. In: Artificial neural networks–ICANN 97. Springer, Berlin, pp 583–588

    Google Scholar 

  • Smith MG, Bull L (2005) Genetic programming with a genetic algorithm for feature construction and selection. Genet Program Evolvable Mach 6(3):265–281

    Article  Google Scholar 

  • Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE Trans Evol Comput 1(1):67–82

    Article  Google Scholar 

  • Wu SX, Banzhaf W (2010) A hierarchical cooperative evolutionary algorithm. In: Proceedings of the 12th annual conference on genetic and evolutionary computation, GECCO ’10. ACM, New York, pp 233–240

    Google Scholar 

  • Wu SX, Banzhaf W (2011) Rethinking multilevel selection in genetic programming. In: Proceedings of the 13th annual conference on genetic and evolutionary computation, Dublin, pp 1403–1410

    Google Scholar 

Download references

Acknowledgements

This paper was supported by the Brazilian Government CNPq (Universal) grant (486950/2013-1) and CAPES (Science without Borders) grant (12180-13-0) to Vinícius Veloso de Melo, and Canada’s NSERC Discovery grant RGPIN 283304-2012 to Wolfgang Banzhaf.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vinícius Veloso de Melo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

de Melo, V.V., Banzhaf, W. (2016). Kaizen Programming for Feature Construction for Classification. In: Riolo, R., Worzel, W., Kotanchek, M., Kordon, A. (eds) Genetic Programming Theory and Practice XIII. Genetic and Evolutionary Computation. Springer, Cham. https://doi.org/10.1007/978-3-319-34223-8_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-34223-8_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-34221-4

  • Online ISBN: 978-3-319-34223-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics