Skip to main content

Discovering New Rule Induction Algorithms with Grammar-based Genetic Programming

  • Chapter

Rule induction is a data mining technique used to extract classification rules of the form IF (conditions) THEN (predicted class) from data. The majority of the rule induction algorithms found in the literature follow the sequential covering strategy, which essentially induces one rule at a time until (almost) all the training data is covered by the induced rule set. This strategy describes a basic algorithm composed by several key elements, which can be modified and/or extended to generate new and better rule induction algorithms. With this in mind, this work proposes the use of a grammar-based genetic programming (GGP) algorithm to automatically discover new sequential covering algorithms. The proposed system is evaluated using 20 data sets, and the automatically-discovered rule induction algorithms are compared with four well-known human-designed rule induction algorithms. Results showed that the GGP system is a promising approach to effectively discover new sequential covering algorithms.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Aho, A.V., Sethi, R., Ullman, J.D, (1986), Compilers: Principles, Techniques and Tools. 1st edn. Addison-Wesley.

    Google Scholar 

  • Banzhaf, W., Nordin, P., Keller, R.E., Francone, F.D, (1998), Genetic Programming - An Introduction; On the Automatic Evolution of Computer Programs and its Applications. Morgan Kaufmann.

    Google Scholar 

  • Bhattacharyya, S, (1998), Direct marketing response models using genetic algorithms. In: Proc. of 4th Int. Conf. on Knowledge Discovery and Data Mining (KDD-98). 144-148.

    Google Scholar 

  • Caruana, R., Niculescu-Mizil, A, (2004), Data mining in metric space: an empirical analysis of supervised learning performance criteria. In: Proc. of the 10th ACM SIGKDD Int. Conf. on Knowledge discovery and data mining (KDD-04), ACM Press 69-78.

    Google Scholar 

  • Clark, P., Boswell, R., 1991, Rule induction with CN2: some recent improvements. In Kodratoff, Y., ed, EWSL-91: Proc. of the European Working Session on Learning on Machine Learning, New York, NY, USA, Springer-Verlag 151-163.

    Chapter  Google Scholar 

  • Clark, P., Niblett, T, 1989, The CN2 induction algorithm. Machine Learning 3 261-283.

    Google Scholar 

  • Cohen, W.W., 1995, Fast effective rule induction. In Prieditis, A., Russell, S., eds, Proc. of the 12th Int. Conf. on Machine Learning (ICML-95), Tahoe City, CA, Morgan Kaufmann 115-123.

    Google Scholar 

  • Fawcett, T, (2003), Roc graphs: Notes and practical considerations for data mining researchers. Technical Report HPL-2003-4, HP Labs.

    Google Scholar 

  • Flach, P, (2003), The geometry of roc space: understanding machine learning metrics through roc isometrics. In: Proc. 20th International Conference on Machine Learning (ICML-03), AAAI Press 194-201.

    Google Scholar 

  • Freitas, A.A, (2002), Data Mining and Knowledge Discovery with Evolutionary Algorithms. Springer-Verlag.

    Google Scholar 

  • Fürnkranz, J, 1999, Separate-and-conquer rule learning. Artificial Intelligence Review 13(1) 3-54.

    Article  MATH  Google Scholar 

  • de la Iglesia, B., Debuse, J.C.W., Rayward-Smith, V.J, (1996) Discovering knowledge in commercial databases using modern heuristic techniques. In: Proc. of the 2nd ACM SIGKDD Int. Conf. on Knowledge discovery and data mining (KDD-96), 44-49.

    Google Scholar 

  • Genetic Programming, http://www.genetic-programming.org/ (2006)

  • Koza, J.R, 1992, Genetic Programming: On the Programming of Computers by the means of natural selection. The MIT Press, Massachusetts.

    MATH  Google Scholar 

  • Michalski, R.S, (1969), On the quasi-minimal solution of the general covering problem. In: Proc. of the 5th Int. Symposium on Information Processing, Bled, Yugoslavia 125-128.

    Google Scholar 

  • Mitchell, T, (1997), Machine Learning. Mc Graw Hill.

    Google Scholar 

  • Naur, P, 1963, Revised report on the algorithmic language algol-60. Communications ACM 6(1) 1-17.

    Google Scholar 

  • Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J., (1998), UCI Repository of machine learning databases. University of California, Irvine,http://www.ics.uci.edu/∼mlearn/MLRepository.html

  • Pappa, G.L., Freitas, A.A. (2006), Automatically evolving rule induction algorithms. In Fürnkranz, J., Scheffer, T., Spiliopoulou, M., eds, Proc. of the 17th European Conf. on Machine Learning (ECML-06). Volume 4212 of Lecture Notes in Computer Science., Springer Berlin/Heidelberg 341-352.

    Google Scholar 

  • Pappa, G.L, 2007, Automatically Evolving Rule Induction Algorithms with Grammar-based Genetic Programming. PhD thesis, Computing Laboratory, University of Kent, Cannterbury, UK.

    Google Scholar 

  • Provost, F., Fawcett, T., Kohavi, R, 1998, The case against accuracy estimation for comparing induction algorithms. In: Proc. of the 15th Int. Conf. on Machine Learning (ICML-98), San Francisco, CA, USA, Morgan Kaufmann Publishers Inc. 445-453.

    Google Scholar 

  • Quinlan, J.R, (1993), C4.5: programs for machine learning. Morgan Kaufmann. Witten, I.H., Frank, E, (2005), Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. 2nd edn. Morgan Kaufmann.

    Google Scholar 

  • Zhang, J, 1992, Selecting typical instances in instance-based learning. In: Proc. of the 9th Int. Workshop on Machine learning (ML-92), San Francisco, CA, USA, Morgan Kaufmann 470-479.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Pappa, G.L., Freitas, A.A. (2008). Discovering New Rule Induction Algorithms with Grammar-based Genetic Programming. In: Maimon, O., Rokach, L. (eds) Soft Computing for Knowledge Discovery and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-69935-6_6

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-69935-6_6

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-69934-9

  • Online ISBN: 978-0-387-69935-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics