Skip to main content
Log in

Training genetic programming classifiers by vicinal-risk minimization

  • Published:
Genetic Programming and Evolvable Machines Aims and scope Submit manuscript

Abstract

We propose and motivate the use of vicinal-risk minimization (VRM) for training genetic programming classifiers. We demonstrate that VRM has a number of attractive properties and demonstrate that it has a better correlation with generalization error compared to empirical risk minimization (ERM) so is more likely to lead to better generalization performance, in general. From the results of statistical tests over a range of real and synthetic datasets, we further demonstrate that VRM yields consistently superior generalization errors compared to conventional ERM.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. This is exactly what is done in the hinge loss used in soft-margin support-vector machines [4].

  2. We omit the normalization of the Parzen density estimate since this contributes only a multiplicative constant that does not affect the subsequent minimization stage.

  3. Downloadable from http://www.stats.ox.ac.uk/pub/PRNN/.

  4. The sum of ranks 1–11 = 66.

  5. In this paper we make what may seem to be rather cautious statements about the outcome of hypothesis tests. Hypothesis tests are frequently misinterpreted—see Cohen [6] for a discussion of the technical arguments.

References

  1. N.M. Amil, N. Bredeche, C. Gagné, S. Gelly, M. Schoenauer, O. Teytaud, A statistical learning perspective of genetic programming, 12th European Conference on Genetic Programming (EuroGP 2009) (Tübingen, Germany, 2009), pp. 327–338

  2. Borges, C.E., Alonso, C.L., Montaña, J.L. (2010). Model selection in genetic programming. In: 12th Annual Conference on Genetic and Evolutionary Computation (GECCO 2010), pp. 985–986. Portland, OR.

  3. Chapelle, O., Weston, J., Bottou, L., Vapnik, V. (2000). Vicinal risk minimization. In: Advances in Neural Information Processing Systems 13 (NIPS 2000), pp. 416–422. Denver CO.

  4. V. Cherkassky, F.M. Mulier, Learning from data: concepts, theory and methods, 2nd edn. (Wiley-IEEE Press, New Jersey, 2007)

    Book  Google Scholar 

  5. C.A.C. Coello, G.B. Lamont, Applications of multi-objective evolutionary algorithms, vol. 1 (World Scientific, Singapore, 2004)

    MATH  Google Scholar 

  6. J. Cohen, The earth is round (\(p <.05\)). Am. Psychol. 49(12), 997–1003 (1994)

    Article  Google Scholar 

  7. J. Demšar, Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)

    MATH  MathSciNet  Google Scholar 

  8. R.O. Duda, P.E. Hart, D.G. Stork, Pattern recognition, 2nd edn. (John Wiley and Sons, New York, 2001)

    Google Scholar 

  9. A. Ekárt, S.Z. Németh, Selection based on the Pareto nondomination criterion for controlling code growth in genetic programming. Genet. Program. Evol. M. 2(1), 61–73 (2001)

    Article  MATH  Google Scholar 

  10. A. Frank, A. Asuncion, UCI Machine Learning Repository. University of California, Irvine, School of Information and Computer Sciences. (2010) http://archive.ics.uci.edu/ml

  11. T. Hastie, R. Tibshirani, J. Friedman, The elements of statistical learning: data mining, inference, and prediction, 2nd edn. (Springer, Berlin, 2009)

    Book  Google Scholar 

  12. L. Holmström, P. Koistinen, Using additive noise in back-propagation training. IEEE T. Neural. Networ. 3(1), 24–38 (1992)

    Article  Google Scholar 

  13. H. Iba, H. de Garis., T. Sato, Genetic programming using a minimum description length principle. In: Advances in Genetic Programming, pp. 265–284. MIT Press (1994)

  14. G.N. Karystinos, D.A. Pados, On overfitting, generalization, and randomly expanded training sets. IEEE T. Neural. Networ. 11(5), 1050–1057 (2000)

    Article  Google Scholar 

  15. R. Kumar, P.I. Rockett, Improved sampling of the Pareto-front in multiobjective genetic optimizations by steady-state evolution: A Pareto converging genetic algorithm. Evol. Comput. 10(3), 283–314 (2002)

    Article  Google Scholar 

  16. C. Nadeau, Y. Bengio, Inference for the generalization error. Mach. Learn. 52(3), 239–281 (2003)

    Article  MATH  Google Scholar 

  17. E. Polak, Optimization: algorithms and consistent approximations (Springer, New York, 1997)

    Book  MATH  Google Scholar 

  18. R. Poli, W.B. Langdon, N.F. McPhee, A Field Guide to Genetic Programming. Published via http://lulu.com and freely available at http://www.gp-field-guide.org.uk (2008)

  19. B.D. Ripley, Neural networks and related methods for classification. J. Roy. Stat. Soc. B Met. 56(3), 409–456 (1994)

    MATH  MathSciNet  Google Scholar 

  20. J. Rissanen, Modeling by shortest data description. Automatica 14(5), 465–471 (1978)

    Article  MATH  Google Scholar 

  21. C.P. Robert, G. Casella, Monte Carlo statistical methods, 2nd edn. (Springer, New York, 2005)

    Google Scholar 

  22. S. Silva, S. Dignum, L. Vanneschi, Operator equalisation for bloat free genetic programming and a survey of bloat control methods. Genet. Program. Evol. M. 13(2), 197–238 (2012)

    Article  Google Scholar 

  23. C. Soares, Is the UCI repository useful for data mining?, \(11^{th}\) Portuguese Conference on Artificial Intelligence (EPIA 2003) (Beja, Portugal, 2003), pp. 209–223

  24. A.N. Tikhonov, V.Y. Arsenin, Solutions of ill posed problems (V.H. Winston, Washington, DC, 1977)

    MATH  Google Scholar 

  25. V.N. Vapnik, The nature of statistical learning theory, 2nd edn. (Springer, New York, 2000)

    Book  MATH  Google Scholar 

  26. Y. Zhang, P. Rockett, A comparison of three evolutionary strategies for multiobjective genetic programming. Artif. Intell. Rev. 27(2–3), 149–163 (2007)

    Article  MATH  Google Scholar 

  27. Y. Zhang, P.I. Rockett, A generic optimising feature extraction method using multiobjective genetic programming. Appl. Soft Comput. 11(1), 1087–1097 (2011)

    Article  Google Scholar 

Download references

Acknowledgments

The authors would like to thank Yilong Cao and Richard Everson for valuable discussions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter Rockett.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ni, J., Rockett, P. Training genetic programming classifiers by vicinal-risk minimization. Genet Program Evolvable Mach 16, 3–25 (2015). https://doi.org/10.1007/s10710-014-9222-4

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10710-014-9222-4

Keywords

Navigation