Abstract
John R. Koza defined several metrics to measure the performance of an Evolutionary Algorithm that have been widely used by the Genetic Programming community. Despite the importance of these metrics, and the doubts that they have generated in many authors, their reliability has attracted little research attention, and is still not well understood. The lack of knowledge about these metrics has likely contributed to the decline in their usage in the last years. This paper is an attempt to increase the knowledge about these measures, exploring in which circumstances they are more reliable, providing some clues to improve how they are used, and eventually making their use more justifiable. Specifically, we investigate the amount of uncertainty associated with the measures, taking an analytical and empirical approach and reaching theoretical boundaries to the error. Additionally, a new method to calculate Koza’s performance measures is presented. It is shown that these metrics, under common experimental configurations, have an unacceptable error, which can be arbitrary large in certain conditions.
Similar content being viewed by others
Notes
All the code, configuration files and datasets required to reproduce these experiments are available on http://atc1.aut.uah.es/~david/gpem2014
References
A. Agresti, B.A. Coull, Approximate is better than ’exact’ for interval estimation of binomial proportions. Am. Stat. 52, 119–126 (1998)
P.J. Angeline, An investigation into the sensitivity of genetic programming to the frequency of leaf selection during subtree crossover. in Proceedings of the First Annual Conference on Genetic Programming (GECCO 1996). (MIT Press, Cambridge, MA, 1996), pp. 21–29
D.F. Barrero, D. Camacho, M.D. R-Moreno, Confidence intervals of success rates in evolutionary computation. in Proceedings of the 12th Annual Conference on Genetic and Evolutionary Computation (GECCO 2010). (ACM, Portland, Oregon, 2010), pp. 975–976. doi:10.1145/1830483.1830657
D.F. Barrero, B. Castaño, M.D. R-Moreno, D. Camacho, Statistical Distribution of Generation-to-Success in GP: Application to Model Accumulated Success Probability, in Proceedings of the 14th European Conference on Genetic Programming, EuroGP 2011, LNCS, vol. 6621, ed. by S. Silva, J.A. Foster, M. Nicolau, M. Giacobini, P. Machado (Springer, Turin, 2011), pp. 155–166
D.F. Barrero, M.D. R-Moreno, B. Castano, D. Camacho, An empirical study on the accuracy of computational effort in genetic programming, in Proceedings of the 2011 IEEE Congress on Evolutionary Computation. IEEE Computational Intelligence Society, ed. by A.E. Smith (IEEE Press, New Orleans, 2011), pp. 1169–1176
L.D. Brown, T.T. Cai, A. Dasgupta, Interval estimation for a binomial proportion. Stat. Sci. 16, 101–133 (2001)
L.D. Brown, T.T. Cai, A. Dasgupta, Confidence intervals for a binomial proportion and asymptotic expansions. Ann. Stat. 30(1), 160–201 (2002)
M. Chiarandini, T. Stützle, Experimental Evaluation of Course Timetabling Algorithms. Tech. Rep. AIDA-02-05, Intellectics Group, Computer Science Department, Darmstadt University of Technology, Darmstadt, Germany (2002)
S. Christensen, F. Oppacher, An analysis of Koza’s computational effort statistic for genetic programming. in Proceedings of the 5th European Conference on Genetic Programming (EuroGP 2002). (Springer, London, 2002), pp. 182–191
C. Clopper, S. Pearson, The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika 26, 404–413 (1934)
D. Frost, I. Rish, L. Vila, Summarizing CSP hardness with continuous probability distributions. in Proceedings of the Fourteenth National Conference on Artificial Intelligence and Ninth Conference on Innovative Applications of Artificial Intelligence, AAAI’97/IAAI’97. (AAAI Press, Menlo Park, 1997), pp. 327–333
A. Gelman, J.B. Carlin, H.S. Stern, D.B. Rubin, Bayesian Data Analysis, Second Edition (Chapman & Hall/CRC Texts in Statistical Science), 2nd edn. (Chapman and Hall, London, 2003)
H.H. Hoos, T. Sttzle, Evaluating Las Vegas algorithms—pitfalls and remedies. in Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI-98). (Morgan Kaufmann Publishers, Los Altos, CA, 1998), pp. 238–245
A. Kaufmann, D. Grounchko, R. Cruon, Mathematical Models for the Study of the Reliability of Systems, Mathematics in Science and Engineering, vol. 124 (Academic Press, New York, 1977)
M. Keijzer, V. Babovic, C. Ryan, M. O’Neill, M. Cattolico, Adaptive logic programming. in Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2001). (Morgan Kaufmann, San Francisco, CA, 2001), pp. 42–49
J. Koza, Genetic Programming: On the programming of Computers by Means of Natural Selection (MIT Press, Cambrige, MA, 1992)
P.S. Laplace, Théorie Analytique des probabilités (Mme Ve Courcier, Paris, 1812)
E. Limpert, W.A. Stahel, M. Abbt, Log-normal distributions across the sciences: keys and clues. Bioscience 51(5), 341–352 (2001)
D.C. Montgomery, G.C. Runger, Applied Statistics and Probability for Engineers, 4th edn. (Wiley, New York, 2006)
J.B. Mouret, S. Doncieux, Encouraging behavioral diversity in evolutionary robotics: an empirical study. Evol. Comput. 20(1), 91–133 (2012)
R. Myers, E.R. Hancock, Empirical modelling of genetic algorithms. Evol. Comput. 9(4), 461–493 (2001)
R.G. Newcombe, Two-sided confidence intervals for the single proportion: comparison of seven methods. Stat. Med. 17(8), 857–872 (1998)
J. Niehaus, W. Banzhaf, More on computational effort statistics for genetic programming. in Genetic Programming, Proceedings of EuroGP’2003, LNCS, vol. 2610. (Springer, Essex, 2003), pp. 164–172
R. Poli, L. Vanneschi, W. Langdon, N. McPhee, Theoretical results in Genetic Programming: the next ten years? Genet. Program Evolvable Mach. 11(3), 285–320 (2010)
R. Sharma, Bayes approach to interval estimation of a binomial parameter. Ann. Inst. Stat. Math. 27(1), 259–267 (1975)
M. Walker, H. Edwards, C. Messom, The reliability of confidence intervals for computational effort comparisons. in Proceedings of the 9th Annual Conference on Genetic and Evolutionary Computation (GECCO 2007). (ACM, New York, NY, 2007), pp. 1716–1723
M. Walker, H. Edwards, C.H. Messom, Confidence intervals for computational effort comparisons. in EuroGP, pp. 23–32 (2007)
E.B. Wilson, Probable inference, the law of succession, and statistical inference. J. Am. Stat. Assoc. 22, 309–316 (1927)
Acknowledgments
Authors would like to thank Héctor Menéndez, Alejandro Sierra and Ricardo Aler for their reviews and valuable suggestions. This work is supported by the Project of Castilla-La Mancha PEII11-0079-8929.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Barrero, D.F., Castaño, B., R-Moreno, M.D. et al. A study on Koza’s performance measures. Genet Program Evolvable Mach 16, 327–349 (2015). https://doi.org/10.1007/s10710-014-9238-9
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10710-014-9238-9