1. the complete title of one (or more) paper(s) published in the open literature describing the work that the author claims describes a human-competitive result; Human Competitiveness of Genetic Programming in Spectrum Based Fault Localisation: Theoretical and Empirical Analysis ------------------------------------------------------------------------------- 2. the name, complete physical mailing address, e-mail address, and phone number of EACH author of EACH paper(s); Shin Yoo Korea Advanced Institute of Science and Technology School of Computing 291 Daehak Ro, Yuseong Gu Daejeon 34141 Republic of Korea e-mail: shin.yoo@kaist.ac.kr tel: +82-42-350-3567 Xiaoyuan Xie Wuhan University Computer School Wuchang, Wuhan, Hubei, 430072 China e-mail: xxiw@whu.edu.cn tel: +86-27-6877-6027 Fei-Ching Kuo Swinburn University of Technology Hawthorn, VIC 3122 Austrailia email: dkuo@swin.edu.au tel: +61-3-9214-4752 Tsong Yueh Chen Swinburn University of Technology Hawthorn, VIC 3122 Austrailia email: tychen@swin.edu.au tel: +61-3-9214-4369 Mark Harman Facebook London, 10 Brock Street London and University College London Gower Street London WC1E 6BT United Kingdom email: mark.harman@ucl.ac.uk tel: +44-20-7679-0325 ------------------------------------------------------------------------------- 3. the name of the corresponding author (i.e., the author to whom notices will be sent concerning the competition); Shin Yoo ------------------------------------------------------------------------------- 4. the abstract of the paper(s); We report on the application of Genetic Programming to Software Fault Localisation, a problem in the area of Search Based Software Engineering (SBSE). We give both empirical and theoretical evidence for the human competitiveness of the evolved fault localisation formulæ under the single fault scenario, compared to those generated by human ingenuity and reported in many papers, published over more than a decade. Though there have been previous human competitive results claimed for SBSE problems, this is the first time that evolved solutions have been formally proved to be human competitive. We further prove that no future human investigation could outperform the evolved solutions. We complement these proofs with an empirical analysis of both human and evolved solutions, which indicates that the evolved solutions are not only theoretically human competitive, but also convey similar practical benefits to human-evolved counterparts. ------------------------------------------------------------------------------- 5. a list containing one or more of the eight letters (A, B, C, D, E, F, G, or H) that correspond to the criteria (see above) that the author claims that the work satisfies; (B) The result is equal to or better than a result that was accepted as a new scientific result at the time when it was published in a peer-reviewed scientific journal. (D) The result is publishable in its own right as a new scientific result — independent of the fact that the result was mechanically created. (E) The result is equal to or better than the most recent human-created solution to a long-standing problem for which there has been a succession of increasingly better human-created solutions. (F) The result is equal to or better than a result that was considered an achievement in its field at the time it was first discovered. (G) The result solves a problem of indisputable difficulty in its field. ------------------------------------------------------------------------------- 6. a statement stating why the result satisfies the criteria that the contestant claims (see examples of statements of human-competitiveness as a guide to aid in constructing this part of the submission); (B) The primary contribution of the paper cited is to prove that not only is the GP approach used human competitive, but it also finds a maximal solution. More importantly, the paper proved that no matter how competitive humans may become in the future, they cannot find any solution that is better than this GP result. The GP has already outperformed over a decade of previous human endeavour, and provably cannot be surpassed by future human endeavour. Specifically, Spectrum Based Fault Localisation is an automated debugging technique that has been widely studied for over a decade. It relies on what is called a risk evaluation formula to convert observations of passing and failing test executions into how likely each structural entity (e.g. statements or methods) is to be the cause of the failure. Over the last decade, more than 30 formulae have been manually designed. A few of them have been empirically known to be the "better" ones, and theoretically proven to be "maximal" (i.e. it is not possible to "better" these with respect to a specific class of faults). This work shows, with proofs, that GP evolved formulae that are either 1) equivalent to the known maximals, or 2) previously unknown maximals. Moreover, we prove that the greatest formula, i.e. the one formula that can "better" all other formulae with respect to any arbitrary fault, does not exist. All in all, GP has not only expanded the horizon of what is feasible but also reached a point where no human can improve upon. (D) The formulae GP has produced have been published at their own right [1]. Its equivalence to the best-performing existing "maximal" formulae have also been published independently [2]. Finally, the paper on which this submission is based on contains a proof that GP cannot be bettered by humans. (E) Starting with the earliest human designed risk evaluation formulae, such as Tarantula (proposed in 2001) [3, 4], SBFL has been widely studied [5-11], with each subsequent human-designed algorithm being published with a claim to be better than those that preceded it. It is only with the work of Xie et al. [12] that the hierarchy between these formulae has been revealed with an underpinning mathematical theory of subsumption. This mathematical analysis allows precise statements about the superiority of one technique over another. The formulae evolved by GP have been directly proven to be maximal. This makes them provably at least as good as all those achieved by humans over two decades of work, and strictly superior to most of the results produced by humans over this protracted period of human intellectual effort [2]. (F) One human-designed formula that was known to be maximal previously is Naish2 designed by Naish et al., which was an achievement on its own [11]. GP evolved formula, GP13, is provably mathematically equivalent to Naish2 and better than many other widely studied, human-designed formulae such as Tarantula, Ochiai, and Jaccard [2]: all of these formulae were considered to be independent achievements at the time of their discovery, and have been widely used by the software testing community, yet the GP results (GP13) is provably strictly better. (G) The succession of publications in the SBFL area have yielded over 30 distinct manually-designed formulae, each of which has been worthy of an independent publication. This protracted "human evolution" of ever-improved formulae over a long period is a testament to the continuing effort, and the importance of finding "the best" formula. Our proof shows not only that GP has evolved a maximal (best possible) formula, but it also shows the futility of any further human attempts to design a better one. This breakthrough result will thus shape the future of automated debugging research: instead of attempting to design a single technique that works well all the time, future research should treat the problem of automated debugging as a learning problem, and use approaches such as GP to adapt to each individual project and fault. ------------------------------------------------------------------------------- 7. a full citation of the paper (that is, author names; publication date; name of journal, conference, technical report, thesis, book, or book chapter; name of editors, if applicable, of the journal or edited book; publisher name; publisher city; page numbers, if applicable); Shin Yoo, Xiaoyuan Xie, Fei-Ching Kuo, Tsong Yueh Chen, and Mark Harman. "Human Competitiveness of Genetic Programming in Spectrum Based Fault Localisation: Theoretical and Empirical Analysis" ACM Transactions on Software Engineering and Methodology, ACM, to appear. ------------------------------------------------------------------------------- 8. a statement either that "any prize money, if any, is to be divided equally among the co-authors" OR a specific percentage breakdown as to how the prize money, if any, is to be divided among the co-authors; Any prize money, if any, is to be split equally between the co-authors of the cited paper. ------------------------------------------------------------------------------- 9. a statement stating why the authors expect that their entry would be the "best" Locating the lines of code responsible for fault is a pressing yet challenging problem for practising software engineers. This fault localisation problem has been studied widely in the scientific literature for over two decades. This entry for the HUMIES not only outperforms previously published peer-reviewed human scientific achievements in the field, it goes much further: It has been proved that no human approach is superior, nor can it ever be. That is, one of the primary contributions of this entry is a mathematical demonstration of the superiority of the genetic programming solution, and a theorem that demonstrates that no future human endeavour can outperform the GP solution. This is important because a challenging real-world engineering problem has been solved by a human-competitive evolutionary algorithm, while also provably outperforming previous human results with a maximal solution that no future human intellectual effort can surpass. Our claims are as follows: - We provide mathematically provably human equivalent GP results in automated debugging. - Moreover, no human can surpass what GP has produced: we provide a proof that no such thing is possible. - Finally, our results provide a fundamental insight into how the automated debugging research should progress in the future. Our results not only shows the capability of GP to produce human competitive results, but also provides a guidance to the research community regarding the future direction of fault localisation: instead of trying to design a single best technique, we should treat the problem as one of learning and adapt to each project and fault. ------------------------------------------------------------------------------- 10. an indication of the general type of genetic or evolutionary computation used, such as GA (genetic algorithms), GP (genetic programming), ES (evolution strategies), EP (evolutionary programming), LCS (learning classifier systems), GE (grammatical evolution), GEP (gene expression programming), DE (differential evolution), etc. Genetic Programming (GP) ------------------------------------------------------------------------------- References: [1] S. Yoo. Evolving human competitive spectra-based fault localisation techniques. In G. Fraser and J. Teixeira de Souza, editors, Search Based Software Engineering, volume 7515 of Lecture Notes in Computer Science, pages 244–258. Springer Berlin Heidelberg, 2012. [2] X. Xie, F.-C. Kuo, T. Y. Chen, S. Yoo, and M. Harman. Provably optimal and human-competitive results in sbse for spectrum based fault localisation. In G. Ruhe and Y. Zhang, editors, Search Based Software Engineering, volume 8084 of Lecture Notes in Computer Science, pages 224–238. Springer Berlin Heidelberg, 2013. [3] J. A. Jones and M. J. Harrold. Empirical evaluation of the tarantula automatic fault-localization technique. In Proceedings of the 20th International Conference on Automated Software Engineering (ASE2005), pages 273–282. ACM Press, 2005. [4] J. A. Jones, M. J. Harrold, and J. T. Stasko. Visualization for fault localization. In Proceedings of ICSE Workshop on Software Visualization, pages 71–75, 2001. [5] R. Abreu, P. Zoeteweij, and A. J. van Gemund. An evaluation of similarity coefficients for software fault localization. In The proceedings of the 12th Pacific Rim International Symposium on Dependable Computing, PRDC 2006, pages 39–46. IEEE, 2006. [6] R. Abreu, P. Zoeteweij, and A. J. C. van Gemund. On the accuracy of spectrum-based fault localization. In Proceedings of the Testing: Academic and Industrial Conference Practice and Research Techniques - MUTATION, pages 89–98. IEEE Computer Society, 2007. [7] S. Artzi, J. Dolby, F. Tip, and M. Pistoia. Directed test generation for effective fault localization. In Proceedings of the 19th international symposium on Software testing and analysis, ISSTA ’10, pages 49–60, New York, NY, USA, 2010. ACM. [8] V. Dallmeier, C. Lindig, and A. Zeller. Lightweight bug localization with ample. In Proceedings of the sixth international symposium on Automated analysis-driven debugging, AADEBUG’05, pages 99–104, New York, NY, USA, 2005. ACM. [9] M. Renieres and S. Reiss. Fault localization with nearest neighbor queries. In Proceedings of the 18th International Conference on Automated Software Engineering, pages 30 – 39, October 2003. [10] W. E. Wong, Y. Qi, L. Zhao, and K.-Y. Cai. Effective fault localization using code coverage. In Proceedings of the 31st Annual International Computer Software and Applications Conference - Volume 01, COMPSAC ’07, pages 449–456, Washington, DC, USA, 2007. IEEE Computer Society. [11] L. Naish, H. J. Lee, and K. Ramamohanarao. A model for spectra-based software diagnosis. ACM Transactions on Software Engineering Methodology, 20(3):11:1–11:32, August 2011. [12] X. Xie, T. Y. Chen, F.-C. Kuo, and B. Xu. A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization. ACM Transactions on Software Engineering Methodology, 22(4):31:1– 31:40, October 2013.