skip to main content
research-article

Human Competitiveness of Genetic Programming in Spectrum-Based Fault Localisation: Theoretical and Empirical Analysis

Authors Info & Claims
Published:28 June 2017Publication History
Skip Abstract Section

Abstract

We report on the application of Genetic Programming to Software Fault Localisation, a problem in the area of Search-Based Software Engineering (SBSE). We give both empirical and theoretical evidence for the human competitiveness of the evolved fault localisation formulæ under the single fault scenario, compared to those generated by human ingenuity and reported in many papers, published over more than a decade. Though there have been previous human competitive results claimed for SBSE problems, this is the first time that evolved solutions have been formally proved to be human competitive. We further prove that no future human investigation could outperform the evolved solutions. We complement these proofs with an empirical analysis of both human and evolved solutions, which indicates that the evolved solutions are not only theoretically human competitive, but also convey similar practical benefits to human-evolved counterparts.

References

  1. R. Abreu, P. Zoeteweij, and A. J. C. van Gemund. 2009. Spectrum-based multiple fault localization. In Proceedings of the 24th IEEE/ACM International Conference on Automated Software Engineering (ASE’09). 88--99. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Rui Abreu, Peter Zoeteweij, and Arjan J. C. van Gemund. 2006. An evaluation of similarity coefficients for software fault localization. In The Proceedings of the 12th Pacific Rim International Symposium on Dependable Computing (PRDC’06). IEEE, 39--46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Rui Abreu, Peter Zoeteweij, and Arjan J. C. van Gemund. 2007. On the accuracy of spectrum-based fault localization. In Proceedings of the Testing: Academic and Industrial Conference Practice and Research Techniques—MUTATION. IEEE Computer Society, 89--98. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Wasif Afzal, Richard Torkar, and Robert Feldt. 2009. A systematic review of search-based testing for non-functional system properties. Info. Softw. Technol. 51, 6 (2009), 957--976. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Andrea Arcuri and Lionel Briand. 2011. A practical guide for using statistical tests to assess randomized algorithms in software engineering. In Proceedings of the 33rd International Conference on Software Engineering (ICSE’11). ACM, New York, 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Shay Artzi, Julian Dolby, Frank Tip, and Marco Pistoia. 2010. Directed test generation for effective fault localization. In Proceedings of the 19th International Symposium on Software Testing and Analysis (ISSTA’10). ACM, New York, 49--60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Earl T. Barr, Mark Harman, Yue Jia, Alexandru Marginean, and Justyna Petke. 2015. Automated software transplantation. In Proceedings of the 2015 International Symposium on Software Testing and Analysis (ISSTA’15). ACM, New York, 257--269. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Mariano Ceccato, Alessandro Marchetto, Leonardo Mariani, Cu D. Nguyen, and Paolo Tonella. 2015. Do automatically generated test cases make debugging easier? An experimental assessment of debugging effectiveness and efficiency. ACM Trans. Softw. Eng. Methodol. 25, 1 (Dec. 2015), 5:1--5:38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Yanping Chen, Robert L. Probert, and D. Paul Sims. 2002. Specification-based regression test selection with risk analysis. In Proceedings of the Conference of the Centre for Advanced Studies on Collaborative research (CASCON’02). IBM Press, 1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Gregory W. Corder and Dale I. Foreman. 2009. Nonparametric Statistics for Non-Statisticians: A Step-by-Step Approach. Wiley.Google ScholarGoogle Scholar
  11. Valentin Dallmeier, Christian Lindig, and Andreas Zeller. 2005. Lightweight bug localization with AMPLE. In Proceedings of the 6th International Symposium on Automated Analysis-driven Debugging (AADEBUG’05). ACM, New York, 99--104. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Nicholas DiGiuseppe and James A. Jones. 2011. On the influence of multiple faults on coverage-based fault localization. In Proceedings of the 2011 International Symposium on Software Testing and Analysis (ISSTA’11). ACM, New York, NY, USA, 210--220. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Hyunsook Do, Sebastian G. Elbaum, and Gregg Rothermel. 2005. Supporting controlled experimentation with testing techniques: An infrastructure and its potential impact. Empirical Software Engineering 10, 4 (2005), 405--435. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Stephanie Forrest, ThanhVu Nguyen, Westley Weimer, and Claire Le Goues. 2009. A genetic programming approach to automated software repair. In Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation (GECCO’09). ACM, New York, 947--954. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Gordon Fraser, Matt Staats, Phil McMinn, Andrea Arcuri, and Frank Padberg. 2013. Does automated white-box test generation really help software testers? In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA’13). 291--301. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Fabrício Gomes Freitas and Jerffeson Teixeira Souza. 2011. Ten years of search-based software engineering: A bibliometric analysis. In Search-Based Software Engineering, MyraB. Cohen and Mel Ó Cinnéide (Eds.). Lecture Notes in Computer Science, Vol. 6956. Springer, Berlin, 18--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. Gonzalez-Sanchez, R. Abreu, H. G. Gross, and A. J. C. van Gemund. 2011b. Prioritizing tests for fault localization through ambiguity group reduction. In Proceedings of the 26th IEEE/ACM International Conference on Automated Software Engineering (ASE’11). 83--92. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Alberto Gonzalez-Sanchez, Rui Abreu, Hans-Gerhard Gross, and Arjan J. C. van Gemund. 2011a. Spectrum-based sequential diagnosis. In Proceedings of the 25th AAAI Conference on Artificial Intelligence (AAAI’11). AAAI Press, 189--196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. Gonzalez-Sanchez, E. Piel, H. G. Gross, and A. J. C. van Gemund. 2010. Prioritizing tests for software fault localization. In Proceedings of the 2010 10th International Conference on Quality Software. 42--51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. C. Gouveia, J. Campos, and R. Abreu. 2013. Using HTML5 visualizations in software fault localization. In Proceedings of the 1st IEEE Working Conference on Software Visualization (VISSOFT’13). 1--10.Google ScholarGoogle Scholar
  21. Dan Hao, Lu Zhang, Ying Pan, Hong Mei, and Jiasu Sun. 2008. On similarity-awareness in testing-based fault localization. Auto. Softw. Eng. 15 (June 2008), 207--249. Issue 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Mark Harman. 2011. Software engineering meets evolutionary computation. IEEE Comput. 44, 10 (Oct. 2011), 31--39. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Mark Harman, S. Afshin Mansouri, and Yuanyuan Zhang. 2012. Search-based software engineering: Trends, techniques and applications. Comput. Surveys 45, 1, Article 11 (December 2012), 61 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Mary Jean Harrold, Gregg Rothermel, Rui Wu, and Liu Yi. 1998. An empirical investigation of program spectra. In Proceedings of the ACM SIGPLAN-SIGSOFT workshop on Program Analysis for Software Tools and Engineering (PASTE’98). ACM, New York, 83--90. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Paul Jaccard. 1901. Étude comparative de la distribution florale dans une portion des Alpes et des Jura. Bulletin del la Société Vaudoise des Sciences Naturelles 37 (1901), 547--579.Google ScholarGoogle Scholar
  26. Wei Jin and Alessandro Orso. 2012. BugRedux: Reproducing field failures for in-house debugging. In Proceedings of the 34th International Conference on Software Engineering (ICSE’12). IEEE Press, Piscataway, NJ, 474--484. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Wei Jin and Alessandro Orso. 2013. F3: Fault localization for field failures. In Proceedings of the 2013 International Symposium on Software Testing and Analysis (ISSTA’13). ACM, New York, 213--223. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. James A. Jones and Mary Jean Harrold. 2005. Empirical evaluation of the tarantula automatic fault-localization technique. In Proceedings of the 20th International Conference on Automated Software Engineering (ASE’05). ACM, 273--282. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. James A. Jones, Mary Jean Harrold, and John Stasko. 2002. Visualization of test information to assist fault localization. In Proceedings of the 24th International Conference on Software Engineering. ACM, New York, 467--477. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. James A. Jones, Mary Jean Harrold, and John T. Stasko. 2001. Visualization for fault localization. In Proceedings of ICSE Workshop on Software Visualization. 71--75.Google ScholarGoogle Scholar
  31. J. R. Koza. 1992. Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Claire Le Goues, Stephanie Forrest, and Westley Weimer. 2013. Current challenges in automatic software repair. Softw. Qual. J. 21, 3 (2013), 421--443. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Hua Jie Lee. 2011. Software Debugging using Program Spectra. Ph.D. Dissertation. University of Melbourne.Google ScholarGoogle Scholar
  34. W. Masri and R. A. Assi. 2010. Cleansing test suites from coincidental correctness to enhance fault-localization. In Proceedings of the 2010 3rd International Conference on Software Testing, Verification and Validation (ICST’10). 165--174. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Philip McMinn. 2004. Search-based software test data generation: A survey. Softw. Test. Verificat. Reliabil. 14, 2 (June 2004), 105--156. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Seokhyeon Moon, Yunho Kim, Moonzoo Kim, and Shin Yoo. 2014. Ask the mutants: Mutating faulty programs for fault localization. In Proceedings of the 7th International Conference on Software Testing, Verification and Validation (ICST’14). 153--162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Lee Naish, Hua Jie Lee, and Kotagiri Ramamohanarao. 2011. A model for spectra-based software diagnosis. ACM Trans. Softw. Eng. Methodol. 20, 3, Article 11 (August 2011), 32 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Lee Naish, Hua Jie Lee, and Kotagiri Ramamohanarao. 2012. Spectral debugging: How much better can we do? In Proceedings of the 35th Australasian Computer Science Conference—Volume 122 (ACSC’12). Australian Computer Society, Inc., Darlinghurst, Australia, 99--106. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. A. Ochiai. 1957. Zoogeographic studies on the soleoid fishes found in Japan and its neighbouring regions. Bull. Japan. Soc. Sci. Fish. 22, 9 (1957), 526--530.Google ScholarGoogle ScholarCross RefCross Ref
  40. Annibale Panichella, Bogdan Dit, Rocco Oliveto, Massimiliano Di Penta, Denys Poshyvanyk, and Andrea De Lucia. 2013. How to effectively use topic models for software engineering tasks? An approach based on genetic algorithms. In Proceedings of the 2013 International Conference on Software Engineering (ICSE’13). IEEE, 522--531. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Sangmin Park, Richard W. Vuduc, and Mary Jean Harrold. 2010. Falcon: Fault localization in concurrent programs. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1 (ICSE’10). ACM, New York, 245--254. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Chris Parnin and Alessandro Orso. 2011. Are automated debugging techniques actually helping programmers? In Proceedings of the 2011 International Symposium on Software Testing and Analysis (ISSTA’11). ACM, New York, 199--209. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Riccardo Poli, William B. Langdon, and Nicholas Freitag McPhee. 2008. A Field Guide to Genetic Programming. Published via http://lulu.com and retrieved from http://www.gp-field-guide.org.uk (with contributions by J. R. Koza). Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Yuhua Qi, Xiaoguang Mao, Yan Lei, and Chengsong Wang. 2013. Using automated program repair for evaluating the effectiveness of fault localization techniques. In Proceedings of the 2013 International Symposium on Software Testing and Analysis (ISSTA’13). ACM, New York, 191--201. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. M. Renieres and S. P. Reiss. 2003. Fault localization with nearest neighbor queries. In Proceedings of the 18th International Conference on Automated Software Engineering. 30--39. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. P. F. Russel and T. Ramachandra Rao. 1940. On habitat and association of species of anopheline larvae in south-eastern Madras. J. Malar. Inst. India 3, 1 (1940), 153--178.Google ScholarGoogle Scholar
  47. SLOCCount. 2004. Retrieved from http://www.dwheeler.com/sloccount/sloccount.html (2004).Google ScholarGoogle Scholar
  48. Friedrich Steimann, Marcus Frenkel, and Rui Abreu. 2013. Threats to the validity and value of empirical assessments of the accuracy of coverage-based fault locators. In Proceedings of the 2013 International Symposium on Software Testing and Analysis (ISSTA’13). ACM, New York, 314--324. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. András Vargha and Harold D. Delaney. 2000. A critique and improvement of the “CL” common language effect size statistics of McGraw and Wong. J. Educat. Behav. Stat. 25, 2 (2000), pp. 101--132.Google ScholarGoogle Scholar
  50. Westley Weimer, ThanhVu Nguyen, Claire Le Goues, and Stephanie Forrest. 2009. Automatically finding patches using genetic programming. In Proceedings of the 31st IEEE International Conference on Software Engineering (ICSE’09). IEEE. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. W. Eric Wong, Yu Qi, Lei Zhao, and Kai-Yuan Cai. 2007. Effective fault localization using code coverage. In Proceedings of the 31st Annual International Computer Software and Applications Conference—Volume 01 (COMPSAC’07). IEEE Computer Society, Washington, DC, 449--456. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Xiaoyuan Xie, Tsong Yueh Chen, Fei-Ching Kuo, and Baowen Xu. 2013a. A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization. ACM Trans. Softw. Eng. Methodol. 22, 4, Article 31 (October 2013), 40 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Xiaoyuan Xie, Fei-Ching Kuo, Tsong Yueh Chen, Shin Yoo, and Mark Harman. 2013b. Provably optimal and human-competitive results in SBSE for spectrum-based fault localisation. In Search-Based Software Engineering, Günther Ruhe and Yuanyuan Zhang (Eds.). Lecture Notes in Computer Science, Vol. 8084. Springer, Berlin, 224--238. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Jian Xu, W. K. Chan, Zhenyu Zhang, T. H. Tse, and Shanping Li. 2011. A dynamic fault localization technique with noise reduction for java programs. In Proceedings of the 11th International Conference on Quality Software, Manuel Núñez, Robert M. Hierons, and Mercedes G. Merayo (Eds.). IEEE Computer Society, 11--20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Shin Yoo. 2012. Evolving human competitive spectra-based fault localisation techniques. In Search-Based Software Engineering, Gordon Fraser and Jerffeson Teixeira de Souza (Eds.). Lecture Notes in Computer Science, Vol. 7515. Springer, Berlin, 244--258. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Shin Yoo, Mark Harman, and David Clark. 2013. Fault localization prioritization: Comparing information-theoretic and coverage-based approaches. ACM Trans. Softw. Eng. Methodol. 22, 3 (July 2013), 19:1--19:29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Yanbing Yu, James A. Jones, and Mary Jean Harrold. 2008. An empirical study of the effects of test-suite reduction on fault localization. In Proceedings of the International Conference on Software Engineering (ICSE’08). ACM, 201--210. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Human Competitiveness of Genetic Programming in Spectrum-Based Fault Localisation: Theoretical and Empirical Analysis

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Software Engineering and Methodology
        ACM Transactions on Software Engineering and Methodology  Volume 26, Issue 1
        January 2017
        176 pages
        ISSN:1049-331X
        EISSN:1557-7392
        DOI:10.1145/3092955
        Issue’s Table of Contents

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 28 June 2017
        • Accepted: 1 March 2017
        • Revised: 1 January 2017
        • Received: 1 December 2015
        Published in tosem Volume 26, Issue 1

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader