research-article

Human Competitiveness of Genetic Programming in Spectrum-Based Fault Localisation: Theoretical and Empirical Analysis

Authors:
Shin Yoo

Korea Advanced Institute of Science and Technology, Republic of Korea

Korea Advanced Institute of Science and Technology, Republic of Korea
View Profile

,
Xiaoyuan Xie

Wuhan University, Hubei, China

Wuhan University, Hubei, China
View Profile

,
Fei-Ching Kuo

Swinburne University of Technology, Hawthorn, VIC, Austrailia

Swinburne University of Technology, Hawthorn, VIC, Austrailia
View Profile

,
Tsong Yueh Chen

Swinburne University of Technology, Hawthorn, VIC, Austrailia

Swinburne University of Technology, Hawthorn, VIC, Austrailia
View Profile

,
Mark Harman

University College London, UK

University College London, UK
View Profile

ACM Transactions on Software Engineering and Methodology Volume 26 Issue 1Article No.: 4pp 1–30https://doi.org/10.1145/3078840

Published:28 June 2017Publication History

ACM Transactions on Software Engineering and Methodology

Abstract

We report on the application of Genetic Programming to Software Fault Localisation, a problem in the area of Search-Based Software Engineering (SBSE). We give both empirical and theoretical evidence for the human competitiveness of the evolved fault localisation formulæ under the single fault scenario, compared to those generated by human ingenuity and reported in many papers, published over more than a decade. Though there have been previous human competitive results claimed for SBSE problems, this is the first time that evolved solutions have been formally proved to be human competitive. We further prove that no future human investigation could outperform the evolved solutions. We complement these proofs with an empirical analysis of both human and evolved solutions, which indicates that the evolved solutions are not only theoretically human competitive, but also convey similar practical benefits to human-evolved counterparts.

References

R. Abreu, P. Zoeteweij, and A. J. C. van Gemund. 2009. Spectrum-based multiple fault localization. In Proceedings of the 24th IEEE/ACM International Conference on Automated Software Engineering (ASE’09). 88--99. Google ScholarDigital Library
Rui Abreu, Peter Zoeteweij, and Arjan J. C. van Gemund. 2006. An evaluation of similarity coefficients for software fault localization. In The Proceedings of the 12th Pacific Rim International Symposium on Dependable Computing (PRDC’06). IEEE, 39--46. Google ScholarDigital Library
Rui Abreu, Peter Zoeteweij, and Arjan J. C. van Gemund. 2007. On the accuracy of spectrum-based fault localization. In Proceedings of the Testing: Academic and Industrial Conference Practice and Research Techniques—MUTATION. IEEE Computer Society, 89--98. Google ScholarDigital Library
Wasif Afzal, Richard Torkar, and Robert Feldt. 2009. A systematic review of search-based testing for non-functional system properties. Info. Softw. Technol. 51, 6 (2009), 957--976. Google ScholarDigital Library
Andrea Arcuri and Lionel Briand. 2011. A practical guide for using statistical tests to assess randomized algorithms in software engineering. In Proceedings of the 33rd International Conference on Software Engineering (ICSE’11). ACM, New York, 1--10. Google ScholarDigital Library
Shay Artzi, Julian Dolby, Frank Tip, and Marco Pistoia. 2010. Directed test generation for effective fault localization. In Proceedings of the 19th International Symposium on Software Testing and Analysis (ISSTA’10). ACM, New York, 49--60. Google ScholarDigital Library
Earl T. Barr, Mark Harman, Yue Jia, Alexandru Marginean, and Justyna Petke. 2015. Automated software transplantation. In Proceedings of the 2015 International Symposium on Software Testing and Analysis (ISSTA’15). ACM, New York, 257--269. Google ScholarDigital Library
Mariano Ceccato, Alessandro Marchetto, Leonardo Mariani, Cu D. Nguyen, and Paolo Tonella. 2015. Do automatically generated test cases make debugging easier? An experimental assessment of debugging effectiveness and efficiency. ACM Trans. Softw. Eng. Methodol. 25, 1 (Dec. 2015), 5:1--5:38. Google ScholarDigital Library
Yanping Chen, Robert L. Probert, and D. Paul Sims. 2002. Specification-based regression test selection with risk analysis. In Proceedings of the Conference of the Centre for Advanced Studies on Collaborative research (CASCON’02). IBM Press, 1--14. Google ScholarDigital Library
Gregory W. Corder and Dale I. Foreman. 2009. Nonparametric Statistics for Non-Statisticians: A Step-by-Step Approach. Wiley.Google Scholar
Valentin Dallmeier, Christian Lindig, and Andreas Zeller. 2005. Lightweight bug localization with AMPLE. In Proceedings of the 6th International Symposium on Automated Analysis-driven Debugging (AADEBUG’05). ACM, New York, 99--104. Google ScholarDigital Library
Nicholas DiGiuseppe and James A. Jones. 2011. On the influence of multiple faults on coverage-based fault localization. In Proceedings of the 2011 International Symposium on Software Testing and Analysis (ISSTA’11). ACM, New York, NY, USA, 210--220. Google ScholarDigital Library
Hyunsook Do, Sebastian G. Elbaum, and Gregg Rothermel. 2005. Supporting controlled experimentation with testing techniques: An infrastructure and its potential impact. Empirical Software Engineering 10, 4 (2005), 405--435. Google ScholarDigital Library
Stephanie Forrest, ThanhVu Nguyen, Westley Weimer, and Claire Le Goues. 2009. A genetic programming approach to automated software repair. In Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation (GECCO’09). ACM, New York, 947--954. Google ScholarDigital Library
Gordon Fraser, Matt Staats, Phil McMinn, Andrea Arcuri, and Frank Padberg. 2013. Does automated white-box test generation really help software testers? In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA’13). 291--301. Google ScholarDigital Library
Fabrício Gomes Freitas and Jerffeson Teixeira Souza. 2011. Ten years of search-based software engineering: A bibliometric analysis. In Search-Based Software Engineering, MyraB. Cohen and Mel Ó Cinnéide (Eds.). Lecture Notes in Computer Science, Vol. 6956. Springer, Berlin, 18--32. Google ScholarDigital Library
A. Gonzalez-Sanchez, R. Abreu, H. G. Gross, and A. J. C. van Gemund. 2011b. Prioritizing tests for fault localization through ambiguity group reduction. In Proceedings of the 26th IEEE/ACM International Conference on Automated Software Engineering (ASE’11). 83--92. Google ScholarDigital Library
Alberto Gonzalez-Sanchez, Rui Abreu, Hans-Gerhard Gross, and Arjan J. C. van Gemund. 2011a. Spectrum-based sequential diagnosis. In Proceedings of the 25th AAAI Conference on Artificial Intelligence (AAAI’11). AAAI Press, 189--196. Google ScholarDigital Library
A. Gonzalez-Sanchez, E. Piel, H. G. Gross, and A. J. C. van Gemund. 2010. Prioritizing tests for software fault localization. In Proceedings of the 2010 10th International Conference on Quality Software. 42--51. Google ScholarDigital Library
C. Gouveia, J. Campos, and R. Abreu. 2013. Using HTML5 visualizations in software fault localization. In Proceedings of the 1st IEEE Working Conference on Software Visualization (VISSOFT’13). 1--10.Google Scholar
Dan Hao, Lu Zhang, Ying Pan, Hong Mei, and Jiasu Sun. 2008. On similarity-awareness in testing-based fault localization. Auto. Softw. Eng. 15 (June 2008), 207--249. Issue 2. Google ScholarDigital Library
Mark Harman. 2011. Software engineering meets evolutionary computation. IEEE Comput. 44, 10 (Oct. 2011), 31--39. Google ScholarDigital Library
Mark Harman, S. Afshin Mansouri, and Yuanyuan Zhang. 2012. Search-based software engineering: Trends, techniques and applications. Comput. Surveys 45, 1, Article 11 (December 2012), 61 pages. Google ScholarDigital Library
Mary Jean Harrold, Gregg Rothermel, Rui Wu, and Liu Yi. 1998. An empirical investigation of program spectra. In Proceedings of the ACM SIGPLAN-SIGSOFT workshop on Program Analysis for Software Tools and Engineering (PASTE’98). ACM, New York, 83--90. Google ScholarDigital Library
Paul Jaccard. 1901. Étude comparative de la distribution florale dans une portion des Alpes et des Jura. Bulletin del la Société Vaudoise des Sciences Naturelles 37 (1901), 547--579.Google Scholar
Wei Jin and Alessandro Orso. 2012. BugRedux: Reproducing field failures for in-house debugging. In Proceedings of the 34th International Conference on Software Engineering (ICSE’12). IEEE Press, Piscataway, NJ, 474--484. Google ScholarDigital Library
Wei Jin and Alessandro Orso. 2013. F3: Fault localization for field failures. In Proceedings of the 2013 International Symposium on Software Testing and Analysis (ISSTA’13). ACM, New York, 213--223. Google ScholarDigital Library
James A. Jones and Mary Jean Harrold. 2005. Empirical evaluation of the tarantula automatic fault-localization technique. In Proceedings of the 20th International Conference on Automated Software Engineering (ASE’05). ACM, 273--282. Google ScholarDigital Library
James A. Jones, Mary Jean Harrold, and John Stasko. 2002. Visualization of test information to assist fault localization. In Proceedings of the 24th International Conference on Software Engineering. ACM, New York, 467--477. Google ScholarDigital Library
James A. Jones, Mary Jean Harrold, and John T. Stasko. 2001. Visualization for fault localization. In Proceedings of ICSE Workshop on Software Visualization. 71--75.Google Scholar
J. R. Koza. 1992. Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA. Google ScholarDigital Library
Claire Le Goues, Stephanie Forrest, and Westley Weimer. 2013. Current challenges in automatic software repair. Softw. Qual. J. 21, 3 (2013), 421--443. Google ScholarDigital Library
Hua Jie Lee. 2011. Software Debugging using Program Spectra. Ph.D. Dissertation. University of Melbourne.Google Scholar
W. Masri and R. A. Assi. 2010. Cleansing test suites from coincidental correctness to enhance fault-localization. In Proceedings of the 2010 3rd International Conference on Software Testing, Verification and Validation (ICST’10). 165--174. Google ScholarDigital Library
Philip McMinn. 2004. Search-based software test data generation: A survey. Softw. Test. Verificat. Reliabil. 14, 2 (June 2004), 105--156. Google ScholarDigital Library
Seokhyeon Moon, Yunho Kim, Moonzoo Kim, and Shin Yoo. 2014. Ask the mutants: Mutating faulty programs for fault localization. In Proceedings of the 7th International Conference on Software Testing, Verification and Validation (ICST’14). 153--162. Google ScholarDigital Library
Lee Naish, Hua Jie Lee, and Kotagiri Ramamohanarao. 2011. A model for spectra-based software diagnosis. ACM Trans. Softw. Eng. Methodol. 20, 3, Article 11 (August 2011), 32 pages. Google ScholarDigital Library
Lee Naish, Hua Jie Lee, and Kotagiri Ramamohanarao. 2012. Spectral debugging: How much better can we do? In Proceedings of the 35th Australasian Computer Science Conference—Volume 122 (ACSC’12). Australian Computer Society, Inc., Darlinghurst, Australia, 99--106. Google ScholarDigital Library
A. Ochiai. 1957. Zoogeographic studies on the soleoid fishes found in Japan and its neighbouring regions. Bull. Japan. Soc. Sci. Fish. 22, 9 (1957), 526--530.Google ScholarCross Ref
Annibale Panichella, Bogdan Dit, Rocco Oliveto, Massimiliano Di Penta, Denys Poshyvanyk, and Andrea De Lucia. 2013. How to effectively use topic models for software engineering tasks? An approach based on genetic algorithms. In Proceedings of the 2013 International Conference on Software Engineering (ICSE’13). IEEE, 522--531. Google ScholarDigital Library
Sangmin Park, Richard W. Vuduc, and Mary Jean Harrold. 2010. Falcon: Fault localization in concurrent programs. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1 (ICSE’10). ACM, New York, 245--254. Google ScholarDigital Library
Chris Parnin and Alessandro Orso. 2011. Are automated debugging techniques actually helping programmers? In Proceedings of the 2011 International Symposium on Software Testing and Analysis (ISSTA’11). ACM, New York, 199--209. Google ScholarDigital Library
Riccardo Poli, William B. Langdon, and Nicholas Freitag McPhee. 2008. A Field Guide to Genetic Programming. Published via http://lulu.com and retrieved from http://www.gp-field-guide.org.uk (with contributions by J. R. Koza). Google ScholarDigital Library
Yuhua Qi, Xiaoguang Mao, Yan Lei, and Chengsong Wang. 2013. Using automated program repair for evaluating the effectiveness of fault localization techniques. In Proceedings of the 2013 International Symposium on Software Testing and Analysis (ISSTA’13). ACM, New York, 191--201. Google ScholarDigital Library
M. Renieres and S. P. Reiss. 2003. Fault localization with nearest neighbor queries. In Proceedings of the 18th International Conference on Automated Software Engineering. 30--39. Google ScholarDigital Library
P. F. Russel and T. Ramachandra Rao. 1940. On habitat and association of species of anopheline larvae in south-eastern Madras. J. Malar. Inst. India 3, 1 (1940), 153--178.Google Scholar
SLOCCount. 2004. Retrieved from http://www.dwheeler.com/sloccount/sloccount.html (2004).Google Scholar
Friedrich Steimann, Marcus Frenkel, and Rui Abreu. 2013. Threats to the validity and value of empirical assessments of the accuracy of coverage-based fault locators. In Proceedings of the 2013 International Symposium on Software Testing and Analysis (ISSTA’13). ACM, New York, 314--324. Google ScholarDigital Library
András Vargha and Harold D. Delaney. 2000. A critique and improvement of the “CL” common language effect size statistics of McGraw and Wong. J. Educat. Behav. Stat. 25, 2 (2000), pp. 101--132.Google Scholar
Westley Weimer, ThanhVu Nguyen, Claire Le Goues, and Stephanie Forrest. 2009. Automatically finding patches using genetic programming. In Proceedings of the 31st IEEE International Conference on Software Engineering (ICSE’09). IEEE. Google ScholarDigital Library
W. Eric Wong, Yu Qi, Lei Zhao, and Kai-Yuan Cai. 2007. Effective fault localization using code coverage. In Proceedings of the 31st Annual International Computer Software and Applications Conference—Volume 01 (COMPSAC’07). IEEE Computer Society, Washington, DC, 449--456. Google ScholarDigital Library
Xiaoyuan Xie, Tsong Yueh Chen, Fei-Ching Kuo, and Baowen Xu. 2013a. A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization. ACM Trans. Softw. Eng. Methodol. 22, 4, Article 31 (October 2013), 40 pages. Google ScholarDigital Library
Xiaoyuan Xie, Fei-Ching Kuo, Tsong Yueh Chen, Shin Yoo, and Mark Harman. 2013b. Provably optimal and human-competitive results in SBSE for spectrum-based fault localisation. In Search-Based Software Engineering, Günther Ruhe and Yuanyuan Zhang (Eds.). Lecture Notes in Computer Science, Vol. 8084. Springer, Berlin, 224--238. Google ScholarDigital Library
Jian Xu, W. K. Chan, Zhenyu Zhang, T. H. Tse, and Shanping Li. 2011. A dynamic fault localization technique with noise reduction for java programs. In Proceedings of the 11th International Conference on Quality Software, Manuel Núñez, Robert M. Hierons, and Mercedes G. Merayo (Eds.). IEEE Computer Society, 11--20. Google ScholarDigital Library
Shin Yoo. 2012. Evolving human competitive spectra-based fault localisation techniques. In Search-Based Software Engineering, Gordon Fraser and Jerffeson Teixeira de Souza (Eds.). Lecture Notes in Computer Science, Vol. 7515. Springer, Berlin, 244--258. Google ScholarDigital Library
Shin Yoo, Mark Harman, and David Clark. 2013. Fault localization prioritization: Comparing information-theoretic and coverage-based approaches. ACM Trans. Softw. Eng. Methodol. 22, 3 (July 2013), 19:1--19:29. Google ScholarDigital Library
Yanbing Yu, James A. Jones, and Mary Jean Harrold. 2008. An empirical study of the effects of test-suite reduction on fault localization. In Proceedings of the International Conference on Software Engineering (ICSE’08). ACM, 201--210. Google ScholarDigital Library

Index Terms

Human Competitiveness of Genetic Programming in Spectrum-Based Fault Localisation: Theoretical and Empirical Analysis
1. Software and its engineering
  1. Software creation and management
    1. Search-based software engineering
    2. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Recommendations

Evaluation of spectrum based fault localization tools
ISEC '22: Proceedings of the 15th Innovations in Software Engineering Conference

Software Fault localization (SFL) is the first step in any program debugging process. For more than three decades, researchers have aggressively studied, evaluated, and proposed numerous automatic SFL techniques spanning across various families of ...
Read More
Multiple fault localization based on ant colony algorithm via genetic operation
Abstract
Debugging programs involves a considerable amount of time and resources spent on identifying the source of errors. Spectrum-based fault localization techniques are becoming increasingly popular due to their quick and effective performance when ...
Read More
Empirical evaluation of conditional operators in GP based fault localization
GECCO '17: Proceedings of the Genetic and Evolutionary Computation Conference

Genetic Programming has been successfully applied to learn to rank program elements according to their likelihood of containing faults. However, all GP-evolved formulæ that have been studied in the fault localization literature up to now are single ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Software Engineering and Methodology Volume 26, Issue 1
January 2017
176 pages
ISSN:1049-331X
EISSN:1557-7392
DOI:10.1145/3092955
Editor:
David S. Rosenblum
National University of Singapore, Singapore
Issue’s Table of Contents
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 28 June 2017
- Accepted: 1 March 2017
- Revised: 1 January 2017
- Received: 1 December 2015
Published in tosem Volume 26, Issue 1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Spectrum-based fault localisation
genetic programming
search-based software engineering
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 54
  Total Citations
  View Citations
- 531
  Total Downloads
- Downloads (Last 12 months)34
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Human Competitiveness of Genetic Programming in Spectrum-Based Fault Localisation: Theoretical and Empirical Analysis

ACM Transactions on Software Engineering and Methodology

Abstract

References

Cited By

Index Terms

Recommendations

Evaluation of spectrum based fault localization tools

Multiple fault localization based on ant colony algorithm via genetic operation

Empirical evaluation of conditional operators in GP based fault localization