Abstract
We report on the application of Genetic Programming to Software Fault Localisation, a problem in the area of Search-Based Software Engineering (SBSE). We give both empirical and theoretical evidence for the human competitiveness of the evolved fault localisation formulæ under the single fault scenario, compared to those generated by human ingenuity and reported in many papers, published over more than a decade. Though there have been previous human competitive results claimed for SBSE problems, this is the first time that evolved solutions have been formally proved to be human competitive. We further prove that no future human investigation could outperform the evolved solutions. We complement these proofs with an empirical analysis of both human and evolved solutions, which indicates that the evolved solutions are not only theoretically human competitive, but also convey similar practical benefits to human-evolved counterparts.
- R. Abreu, P. Zoeteweij, and A. J. C. van Gemund. 2009. Spectrum-based multiple fault localization. In Proceedings of the 24th IEEE/ACM International Conference on Automated Software Engineering (ASE’09). 88--99. Google ScholarDigital Library
- Rui Abreu, Peter Zoeteweij, and Arjan J. C. van Gemund. 2006. An evaluation of similarity coefficients for software fault localization. In The Proceedings of the 12th Pacific Rim International Symposium on Dependable Computing (PRDC’06). IEEE, 39--46. Google ScholarDigital Library
- Rui Abreu, Peter Zoeteweij, and Arjan J. C. van Gemund. 2007. On the accuracy of spectrum-based fault localization. In Proceedings of the Testing: Academic and Industrial Conference Practice and Research Techniques—MUTATION. IEEE Computer Society, 89--98. Google ScholarDigital Library
- Wasif Afzal, Richard Torkar, and Robert Feldt. 2009. A systematic review of search-based testing for non-functional system properties. Info. Softw. Technol. 51, 6 (2009), 957--976. Google ScholarDigital Library
- Andrea Arcuri and Lionel Briand. 2011. A practical guide for using statistical tests to assess randomized algorithms in software engineering. In Proceedings of the 33rd International Conference on Software Engineering (ICSE’11). ACM, New York, 1--10. Google ScholarDigital Library
- Shay Artzi, Julian Dolby, Frank Tip, and Marco Pistoia. 2010. Directed test generation for effective fault localization. In Proceedings of the 19th International Symposium on Software Testing and Analysis (ISSTA’10). ACM, New York, 49--60. Google ScholarDigital Library
- Earl T. Barr, Mark Harman, Yue Jia, Alexandru Marginean, and Justyna Petke. 2015. Automated software transplantation. In Proceedings of the 2015 International Symposium on Software Testing and Analysis (ISSTA’15). ACM, New York, 257--269. Google ScholarDigital Library
- Mariano Ceccato, Alessandro Marchetto, Leonardo Mariani, Cu D. Nguyen, and Paolo Tonella. 2015. Do automatically generated test cases make debugging easier? An experimental assessment of debugging effectiveness and efficiency. ACM Trans. Softw. Eng. Methodol. 25, 1 (Dec. 2015), 5:1--5:38. Google ScholarDigital Library
- Yanping Chen, Robert L. Probert, and D. Paul Sims. 2002. Specification-based regression test selection with risk analysis. In Proceedings of the Conference of the Centre for Advanced Studies on Collaborative research (CASCON’02). IBM Press, 1--14. Google ScholarDigital Library
- Gregory W. Corder and Dale I. Foreman. 2009. Nonparametric Statistics for Non-Statisticians: A Step-by-Step Approach. Wiley.Google Scholar
- Valentin Dallmeier, Christian Lindig, and Andreas Zeller. 2005. Lightweight bug localization with AMPLE. In Proceedings of the 6th International Symposium on Automated Analysis-driven Debugging (AADEBUG’05). ACM, New York, 99--104. Google ScholarDigital Library
- Nicholas DiGiuseppe and James A. Jones. 2011. On the influence of multiple faults on coverage-based fault localization. In Proceedings of the 2011 International Symposium on Software Testing and Analysis (ISSTA’11). ACM, New York, NY, USA, 210--220. Google ScholarDigital Library
- Hyunsook Do, Sebastian G. Elbaum, and Gregg Rothermel. 2005. Supporting controlled experimentation with testing techniques: An infrastructure and its potential impact. Empirical Software Engineering 10, 4 (2005), 405--435. Google ScholarDigital Library
- Stephanie Forrest, ThanhVu Nguyen, Westley Weimer, and Claire Le Goues. 2009. A genetic programming approach to automated software repair. In Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation (GECCO’09). ACM, New York, 947--954. Google ScholarDigital Library
- Gordon Fraser, Matt Staats, Phil McMinn, Andrea Arcuri, and Frank Padberg. 2013. Does automated white-box test generation really help software testers? In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA’13). 291--301. Google ScholarDigital Library
- Fabrício Gomes Freitas and Jerffeson Teixeira Souza. 2011. Ten years of search-based software engineering: A bibliometric analysis. In Search-Based Software Engineering, MyraB. Cohen and Mel Ó Cinnéide (Eds.). Lecture Notes in Computer Science, Vol. 6956. Springer, Berlin, 18--32. Google ScholarDigital Library
- A. Gonzalez-Sanchez, R. Abreu, H. G. Gross, and A. J. C. van Gemund. 2011b. Prioritizing tests for fault localization through ambiguity group reduction. In Proceedings of the 26th IEEE/ACM International Conference on Automated Software Engineering (ASE’11). 83--92. Google ScholarDigital Library
- Alberto Gonzalez-Sanchez, Rui Abreu, Hans-Gerhard Gross, and Arjan J. C. van Gemund. 2011a. Spectrum-based sequential diagnosis. In Proceedings of the 25th AAAI Conference on Artificial Intelligence (AAAI’11). AAAI Press, 189--196. Google ScholarDigital Library
- A. Gonzalez-Sanchez, E. Piel, H. G. Gross, and A. J. C. van Gemund. 2010. Prioritizing tests for software fault localization. In Proceedings of the 2010 10th International Conference on Quality Software. 42--51. Google ScholarDigital Library
- C. Gouveia, J. Campos, and R. Abreu. 2013. Using HTML5 visualizations in software fault localization. In Proceedings of the 1st IEEE Working Conference on Software Visualization (VISSOFT’13). 1--10.Google Scholar
- Dan Hao, Lu Zhang, Ying Pan, Hong Mei, and Jiasu Sun. 2008. On similarity-awareness in testing-based fault localization. Auto. Softw. Eng. 15 (June 2008), 207--249. Issue 2. Google ScholarDigital Library
- Mark Harman. 2011. Software engineering meets evolutionary computation. IEEE Comput. 44, 10 (Oct. 2011), 31--39. Google ScholarDigital Library
- Mark Harman, S. Afshin Mansouri, and Yuanyuan Zhang. 2012. Search-based software engineering: Trends, techniques and applications. Comput. Surveys 45, 1, Article 11 (December 2012), 61 pages. Google ScholarDigital Library
- Mary Jean Harrold, Gregg Rothermel, Rui Wu, and Liu Yi. 1998. An empirical investigation of program spectra. In Proceedings of the ACM SIGPLAN-SIGSOFT workshop on Program Analysis for Software Tools and Engineering (PASTE’98). ACM, New York, 83--90. Google ScholarDigital Library
- Paul Jaccard. 1901. Étude comparative de la distribution florale dans une portion des Alpes et des Jura. Bulletin del la Société Vaudoise des Sciences Naturelles 37 (1901), 547--579.Google Scholar
- Wei Jin and Alessandro Orso. 2012. BugRedux: Reproducing field failures for in-house debugging. In Proceedings of the 34th International Conference on Software Engineering (ICSE’12). IEEE Press, Piscataway, NJ, 474--484. Google ScholarDigital Library
- Wei Jin and Alessandro Orso. 2013. F3: Fault localization for field failures. In Proceedings of the 2013 International Symposium on Software Testing and Analysis (ISSTA’13). ACM, New York, 213--223. Google ScholarDigital Library
- James A. Jones and Mary Jean Harrold. 2005. Empirical evaluation of the tarantula automatic fault-localization technique. In Proceedings of the 20th International Conference on Automated Software Engineering (ASE’05). ACM, 273--282. Google ScholarDigital Library
- James A. Jones, Mary Jean Harrold, and John Stasko. 2002. Visualization of test information to assist fault localization. In Proceedings of the 24th International Conference on Software Engineering. ACM, New York, 467--477. Google ScholarDigital Library
- James A. Jones, Mary Jean Harrold, and John T. Stasko. 2001. Visualization for fault localization. In Proceedings of ICSE Workshop on Software Visualization. 71--75.Google Scholar
- J. R. Koza. 1992. Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA. Google ScholarDigital Library
- Claire Le Goues, Stephanie Forrest, and Westley Weimer. 2013. Current challenges in automatic software repair. Softw. Qual. J. 21, 3 (2013), 421--443. Google ScholarDigital Library
- Hua Jie Lee. 2011. Software Debugging using Program Spectra. Ph.D. Dissertation. University of Melbourne.Google Scholar
- W. Masri and R. A. Assi. 2010. Cleansing test suites from coincidental correctness to enhance fault-localization. In Proceedings of the 2010 3rd International Conference on Software Testing, Verification and Validation (ICST’10). 165--174. Google ScholarDigital Library
- Philip McMinn. 2004. Search-based software test data generation: A survey. Softw. Test. Verificat. Reliabil. 14, 2 (June 2004), 105--156. Google ScholarDigital Library
- Seokhyeon Moon, Yunho Kim, Moonzoo Kim, and Shin Yoo. 2014. Ask the mutants: Mutating faulty programs for fault localization. In Proceedings of the 7th International Conference on Software Testing, Verification and Validation (ICST’14). 153--162. Google ScholarDigital Library
- Lee Naish, Hua Jie Lee, and Kotagiri Ramamohanarao. 2011. A model for spectra-based software diagnosis. ACM Trans. Softw. Eng. Methodol. 20, 3, Article 11 (August 2011), 32 pages. Google ScholarDigital Library
- Lee Naish, Hua Jie Lee, and Kotagiri Ramamohanarao. 2012. Spectral debugging: How much better can we do? In Proceedings of the 35th Australasian Computer Science Conference—Volume 122 (ACSC’12). Australian Computer Society, Inc., Darlinghurst, Australia, 99--106. Google ScholarDigital Library
- A. Ochiai. 1957. Zoogeographic studies on the soleoid fishes found in Japan and its neighbouring regions. Bull. Japan. Soc. Sci. Fish. 22, 9 (1957), 526--530.Google ScholarCross Ref
- Annibale Panichella, Bogdan Dit, Rocco Oliveto, Massimiliano Di Penta, Denys Poshyvanyk, and Andrea De Lucia. 2013. How to effectively use topic models for software engineering tasks? An approach based on genetic algorithms. In Proceedings of the 2013 International Conference on Software Engineering (ICSE’13). IEEE, 522--531. Google ScholarDigital Library
- Sangmin Park, Richard W. Vuduc, and Mary Jean Harrold. 2010. Falcon: Fault localization in concurrent programs. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1 (ICSE’10). ACM, New York, 245--254. Google ScholarDigital Library
- Chris Parnin and Alessandro Orso. 2011. Are automated debugging techniques actually helping programmers? In Proceedings of the 2011 International Symposium on Software Testing and Analysis (ISSTA’11). ACM, New York, 199--209. Google ScholarDigital Library
- Riccardo Poli, William B. Langdon, and Nicholas Freitag McPhee. 2008. A Field Guide to Genetic Programming. Published via http://lulu.com and retrieved from http://www.gp-field-guide.org.uk (with contributions by J. R. Koza). Google ScholarDigital Library
- Yuhua Qi, Xiaoguang Mao, Yan Lei, and Chengsong Wang. 2013. Using automated program repair for evaluating the effectiveness of fault localization techniques. In Proceedings of the 2013 International Symposium on Software Testing and Analysis (ISSTA’13). ACM, New York, 191--201. Google ScholarDigital Library
- M. Renieres and S. P. Reiss. 2003. Fault localization with nearest neighbor queries. In Proceedings of the 18th International Conference on Automated Software Engineering. 30--39. Google ScholarDigital Library
- P. F. Russel and T. Ramachandra Rao. 1940. On habitat and association of species of anopheline larvae in south-eastern Madras. J. Malar. Inst. India 3, 1 (1940), 153--178.Google Scholar
- SLOCCount. 2004. Retrieved from http://www.dwheeler.com/sloccount/sloccount.html (2004).Google Scholar
- Friedrich Steimann, Marcus Frenkel, and Rui Abreu. 2013. Threats to the validity and value of empirical assessments of the accuracy of coverage-based fault locators. In Proceedings of the 2013 International Symposium on Software Testing and Analysis (ISSTA’13). ACM, New York, 314--324. Google ScholarDigital Library
- András Vargha and Harold D. Delaney. 2000. A critique and improvement of the “CL” common language effect size statistics of McGraw and Wong. J. Educat. Behav. Stat. 25, 2 (2000), pp. 101--132.Google Scholar
- Westley Weimer, ThanhVu Nguyen, Claire Le Goues, and Stephanie Forrest. 2009. Automatically finding patches using genetic programming. In Proceedings of the 31st IEEE International Conference on Software Engineering (ICSE’09). IEEE. Google ScholarDigital Library
- W. Eric Wong, Yu Qi, Lei Zhao, and Kai-Yuan Cai. 2007. Effective fault localization using code coverage. In Proceedings of the 31st Annual International Computer Software and Applications Conference—Volume 01 (COMPSAC’07). IEEE Computer Society, Washington, DC, 449--456. Google ScholarDigital Library
- Xiaoyuan Xie, Tsong Yueh Chen, Fei-Ching Kuo, and Baowen Xu. 2013a. A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization. ACM Trans. Softw. Eng. Methodol. 22, 4, Article 31 (October 2013), 40 pages. Google ScholarDigital Library
- Xiaoyuan Xie, Fei-Ching Kuo, Tsong Yueh Chen, Shin Yoo, and Mark Harman. 2013b. Provably optimal and human-competitive results in SBSE for spectrum-based fault localisation. In Search-Based Software Engineering, Günther Ruhe and Yuanyuan Zhang (Eds.). Lecture Notes in Computer Science, Vol. 8084. Springer, Berlin, 224--238. Google ScholarDigital Library
- Jian Xu, W. K. Chan, Zhenyu Zhang, T. H. Tse, and Shanping Li. 2011. A dynamic fault localization technique with noise reduction for java programs. In Proceedings of the 11th International Conference on Quality Software, Manuel Núñez, Robert M. Hierons, and Mercedes G. Merayo (Eds.). IEEE Computer Society, 11--20. Google ScholarDigital Library
- Shin Yoo. 2012. Evolving human competitive spectra-based fault localisation techniques. In Search-Based Software Engineering, Gordon Fraser and Jerffeson Teixeira de Souza (Eds.). Lecture Notes in Computer Science, Vol. 7515. Springer, Berlin, 244--258. Google ScholarDigital Library
- Shin Yoo, Mark Harman, and David Clark. 2013. Fault localization prioritization: Comparing information-theoretic and coverage-based approaches. ACM Trans. Softw. Eng. Methodol. 22, 3 (July 2013), 19:1--19:29. Google ScholarDigital Library
- Yanbing Yu, James A. Jones, and Mary Jean Harrold. 2008. An empirical study of the effects of test-suite reduction on fault localization. In Proceedings of the International Conference on Software Engineering (ICSE’08). ACM, 201--210. Google ScholarDigital Library
Index Terms
- Human Competitiveness of Genetic Programming in Spectrum-Based Fault Localisation: Theoretical and Empirical Analysis
Recommendations
Evaluation of spectrum based fault localization tools
ISEC '22: Proceedings of the 15th Innovations in Software Engineering ConferenceSoftware Fault localization (SFL) is the first step in any program debugging process. For more than three decades, researchers have aggressively studied, evaluated, and proposed numerous automatic SFL techniques spanning across various families of ...
Multiple fault localization based on ant colony algorithm via genetic operation
AbstractDebugging programs involves a considerable amount of time and resources spent on identifying the source of errors. Spectrum-based fault localization techniques are becoming increasingly popular due to their quick and effective performance when ...
Empirical evaluation of conditional operators in GP based fault localization
GECCO '17: Proceedings of the Genetic and Evolutionary Computation ConferenceGenetic Programming has been successfully applied to learn to rank program elements according to their likelihood of containing faults. However, all GP-evolved formulæ that have been studied in the fault localization literature up to now are single ...
Comments