ABSTRACT
A notable class of techniques for automatic program repair is known as semantics-based. Such techniques, e.g., Angelix, infer semantic specifications via symbolic execution, and then use program synthesis to construct new code that satisfies those inferred specifications. However, the obtained specifications are naturally incomplete, leaving the synthesis engine with a difficult task of synthesizing a general solution from a sparse space of many possible solutions that are consistent with the provided specifications but that do not necessarily generalize. We present S3, a new repair synthesis engine that leverages programming-by-examples methodology to synthesize high-quality bug repairs. The novelty in S3 that allows it to tackle the sparse search space to create more general repairs is three-fold: (1) A systematic way to customize and constrain the syntactic search space via a domain-specific language, (2) An efficient enumeration- based search strategy over the constrained search space, and (3) A number of ranking features based on measures of the syntactic and semantic distances between candidate solutions and the original buggy program. We compare S3’s repair effectiveness with state-of-the-art synthesis engines Angelix, Enumerative, and CVC4. S3 can successfully and correctly fix at least three times more bugs than the best baseline on datasets of 52 bugs in small programs, and 100 bugs in real-world large programs.
- 2016. Syntax-guided Synthesis. (2016). http://www.sygus.org/Google Scholar
- Rui Abreu, Peter Zoeteweij, and Arjan JC Van Gemund. 2007. On the accuracy of spectrum-based fault localization. In Testing: Academic and Industrial Conference Practice and Research Techniques-MUTATION (TAICPART-MUTATION 2007). 89– 98. Google ScholarDigital Library
- Rajeev Alur, Rastislav Bodik, Garvit Juniwal, Milo MK Martin, Mukund Raghothaman, Sanjit A Seshia, Rishabh Singh, Armando Solar-Lezama, Emina Torlak, and Abhishek Udupa. 2015. Syntax-guided synthesis. Dependable Software Systems Engineering (2015).Google Scholar
- Rajeev Alur, Arjun Radhakrishna, and Abhishek Udupa. Scaling Enumerative Program Synthesis via Divide and Conquer. Technical Report. University of Pennsylvania.Google Scholar
- Tom Britton, Lisa Jeng, Graham Carver, Paul Cheak, and Tomer Katzenellenbogen. 2013. Reversible Debugging Software. Technical Report. University of Cambridge, Judge Business School.Google Scholar
- Satish Chandra, Emina Torlak, Shaon Barman, and Rastislav Bodik. 2011. Angelic debugging. In International Conference on Software Engineering (ICSE’11). 121– 130. Google ScholarDigital Library
- Vitaly Chipounov, Vlad Georgescu, Cristian Zamfir, and George Candea. 2009. Selective symbolic execution. In Workshop on Hot Topics in System Dependability (HotDep).Google Scholar
- Loris D’Antoni, Roopsha Samanta, and Rishabh Singh. 2016. Qlose: Program repair with quantitative objectives. In International Conference on Computer Aided Verification (CAV). Springer, 383–401.Google ScholarCross Ref
- Loris D’Antoni, Rishabh Singh, and Michael Vaughn. 2017 (to appear). NoFAQ: Synthesizing command repairs from examples. In Joint Conference on European Software Engineering Conference and International Symposium on Foundations of Software Engineering (ESEC/FSE ’17). Google ScholarDigital Library
- Thomas Durieux and Martin Monperrus. 2016. IntroClassJava: A Benchmark of 297 Small and Buggy Java Programs. Technical Report. Universite Lille 1. https://hal.archives-ouvertes.fr/hal-01272126/documentGoogle Scholar
- Michael D Ernst, Jeff H Perkins, Philip J Guo, Stephen McCamant, Carlos Pacheco, Matthew S Tschantz, and Chen Xiao. 2007. The Daikon system for dynamic detection of likely invariants. Science of Computer Programming 69, 1 (2007), 35–45. Google ScholarDigital Library
- Jean-Rémy Falleri, Floréal Morandat, Xavier Blanc, Matias Martinez, and Martin Monperrus. 2014. Fine-grained and accurate source code differencing. In International Conference on Automated Software Engineering (ASE’14). 313–324. Google ScholarDigital Library
- Sumit Gulwani. 2011. Automating string processing in spreadsheets using inputoutput examples. In ACM SIGPLAN Notices, Vol. 46. ACM, 317–330. Google ScholarDigital Library
- Sumit Gulwani, Javier Esparza, Orna Grumberg, and Salomon Sickert. 2016. Programming by Examples (and its applications in Data Wrangling). Verification and Synthesis of Correct and Secure Systems (2016).Google Scholar
- Kim Herzig and Andreas Zeller. 2013. The impact of tangled code changes. In Working Conference on Mining Software Repositories (MSR). 121–130. Google ScholarDigital Library
- Susmit Jha, Sumit Gulwani, Sanjit A. Seshia, and Ashish Tiwari. 2010. Oracleguided Component-based Program Synthesis. In International Conference on Software Engineering (ICSE). Cape Town, South Africa, 215–224. Google ScholarDigital Library
- Lingxiao Jiang, Ghassan Misherghi, Zhendong Su, and Stephane Glondu. 2007. Deckard: Scalable and accurate tree-based detection of code clones. In International conference on Software Engineering (ICSE). IEEE, 96–105. Google ScholarDigital Library
- René Just, Darioush Jalali, and Michael D Ernst. 2014. Defects4J: A database of existing faults to enable controlled testing studies for Java programs. In International Symposium on Software Testing and Analysis (ISSTA ’14). 437–440. Google ScholarDigital Library
- Yalin Ke, Kathryn T. Stolee, Claire Le Goues, and Yuriy Brun. 2015. Repairing Programs with Semantic Code Search. In International Conference on Automated Software Engineering (ASE). 295–306.Google Scholar
- Dongsun Kim, Jaechang Nam, Jaewoo Song, and Sunghun Kim. 2013. Automatic patch generation learned from human-written patches. In International Conference on Software Engineering (ICSE ’13). 802–811. Google ScholarDigital Library
- Dileep Kini and Sumit Gulwani. 2015. FlashNormalize: Programming by Examples for Text Normalization.. In International Joint Conference on Artificial Intelligence (IJCAI). 776–783. Google ScholarDigital Library
- Tien-Duy B Le, Xuan-Bach D Le, David Lo, and Ivan Beschastnikh. 2015. Synergizing specification miners through model fissions and fusions (t). In International Conference on Automated Software Engineering (ASE). IEEE, 115–125.Google ScholarDigital Library
- Vu Le and Sumit Gulwani. 2014. Flashextract: A framework for data extraction by examples. In ACM SIGPLAN Notices, Vol. 49. ACM, 542–553. Google ScholarDigital Library
- Xuan-Bach D Le, Duc-Hiep Chu, David Lo, Claire Le Goues, and Willem Visser. 2017 (to appear). JFIX: Semantics-Based Repair of Java Programs via Symbolic PathFinder. In International Symposium on Software Testing and Analysis (ISSTA’17). Google ScholarDigital Library
- Xuan Bach D. Le, Quang Loc Le, David Lo, and Claire Le Goues. 2016. Enhancing Automated Program Repair with Deductive Verification. In International Conference on Software Maintenance and Evolution (ICSME). 428–432.Google Scholar
- Xuan-Bach D Le, Tien-Duy B Le, and David Lo. 2015. Should fixing these failures be delegated to automated program repair?. In International Symposium on Software Reliability Engineering (ISSRE). 427–437. Google ScholarDigital Library
- Xuan-Bach D Le, David Lo, and Claire Le Goues. 2016. Empirical study on synthesis engines for semantics-based program repair. In International Conference on Software Maintenance and Evolution (ICSME’16). 423–427.Google ScholarCross Ref
- Xuan Bach D Le, David Lo, and Claire Le Goues. 2016. History driven program repair. In International Conference on Software Analysis, Evolution, and Reengineering (SANER). IEEE, 213–224.Google ScholarCross Ref
- Claire Le Goues, Michael Dewey-Vogt, Stephanie Forrest, and Westley Weimer. 2012. A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In International Conference on Software Engineering (ICSE’12). 3–13. Google ScholarDigital Library
- Claire Le Goues, Neal Holtschulte, Edward K Smith, Yuriy Brun, Premkumar Devanbu, Stephanie Forrest, and Westley Weimer. 2015. The ManyBugs and IntroClass benchmarks for automated repair of C programs. Transactions on Software Engineering (TSE) 41, 12 (Dec. 2015), 1236–1256.Google ScholarDigital Library
- Claire Le Goues, ThanhVu Nguyen, Stephanie Forrest, and Westley Weimer. 2012. GenProg: A Generic Method for Automatic Software Repair. IEEE Transactions on Software Engineering 38, 1 (2012), 54–72. Google ScholarDigital Library
- Fan Long and Martin Rinard. 2015. Staged Program Repair with Condition Synthesis. In European Software Engineering Conference and International Symposium on Foundations of Software Engineering (ESEC/FSE). 166–178. Google ScholarDigital Library
- Fan Long and Martin Rinard. 2016. An analysis of the search spaces for generate and validate patch generation systems. In International Conference on Software Engineering (ICSE). ACM, 702–713. Google ScholarDigital Library
- Fan Long and Martin Rinard. 2016. Automatic Patch Generation by Learning Correct Code. In Symposium on Principles of Programming Languages (POPL). 298–312. Google ScholarDigital Library
- Sergey Mechtaev, Jooyong Yi, and Abhik Roychoudhury. 2015. Directfix: Looking for simple program repairs. In International Conference on Software Engineering (ICSE). IEEE Press, 448–458. Google ScholarDigital Library
- Sergey Mechtaev, Jooyong Yi, and Abhik Roychoudhury. 2016. Angelix: Scalable multiline program patch synthesis via symbolic analysis. In International Conference on Software Engineering (ICSE). IEEE, 691–701. Google ScholarDigital Library
- Hoang Duong Thien Nguyen, Dawei Qi, Abhik Roychoudhury, and Satish Chandra. 2013. Semfix: Program repair via semantic analysis. In International Conference on Software Engineering (ICSE). IEEE Press, 772–781. Google ScholarDigital Library
- Corina S Păsăreanu, Willem Visser, David Bushnell, Jaco Geldenhuys, Peter Mehlitz, and Neha Rungta. 2013. Symbolic PathFinder: integrating symbolic execution with model checking for Java bytecode analysis. Automated Software Engineering 20, 3 (2013), 391–425.Google ScholarCross Ref
- Yuhua Qi, Xiaoguang Mao, Yan Lei, Ziying Dai, and Chengsong Wang. 2014. The strength of random search on automated program repair. In International Conference on Software Engineering (ICSE). ACM, 254–265. Google ScholarDigital Library
- Zichao Qi, Fan Long, Sara Achour, and Martin Rinard. 2015. An analysis of patch plausibility and correctness for generate-and-validate patch generation systems. In International Symposium on Software Testing and Analysis (ISSTA). ACM, 24–36. Google ScholarDigital Library
- Andrew Reynolds, Morgan Deters, Viktor Kuncak, Cesare Tinelli, and Clark Barrett. 2015. Counterexample-guided quantifier instantiation for synthesis in SMT. In International Conference on Computer Aided Verification (CAV). 198–216.Google ScholarCross Ref
- Rishabh Singh and Sumit Gulwani. 2016. Transforming spreadsheet data types using examples. In ACM SIGPLAN Notices, Vol. 51. ACM, 343–356. Google ScholarDigital Library
- Edward K Smith, Earl T Barr, Claire Le Goues, and Yuriy Brun. 2015. Is the cure worse than the disease? overfitting in automated program repair. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. ACM, 532–543. Google ScholarDigital Library
- Armando Solar-Lezama, Rodric Rabbah, Rastislav Bodík, and Kemal Ebcioğlu. 2005. Programming by sketching for bit-streaming programs. In ACM SIGPLAN Notices. ACM, 281–294. Google ScholarDigital Library
- Shin Hwei Tan, Hiroaki Yoshida, Mukul R Prasad, and Abhik Roychoudhury. 2016. Anti-patterns in search-based program repair. In International Symposium on Foundations of Software Engineering. ACM, 727–738. Google ScholarDigital Library
- G. Tassey. 2002. The economic impacts of inadequate infrastructure for software testing. Planning Report, NIST (2002).Google Scholar
- Ferdian Thung, Xuan-Bach D Le, and David Lo. 2015. Active semi-supervised defect categorization. In International Conference on Program Comprehension. IEEE Press, 60–70. Google ScholarDigital Library
- Willem Visser. 2016. What makes killing a mutant hard. In International Conference on Automated Software Engineering (ASE). ACM, 39–44. Google ScholarDigital Library
- Westley Weimer, Zachary P Fry, and Stephanie Forrest. 2013. Leveraging program equivalence for adaptive program repair: Models and first results. In International Conference on Automated Software Engineering (ASE). 356–366. Google ScholarDigital Library
- Westley Weimer, ThanhVu Nguyen, Claire Le Goues, and Stephanie Forrest. 2009. Automatically finding patches using genetic programming. In International Conference on Software Engineering (ICSE). IEEE, 364–374. Google ScholarDigital Library
- Yingfei Xiong, Jie Wang, Runfa Yan, Jiachen Zhang, Shi Han, Gang Huang, and Lu Zhang. 2017. Precise condition synthesis for program repair. In International Conference on Software Engineering (ICSE). IEEE Press, 416–426. ESEC/FSE’17, September 4–8, 2017, Paderborn, Germany Xuan-Bach D. Le, Duc-Hiep Chu, David Lo, Claire Le Goues, and Willem Visser Google ScholarDigital Library
- Jifeng Xuan, Matias Martinez, Favio Demarco, Maxime Clément, Sebastian Lamelas, Thomas Durieux, Daniel Le Berre, and Martin Monperrus. 2016. Nopol: Automatic Repair of Conditional Statement Bugs in Java Programs. Transactions on Software Engineering (2016). Google ScholarDigital Library
Index Terms
- S3: syntax- and semantic-guided repair synthesis via programming by examples
Recommendations
Can reactive synthesis and syntax-guided synthesis be friends?
SPLASH Companion 2021: Companion Proceedings of the 2021 ACM SIGPLAN International Conference on Systems, Programming, Languages, and Applications: Software for HumanityWhile reactive synthesis and syntax-guided synthesis (SyGuS) have seen enormous progress in recent years, combining the two approaches has remained a challenge. In this work, we present the synthesis of reactive programs from Temporal Stream Logic ...
Can reactive synthesis and syntax-guided synthesis be friends?
PLDI 2022: Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and ImplementationWhile reactive synthesis and syntax-guided synthesis (SyGuS) have seen enormous progress in recent years, combining the two approaches has remained a challenge. In this work, we present the synthesis of reactive programs from Temporal Stream Logic ...
Dimensions in program synthesis
PPDP '10: Proceedings of the 12th international ACM SIGPLAN symposium on Principles and practice of declarative programmingProgram Synthesis, which is the task of discovering programs that realize user intent, can be useful in several scenarios: enabling people with no programming background to develop utility programs, helping regular programmers automatically discover ...
Comments