research-article

S3: syntax- and semantic-guided repair synthesis via programming by examples

Authors:
Xuan-Bach D. Le

Singapore Management University, Singapore

Singapore Management University, Singapore
View Profile

,
Duc-Hiep Chu

IST Austria, Austria

IST Austria, Austria
View Profile

,
David Lo

Singapore Management University, Singapore

Singapore Management University, Singapore
View Profile

,
Claire Le Goues

Carnegie Mellon University, USA

Carnegie Mellon University, USA
View Profile

,
Willem Visser

Stellenbosch University, South Africa

Stellenbosch University, South Africa
View Profile

ESEC/FSE 2017: Proceedings of the 2017 11th Joint Meeting on Foundations of Software EngineeringAugust 2017Pages 593–604https://doi.org/10.1145/3106237.3106309

Published:21 August 2017Publication History

ESEC/FSE 2017: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering

Pages 593–604

ABSTRACT

A notable class of techniques for automatic program repair is known as semantics-based. Such techniques, e.g., Angelix, infer semantic specifications via symbolic execution, and then use program synthesis to construct new code that satisfies those inferred specifications. However, the obtained specifications are naturally incomplete, leaving the synthesis engine with a difficult task of synthesizing a general solution from a sparse space of many possible solutions that are consistent with the provided specifications but that do not necessarily generalize. We present S3, a new repair synthesis engine that leverages programming-by-examples methodology to synthesize high-quality bug repairs. The novelty in S3 that allows it to tackle the sparse search space to create more general repairs is three-fold: (1) A systematic way to customize and constrain the syntactic search space via a domain-specific language, (2) An efficient enumeration- based search strategy over the constrained search space, and (3) A number of ranking features based on measures of the syntactic and semantic distances between candidate solutions and the original buggy program. We compare S3’s repair effectiveness with state-of-the-art synthesis engines Angelix, Enumerative, and CVC4. S3 can successfully and correctly fix at least three times more bugs than the best baseline on datasets of 52 bugs in small programs, and 100 bugs in real-world large programs.

References

2016. Syntax-guided Synthesis. (2016). http://www.sygus.org/Google Scholar
Rui Abreu, Peter Zoeteweij, and Arjan JC Van Gemund. 2007. On the accuracy of spectrum-based fault localization. In Testing: Academic and Industrial Conference Practice and Research Techniques-MUTATION (TAICPART-MUTATION 2007). 89– 98. Google ScholarDigital Library
Rajeev Alur, Rastislav Bodik, Garvit Juniwal, Milo MK Martin, Mukund Raghothaman, Sanjit A Seshia, Rishabh Singh, Armando Solar-Lezama, Emina Torlak, and Abhishek Udupa. 2015. Syntax-guided synthesis. Dependable Software Systems Engineering (2015).Google Scholar
Rajeev Alur, Arjun Radhakrishna, and Abhishek Udupa. Scaling Enumerative Program Synthesis via Divide and Conquer. Technical Report. University of Pennsylvania.Google Scholar
Tom Britton, Lisa Jeng, Graham Carver, Paul Cheak, and Tomer Katzenellenbogen. 2013. Reversible Debugging Software. Technical Report. University of Cambridge, Judge Business School.Google Scholar
Satish Chandra, Emina Torlak, Shaon Barman, and Rastislav Bodik. 2011. Angelic debugging. In International Conference on Software Engineering (ICSE’11). 121– 130. Google ScholarDigital Library
Vitaly Chipounov, Vlad Georgescu, Cristian Zamfir, and George Candea. 2009. Selective symbolic execution. In Workshop on Hot Topics in System Dependability (HotDep).Google Scholar
Loris D’Antoni, Roopsha Samanta, and Rishabh Singh. 2016. Qlose: Program repair with quantitative objectives. In International Conference on Computer Aided Verification (CAV). Springer, 383–401.Google ScholarCross Ref
Loris D’Antoni, Rishabh Singh, and Michael Vaughn. 2017 (to appear). NoFAQ: Synthesizing command repairs from examples. In Joint Conference on European Software Engineering Conference and International Symposium on Foundations of Software Engineering (ESEC/FSE ’17). Google ScholarDigital Library
Thomas Durieux and Martin Monperrus. 2016. IntroClassJava: A Benchmark of 297 Small and Buggy Java Programs. Technical Report. Universite Lille 1. https://hal.archives-ouvertes.fr/hal-01272126/documentGoogle Scholar
Michael D Ernst, Jeff H Perkins, Philip J Guo, Stephen McCamant, Carlos Pacheco, Matthew S Tschantz, and Chen Xiao. 2007. The Daikon system for dynamic detection of likely invariants. Science of Computer Programming 69, 1 (2007), 35–45. Google ScholarDigital Library
Jean-Rémy Falleri, Floréal Morandat, Xavier Blanc, Matias Martinez, and Martin Monperrus. 2014. Fine-grained and accurate source code differencing. In International Conference on Automated Software Engineering (ASE’14). 313–324. Google ScholarDigital Library
Sumit Gulwani. 2011. Automating string processing in spreadsheets using inputoutput examples. In ACM SIGPLAN Notices, Vol. 46. ACM, 317–330. Google ScholarDigital Library
Sumit Gulwani, Javier Esparza, Orna Grumberg, and Salomon Sickert. 2016. Programming by Examples (and its applications in Data Wrangling). Verification and Synthesis of Correct and Secure Systems (2016).Google Scholar
Kim Herzig and Andreas Zeller. 2013. The impact of tangled code changes. In Working Conference on Mining Software Repositories (MSR). 121–130. Google ScholarDigital Library
Susmit Jha, Sumit Gulwani, Sanjit A. Seshia, and Ashish Tiwari. 2010. Oracleguided Component-based Program Synthesis. In International Conference on Software Engineering (ICSE). Cape Town, South Africa, 215–224. Google ScholarDigital Library
Lingxiao Jiang, Ghassan Misherghi, Zhendong Su, and Stephane Glondu. 2007. Deckard: Scalable and accurate tree-based detection of code clones. In International conference on Software Engineering (ICSE). IEEE, 96–105. Google ScholarDigital Library
René Just, Darioush Jalali, and Michael D Ernst. 2014. Defects4J: A database of existing faults to enable controlled testing studies for Java programs. In International Symposium on Software Testing and Analysis (ISSTA ’14). 437–440. Google ScholarDigital Library
Yalin Ke, Kathryn T. Stolee, Claire Le Goues, and Yuriy Brun. 2015. Repairing Programs with Semantic Code Search. In International Conference on Automated Software Engineering (ASE). 295–306.Google Scholar
Dongsun Kim, Jaechang Nam, Jaewoo Song, and Sunghun Kim. 2013. Automatic patch generation learned from human-written patches. In International Conference on Software Engineering (ICSE ’13). 802–811. Google ScholarDigital Library
Dileep Kini and Sumit Gulwani. 2015. FlashNormalize: Programming by Examples for Text Normalization.. In International Joint Conference on Artificial Intelligence (IJCAI). 776–783. Google ScholarDigital Library
Tien-Duy B Le, Xuan-Bach D Le, David Lo, and Ivan Beschastnikh. 2015. Synergizing specification miners through model fissions and fusions (t). In International Conference on Automated Software Engineering (ASE). IEEE, 115–125.Google ScholarDigital Library
Vu Le and Sumit Gulwani. 2014. Flashextract: A framework for data extraction by examples. In ACM SIGPLAN Notices, Vol. 49. ACM, 542–553. Google ScholarDigital Library
Xuan-Bach D Le, Duc-Hiep Chu, David Lo, Claire Le Goues, and Willem Visser. 2017 (to appear). JFIX: Semantics-Based Repair of Java Programs via Symbolic PathFinder. In International Symposium on Software Testing and Analysis (ISSTA’17). Google ScholarDigital Library
Xuan Bach D. Le, Quang Loc Le, David Lo, and Claire Le Goues. 2016. Enhancing Automated Program Repair with Deductive Verification. In International Conference on Software Maintenance and Evolution (ICSME). 428–432.Google Scholar
Xuan-Bach D Le, Tien-Duy B Le, and David Lo. 2015. Should fixing these failures be delegated to automated program repair?. In International Symposium on Software Reliability Engineering (ISSRE). 427–437. Google ScholarDigital Library
Xuan-Bach D Le, David Lo, and Claire Le Goues. 2016. Empirical study on synthesis engines for semantics-based program repair. In International Conference on Software Maintenance and Evolution (ICSME’16). 423–427.Google ScholarCross Ref
Xuan Bach D Le, David Lo, and Claire Le Goues. 2016. History driven program repair. In International Conference on Software Analysis, Evolution, and Reengineering (SANER). IEEE, 213–224.Google ScholarCross Ref
Claire Le Goues, Michael Dewey-Vogt, Stephanie Forrest, and Westley Weimer. 2012. A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In International Conference on Software Engineering (ICSE’12). 3–13. Google ScholarDigital Library
Claire Le Goues, Neal Holtschulte, Edward K Smith, Yuriy Brun, Premkumar Devanbu, Stephanie Forrest, and Westley Weimer. 2015. The ManyBugs and IntroClass benchmarks for automated repair of C programs. Transactions on Software Engineering (TSE) 41, 12 (Dec. 2015), 1236–1256.Google ScholarDigital Library
Claire Le Goues, ThanhVu Nguyen, Stephanie Forrest, and Westley Weimer. 2012. GenProg: A Generic Method for Automatic Software Repair. IEEE Transactions on Software Engineering 38, 1 (2012), 54–72. Google ScholarDigital Library
Fan Long and Martin Rinard. 2015. Staged Program Repair with Condition Synthesis. In European Software Engineering Conference and International Symposium on Foundations of Software Engineering (ESEC/FSE). 166–178. Google ScholarDigital Library
Fan Long and Martin Rinard. 2016. An analysis of the search spaces for generate and validate patch generation systems. In International Conference on Software Engineering (ICSE). ACM, 702–713. Google ScholarDigital Library
Fan Long and Martin Rinard. 2016. Automatic Patch Generation by Learning Correct Code. In Symposium on Principles of Programming Languages (POPL). 298–312. Google ScholarDigital Library
Sergey Mechtaev, Jooyong Yi, and Abhik Roychoudhury. 2015. Directfix: Looking for simple program repairs. In International Conference on Software Engineering (ICSE). IEEE Press, 448–458. Google ScholarDigital Library
Sergey Mechtaev, Jooyong Yi, and Abhik Roychoudhury. 2016. Angelix: Scalable multiline program patch synthesis via symbolic analysis. In International Conference on Software Engineering (ICSE). IEEE, 691–701. Google ScholarDigital Library
Hoang Duong Thien Nguyen, Dawei Qi, Abhik Roychoudhury, and Satish Chandra. 2013. Semfix: Program repair via semantic analysis. In International Conference on Software Engineering (ICSE). IEEE Press, 772–781. Google ScholarDigital Library
Corina S Păsăreanu, Willem Visser, David Bushnell, Jaco Geldenhuys, Peter Mehlitz, and Neha Rungta. 2013. Symbolic PathFinder: integrating symbolic execution with model checking for Java bytecode analysis. Automated Software Engineering 20, 3 (2013), 391–425.Google ScholarCross Ref
Yuhua Qi, Xiaoguang Mao, Yan Lei, Ziying Dai, and Chengsong Wang. 2014. The strength of random search on automated program repair. In International Conference on Software Engineering (ICSE). ACM, 254–265. Google ScholarDigital Library
Zichao Qi, Fan Long, Sara Achour, and Martin Rinard. 2015. An analysis of patch plausibility and correctness for generate-and-validate patch generation systems. In International Symposium on Software Testing and Analysis (ISSTA). ACM, 24–36. Google ScholarDigital Library
Andrew Reynolds, Morgan Deters, Viktor Kuncak, Cesare Tinelli, and Clark Barrett. 2015. Counterexample-guided quantifier instantiation for synthesis in SMT. In International Conference on Computer Aided Verification (CAV). 198–216.Google ScholarCross Ref
Rishabh Singh and Sumit Gulwani. 2016. Transforming spreadsheet data types using examples. In ACM SIGPLAN Notices, Vol. 51. ACM, 343–356. Google ScholarDigital Library
Edward K Smith, Earl T Barr, Claire Le Goues, and Yuriy Brun. 2015. Is the cure worse than the disease? overfitting in automated program repair. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. ACM, 532–543. Google ScholarDigital Library
Armando Solar-Lezama, Rodric Rabbah, Rastislav Bodík, and Kemal Ebcioğlu. 2005. Programming by sketching for bit-streaming programs. In ACM SIGPLAN Notices. ACM, 281–294. Google ScholarDigital Library
Shin Hwei Tan, Hiroaki Yoshida, Mukul R Prasad, and Abhik Roychoudhury. 2016. Anti-patterns in search-based program repair. In International Symposium on Foundations of Software Engineering. ACM, 727–738. Google ScholarDigital Library
G. Tassey. 2002. The economic impacts of inadequate infrastructure for software testing. Planning Report, NIST (2002).Google Scholar
Ferdian Thung, Xuan-Bach D Le, and David Lo. 2015. Active semi-supervised defect categorization. In International Conference on Program Comprehension. IEEE Press, 60–70. Google ScholarDigital Library
Willem Visser. 2016. What makes killing a mutant hard. In International Conference on Automated Software Engineering (ASE). ACM, 39–44. Google ScholarDigital Library
Westley Weimer, Zachary P Fry, and Stephanie Forrest. 2013. Leveraging program equivalence for adaptive program repair: Models and first results. In International Conference on Automated Software Engineering (ASE). 356–366. Google ScholarDigital Library
Westley Weimer, ThanhVu Nguyen, Claire Le Goues, and Stephanie Forrest. 2009. Automatically finding patches using genetic programming. In International Conference on Software Engineering (ICSE). IEEE, 364–374. Google ScholarDigital Library
Yingfei Xiong, Jie Wang, Runfa Yan, Jiachen Zhang, Shi Han, Gang Huang, and Lu Zhang. 2017. Precise condition synthesis for program repair. In International Conference on Software Engineering (ICSE). IEEE Press, 416–426. ESEC/FSE’17, September 4–8, 2017, Paderborn, Germany Xuan-Bach D. Le, Duc-Hiep Chu, David Lo, Claire Le Goues, and Willem Visser Google ScholarDigital Library
Jifeng Xuan, Matias Martinez, Favio Demarco, Maxime Clément, Sebastian Lamelas, Thomas Durieux, Daniel Le Berre, and Martin Monperrus. 2016. Nopol: Automatic Repair of Conditional Statement Bugs in Java Programs. Transactions on Software Engineering (2016). Google ScholarDigital Library

Index Terms

S3: syntax- and semantic-guided repair synthesis via programming by examples
1. Software and its engineering

Recommendations

Can reactive synthesis and syntax-guided synthesis be friends?
SPLASH Companion 2021: Companion Proceedings of the 2021 ACM SIGPLAN International Conference on Systems, Programming, Languages, and Applications: Software for Humanity

While reactive synthesis and syntax-guided synthesis (SyGuS) have seen enormous progress in recent years, combining the two approaches has remained a challenge. In this work, we present the synthesis of reactive programs from Temporal Stream Logic ...
Read More
Can reactive synthesis and syntax-guided synthesis be friends?
PLDI 2022: Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation

While reactive synthesis and syntax-guided synthesis (SyGuS) have seen enormous progress in recent years, combining the two approaches has remained a challenge. In this work, we present the synthesis of reactive programs from Temporal Stream Logic ...
Read More
Dimensions in program synthesis
PPDP '10: Proceedings of the 12th international ACM SIGPLAN symposium on Principles and practice of declarative programming

Program Synthesis, which is the task of discovering programs that realize user intent, can be useful in several scenarios: enabling people with no programming background to develop utility programs, helping regular programmers automatically discover ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ESEC/FSE 2017: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering
August 2017
1073 pages
ISBN:9781450351058
DOI:10.1145/3106237
General Chairs:
Eric Bodden
Paderborn University, Germany / Fraunhofer IEM, Germany
,
Wilhelm Schäfer
Paderborn University, Germany
,
Program Chairs:
Arie van Deursen
Delft University of Technology, Netherlands
,
Andrea Zisman
Open University, UK
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 August 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Inductive Synthesis
Program Repair
Programming by Examples
Symbolic Execution
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate112of543submissions,21%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 129
  Total Citations
  View Citations
- 985
  Total Downloads
- Downloads (Last 12 months)124
- Downloads (Last 6 weeks)20
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.