Opinion
Letters to the editor

Adding More Color to Patch Picture

Posted
  1. Introduction
  2. CS + CS
  3. Editor-in-Chief response
  4. References
typewriter, Letters to the Editor illustration

I read "Automated Program Repair" with interest (Dec. 2019, p. 56–65). This is exciting technology that, if successful, holds out the promise of substantially improving software quality. While the article highlights systems developed by the first and third authors (GenProg, SemFix, Angelix), it omits quantitative data that can provide a more complete picture of the capabilities of extant program repair systems. My hope is this quantitative data can help researchers and practitioners better understand the capabilities and current limitations of this promising technology.

The most complete evaluation of the GenProg system was reported in Le Goues et al.,1,2 which examines results for a superset of the defects originally considered in Le Goues et al.3 Unfortunately, as reported in Qi et al.7 and communicated to the authors of Le Goues3 in fall of 2014, the experimental setup contains a variety of test harness and test script issues. When these issues are corrected, the results show that Gen-Prog does not fix 55 of 105 bugs, as one might reasonably expect from reading the title of the article. Instead, GenProg fixes only two bugs, highlighting the remarkable ineffectiveness of GenProg as an automatic patch generation system. Moreover, only 69 of the reported 105 bugs are bugs—the remaining 36 are deliberate functionality changes.

I note this ineffectiveness may not be widely recognized—despite being informed of these results in fall of 2014, and despite the publication of Qi,7 at press time, websites maintained by the authors of GenProg still do not reflect the corrections required to accurately represent the capabilities of the GenProg system (for example, see https://squareslab.github.io/genprog-code/).

For comparison, the Prophet system,6 the current state of the art on this benchmark set, generates correct patches for 18 of the 69 defects. But for another 21 defects, Prophet generates incorrect patches that nevertheless validate. This situation requires developers to manually filter the validated patches, with developer evaluation effort and false positives an important concern.

These quantitative results can provide insight into why current commercial automatic patch generation systems such as those discussed in the article focus on specific defect classes such as null dereference defects. Focusing on these classes enables the development of more narrowly tailored techniques that can aspire to fix a larger proportion of the defects with fewer false positives.4,5

In the near term, I think we can expect patch generation systems that focus on specific defect classes to play an increasingly prominent role in maintaining large software systems. Because of the substantial redundancy present in and across most large software systems, as well as the availability of multiple sources of information such as revision histories present in software repositories, I would expect efforts directed at broader classes of defects to pay off in the future. Of course, accurate reporting of relevant results can play an important role in helping the field progress.

Martin Rinard, Cambridge, MA, USA

Back to Top

CS + CS

I read "When Human-Computer Interaction Meets Community Citizen Science" (Feb. 2020, p. 31–34) with interest given my own, multidisciplinary exploration of similar territory. The authors do a nice job of describing the increasingly wide range of citizen science activities. Not only do many leading the expansion of citizen science refer to it as CS, a challenge for those of us who use that term for computer science, but that recent expansion has been occasioned by the launch and growth of online platforms, laying a foundation for the intersection of the two kinds of CS, as is implicit in the article.

I led a small team at RAND that has published two small reports on community citizen science. The Promise of Community Citizen Science9 came out in 2017; Community Citizen Science: From Promise to Action8 came out in 2019. So, while we would like to think we were the ones to introduce the concept, we applaud the work of Yen-Chia Hsu and Illah Nourbakhsh and hope that we can find a way to collaborate.

Marjory S. Blumenthal, Washington, D.C., USA

Back to Top

Editor-in-Chief response

It's great to see excitement and energy in this important area!

Andrew A. Chien, Chicago, IL, USA

    1. Le Goues, C. et al. The ManyBugs and IntroClass benchmarks for automated repair of C programs. IEEE Trans Software Engineering 41, 12 (Dec. 2015), 1236–1256.

    2. Le Goues, C., Brun, Y., Forrest, S. and Weimer, W. Clarifications on the construction and use of the ManyBugs benchmark. IEEE Trans. Software Engineering 43, 11 (Nov. 2017), 1089–1090.

    3. Le Goues, C., Dewey-Vogt, M., Forrest, S. and Weimer, W. A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In Proceedings of the 34th Intern. Conf. Software Engineering (Zurich, Switzerland, June 2–9, 2012), 3–13.

    4. Long, F. Automatic patch generation via learning from successful human patches. Ph.D. thesis, MIT, Cambridge, USA, 2018.

    5. Long, F., Amidon, P. and Rinard, M. Automatic inference of code transforms for patch generation. In Proceedings of the 11th Joint Meeting on Foundations of Software Engineering (Paderborn, Germany, Sept. 4–8, 2017), 727–739.

    6. Long, F. and Rinard, M. Automatic patch generation by learning correct code. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symp. Principles of Programming Languages (St. Petersburg, FL, USA, Jan. 20–22, 2016), 298–312.

    7. Qi, Z., Long, F., Achour, S. and Rinard, M.C. An analysis of patch plausibility and correctness for generate-and-validate patch generation systems. In Proceedings of the Intern. Symp. Software Testing and Analysis (Baltimore, MD, USA, July 12–17, 2015), 24–36.

    8. Chari, R. et al. Community Citizen Science: From Promise to Action (2019); https://www.rand.org/pubs/research_reports/RR2763.html

    9. Chari, R. et al. The Promise of Community Citizen Science (2017); https://www.rand.org/pubs/perspectives/PE256.html

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More