Skip to main content

Advertisement

Log in

Improved representation and genetic operators for linear genetic programming for automated program repair

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Genetic improvement for program repair can fix bugs or otherwise improve software via patch evolution. Consider GenProg, a prototypical technique of this type. GenProg’s crossover and mutation operators manipulate individuals represented as patches. A patch is composed of high-granularity edits that indivisibly comprise an edit operation, a faulty location, and a fix statement used in replacement or insertions. We observe that recombination and mutation of such high-level units limits the technique’s ability to effectively traverse and recombine the repair search spaces. We propose a reformulation of program repair representation, crossover, and mutation operators such that they explicitly traverse the three subspaces that underlie the search problem: the Operator, Fault and Fix Spaces. We provide experimental evidence validating our insight, showing that the operators provide considerable improvement over the baseline repair algorithm in terms of search success rate and efficiency. We also conduct a genotypic distance analysis over the various types of search, providing insight as to the influence of the operators on the program repair search problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. Note that GenProg manipulates C programs at the statement-level, but the formulation generalizes to arbitrary languages and granularity levels.

  2. Both available from the GenProg project, http://genprog.cs.virginia.edu/

  3. IntroClass and ManyBugs are available at http://repairbenchmarks.cs.umass.edu/

References

  • Ackling T, Alexander B, Grunert I (2011) Evolving patches for software repair. In: Genetic and Evolutionary Computation, pp 1427–1434

  • Arcuri A (2011) Evolutionary repair of faulty software. Appl Soft Comput 11 (4):3494–3514

    Article  Google Scholar 

  • Arcuri A, Briand L (2011) A practical guide for using statistical tests to assess randomized algorithms in software engineering. In: International Conference on Software Engineering, ACM, New York, NY, USA, ICSE ’11, pp 1–10

  • Arcuri A, Yao X (2008) A novel co-evolutionary approach to automatic software bug fixing. In: 2008 IEEE Congress on Evolutionary Computation. CEC 2008. (IEEE World Congress on Computational Intelligence), IEEE, pp 162–168

  • Barr ET, Brun Y, Devanbu P, Harman M, Sarro F (2014) The plastic surgery hypothesis. In: Proceedings of the 22nd ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE), pp 306–317

  • Barr ET, Harman M, Jia Y, Marginean A, Petke J (2015) Automated software transplantation. In: International Symposium on Software Testing and Analysis (ISSTA), pp 257–269

  • Brameier MF, Banzhaf W (2007) Linear genetic programming, 1st edn. Springer, Berlin

    MATH  Google Scholar 

  • Britton T, Jeng L, Carver G, Cheak P, Katzenellenbogen T (2013) Reversible debugging software. Tech rep., University of Cambridge, Judge Business School

  • Bruce BR, Petke J, Harman M (2015) Reducing energy consumption using genetic improvement. In: Annual Conference on Genetic and Evolutionary Computation (GECCO), pp 1327–1334

  • Burlacu B, Affenzeller M, Winkler S, Kommenda M, Kronberger G (2015) Methods for genealogy and building block analysis in genetic programming. In: Computational Intelligence and Efficiency in Engineering Systems. Springer, pp 61–74

  • Debroy V, Wong WE (2010) Using mutation to automatically suggest fixes for faulty programs. In: International Conference on Software Testing, Verification, and Validation, pp 65–74

  • DeJong K (1975) An analysis of the behavior of a class of genetic adaptive systems. Ph D Thesis, University of Michigan

  • Fast E, Le Goues C, Forrest S, Weimer W (2010) Designing better fitness functions for automated program repair. In: Pelikan M, Branke J (eds) Genetic and Evolutionary Computation Conference (GECCO). ACM, pp 965–972

  • Forrest S (1993) Genetic algorithms: principles of natural selection applied to computation. Science 261:872–878

    Article  Google Scholar 

  • Forrest S, Nguyen T, Weimer W, Le Goues C (2009) A genetic programming approach to automated software repair. In: Genetic and evolutionary computation conference (GECCO), pp 947–954

  • Freitas E, Camilo CG Jr, Vincenzi AMR (2016) SCOUT: a multi-objective method to select components in designing unit testing. In: IEEE 27th International Symposium on Software Reliability Engineering (ISSRE), pp 36–46

  • Goldberg DE (1989) Genetic algorithms in search, optimization and machine learning, 1st edn. Addison-Wesley Longman Publishing Co., Inc, Reading

    MATH  Google Scholar 

  • Gupta D, Ghafir S (2012) An overview of methods maintaining diversity in genetic algorithms. Int J Emerg Technol Adv Eng 2(5):56–60

    Google Scholar 

  • Harik GR, Lobo FG, Goldberg DE (1999) The compact genetic algorithm. IEEE Trans Evol Comput 3(4):287–297

    Article  Google Scholar 

  • Harman M, Mansouri SA, Zhang Y (2012) Search-based software engineering: trends, techniques and applications. ACM Comput Surv 45(1):11:1–11:61. https://doi.org/10.1145/2379776.2379787

    Article  Google Scholar 

  • Holland JH (1992) Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control and artificial intelligence. MIT Press, Cambridge

    Google Scholar 

  • Jones JA, Harrold MJ, Stasko J (2002) Visualization of test information to assist fault localization. In: International conference on software engineering (ICSE), Orlando, FL, USA. https://doi.org/10.1145/581339.581397, pp 467–477

  • Ke Y, Stolee KT, Le Goues C, Brun Y (2015) Repairing programs with semantic code search. In: 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp 295–306

  • Kim D, Nam J, Song J, Kim S (2013) Automatic patch generation learned from human-written patches. In: ACM/IEEE International Conference on Software Engineering (ICSE), San Francisco, CA, USA, pp 802–811

  • Kim YH, Moon BR (2004) Distance measures in genetic algorithms. In: Genetic and evolutionary computation conference (GECCO). Springer, pp 400–401

  • Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge

    MATH  Google Scholar 

  • Koza JR (1994) Genetic programming II: automatic discovery of reusable programs. MIT Press, Cambridge

    MATH  Google Scholar 

  • Langdon WB, Harman M (2015) Grow and graft a better CUDA pknotsRG for RNA pseudoknot free energy calculation. In: Genetic and Evolutionary Computation Conference, GECCO Companion ’15, pp 805–810

  • Le XBD, Lo D, Le Goues C (2016) History driven program repair. In: IEEE 23Rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), vol 1. IEEE, pp 213–224

  • Le Goues C, Dewey-Vogt M, Forrest S, Weimer W (2012a) A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In: International Conference on Software Engineering (ICSE), pp 3–13

  • Le Goues C, Nguyen T, Forrest S, Weimer W (2012b) Genprog: a generic method for automatic software repair. IEEE Transactions on Software Engineering (TSE) 38:54–72. https://doi.org/10.1109/TSE.2011.104

  • Le Goues C, Weimer W, Forrest S (2012) Representations and operators for improving evolutionary software repair. In: Genetic and evolutionary computation conference (GECCO), pp 959–966

  • Le Goues C, Forrest S, Weimer W (2013) Current challenges in automatic software repair. Softw Qual J 21(3):421–443. https://doi.org/10.1007/s11219-013-9208-0

    Article  Google Scholar 

  • Le Goues C, Holtschulte N, Smith EK, Brun Y, Devanbu P, Forrest S, Weimer W (2015) The ManyBugs and IntroClass benchmarks for automated repair of C programs. IEEE Transactions on Software Engineering (TSE)

  • Liblit B, Naik M, Zheng AX, Aiken A, Jordan MI (2005) Scalable statistical bug isolation. SIGPLAN Not 40(6):15–26. https://doi.org/10.1145/1064978.1065014

    Article  Google Scholar 

  • Long F, Rinard M (2015) Staged program repair with condition synthesis. In: Joint Meeting of the European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), ACM, New York, NY, USA, ESEC/FSE 2015, pp 166–178

  • Long F, Rinard M (2016) Automatic patch generation by learning correct code. In: Principles of Programming Languages, POPL ’16, pp 298–312

  • Louis SJ, Rawlins GJE (1992) Syntactic analysis of convergence in genetic algorithms. In: Foundations of Genetic Algorithms 2, Morgan Kaufmann, pp 141–151

  • Luke S, Spector L (1997) A comparison of crossover and mutation in genetic programming. Genet Program 97:240–248

    Google Scholar 

  • Machado BN, Camilo CG Jr, Rodrigues CL, Quijano EHD (2016) Sbstframe: a framework to search-based software testing. In: 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp 004,106–004,111

  • Martinez M, Monperrus M (2015) Mining software repair models for reasoning on the search space of automated program fixing. Empir Softw Eng 20(1):176–205

    Article  Google Scholar 

  • Mattfeld DC (2013) Evolutionary search and the job shop: Investigations on genetic algorithms for production scheduling. Springer, Berlin

    MATH  Google Scholar 

  • Mechtaev S, Yi J, Roychoudhury A (2016) Angelix: Scalable multiline program patch synthesis via symbolic analysis. In: International Conference on Software Engineering, ICSE ’16, pp 691–701

  • Moncao ACBL, Camilo CG, Queiroz LT, Rodrigues CL, de Sa Leitao P, Vincenzi AMR (2013) Shrinking a database to perform SQL mutation tests using an evolutionary algorithm. In: IEEE Congress on Evolutionary Computation (CEC), pp 2533–2539

  • Morrison RW, De Jong KA (2001) Measurement of population diversity. In: International Conference on Artificial Evolution (Evolution Artificielle). Springer, pp 31–41

  • Nguyen HDT, Qi D, Roychoudhury A, Chandra S (2013) SemFix: program repair via semantic analysis. In: International Conference on Software Engineering (ICSE), pp 772–781

  • Oliveira AAL, Camilo CG Jr, Vincenzi AMR (2013) A coevolutionary algorithm to automatic test case selection and mutant in mutation testing. In: IEEE Congress on Evolutionary Computation (CEC), pp 829–836

  • Oliveira VPL, Souza EF, Le Goues C, Camilo CG Jr (2016) Improved crossover operators for genetic programming for program repair. In: International Symposium on Search Based Software Engineering (SSBSE). Springer, pp 112–127

  • Orlov M, Sipper M (2011) Flight of the FINCH through the Java wilderness. IEEE Trans Evol Comput 15(2):166–182

    Article  Google Scholar 

  • Petke J, Harman M, Langdon WB, Weimer W (2014) Using genetic improvement and code transplants to specialise a C++ program to a problem class. In: Genetic Programming, pp 137–149

  • Pressman RS (2001) Software engineering: a practitioner’s approach, 5th edn. McGraw-Hill Higher Education, Burr Ridge

    MATH  Google Scholar 

  • Qi Y, Mao X, Lei Y, Dai Z, Wang C (2014) The strength of random search on automated program repair. In: International Conference on Software Engineering (ICSE), pp 254–265

  • Qi Z, Long F, Achour S, Rinard M (2015) An analysis of patch plausibility and correctness for generate-and-validate patch generation systems. In: International Symposium on Software Testing and Analysis (ISSTA), pp 24–36

  • Rawlins GJE (1991) Foundations of genetic algorithms. Morgan Kaufmann, San Francisco

    Google Scholar 

  • Rothlauf F (2011) Design of modern heuristics: principles and application. Springer, Berlin

    Book  MATH  Google Scholar 

  • Saha D, Nanda MG, Dhoolia P, Nandivada VK, Sinha V, Chandra S (2011) Fault localization for data-centric programs. In: Joint Meeting of the European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), pp 157–167

  • Schulte E, Forrest S, Weimer W (2010) Automated program repair through the evolution of assembly code. In: Automated software engineering (ASE), pp 313–316

  • Schulte E, Dorn J, Harding S, Forrest S, Weimer W (2014) Post-compiler software optimization for reducing energy. SIGARCH Comput Archit News 42(1):639–652

    Google Scholar 

  • Silva S, Esparcia-Alcázar AI (eds.) (2015) Genetic and evolutionary computation conference companion material proceedings, Workshop on Genetic Improvement, ACM

  • Smith EK, Barr E, Le Goues C, Brun Y (2015) Is the cure worse than the disease? Overfitting in automated program repair. In: Joint Meeting of the European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), pp 532–543

  • Weimer W, Nguyen T, Le Goues C, Forrest S (2009) Automatically finding patches using genetic programming. In: International Conference on Software Engineering (ICSE), pp 364–374

  • Weimer W, Fry ZP, Forrest S (2013) Leveraging program equivalence for adaptive program repair: models and first results. In: Automated Software Engineering (ASE), pp 356–366

  • Wong WE, Gao R, Li Y, Abreu R, Wotawa F (2016) A survey on software fault localization. IEEE Transactions on Software Engineering (TSE) 42(8):707–740

    Article  Google Scholar 

  • Zeller A (1999) Yesterday, my program worked. Today, it does not. Why?. In: Joint Meeting of the European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), pp 253–267

Download references

Acknowledgements

Acknowledgements to be added to support a camera-ready.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vinicius Paulo L. Oliveira.

Additional information

Communicated by: Martin Monperrus and Westley Weimer

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Oliveira, V.P.L., Souza, E.F.d., Goues, C.L. et al. Improved representation and genetic operators for linear genetic programming for automated program repair. Empir Software Eng 23, 2980–3006 (2018). https://doi.org/10.1007/s10664-017-9562-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-017-9562-9

Keywords

Navigation