Skip to main content

Impact of Test Suite Coverage on Overfitting in Genetic Improvement of Software

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 12420))

Abstract

Genetic Improvement (GI) uses automated search to improve existing software. It can be used to improve runtime, energy consumption, fix bugs, and any other software property, provided that such property can be encoded into a fitness function. GI usually relies on testing to check whether the changes disrupt the intended functionality of the software, which makes test suites important artefacts for the overall success of GI. The objective of this work is to establish which characteristics of the test suites correlate with the effectiveness of GI. We hypothesise that different test suite properties may have different levels of correlation to the ratio between overfitting and non-overfitting patches generated by the GI algorithm. In order to test our hypothesis, we perform a set of experiments with automatically generated test suites using EvoSuite and 4 popular coverage criteria. We used these test suites as input to a GI process and collected the patches generated throughout such a process. We find that while test suite coverage has an impact on the ability of GI to produce correct patches, with branch coverage leading to least overfitting, the overfitting rate was still significant. We also compared automatically generated tests with manual, developer-written ones and found that while manual tests had lower coverage, the GI runs with manual tests led to less overfitting than in the case of automatically generated tests. Finally, we did not observe enough statistically significant correlations between the coverage metrics and overfitting ratios of patches, i.e., the coverage of test suites cannot be used as a linear predictor for the level of overfitting of the generated patches.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    We use the word ‘potentially’ here, as although the patch might improve upon our training and test set, it does not mean the runtime improvement will generalise to all possible usages of software. Manual check is thus necessary.

  2. 2.

    See gin.util.TestCaseGenerator at https://github.com/justynapt/ssbse2020RENE.

  3. 3.

    Following advice given here: https://github.com/EvoSuite/evosuite/issues/48.

References

  1. An, G., Kim, J., Yoo, S.: Comparing line and AST granularity level for program repair using pyggi. In: Petke, J., Stolee, K.T., Langdon, W.B., Weimer, W. (eds.) Proceedings of the 4th International Genetic Improvement Workshop, GI@ICSE 2018, pp. 19–26. ACM (2018). https://doi.org/10.1145/3194810.3194814

  2. Assiri, F.Y., Bieman, J.M.: An assessment of the quality of automated program operator repair. In: Seventh IEEE International Conference on Software Testing, Verification and Validation, ICST 2014, pp. 273–282. IEEE Computer Society (2014). https://doi.org/10.1109/ICST.2014.40

  3. Barr, E.T., Harman, M., Jia, Y., Marginean, A., Petke, J.: Automated software transplantation. In: Young, M., Xie, T. (eds.) Proceedings of the 2015 International Symposium on Software Testing and Analysis (ISSTA 2015), pp. 257–269. ACM (2015). https://doi.org/10.1145/2771783.2771796

  4. Barr, E.T., Harman, M., McMinn, P., Shahbaz, M., Yoo, S.: The oracle problem in software testing: a survey. IEEE Trans. Software Eng. 41(5), 507–525 (2015). https://doi.org/10.1109/TSE.2014.2372785

    Article  Google Scholar 

  5. Basios, M., Li, L., Wu, F., Kanthan, L., Barr, E.T.: Darwinian data structure selection. In: Leavens, G.T., Garcia, A., Pasareanu, C.S. (eds.) Proceedings of the 2018 ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/SIGSOFT 2018, pp. 118–128. ACM (2018). https://doi.org/10.1145/3236024.3236043

  6. Brownlee, A.E.I., Petke, J., Alexander, B., Barr, E.T., Wagner, M., White, D.R.: Gin: genetic improvement research made easy. In: Auger, A., Stützle, T. (eds.) Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2019, pp. 985–993. ACM (2019). https://doi.org/10.1145/3321707.3321841

  7. Bruce, B.R., Petke, J., Harman, M., Barr, E.T.: Approximate oracles and synergy in software energy search spaces. IEEE Trans. Software Eng. 45(11), 1150–1169 (2019). https://doi.org/10.1109/TSE.2018.2827066

    Article  Google Scholar 

  8. Chekam, T.T., Papadakis, M., Le Traon, Y., Harman, M.: An empirical study on mutation, statement and branch coverage fault revelation that avoids the unreliable clean program assumption. In: 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE), pp. 597–608 (2017)

    Google Scholar 

  9. Cody-Kenny, B., Lopez, E.G., Barrett, S.: locoGP: improving performance by genetic programming Java source code. In: Langdon, W.B., Petke, J., White, D.R. (eds.) Genetic Improvement 2015 Workshop, pp. 811–818. ACM (2015). https://doi.org/10.1145/2739482.2768419

  10. Fisher, R.A.: On the interpretation of chi-squared from contingency tables, and the calculation of P. J. R. Stat. Soc. 85(1), 87–94 (1922). https://doi.org/10.2307/2340521

    Article  Google Scholar 

  11. Fraser, G., Arcuri, A.: Evolutionary generation of whole test suites. In: Núñez, M., Hierons, R.M., Merayo, M.G. (eds.) Proceedings of the 11th International Conference on Quality Software, QSIC 2011, pp. 31–40. IEEE Computer Society (2011). https://doi.org/10.1109/QSIC.2011.19

  12. Langdon, W.B., Harman, M.: Optimizing existing software with genetic programming. IEEE Trans. Evol. Comput. 19(1), 118–135 (2015). https://doi.org/10.1109/TEVC.2013.2281544

    Article  Google Scholar 

  13. Langdon, W.B., Lam, B.Y.H., Petke, J., Harman, M.: Improving CUDA DNA analysis software with genetic programming. In: Silva, S., Esparcia-Alcázar, A.I. (eds.) Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2015, pp. 1063–1070. ACM (2015). https://doi.org/10.1145/2739480.2754652

  14. Le Goues, C., Nguyen, T., Forrest, S., Weimer, W.: GenProg: a generic method for automatic software repair. IEEE Trans. Software Eng. 38(1), 54–72 (2012). https://doi.org/10.1109/TSE.2011.104

    Article  Google Scholar 

  15. Offutt, A.J., Lee, S.D.: How strong is weak mutation? In: Howden, W.E. (ed.) Proceedings of the Symposium on Testing, Analysis, and Verification, TAV 1991, Victoria, British Columbia, Canada, 8–10 October 1991, pp. 200–213. ACM (1991). https://doi.org/10.1145/120807.120826

  16. Offutt, A.J., Lee, S.D.: An empirical evaluation of weak mutation. IEEE Trans. Software Eng. 20(5), 337–344 (1994). https://doi.org/10.1109/32.286422

    Article  Google Scholar 

  17. Offutt, A.J., Untch, R.H.: Mutation 2000: uniting the orthogonal. In: Wong, W.E. (ed.) Mutation Testing for the New Century, pp. 34–44. Springer, Boston (2001). https://doi.org/10.1007/978-1-4757-5939-6-7

    Chapter  Google Scholar 

  18. Petke, J., Haraldsson, S.O., Harman, M., White, D.R., Woodward, J.R.: Genetic improvement of software: a comprehensive survey. IEEE Trans. Evol. Comput. (2017). https://doi.org/10.1109/TEVC.2017.2693219

  19. Petke, J., Harman, M., Langdon, W.B., Weimer, W.: Using genetic improvement and code transplants to specialise a C++ program to a problem class. In: Nicolau, M., et al. (eds.) EuroGP 2014. LNCS, vol. 8599, pp. 137–149. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44303-3_12

    Chapter  Google Scholar 

  20. Smith, E.K., Barr, E.T., Le Goues, C., Brun, Y.: Is the cure worse than the disease? Overfitting in automated program repair. In: Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2015, pp. 532–543 (2015). https://doi.org/10.1145/2786805.2786825

  21. Spearman, C.: The proof and measurement of association between two things. Am. J. Psychol. 15(1), 72–101 (1904). https://doi.org/10.2307/1422689

    Article  Google Scholar 

  22. White, D.R.: GI in no time. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2017), pp. 1549–1550. ACM (2017). https://doi.org/10.1145/3067695.3082515

  23. Wu, F., Weimer, W., Harman, M., Jia, Y., Krinke, J.: Deep parameter optimisation. In: Silva, S., Esparcia-Alcázar, A.I. (eds.) Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2015, pp. 1375–1382. ACM (2015). https://doi.org/10.1145/2739480.2754648

  24. Yi, J., Tan, S.H., Mechtaev, S., Böhme, M., Roychoudhury, A.: A correlation study between automated program repair and test-suite metrics. Empirical Softw. Eng. 23(5), 2948–2979 (2017). https://doi.org/10.1007/s10664-017-9552-y

    Article  Google Scholar 

Download references

Acknowlegements

This work was funded by the EPSRC grant EP/P023991/1 and the ERC grant 741278 Evolving Program Improvement Collaborators (EPIC). The authors would also like to thank Prof. Gordon Fraser from University of Passau for consultation on the output diversity metric.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Justyna Petke .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lim, M., Guizzo, G., Petke, J. (2020). Impact of Test Suite Coverage on Overfitting in Genetic Improvement of Software. In: Aleti, A., Panichella, A. (eds) Search-Based Software Engineering. SSBSE 2020. Lecture Notes in Computer Science(), vol 12420. Springer, Cham. https://doi.org/10.1007/978-3-030-59762-7_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-59762-7_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-59761-0

  • Online ISBN: 978-3-030-59762-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics