Skip to main content

Enhancing Genetic Improvement Mutations Using Large Language Models

  • Conference paper
  • First Online:
Search-Based Software Engineering (SSBSE 2023)

Abstract

Large language models (LLMs) have been successfully applied to software engineering tasks, including program repair. However, their application in search-based techniques such as Genetic Improvement (GI) is still largely unexplored. In this paper, we evaluate the use of LLMs as mutation operators for GI to improve the search process. We expand the Gin Java GI toolkit to call OpenAI’s API to generate edits for the JCodec tool. We randomly sample the space of edits using 5 different edit types. We find that the number of patches passing unit tests is up to \(75\%\) higher with LLM-based edits than with standard Insert edits. Further, we observe that the patches found with LLMs are generally less diverse compared to standard edits. We ran GI with local search to find runtime improvements. Although many improving patches are found by LLM-enhanced GI, the best improving patch was found by standard GI.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Data Availability Statement

The code, LLMs prompt and experimental infrastructure, data from the evaluation, and results are available as open source at [1]. The code is also under the ‘llm’ branch of github.com/gintool/gin (commit 9fe9bdf; branched from master commit 2359f57 pending full integration with Gin).

References

  1. Artifact of Enhancing Genetic Improvement Mutations Using Large Language Models. Zenodo (2023). https://doi.org/10.5281/zenodo.8304433

  2. Böhme, M., Soremekun, E.O., Chattopadhyay, S., Ugherughe, E., Zeller, A.: Where is the bug and how is it fixed? an experiment with practitioners. In: Proceedings of ACM Symposium on the Foundations of Software Engineering, pp. 117–128 (2017)

    Google Scholar 

  3. Brownlee, A.E., Petke, J., Alexander, B., Barr, E.T., Wagner, M., White, D.R.: Gin: genetic improvement research made easy. In: GECCO, pp. 985–993 (2019)

    Google Scholar 

  4. Brownlee, A.E., Petke, J., Rasburn, A.F.: Injecting shortcuts for faster running Java code. In: IEEE CEC 2020, pp. 1–8 (2020)

    Google Scholar 

  5. Chen, M., et al.: Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374 (2021)

  6. Fan, A., et al.: Large language models for software engineering: survey and open problems (2023)

    Google Scholar 

  7. Github - jcodec/jcodec: Jcodec main repo. https://github.com/jcodec/jcodec

  8. Han, S.J., Ransom, K.J., Perfors, A., Kemp, C.: Inductive reasoning in humans and large language models. Cogn. Syst. Res. 83, 101155 (2023)

    Article  Google Scholar 

  9. Hou, X., et al.: Large language models for software engineering: a systematic literature review. arXiv:2308.10620 (2023)

  10. Kang, S., Yoo, S.: Towards objective-tailored genetic improvement through large language models. arXiv:2304.09386 (2023)

  11. Kim, D., Nam, J., Song, J., Kim, S.: Automatic patch generation learned from human-written patches (2013). http://logging.apache.org/log4j/

  12. Kirbas, S., et al.: On the introduction of automatic program repair in bloomberg. IEEE Softw. 38(4), 43–51 (2021)

    Article  Google Scholar 

  13. Marginean, A., et al.: Sapfix: automated end-to-end repair at scale. In: ICSE-SEIP, pp. 269–278 (2019)

    Google Scholar 

  14. Petke, J., Alexander, B., Barr, E.T., Brownlee, A.E., Wagner, M., White, D.R.: Program transformation landscapes for automated program modification using Gin. Empir. Softw. Eng. 28(4), 1–41 (2023)

    Article  Google Scholar 

  15. Petke, J., Haraldsson, S.O., Harman, M., Langdon, W.B., White, D.R., Woodward, J.R.: Genetic improvement of software: a comprehensive survey. IEEE Trans. Evol. Comput. 22, 415–432 (2018)

    Article  Google Scholar 

  16. Siddiq, M.L., Santos, J., Tanvir, R.H., Ulfat, N., Rifat, F.A., Lopes, V.C.: Exploring the effectiveness of large language models in generating unit tests. arXiv preprint arXiv:2305.00418 (2023)

  17. Sobania, D., Briesch, M., Hanna, C., Petke, J.: An analysis of the automatic bug fixing performance of chatGPT. In: 2023 IEEE/ACM International Workshop on Automated Program Repair (APR), pp. 23–30. IEEE Computer Society (2023)

    Google Scholar 

  18. Xia, C.S., Paltenghi, M., Tian, J.L., Pradel, M., Zhang, L.: Universal fuzzing via large language models. arXiv preprint arXiv:2308.04748 (2023)

  19. Xia, C.S., Zhang, L.: Keep the conversation going: fixing 162 out of 337 bugs for \$0.42 each using chatgpt. arXiv preprint arXiv:2304.00385 (2023)

Download references

Acknowledgements

This work was supported by the UKRI EPSRC grant no. EP/P023991/1 and the ERC advanced fellowship grant no. 741278.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexander E. I. Brownlee .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Brownlee, A.E.I. et al. (2024). Enhancing Genetic Improvement Mutations Using Large Language Models. In: Arcaini, P., Yue, T., Fredericks, E.M. (eds) Search-Based Software Engineering. SSBSE 2023. Lecture Notes in Computer Science, vol 14415. Springer, Cham. https://doi.org/10.1007/978-3-031-48796-5_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-48796-5_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-48795-8

  • Online ISBN: 978-3-031-48796-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics