Enhancing Genetic Improvement Mutations Using Large Language Models

Brownlee, Alexander E. I.; Callan, James; Even-Mendoza, Karine; Geiger, Alina; Hanna, Carol; Petke, Justyna; Sarro, Federica; Sobania, Dominik

doi:10.1007/978-3-031-48796-5_13

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14415))

Included in the following conference series:

International Symposium on Search Based Software Engineering

359 Accesses
3 Citations
40 Altmetric

Abstract

Large language models (LLMs) have been successfully applied to software engineering tasks, including program repair. However, their application in search-based techniques such as Genetic Improvement (GI) is still largely unexplored. In this paper, we evaluate the use of LLMs as mutation operators for GI to improve the search process. We expand the Gin Java GI toolkit to call OpenAI’s API to generate edits for the JCodec tool. We randomly sample the space of edits using 5 different edit types. We find that the number of patches passing unit tests is up to $75\%$ higher with LLM-based edits than with standard Insert edits. Further, we observe that the patches found with LLMs are generally less diverse compared to standard edits. We ran GI with local search to find runtime improvements. Although many improving patches are found by LLM-enhanced GI, the best improving patch was found by standard GI.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Data Availability Statement

The code, LLMs prompt and experimental infrastructure, data from the evaluation, and results are available as open source at [1]. The code is also under the ‘llm’ branch of github.com/gintool/gin (commit 9fe9bdf; branched from master commit 2359f57 pending full integration with Gin).

References

Artifact of Enhancing Genetic Improvement Mutations Using Large Language Models. Zenodo (2023). https://doi.org/10.5281/zenodo.8304433
Böhme, M., Soremekun, E.O., Chattopadhyay, S., Ugherughe, E., Zeller, A.: Where is the bug and how is it fixed? an experiment with practitioners. In: Proceedings of ACM Symposium on the Foundations of Software Engineering, pp. 117–128 (2017)
Google Scholar
Brownlee, A.E., Petke, J., Alexander, B., Barr, E.T., Wagner, M., White, D.R.: Gin: genetic improvement research made easy. In: GECCO, pp. 985–993 (2019)
Google Scholar
Brownlee, A.E., Petke, J., Rasburn, A.F.: Injecting shortcuts for faster running Java code. In: IEEE CEC 2020, pp. 1–8 (2020)
Google Scholar
Chen, M., et al.: Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374 (2021)
Fan, A., et al.: Large language models for software engineering: survey and open problems (2023)
Google Scholar
Github - jcodec/jcodec: Jcodec main repo. https://github.com/jcodec/jcodec
Han, S.J., Ransom, K.J., Perfors, A., Kemp, C.: Inductive reasoning in humans and large language models. Cogn. Syst. Res. 83, 101155 (2023)
Article Google Scholar
Hou, X., et al.: Large language models for software engineering: a systematic literature review. arXiv:2308.10620 (2023)
Kang, S., Yoo, S.: Towards objective-tailored genetic improvement through large language models. arXiv:2304.09386 (2023)
Kim, D., Nam, J., Song, J., Kim, S.: Automatic patch generation learned from human-written patches (2013). http://logging.apache.org/log4j/
Kirbas, S., et al.: On the introduction of automatic program repair in bloomberg. IEEE Softw. 38(4), 43–51 (2021)
Article Google Scholar
Marginean, A., et al.: Sapfix: automated end-to-end repair at scale. In: ICSE-SEIP, pp. 269–278 (2019)
Google Scholar
Petke, J., Alexander, B., Barr, E.T., Brownlee, A.E., Wagner, M., White, D.R.: Program transformation landscapes for automated program modification using Gin. Empir. Softw. Eng. 28(4), 1–41 (2023)
Article Google Scholar
Petke, J., Haraldsson, S.O., Harman, M., Langdon, W.B., White, D.R., Woodward, J.R.: Genetic improvement of software: a comprehensive survey. IEEE Trans. Evol. Comput. 22, 415–432 (2018)
Article Google Scholar
Siddiq, M.L., Santos, J., Tanvir, R.H., Ulfat, N., Rifat, F.A., Lopes, V.C.: Exploring the effectiveness of large language models in generating unit tests. arXiv preprint arXiv:2305.00418 (2023)
Sobania, D., Briesch, M., Hanna, C., Petke, J.: An analysis of the automatic bug fixing performance of chatGPT. In: 2023 IEEE/ACM International Workshop on Automated Program Repair (APR), pp. 23–30. IEEE Computer Society (2023)
Google Scholar
Xia, C.S., Paltenghi, M., Tian, J.L., Pradel, M., Zhang, L.: Universal fuzzing via large language models. arXiv preprint arXiv:2308.04748 (2023)
Xia, C.S., Zhang, L.: Keep the conversation going: fixing 162 out of 337 bugs for \$0.42 each using chatgpt. arXiv preprint arXiv:2304.00385 (2023)

Download references

Acknowledgements

This work was supported by the UKRI EPSRC grant no. EP/P023991/1 and the ERC advanced fellowship grant no. 741278.

Author information

Authors and Affiliations

University of Stirling, Stirling, UK
Alexander E. I. Brownlee
University College London, London, UK
James Callan, Carol Hanna, Justyna Petke & Federica Sarro
King’s College London, London, UK
Karine Even-Mendoza
Johannes Gutenberg University Mainz, Mainz, Germany
Alina Geiger & Dominik Sobania

Authors

Alexander E. I. Brownlee
View author publications
You can also search for this author in PubMed Google Scholar
James Callan
View author publications
You can also search for this author in PubMed Google Scholar
Karine Even-Mendoza
View author publications
You can also search for this author in PubMed Google Scholar
Alina Geiger
View author publications
You can also search for this author in PubMed Google Scholar
Carol Hanna
View author publications
You can also search for this author in PubMed Google Scholar
Justyna Petke
View author publications
You can also search for this author in PubMed Google Scholar
Federica Sarro
View author publications
You can also search for this author in PubMed Google Scholar
Dominik Sobania
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alexander E. I. Brownlee .

Editor information

Editors and Affiliations

National Institute of Informatics, Tokyo, Japan
Paolo Arcaini
Beihang University, Beijing, China
Tao Yue
Grand Valley State University, Allendale, MI, USA
Erik M. Fredericks

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Brownlee, A.E.I. et al. (2024). Enhancing Genetic Improvement Mutations Using Large Language Models. In: Arcaini, P., Yue, T., Fredericks, E.M. (eds) Search-Based Software Engineering. SSBSE 2023. Lecture Notes in Computer Science, vol 14415. Springer, Cham. https://doi.org/10.1007/978-3-031-48796-5_13

Download citation

DOI: https://doi.org/10.1007/978-3-031-48796-5_13
Published: 04 December 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-48795-8
Online ISBN: 978-3-031-48796-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics