Skip to main content

Exploiting Knowledge from Code to Guide Program Search

  • Conference paper
  • First Online:
  • 805 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13223))

Abstract

Human code is different from code generated by program search. We investigate if properties from human-generated code can guide program search to improve the qualities of the generated programs, e.g., readability and performance. Here we focus on program search with grammatical evolution, which produces code that has different structure compared to human-generated code, e.g., loops and conditions are hardly used. We use a large code-corpus that was mined from the open software repository service GitHub and measure software metrics and properties describing the code-base. We use this knowledge to guide the search by incorporating a new selection scheme. Our new selection scheme favors programs that are structurally similar to the programs in the GitHub code-base. We find noticeable evidence that software metrics can help in guiding evolutionary search.

The authors thank Jordan Wick for sharing his expertise, the insightful discussions, and his help on our project. This work was supported by a fellowship within the IFI programme of the German Academic Exchange Service (DAAD).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    radon: https://pypi.org/project/radon/.

  2. 2.

    astdump: https://pypi.org/project/astdump/.

References

  1. Altenberg, L.: Open problems in the spectral analysis of evolutionary dynamics. In: Menon, A. (ed.) Frontiers of Evolutionary Computation. Genetic Algorithms and Evolutionary Computation, vol. 11, pp. 73–102. Springer, Boston (2004). https://doi.org/10.1007/1-4020-7782-3_4

    Chapter  Google Scholar 

  2. Basili, V.R., Perricone, B.T.: Software errors and complexity: an empirical investigation. Commun. ACM 27(1), 42–52 (1984)

    Article  Google Scholar 

  3. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)

    Article  Google Scholar 

  4. Dijkstra, E.W.: The humble programmer. Commun. ACM 15(10), 859–866 (1972)

    Article  Google Scholar 

  5. Fenton, M., McDermott, J., Fagan, D., Forstenlechner, S., Hemberg, E., O’Neill, M.: PonyGE2: grammatical evolution in python. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, pp. 1194–1201. ACM, Berlin (2017)

    Google Scholar 

  6. Fenton, N.E., Neil, M.: A critique of software defect prediction models. IEEE Trans. Softw. Eng. 25(5), 675–689 (1999)

    Article  Google Scholar 

  7. Forstenlechner, S., Fagan, D., Nicolau, M., O’Neill, M.: Towards understanding and refining the general program synthesis benchmark suite with genetic programming. In: 2018 IEEE Congress on Evolutionary Computation (CEC), pp. 1–6. IEEE (2018)

    Google Scholar 

  8. Helmuth, T., Spector, L.: General program synthesis benchmark suite. In: Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation, pp. 1039–1046. ACM, New York (2015)

    Google Scholar 

  9. Hemberg, E., Kelly, J., O’Reilly, U.M.: On domain knowledge and novelty to improve program synthesis performance with grammatical evolution. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2019, pp. 1039–1046. ACM, New York (2019)

    Google Scholar 

  10. Hemberg, E., Veeramachaneni, K., McDermott, J., Berzan, C., O’Reilly, U.M.: An investigation of local patterns for estimation of distribution genetic programming. In: Proceedings of the 14th Annual Conference on Genetic and Evolutionary Computation (GECCO 2012), pp. 767–774. ACM, New York (2012)

    Google Scholar 

  11. Johansson, V.: Lexical diversity and lexical density in speech and writing: a developmental perspective. In: Working Papers in Linguistics, vol. 53, pp. 61–79 (2009)

    Google Scholar 

  12. Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge (1992)

    MATH  Google Scholar 

  13. Krawiec, K.: Behavioral Program Synthesis with Genetic Programming. Studies in Computational Intelligence, vol. 618. Springer, Cham (2016)

    Book  Google Scholar 

  14. McCabe, T.J.: A complexity measure. IEEE Trans. Softw. Eng. SE–2(4), 308–320 (1976)

    Article  MathSciNet  Google Scholar 

  15. Nicolau, M.: Understanding grammatical evolution: initialisation. Genet. Program. Evolvable Mach. 18(4), 467–507 (2017). https://doi.org/10.1007/s10710-017-9309-9D

    Article  Google Scholar 

  16. Petke, J.: New operators for non-functional genetic improvement. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, pp. 1541–1542. ACM, New York (2017)

    Google Scholar 

  17. Petke, J., Harman, M., Langdon, W.B., Weimer, W.: Using genetic improvement and code transplants to specialise a C++ Program to a Problem class. In: Nicolau, M., et al. (eds.) EuroGP 2014. LNCS, vol. 8599, pp. 137–149. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44303-3_12

    Chapter  Google Scholar 

  18. Ryan, C., Collins, J.J., Neill, M.O.: Grammatical evolution: evolving programs for an arbitrary language. In: Banzhaf, W., Poli, R., Schoenauer, M., Fogarty, T.C. (eds.) EuroGP 1998. LNCS, vol. 1391, pp. 83–96. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0055930

    Chapter  Google Scholar 

  19. Schweim, D., Wittenberg, D., Rothlauf, F.: On sampling error in genetic programming. Nat. Comput. (2021). https://doi.org/10.1007/s11047-020-09828-w

  20. Selby, R.W., Basili, V.R.: Analyzing error-prone system structure. IEEE Trans. Softw. Eng. 17(2), 141–152 (1991)

    Article  Google Scholar 

  21. Sobania, D.: On the generalizability of programs synthesized by grammar-guided genetic programming. In: Hu, T., Lourenço, N., Medvet, E. (eds.) EuroGP 2021. LNCS, vol. 12691, pp. 130–145. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-72812-0_9

    Chapter  Google Scholar 

  22. Sobania, D., Rothlauf, F.: Teaching GP to program like a human software developer: using perplexity pressure to guide program synthesis approaches. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2019), pp. 1065–1074. ACM, New York (2019)

    Google Scholar 

  23. Sobania, D., Rothlauf, F.: Challenges of program synthesis with grammatical evolution. In: Hu, T., Lourenço, N., Medvet, E., Divina, F. (eds.) EuroGP 2020. LNCS, vol. 12101, pp. 211–227. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-44094-7_14

    Chapter  Google Scholar 

  24. Spector, L.: Assessment of problem modality by differential performance of lexicase selection in genetic programming: a preliminary report. In: Proceedings of the 14th Annual Conference Companion on Genetic and Evolutionary Computation, GECCO 2012, pp. 401–408. Association for Computing Machinery, New York (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dirk Schweim .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Schweim, D., Hemberg, E., Sobania, D., O’Reilly, UM. (2022). Exploiting Knowledge from Code to Guide Program Search. In: Medvet, E., Pappa, G., Xue, B. (eds) Genetic Programming. EuroGP 2022. Lecture Notes in Computer Science, vol 13223. Springer, Cham. https://doi.org/10.1007/978-3-031-02056-8_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-02056-8_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-02055-1

  • Online ISBN: 978-3-031-02056-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics