Skip to main content
Log in

Reinforced Genetic Programming

  • Published:
Genetic Programming and Evolvable Machines Aims and scope Submit manuscript

Abstract

This paper introduces the Reinforced Genetic Programming (RGP) system, which enhances standard tree-based genetic programming (GP) with reinforcement learning (RL). RGP adds a new element to the GP function set: monitored action-selection points that provide hooks to a reinforcement-learning system. Using strong typing, RGP can restrict these choice points to leaf nodes, thereby turning GP trees into classify-and-act procedures. Then, environmental reinforcements channeled back through the choice points provide the basis for both lifetime learning and general GP fitness assessment. This paves the way for evolutionary acceleration via both Baldwinian and Lamarckian mechanisms. In addition, the hybrid hints of potential improvements to RL by exploiting evolution to design proper abstraction spaces, via the problem-state classifications of the internal tree nodes. This paper details the basic mechanisms of RGP and demonstrates its application on a series of static and dynamic maze-search problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. D. H. Ackley and M. L. Littman, “Interactions between learning and evolution,” in Artificial Life II, C. G. Langton, C. Taylor, J. D. Farmer, and S. Rasmussen (eds.), Addison-Wesley, Reading, MA, 1992, pp. 487-509.

    Google Scholar 

  2. J. M. Baldwin, A new factor in evolution, American Naturalist vol. 30, pp. 441-451, 1896.

    Google Scholar 

  3. K. A. DeJong, “Genetic-algorithm-based learning,” in Machine Learning, Y. Kodratoff and R. Michalski (eds.), vol. 3, Morgan Kaufmann: San Francisco, 1990, pp. 611-638.

    Google Scholar 

  4. G. E. Hinton and S. J. Nowlan, “How learning can guide evolution,” Complex Syst. vol. 1, pp. 495-502, 1987.

    Google Scholar 

  5. J. H. Holland, Adaptation in Natural and Artificial Systems, 2nd ed., The MIT Press: Cambridge, MA, 1992.

    Google Scholar 

  6. C. R. Houck, J. A. Joines, M. G. Kay, and J. R. Wilson, “Empirical investigation of the benefits of partial Lamarckianism,” Evolutionary Comput. vol. 5, pp. 31-60, 1997.

    Google Scholar 

  7. H. Iba, “Multi-agent reinforcement learning with genetic programming,” in Genetic Programming 1998: Proc. Third Annual Conf., J. R. Koza, W. Banzhaf, K. Chellapilla, K. Deb, M. Dorigo, D. B. Fogel, M. H. Garzon, D. E. Goldberg, H. Iba, and R. Riolo (eds.), Morgan Kaufmann: San Francisco, 1998, pp. 167-172.

    Google Scholar 

  8. H. Kitano, “Deisgning neutral networks using genetic algorithms with graph generation system,” Complex syst. vol. 4, pp. 461-467, 1990.

    Google Scholar 

  9. J. R. Koza, Genetic Programming: On the Programming of Computers by Natural Selection, MIT Press: Cambridge, MA, 1992.

    Google Scholar 

  10. J. B. Lamarck, “Of the influence of the environment on the activities and habits of animals, and the influence of the activities and habits of these living bodies in modifiying their organization and structure,” Zool. Philos., pp. 106-127, 1914.

  11. P. L. Lanzi and S. W. Wilson, “Toward optimal classifier system performance in non-markov environments,” Evolution Comput. vol. 8, pp. 393-418, 2000.

    Google Scholar 

  12. G. Mayley, “Landscapes, learning costs and genetic assimilation,” Evolutionary Comput. vol. 4, 1996.

  13. G. F. Miller, P. M. Todd, and S. U. Hedge, “Designing neutral networks using genetic algorithms,” in Proc. Third Int. Conf. Genetic Algorithms, Morgan Kaufmann: San Francisco, 1989, pp. 379-384.

    Google Scholar 

  14. R. L. Riolo, “Bucket brigade performance: I. Long sequences of classifiers,” in Proc. Second Int. Conf. Genetic Algorithms, J. J. Grefenstette (ed.), Lawrence Erlbaum Association: Mahwah, NJ, 1987, pp. 184-195.

    Google Scholar 

  15. R. L. Riolo, “Lookahead planning and latent learning in a classifier system,” in Proc. First Int. Conf. Simulation of Adaptive Behavior: From Animals to Animats, J.-A. Meyer and S. W. Wilson (eds.), MIT Press: Cambridge, MA, 1991, pp. 316-326.

    Google Scholar 

  16. G. G. Robertson and R. L. Riolo, “A tale of two classifier systems,” Machine Learning vol. 3, pp. 139-159, 1988.

    Google Scholar 

  17. R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, MIT Press: Cambridge, MA, 1998.

    Google Scholar 

  18. S. Taylor, “Using Lamarckian evolution to increase the effectiveness of neutral network training with a genetic algorithm and backpropagation,” in Artificial Life at Stanford 1994, J. R. Koza (ed.), Stanford Bookstore: Stanford, CA, 1994, pp. 181-186.

    Google Scholar 

  19. A. Teller, “The internal reinforcement of evolving algorithms,” in Advances in Genetic Programming 3, L. Spector, W. B. Langdon, U.-M. O'Reilly, and P. J. Angeline (eds.), MIT Press: Cambridge, MA, 1999, pp. 325-354.

    Google Scholar 

  20. G. Tesauro, “Temporal difference learning and TD-Gammon,” Commun. ACM vol. 38, pp. 58-68, 1995.

    Google Scholar 

  21. P. Turney, L. D. Whitley, and R. W. Anderson, “Introduction to the special issue: Evolution, learning, and instinct: 100 years of the Baldwin effect,” Evolutionary Comput. vol. 4, pp. iv-viii, 1997.

    Google Scholar 

  22. C. Watkins and P. Dayan, “Q-learning,” Machine Learning, vol. 8, pp. 297-292, 1992.

    Google Scholar 

  23. D. L. Whitley, V. S. Gordon, and K. E. Mathias, “Lamarckian evolution, the Baldw in effect and function optimization,” in Parallel Problem Solving from Nature-PPSN III, Y. Davidor, H.-P. Schwefel, and R. Manner (eds.), Springer-Verlag: Berlin, 1994, pp. 6-15.

    Google Scholar 

  24. S. W. Wilson and D. E. Goldberg, “A critical reviewof classifier systems,” in Proc. 3rd Int. Conf. Genetic Algorithms (ICGA89), J. D. Schaffer (ed.), Morgan Kaufmann: San Francisco, CA, 1989, pp. 244-255.

    Google Scholar 

  25. L. Yaeger, “Computational genetics, physiology, metabolism, neutral systems, learning, vision and behavior or polyworld: Life in a new context,” in Artificial Life III, Proc. vol. XVII, C. G. Langton (ed.), Addison-Wesley, Reading, MA, 1994, Santa Fe Institute Studies in the Sciences of Complexity, pp. 263-298.

    Google Scholar 

  26. W. Zhang and T. G. Dietterich, “A reinforcement learning approach to job-shop scheduling,” in Proc. Int. Joint Conf. Artificial Intelligence, C. S. Mellish, (ed.), Morgan Kaufmann: San Francisco, CA, 1995, pp. 1114-1120.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Downing, K.L. Reinforced Genetic Programming. Genetic Programming and Evolvable Machines 2, 259–288 (2001). https://doi.org/10.1023/A:1011953410319

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1011953410319

Navigation