skip to main content
10.1145/2598394.2598480acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
poster

Efficient interleaved sampling of training data in genetic programming

Published:12 July 2014Publication History

ABSTRACT

The ability to generalize beyond the training set is important for Genetic Programming (GP). Interleaved Sampling is a recently proposed approach to improve generalization in GP. In this technique, GP alternates between using the entire data set and only a single data point. Initial results showed that the technique not only produces solutions that generalize well, but that it so happens at a reduced computational expense as half the number of generations only evaluate a single data point.

This paper further investigates the merit of interleaving the use of training set with two alternatives approaches. These are: the use of random search instead of a single data point, and simply minimising the tree size. Both of these alternatives are computationally even cheaper than the original setup as they simply do not invoke the fitness function half the time. We test the utility of these new methods on four, well cited, and high dimensional problems from the symbolic regression domain.

The results show that the new approaches continue to produce general solutions despite taking only half the fitness evaluations. Size minimisation also prevents bloat while producing competitive results on both training and test data sets. The tree sizes with size minisation are substantially smaller than the rest of the setups, which further brings down the training costs.

References

  1. I. Goncalves and S. Silva. Balancing learning and overfitting in genetic programming with interleaved sampling of training data. In K. Krawiec, A. Moraglio, T. Hu, A. S. Uyar, and B. Hu, editors, Proceedings of the 16th European Conference on Genetic Programming, EuroGP 2013, volume 7831 of LNCS, pages 73--84, Vienna, Austria, 3-5 Apr. 2013. Springer Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Efficient interleaved sampling of training data in genetic programming

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            GECCO Comp '14: Proceedings of the Companion Publication of the 2014 Annual Conference on Genetic and Evolutionary Computation
            July 2014
            1524 pages
            ISBN:9781450328814
            DOI:10.1145/2598394

            Copyright © 2014 Owner/Author

            Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 12 July 2014

            Check for updates

            Qualifiers

            • poster

            Acceptance Rates

            GECCO Comp '14 Paper Acceptance Rate180of544submissions,33%Overall Acceptance Rate1,669of4,410submissions,38%

            Upcoming Conference

            GECCO '24
            Genetic and Evolutionary Computation Conference
            July 14 - 18, 2024
            Melbourne , VIC , Australia

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader