skip to main content
10.1145/2739480.2754693acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

Building Predictive Models via Feature Synthesis

Published:11 July 2015Publication History

ABSTRACT

We introduce Evolutionary Feature Synthesis (EFS), a regression method that generates readable, nonlinear models of small to medium size datasets in seconds. EFS is, to the best of our knowledge, the fastest regression tool based on evolutionary computation reported to date. The feature search involved in the proposed method is composed of two main steps: feature composition and feature subset selection. EFS adopts a bottom-up feature composition strategy that eliminates the need for a symbolic representation of the features and exploits the variable selection process involved in pathwise regularized linear regression to perform the feature subset selection step. The result is a regression method that is competitive against neural networks, and outperforms both linear methods and Multiple Regression Genetic Programming, up to now the best regression tool based on evolutionary computation.

References

  1. Multiple regression genetic programming in Java. http://flexgp.github.io/gp-learners/, 2014.Google ScholarGoogle Scholar
  2. I. Arnaldo, K. Krawiec, and U.-M. O'Reilly. Multiple regression genetic programming. In Proceedings of the 2014 Conference on Genetic and Evolutionary Computation, GECCO '14, pages 879--886, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. T. Bertin-Mahieux, D. Ellis, B. Whitman, and P. Lamere. The million song dataset. In ISMIR 2011: Proceedings of the 12th International Society for Music Information Retrieval Conference, October 24--28, Miami, Florida, pages 591--596, 2011.Google ScholarGoogle Scholar
  4. V. V. De Melo. Kaizen programming. In Proceedings of the 2014 Conference on Genetic and Evolutionary Computation, GECCO '14, pages 895--902, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. H. Friedman, T. Hastie, and R. Tibshirani. Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1):1--22, 2 2010.Google ScholarGoogle ScholarCross RefCross Ref
  6. Y. Ganjisaffar. Lasso4j. https://code.google.com/p/lasso4j/, 2014.Google ScholarGoogle Scholar
  7. M. Garcia-Limon, H. J. Escalante, E. Morales, and A. Morales-Reyes. Simultaneous generation of prototypes and features through genetic programming. In Proceedings of the 2014 Conference on Genetic and Evolutionary Computation, GECCO '14, pages 517--524, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C. Gathercole and P. Ross. Dynamic training subset selection for supervised learning in genetic programming. In Y. Davidor, H.-P. Schwefel, and R. Manner, editors, Parallel Problem Solving from Nature, PPSN III, volume 866 of Lecture Notes in Computer Science, pages 312--321. Springer Berlin Heidelberg, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. I. Guyon and A. Elisseeff. An introduction to variable and feature selection. Journal of Machine Learning Research, 3:1157--1182, Mar. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. T. Hastie, R. Tibshirani, and J. Friedman. The elements of statistical learning: data mining, inference and prediction. Springer, 2nd edition, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  11. I. Icke and J. Bongard. Improving genetic programming based symbolic regression using deterministic machine learning. In 2013 IEEE Congress on Evolutionary Computation (CEC), pages 1763--1770, June 2013.Google ScholarGoogle ScholarCross RefCross Ref
  12. U. Kamath, J. Lin, and K. De Jong. SAX-EFG: An evolutionary feature generation framework for time series classification. In Proceedings of the 2014 Conference on Genetic and Evolutionary Computation, GECCO '14, pages 533--540, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. K. Krawiec and U.-M. O'Reilly. Behavioral programming: A broader and more detailed take on semantic GP. In Proceedings of the 2014 Conference on Genetic and Evolutionary Computation, GECCO '14, pages 935--942, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Langford, L. Li, and T. Zhang. Sparse online learning via truncated gradient. Journal of Machine Learning Research, 10:777--801, June 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Y. Lin and B. Bhanu. Evolutionary feature synthesis for object recognition. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 35(2):156--171, May 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. MathWorks. Neural network toolbox, 2014.Google ScholarGoogle Scholar
  17. T. McConaghy. FFX: Fast, scalable, deterministic symbolic regression technology. In R. Riolo, E. Vladislavleva, and J. H. Moore, editors, Genetic Programming Theory and Practice IX, Genetic and Evolutionary Computation, pages 235--260. Springer New York, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  18. R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58:267--288, 1994.Google ScholarGoogle Scholar
  19. A. Tsanas and A. Xifara. Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools. Energy and Buildings, 49(0):560--567, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  20. F. Xue, R. Subbu, and P. Bonissone. Locally weighted fusion of multiple predictive models. In International Joint Conference on Neural Networks, 2006. IJCNN '06., pages 2137--2143, 2006.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    GECCO '15: Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation
    July 2015
    1496 pages
    ISBN:9781450334723
    DOI:10.1145/2739480

    Copyright © 2015 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 11 July 2015

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    GECCO '15 Paper Acceptance Rate182of505submissions,36%Overall Acceptance Rate1,669of4,410submissions,38%

    Upcoming Conference

    GECCO '24
    Genetic and Evolutionary Computation Conference
    July 14 - 18, 2024
    Melbourne , VIC , Australia

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader