skip to main content
10.1145/1830483.1830651acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

Knowledge mining with genetic programming methods for variable selection in flavor design

Published:07 July 2010Publication History

ABSTRACT

This paper presents a novel approach for knowledge mining from a sparse and repeated measures dataset. Genetic programming based symbolic regression is employed to generate multiple models that provide alternate explanations of the data. This set of models, called an ensemble, is generated for each of the repeated measures separately. These multiple ensembles are then utilized to generate information about, (a) which variables are important in each ensemble, (b) cluster the ensembles into different groups that have similar variables that drive their response variable, and (c) measure sensitivity of response with respect to the important variables. We apply our methodology to a sensory science dataset. The data contains hedonic evaluations (liking scores), assigned by a diverse set of human testers, for a small set of flavors composed from seven ingredients. Our approach: (1) identifies the important ingredients that drive the liking score of a panelist and (2) segments the panelists into groups that are driven by the same ingredient, and (3) enables flavor scientists to perform the sensitivity analysis of liking scores relative to changes in the levels of important ingredients.

References

  1. Evolved Analytics LLC. DataModeler Release 1.0. Evolved Analytics LLC, 2010.Google ScholarGoogle Scholar
  2. F. D. Francone, L. M. Deschaine, T. Battenhouse, and J. J. Warren. Discrimination of unexploded ordnance from clutter using linear genetic programming. In Late Breaking Papers at the 2004 Genetic and Evolutionary Computation Conference, Seattle, Washington, USA, 26 July 2004.Google ScholarGoogle Scholar
  3. R. J. Gilbert, R. Goodacre, B. Shann, D. B. Kell, J. Taylor, and J. J. Rowland. Genetic programming-based variable selection for high-dimensional data. In Genetic Programming 1998: Proceedings of the Third Annual Conference, pages 109--115, Wisconsin, USA, 1998. Morgan Kaufmann.Google ScholarGoogle Scholar
  4. M. Keijzer. Scaled symbolic regression. Genetic Programming and Evolvable Machines, 5(3):259--269, Sept. 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. F. Korns. Large-scale, time-constrained symbolic regression. In R. L. Riolo, T. Soule, and B. Worzel, editors, Genetic Programming Theory and Practice IV, volume 5 of Genetic and Evolutionary Computation, chapter 16, pages 299--314. Springer, Ann Arbor, May 2006.Google ScholarGoogle Scholar
  6. J. R. Koza. Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, USA, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Landry, L. D. Kosta, and T. Bernier. Discriminant feature selection by genetic programming: Towards a domain independent multi-class object detection system. Journal of Systemics, Cybernetics and Informatics., 3(1), 2006.Google ScholarGoogle Scholar
  8. K. Neshatian, M. Zhang, and M. Johnston. Feature construction and dimension reduction using genetic programming. In Australian Conference on Artificial Intelligence, volume LNCS 4830, pages 160--170. Springer, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. R. Poli. Genetic programming for feature detection and image segmentation. In T. C. Fogarty, editor, Evolutionary Computing, number 1143, pages 110--125. Springer-Verlag, University of Sussex, UK, 1-2 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. D. Schmidt and H. Lipson. Coevolution of fitness predictors. IEEE Transactions on Evolutionary Computation, 12(6):736--749, Dec. 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. R. Sherrah, R. E. Bogner, and A. Bouzerdoum. The evolutionary pre-processor: Automatic feature extraction for supervised classification using genetic programming. In Genetic Programming 1997: Proceedings of the Second Annual Conference, pages 304--312, Stanford University, CA, USA, July 1997. Morgan Kaufmann.Google ScholarGoogle Scholar
  12. G. Smits, A. Kordon, K. Vladislavleva, E. Jordaan, and M. Kotanchek. Variable selection in industrial datasets using pareto genetic programming. In T. Yu, R. L. Riolo, and B. Worzel, editors, Genetic Programming Theory and Practice III, volume 9 of Genetic Programming, chapter 6, pages 79--92. Springer, Ann Arbor, 12-14 May 2005.Google ScholarGoogle Scholar
  13. G. Smits and M. Kotanchek. Pareto-front exploitation in symbolic regression. In U.-M. O'Reilly, T. Yu, R. L. Riolo, and B. Worzel, editors, Genetic Programming Theory and Practice II, chapter 17. Springer, Ann Arbor, 13-15 May 2004.Google ScholarGoogle Scholar
  14. K. Veeramachaneni, K. Vladislavleva, M. Burland, J. Parcon, and U.-M. O'Reilly. Evolutionary optimization of avors. In Proceedings of GECCO, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Yu, J. Yu, A. A. Almal, S. M. Dhanasekaran, D. Ghosh, W. P. Worzel, and A. M. Chinnaiyan. Feature selection and molecular classification of cancer using genetic programming. Neoplasia, 9(4):292--303, Apr. 2007.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Knowledge mining with genetic programming methods for variable selection in flavor design

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      GECCO '10: Proceedings of the 12th annual conference on Genetic and evolutionary computation
      July 2010
      1520 pages
      ISBN:9781450300728
      DOI:10.1145/1830483

      Copyright © 2010 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 7 July 2010

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,669of4,410submissions,38%

      Upcoming Conference

      GECCO '24
      Genetic and Evolutionary Computation Conference
      July 14 - 18, 2024
      Melbourne , VIC , Australia

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader