research-article

Knowledge mining with genetic programming methods for variable selection in flavor design

Authors:
Katya Vladislavleva

Antwerp University, Antwerp, Belgium

Antwerp University, Antwerp, Belgium
View Profile

,
Kalyan Veeramachaneni

Massachusetts Institute of Technology, Cambridge, MA, USA

Massachusetts Institute of Technology, Cambridge, MA, USA
View Profile

,
Matt Burland

Givaudan Flavors Corporation, Cincinnati, OH, USA

Givaudan Flavors Corporation, Cincinnati, OH, USA
View Profile

,
Jason Parcon

Givaudan Flavors Corporation, Cincinnati, OH, USA

Givaudan Flavors Corporation, Cincinnati, OH, USA
View Profile

,
Una-May O'Reilly

Massachusetts Institute of Technology, Cambridge, MA, USA

Massachusetts Institute of Technology, Cambridge, MA, USA
View Profile

GECCO '10: Proceedings of the 12th annual conference on Genetic and evolutionary computationJuly 2010Pages 941–948https://doi.org/10.1145/1830483.1830651

Published:07 July 2010Publication History

GECCO '10: Proceedings of the 12th annual conference on Genetic and evolutionary computation

Pages 941–948

ABSTRACT

This paper presents a novel approach for knowledge mining from a sparse and repeated measures dataset. Genetic programming based symbolic regression is employed to generate multiple models that provide alternate explanations of the data. This set of models, called an ensemble, is generated for each of the repeated measures separately. These multiple ensembles are then utilized to generate information about, (a) which variables are important in each ensemble, (b) cluster the ensembles into different groups that have similar variables that drive their response variable, and (c) measure sensitivity of response with respect to the important variables. We apply our methodology to a sensory science dataset. The data contains hedonic evaluations (liking scores), assigned by a diverse set of human testers, for a small set of flavors composed from seven ingredients. Our approach: (1) identifies the important ingredients that drive the liking score of a panelist and (2) segments the panelists into groups that are driven by the same ingredient, and (3) enables flavor scientists to perform the sensitivity analysis of liking scores relative to changes in the levels of important ingredients.

References

Evolved Analytics LLC. DataModeler Release 1.0. Evolved Analytics LLC, 2010.Google Scholar
F. D. Francone, L. M. Deschaine, T. Battenhouse, and J. J. Warren. Discrimination of unexploded ordnance from clutter using linear genetic programming. In Late Breaking Papers at the 2004 Genetic and Evolutionary Computation Conference, Seattle, Washington, USA, 26 July 2004.Google Scholar
R. J. Gilbert, R. Goodacre, B. Shann, D. B. Kell, J. Taylor, and J. J. Rowland. Genetic programming-based variable selection for high-dimensional data. In Genetic Programming 1998: Proceedings of the Third Annual Conference, pages 109--115, Wisconsin, USA, 1998. Morgan Kaufmann.Google Scholar
M. Keijzer. Scaled symbolic regression. Genetic Programming and Evolvable Machines, 5(3):259--269, Sept. 2004. Google ScholarDigital Library
M. F. Korns. Large-scale, time-constrained symbolic regression. In R. L. Riolo, T. Soule, and B. Worzel, editors, Genetic Programming Theory and Practice IV, volume 5 of Genetic and Evolutionary Computation, chapter 16, pages 299--314. Springer, Ann Arbor, May 2006.Google Scholar
J. R. Koza. Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, USA, 1992. Google ScholarDigital Library
J. Landry, L. D. Kosta, and T. Bernier. Discriminant feature selection by genetic programming: Towards a domain independent multi-class object detection system. Journal of Systemics, Cybernetics and Informatics., 3(1), 2006.Google Scholar
K. Neshatian, M. Zhang, and M. Johnston. Feature construction and dimension reduction using genetic programming. In Australian Conference on Artificial Intelligence, volume LNCS 4830, pages 160--170. Springer, 2007. Google ScholarDigital Library
R. Poli. Genetic programming for feature detection and image segmentation. In T. C. Fogarty, editor, Evolutionary Computing, number 1143, pages 110--125. Springer-Verlag, University of Sussex, UK, 1-2 1996. Google ScholarDigital Library
M. D. Schmidt and H. Lipson. Coevolution of fitness predictors. IEEE Transactions on Evolutionary Computation, 12(6):736--749, Dec. 2008. Google ScholarDigital Library
J. R. Sherrah, R. E. Bogner, and A. Bouzerdoum. The evolutionary pre-processor: Automatic feature extraction for supervised classification using genetic programming. In Genetic Programming 1997: Proceedings of the Second Annual Conference, pages 304--312, Stanford University, CA, USA, July 1997. Morgan Kaufmann.Google Scholar
G. Smits, A. Kordon, K. Vladislavleva, E. Jordaan, and M. Kotanchek. Variable selection in industrial datasets using pareto genetic programming. In T. Yu, R. L. Riolo, and B. Worzel, editors, Genetic Programming Theory and Practice III, volume 9 of Genetic Programming, chapter 6, pages 79--92. Springer, Ann Arbor, 12-14 May 2005.Google Scholar
G. Smits and M. Kotanchek. Pareto-front exploitation in symbolic regression. In U.-M. O'Reilly, T. Yu, R. L. Riolo, and B. Worzel, editors, Genetic Programming Theory and Practice II, chapter 17. Springer, Ann Arbor, 13-15 May 2004.Google Scholar
K. Veeramachaneni, K. Vladislavleva, M. Burland, J. Parcon, and U.-M. O'Reilly. Evolutionary optimization of avors. In Proceedings of GECCO, 2010. Google ScholarDigital Library
J. Yu, J. Yu, A. A. Almal, S. M. Dhanasekaran, D. Ghosh, W. P. Worzel, and A. M. Chinnaiyan. Feature selection and molecular classification of cancer using genetic programming. Neoplasia, 9(4):292--303, Apr. 2007.Google ScholarCross Ref

Index Terms

Knowledge mining with genetic programming methods for variable selection in flavor design
1. Computing methodologies
  1. Symbolic and algebraic manipulation
    1. Symbolic and algebraic algorithms

Recommendations

Classification and variable selection using the mining of positive and negative association rules
Highlights
- Use rules of forms A ℸ B ⇒ z o r ℸ z for feature selection and classification.
- Algorithm mining the rules is built based on equivalence classes.
- It exploits the downward closure property of negative itemsets and is complete.
- ...
Abstract
Association rules (ARs) have been applied to classification and variable selection. However, currently, only positive ARs are used for variable selection, while only special forms of positive and negative association rules (PNARs) are used for ...
Read More
Knowledge mining sensory evaluation data: genetic programming, statistical techniques, and swarm optimization

Knowledge mining sensory evaluation data is a challenging process due to extreme sparsity of the data, and a large variation in responses from different members (called assessors) of the panel. The main goals of knowledge mining in sensory sciences are ...
Read More
Building boosted classification tree ensemble with genetic programming
GECCO '18: Proceedings of the Genetic and Evolutionary Computation Conference Companion

Adaptive boosting (AdaBoost) is a method for building classification ensemble, which combines multiple classifiers built in an iterative process of reweighting instances. This method proves to be a very effective classification method, therefore it was ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
GECCO '10: Proceedings of the 12th annual conference on Genetic and evolutionary computation
July 2010
1520 pages
ISBN:9781450300728
DOI:10.1145/1830483
General Chair:
Martin Pelikan
University of Missouri, USA
,
Program Chair:
Jürgen Branke
University of Warwick, Coventry, UK
Copyright © 2010 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 7 July 2010
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
ensemble modeling
genetic programming
pareto
sensory data
symbolic regression
variable selection
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,669of4,410submissions,38%
Upcoming Conference
GECCO '24

Sponsor:

sigevo

Genetic and Evolutionary Computation Conference

July 14 - 18, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 12
  Total Citations
  View Citations
- 194
  Total Downloads
- Downloads (Last 12 months)2
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Knowledge mining with genetic programming methods for variable selection in flavor design

GECCO '10: Proceedings of the 12th annual conference on Genetic and evolutionary computation

ABSTRACT

References

Cited By

Index Terms

Recommendations

Classification and variable selection using the mining of positive and negative association rules

Knowledge mining sensory evaluation data: genetic programming, statistical techniques, and swarm optimization

Building boosted classification tree ensemble with genetic programming