Skip to main content
Log in

Knowledge mining sensory evaluation data: genetic programming, statistical techniques, and swarm optimization

  • Published:
Genetic Programming and Evolvable Machines Aims and scope Submit manuscript

Abstract

Knowledge mining sensory evaluation data is a challenging process due to extreme sparsity of the data, and a large variation in responses from different members (called assessors) of the panel. The main goals of knowledge mining in sensory sciences are understanding the dependency of the perceived liking score on the concentration levels of flavors’ ingredients, identifying ingredients that drive liking, segmenting the panel into groups with similar liking preferences and optimizing flavors to maximize liking per group. Our approach employs (1) Genetic programming (symbolic regression) and ensemble methods to generate multiple diverse explanations of assessor liking preferences with confidence information; (2) statistical techniques to extrapolate using the produced ensembles to unobserved regions of the flavor space, and segment the assessors into groups which either have the same propensity to like flavors, or are driven by the same ingredients; and (3) two-objective swarm optimization to identify flavors which are well and consistently liked by a selected segment of assessors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

Notes

  1. The greater than normal number of samples were enabled by a proprietary method for delivering the flavor to the assessor which delays sensory fatigue.

  2. The choice of the θ threshold is highly influential in the subsequent conclusions.

References

  1. A. Antos, I. Kontoyiannis, in Information Theory, 2001. Proceedings. 2001 IEEE International Symposium. Estimating the entropy of discrete distributions. (IEEE Press, New York, 2001), pp. 45–45. doi:10.1109/ISIT.2001.935908

  2. G. Folino, C. Pizzuti, G. Spezzano, GP ensembles for large-scale data classification. IEEE Trans. Evol. Comput. 10(5), 604–616 (2006)

    Article  Google Scholar 

  3. F.D. Francone, L.M. Deschaine, T. Battenhouse, J.J. Warren, in Late Breaking Papers at the 2004 Genetic and Evolutionary Computation Conference, ed. by M. Keijzer. Discrimination of unexploded ordnance from clutter using linear genetic programming (Seattle, Washington, USA, 2004). http://www.cs.bham.ac.uk/wbl/biblio/gecco2004/LBP022.pdf

  4. Y. Freund, Boosting a weak learning algorithm by majority. Inf. Comput. 121(2), 256–285 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  5. Y. Freund, H.S. Seung, E. Shamir, N. Tishby, in Advances in Neural Information Processing Systems, 5th edn. Information, prediction, and query by committee (Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1993), pp. 483–490 [NIPS Conference]

  6. R.J. Gilbert, R. Goodacre, B. Shann, D.B. Kell, J. Taylor, J.J. Rowland, in Genetic Programming 1998: Proceedings of the Third Annual Conference, ed by J.R. Koza, W. Banzhaf, K. Chellapilla, K. Deb, M. Dorigo, D.B. Fogel, M.H. Garzon, D.E. Goldberg, H. Iba, R. Riolo. Genetic programming-based variable selection for high-dimensional data (Morgan Kaufmann, University of Wisconsin, Madison, Wisconsin, USA, 1998), pp. 109–115

  7. L.K. Hansen, P. Salamon. Neural network ensembles. IEEE Trans. Pattern Anal. Mach. Intell. 12(10), 993–1001 (1990). doi:http://dx.doi.org/10.1109/34.58871

    Google Scholar 

  8. H. Iba, in Proceedings of the Genetic and Evolutionary Computation Conference, vol 2, ed. by W. Banzhaf, J. Daida, A.E. Eiben, M.H. Garzon, V. Honavar, M. Jakiela, R.E. Smith. Bagging, boosting, and bloating in genetic programming (Morgan Kaufmann, Orlando Florida USA, 1999) pp. 1053–1060.

    Google Scholar 

  9. M. Keijzer, in Proceedings of the 6th European Conference on Genetic programming, EuroGP’03. Improving symbolic regression with interval arithmetic and linear scaling (Springer, Berlin, Heidelberg, 2003), pp. 70–82. http://portal.acm.org/citation.cfm?id=1762668.1762676

  10. M. Keijzer, Scaled symbolic regression. Genetic Program. Evol. Mach. 5(3), 259–269 (2004). doi:10.1023/B:GENP.0000030195.77571.f9

    Article  Google Scholar 

  11. M.F. Korns, in Genetic Programming Theory and Practice IV, Genetic and Evolutionary Computation, vol. 5, chap. 16, ed. by R.L. Riolo, T. Soule, B. Worzel. Large-scale, time-constrained symbolic regression (Springer, Ann Arbor, 2006) pp. 299–314.

    Google Scholar 

  12. M. Kotanchek, G. Smits, E. Vladislavleva, in Genetic Programming Theory and Practice V, Genetic and Evolutionary Computation, chap. 12, ed. by R.L. Riolo, T. Soule, B. Worzel. Trustable symoblic regression models (Springer, Ann Arbor, 2007) pp. 203–222.

    Google Scholar 

  13. J.R. Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection (MIT Press, Cambridge, MA, USA, 1992)

    MATH  Google Scholar 

  14. A. Krogh, J. Vedelsby, in Advances in Neural Information Processing Systems, vol. 7, ed. by G. Tesauro, D. Touretzky, T. Leen. Neural network ensembles, cross validation, and active learning (The MIT Press, Cambridge, MA, USA, 1995) pp. 231–238.

    Google Scholar 

  15. J. Landry, L.D. Kosta, T. Bernier, Discriminant feature selection by genetic programming: Towards a domain independent multi-class object detection system. J. Syst. Cybernet. Inform. 3(1), 76–81 (2006)

    Google Scholar 

  16. X. Li, in Lecture Notes in Computer Science, vol. 2723/2003. A non-dominated sorting particle swarm optimizer for multiobjective optimization (Springer, Berlin, 2003), pp. 37–48.

  17. Y. Liu, X. Yao, in PPSN VII: Proceedings of the 7th International Conference on Parallel Problem Solving from Nature. Learning and evolution by minimization of mutual information (Springer, London, UK, 2002), pp. 495–504

  18. Y. Liu, X. Yao, T. Higuchi, Evolutionary ensembles with negative correlation learning. IEEE Trans. Evol. Comput. 4(4), 380–387 (2000)

    Article  Google Scholar 

  19. H.R. Moskowitz, R. Bernstein, Variability in hedonics: Indications of world-wide sensory and cognitive preference segmentation. J. Sens. Stud. 15(3), 263–284 (2000)

    Article  Google Scholar 

  20. S. Mukherjee, V. Vapnik, in NIPS 12. Multivariate density estimation: a support vector machine approach (1999), pp. 1–8

  21. K. Neshatian, M. Zhang, M. Johnston, in Australian Conference on Artificial Intelligence, Lecture Notes in Computer Science, vol. 4830, ed. by M.A. Orgun, J. Thornton. Feature construction and dimension reduction using genetic programming (Springer, Berlin, 2007), pp. 160–170

  22. G. Paris, D. Robilliard, C. Fonlupt, in Artificial Evolution 5th International Conference, Evolution Artificielle, EA 2001, LNCS, vol. 2310, ed. by P. Collet, C. Fonlupt, J.K. Hao, E. Lutton, M. Schoenauer. Applying boosting techniques to genetic programming (Springer, Creusot France, 2001), pp. 267–278.

    Google Scholar 

  23. R. Poli, in Evolutionary Computing, 1143, ed. by T.C. Fogarty. Genetic programming for feature detection and image segmentation (Springer, University of Sussex, UK, 1996), pp. 110–125.

    Chapter  Google Scholar 

  24. R.E. Schapire, The strength of weak learnability. Mach. Learn. 5(2), 197–227 (1990)

    Google Scholar 

  25. M.D. Schmidt, H. Lipson, Coevolution of fitness predictors. IEEE Trans. Evol. Comput. 12(6), 736–749 (2008)

    Article  Google Scholar 

  26. J.R. Sherrah, R.E. Bogner, A. Bouzerdoum, in Genetic Programming 1997: Proceedings of the Second Annual Conference, ed. by J.R. Koza, K. Deb, M. Dorigo, D.B. Fogel, M. Garzon, H. Iba, R.L. Riolo. The evolutionary pre-processor: Automatic feature extraction for supervised classification using genetic programming (Morgan Kaufmann, Stanford University, CA, USA, 1997), pp. 304–312.

    Google Scholar 

  27. G. Smits, A. Kordon, K. Vladislavleva, E. Jordaan, M. Kotanchek, in Genetic Programming Theory and Practice III, Genetic Programming, vol. 9, chap. 6, ed. by T. Yu, R.L. Riolo, B. Worzel. Variable selection in industrial datasets using pareto genetic programming (Springer, Ann Arbor, 2005), pp. 79–92.

    Google Scholar 

  28. G. Smits, M. Kotanchek, in Genetic Programming Theory and Practice II, chap. 17, ed. by U.M. O’Reilly, T. Yu, R.L. Riolo, B. Worzel. Pareto-front exploitation in symbolic regression (Springer, Ann Arbor, 2004), pp. 283–299.

    Google Scholar 

  29. P. Sun, X. Yao, in ICDM ’06: Proceedings of the Sixth International Conference on Data Mining. Boosting kernel models for regression (IEEE Computer Society, Washington, DC, USA 2006), pp. 583–591

  30. J.S. Taylor, A. Dolia, in Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, ed. by N. Lawrence. A framework for probability density estimation. Journal of Machine Learning Research (2007), pp. 468–475

  31. K. Veeramachaneni, K. Vladislavleva, M. Burland, J. Parcon, U.M. O’Reilly, in GECCO ’10: Proceedings of the 12th Annual Conference on Genetic and Evolutionary Computation, ed. by J. Branke, M. Pelikan, E. Alba, D.V. Arnold, J. Bongard, A. Brabazon, J. Branke, M.V. Butz, J. Clune, M. Cohen, K. Deb, A.P. Engelbrecht, N. Krasnogor, J.F. Miller, M. O’Neill, K. Sastry, D. Thierens, J. van Hemert, L. Vanneschi, C. Witt. Evolutionary optimization of flavors (ACM, Portland, Oregon, USA, 2010), pp. 1291–1298

  32. E. Vladislavleva, Model-based problem solving through symbolic regression via pareto genetic programming. Ph.D. thesis (Tilburg University, Tilburg, the Netherlands, 2008). http://arno.uvt.nl/show.cgi?fid=80764

  33. K. Vladislavleva, K. Veeramachaneni, M. Burland, J. Parcon, U.M. O’Reilly, in GECCO ’10: Proceedings of the 12th Annual Conference on Genetic and Evolutionary Computation, ed. by J. Branke, M. Pelikan, E. Alba, D.V. Arnold, J. Bongard, A. Brabazon, J. Branke, M.V. Butz, J. Clune, M. Cohen, K. Deb, A.P. Engelbrecht, N. Krasnogor, J.F. Miller, M. O’Neill, K. Sastry, D. Thierens, J. Hemert, L. Vanneschi, C. Witt. Knowledge mining with genetic programming methods for variable selection in flavor design (ACM, Portland, Oregon, USA, 2010), pp. 941–948.

    Chapter  Google Scholar 

  34. K. Vladislavleva, K. Veeramachaneni, U.M. O’Reilly, in Proceedings of the 13th European Conference on Genetic Programming, EuroGP 2010, LNCS, vol. 6021, ed. by A.I. Esparcia-Alcazar, A. Ekart, S. Silva, S. Dignum, A.S. Uyar. Learning a lot from only a little: Genetic programming for panel segmentation on sparse sensory evaluation data (Springer, Istanbul, 2010), pp. 244–255.

    Google Scholar 

  35. D.H. Wolpert, Stacked generalization. Neural Netw. 5(2), 241–259 (1992)

    Article  MathSciNet  Google Scholar 

  36. J. Yu, J. Yu, A.A. Almal, S.M. Dhanasekaran, D. Ghosh, W.P. Worzel, A.M. Chinnaiyan, Feature selection and molecular classification of cancer using genetic programming. Neoplasia 9(4), 292–303 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kalyan Veeramachaneni.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Veeramachaneni, K., Vladislavleva, E. & O’Reilly, UM. Knowledge mining sensory evaluation data: genetic programming, statistical techniques, and swarm optimization. Genet Program Evolvable Mach 13, 103–133 (2012). https://doi.org/10.1007/s10710-011-9153-2

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10710-011-9153-2

Keywords

Navigation