Skip to main content
Log in

Explanatory Analysis of the Metabolome Using Genetic Programming of Simple, Interpretable Rules

  • Published:
Genetic Programming and Evolvable Machines Aims and scope Submit manuscript

Abstract

Genetic programming, in conjunction with advanced analytical instruments, is a novel tool for the investigation of complex biological systems at the whole-tissue level. In this study, samples from tomato fruit grown hydroponically under both high- and low-salt conditions were analysed using Fourier-transform infrared spectroscopy (FTIR), with the aim of identifying spectral and biochemical features linked to salinity in the growth environment. FTIR spectra of whole tissue extracts are not amenable to direct visual analysis, so numerical modelling methods were used to generate models capable of classifying the samples based on their spectral characteristics. Genetic programming (GP) provided models with a better prediction accuracy to the conventional data modelling methods used, whilst being much easier to interpret in terms of the variables used. Examination of the GP-derived models showed that there were a small number of spectral regions that were consistently being used. In particular, the spectral region containing absorbances potentially due to a cyanide/nitrile functional group was identified as discriminatory. The explanatory power of the GP models enabled a chemical interpretation of the biochemical differences to be proposed. The combination of FTIR and GP is therefore a powerful and novel analytical tool that, in this study, improves our understanding of the biochemistry of salt tolerance in tomato plants.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. G. P. Aboulseman, E. Gifford and B. R. Hunt, Opt. Eng. vol. 33 pp. 2562–2571, 1994.

    Google Scholar 

  2. B. K. Alsberg, D. B. Kell and R. Goodacre, Anal. Chem. vol. 70 pp. 4126–4133, 1998.

    Google Scholar 

  3. P. J. Angeline, “Subtree Crossover Causes Bloat,” in GP '98 Proceedings of the Third Annual Genetic Programming Conference, R. J. Koza, W. Banzhaf, K. Chellapilla, K. Deb, M. Dorigo, D. B. Fogel, M. H. Garzon, D. E. Goldberg, H. Iba and R. L. Riolo eds., Morgan Kaufmann: Madison, WI, 1998, pp. 745–752.

    Google Scholar 

  4. W. Banzhaf, P. Nordin, R. Keller and F. Francone, Genetic Programming—An Introduction, San Francisco, CA: Academic Press, 1999.

    Google Scholar 

  5. C. M. Bishop, Neural Networks for Pattern Recognition, Clarendon Press: Oxford, 1995.

    Google Scholar 

  6. P. Bork, T. Dandekar and Y. Diaz-Lazcoz, J. Mol. Biol. vol. 283 pp. 707–725, 1998.

    Google Scholar 

  7. D. Bouchez and H. Hofte, Plant Physiol. vol. 118 pp. 725–732, 1998.

    Google Scholar 

  8. I. Bratko and S. H. Muggleton, Comm. ACM vol. 38 pp. 65–70, 1995.

    Google Scholar 

  9. D. R. Causton, A Biologist's Advanced Mathematics, Allen and Unwin: London, 1987.

    Google Scholar 

  10. S. T. Cole, R. Brosch and J. Parkhill, Nature vol. 393 pp. 537–544, 1998.

    PubMed  Google Scholar 

  11. R. J. Gilbert, R. Goodacre and B. Shann, “Genetic Programming-Based Variable Selection for High-Dimensional Data,” in GP '98 Proceedings of the Third Annual Genetic Programming Conference, R. J. Koza, W. Banzhaf, K. Chellapilla, K. Deb, M. Dorigo, D. B. Fogel, M. H. Garzon, D. E. Goldberg, H. Iba and R. L. Riolo eds., Morgan Kaufmann: Madison, WI, 1998.

    Google Scholar 

  12. R. J. Gilbert, R. Goodacre, A. M. Woodward and D. B. Kell, Anal. Chem. vol. 69 pp. 4381–4389, 1997.

    Google Scholar 

  13. R. Goodacre, M. J. Neal and D. B. Kell, Anal. Chem. vol. 66 pp. 1070–1085, 1994.

    Google Scholar 

  14. R. Goodacre, M. J. Neal and D. B. Kell, Z. Bakteriol.-Int. J. Med. Microbiol. Virol. Parasitol. Infect. Dis. vol. 284 pp. 516–539, 1996a.

    Google Scholar 

  15. R. Goodacre, B. Shann, R. J. Gilbert, E. M. Timmins, A. C. McGovern, B. K. Alsberg, D. B. Kell and N. A. Logan, Anal. Chem. vol. 72 pp. 119–127, 2000.

    Google Scholar 

  16. R. Goodacre, E. M. Timmins, R. Burton, N. Kaderbhai, A. M. Woodward, D. B. Kell and P. J. Rooney, Microbiology—UK vol. 144 pp. 1157–1170, 1998.

    Google Scholar 

  17. R. Goodacre, E. M. Timmins, P. J. Rooney, J. J. Rowland and D. B. Kell, FEMS Microbiol. Lett. vol. 140 pp. 233–239, 1996b.

    Google Scholar 

  18. P. R. Griffiths and J. A. de Haseth, Fourier Transform Infrared Spectrometry, John Wiley: New York, 1986.

    Google Scholar 

  19. H. W. M. Hilhorst, S. P. C. Groot and R. J. Bino, Acta Bot. Neerl. vol. 47 pp. 169–183, 1998.

    Google Scholar 

  20. H. W. M. Hilhorst and P. E. Toorop, Adv. Agron. vol. 61 pp. 111–165, 1997.

    Google Scholar 

  21. J. C. D. Hinton, Mol. Microbiol. vol. 26 pp. 417–422, 1997.

    Google Scholar 

  22. A. C. Hulme, The Biochemistry of Fruits and Their Products, Academic Press: London, 1970.

    Google Scholar 

  23. I. T. Jolliffe, Principal Component Analysis, Springer-Verlag: New York, 1986.

    Google Scholar 

  24. A. Jones, A. D. Shaw and G. J. Salter, “The Exploitation of Chemometric Methods in the Analysis of Spectroscopic Data: Application to Olive Oils,” in Lipid Analysis of oils and fats, R. J. Hamilton ed., Chapman and Hall: London, 1998a, pp. 317–376.

    Google Scholar 

  25. A. Jones, D. Young, J. Taylor, D. B. Kell and J. J. Rowland, Biotechnol. Bioeng. vol. 59 pp. 131–143, 1998b.

    Google Scholar 

  26. J. R. Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection, MIT Press: Cambridge, MA, 1992.

    Google Scholar 

  27. W. B. Langdon, Genetic Programming and Data Structures, Kluwer Academic Publishers: Dordrecht, 1998.

    Google Scholar 

  28. W. B. Langdon and R. Poli, “Fitness Causes Bloat: Mutation,” in Genetic Programming. First European Workshop, EuroGP'98, W. Banzhaf, R. Poli, M. Schoenauer and T. C. Fogarty eds., Springer: Paris, France, 1998, pp. 37–48.

    Google Scholar 

  29. M. H. Mahmoud, A. S. El-Beltagy, R. M. Helal and M. A. Maksoud, Acta Horticult. vol. 190 pp. 559–565, 1986a.

    Google Scholar 

  30. M. H. Mahmoud, R. A. Jones and A. S. El-Beltagy, Acta Horticult. vol. 190 pp. 533–543, 1986b.

    Google Scholar 

  31. H. Martens and T. Naes, Multivariate Calibration, John Wiley: Chichester, 1989.

    Google Scholar 

  32. Y. Mizrahi, Plant Physiol. vol. 69 pp. 966–970, 1982.

    Google Scholar 

  33. D. Naumann, C. P. Schultz and D. Helm, “What Can Infrared Spectroscopy Tell Us about the Structure and Composition of Intact Bacterial Cells?,” in Infrared Spectroscopy of Biomolecules, H. H. Mantsch and D. Chapman eds., John Wiley: New York, 1996, pp. 279–310.

    Google Scholar 

  34. S. G. Oliver, M. K. Winson, D. B. Kell and F. Baganaz, Trends Biotechnol. vol. 16 pp. 373–378, 1998.

    Google Scholar 

  35. B. D. Ripley, Pattern Recognition and Neural Networks, Cambridge University Press, Cambridge, UK, 1996.

    Google Scholar 

  36. B. Schrader, Infrared and Raman Spectroscopy: Methods and Applications, Verlag Chemie: Weinheim, 1995.

    Google Scholar 

  37. M. B. Seaholtz and B. Kowalski, Anal. Chim. Acta vol. 277 pp. 165–177, 1993.

    Google Scholar 

  38. J. Taylor, J. J. Rowland and R. Goodacre, “Genetic Programming in the Interpretation of Fourier Transform Infrared Spectra: Quantification of Metabolites of Pharmaceutical Importance,” in GP '98 Proceedings of the Third Annual Genetic Programming Conference, J. R. Koza, W. Banzhaf, K. Cellapilla, K. Deb, M. Dorigo, D. B. Fogel, M. H. Garzon, D. E. Goldberg, H. Iba and R. L. Riolo eds., Morgan Kaufmann: San Francisco, CA, 1998a, pp. 377–380.

    Google Scholar 

  39. J. Taylor, M. K. Winson, R. Goodacre, J. J. Rowland and D. B. Kell, FEMS Microbiol. Lett. vol. 160 pp. 237–246, 1998b.

    Google Scholar 

  40. P. D. Wasserman, Neural Computing: Theory and Practice, Van Nostrand Reinhold: New York, 1989.

    Google Scholar 

  41. M. C. Whitlock and N. H. Barton, Genetics, vol. 146 pp. 427–441, 1997.

    Google Scholar 

  42. M. K. Winson, R. Goodacre, E. M. Timmins, A. Jones, B. K. Alsberg, A. M. Woodward, J. J. Rowland and D. B. Kell, Anal. Chim. Acta vol. 348 pp. 273–282, 1997.

    Google Scholar 

  43. A. M. Woodward, R. J. Gilbert and D. B. Kell, Bioelectrochem. Biogenet. vol. 48 pp. 389–396, 1999.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Johnson, H.E., Gilbert, R.J., Winson, M.K. et al. Explanatory Analysis of the Metabolome Using Genetic Programming of Simple, Interpretable Rules. Genetic Programming and Evolvable Machines 1, 243–258 (2000). https://doi.org/10.1023/A:1010014314078

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1010014314078

Navigation