Skip to main content
Log in

Declarative and Preferential Bias in GP-based Scientific Discovery

  • Published:
Genetic Programming and Evolvable Machines Aims and scope Submit manuscript

Abstract

This work examines two methods for evolving dimensionally correct equations on the basis of data. It is demonstrated that the use of units of measurement aids in evolving equations that are amenable to interpretation by domain specialists. One method uses a strong typing approach that implements a declarative bias towards correct equations, the other method uses a coercion mechanism in order to implement a preferential bias towards the same objective. Four experiments using real-world, unsolved scientific problems were performed in order to examine the differences between the approaches and to judge the worth of the induction methods.

Not only does the coercion approach perform significantly better on two out of the four problems when compared to the strongly typed approach, but it also regularizes the expressions it induces, resulting in a more reliable search process.

A trade-off between type correctness and ability to solve the problem is identified. Due to the preferential bias implemented in the coercion approach, this trade-off does not lead to sub-optimal performance. No evidence is found that the reduction of the search space achieved through declarative bias helps in finding better solutions faster. In fact, for the class of scientific discovery problems the opposite seems to be the case.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. V. Babovic, Emergence, Evolution, Intelligence: Hydroinformatics. Rotterdam: Balkema, 1996.

    Google Scholar 

  2. V. Babovic and M. Keijzer, “Data to knowledge–The new scientific paradigm,” in Water Industry Systems: Modelling and Optimisation Applications, D. Savic and G. Walters (eds.), Research Studies Press: Exeter, 1999, pp. 3–14.

    Google Scholar 

  3. V. Babovic and M. Keijzer, “Genetic programming as a model induction engine,” J. Hydroinformatics, vol. 2, no. 1, pp. 35–60, 2000.

    Google Scholar 

  4. V. Babovic, M. Keijzer, D. R. Aguilera, and J. Harrington, “Analysis of Settling Processes using Genetic Programming,” D2K Technical Report 0501-1, http://www.d2k.dk, 2001.

  5. W. Banzhaf, “Genotype-Phenotype-Mapping and Neutral Variation–A case study in Genetic Programming,” in Parallel Problem Solving from Nature III, Y. Davidor, H.-P. Schwefel, and R. Männer (eds.) vol. 866 of LNCS. Springer-Verlag: Jerusalem, 1994, pp. 322–332.

    Google Scholar 

  6. C. Davies, “Definitive equations for the fluid resistance of spheres,” Proc. Physical Society, vol. 57 (322), 1945.

  7. K. Deb, S. Agrawal, A. Pratab, and T. Meyarivan, “A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: NSGA-II,” in Proc. PPSN-6, M. Schoenauer (ed.), Springer-Verlag: Berlin, 2000, pp. 849–858.

    Google Scholar 

  8. M. Garcia and G. Parker, “Entrainment of bed sediment into suspension,” J. Hydraulic Engineering, vol. 117, no. 4, pp. 414–435, 1991.

    Google Scholar 

  9. S. Geman, E. Bienenstock, and R. Doursat, “Neural Networks and the bias/variance dilemma,” Neural Computation, vol. 4, pp. 1–58, 1992.

    Google Scholar 

  10. R. Gibbs, M. Matthews, and D. Link, “The relationship between sphere size and settling velocity,” J. Sediment Petrology, vol. 41, no. 1, pp. 7–18, 1971.

    Google Scholar 

  11. F. Gruau, “On using syntactic constraints with genetic programming,” in Advances in Genetic Programming 2, P. J. Angeline and K. E. Kinnear, Jr. (eds.), MIT Press: Cambridge, MA, Chapt. 19, 1996, pp. 377–394.

    Google Scholar 

  12. H. Guy, D. Simons, and E. Richardson, “Summary of alluvial channel data from flume experiments, 1956–61,” Professional Paper 462-I, U.S. Geological Survey, Washington D.C., 1966.

    Google Scholar 

  13. R. J. Hallermeier, “Terminal settling velocity of commonly occuring sands,” Sedimentology, vol. 28, pp. 859–865, 1981.

    Google Scholar 

  14. T. D. Haynes, D. A. Schoenefeld, and R. L. Wainwright, “Type inheritance in strongly typed genetic programming,” in Advances in Genetic Programming 2, P. J. Angeline and K. E. Kinnear, Jr. (eds.), MIT Press: Cambridge, MA, Chapt. 18, 1996, pp. 359–376.

    Google Scholar 

  15. M. Keijzer, “Efficiently representing populations in genetic programming,” in Advances in Genetic Programming 2, P. J. Angeline and K. E. Kinnear, Jr. (eds.), MIT Press: Cambridge, MA, Chapt. 13, 1996, pp. 259–278.

    Google Scholar 

  16. M. Keijzer and V. Babovic, “Dimensionally aware genetic programming,” in Proceedings of the Genetic and Evolutionary Computation Conference. W. Banzhaf, J. Daida, A. E. Eiben, M. H. Garzon, V. Honavar, M. Jakiela, and R. E. Smith (eds.), vol. 2, Morgan Kaufmann: Orlando, FL, 1999, pp. 1069–1076.

    Google Scholar 

  17. M. Keijzer and V. Babovic, “Genetic programming, ensemble methods and the bias/variance tradeoff–Introductory investigations,” in Genetic Programming, Proceedings of EuroGP'2000, R. Poli, W. Banzhaf, W. B. Langdon, J. F. Miller, P. Nordin, and T. C. Fogarty (eds.), vol. 1802 of LNCS, Springer-Verlag: Edinburgh, 2000, pp. 76–90.

    Google Scholar 

  18. M. Keijzer, V. Babovic, C. Ryan, M. O'Neill, and M. Cattolico, “Adaptive logic programming,” in Proc. GECCO 2001, L. Spector, E. D. Goodman, A. Wu, W. B. Langdon, H.-M. Voigt, M. Gen, S. Sen, M. Dorigo, S. Pezeshk, M. H. Garzon, and E. Burke (eds.), Morgan Kaufmann: Los Altos, CA, 2001a.

    Google Scholar 

  19. M. Keijzer, C. Ryan, M. O'Neill, M. Cattolico, and V. Babovic, “Ripple crossover in genetic programming,” in Genetic Programming, Proceedings of EuroGP'2001, J. F. Miller, M. Tomassini, P. L. Lanzi, C. Ryan, A. G. B. Tettamanzi, and W. B. Langdon (eds.), vol. 2038 of LNCS, Springer-Verlag: Lake Como, Italy, 2001b, pp. 74–86.

    Google Scholar 

  20. P. D. Komar and G. L. Taghon, “Analysis of settling velocities of faecal pellets from the subtidal Polychaete Amphicteis Scaphobronchiate,” J. Marine Research, vol. 43, no. 3, pp. 605–614, 1985.

    Google Scholar 

  21. J. R. Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection, MIT Press: Cambridge, MA, 1992.

    Google Scholar 

  22. V. Kutija and H. Hong, “A numerical model for addressing the additional resistance to flow introduced by flexible vegetation,” J. Hydraulics Research, vol. 34, no. 1, pp. 99–114, 1996.

    Google Scholar 

  23. T. Larsen, J. Frier, and K. Vestergraard, “Discharge/stage relation in vegetated Danish stream,” in International Conference on River Flood Hydraulics, 1990.

  24. D. J. Montana, “Strongly typed genetic programming,” Evolutionary Computation, vol. 3, no. 2, pp. 199–230, 1995.

    Google Scholar 

  25. M. O'Neill and C. Ryan, “Grammatical evolution,” IEEE Trans. Evolutionary Computation. Forthcoming, 2001.

  26. R. Palmer, “Broken ergodicity,” Advances in Physics, vol. 31, pp. 669–736, 1982.

    Google Scholar 

  27. A. Ratle and M. Sebag, “Genetic programming and domain knowledge: beyond the limitations of grammar-guided machine discovery,” in Parallel Problem Solving from Nature–PPSN VI 6th Inter. Conf., M. Schoenauer, K. Deb, G. Rudolph, X. Yao, E. Lutton, J. J. Merelo, and H.-P. Schwefel (eds.), Springer Verlag: Paris, France, 2000, pp. 211–220, LNCS 1917.

    Google Scholar 

  28. G. Tahgon, A. Nowell, and P. Jumars, “Transport and breakdown of faecal pellets: biological and sedimentological consequences,” Limnological Oceanography, vol. 29, no. 1, pp. 64–72, 1984.

    Google Scholar 

  29. T. Tsujimoto, T. Okada, and A. Omata, “Field measurements of turbulent flow over vegetation on flood plain of river Kakehaski,” Khl progressive report, Hydraulic Laboratory, Kanazawa University, 1993.

  30. P. A. Whigham, “Grammatical bias for evolutionary learning,” Ph.D. thesis, School of Computer Science, University College, University of New South Wales, Australian Defence Force Academy, 1996a.

  31. P. A. Whigham, “Search bias, language bias, and genetic programming,” in Genetic Programming 1996: Proc. First Annual Conf., J. R. Koza, D. E. Goldberg, D. B. Fogel, and R. L. Riolo (eds.), MIT Press: Stanford University, CA, 1996b, pp. 230–237.

    Google Scholar 

  32. M. L. Wong and K. S. Leung, “Evolutionary program induction directed by logic grammars,” Evolutionary Computation, vol. 5, no. 2, pp. 143–180, 1997.

    Google Scholar 

  33. T. Yuand C. Clack, “PolyGP: A polymorphic genetic programming system in Haskell,” in Genetic Programming 1998: Proc. Third Annual Conf., J. R. Koza, W. Banzhaf, K. Chellapilla, K. Deb, M. Dorigo, D. B. Fogel, M. H. Garzon, D. E. Goldberg, H. Iba, and R. Riolo (eds.), University of Wisconsin, Morgan Kaufmann: Madison, Wisconsin, 1998, pp. 416–421.

    Google Scholar 

  34. J. A. Zyserman and J. Fredsøe, “Data analysis of bed concentration of suspended sediment,” J. Hydraulic Engineering, vol. 120, no. 9, pp. 1021–1042, 1994.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Keijzer, M., Babovic, V. Declarative and Preferential Bias in GP-based Scientific Discovery. Genetic Programming and Evolvable Machines 3, 41–79 (2002). https://doi.org/10.1023/A:1014596120381

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1014596120381

Navigation