Abstract
This work examines two methods for evolving dimensionally correct equations on the basis of data. It is demonstrated that the use of units of measurement aids in evolving equations that are amenable to interpretation by domain specialists. One method uses a strong typing approach that implements a declarative bias towards correct equations, the other method uses a coercion mechanism in order to implement a preferential bias towards the same objective. Four experiments using real-world, unsolved scientific problems were performed in order to examine the differences between the approaches and to judge the worth of the induction methods.
Not only does the coercion approach perform significantly better on two out of the four problems when compared to the strongly typed approach, but it also regularizes the expressions it induces, resulting in a more reliable search process.
A trade-off between type correctness and ability to solve the problem is identified. Due to the preferential bias implemented in the coercion approach, this trade-off does not lead to sub-optimal performance. No evidence is found that the reduction of the search space achieved through declarative bias helps in finding better solutions faster. In fact, for the class of scientific discovery problems the opposite seems to be the case.
Similar content being viewed by others
References
V. Babovic, Emergence, Evolution, Intelligence: Hydroinformatics. Rotterdam: Balkema, 1996.
V. Babovic and M. Keijzer, “Data to knowledge–The new scientific paradigm,” in Water Industry Systems: Modelling and Optimisation Applications, D. Savic and G. Walters (eds.), Research Studies Press: Exeter, 1999, pp. 3–14.
V. Babovic and M. Keijzer, “Genetic programming as a model induction engine,” J. Hydroinformatics, vol. 2, no. 1, pp. 35–60, 2000.
V. Babovic, M. Keijzer, D. R. Aguilera, and J. Harrington, “Analysis of Settling Processes using Genetic Programming,” D2K Technical Report 0501-1, http://www.d2k.dk, 2001.
W. Banzhaf, “Genotype-Phenotype-Mapping and Neutral Variation–A case study in Genetic Programming,” in Parallel Problem Solving from Nature III, Y. Davidor, H.-P. Schwefel, and R. Männer (eds.) vol. 866 of LNCS. Springer-Verlag: Jerusalem, 1994, pp. 322–332.
C. Davies, “Definitive equations for the fluid resistance of spheres,” Proc. Physical Society, vol. 57 (322), 1945.
K. Deb, S. Agrawal, A. Pratab, and T. Meyarivan, “A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: NSGA-II,” in Proc. PPSN-6, M. Schoenauer (ed.), Springer-Verlag: Berlin, 2000, pp. 849–858.
M. Garcia and G. Parker, “Entrainment of bed sediment into suspension,” J. Hydraulic Engineering, vol. 117, no. 4, pp. 414–435, 1991.
S. Geman, E. Bienenstock, and R. Doursat, “Neural Networks and the bias/variance dilemma,” Neural Computation, vol. 4, pp. 1–58, 1992.
R. Gibbs, M. Matthews, and D. Link, “The relationship between sphere size and settling velocity,” J. Sediment Petrology, vol. 41, no. 1, pp. 7–18, 1971.
F. Gruau, “On using syntactic constraints with genetic programming,” in Advances in Genetic Programming 2, P. J. Angeline and K. E. Kinnear, Jr. (eds.), MIT Press: Cambridge, MA, Chapt. 19, 1996, pp. 377–394.
H. Guy, D. Simons, and E. Richardson, “Summary of alluvial channel data from flume experiments, 1956–61,” Professional Paper 462-I, U.S. Geological Survey, Washington D.C., 1966.
R. J. Hallermeier, “Terminal settling velocity of commonly occuring sands,” Sedimentology, vol. 28, pp. 859–865, 1981.
T. D. Haynes, D. A. Schoenefeld, and R. L. Wainwright, “Type inheritance in strongly typed genetic programming,” in Advances in Genetic Programming 2, P. J. Angeline and K. E. Kinnear, Jr. (eds.), MIT Press: Cambridge, MA, Chapt. 18, 1996, pp. 359–376.
M. Keijzer, “Efficiently representing populations in genetic programming,” in Advances in Genetic Programming 2, P. J. Angeline and K. E. Kinnear, Jr. (eds.), MIT Press: Cambridge, MA, Chapt. 13, 1996, pp. 259–278.
M. Keijzer and V. Babovic, “Dimensionally aware genetic programming,” in Proceedings of the Genetic and Evolutionary Computation Conference. W. Banzhaf, J. Daida, A. E. Eiben, M. H. Garzon, V. Honavar, M. Jakiela, and R. E. Smith (eds.), vol. 2, Morgan Kaufmann: Orlando, FL, 1999, pp. 1069–1076.
M. Keijzer and V. Babovic, “Genetic programming, ensemble methods and the bias/variance tradeoff–Introductory investigations,” in Genetic Programming, Proceedings of EuroGP'2000, R. Poli, W. Banzhaf, W. B. Langdon, J. F. Miller, P. Nordin, and T. C. Fogarty (eds.), vol. 1802 of LNCS, Springer-Verlag: Edinburgh, 2000, pp. 76–90.
M. Keijzer, V. Babovic, C. Ryan, M. O'Neill, and M. Cattolico, “Adaptive logic programming,” in Proc. GECCO 2001, L. Spector, E. D. Goodman, A. Wu, W. B. Langdon, H.-M. Voigt, M. Gen, S. Sen, M. Dorigo, S. Pezeshk, M. H. Garzon, and E. Burke (eds.), Morgan Kaufmann: Los Altos, CA, 2001a.
M. Keijzer, C. Ryan, M. O'Neill, M. Cattolico, and V. Babovic, “Ripple crossover in genetic programming,” in Genetic Programming, Proceedings of EuroGP'2001, J. F. Miller, M. Tomassini, P. L. Lanzi, C. Ryan, A. G. B. Tettamanzi, and W. B. Langdon (eds.), vol. 2038 of LNCS, Springer-Verlag: Lake Como, Italy, 2001b, pp. 74–86.
P. D. Komar and G. L. Taghon, “Analysis of settling velocities of faecal pellets from the subtidal Polychaete Amphicteis Scaphobronchiate,” J. Marine Research, vol. 43, no. 3, pp. 605–614, 1985.
J. R. Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection, MIT Press: Cambridge, MA, 1992.
V. Kutija and H. Hong, “A numerical model for addressing the additional resistance to flow introduced by flexible vegetation,” J. Hydraulics Research, vol. 34, no. 1, pp. 99–114, 1996.
T. Larsen, J. Frier, and K. Vestergraard, “Discharge/stage relation in vegetated Danish stream,” in International Conference on River Flood Hydraulics, 1990.
D. J. Montana, “Strongly typed genetic programming,” Evolutionary Computation, vol. 3, no. 2, pp. 199–230, 1995.
M. O'Neill and C. Ryan, “Grammatical evolution,” IEEE Trans. Evolutionary Computation. Forthcoming, 2001.
R. Palmer, “Broken ergodicity,” Advances in Physics, vol. 31, pp. 669–736, 1982.
A. Ratle and M. Sebag, “Genetic programming and domain knowledge: beyond the limitations of grammar-guided machine discovery,” in Parallel Problem Solving from Nature–PPSN VI 6th Inter. Conf., M. Schoenauer, K. Deb, G. Rudolph, X. Yao, E. Lutton, J. J. Merelo, and H.-P. Schwefel (eds.), Springer Verlag: Paris, France, 2000, pp. 211–220, LNCS 1917.
G. Tahgon, A. Nowell, and P. Jumars, “Transport and breakdown of faecal pellets: biological and sedimentological consequences,” Limnological Oceanography, vol. 29, no. 1, pp. 64–72, 1984.
T. Tsujimoto, T. Okada, and A. Omata, “Field measurements of turbulent flow over vegetation on flood plain of river Kakehaski,” Khl progressive report, Hydraulic Laboratory, Kanazawa University, 1993.
P. A. Whigham, “Grammatical bias for evolutionary learning,” Ph.D. thesis, School of Computer Science, University College, University of New South Wales, Australian Defence Force Academy, 1996a.
P. A. Whigham, “Search bias, language bias, and genetic programming,” in Genetic Programming 1996: Proc. First Annual Conf., J. R. Koza, D. E. Goldberg, D. B. Fogel, and R. L. Riolo (eds.), MIT Press: Stanford University, CA, 1996b, pp. 230–237.
M. L. Wong and K. S. Leung, “Evolutionary program induction directed by logic grammars,” Evolutionary Computation, vol. 5, no. 2, pp. 143–180, 1997.
T. Yuand C. Clack, “PolyGP: A polymorphic genetic programming system in Haskell,” in Genetic Programming 1998: Proc. Third Annual Conf., J. R. Koza, W. Banzhaf, K. Chellapilla, K. Deb, M. Dorigo, D. B. Fogel, M. H. Garzon, D. E. Goldberg, H. Iba, and R. Riolo (eds.), University of Wisconsin, Morgan Kaufmann: Madison, Wisconsin, 1998, pp. 416–421.
J. A. Zyserman and J. Fredsøe, “Data analysis of bed concentration of suspended sediment,” J. Hydraulic Engineering, vol. 120, no. 9, pp. 1021–1042, 1994.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Keijzer, M., Babovic, V. Declarative and Preferential Bias in GP-based Scientific Discovery. Genetic Programming and Evolvable Machines 3, 41–79 (2002). https://doi.org/10.1023/A:1014596120381
Issue Date:
DOI: https://doi.org/10.1023/A:1014596120381