Abstract
In the last years, different approaches have been proposed to introduce semantic information to genetic programming. In particular, the geometric semantic genetic programming (GSGP) and the interesting properties of its evolutionary operators have gotten the attention of the community. This paper is interested in the use of GSGP to solve symbolic regression problems, where semantics is defined by the output set generated by a given individual when applied to the training cases. In this scenario, both mutation and crossover operators defined with fitness function based on Manhattan distance use randomly built functions to generate offspring. However, the outputs of these random functions are not guaranteed to be uniformly distributed in the semantic space, as the functions are generated considering the syntactic space. We hypothesize that the non-uniformity of the semantics of these functions may bias the search, and propose three different standard normalization techniques to improve the distribution of the outputs of these random functions over the semantic space. The results are compared with a popular strategy that uses a logistic function as a wrapper to the outputs, and show that the strategies tested can improve the results of the previous method. The experimental analysis also indicates that a more uniform distribution of the semantics of these functions does not necessarily imply in better results in terms of test error.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The hyperrectangle in a semantic space under the Euclidean metric is the equivalent to a segment in a semantic space under the Manhattan distance.
- 2.
The use of AQ instead of division—or protected division—makes the approach proposed by Dick [7] redundant. For this reason, we do not present the method in our experimental analysis.
- 3.
Five training partitions of the 5-fold cross-validation, 5 samples generated for Keijzer-5 and Vladislavleva-1 and a single set generated for Keijzer-6 and Keijzer-7.
References
Vanneschi, L., Castelli, M., Silva, S.: A survey of semantic methods in genetic programming. Genet. Program. Evolvable Mach. 15(2), 195–214 (2014)
Oliveira, L.: Improving search in geometric semantic genetic programming. Ph.D. thesis, Universidade Federal de Minas Gerais, September 2016
Moraglio, A., Krawiec, K., Johnson, C.G.: Geometric semantic genetic programming. In: Coello, C.A.C., Cutello, V., Deb, K., Forrest, S., Nicosia, G., Pavone, M. (eds.) PPSN 2012. LNCS, vol. 7491, pp. 21–31. Springer, Heidelberg (2012). doi:10.1007/978-3-642-32937-1_3
Pawlak, T., Wieloch, B., Krawiec, K.: Semantic backpropagation for designing search operators in genetic programming. IEEE Trans. Evol. Comput. 19(3), 326–340 (2014)
Pawlak, T.P.: Competent algorithms for geometric semantic genetic programming. Ph.D. thesis, Poznan University of Technology, Pozna’n, Poland (2015)
Castelli, M., Silva, S., Vanneschi, L.: A C++ framework for geometric semantic genetic programming. Genet. Program. Evolvable Mach. 16(1), 73–81 (2015)
Dick, G.: Improving geometric semantic genetic programming with safe tree initialisation. In: Machado, P., Heywood, M.I., McDermott, J., Castelli, M., García-Sánchez, P., Burelli, P., Risi, S., Sim, K. (eds.) EuroGP 2015. LNCS, vol. 9025, pp. 28–40. Springer, Heidelberg (2015). doi:10.1007/978-3-319-16501-1_3
Jackson, D.: Phenotypic diversity in initial genetic programming populations. In: Esparcia-Alcázar, A.I., Ekárt, A., Silva, S., Dignum, S., Uyar, A.Ş. (eds.) EuroGP 2010. LNCS, vol. 6021, pp. 98–109. Springer, Heidelberg (2010). doi:10.1007/978-3-642-12148-7_9
Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection, vol. 1. MIT Press, Cambridge (1992)
Vanneschi, L., Castelli, M., Manzoni, L., Silva, S.: A new implementation of geometric semantic gp and its application to problems in pharmacokinetics. In: Krawiec, K., Moraglio, A., Hu, T., Etaner-Uyar, A.Ş., Hu, B. (eds.) EuroGP 2013. LNCS, vol. 7831, pp. 205–216. Springer, Heidelberg (2013). doi:10.1007/978-3-642-37207-0_18
Castelli, M., Trujillo, L., Vanneschi, L., Silva, S., Z-Flores, E., Legrand, P.: Geometric semantic genetic programming with local search. In: Proceedings of the 2015 Genetic and Evolutionary Computation Conference, GECCO 2015, pp. 999–1006. ACM, New York (2015)
Oliveira, L.O.V.B., Miranda, L.F., Pappa, G.L., Otero, F.E.B., Takahashi, R.H.C.: Reducing dimensionality to improve search in semantic genetic programming. In: Handl, J., Hart, E., Lewis, P.R., López-Ibáñez, M., Ochoa, G., Paechter, B. (eds.) PPSN 2016. LNCS, vol. 9921, pp. 375–385. Springer, Heidelberg (2016). doi:10.1007/978-3-319-45823-6_35
Oliveira, L., Otero, F.E.B., Pappa, G.L.: A dispersion operator for geometric semantic genetic programming. In: Proceedings of the Genetic and Evolutionary Computation Conference 2016, pp. 773–780. ACM (2016)
Gonçalves, I., Silva, S., Fonseca, C.M.: On the generalization ability of geometric semantic genetic programming. In: Machado, P., Heywood, M.I., McDermott, J., Castelli, M., García-Sánchez, P., Burelli, P., Risi, S., Sim, K. (eds.) EuroGP 2015. LNCS, vol. 9025, pp. 41–52. Springer, Heidelberg (2015). doi:10.1007/978-3-319-16501-1_4
Han, J., Pei, J., Kamber, M.: Data Mining: Concepts and Techniques. The Morgan Kaufmann Series in Data Management Systems. Elsevier Science, Amsterdam (2011)
Ni, J., Drieberg, R.H., Rockett, P.I.: The use of an analytic quotient operator in genetic programming. IEEE Trans. Evol. Comput. 17(1), 146–152 (2013)
Bache, K., Lichman, M.: UCI machine learning repository (2014). http://archive.ics.uci.edu/ml
Castelli, M., Vanneschi, L., Silva, S.: Prediction of high performance concrete strength using genetic programming with geometric semantic genetic operators. Expert Syst. Appl. 40(17), 6856–6862 (2013)
McDermott, J., White, D.R., Luke, S., Manzoni, L., Castelli, M., Vanneschi, L., Jaskowski, W., Krawiec, K., Harper, R., De Jong, K., O’Reilly, U.M.: Genetic programming needs better benchmarks. In: Proceedings of the 14th Annual Conference on Genetic and Evolutionary Computation, pp. 791–798. ACM (2012)
Iman, R.L., Davenport, J.M.: Approximations of the critical region of the Friedman statistic. Commun. Stat. - Theory Methods 9(6), 571–595 (1980)
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Acknowledgements
The authors would like to thank the anonymous reviewers for their valuable comments and suggestions. This work was partially supported by the following Brazilian Research Support Agencies: CNPq, FAPEMIG, and CAPES.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Oliveira, L.O.V.B., Casadei, F., Pappa, G.L. (2017). Strategies for Improving the Distribution of Random Function Outputs in GSGP. In: McDermott, J., Castelli, M., Sekanina, L., Haasdijk, E., García-Sánchez, P. (eds) Genetic Programming. EuroGP 2017. Lecture Notes in Computer Science(), vol 10196. Springer, Cham. https://doi.org/10.1007/978-3-319-55696-3_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-55696-3_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-55695-6
Online ISBN: 978-3-319-55696-3
eBook Packages: Computer ScienceComputer Science (R0)