Abstract
Genetic programming approaches are moving from analysing the syntax of individual solutions to look into their semantics. One of the common definitions of the semantic space in the context of symbolic regression is a n-dimensional space, where n corresponds to the number of training examples. In problems where this number is high, the search process can became harder as the number of dimensions increase. Geometric semantic genetic programming (GSGP) explores the semantic space by performing geometric semantic operations—the fitness landscape seen by GSGP is guaranteed to be conic by construction. Intuitively, a lower number of dimensions can make search more feasible in this scenario, decreasing the chances of data overfitting and reducing the number of evaluations required to find a suitable solution. This paper proposes two approaches for dimensionality reduction in GSGP: (i) to apply current instance selection methods as a pre-process step before training points are given to GSGP; (ii) to incorporate instance selection to the evolution of GSGP. Experiments in 15 datasets show that GSGP performance is improved by using instance reduction during the evolution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Albinati, J., Pappa, G.L., Otero, F.E.B., Oliveira, L.O.V.B.: The effect of distinct geometric semantic crossover operators in regression problems. In: Proceedings of EuroGP, pp. 3–15 (2015)
Arnaiz-González, Á., Blachnik, M., Kordos, M., García-Osorio, C.: Fusion of instance selection methods in regression tasks. Inf. Fus. 30, 69–79 (2016)
Castelli, M., Silva, S., Vanneschi, L.: A C++ framework for geometric semantic genetic programming. Genet. Prog. Evolvable Mach. 16(1), 73–81 (2015)
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Domingos, P.: A few useful things to know about machine learning. Commun. ACM 55(10), 78–87 (2012)
Garcia, S., Derrac, J., Cano, J., Herrera, F.: Prototype selection for nearest neighbor classification: taxonomy and empirical study. IEEE Trans. Pattern Anal. Mach. Intell. 34(3), 417–435 (2012)
Guillen, A., Herrera, L.J., Rubio, G., Pomares, H., Lendasse, A., Rojas, I.: New method for instance or prototype selection using mutual information in time series prediction. Neurocomputing 73(10–12), 2030–2038 (2010)
Hart, P.: The condensed nearest neighbor rule (corresp.). IEEE Trans. Inf. Theor. 14(3), 515–516 (1968)
Kordos, M., Blachnik, M.: Instance selection with neural networks for regression problems. In: Villa, A.E.P., Duch, W., Érdi, P., Masulli, F., Palm, G. (eds.) ICANN 2012, Part II. LNCS, vol. 7553, pp. 263–270. Springer, Heidelberg (2012)
Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection, vol. 1. MIT Press, Cambridge (1992)
Lichman, M.: UCI Machine Learning Repository (2015). http://archive.ics.uci.edu/ml
McDermott, J., White, D.R., Luke, S., Manzoni, L., Castelli, M., Vanneschi, L., Jaskowski, W., Krawiec, K., Harper, R., De Jong, K., O’Reilly, U.M.: Genetic programming needs better benchmarks. In: Proceedings of GECCO, pp. 791–798 (2012)
Moraglio, A.: Abstract convex evolutionary search. In: Proceedings of the 11th FOGA, pp. 151–162 (2011)
Moraglio, A., Krawiec, K., Johnson, C.G.: Geometric semantic genetic programming. In: Coello, C.A.C., Cutello, V., Deb, K., Forrest, S., Nicosia, G., Pavone, M. (eds.) PPSN 2012, Part I. LNCS, vol. 7491, pp. 21–31. Springer, Heidelberg (2012)
Ni, J., Drieberg, R.H., Rockett, P.I.: The use of an analytic quotient operator in genetic programming. IEEE Trans. Evol. Comput. 17(1), 146–152 (2013)
Rodrguez-Fdez, I., Mucientes, M., Bugarn, A.: An instance selection algorithm for regression and its application in variance reduction. In: 2013 IEEE International Conference on Fuzzy Systems (FUZZ), pp. 1–8, July 2013
Vanneschi, L., Castelli, M., Silva, S.: A survey of semantic methods in genetic programming. Genet. Program. Evolvable Mach. 15(2), 195–214 (2014)
Wilson, D.L.: Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans. Syst. Man Cybern. 2(3), 408–421 (1972)
Acknowledgements
The authors would like to thank CNPq (141985/2015-1), CAPES and Fapemig for their financial support.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Oliveira, L.O.V.B., Miranda, L.F., Pappa, G.L., Otero, F.E.B., Takahashi, R.H.C. (2016). Reducing Dimensionality to Improve Search in Semantic Genetic Programming. In: Handl, J., Hart, E., Lewis, P., López-Ibáñez, M., Ochoa, G., Paechter, B. (eds) Parallel Problem Solving from Nature – PPSN XIV. PPSN 2016. Lecture Notes in Computer Science(), vol 9921. Springer, Cham. https://doi.org/10.1007/978-3-319-45823-6_35
Download citation
DOI: https://doi.org/10.1007/978-3-319-45823-6_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45822-9
Online ISBN: 978-3-319-45823-6
eBook Packages: Computer ScienceComputer Science (R0)