Can Genetic Programming Do Manifold Learning Too?

Lensen, Andrew; Xue, Bing; Zhang, Mengjie

doi:10.1007/978-3-030-16670-0_8

Andrew Lensen ORCID: orcid.org/0000-0003-1269-4751¹⁹,
Bing Xue¹⁹ &
Mengjie Zhang¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11451))

Included in the following conference series:

European Conference on Genetic Programming

994 Accesses
14 Citations

Abstract

Exploratory data analysis is a fundamental aspect of knowledge discovery that aims to find the main characteristics of a dataset. Dimensionality reduction, such as manifold learning, is often used to reduce the number of features in a dataset to a manageable level for human interpretation. Despite this, most manifold learning techniques do not explain anything about the original features nor the true characteristics of a dataset. In this paper, we propose a genetic programming approach to manifold learning called GP-MaL which evolves functional mappings from a high-dimensional space to a lower dimensional space through the use of interpretable trees. We show that GP-MaL is competitive with existing manifold learning algorithms, while producing models that can be interpreted and re-used on unseen data. A number of promising future directions of research are found in the process.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
An embedding here refers to the low-dimensional representation of the structure present in a dataset.
2.
Here, neighbours refer to the closest instances to a point by (Euclidean) distance.
3.
Five inputs were found to be a good balance between encouraging wider trees and minimising computing resources required.
4.
Information gain (mutual information) is often used in feature selection for classification to measure the dependency between a feature and the class label.

References

Bengio, Y., Courville, A.C., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)
Article Google Scholar
Cano, A., Ventura, S., Cios, K.J.: Multi-objective genetic programming for feature extraction and data visualization. Soft Comput. 21(8), 2069–2089 (2017)
Article Google Scholar
Dheeru, D., Karra Taniskidou, E.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml
François, D., Wertz, V., Verleysen, M.: The concentration of fractional distances. IEEE Trans. Knowl. Data Eng. 19(7), 873–886 (2007)
Article Google Scholar
Jolliffe, I.T.: Principal component analysis. In: Lovric, M. (ed.) International Encyclopedia of Statistical Science, pp. 1094–1096. Springer, Berlin (2011). https://doi.org/10.1007/978-3-642-04898-2
Chapter Google Scholar
Kruskal, J.B.: Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 29(1), 1–27 (1964)
Article MathSciNet Google Scholar
Lensen, A., Xue, B., Zhang, M.: New representations in genetic programming for feature construction in k-means clustering. In: Shi, Y., et al. (eds.) SEAL 2017. LNCS, vol. 10593, pp. 543–555. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68759-9_44
Chapter Google Scholar
Lensen, A., Xue, B., Zhang, M.: Automatically evolving difficult benchmark feature selection datasets with genetic programming. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO, pp. 458–465. ACM (2018)
Google Scholar
Liu, H., Motoda, H.: Feature Selection for Knowledge Discovery and Data Mining, vol. 454. Springer, Boston (2012). https://doi.org/10.1007/978-1-4615-5689-3
Book MATH Google Scholar
van der Maaten, L.: Accelerating t-SNE using tree-based algorithms. J. Mach. Learn. Res. 15(1), 3221–3245 (2014)
MathSciNet MATH Google Scholar
van der Maaten, L., Hinton, G.E.: Visualizing high-dimensional data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)
MATH Google Scholar
Neshatian, K., Zhang, M., Andreae, P.: A filter approach to multiple feature construction for symbolic learning classifiers using genetic programming. IEEE Trans. Evol. Comput. 16(5), 645–661 (2012)
Article Google Scholar
Nguyen, S., Zhang, M., Alahakoon, D., Tan, K.C.: Visualizing the evolution of computer programs for genetic programming [research frontier]. IEEE Comput. Intell. Mag. 13(4), 77–94 (2018)
Article Google Scholar
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Poli, R., Langdon, W.B., McPhee, N.F.: A Field Guide to Genetic Programming. Lulu.com, Morrisville (2008)
Google Scholar
Rodriguez-Coayahuitl, L., Morales-Reyes, A., Escalante, H.J.: Structurally layered representation learning: towards deep learning through genetic programming. In: Castelli, M., Sekanina, L., Zhang, M., Cagnoni, S., García-Sánchez, P. (eds.) EuroGP 2018. LNCS, vol. 10781, pp. 271–288. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-77553-1_17
Chapter Google Scholar
Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)
Article Google Scholar
Sun, Y., Xue, B., Zhang, M., Yen, G.G.: A particle swarm optimization-based flexible convolutional auto-encoder for image classification. IEEE Trans. Neural Netw. Learn. Syst. (2018). https://doi.org/10.1109/TNNLS.2018.2881143
Sun, Y., Yen, G.G., Yi, Z.: Evolving unsupervised deep neural networks for learning meaningful representations. IEEE Trans. Evol. Comput. (2018). https://doi.org/10.1109/TEVC.2018.2808689
Tran, B., Xue, B., Zhang, M.: Genetic programming for feature construction and selection in classification on high-dimensional data. Memet. Comput. 8(1), 3–15 (2016)
Article Google Scholar
Zhang, C., Liu, C., Zhang, X., Almpanidis, G.: An up-to-date comparison of state-of-the-art classification algorithms. Expert Syst. Appl. 82, 128–150 (2017)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Engineering and Computer Science, Victoria University of Wellington, PO Box 600, Wellington, 6140, New Zealand
Andrew Lensen, Bing Xue & Mengjie Zhang

Authors

Andrew Lensen
View author publications
You can also search for this author in PubMed Google Scholar
Bing Xue
View author publications
You can also search for this author in PubMed Google Scholar
Mengjie Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrew Lensen .

Editor information

Editors and Affiliations

Brno University of Technology, Brno, Czech Republic
Lukas Sekanina
Memorial University, St. John's, NL, Canada
Ting Hu
University of Coimbra, Coimbra, Portugal
Nuno Lourenço
HTWK Leipzig University of Applied Sciences, Leipzig, Germany
Hendrik Richter
University of Granada, Granada, Spain
Pablo García-Sánchez

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lensen, A., Xue, B., Zhang, M. (2019). Can Genetic Programming Do Manifold Learning Too?. In: Sekanina, L., Hu, T., Lourenço, N., Richter, H., García-Sánchez, P. (eds) Genetic Programming. EuroGP 2019. Lecture Notes in Computer Science(), vol 11451. Springer, Cham. https://doi.org/10.1007/978-3-030-16670-0_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-16670-0_8
Published: 27 March 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-16669-4
Online ISBN: 978-3-030-16670-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics