A Genetic Programming Encoder for Increasing Autoencoder Interpretability

Schofield, Finn; Slyfield, Luis; Lensen, Andrew

doi:10.1007/978-3-031-29573-7_2

A Genetic Programming Encoder for Increasing Autoencoder Interpretability

Finn Schofield¹⁰,
Luis Slyfield¹⁰ &
Andrew Lensen ORCID: orcid.org/0000-0003-1269-4751¹⁰

Conference paper
First Online: 29 March 2023

449 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13986))

Abstract

Autoencoders are powerful models for non-linear dimensionality reduction. However, their neural network structure makes it difficult to interpret how the high dimensional features relate to the low-dimensional embedding, which is an issue in applications where explainability is important. There have been attempts to replace both the neural network components in autoencoders with interpretable genetic programming (GP) models. However, for the purposes of interpretable dimensionality reduction, we observe that replacing only the encoder with GP is sufficient. In this work, we propose the Genetic Programming Encoder for Autoencoding (GPE-AE). GPE-AE uses a multi-tree GP individual as an encoder, while retaining the neural network decoder. We demonstrate that GPE-AE is a competitive non-linear dimensionality reduction technique compared to conventional autoencoders and a GP based method that does not use an autoencoder structure. As visualisation is a common goal for dimensionality reduction, we also evaluate the quality of visualisations produced by our method, and highlight the value of functional mappings by demonstrating insights that can be gained from interpreting the GP encoders.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Bengio, Y., Courville, A.C., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013). https://doi.org/10.1109/TPAMI.2013.50
Bi, Y., Xue, B., Zhang, M.: Evolving deep forest with automatic feature extraction for image classification using genetic programming. In: Bäck, T., et al. (eds.) PPSN 2020. LNCS, vol. 12269, pp. 3–18. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58112-1_1
Chapter Google Scholar
Chai, T., Draxler, R.R.: Root mean square error (RMSE) or mean absolute error (MAE). Geosci. Model Dev. Discuss. 7(1), 1525–1534 (2014)
Google Scholar
Dua, D., Graff, C.: UCI machine learning repository (2017). https://archive.ics.uci.edu/ml
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Article MathSciNet MATH Google Scholar
Hinton, G.E., Roweis, S.T.: Stochastic neighbor embedding. In: Becker, S., Thrun, S., Obermayer, K. (eds.) Advances in Neural Information Processing Systems 15 [Neural Information Processing Systems, NIPS 2002, 9–14 December 2002, Vancouver, British Columbia, Canada], pp. 833–840. MIT Press (2002)
Google Scholar
Jolliffe, I.T.: Principal Component Analysis. In: Lovric, M. (ed.) International Encyclopedia of Statistical Science, pp. 1094–1096. Springer, Berlin, Heidelberg (2011). https://doi.org/10.1007/978-3-642-04898-2_455
Kashef, S., Nezamabadi-pour, H.: An advanced ACO algorithm for feature subset selection. Neurocomputing 147, 271–279 (2015)
Article Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015, Conference Track Proceedings (2015). https://arxiv.org/abs/1412.6980
Leardi, R., Boggia, R., Terrile, M.: Genetic algorithms as a strategy for feature selection. J. Chemom. 6(5), 267–281 (1992)
Article Google Scholar
Lee, J.A., Verleysen, M.: Nonlinear Dimensionality Reduction, vol. 1. Springer, New York (2007). https://doi.org/10.1007/978-0-387-39351-3
Lensen, A., Xue, B., Zhang, M.: Can genetic programming do manifold learning too? In: Sekanina, L., Hu, T., Lourenço, N., Richter, H., García-Sánchez, P. (eds.) EuroGP 2019. LNCS, vol. 11451, pp. 114–130. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-16670-0_8
Chapter Google Scholar
Lensen, A., Xue, B., Zhang, M.: Genetic programming for manifold learning: preserving local topology. IEEE Transactions on Evolutionary Computation, pp. 1–15 (2022). early Access
Google Scholar
Lensen, A., Zhang, M., Xue, B.: Multi-objective genetic programming for manifold learning: balancing quality and dimensionality. Genet. Program. Evolvable Mach. 21(3), 399–431 (2020). https://doi.org/10.1007/s10710-020-09375-4
Article Google Scholar
van der Maaten, L.: Learning a parametric embedding by preserving local structure. In: Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, AISTATS 2009, Clearwater Beach, Florida, USA, 16–18 April 2009. JMLR Proceedings, vol. 5, pp. 384–391. JMLR.org (2009)
Google Scholar
McDermott, J.: Why is auto-encoding difficult for genetic programming? In: Sekanina, L., Hu, T., Lourenço, N., Richter, H., García-Sánchez, P. (eds.) EuroGP 2019. LNCS, vol. 11451, pp. 131–145. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-16670-0_9
Chapter Google Scholar
McInnes, L., Healy, J.: UMAP: uniform manifold approximation and projection for dimension reduction. CoRR abs/1802.03426 (2018)
Google Scholar
Poli, R., Langdon, W.B., McPhee, N.F.: A Field Guide to Genetic Programming (2008). lulu.com, https://www.gp-field-guide.org.uk/
Rodriguez-Coayahuitl, L., Morales-Reyes, A., Escalante, H.J.: Evolving autoencoding structures through genetic programming. Genet. Program. Evolvable Mach. 20(3), 413–440 (2019). https://doi.org/10.1007/s10710-019-09354-4
Article Google Scholar
Ruberto, S., Terragni, V., Moore, J.H.: Image feature learning with genetic programming. In: Bäck, T., et al. (eds.) PPSN 2020. LNCS, vol. 12270, pp. 63–78. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58115-2_5
Chapter Google Scholar
Sainburg, T., McInnes, L., Gentner, T.Q.: Parametric UMAP embeddings for representation and semisupervised learning. Neural Comput. 33(11), 2881–2907 (2021)
MathSciNet MATH Google Scholar
Schofield, F., Lensen, A.: Using genetic programming to find functional mappings for UMAP embeddings. In: IEEE Congress on Evolutionary Computation, CEC 2021, Kraków, Poland, June 28–1 July 2021, pp. 704–711. IEEE (2021)
Google Scholar
Svetnik, V., Liaw, A., Tong, C., Culberson, J.C., Sheridan, R.P., Feuston, B.P.: Random forest: a classification and regression tool for compound classification and QSAR modeling. J. Chem. Inf. Comput. Sci. 43(6), 1947–1958 (2003)
Article Google Scholar
Uriot, T., Virgolin, M., Alderliesten, T., Bosman, P.: On genetic programming representations and fitness functions for interpretable dimensionality reduction (2022). https://arxiv.org/abs/2203.00528
Vanschoren, J., van Rijn, J.N., Bischl, B., Torgo, L.: OpenML: networked science in machine learning. SIGKDD Explor. 15(2), 49–60 (2013). https://doi.org/10.1145/2641190.2641198
Article Google Scholar
Xue, B., Zhang, M., Browne, W.N.: Multi-objective particle swarm optimisation (PSO) for feature selection. In: Genetic and Evolutionary Computation Conference, GECCO 2012, Philadelphia, PA, USA, 7–11 July 2012, pp. 81–88. ACM (2012)
Google Scholar
Zhao, H.: A multi-objective genetic programming approach to developing pareto optimal decision trees. Decis. Support Syst. 43(3), 809–826 (2007)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Engineering and Computer Science, Victoria University of Wellington, PO Box 600, Wellington, 6140, New Zealand
Finn Schofield, Luis Slyfield & Andrew Lensen

Authors

Finn Schofield
View author publications
You can also search for this author in PubMed Google Scholar
Luis Slyfield
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Lensen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrew Lensen .

Editor information

Editors and Affiliations

Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
Gisele Pappa
Università degli studi di Torino, Turin, Italy
Mario Giacobini
Brno University of Technology, Brno, Czech Republic
Zdenek Vasicek

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schofield, F., Slyfield, L., Lensen, A. (2023). A Genetic Programming Encoder for Increasing Autoencoder Interpretability. In: Pappa, G., Giacobini, M., Vasicek, Z. (eds) Genetic Programming. EuroGP 2023. Lecture Notes in Computer Science, vol 13986. Springer, Cham. https://doi.org/10.1007/978-3-031-29573-7_2

Download citation

DOI: https://doi.org/10.1007/978-3-031-29573-7_2
Published: 29 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-29572-0
Online ISBN: 978-3-031-29573-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics