Skip to main content

A Genetic Programming Encoder for Increasing Autoencoder Interpretability

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13986))

Abstract

Autoencoders are powerful models for non-linear dimensionality reduction. However, their neural network structure makes it difficult to interpret how the high dimensional features relate to the low-dimensional embedding, which is an issue in applications where explainability is important. There have been attempts to replace both the neural network components in autoencoders with interpretable genetic programming (GP) models. However, for the purposes of interpretable dimensionality reduction, we observe that replacing only the encoder with GP is sufficient. In this work, we propose the Genetic Programming Encoder for Autoencoding (GPE-AE). GPE-AE uses a multi-tree GP individual as an encoder, while retaining the neural network decoder. We demonstrate that GPE-AE is a competitive non-linear dimensionality reduction technique compared to conventional autoencoders and a GP based method that does not use an autoencoder structure. As visualisation is a common goal for dimensionality reduction, we also evaluate the quality of visualisations produced by our method, and highlight the value of functional mappings by demonstrating insights that can be gained from interpreting the GP encoders.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bengio, Y., Courville, A.C., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013). https://doi.org/10.1109/TPAMI.2013.50

  2. Bi, Y., Xue, B., Zhang, M.: Evolving deep forest with automatic feature extraction for image classification using genetic programming. In: Bäck, T., et al. (eds.) PPSN 2020. LNCS, vol. 12269, pp. 3–18. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58112-1_1

    Chapter  Google Scholar 

  3. Chai, T., Draxler, R.R.: Root mean square error (RMSE) or mean absolute error (MAE). Geosci. Model Dev. Discuss. 7(1), 1525–1534 (2014)

    Google Scholar 

  4. Dua, D., Graff, C.: UCI machine learning repository (2017). https://archive.ics.uci.edu/ml

  5. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  6. Hinton, G.E., Roweis, S.T.: Stochastic neighbor embedding. In: Becker, S., Thrun, S., Obermayer, K. (eds.) Advances in Neural Information Processing Systems 15 [Neural Information Processing Systems, NIPS 2002, 9–14 December 2002, Vancouver, British Columbia, Canada], pp. 833–840. MIT Press (2002)

    Google Scholar 

  7. Jolliffe, I.T.: Principal Component Analysis. In: Lovric, M. (ed.) International Encyclopedia of Statistical Science, pp. 1094–1096. Springer, Berlin, Heidelberg (2011). https://doi.org/10.1007/978-3-642-04898-2_455

  8. Kashef, S., Nezamabadi-pour, H.: An advanced ACO algorithm for feature subset selection. Neurocomputing 147, 271–279 (2015)

    Article  Google Scholar 

  9. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015, Conference Track Proceedings (2015). https://arxiv.org/abs/1412.6980

  10. Leardi, R., Boggia, R., Terrile, M.: Genetic algorithms as a strategy for feature selection. J. Chemom. 6(5), 267–281 (1992)

    Article  Google Scholar 

  11. Lee, J.A., Verleysen, M.: Nonlinear Dimensionality Reduction, vol. 1. Springer, New York (2007). https://doi.org/10.1007/978-0-387-39351-3

  12. Lensen, A., Xue, B., Zhang, M.: Can genetic programming do manifold learning too? In: Sekanina, L., Hu, T., Lourenço, N., Richter, H., García-Sánchez, P. (eds.) EuroGP 2019. LNCS, vol. 11451, pp. 114–130. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-16670-0_8

    Chapter  Google Scholar 

  13. Lensen, A., Xue, B., Zhang, M.: Genetic programming for manifold learning: preserving local topology. IEEE Transactions on Evolutionary Computation, pp. 1–15 (2022). early Access

    Google Scholar 

  14. Lensen, A., Zhang, M., Xue, B.: Multi-objective genetic programming for manifold learning: balancing quality and dimensionality. Genet. Program. Evolvable Mach. 21(3), 399–431 (2020). https://doi.org/10.1007/s10710-020-09375-4

    Article  Google Scholar 

  15. van der Maaten, L.: Learning a parametric embedding by preserving local structure. In: Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, AISTATS 2009, Clearwater Beach, Florida, USA, 16–18 April 2009. JMLR Proceedings, vol. 5, pp. 384–391. JMLR.org (2009)

    Google Scholar 

  16. McDermott, J.: Why is auto-encoding difficult for genetic programming? In: Sekanina, L., Hu, T., Lourenço, N., Richter, H., García-Sánchez, P. (eds.) EuroGP 2019. LNCS, vol. 11451, pp. 131–145. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-16670-0_9

    Chapter  Google Scholar 

  17. McInnes, L., Healy, J.: UMAP: uniform manifold approximation and projection for dimension reduction. CoRR abs/1802.03426 (2018)

    Google Scholar 

  18. Poli, R., Langdon, W.B., McPhee, N.F.: A Field Guide to Genetic Programming (2008). lulu.com, https://www.gp-field-guide.org.uk/

  19. Rodriguez-Coayahuitl, L., Morales-Reyes, A., Escalante, H.J.: Evolving autoencoding structures through genetic programming. Genet. Program. Evolvable Mach. 20(3), 413–440 (2019). https://doi.org/10.1007/s10710-019-09354-4

    Article  Google Scholar 

  20. Ruberto, S., Terragni, V., Moore, J.H.: Image feature learning with genetic programming. In: Bäck, T., et al. (eds.) PPSN 2020. LNCS, vol. 12270, pp. 63–78. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58115-2_5

    Chapter  Google Scholar 

  21. Sainburg, T., McInnes, L., Gentner, T.Q.: Parametric UMAP embeddings for representation and semisupervised learning. Neural Comput. 33(11), 2881–2907 (2021)

    MathSciNet  MATH  Google Scholar 

  22. Schofield, F., Lensen, A.: Using genetic programming to find functional mappings for UMAP embeddings. In: IEEE Congress on Evolutionary Computation, CEC 2021, Kraków, Poland, June 28–1 July 2021, pp. 704–711. IEEE (2021)

    Google Scholar 

  23. Svetnik, V., Liaw, A., Tong, C., Culberson, J.C., Sheridan, R.P., Feuston, B.P.: Random forest: a classification and regression tool for compound classification and QSAR modeling. J. Chem. Inf. Comput. Sci. 43(6), 1947–1958 (2003)

    Article  Google Scholar 

  24. Uriot, T., Virgolin, M., Alderliesten, T., Bosman, P.: On genetic programming representations and fitness functions for interpretable dimensionality reduction (2022). https://arxiv.org/abs/2203.00528

  25. Vanschoren, J., van Rijn, J.N., Bischl, B., Torgo, L.: OpenML: networked science in machine learning. SIGKDD Explor. 15(2), 49–60 (2013). https://doi.org/10.1145/2641190.2641198

    Article  Google Scholar 

  26. Xue, B., Zhang, M., Browne, W.N.: Multi-objective particle swarm optimisation (PSO) for feature selection. In: Genetic and Evolutionary Computation Conference, GECCO 2012, Philadelphia, PA, USA, 7–11 July 2012, pp. 81–88. ACM (2012)

    Google Scholar 

  27. Zhao, H.: A multi-objective genetic programming approach to developing pareto optimal decision trees. Decis. Support Syst. 43(3), 809–826 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrew Lensen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Schofield, F., Slyfield, L., Lensen, A. (2023). A Genetic Programming Encoder for Increasing Autoencoder Interpretability. In: Pappa, G., Giacobini, M., Vasicek, Z. (eds) Genetic Programming. EuroGP 2023. Lecture Notes in Computer Science, vol 13986. Springer, Cham. https://doi.org/10.1007/978-3-031-29573-7_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-29573-7_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-29572-0

  • Online ISBN: 978-3-031-29573-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics