Evolving Adaptive Neural Network Optimizers for Image Classification

Carvalho, Pedro; Lourenço, Nuno; Machado, Penousal

doi:10.1007/978-3-031-02056-8_1

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13223))

Included in the following conference series:

European Conference on Genetic Programming (Part of EvoStar)

770 Accesses

Abstract

The evolution of hardware has enabled Artificial Neural Networks to become a staple solution to many modern Artificial Intelligence problems such as natural language processing and computer vision. The neural network’s effectiveness is highly dependent on the optimizer used during training, which motivated significant research into the design of neural network optimizers. Current research focuses on creating optimizers that perform well across different topologies and network types. While there is evidence that it is desirable to fine-tune optimizer parameters for specific networks, the benefits of designing optimizers specialized for single networks remain mostly unexplored.

In this paper, we propose an evolutionary framework called Adaptive AutoLR (ALR) to evolve adaptive optimizers for specific neural networks in an image classification task. The evolved optimizers are then compared with state-of-the-art, human-made optimizers on two popular image classification problems. The results show that some evolved optimizers perform competitively in both tasks, even achieving the best average test accuracy in one dataset. An analysis of the best evolved optimizer also reveals that it functions differently from human-made approaches. The results suggest ALR can evolve novel, high-quality optimizers motivating further research and applications of the framework.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bottou, L.: On-Line Learning and Stochastic Approximations, pp. 9–42. Cambridge University Press, Cambridge (1999)
Google Scholar
Carvalho, P., Lourenço, N., Assunção, F., Machado, P.: AutoLR: an evolutionary approach to learning rate policies. In: Proceedings of the 2020 Genetic and Evolutionary Computation Conference, GECCO 2020, pp. 672–680. Association for Computing Machinery, New York (2020). https://doi.org/10.1145/3377930.3390158
Chollet, F., et al.: Keras CIFAR10 architecture (2015). https://keras.io/examples/cifar10_cnn_tfaugment2d/
Chollet, F., et al.: Keras MNIST architecture (2015). https://keras.io/examples/mnist_cnn/
Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Learning hierarchical features for scene labeling. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1915–1929 (2012)
Article Google Scholar
Hinton, G., Srivastava, N., Swersky, K.: Overview of mini-batch gradient descent. University Lecture (2015). https://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Jacobs, R.A.: Increased rates of convergence through learning rate adaptation. Neural Netw. 1(4), 295–307 (1988)
Article Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)
Article Google Scholar
Lopez, M.M., Kalita, J.: Deep learning applied to NLP. arXiv preprint arXiv:1703.03091 (2017)
Mockus, J., Tiesis, V., Zilinskas, A.: The application of Bayesian methods for seeking the extremum, vol. 2, pp. 117–129 (2014)
Google Scholar
Nesterov, Y.: A method for unconstrained convex minimization problem with the rate of convergence o (1/k\({\hat{}}\) 2). In: Doklady an USSR, vol. 269, pp. 543–547 (1983)
Google Scholar
Pedro, C.: Adaptive AutoLR grammar (2020). https://github.com/soren5/autolr/blob/master/grammars/adaptive_autolr_grammar.txt
Pedro, C.: AutoLR (2020). https://github.com/soren5/autolr
Pedro, C.: Keras CIFAR model (2020). https://github.com/soren5/autolr/blob/benchmarks/models/json/cifar_model.json
Pedro, C.: Keras MNIST model (2020). https://github.com/soren5/autolr/blob/benchmarks/models/json/mnist_model.json
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986)
Article Google Scholar
Senior, A., Heigold, G., Ranzato, M., Yang, K.: An empirical study of learning rates in deep neural networks for speech recognition. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6724–6728. IEEE (2013)
Google Scholar
Snoek, J., Larochelle, H., Adams, R.P.: Practical Bayesian optimization of machine learning algorithms. In: Proceedings of the 25th International Conference on Neural Information Processing Systems, NIPS 2012, vol. 2, pp. 2951–2959. Curran Associates Inc., Red Hook (2012)
Google Scholar
Suganuma, M., Shirakawa, S., Nagao, T.: A genetic programming approach to designing convolutional neural network architectures. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2017, pp. 497–504. Association for Computing Machinery, New York (2017). https://doi.org/10.1145/3071178.3071229

Download references

Acknowledgments

This work is partially funded by: Fundação para a Ciência e Tecnologia (FCT), Portugal, under the grant UI/BD/151053/2021, and by national funds through the FCT - Foundation for Science and Technology, I.P., within the scope of the project CISUC - UID/CEC/00326/2020 and by European Social Fund, through the Regional Operational Program Centro 2020.

Author information

Authors and Affiliations

CISUC, Department of Informatics Engineering, University of Coimbra, Polo II - Pinhal de Marrocos, 3030, Coimbra, Portugal
Pedro Carvalho, Nuno Lourenço & Penousal Machado

Authors

Pedro Carvalho
View author publications
You can also search for this author in PubMed Google Scholar
Nuno Lourenço
View author publications
You can also search for this author in PubMed Google Scholar
Penousal Machado
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pedro Carvalho .

Editor information

Editors and Affiliations

University of Trieste, Trieste, Italy
Eric Medvet
Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
Gisele Pappa
Victoria University of Wellington, Wellington, New Zealand
Bing Xue

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Carvalho, P., Lourenço, N., Machado, P. (2022). Evolving Adaptive Neural Network Optimizers for Image Classification. In: Medvet, E., Pappa, G., Xue, B. (eds) Genetic Programming. EuroGP 2022. Lecture Notes in Computer Science, vol 13223. Springer, Cham. https://doi.org/10.1007/978-3-031-02056-8_1

Download citation

DOI: https://doi.org/10.1007/978-3-031-02056-8_1
Published: 13 April 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-02055-1
Online ISBN: 978-3-031-02056-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Evolving Adaptive Neural Network Optimizers for Image Classification