Skip to main content

TPOT2: A New Graph-Based Implementation of the Tree-Based Pipeline Optimization Tool for Automated Machine Learning

  • Chapter
  • First Online:
Genetic Programming Theory and Practice XX

Abstract

In this chapter, we present a new implementation of the popular Tree-Based Pipeline Optimization Tool (TPOT). This new implementation, called TPOT2, was rebuilt from the ground up to be more modular, easier to maintain, and easier to expand. TPOT2 comes with new features and optimizations, such as a more flexible graph-based representation of Scikit-Learn pipelines and the ability to specify various aspects of the evolutionary run. Using experiments on multiple benchmark datasets, we show that TPOT2 performs at least as well as TPOT1 with equivalent settings, with stronger performance on a few datasets. We outline some future directions for further optimizations and applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/EpistasisLab/tpot2.

  2. 2.

    https://github.com/alegonz/baikal.

  3. 3.

    https://github.com/scikit-learn-contrib/skdag.

  4. 4.

    The code to run experiments: https://github.com/epistasisLab/tpot2_gptp_experiments.

References

  1. Akiba, T., Sano, S., Yanase, T., Ohta, T., Koyama, M.: Optuna: a next-generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2019)

    Google Scholar 

  2. Blank, J., Deb, K.: pymoo: Multi-objective optimization in python. IEEE Access 8, 89497–89509 (2020)

    Article  Google Scholar 

  3. Cavaglià, M., Gaudio, S., Hansen, T., Staats, K., Szczepańczyk, M., Zanolin, M.: Improving the background of gravitational-wave searches for core collapse supernovae: a machine learning approach. Mach. Learn.: Sci. Technol. 1(1), 015005 (2020)

    Google Scholar 

  4. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)

    Article  Google Scholar 

  5. Feurer, M., Eggensperger, K., Falkner, S., Lindauer, M., Hutter, F.: Auto-sklearn 2.0: hands-free automl via meta-learning. arXiv:2007.04074 [cs.LG] (2020)

  6. Feurer, M., Eggensperger, K., Falkner, S., Lindauer, M., Hutter, F.: Auto-sklearn 2.0: hands-free automl via meta-learning. J. Mach. Learn. Res. 23(1), 11936–11996 (2022)

    Google Scholar 

  7. Feurer, M., Klein, A., Eggensperger, J., Springenberg, K., Blum, M., Hutter, F.: Efficient and robust automated machine learning. In: Advances in Neural Information Processing Systems 2015, vol. 28, pp. 2962–2970 (2015)

    Google Scholar 

  8. Fortin, F.-A., De Rainville, F.-M., Gardner, M.-A., Parizeau, M., Gagné, C.: DEAP: evolutionary algorithms made easy. J. Mach. Learn. Res. 13, 2171–2175 (2012)

    MathSciNet  Google Scholar 

  9. Freda, P.J., Ghosh, A., Zhang, E., Luo, T., Chitre, A.S., Polesskaya, O., St. Pierre, C.L., Gao, J., Martin, C.D., Chen, H. et al.: Automated quantitative trait locus analysis (autoqtl). BioData Mining 16(1) (2023)

    Google Scholar 

  10. Hagberg, A.A., Schult, D.A., Swart, P.J.: Exploring network structure, dynamics, and function using networkx. In: Varoquaux, G., Vaught, T., Millman, J. (eds.), Proceedings of the 7th Python in Science Conference, pp. 11–15. Pasadena (2008)

    Google Scholar 

  11. Manduchi, E., Fu, W., Romano, J.D., Ruberto, S., Moore, J.H.: Embedding covariate adjustments in tree-based automated machine learning for biomedical big data analyses. BMC Bioinf. 21(1) (2020)

    Google Scholar 

  12. Manduchi, E., Romano, J.D., Moore, J.H.: The promise of automated machine learning for the genetic analysis of complex traits. Hum. Genet. 141(9), 1529–1544 (2021)

    Article  Google Scholar 

  13. Olson, R.S., Moore, J.H.: Tpot: a tree-based pipeline optimization tool for automating machine learning. In: Workshop on Automatic Machine Learning, pp. 66–74. PMLR (2016)

    Google Scholar 

  14. Parmentier, L., Nicol, O., Jourdan, L., Kessaci, M.E.: Tpot-sh: A faster optimization algorithm to solve the automl problem on large datasets. In: 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), pp. 471–478. IEEE (2019)

    Google Scholar 

  15. Soper, D.S.: Greed is good: Rapid hyperparameter optimization and model selection using greedy k-fold cross validation. Electronics 10(16), 1973 (2021)

    Article  Google Scholar 

  16. Thornton, C., Hutter, F., Hoos, H. H., Leyton-Brown, K.: Auto-weka: Combined selection and hyperparameter optimization of classification algorithms. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 847–855 (2013)

    Google Scholar 

Download references

Acknowledgements

The study was supported by the following NIH grants: R01 LM010098 and U01 AG066833.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pedro Ribeiro .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Ribeiro, P. et al. (2024). TPOT2: A New Graph-Based Implementation of the Tree-Based Pipeline Optimization Tool for Automated Machine Learning. In: Winkler, S., Trujillo, L., Ofria, C., Hu, T. (eds) Genetic Programming Theory and Practice XX. Genetic and Evolutionary Computation. Springer, Singapore. https://doi.org/10.1007/978-981-99-8413-8_1

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8413-8_1

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8412-1

  • Online ISBN: 978-981-99-8413-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics