Skip to main content

Latent Variable Symbolic Regression for High-Dimensional Inputs

  • Chapter
  • First Online:
Genetic Programming Theory and Practice VII

Part of the book series: Genetic and Evolutionary Computation ((GEVO))

Abstract

This paper explores symbolic regression when there are hundreds of input variables, and the variables have similar influence which means that variable pruning (a priori, or on-the-fly) will be ineffective. For this problem, traditional genetic programming and many other regression approaches do poorly. We develop a technique based on latent variables, nonlinear sensitivity analysis, and genetic programming designed to manage the challenge. The technique handles 340- input variable problems in minutes, with promise to scale well to even higher dimensions. The technique is successfully verified on 24 real-world circuit modeling problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Almal, Arpit A. and al. (2006). Using genetic programming to classify node positive patients in bladder cancer. In Proc. Genetic and Evolutionary Computation Conference, pages 239–246.

    Google Scholar 

  • Baffi, G., Martin, E.B., and Morris, A.J. (1999). Non-linear projection to latent structures revisited (the neural network pls algorithm). Computers in Chemical Engineering, 23(9).

    Google Scholar 

  • Becker, Y.L., Fox, H., and Fei, P. (2007). An empirical study of multi-objective algorithms for stock ranking. In Riolo, R.L., Soule, T., and Worzel, B., editors, Genetic Programming Theory and Practice V, pages 241–262. Springer.

    Google Scholar 

  • Breiman, L. (2001). Random forests. Machine Learning, 45(1):5–32.

    Article  MATH  Google Scholar 

  • Breiman, L., Friedman, J.H., Olshen, R.A., and Stone, C.J. (1984). Classification and Regression Trees. Chapman & Hall.

    Google Scholar 

  • Drennan, P. and McAndrew, C. (1999). A comprehensive mosfet mismatch model. In Proc. International Electron Devices Meeting.

    Google Scholar 

  • Friedman, J.H. (2002). Stochastic gradient boosting. Journal of Computational Statistics & Data Analysis, 38(4):367–378.

    Article  MATH  Google Scholar 

  • Friedman, J.H. and Popescu, B.E. (2004). Gradient directed regularization for linear regression and classification. Technical report, Stanford University, Department of Statistics.

    Google Scholar 

  • Friedman, J.H. and Tukey, J.W. (1974). A projection pursuit algorithm for exploratory data analysis. IEEE Trans. Computers, C-23:881.

    Article  Google Scholar 

  • Hastie, T., Tibshirani, R., and Friedman, J.H. (2001). The Elements of Statistical Learning. Springer.

    Google Scholar 

  • Jordan, Michael I. and Jacobs, Robert A. (1994). Hierarchical mixtures of experts and the em algorithm. Neural Computation, 6:181–214.

    Article  Google Scholar 

  • Kordon, A., Castillo, F., Smits, G., and Kotanchek, M. (2005). Application issues of genetic programming in industry. In Yu, T., Riolo, R.L., and Worzel, B., editors, Genetic Programming Theory and Practice III, chapter 16, pages 241–258. Springer.

    Google Scholar 

  • Kordon, A., Smits, G., Jordaan, E., and Rightor, E. (2002). Robust soft sensors based on integration of genetic programming, analytical neural networks, and support vector machines. In Fogel, D.B. and al., editors, Congress on Evolutionary Computation, pages 896–901. IEEE Press.

    Google Scholar 

  • Korns, M.F. (2007). Large-scale, time-constrained symbolic regression-classification. In Riolo, R.L., Soule, T., and Worzel, B., editors, Genetic Programming Theory and Practice V, chapter 4, pages 53–68. Springer.

    Google Scholar 

  • Koza, John R. (1992). Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, USA.

    MATH  Google Scholar 

  • Li, X. and Cao, Y. (2008). Projection-based piecewise-linear response surface modeling for strongly nonlinear vlsi performance variations. In IEEE/ACM International Symposium on Quality Electronic Design.

    Google Scholar 

  • Li, X., Gopalakrishnan, P., Xu, Y., and Pileggi, L. (2007). Robust analog/rf circuit design with projection-based performance modeling. IEEE Trans. Comput.-Aided Design of Integr. Circuits and Systems, 26(1):2–15.

    Article  Google Scholar 

  • Malthouse, C., Tamhane, A.C., and Mah, R.S.H. (1997). Nonlinear partial least squares. Computers in Chemical Engineering, 21(8).

    Google Scholar 

  • McConaghy, T. and Gielen, G.G.E. (2006). Canonical form functions as a simple means for genetic programming to evolve human-interpretable functions. In Proc. Genetic and Evolutionary Computation Conference, pages 855–862.

    Google Scholar 

  • McConaghy, T. and Gielen, G.G.E. (2009). Template-free symbolic performance modeling of analog circuits via canonical form functions and genetic programming. IEEE Trans. Comput.-Aided Design of Integr. Circuits and Systems (to appear).

    Google Scholar 

  • McConaghy, T., Palmers, P., Gielen, G.G.E., and Steyaert, M. (2008). Automated extraction of expert domain knowledge from genetic programming synthesis results. In Riolo, R.L., Soule, T., and Worzel, B., editors, Genetic Programming Theory and Practice VI, pages 111–125. Springer.

    Google Scholar 

  • McKay, B., Willis, M., Searson, D., and Montague, G. (1999). Non-linear continuum regression using genetic programming. In Banzhaf, W. and al., editors, Proc. Genetic and Evol. Comput. Conference, pages 1106–1111.

    Google Scholar 

  • Moore, J.H., Greene, C.S., Andrews, P.C., and White, B.C. (2008). Does complexity matter? artificial evolution, computational evolution and the genetic analysis of epistasis in common human diseases. In Riolo, R.L., Soule, T., and Worzel, B., editors, Genetic Programming Theory and Practice VI, Genetic and Evolutionary Computation, chapter 9, pages 125–145. Springer.

    Google Scholar 

  • Nelder, J.A. and Mead, R. (1965). A simplex method for function minimization. Computer Journal, 7:308–313.

    MATH  Google Scholar 

  • Poggio, T. and Girosi, F. (1990). Networks for approximation and learning. Proc. of the IEEE, 78(9):1481–1497.

    Article  Google Scholar 

  • Sansen, W. (2006). Analog Design Essentials. Springer.

    Google Scholar 

  • Schmidt, M.D. and Lipson, H. (2006). Co-evolving fitness predictors for accelerating and reducing evaluations. In Riolo, R.L., Soule, T., and Worzel, B., editors, Genetic Programming Theory and Practice IV, chapter 17. Springer.

    Google Scholar 

  • Singhee, A. and Rutenbar, R.A. (2007). Beyond low-order statistical response surfaces: Latent variable regression for efficient, highly nonlinear fitting. In Proc. Design Automation Conference.

    Google Scholar 

  • Smits, G., Kordon, A., Vladislavleva, K., Jordaan, E., and Kotanchek, M. (2005). Variable selection in industrial datasets using pareto genetic programming. In Yu, T., Riolo, R.L., and Worzel, B., editors, Genetic Programming Theory and Practice III, volume 9 of Genetic Programming, pages 79–92. Springer.

    Google Scholar 

  • Vladislavleva, E. (2008). Model-based Problem Solving through Symbolic Regression via Pareto Genetic Programming. PhD thesis, Tilburg University.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

McConaghy, T. (2010). Latent Variable Symbolic Regression for High-Dimensional Inputs. In: Riolo, R., O'Reilly, UM., McConaghy, T. (eds) Genetic Programming Theory and Practice VII. Genetic and Evolutionary Computation. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-1626-6_7

Download citation

  • DOI: https://doi.org/10.1007/978-1-4419-1626-6_7

  • Published:

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4419-1653-2

  • Online ISBN: 978-1-4419-1626-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics