Elsevier

Applied Soft Computing

Volume 94, September 2020, 106432
Applied Soft Computing

Constructing parsimonious analytic models for dynamic systems via symbolic regression

https://doi.org/10.1016/j.asoc.2020.106432Get rights and content

Highlights

  • Accurate models can be found from small data sets and used for real-time control.

  • Symbolic regression finds both the structure and the parameters of the models.

  • The models have the form of mathematical expressions, facilitating further processing.

  • The method works both with state-space and input–output (NARX-type) models.

  • Extensive experimental evaluation on three systems demonstrates its practical utility.

Abstract

Developing mathematical models of dynamic systems is central to many disciplines of engineering and science. Models facilitate simulations, analysis of the system’s behavior, decision making and design of automatic control algorithms. Even inherently model-free control techniques such as reinforcement learning (RL) have been shown to benefit from the use of models, typically learned online. Any model construction method must address the tradeoff between the accuracy of the model and its complexity, which is difficult to strike. In this paper, we propose to employ symbolic regression (SR) to construct parsimonious process models described by analytic equations. We have equipped our method with two different state-of-the-art SR algorithms which automatically search for equations that fit the measured data: Single Node Genetic Programming (SNGP) and Multi-Gene Genetic Programming (MGGP). In addition to the standard problem formulation in the state-space domain, we show how the method can also be applied to input–output models of the NARX (nonlinear autoregressive with exogenous input) type. We present the approach on three simulated examples with up to 14-dimensional state space: an inverted pendulum, a mobile robot, and a bipedal walking robot. A comparison with deep neural networks and local linear regression shows that SR in most cases outperforms these commonly used alternative methods. We demonstrate on a real pendulum system that the analytic model found enables a RL controller to successfully perform the swing-up task, based on a model constructed from only 100 data samples.

Introduction

Numerous methods rely on an accurate model of the system. Model-based techniques comprise a wide variety of methods such as model predictive control [1], [2], time series prediction [3], fault detection and diagnosis [4], [5], or reinforcement learning (RL) [6], [7].

Even though model-free algorithms are available, the absence of a model slows down convergence and leads to extensive learning times [8], [9], [10]. Various model-based methods have been proposed to speed up learning [11], [12], [13], [14], [15]. To that end, many model-learning approaches are available: time-varying linear models [16], [17], Gaussian processes [18], [19] and other probabilistic models [20], basis function expansions [21], [22], regression trees [23], deep neural networks [7], [24], [25], [26], [27], [28], [29] or local linear regression [30], [31], [32].

All the above approaches suffer from drawbacks induced by the use of the specific approximation technique, such as a large number of parameters (deep neural networks), local nature of the approximator (local linear regression), computational complexity (Gaussian processes), etc. In this article, we propose another way to capture the system dynamics: using analytic models constructed by means of the symbolic regression method (SR). Symbolic regression is based on genetic programming and it has been used in nonlinear data-driven modeling, often with quite impressive results [33], [34], [35], [36], [37].

Symbolic regression appears to be quite unknown to the machine learning community as only a few works have been reported on the use of SR for control of dynamic systems. For instance, modeling of the value function by means of genetic programming is presented in [38], where analytic descriptions of the value function are obtained based on data sampled from the optimal value function. Another example is the work [39], where SR is used to construct an analytic function, which serves as a proxy to the value function and a continuous policy can be derived from it. A multi-objective evolutionary algorithm was proposed in [40], which is based on interactive learning of the value function through inputs from the user. SR is employed to construct a smooth analytic approximation of the policy in [41], using the data sampled from the interpolated policy.

To our best knowledge, there have been no reports in the literature on the use of symbolic regression for constructing a process model in model-based control methods. We argue that the use of SR for model learning is a valuable element missing from the current nonlinear control schemes and we demonstrate its usefulness.

In this paper, we extend our previous work [42], [43], which indicated that SR is a suitable tool for this task. It does not require any basis functions defined a priori and contrary to (deep) neural networks it learns accurate, parsimonious models even from very small data sets. Symbolic regression can handle also high-dimensional problems and it does not suffer from the exponential growth of the computational complexity with the dimensionality of the problem, which we demonstrate on an enriched set of experiments including a complex bipedal walking robot system. In this work, we extend the use of the method to the class of input–output models, which are suitable in cases when the full state vector cannot be measured. By testing our method with two different state-of-the-art genetic programming algorithms, we demonstrate that the method is not dependent on the particular choice of the SR algorithm.

The paper is organized as follows. Sections 2 Theoretical background, 3 Method present the relevant context for model learning and the proposed method. The experimental evaluation of the method is reported in Section 4 and the conclusions are drawn in Section 5. Appendix A describes the RL method used in this paper.

Section snippets

Theoretical background

The discrete-time nonlinear state-space process model is described as xk+1=f(xk,uk)with the state xk,xk+1XRn and the input ukURm. Note that the actual process can be stochastic (typically when the sensor readings are corrupted by noise), but in this paper we aim at constructing a deterministic process model (1).

The full state vector cannot be directly measured for a vast majority of processes and a state estimator would have to be used. In the absence of an accurate process model, such a

Method

In this section, we explain the principle of our method, briefly describe two variants of genetic programming algorithms used in this work, and discuss the computational complexity of our approach.

Experimental results

We have carried out experiments with three nonlinear systems: a mobile robot, a 1-DOF inverted pendulum and a bipedal walking robot. The data, the codes and the detailed configuration of the experiments is available in our repository.1

The simulation experiment with the mobile robot illustrates the use of the presented method, showing the precision and compactness of the models found in the case where the ground truth is known (Section 4.1). We

Conclusions

We showed that symbolic regression is a very effectivemethod for constructing dynamic process models from data. It generates parsimonious models in the form of analytic expressions, which makes it a good alternative to black-box models, especially in problems with limited amounts of data. Prior knowledge on the type of nonlinearities and model complexity can easily be included in the symbolic regression procedure. Despite the technique is not yet broadly used in the field of robotics and

CRediT authorship contribution statement

Erik Derner: Methodology, Software, Data curation, Writing - original draft. Jiří Kubalík: Conceptualization, Formal analysis, Validation, Resources. Nicola Ancona: Investigation, Visualization. Robert Babuška: Supervision, Writing - reviewing & editing, Project administration, Funding acquisition.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported by the European Regional Development Fund under the project Robotics for Industry 4.0 (reg. no. CZ.02.1.01/0.0/0.0/15_003/0000470) and by the Grant Agency of the Czech Technical University in Prague , grant no.SGS19/174/OHK3/3T/13.

The authors thank Tim de Bruin and Jonáš Kulhánek for their help with the DNN experiments for the walking robot and Jan Žegklitz for his help with experiments using the MGGP algorithm.

References (60)

  • MastersT.

    Neural, Novel and Hybrid Algorithms for Time Series Prediction

    (1995)
  • GertlerJ.

    Fault Detection and Diagnosis

    (2013)
  • SuttonR.S. et al.

    Reinforcement Learning: An Introduction

    (2018)
  • MnihV. et al.

    Human-level control through deep reinforcement learning

    Nature

    (2015)
  • GuS. et al.

    Continuous deep q-learning with model-based acceleration

    (2016)
  • PetersJ. et al.

    Policy gradient methods for robotics

  • KoberJ. et al.

    Reinforcement learning in robotics: A survey

  • KuvayevL. et al.

    Model-based reinforcement learning with an approximate, learned model

  • J. Forbes, D. Andre, Representations for learning control policies, in: Proc. 19th Int. Conf. Mach. Learn. Workshop...
  • JongN.K. et al.

    Model-based function approximation in reinforcement learning

  • SuttonR.S.

    Dyna, an integrated architecture for learning, planning, and reacting

    SIGART Bull.

    (1991)
  • LevineS. et al.

    Learning neural network policies with guided policy search under unknown dynamics

  • LioutikovR. et al.

    Sample-based information-theoretic stochastic optimal control

  • DeisenrothM. et al.

    PILCO: A model-based and data-efficient approach to policy search

  • BoedeckerJ. et al.

    Approximate real-time optimal control based on sparse Gaussian process models

  • NgA.Y. et al.

    Autonomous inverted helicopter flight via reinforcement learning

  • MunosR. et al.

    Variable resolution discretization in optimal control

    Mach. Learn.

    (2002)
  • BuşoniuL. et al.

    Cross-entropy optimization of control policies with adaptive basis functions

    IEEE Trans. Syst. Man Cybern. B

    (2011)
  • ErnstD. et al.

    Tree-based batch mode reinforcement learning

    J. Mach. Learn. Res.

    (2005)
  • S. Lange, M. Riedmiller, A. Voigtlander, Autonomous reinforcement learning on raw visual input data in a real world...
  • Cited by (18)

    • Multi-objective symbolic regression for physics-aware dynamic modeling

      2021, Expert Systems with Applications
      Citation Excerpt :

      Typically, SR is realized using genetic programming (GP). SR has been used in nonlinear data-driven modeling with quite impressive results (Alibekov et al., 2016; Derner et al., 2020; Derner et al., 2018b; Schmidt & Lipson, 2009; Staelens et al., 2013; Vladislavleva et al., 2013; Derner et al., 2018a). Moreover, contrary to data-hungry approaches such as neural networks, SR can construct good models even from very small training data sets.

    • Machine Learning Control Synthesis by Symbolic Regression for Avoidance of Arbitrary Positioned Obstacles

      2023, 9th 2023 International Conference on Control, Decision and Information Technologies, CoDIT 2023
    View all citing articles on Scopus
    View full text