Abstract
A methodology, to determine the causal relations between time series and to derive the set of equations describing the interacting systems, has been developed. The techniques proposed are completely data driven and they are based on ensembles of Time Delay Neural Networks (TDNNs) and Symbolic Regression (SR) via Genetic Programming (GP). With regard to the detection of the causal influences and the identification of graphical causal networks, the developed tools have better performances than those reported in the literature. For example, the TDNN ensembles can cope with evolving systems, non-Markovianity, feedback loops and multicausality. In its turn, on the basis of the information derived from the TDNN ensembles, SR via GP permits to identify the set of equations, i.e. the detailed model of the interacting systems. Numerical tests and real life examples from various disciplines prove the power and versatility of the developed tools, capable of handling tens of time series and even images. The excellent results obtained emphasize the importance of recording the time evolution of signals, which would allow a much better understanding of many issues, ranging from the physical to the social and medical sciences.
Similar content being viewed by others
Data availability
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.
References
Annamalai H, Xie S-P, McCreary JP, Murtugudde R (2005) Impact of Indian Ocean sea surface temperature on developing El Niño. J Climate 18:302–319
Burnham KP, Anderson DR (2002) Model selection and multi-model inference: a practical information-theoretic approach, 2nd edn. Springer, New York
Cane MA (1983) Oceanograhic events during El Nino. Science 222:1189–1195
Chen Y, Rangarajan G, Feng J, Ding M (2004) Analyzing multiple nonlinear time series with extended Granger causality. Phys Lett A 324:26. https://doi.org/10.1016/j.physleta.2004.02.032
Craciunescu T et al (2016) Geodesic distance on Gaussian manifolds for the robust identification of chaotic systems. Nonlinear Dyn 86(1):677–693. https://doi.org/10.1007/s11071-016-2915-x
Craciunescu T et al (2018) Improving entropy estimates of complex network topology for the characterization of coupling in dynamical systems. Entropy 20–11:891
Fan L, Liu Q, Wnag C, Guo F (2017) Indian ocean dipole modes associated with different types of ENSO development. J Climate 30:2223–2249
Fischer AP, Terray P, Guilyardi E, Gualdi S, Delecluse P (2005) Two independent triggers for the Indian Ocean dipole zonal mode in a coupled GCM. J Climate 18:3428–3449
Granger CWJ (1969) Investigating causal relations by econometric models and cross-spectral methods. Econometrica 37(3):424–438. https://doi.org/10.2307/1912791.JSTOR1912791
Illari P, Russo F (2014) Causality: philosophical theory meets scientific practice. Oxford University Press
JAMSTEC Japan Agency for Marine-Earth Science and Technology. http://www.jamstec.go.jp/e/ (accessed on 15 Oct 2020).
Kalainathan, D. Generative neural networks to infer causal mechanisms: algorithms and applications. Machine Learning [stat.ML]. Université Paris Sud (Paris 11)—Université Paris Saclay, 2019. English. fftel-02435986f. https://hal.inria.fr/tel-02435986
Klein SA, Soden BJ, Lau NC (1999) Remote sea surface temperature variations during ENSO: evidence for a tropical atmospheric bridge. J Climate 12:917
Krakovská A et al (2018) Comparison of six methods for the detection of causality in a bivariate time series. Phys Rev E 97:042207
Kutach D (2014) Causation and its basis in fundamental physics. Oxford University Press
Lacasa L (2008) From time series to complex networks: the visibility graph. Proc Natl Acad Sci. https://doi.org/10.1073/pnas.0709247105
Losee J (2011) Theories of causality: from antiquity to the present. Transaction Publishers, Abingdon
Maddala GS, Lahiri K (2009) Introduction to Econometrics, 4th edn. Wiley, Chichester, pp 155–160
Manoj BS, Chakraborty A, Singh R (2018) Complex networks: a networking and signal processing perspective. Pearson, New York
Marinazzo D, Pellicoro M, Stramaglia S (2008) Kernel method for nonlinear granger causality. Phys Rev Lett 100:144103. https://doi.org/10.1103/PhysRevLett.100.144103
Marwan N, Romano MC, Thiel M, Kurths J (2007) Recurrence plots for the analysis of complex systems. Phys Rep 438(5–6):237
Murari A et al (2013) Clustering based on the geodesic distance on Gaussian manifolds for the automatic classification of disruptions. Nucl Fusion 53:033006. https://doi.org/10.1088/0029-5515/53/3/033006
Murari A et al (2015) A new approach to the formulation and validation of scaling expressions for plasma confinement in tokamaks. Nucl Fusion 55(7):073009. https://doi.org/10.1088/0029-5515/55/7/073009
Murari A et al (2017) Detection of causal relations in time series affected by noise in tokamaks using geodesic distance on gaussian manifolds. Entropy 19:10. https://doi.org/10.3390/e19100569
Murari A, Lungaroni M, Peluso E et al (2019) A model falsification approach to learning in non-stationary environments for experimental design. Sci Rep 9:17880. https://doi.org/10.1038/s41598-019-54145-7
Murari A, Peluso E, Lungaroni M et al (2020) Data driven theory for knowledge discovery in the exact sciences with applications to thermonuclear fusion. Sci Rep 10:19858. https://doi.org/10.1038/s41598-020-76826-4
Murari A, Gelfusa M, Lungaroni M, Gaudio P, Peluso E (2021) A systemic approach to classification for knowledge discovery with applications to the identification of boundary equations in complex systems. Artif Intell Rev. https://doi.org/10.1007/s10462-021-10032-0
NOAA ESRL Physical Sciences Division Data. https://www.esrl.noaa.gov/psd/data/gridded/rsshelp.html (accessed on 15 Oct 2020).
Pearl J (2013) Causality: models, reasoning and inference, 2nd edn. Cambridge University Press
Pearl J, Mackenzie D (2019) The book of why: the new science of cause and effect. Penguin Books
Peluso E et al (2020) A refinement of recurrence analysis to determine the time delay of causality in presence of external perturbations. Entropy 22(8):865. https://doi.org/10.3390/e22080865
Peters J, Janzing D, Schölkopf B (2017) Elements of causal inference: foundations and learning algorithms. MIT Press, Cambridge
Reichenbach, H. 1978, Hans Reichenbach selected writings 1909–1953, volume 2, Reichenbach M, Cohen, RS, Eds. (Vienna circle collection 4b). Dordrecht: D. Reidel. doi: https://doi.org/10.1007/978-94-009-9855-1
Rossi R, Murari A, Gaudio P (2020) On the potential of time delay neural networks to detect indirect coupling between time series. Entropy 22:584
Runge, J. et al. Detecting causal associations in large nonlinear time series datasets. arXiv:1702.07007v2 [stat.ME] (2018).
Runge J et al (2019) Inferring causation from time series in Earth system sciences. Nat Commun 10:2553. https://doi.org/10.1038/s41467-019-10105-3
Saji NH, Goswami BN, Vinayachandran PN, Yamagata TA (1999) A dipole mode in the tropical Indian Ocean. Nature 401:360–363
San Liang X (2014) Unraveling the cause-effect relation between time series. Phys Rev E 90:052150
Schmid M, Lipson H (2009) Distilling free-form natural laws from experimental data. Science 324(5923):81–85. https://doi.org/10.1126/science.1165893
Schreiber T (2000) Measuring information transfer. Phys Rev Lett 85(2):461–464
Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:379. https://doi.org/10.1002/j.1538-7305.1948.tb01338.x
Spirtes P, Glymour C (1991) An algorithm for fast recovery of sparse causal graphs. Soc Sci Comput Rev 9:62–72
Spirtes P, Glymour C, Scheines R (2000) Causation, prediction, and search. MIT Press, Cambridge
Sugihara G et al (2012) Detecting causality in complex ecosystems (PDF). Science 338(6106):496–500. https://doi.org/10.1126/science.1227079
Trenberth KE (1997) The changing character of precipitation. Bull Am Meteor Soc 78:2771
Verma T, Pearl J (1990) Causal networks: semantics and expressiveness. Mach Intell Pattern Recognit 9:69–76. https://doi.org/10.1016/B978-0-444-88650-7.50011-1
Waibel, A. Phoneme recognition using time-delay neural networks. SP87–100, meeting of the institute of electrical, information and communication engineers (IEICE), Dec, 1987, Tokyo, Japan.
Webster PJ, Moore AM, Loschnigg JP, Leben RR (1999) Coupled ocean-atmosphere dynamics in the Indian Ocean during 1997–98. Nature 23(401):356–360
Wiedermann W, von Eye A (2016) Statistics and causality: methods for applied empirical research: Wiley series in probability and statistics book 2. John Wiley & Son, NJ
Wang Z. and Oates T. Imaging time-series to improve classification and imputation. arXiv:1506.00327v1 [cs.LG] 1 Jun 2015
Woodward J (2003) Making things happen: a theory of causal explanation. Oxford University Press
Zhang J (2008) On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias. Artif Intell 172:1873–1896
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1
Theoretical interpretation of the indicators for causality detection
To analyse the properties of the statistical indicators introduced in Sect. 3, let’s name y the time series under study, which can be represented by the following generic function:
where i indicates the time axis, \({x}_{j}\) is the j-th driver of the time series, and \(\epsilon\) represents all the stochastics processes influencing the measurements (for example random noise), assumed additive and independent. The causality detection procedure consists of calculating the prediction error in two cases, first using all the expected drivers (“all”), and secondly removing from the network one driver identified with the letter “k”. The prediction using all the drivers is indicated by yp,all while the prediction without the k driver is indicated by yp,k. With this notation, the overall prediction errors can be written as follows:
From the previous equations, it is evident that \({E}_{k}\) is always larger or equal to\({E}_{all}\). If \({E}_{k}={E}_{all}\), excluding \({x}_{k}\) from the list of regressors does not lead to any prediction degradation, and therefore \({x}_{k}\) cannot have a causal influence on y. On the contrary, if\({E}_{k}>{E}_{all}\), including \({x}_{k}\) has the effect of improving the prediction and therefore it can be considered to have a causal influence on y.
In the proposed methodology, \({E}_{k}\) and \({E}_{all}\) are estimated with TDNN ensembles. The motivation for this choice resides in the fact that TDNNs are universal approximators and therefore allow avoiding a priori assumptions on the mathematical form of the function f (contrary to classical causality detection algorithms, such as the various versions of Granger causality detection). Their ensembles also improve significantly the statistical estimator of the prediction errors, id est the median value (more robust than the average to avoid fluctuations due to outliers) and the standard deviation of the median errors.
With the estimators proposed in Sect. 3, two major aspects of causality can be tackled. The first is causality detection, which is a binary problem (does \({x}_{k}\) causes y?); this task is accomplished with a hypothesis test on the difference of means. The corresponding indicator Z is calculated as follows:
If \(Z\ge {Z}_{threshod}\), the null hypothesis (that \({x}_{k}\) has no causal influence on y) can be discarded with a significance level α depending on the value of Z; otherwise the null hypothesis must be accepted as valid.
The second objective is causality quantification, and it has been tackled with the \({R}_{s}\) indicator, defined as:
\({R}_{\sigma }=\frac{{E}_{k}}{{E}_{all}}=\frac{\sum {\left(f\left({y}_{i-{\Delta }_{y}},{x}_{j,i-{\Delta }_{{x}_{j}}},{x}_{k,i-{\Delta }_{{x}_{k}}}\right)-f\left({y}_{i-{\Delta }_{y}},{x}_{j,i-{\Delta }_{{x}_{j}}}\right)\right)}^{2}+\sum {\epsilon }^{2}}{\sum {\epsilon }^{2}}\) A1.
In some complex cases, equation A1 can be difficult to calculate. The problem can be often circumvented by using the Taylor expansion with respect to \({x}_{k,i-1},\dots ,{x}_{k,i-\Delta {x}_{i}}\):
One then obtains:
\({R}_{\sigma }=\frac{{E}_{k}}{{E}_{all}}=\frac{\sum {\left(\sum {\left[\frac{\partial f}{\partial {x}_{k,t}}\right]}_{0}\Delta {x}_{k,t}\right)}^{2}+\sum {\epsilon }^{2}}{\sum {\epsilon }^{2}}\) A2.
From the above results, it is clear that the higher the value of the indicator, the higher the influence of \({x}_{k}\) on the time series \(y\).
In some cases, the above relations provide an immediate interpretation of the methodology. In the following, we report various cases that have been also analysed in the main text of the paper and can be addressed directly with equation A1.
The first example is the “Coupled AR model”, described by the following equations:
where the objective consists of calculating the influence of x on y. With the ensemble, it is possible to derive:
Then, the \({R}_{\sigma }\) indicator becomes:
Consequently:
Therefore any value of \({R}_{\sigma }\) significantly higher than one indicates a causal link between x and y.
The second example is the Lorenz trivariate case introduced in Sect. 3:
System X:
System Y:
System Z:
where, for simplicity sake, the prediction of the system variables and not an embedded signals is used. In this case, we can write the \({R}_{\sigma }\) for all the systems as:
where the results are perfect in line with the expected ones. For example, particularised for case 6 of Table 1 of the main text, the following values have to be assigned to the constants \({c}_{ij}\):
And the \({R}_{\sigma }\) indicators become:
\({R}_{\sigma }\left(Y\to X\right)>1\); \({R}_{\sigma }\left(Z\to X\right)=1\); \({R}_{\sigma }\left(X\to Y\right)=1\); \({R}_{\sigma }\left(Z\to Y\right)>1\); \({R}_{\sigma }\left(X\to Z\right)>1\); \({R}_{\sigma }\left(Y\to Z\right)=1\)
The extension to all the other trivariate cases is immediate.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Murari, A., Rossi, R. & Gelfusa, M. Combining neural computation and genetic programming for observational causality detection and causal modelling. Artif Intell Rev 56, 6365–6401 (2023). https://doi.org/10.1007/s10462-022-10320-3
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-022-10320-3