This blog summarizes research trends based on published paper abstracts. Specific numbers or findings may contain inaccuracies. For scholarly rigor, always consult the original papers cited in each post.
Correlation is not causation. Everyone knows that. But how do you actually discover causation? If you observe two variables moving together in a complex system---say, sea surface temperatures in the western and eastern Pacific---how do you determine which one drives the other, and over what timescale?
A new framework borrowed from weather forecasting does it by running the clock backwards.
The three papers reviewed here represent two complementary axes of modern causal inference for time series. On one axis, Andreou, Chen, and Bollt reformulate Bayesian data assimilation---the mathematical engine behind modern weather prediction---as a causal discovery tool. On the other, Zhang et al. tackle the same causal discovery problem from a data-driven machine learning perspective, handling latent confounders and shifting dynamics that plague classical Granger causality. Together, they sketch a surprisingly complete picture of where causal inference for dynamical systems stands today.
The Core Idea: Causation as an Inverse Problem
Weather forecasting is a forward problem: given initial conditions, predict the future. Data assimilation is its inverse: given noisy observations scattered across time and space, reconstruct the hidden state that produced them. This inverse-problem perspective is what Andreou, Chen, and Bollt exploit.
Their Assimilative Causal Inference (ACI) framework, published in Nature Communications, starts from a simple insight. If variable X causally influences variable Y, then observing Y should improve our ability to reconstruct the past trajectory of X through data assimilation. The degree of this improvement---quantified by comparing posterior uncertainties with and without the observations---measures the strength of the causal link.
This is not merely metaphorical. The authors formulate causal influence as a rigorous Bayesian quantity: the reduction in posterior uncertainty of the cause variable when effect observations are assimilated, compared to when they are withheld. The framework inherits the full mathematical machinery of Bayesian filtering and smoothing, including its ability to handle nonlinear dynamics, partial observations, and non-Gaussian noise.
Causal Influence Range: How Far Does a Cause Reach?
One of the most striking concepts introduced across the first two papers is the Causal Influence Range (CIR)---a measure of how far into the future (or how far back into the past) a causal link extends.
Classical causal inference methods typically answer binary questions: does X cause Y, yes or no? CIR goes further. It answers: for how long does X cause Y? In a climate system where ocean temperatures influence atmospheric circulation over months or years, this temporal dimension of causality is essential.
Andreou and Chen's companion paper in SIAM/ASA Journal on Uncertainty Quantification develops this idea fully, introducing two complementary measures:
<
| Concept | Definition | Analogy |
|---|
| Forward CIR | How far into the future does a cause at time t continue to influence the effect? | Future light cone in relativity |
| Backward CIR | How far back in time must we look to find when the cause of a current effect began? | Past light cone in relativity |
The relativistic analogy is not decorative. Just as a light cone in spacetime defines the set of events that can causally affect (or be affected by) a given event, the forward and backward CIR define the temporal boundary of causal influence in a dynamical system. Events outside the CIR are causally disconnected---no matter how correlated they may appear.
ENSO: A Six-Variable Causal Laboratory
The ACI framework is demonstrated on the El Nino-Southern Oscillation (ENSO), arguably the most consequential climate oscillation on Earth. The authors construct a six-variable model capturing the essential thermodynamic and dynamical couplings between the western and eastern Pacific.
The results are both intuitive and informative. The strongest ACI signal runs from the thermocline depth in the western Pacific (T_C) to the sea surface temperature in the eastern Pacific (T_E)---consistent with the established physical understanding that subsurface heat transport drives ENSO events. Meanwhile, the variable h_W (western Pacific thermocline depth) exhibits the longest CIR to T_E, suggesting that western Pacific subsurface anomalies influence eastern Pacific surface temperatures over extended periods.
The framework also handles extreme events---a critical capability for climate science. Using nonlinear dyad models, the authors demonstrate that ACI can detect causal links that activate only during extreme-event regimes, capturing the asymmetry between El Nino and La Nina dynamics that linear methods miss entirely.
A practical concern with any Bayesian framework is computational cost. Full nonlinear Bayesian filtering requires either particle methods (expensive) or approximations (potentially inaccurate). Andreou and Chen address this in their SIAM/ASA paper by deriving closed-form analytical solutions for CIR in a class of models called Conditional Gaussian Nonlinear Systems (CGNS).
CGNS models are structured so that some variables are conditionally Gaussian given others, even when the full system is nonlinear. This structure is common in geophysical systems---atmospheric variables often follow approximately Gaussian distributions conditioned on slowly varying ocean states. For CGNS models, the posterior distributions required for CIR computation can be obtained exactly, without Monte Carlo sampling.
The paper demonstrates these closed-form solutions on three physically meaningful cases:
Tipping-point dynamics: Where a slow variable approaches a critical threshold and abruptly shifts the system's behavior. CIR reveals how early the causal signal of an approaching tipping point appears in the data.Multiscale atmosphere-ocean coupling: Where fast atmospheric fluctuations interact with slow ocean dynamics. Forward and backward CIR capture the asymmetric timescales of influence.Lorenz-84 atmospheric blocking: A canonical model for persistent weather patterns. CIR quantifies how long a blocking event causally constrains downstream atmospheric evolution.The Other Axis: Data-Driven Causal Discovery Under Confounding
While ACI operates within a model-based, continuous-time paradigm rooted in physics, many real-world causal discovery problems lack a known dynamical model. Time series from industrial processes, financial markets, or sensor networks come without governing equations. For these settings, Granger causality---the idea that X causes Y if past values of X help predict Y beyond what Y's own past provides---has been the workhorse method since the 1960s.
But Granger causality has well-known vulnerabilities. Latent confounders (unobserved common causes) can create spurious Granger-causal links. Unknown interventions (regime changes, equipment replacements, policy shifts) can mask or distort genuine causal relationships. In practice, both problems occur simultaneously.
Zhang, Ren, Qian, and Duffield's InvarGC (Invariant Granger Causality) tackles both problems at once through a four-module architecture:
Latent Confounder Identification Module (LCIM): Extracts latent confounding factors from the observed time series using a variational approach.
Intervention Identification Module: Detects unknown intervention points---moments when the data-generating process changes---without requiring prior knowledge of when or how interventions occurred.
Invariant Granger Causality Module: Identifies causal relationships that remain stable across the detected intervention regimes. This invariance criterion is the key: if X Granger-causes Y in every regime, the link is more likely genuine than confounded.
Prediction Module: Validates the discovered causal structure through forecasting performance.The theoretical contribution is substantial. The authors prove three identifiability theorems establishing that their method can correctly recover causal structure even when confounders are present and interventions are unknown. On synthetic data, InvarGC achieves AUROC and AUPRC scores of 1.0---perfect recovery of the true causal graph. On real-world benchmarks---the Tennessee Eastman Process (TEP) industrial dataset and the Causal-Rivers hydrological dataset---InvarGC sets new state-of-the-art performance.
Two Axes, One Problem
The complementarity between these papers is worth making explicit. ACI and InvarGC are solving the same fundamental problem---identifying causal relationships in temporal data---but from opposite directions:
<
| Dimension | ACI (Papers 1 & 2) | InvarGC (Paper 3) |
|---|
| Paradigm | Model-based, physics-informed | Data-driven, model-agnostic |
| Time domain | Continuous-time dynamics | Discrete-time series |
| Primary tool | Bayesian data assimilation | Granger causality + invariance |
| Handles nonlinearity | Via nonlinear state-space models | Via neural network components |
| Handles confounders | Through full state estimation | Through explicit latent factor extraction |
| Handles regime changes | Via extreme-event ACI | Via intervention identification |
| Temporal resolution | Causal Influence Range (continuous) | Lag-based Granger (discrete) |
| Primary application | Climate, geophysics | Industrial processes, sensor networks |
Neither approach subsumes the other. When a governing physical model is available---as in climate science, fluid dynamics, or neuroscience---ACI leverages that model to extract richer causal information, including the temporal extent of causal influence. When no model is available and the data come with unknown confounders and regime shifts, InvarGC provides a principled discovery procedure that goes beyond classical Granger causality.
The ideal causal analysis of a complex system would deploy both: model-based ACI where physics is known, data-driven InvarGC where it is not, and cross-validation between the two where domains overlap.
Open Questions
Scalability of ACI. The ENSO demonstration uses six variables. Real climate models involve millions of state variables. Can ACI scale to high-dimensional systems, or does it require careful dimensionality reduction that may itself introduce causal artifacts?
CIR for non-CGNS systems. Closed-form CIR solutions are available only for CGNS models. For fully nonlinear systems, particle filter-based CIR computation may be prohibitively expensive. Variational or neural surrogate approaches could bridge this gap.
InvarGC with continuous dynamics. InvarGC operates on discrete time series. Extending the invariance principle to continuous-time settings---perhaps through neural ODEs or stochastic differential equations---could unify it with the ACI framework.
Causal discovery in partially observed systems. Both approaches assume that the relevant variables are observed, even if confounders are latent. In many real systems, the causal variables themselves may be unobserved or measured only through noisy proxies. Combining ACI's state estimation with InvarGC's confounder extraction is a natural but unexplored direction.
Validation beyond benchmarks. Perfect AUROC on synthetic data and SOTA on standard benchmarks are encouraging, but causal discovery methods ultimately need validation through interventional experiments. Can the causal graphs discovered by these methods guide experiments that confirm the predicted causal links?
Closing Reflection
Causal inference has long been split between two cultures. One, rooted in statistics and econometrics, works with observational data and worries about confounders. The other, rooted in physics and dynamical systems, works with mechanistic models and worries about state estimation. These three papers suggest the split is closing.
ACI shows that the machinery of weather forecasting---Bayesian data assimilation, state-space models, filtering and smoothing---is also machinery for causal discovery. InvarGC shows that the machinery of robust machine learning---latent variable extraction, invariance across environments, neural prediction---addresses the same causal questions from a complementary angle. The concept of Causal Influence Range adds a temporal dimension that neither classical Granger causality nor Pearl's graphical framework naturally provides.
The next step is integration. A causal inference framework that combines model-based temporal attribution with data-driven robustness to confounders and regime shifts would be more than the sum of its parts. These papers point the way.
Correlation is not causation. Everyone knows that. But how do you actually discover causation? If you observe two variables moving together in a complex system---say, sea surface temperatures in the western and eastern Pacific---how do you determine which one drives the other, and over what timescale?
A new framework borrowed from weather forecasting does it by running the clock backwards.
The three papers reviewed here represent two complementary axes of modern causal inference for time series. On one axis, Andreou, Chen, and Bollt reformulate Bayesian data assimilation---the mathematical engine behind modern weather prediction---as a causal discovery tool. On the other, Zhang et al. tackle the same causal discovery problem from a data-driven machine learning perspective, handling latent confounders and shifting dynamics that plague classical Granger causality. Together, they sketch a surprisingly complete picture of where causal inference for dynamical systems stands today.
The Core Idea: Causation as an Inverse Problem
Weather forecasting is a forward problem: given initial conditions, predict the future. Data assimilation is its inverse: given noisy observations scattered across time and space, reconstruct the hidden state that produced them. This inverse-problem perspective is what Andreou, Chen, and Bollt exploit.
Their Assimilative Causal Inference (ACI) framework, published in Nature Communications, starts from a simple insight. If variable X causally influences variable Y, then observing Y should improve our ability to reconstruct the past trajectory of X through data assimilation. The degree of this improvement---quantified by comparing posterior uncertainties with and without the observations---measures the strength of the causal link.
This is not merely metaphorical. The authors formulate causal influence as a rigorous Bayesian quantity: the reduction in posterior uncertainty of the cause variable when effect observations are assimilated, compared to when they are withheld. The framework inherits the full mathematical machinery of Bayesian filtering and smoothing, including its ability to handle nonlinear dynamics, partial observations, and non-Gaussian noise.
Causal Influence Range: How Far Does a Cause Reach?
One of the most striking concepts introduced across the first two papers is the Causal Influence Range (CIR)---a measure of how far into the future (or how far back into the past) a causal link extends.
Classical causal inference methods typically answer binary questions: does X cause Y, yes or no? CIR goes further. It answers: for how long does X cause Y? In a climate system where ocean temperatures influence atmospheric circulation over months or years, this temporal dimension of causality is essential.
Andreou and Chen's companion paper in SIAM/ASA Journal on Uncertainty Quantification develops this idea fully, introducing two complementary measures:
<
| Concept | Definition | Analogy |
|---|
| Forward CIR | How far into the future does a cause at time t continue to influence the effect? | Future light cone in relativity |
| Backward CIR | How far back in time must we look to find when the cause of a current effect began? | Past light cone in relativity |
The relativistic analogy is not decorative. Just as a light cone in spacetime defines the set of events that can causally affect (or be affected by) a given event, the forward and backward CIR define the temporal boundary of causal influence in a dynamical system. Events outside the CIR are causally disconnected---no matter how correlated they may appear.
ENSO: A Six-Variable Causal Laboratory
The ACI framework is demonstrated on the El Nino-Southern Oscillation (ENSO), arguably the most consequential climate oscillation on Earth. The authors construct a six-variable model capturing the essential thermodynamic and dynamical couplings between the western and eastern Pacific.
The results are both intuitive and informative. The strongest ACI signal runs from the thermocline depth in the western Pacific (T_C) to the sea surface temperature in the eastern Pacific (T_E)---consistent with the established physical understanding that subsurface heat transport drives ENSO events. Meanwhile, the variable h_W (western Pacific thermocline depth) exhibits the longest CIR to T_E, suggesting that western Pacific subsurface anomalies influence eastern Pacific surface temperatures over extended periods.
The framework also handles extreme events---a critical capability for climate science. Using nonlinear dyad models, the authors demonstrate that ACI can detect causal links that activate only during extreme-event regimes, capturing the asymmetry between El Nino and La Nina dynamics that linear methods miss entirely.
Closed-Form Solutions for Tractable Causality
A practical concern with any Bayesian framework is computational cost. Full nonlinear Bayesian filtering requires either particle methods (expensive) or approximations (potentially inaccurate). Andreou and Chen address this in their SIAM/ASA paper by deriving closed-form analytical solutions for CIR in a class of models called Conditional Gaussian Nonlinear Systems (CGNS).
CGNS models are structured so that some variables are conditionally Gaussian given others, even when the full system is nonlinear. This structure is common in geophysical systems---atmospheric variables often follow approximately Gaussian distributions conditioned on slowly varying ocean states. For CGNS models, the posterior distributions required for CIR computation can be obtained exactly, without Monte Carlo sampling.
The paper demonstrates these closed-form solutions on three physically meaningful cases:
Tipping-point dynamics: Where a slow variable approaches a critical threshold and abruptly shifts the system's behavior. CIR reveals how early the causal signal of an approaching tipping point appears in the data.Multiscale atmosphere-ocean coupling: Where fast atmospheric fluctuations interact with slow ocean dynamics. Forward and backward CIR capture the asymmetric timescales of influence.Lorenz-84 atmospheric blocking: A canonical model for persistent weather patterns. CIR quantifies how long a blocking event causally constrains downstream atmospheric evolution.The Other Axis: Data-Driven Causal Discovery Under Confounding
While ACI operates within a model-based, continuous-time paradigm rooted in physics, many real-world causal discovery problems lack a known dynamical model. Time series from industrial processes, financial markets, or sensor networks come without governing equations. For these settings, Granger causality---the idea that X causes Y if past values of X help predict Y beyond what Y's own past provides---has been the workhorse method since the 1960s.
But Granger causality has well-known vulnerabilities. Latent confounders (unobserved common causes) can create spurious Granger-causal links. Unknown interventions (regime changes, equipment replacements, policy shifts) can mask or distort genuine causal relationships. In practice, both problems occur simultaneously.
Zhang, Ren, Qian, and Duffield's InvarGC (Invariant Granger Causality) tackles both problems at once through a four-module architecture:
Latent Confounder Identification Module (LCIM): Extracts latent confounding factors from the observed time series using a variational approach.Intervention Identification Module: Detects unknown intervention points---moments when the data-generating process changes---without requiring prior knowledge of when or how interventions occurred.
Invariant Granger Causality Module: Identifies causal relationships that remain stable across the detected intervention regimes. This invariance criterion is the key: if X Granger-causes Y in every regime, the link is more likely genuine than confounded.
Prediction Module: Validates the discovered causal structure through forecasting performance.The theoretical contribution is substantial. The authors prove three identifiability theorems establishing that their method can correctly recover causal structure even when confounders are present and interventions are unknown. On synthetic data, InvarGC achieves AUROC and AUPRC scores of 1.0---perfect recovery of the true causal graph. On real-world benchmarks---the Tennessee Eastman Process (TEP) industrial dataset and the Causal-Rivers hydrological dataset---InvarGC sets new state-of-the-art performance.
Two Axes, One Problem
The complementarity between these papers is worth making explicit. ACI and InvarGC are solving the same fundamental problem---identifying causal relationships in temporal data---but from opposite directions:
<
| Dimension | ACI (Papers 1 & 2) | InvarGC (Paper 3) |
|---|
| Paradigm | Model-based, physics-informed | Data-driven, model-agnostic |
| Time domain | Continuous-time dynamics | Discrete-time series |
| Primary tool | Bayesian data assimilation | Granger causality + invariance |
| Handles nonlinearity | Via nonlinear state-space models | Via neural network components |
| Handles confounders | Through full state estimation | Through explicit latent factor extraction |
| Handles regime changes | Via extreme-event ACI | Via intervention identification |
| Temporal resolution | Causal Influence Range (continuous) | Lag-based Granger (discrete) |
| Primary application | Climate, geophysics | Industrial processes, sensor networks |
Neither approach subsumes the other. When a governing physical model is available---as in climate science, fluid dynamics, or neuroscience---ACI leverages that model to extract richer causal information, including the temporal extent of causal influence. When no model is available and the data come with unknown confounders and regime shifts, InvarGC provides a principled discovery procedure that goes beyond classical Granger causality.
The ideal causal analysis of a complex system would deploy both: model-based ACI where physics is known, data-driven InvarGC where it is not, and cross-validation between the two where domains overlap.
Open Questions
Scalability of ACI. The ENSO demonstration uses six variables. Real climate models involve millions of state variables. Can ACI scale to high-dimensional systems, or does it require careful dimensionality reduction that may itself introduce causal artifacts?
CIR for non-CGNS systems. Closed-form CIR solutions are available only for CGNS models. For fully nonlinear systems, particle filter-based CIR computation may be prohibitively expensive. Variational or neural surrogate approaches could bridge this gap.
InvarGC with continuous dynamics. InvarGC operates on discrete time series. Extending the invariance principle to continuous-time settings---perhaps through neural ODEs or stochastic differential equations---could unify it with the ACI framework.
Causal discovery in partially observed systems. Both approaches assume that the relevant variables are observed, even if confounders are latent. In many real systems, the causal variables themselves may be unobserved or measured only through noisy proxies. Combining ACI's state estimation with InvarGC's confounder extraction is a natural but unexplored direction.
Validation beyond benchmarks. Perfect AUROC on synthetic data and SOTA on standard benchmarks are encouraging, but causal discovery methods ultimately need validation through interventional experiments. Can the causal graphs discovered by these methods guide experiments that confirm the predicted causal links?
Closing Reflection
Causal inference has long been split between two cultures. One, rooted in statistics and econometrics, works with observational data and worries about confounders. The other, rooted in physics and dynamical systems, works with mechanistic models and worries about state estimation. These three papers suggest the split is closing.
ACI shows that the machinery of weather forecasting---Bayesian data assimilation, state-space models, filtering and smoothing---is also machinery for causal discovery. InvarGC shows that the machinery of robust machine learning---latent variable extraction, invariance across environments, neural prediction---addresses the same causal questions from a complementary angle. The concept of Causal Influence Range adds a temporal dimension that neither classical Granger causality nor Pearl's graphical framework naturally provides.
The next step is integration. A causal inference framework that combines model-based temporal attribution with data-driven robustness to confounders and regime shifts would be more than the sum of its parts. These papers point the way.