Abstract
Complex systems with many interacting nodes are inherently stochastic and best described by stochastic differential equations. Despite increasing observation data, inferring these equations from empirical data remains challenging. Here, we propose the Langevin graph network approach to learn the hidden stochastic differential equations of complex networked systems, outperforming five stateoftheart methods. We apply our approach to two real systems: bird flock movement and tau pathology diffusion in brains. The inferred equation for bird flocks closely resembles the secondorder Vicsek model, providing unprecedented evidence that the Vicsek model captures genuine flocking dynamics. Moreover, our approach uncovers the governing equation for the spread of abnormal tau proteins in mouse brains, enabling early prediction of tau occupation in each brain region and revealing distinct pathology dynamics in mutant mice. By learning interpretable stochastic dynamics of complex systems, our findings open new avenues for downstream applications such as control.
Similar content being viewed by others
Introduction
The behaviors of complex systems, ranging from cell migration to pathological protein diffusion, brain activity, human mobility, and bird flocking, exhibit not only nonlinearity but also stochasticity^{1,2,3}. Stochasticity plays a crucial role in enhancing a system’s adaptability to rapidly changing environments^{4,5}, facilitating information processing^{6,7}, and increasing robustness^{8,9}. The emergence of order from disorder has long fascinated scientists, particularly in the context of system dynamics. While the behaviors of complex systems can be observed experimentally, their underlying dynamics remain elusive. Therefore, stochastic differential equations (SDEs) have been widely employed to model such stochastic systems due to their ability to simultaneously describe deterministic evolution and random fluctuations stemming from unresolved degrees of freedom.
However, conventional SDE models used to describe realworld scenarios have certain limitations, such as predefined forms, simplified physics, and assumed parameter values. Fortunately, the increasing availability of empirical data, including network typologies and node activities, provides an opportunity to shift this paradigm. Instead of modeling the dynamics of a complex system using a predefined SDE, it becomes possible to infer the hidden SDE from observational data on system behaviors.
Discovering the governing laws of dynamics from data has become a prominent field of artificial intelligenceempowered scientific exploration^{10,11,12,13}, making significant progress in recent years^{14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29}. Numerous datadriven methods have been proposed to identify ordinary differential equations (ODEs) and partial differential equations for single and fewbody nonlinear systems^{14,15,16}, as well as ODEs for large networks^{17,18,19}. However, these methods may not effectively address real systems exhibiting stochasticity. Previous efforts to learn stochastic dynamics have primarily focused on predicting a system’s future evolution rather than inferring its underlying SDE^{20}. Additionally, the majority of previous methods have been validated on simulated systems with known groundtruth dynamics^{24,25}, and few have demonstrated the ability to infer real stochastic systems with unknown underlying dynamics (with exceptions like^{27,28}).
Here, we aim to address a fundamental question: given the observations of network topology and nodes’ activity series, how can we infer the coupled SDEs that capture the hidden stochastic dynamics of a complex system? The main contributions of our work are summarized as follows:

1.
We propose a method termed the Langevin Graph Network Approach (LaGNA) that incorporates an innovative messagepassing mechanism to separate dynamical sources within nodal activity data. This method subsequently infers concise mathematical expressions for each of these dynamic sources by leveraging corresponding neural network modules. Comparative analyses showcase our method’s proficiency in effectively unveiling the hidden coupled SDEs of complex networked systems, demonstrating superior performance compared to five stateoftheart methodologies in the field.

2.
We apply our method to natural flocking, an intriguing phenomenon and important research topic in the community of statistical physics and complex systems. From the trajectory data of several flocks our method successfully infers the SDE of real flocking dynamics. The inferred SDE exhibits a remarkable resemblance to the secondorder Vicsek model, providing unprecedented evidence that the Vicsek model is not just a toy model but captures genuine flocking dynamics.

3.
We apply our method to the spreading of pathological tau protein in Alzheimer’s disease (AD) brains, a frontier problem in neuroscience. From the experimental data of tau pathology in AD mice brains, our method successfully infers a novel SDE that captures the tau diffusion dynamics. The finding not only enables earlystage prediction of the percentage of brain areas that will be affected by tau pathology but also offers novel quantitative insights into the mechanism of tau pathology.
Results
Overview of the LaGNA framework
The state evolution of a complex system is often driven by several dynamic sources, including the selfdynamics of each node, the interaction between nodes, and intrinsic stochastic diffusion. In the first stage of LaGNA (Fig. 1a–d), we design a messagepassing mechanism guided by the complex network’s nontrivial topology. The messagepassing mechanism consists of three neural network (NN) modules: selfdynamics simulator \(\hat{{{{{{\bf{f}}}}}}}(\cdot )\), interaction dynamics simulator \(\hat{{{{{{\bf{g}}}}}}}(\cdot )\), and diffusion simulator \(\hat{{{{{{\boldsymbol{\phi }}}}}}}(\cdot )\), tailored to separate the dynamical sources hidden in nodes’ activity data (Fig. 1b,d) and differing from that used in graph neural network^{30}.
Each node i’s activity at time t is denoted as a ddimensional vector \({{{{{{\bf{x}}}}}}}_{i}(t)\equiv {({x}_{i,1}(t),{x}_{i,2}(t),\ldots,{x}_{i,d}(t))}^{{{{{{\rm{T}}}}}}}\). Given the input of nodes’ activities x_{i}(t), i = 1, 2, …, n (Fig. 1a), the LaGNA estimates the states at the next time step \({\hat{{{{{{\bf{x}}}}}}}}_{i}(t+\,{{\mbox{d}}}\,t)\) (Fig. 1c) using the following equation:
Here, A_{ij} is the adjacency matrix representing the network topology, and W_{t} is the ddimensional vector representing the Wiener process (i.e., normally distributed around zero with variance dt). Note that the form of Eq. (1) can describe a wide range of complex dynamical systems^{31,32,33}, including epidemic spreading, neuronal dynamics, ecological dynamics, gene regulation, as well as flocking and tau pathology diffusion, as we will show below. The current stage of LaGNA can be viewed as an implicit dynamical system with a large number of trainable parameters: θ_{f}, θ_{g}, and θ_{ϕ} representing the self, interaction, and diffusion simulators, respectively. To capture the underlying dynamics of a given complex system, LaGNA’s outputs \({\hat{{{{{{\bf{x}}}}}}}}_{i}(t+\,{{\mbox{d}}}\,t)\) need to exhibit behavior similar to the true observation x_{i}(t + dt). Due to the intrinsic stochasticity, minimizing the difference between \({\hat{{{{{{\bf{x}}}}}}}}_{i}(t+\,{{\mbox{d}}}\,t)\) and x_{i}(t + dt) will result in overfitting. Therefore, we train LaGNA with observation pairs x_{i}(t) and x_{i}(t + dt), and obtain its optimal parameters by maximizing instead the expectation:
where \({p}_{{{{{{{\boldsymbol{\theta }}}}}}}_{f},{{{{{{\boldsymbol{\theta }}}}}}}_{g},{{{{{{\boldsymbol{\theta }}}}}}}_{\phi }}\) is the probability density of the normal distribution generated by the model of Fig. 1b with parameters θ_{f}, θ_{g}, θ_{ϕ}. Note that Eq. (2) describes the case of d = 1; refer to the “Methods” section for situations d > 1.
The welltrained LaGNA has the ability to predict future behaviors; however, it currently lacks an explicit equation to describe the underlying dynamics of the system. In the second stage, we aim to unveil the inner workings of the LaGNA black box. The tailored messagepassing mechanism (Fig. 1d) has separated the underlying dynamics into three neural network modules, namely \(\hat{{{{{{\bf{f}}}}}}}(\cdot )\), \(\hat{{{{{{\bf{g}}}}}}}(\cdot )\), and \(\hat{{{{{{\boldsymbol{\phi }}}}}}}(\cdot )\). This decomposition allows us to penetrate each module, deriving explicit expressions for the three parts. Using preconstructed comprehensive libraries of terms, i.e., L_{F}, L_{G}, and L_{Φ} shown in Supplementary Information Section IB, we identify the optimal combination of terms from the libraries using a modified version of our twophase approach^{17}. Our framework successfully separates and identifies concise mathematical expressions for selfdynamics, interaction dynamics, and intrinsic stochastic diffusion, respectively, which together form the final stochastic differential equation (Fig. 1e,f,g). LaGNA enables the balance of accuracy and complexity of mathematical expressions (see Supplementary Information Section IIIC), becoming an interpretable learner for discovering the hidden SDEs of complex networked systems. Further details are described in Methods and Supplementary Information Section I.
Learning the stochastic dynamics of signed and weighted networks
Signed and weighted networks are prevalent in various biological and physical systems. In neuronal systems, for instance, the synapses between neurons can be either excitatory, enhancing the activity of the receiving neuron, or inhibitory, reducing activity. In physical systems like power grids and traffic networks, link weights play a crucial role in system characterization. The combined effect of heterogeneous links in these networks makes interaction intricate and poses challenges in dynamics inference.
To address the challenge of interaction heterogeneity and validate the effectiveness of our framework, we conduct simulations of a stochastic system with HindmarshRose (HR) neuronal dynamics on a signed network^{34} (refer to Supplementary Information Section IIB.3). In the simulations, we randomly assign half of the nodes as excitatory and the other half inhibitory. The links from excitatory nodes show excitability with V_{syn} = 2, while the links from inhibitory nodes show inhibition with V_{syn} = − 1.5. To infer the hidden HR dynamics, we incorporate the knowledge of link types and utilize two NNs for estimating excitatory and inhibitory interaction dynamics, respectively. It is worth noting that we use only one trial of the nodes’ activity sequence. The results in Fig. 2b–d show that our framework accurately estimates the terms of self, diffusion, and, notably, the two types of interactions. The inferred stochastic differential equations (SDEs) successfully reproduce the force field (Fig. 2e) and the stochastic trajectory (Fig. 2f). Additionally, we consider a weighted network A_{ij}, where 0 ≤ A_{ij} ≤ 1 and further simulate the network dynamics using stochastic Rössler equations^{35} (refer to Supplementary Information Section IIB.2). The results in Fig. 2h–j demonstrate the inability of the stochastic Rössler dynamics on weighted networks. The trajectory generated by the inferred SDEs exhibits similar dynamical characteristics compared to the original trajectory (Fig. 2g, l), and the reproduced force field closely aligns with the true force field (Fig. 2k).
To underscore the significance of our LaGNA method in inferring the stochastic dynamics of complex networked systems, we conduct comparisons between LaGNA and five stateoftheart methods, namely ModifiedSINDy^{29}, TwoPhase inference^{17}, SDEnet^{25}, SVISE^{26} and SFI^{28}, utilizing the stochastic Lorenz networked system (refer to Supplementary Information Section IIB.1). In this model system, each node’s state is represented by a threedimensional vector x_{i} = (x_{i,1}, x_{i,2}, x_{i,3}), where the intrinsic stochastic diffusion in one dimension (e.g., x_{i,2}) can be influenced by another (e.g., x_{i,3}). The stochastic intensity is denoted by \(1/\sqrt{\gamma }\), with a smaller γ indicating higher stochasticity. As shown in Fig. 2m, among the evaluated methods, LaGNA demonstrates significantly reduced errors in inferring the networked SDE, outperforming the other methods by two orders of magnitude. Although SDENet, SVISE, and SFI can effectively estimate drift and diffusion, explicit expressions for the networked SDEs remain elusive for the three methods. This challenge arises from the fact that the drift field of networked SDEs encompasses both selfdynamics and interaction effects. In contrast, LaGNA incorporates three specifically designed neural network modules, especially the messagepassing module defined on links to capture the interaction dynamics, which together distinguish between self, interaction, and diffusion effects in observation data. This distinction enables accurate learning of networked SDEs from stochastic trajectories, effectively overcoming the limitations of previous “methods” (see Supplementary Information Section V for further comparisons).
Note that there is an interesting method recently introduced for learning macroscopic dynamical descriptions of stochastic dissipative systems^{27}. However, the objective of this method differs from ours, as it aims to capture coarsegrained macroscopic behavior rather than the nodelevel microscopic dynamics required for our exploration of the real complex systems in the following sections. Due to the disparity in objectives and outputs, this method is not included in the comparison tests.
Learning the dynamics of empirical bird flocks
Collective motions and swarming are fascinating phenomena widely observed in nature^{3,36,37}, such as bird and fish flocks, cell motions, and bacteria colonies. Understanding how individuals interact when large numbers of individuals move together in groups without colliding, or even when they perform tasks together, like hunting^{3,38}, has been a topic of widespread discussion. The prevailing consensus on this phenomenon is that the condensation of individuals results from birds being consistent with their neighbors regarding speed, also known as alignment, as well as their tendency to maintain a close distance while avoiding collisions, also known as cohesion^{3,39}.
To discover the underlying dynamics from the flocking trajectories, we extend the internal architecture of the LaGNA based on the above hypothesis. Specifically, we implement a secondorder version by setting up three specialized NNs for simulating the selfpropulsion, cohesion, and alignment respectively. We modify the loss function as the summation of negative loglikelihood loss \({{{{{{\mathcal{L}}}}}}}_{{{{{{\rm{nl}}}}}}}\) and three prediction errors:
where β_{1}, β_{2}, β_{3}, β_{4} are hyperparameters balancing the different parts of loss, and \({{{{{{\mathcal{L}}}}}}}_{{{{{{\rm{r}}}}}}}\), \({{{{{{\mathcal{L}}}}}}}_{{{{{{\rm{v}}}}}}}\) and \({{{{{{\mathcal{L}}}}}}}_{{{{{{\rm{a}}}}}}}\) are the squared error between the predicted and true displacements, velocities and accelerations, respectively. To validate the potency of the extended framework, we generate a 20bird flocking system with the 3dimensional Vicsek model. The results show that our framework accurately estimates the selfpropulsion, cohesion, and alignment strength of the Vicsek model system, as shown in Fig. 3a–e. The inferred secondorder SDEs successfully regenerate the collective behaviors, as detailed in Supplementary Information Section IIB.4.
Furthermore, we apply our extended secondorder SDE inference framework to learn the empirical dynamics of bird flocks. The dataset consists of four sets of homing flights, which were collected from homing pigeons equipped with GPS devices^{40}. The pigeons were released approximately 15 km away from their home loft, and the GPS devices recorded their location during the return journey at a sampling rate of 0.2 seconds. Because there is less variation in the vertical direction of the bird flocks, here we primarily focus on the movement in the horizontal plane and perform the following data preprocessing steps: spline interpolation with a sampling rate of 0.01 for data augmentation; normalization of the data; extraction of the time period after takeoff and before descent when collective behaviors are most prominent; and alignment of the coordinates. Note that some pigeons exhibited outliers, so their data was removed. In the end, the first flock set contains 8 pigeons, and the other three sets contain 7 pigeons each.
Using the extended framework, we successfully learn the selfpropulsion, alignment, and cohesion parts based on one of the four flock datasets. The estimated strengths exhibit a close correspondence with specific scaled functions. Specifically, the alignment strength matches with \(\hat{{{{{{\mathcal{A}}}}}}}={a}_{1}(\exp ({r}_{ij}/3)+{a}_{2})+{a}_{3}\), the cohesion strength matches with \(\hat{{{\mathcal{C}}}}=c_{1}((r_{ij}/21)^{3}/(r_{ij}/2+1)^{6}+c_{2})+c_{3}\), and the selfpropulsion strength is s_{1}(∣v_{i}∣^{2} + s_{2}) + s_{3} (refer to Supplementary Information Section IIIA). Therefore, the inferred SDE is
where \({{{{{{\bf{v}}}}}}}_{i}=\dot{{{{{{{\bf{r}}}}}}}_{i}}\), r_{ij} = r_{j} − r_{i}, v_{ij} = v_{j} − v_{i}, r_{ij} = ∣r_{ij}∣, \({{{{{{\bf{W}}}}}}}_{t} \sim \, {{{{{\mathcal{N}}}}}}(0,\,{{\mbox{d}}}\,t)\) representing the Wiener process with mean zero and variance dt with dt = 0.01, and \(\hat{{{{{{\boldsymbol{\epsilon }}}}}}}\) is the estimated intensity of stochasticity. The reproduced force field exhibits consistency with the actual field for a substantial duration across each flock system, and longterm predictions reveal a diverse range of behaviors, as depicted in Fig. 3g–j. To assess the generalizability of the inferred SDE, we employ it to describe three other datasets that were not used in training. We observe that by solely finetuning the scaling coefficients without altering the equation form, Eq. (4) is able to effectively capture the underlying dynamic mechanism of the collective behaviors exhibited in these three datasets, as illustrated in Fig. 3k–p. The hyperparameters and coefficients are shown in Supplementary Information Table 5.
The renowned Vicsek model has long served as a staple in flocking dynamics research, often regarded as a simplistic representation. Our finding offers unprecedented evidence that the Vicsek model transcends its toy model status, effectively encapsulating authentic flocking dynamics. Remarkably, Eq. (4) is autonomously inferred from the observation data, devoid of preconceived assumptions about its structure. Consequently, the striking resemblance of the inferred SDE to the secondorder Vicsek model^{41,42} unveils new perspectives for understanding and modeling the collective behaviors of real flocks.
Learning the spreading dynamics of tau pathology in mouse brain
Tau proteins play a crucial role in maintaining the stability of axon microtubules, which is essential for the proper functioning of the brain^{43}. However, in Alzheimer’s disease (AD), misfolded and hyperphosphorylated tau proteins lose their ability to bind microtubules properly, leading to their accumulation as neurofibrillary tangles, a hallmark of the disease^{44}. Previous experimental studies have shown that in the early stages of AD, pathological tau spreads from the transentorhinal cortex to other areas of the limbic and neocortical regions, suggesting a spread along neuroanatomical connections^{45}. There is also evidence indicating that tau can be released into the extracellular space, either as free tau or in vesicles such as exosomes^{46,47}. As the second empirical system, here we apply our proposed inference framework to the observed tau pathology data to identify the hidden governing equation of the spreading dynamics contributed by both neuroanatomical connections and spatial proximity due to extracellular diffusion.
To obtain the tau spreading data, biologists injected nontransgenic (NTG) mice in five specific injection sites with paired helical filament (PHF) tau extracted from the hippocampus and overlaying cortex. The injected mice were euthanized at different time points (1, 3, 6, and 9 months after injection) to obtain pseudolongitudinal data^{48,49}. The brain sections of each mouse were then stained to label the percentage of infected areas in different brain regions, as shown in Fig. 4a. We consider the bidirectional diffusion of tau pathology along neuroanatomical connections, with retrograde (from terminals to the cell body) and anterograde (from the cell body to terminals) directions^{49,50}. We also take into account the influence of geographical distance on diffusion, as shown in Fig. 4b. This leads to a total of n = 160 brain regions, with timedependent percentages of the area occupied by tau pathology denoted as y(t). We have neuroanatomical and Euclidean weighted adjacency matrices A and D, respectively, with the anterograde matrix represented as A^{T}.
By applying our framework, we infer the governing equation of the tau pathology diffusion dynamics as follows:
Here, \({D}_{ij}=1/\log ({E}_{ij}^{2})\), where matrix E represents the actual Euclidean distance between different regions, and the term T(t) = c_{t} + 1.5t includes trainable parameter c_{t} that captures the varying propagation rate over time. The elements in D less than 0.11 are set to zero, meaning that there is no immediate spatial diffusion between two regions that are far apart (refer to Supplementary Information Section IIIB). The binary matrix \(\tilde{A}\) has elements \({\tilde{A}}_{ij}=1\) when A_{ij} > 0, and \({\tilde{A}}_{ij}=0\) otherwise; The same applies to the binary matrix \(\tilde{D}\). In the initial state x(0), only the five sites that were injected with a volume of 1 μg of tau are set to a value of 1 and the rest are zero, consistent with previous work^{49}. The weights b_{0}, b_{1}, b_{2} and b_{3} are heterogeneous factors assigned to different regions^{51}. The term σ represents the stochastic noise in the system. It is worth mentioning that the three terms on the right side of the inferred equation correspond to retrograde, anterograde, and spatial diffusion, respectively, which demonstrates the biological interpretability of our inference result.
Due to the stochastic nature of pathology propagation in brains, the pattern emergence occurs following the fluctuating duration of early stages. As shown in Fig. 4c,d, the inferred equation adeptly predicts the tau diffusion at 6 and 9 months postinjection (MPI) given the injection sites. To validate the prediction’s specificity to the injection sites, we assess its performance against 500 randomly selected initial sets comprising five regions each. The results affirm that injections with the experimental seed regions yield the highest accuracy across all time points (Fig. 4e). Furthermore, we evaluate a degenerate model that treats each brain region equally. The outcomes demonstrate reduced predictability of the degenerated model (Fig. 4f), underscoring the significant influence of regional heterogeneity on tau pathology diffusion.
Finally, we apply our method to infer the tau diffusion dynamics in mice with LRRK2^{G2019S} mutation. This mutation is the most common cause of familial Parkinson’s disease and a common risk factor for idiopathic Parkinson’s disease. Mice carrying this mutation exhibit altered tau pathology patterns, but, intriguingly, the inferred equation that accurately delineates tau diffusion in mutant mice (Fig. 4g) shares the same form as Eq. (5). The remarkable distinction lies in the observation that while tau pathology diffusion in NTG mice lacks a directional preference, the diffusion in mutated mice exhibits a pronounced inclination towards the retrograde direction. This preference is quantified by the absolute average value of coefficient b_{1} of the retrograde direction ranging from 0.7–0.9, contrasting with b_{2} for the anterograde direction, which falls within the range of 0.1–0.3. These results align with a recent experiment^{50}.
Our discovery offers new insights into tau pathology. Firstly, the inferred Eq. (5) holds promise in generally capturing tau pathology dynamics in brains. Secondly, the results shed light on the significance of spatial diffusion, a factor overlooked in previous studies, indicating its nonnegligible impact on tau pathology. Lastly, the delineation of coefficients for retrograde and anterograde diffusion terms underscores the distinct tau pathology dynamics in mutant mice. These findings collectively enhance our understanding of tau pathology mechanisms.
Discussion
Inferring the governing equations of complex systems from observation data is a crucial direction to automatize scientific discovery. Previous studies have primarily focused on benchmarking algorithms on model systems with known ground truths. In contrast, our work delves into two important realworld systems and successfully distills their concealed networked SDEs. This not only showcases the applicability of our approach but also generates novel insights for understanding the mechanisms hidden in empirical flocking and tau pathology diffusion. Importantly, our LaGNA method requires only one trial of nodes’ activity sequence and only snapshots rather than continuous time series data, enhancing its flexibility and adaptability to other real scenarios with the aid of inductive bias.
While LaGNA demonstrates superior performance compared to previous stateoftheart methods and provides valuable insights into real complex systems, it does have limitations that necessitate further attention in future research. Firstly, in some scenarios, the activity time series of certain nodes may be inaccessible. Therefore, it is worth determining the minimal subnetwork structure required to unveil the system dynamics^{52,53,54}. Real data from stochastic systems often exhibit a combination of intrinsic stochasticity and extrinsic noise, with the latter arising from measurement errors. Distinguishing between these types of noise poses significant challenges^{55,56,57}. Without prior knowledge of the dominant source of noise, we treat all noises as intrinsic in this work, and LaGNA demonstrates accurate inference when the relative strength of extrinsic noise is below 10%. When extrinsic noise is more pronounced, a preprocessing step of denoising, such as the KalmanTakens filter^{57}, enhances inference capability (see Supplementary Information Sections VB and VC). However, future efforts are needed to better address extrinsic noises in data.
Secondly, while many real complex networks have been successfully mapped in the past, obtaining the topological data of a network may not always be feasible in certain scenarios. In such cases, there is a need to infer both network topology and system dynamics. Recent commendable efforts have been made to address this challenge^{18,58,59}, yet they either require activity data from many trials with different initial states^{18} or learn for dynamics prediction rather than inference^{58,59}. Simultaneously inferring both the dynamical equation and network topology of a large real system using a limited amount of experimentally feasible data remains a challenging task.
Thirdly, while the preconstructed libraries in the second stage of LaGNA can contain a large number of orthogonal or nonorthogonal elementary function terms, it is still possible that the use of preconstructed libraries may overlook certain terms. Symbolic regression, an alternative method that does not rely on preconstructed libraries, faces higherdimensional challenges. Thus, further efforts are needed to enhance the automation of current methods.
Fourthly, there has been considerable interest in higherorder interactions within complex systems in recent years^{59,60,61}. LaGNA can be extended to accommodate higherorder systems by incorporating additional terms such as \({\sum }_{j,k}{A}_{i,j,k}{{{{{\bf{h}}}}}}({{{{{{\bf{x}}}}}}}_{i}(t),{{{{{{\bf{x}}}}}}}_{j}(t),{{{{{{\bf{x}}}}}}}_{k}(t))\) into the interaction part of Eq. (1) where A_{i,j,k} represents the thirdorder network and the function h denotes the thirdorder interaction dynamics. Yet this will increase the complexity in identifying an optimal equation, Yet this extension will increase the complexity of identifying an optimal equation, presenting a promising avenue for future efforts to address.
Methods
Loss function of LaGNA
Consider a complex networked system whose dynamics are governed by stochastic differential equations (SDEs)
Here, x_{i}(t) represents the ddimensional state of node i at time t; A is the adjacency matrix of size n × n, where A_{ij} denotes the influence from node j to i; \({{{{{\bf{F}}}}}}\equiv {({F}_{1}({{{{{{\bf{x}}}}}}}_{i}),\, {F}_{2}({{{{{{\bf{x}}}}}}}_{i}),\ldots,\, {F}_{d}({{{{{{\bf{x}}}}}}}_{i}))}^{{{{{{\rm{T}}}}}}}\) and \({{{{{\bf{G}}}}}}\equiv {({G}_{1}({{{{{{\bf{x}}}}}}}_{i},{{{{{{\bf{x}}}}}}}_{j}),{G}_{2}({{{{{{\bf{x}}}}}}}_{i},{{{{{{\bf{x}}}}}}}_{j}),\ldots,{G}_{d}({{{{{{\bf{x}}}}}}}_{i},{{{{{{\bf{x}}}}}}}_{j}))}^{{{{{{\rm{T}}}}}}}\) are nonlinear functions representing the self and interaction dynamics, respectively; Φ(x_{i}(t)) is the positivedefinite diffusion matrix of size d × d, and W_{t} is a ddimensional vector representing the Wiener process with mean zero and variance dt^{25}. Note that, by choosing different F and G, Eq. (6) can describe a wide range of systems dynamics^{32,33}.
For simplicity, let’s consider first the case d = 1. Given x(t) and dt, x(t + dt) can be considered as points drawn from the normal distribution
where \({\mu }_{i}(t)={x}_{i}(t)+[F({x}_{i}(t))+{\sum }_{j=1}^{n}{A}_{ij}G({x}_{i}(t), \ {x}_{j}(t))]{{\mbox{d}}}t\), and \({\sigma }_{i}^{2}(t)={\Phi }^{2}({x}_{i}(t)){{\mbox{d}}}t\). To train the network end to end, we use all nodes’ states at time t, x(t), as inputs. Based on the network topology A_{ij}, we map the information flow from node j to node i using a function g(x_{i}(t), x_{j}(t)). The estimated information values are then aggregated elementwise for each receiving node over all respective sending nodes. Additionally, we map the selfdynamics of each node i using a function f(x_{i}(t)). The estimated mean and variance of node i’s activity distribution can be written as \({\tilde{\mu }}_{i}(t)={x}_{i}(t)+[f({x}_{i}(t))+{\sum }_{j=1}^{n}{A}_{ij}g({x}_{i}(t), \ {x}_{j}(t))]{{\mbox{d}}}t\) and \({\tilde{\sigma }}_{i}^{2}(t)={\phi }^{2}({x}_{i}(t)){{\mbox{d}}}t\) respectively. The functions g, f and ϕ are determined by trainable parameters θ_{f}, θ_{g} and θ_{ϕ}, respectively.
TO obtain the optimal parameters, we train the model in Fig. 1b by maximizing the likelihood between the true and estimated distributions. Since the true distribution is inaccessible and only the nextstep state is available, we instead maximize the following expectations using the maximum likelihood estimates (MLE):
where \({p}_{{{{{{{\boldsymbol{\theta }}}}}}}_{f},{{{{{{\boldsymbol{\theta }}}}}}}_{g},{{{{{{\boldsymbol{\theta }}}}}}}_{\phi }}\) represents the probability density of the normal distribution generated by the model of Fig. 1b with parameters θ_{f}, θ_{g}, θ_{ϕ}, i.e.,
Maximizing the likelihood in Eq. (9) is equivalent to minimizing the negative loglikelihood using the estimated \({\tilde{\mu }}_{i}(t)\) and \({\tilde{\sigma }}_{i}^{2}(t)\), i.e.,
Here, the constant coefficients and terms can be omitted, hence the loss function becomes
For a training dataset containing n observed nodes, the expectation becomes
For the case d > 1, the negative logarithm of a multivariate normal distribution can be written as
where Σ(t) is a positive semidefinite matrix.
Inference of self, interaction, and diffusion parts
After the welltrained model of Fig. 1b separates the self, interaction, and diffusion parts, we adopt the core idea of the twophase inference approach proposed by us previously^{17} to infer the concise form for each part. Specifically, with three preconstructed extensive libraries L_{F}, L_{G}, and L_{Φ} that contain widely used elementary functions (see Supplementary Information Section IB), we introduce the time series data x_{i}(t), where i ∈ n, into L_{F}, L_{G}, and L_{Φ}, and obtain the timevarying matrices Θ_{F}(t) ≡ L_{F}(x_{i}(t)), Θ_{G}(t) ≡ L_{G}(x_{i}(t), x_{j}(t)), and Θ_{Φ}(t) ≡ L_{Φ}(x_{i}(t)). Then, the inference problem can be formulated using the estimated values as follows:
Here \({\tilde{\Theta }}_{F}\equiv {\Theta }_{F}\bigotimes {I}_{d}\), \({\tilde{\Theta }}_{G}\equiv {\Theta }_{G}\bigotimes {I}_{d}\), and \({\tilde{\Theta }}_{\Phi }\equiv {\Theta }_{\Phi }\bigotimes {I}_{d}\), where ⨂ is the Kronecker product, and I_{d} is d × d identity matrix. Therefore, the objective is to find the appropriate sparse coefficients ξ_{F}, ξ_{G}, and ξ_{Φ}, where most of the elements are zero while only the coefficients of highly relevant elementary functions are nonzero, such that Eq. (14) closely matches the observed data.
The first phase involves global regression to find a few most relevant elementary functions for each part, based on the optimization formulas:
where λ_{F}, λ_{G} and λ_{ϕ} are hyperparameters that regulate the sparsity of the coefficients. In the implementation, we use the least absolute shrinkage (LASSO) with fivefold validation to determine the optimal hyperparameters. Through global regression, we obtain the degree of relevance between each elementary function in the libraries and the hidden dynamics, significantly reducing the model space.
Next, we utilize the second phase to identify the minimal number of elementary functions for self, interaction, and diffusion parts, respectively, which constitute the final stochastic differential equation. To do so, we add the most relevant elementary functions one by one according to the relevance degree obtained in the first phase. We use metric \({\kappa }^{2}=1\frac{{\sum }_{i}{({\hat{y}}_{i}{y}_{i})}^{2}}{{\sum }_{i}{({y}_{i}\overline{y})}^{2}}\) to indicate the regression score of a temporary combination of elementary functions. The more accurate is the current equation, the closer to 1 is κ^{2}. Here, \(\hat{y}\), y_{i}, and \({\overline{y}}_{i}\) are prediction, true value and mean of true value, respectively. As we sequentially add the relevant elementary functions into equation, the metric κ^{2} will change accordingly. The minimal number of elementary functions for each part is determined when adding more elementary functions into the equation does not increase, or even decreases, the value of κ^{2}, as shown in Fig. 1e–g.
Quantification of inference inaccuracy
The goal of our study is to infer the mathematical equation that describes the dynamics underlying a complex system rather than only to predict future states of the system. To quantify the difference between the two equations, we use symmetric mean absolute percentage error (sMAPE):
Here, k is the cardinal number of the set containing the inferred and true terms, I_{i} and R_{i} are the inferred and true coefficients for each term respectively. The value of sMAPE is within the interval between 0 and 1. The smaller the sMAPE, more accurate is the inference result. Importantly, sMAPE is sensitive to false negative and false positive errors. For example, if the inferred equation contains a term that should not be there, or does not contain a term that should be there, the value of sMAPE will increase significantly.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
The codes for generating the simulation data in this study are deposited in the public GitHub repository^{51}. The real data of bird flocks^{40} and tau pathology^{49} are shared via the link https://doi.org/10.6084/m9.figshare.24804894.v4. Source data are provided in this paper.
Code availability
The codes are available in the public GitHub repository and on Zenodo: https://github.com/TingTingGao/NetworkSDEInference.git[https://doi.org/10.5281/zenodo.12112887]^{51}.
References
Brückner, D. B. et al. Stochastic nonlinear dynamics of confined cell migration in twostate systems. Nat. Phys. 15, 595–601 (2019).
Ji, F., Wu, Y., Pumera, M. & Zhang, L. Collective behaviors of active matter learning from natural taxes across scales. Adv. Mater. 35, 2203959 (2023).
Vicsek, T. & Zafeiris, A. Collective motion. Phys. Rep. 517, 71–140 (2012).
Shahrezaei, V. & Swain, P. S. The stochastic nature of biochemical networks. Curr. Opin. Biotechnol. 19, 369–374 (2008).
Acar, M., Mettetal, J. T. & Van Oudenaarden, A. Stochastic switching as a survival strategy in fluctuating environments. Nat. Genet. 40, 471–475 (2008).
Rolls, E. T. & Deco, G.The Noisy Brain: Stochastic Dynamics as a Principle of Brain Function (Oxford University Press, Oxford, 2010).
Mendonça, P. R. et al. Stochastic and deterministic dynamics of intrinsically irregular firing in cortical inhibitory interneurons. eLife 5, e16475 (2016).
Palmer, T. Stochastic weather and climate models. Nat. Rev. Phys. 1, 463–471 (2019).
Grilli, J. Macroecological laws describe variation and diversity in microbial communities. Nat. Commun. 11, 4743 (2020).
Georgescu, I. How machines could teach physicists new scientific concepts. Nat. Rev. Phys. 4, 736–738 (2022).
Krenn, M. et al. On scientific understanding with artificial intelligence. Nat. Rev. Phys. 4, 761–769 (2022).
Liu, Z. & Tegmark, M. Machine learning conservation laws from trajectories. Phys. Rev. Lett. 126, 180604 (2021).
Wang, H. et al. Scientific discovery in the age of artificial intelligence. Nature 620, 47–60 (2023).
Brunton, S. L., Proctor, J. L. & Kutz, J. N. Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proc. Natl Acad. Sci. USA 113, 3932–3937 (2016).
Raissi, M., Perdikaris, P. & Karniadakis, G. E. Physicsinformed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 378, 686–707 (2019).
Cranmer, M. et al. Discovering symbolic models from deep learning with inductive biases. Adv. Neural Inf. Process. Syst. 33, 17429–17442 (2020).
Gao, T.T. & Yan, G. Autonomous inference of complex network dynamics from incomplete and noisy data. Nat. Comput. Sci. 2, 160–168 (2022).
Zhang, Y. et al. Universal framework for reconstructing complex networks and node dynamics from discrete or continuous dynamics data. Phys. Rev. E 106, 034315 (2022).
Rao, C. et al. Encoding physics to learn reaction–diffusion processes. Nat. Mach. Intell. 5, 765–779 (2023).
Gu, T. et al. Stochastic trajectory prediction via motion indeterminacy diffusion. Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. 17113–17122 (2022).
Tang, K., Ao, P. & Yuan, B. Robust reconstruction of the fokkerplanck equations from time series at different sampling rates. EPL 102, 40003 (2013).
Bernhard, J. E., Moreland, J. S. & Bass, S. A. Bayesian estimation of the specific shear and bulk viscosity of quark–gluon plasma. Nat. Phys. 15, 1113–1117 (2019).
Mitra, E. D. & Hlavacek, W. S. Parameter estimation and uncertainty quantification for systems biology models. Curr. Opin. Syst. Biol. 18, 9–18 (2019).
Brückner, D. B., Ronceray, P. & Broedersz, C. P. Inferring the dynamics of underdamped stochastic systems. Phys. Rev. Lett. 125, 058103 (2020).
Dietrich, F. et al. Learning effective stochastic differential equations from microscopic simulations: linking stochastic numerics to deep learning. Chaos 33, 023121 (2023).
Course, K. & Nair, P. B. State estimation of a physical system with unknown governing equations. Nature 622, 261–267 (2023).
Chen, X. et al. Constructing custom thermodynamics using deep learning. Nat. Comput. Sci. 4, 66–85 (2024).
Frishman, A. & Ronceray, P. Learning force fields from stochastic trajectories. Phys. Rev. X 10, 021009 (2020).
Kaheman, K., Brunton, S. L. & Kutz, J. N. Automatic differentiation to simultaneously identify nonlinear dynamics and extract noise probability distributions from data. Mach. Learn.: Sci. Technol. 3, 015031 (2022).
Xu, K., Hu, W., Leskovec, J. & Jegelka, S. How powerful are graph neural networks? Int. Conf. Learn. Represent. (ICLR) (2019).
Boccaletti, S., Latora, V., Moreno, Y., Chavez, M. & Hwang, D.U. Complex networks: structure and dynamics. Phys. Rep. 424, 175–308 (2006).
Barzel, B. & Barabási, A.L. Universality in network dynamics. Nat. Phys. 9, 673–681 (2013).
Meena, C. et al. Emergent stability in complex network dynamics. Nat. Phys. 19, 1033–1042 (2023).
Borges, F. et al. Inference of topology and the nature of synapses, and the flow of information in neuronal networks. Phys. Rev. E 97, 022303 (2018).
Arenas, A., DíazGuilera, A., Kurths, J., Moreno, Y. & Zhou, C. Synchronization in complex networks. Phys. Rep. 469, 93–153 (2008).
Cavagna, A. et al. Scalefree correlations in starling flocks. Proc. Natl Acad. Sci. USA 107, 11865–11870 (2010).
Katz, Y., Tunstrøm, K., Ioannou, C. C., Huepe, C. & Couzin, I. D. Inferring the structure and dynamics of interactions in schooling fish. Proc. Natl Acad. Sci. USA 108, 18720–18725 (2011).
Vásárhelyi, G. et al. Optimized flocking of autonomous drones in confined environments. Sci. Robot. 3, eaat3536 (2018).
Reynolds, C. W. Flocks, herds and schools: a distributed behavioral model. Comput. Graph. 21, 25–34 (1987).
Nagy, M., Ákos, Z., Biro, D. & Vicsek, T. Hierarchical group dynamics in pigeon flocks. Nature 464, 890–893 (2010).
Vicsek, T., Czirók, A., BenJacob, E., Cohen, I. & Shochet, O. Novel type of phase transition in a system of selfdriven particles. Phys. Rev. Lett. 75, 1226 (1995).
Grégoire, G., Chaté, H. & Tu, Y. Moving and staying together without a leader. Physica D 181, 157–170 (2003).
Wang, Y. & Mandelkow, E. Tau in physiology and pathology. Nat. Rev. Neurosci. 17, 22–35 (2016).
Guo, J. L. et al. Unique pathological tau conformers from alzheimer’s brains transmit tau pathology in nontransgenic mice. J. Exp. Med. 213, 2635–2654 (2016).
Braak, H. & Braak, E. Neuropathological stageing of alzheimerrelated changes. Acta Neuropathol. 82, 239–259 (1991).
Wu, J. W. et al. Neuronal activity enhances tau propagation and tau pathology in vivo. Nat. Neurosci. 19, 1085–1092 (2016).
Asai, H. et al. Depletion of microglia and inhibition of exosome synthesis halt tau propagation. Nat. Neurosci. 18, 1584–1593 (2015).
Henderson, M. X. et al. Spread of αsynuclein pathology through the brain connectome is modulated by selective vulnerability and predicted by network analysis. Nat. Neurosci. 22, 1248–1257 (2019).
Cornblath, E. J. et al. Computational modeling of tau pathology spread reveals patterns of regional vulnerability and the impact of a genetic risk factor. Sci. Adv. 7, eabg6677 (2021).
Ramirez, D. M. et al. Endogenous pathology in tauopathy mice progresses via brain networks. Preprint at bioRxiv https://doi.org/10.1101/2F2023.05.23.541792 (2023).
Gao, T. LaGNA: Learning interpretable dynamics of stochastic complex systems. https://doi.org/10.5281/zenodo.12112887 (2024).
Casadiego, J., Nitzan, M., Hallerberg, S. & Timme, M. Modelfree inference of direct network interactions from nonlinear collective dynamics. Nat. Commun. 8, 2192 (2017).
Shen, J., Liu, F., Tu, Y. & Tang, C. Finding gene network topologies for given biological function with recurrent neural network. Nat. Commun. 12, 3125 (2021).
Levina, A., Priesemann, V. & Zierenberg, J. Tackling the subsampling problem to infer collective properties from limited data. Nat. Rev. Phys. 4, 770–784 (2022).
Swain, P. S., Elowitz, M. B. & Siggia, E. D. Intrinsic and extrinsic contributions to stochasticity in gene expression. Proc. Natl Acad. Sci. USA 99, 12795–12800 (2002).
Lind, P. G. et al. Extracting strong measurement noise from stochastic time series: applications to empirical data. Phys. Rev. E 81, 041125 (2010).
Hamilton, F., Berry, T. & Sauer, T. Kalmantakens filtering in the presence of dynamical noise. Eur. Phys. J. Spec. Top. 226, 3239–3250 (2017).
Prasse, B. & Van Mieghem, P. Predicting network dynamics without requiring the knowledge of the interaction graph. Proc. Natl Acad. Sci. USA 119, e2205517119 (2022).
Li, X. et al. Higherorder granger reservoir computing: simultaneously achieving scalable complex structures inference and accurate dynamics prediction. Nat. Commun. 15, 2506 (2024).
Lambiotte, R., Rosvall, M. & Scholtes, I. From networks to optimal higherorder models of complex systems. Nat. Phys. 15, 313–320 (2019).
Battiston, F. et al. The physics of higherorder interactions in complex systems. Nat. Phys. 17, 1093–1098 (2021).
Acknowledgements
GY is supported by the National Natural Science Foundation of China (grants no. T2225022, no. 12161141016, no. 12350710786, and no. 62088101), STI2030 Major Project (grant no. 2021ZD0204500), Shanghai Municipal Science and Technology Major Project (grant no. 2021SHZDZX0100), Shuguang Program of Shanghai Education Development Foundation and Shanghai Municipal Education Commission (grant no. 22SG21), and the Fundamental Research Funds for the Central Universities. BB is supported by the Israel Science Foundation (grant no. 499/19), the IsraelChina ISFNSFC joint research program (grant no. 3552/21), the US National Science Foundation CRISP award no. 1735505, and the VATAT grant for data science research. The authors are also grateful for the helpful discussion with Zhuohao He, Jack M. Moore, Xiaozhu Zhang, Tongyu Li, and Xiaolei Ru.
Author information
Authors and Affiliations
Contributions
G.Y. conceived the research, G.Y., T.T.G. and B.B. designed it, T.T.G. performed it, G.Y., T.T.G., and B.B. analyzed the results, and G.Y. and T.T.G. wrote the manuscript with input from B.B.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Tailin Wu, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Gao, TT., Barzel, B. & Yan, G. Learning interpretable dynamics of stochastic complex systems from experimental data. Nat Commun 15, 6029 (2024). https://doi.org/10.1038/s4146702450378x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s4146702450378x
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.