# Model-free inference of direct network interactions from nonlinear collective dynamics

## Abstract

The topology of interactions in network dynamical systems fundamentally underlies their function. Accelerating technological progress creates massively available data about collective nonlinear dynamics in physical, biological, and technological systems. Detecting direct interaction patterns from those dynamics still constitutes a major open problem. In particular, current nonlinear dynamics approaches mostly require to know a priori a model of the (often high dimensional) system dynamics. Here we develop a model-independent framework for inferring direct interactions solely from recording the nonlinear collective dynamics generated. Introducing an explicit dependency matrix in combination with a block-orthogonal regression algorithm, the approach works reliably across many dynamical regimes, including transient dynamics toward steady states, periodic and non-periodic dynamics, and chaos. Together with its capabilities to reveal network (two point) as well as hypernetwork (e.g., three point) interactions, this framework may thus open up nonlinear dynamics options of inferring direct interaction patterns across systems where no model is known.

## Introduction

The collective dynamics of many natural systems ranging from regulatory circuits and metabolic systems1,2,3,4,5,6,7 to communication, distribution, and supply networks8,9 is derived from the direct interactions of their parts. Determining how such systems are connected may help us in understanding and controlling their function10,11. Current nonlinear dynamics approaches may recover direct interactions from the collective dynamics of a system if a mathematical model is provided in advance and only their unknown parameters, network links, and nonlinear terms are to be determined11,12,13,14,15,16,17,18,19. Such models, however, are usually not at hand under most experimental conditions, thereby constraining the applicability of these methods to a limited number of examples. Recent works20,21 on low-dimensional systems suggest that approximating the dynamics through expansions in basis functions may reveal the interaction patterns, if such dynamics admits a sparse representation in the proposed basis. A more recent model-free approach that takes into account the nonlinear network dynamics requires to externally drive the systems in a controlled way, thus enabling reconstruction from experimental settings for one particular range of settings22. Common model-free approaches not considering the nonlinear system dynamics construct functional links by detecting statistical dependencies (e.g., correlations, mutual information, Granger causality, and extensions thereof)23,23,24,25,26,27,28,29,31 and thus are prone to recover indirect interactions among the units of a network, for instance, due to common external inputs or decorrelating effects induced by other units in the network11,27,28,32,32,34. Although latest efforts have focused on filtering indirect connections27,28 from pairwise statistical dependencies, recent studies show that these functional links can only match direct connections under specific homogeneity conditions35, which rarely occur in real-world systems.

In this article, we propose a novel concept for inferring direct interactions in coupled dynamical systems, relying only on their nonlinear collective dynamics, with neither assuming specific dynamic models to be known in advance nor assuming the dynamics admits a sparse representation, nor imposing controlled drivings, nor expecting statistical dependencies to faithfully reveal direct, physical interactions. To achieve this goal, we here change the perspective and ask which units j of the network provide direct physical interactions to a given unit i and appear on the right hand side of its differential equation, rather than asking for details of the interaction functions among those units. We demonstrate that the problem of inferring direct interactions based on observed nonlinear dynamics may be posed as a multivariate regression problem by introducing an explicit dependency matrix and thereby systematically decomposing each units dynamics into pairwise, three-point, and higher-order interactions with other units in the network. Such decompositions provide restricting equations for mapping the collective dynamics to direct interactions. We validate and characterize the predictive power of our approach by successfully revealing the structure of generic as well as specific biological model systems. These model systems may exhibit complex noisy dynamics such as transient dynamics toward steady states, periodic and non-periodic dynamics, or chaos, and have standard pairwise as well as hypernetwork (such as three point) interactions. Interaction networks may even be revealed if some units are not measured (and thus hidden during observation).

## Results

### Mapping time series to direct interactions

To understand which information a time series contains about the direct interactions in networks, consider a system whose time evolution is given by

$${\bf{\dot x}} = {\bf{F}}({\bf{x}}(t)) + {\boldsymbol{\xi }}(t)$$
(1)

where $${\bf{x}}(t) = \left[ {x_1(t),...,x_N(t)} \right]^\intercal \in {\Bbb R}^N$$ is the state of the entire system consisting of units with variables x i (t), $${\bf{\dot x}} = d{\bf{x}}(t){\mathrm{/}}dt$$ denotes its temporal derivative, $${\boldsymbol{\xi }}(t) = \left[ {\xi _1(t), \ldots ,\xi _N(t)} \right] \in {\Bbb R}^N$$ represents external noise acting on the whole system, and $${\bf{F}}:{\Bbb R}^N \to {\Bbb R}^N$$ is any smooth, typically nonlinear function that we assume to be unknown. Common examples are the regulation functions in models for gene regulatory networks3,5,7 or rate laws in metabolic systems36.

Given a multivariate time series

$$x_{i,m}: = x_i\left( {t_m} \right)$$
(2)

recorded at discrete time points t m  = mΔt + t 0, system identification aims to reveal the exact functional form of F and to exactly predict the systems future37. Owing to the high dimensionality of most networks, such identification is typically restricted or even impossible. Here we address the problem in a slightly yet essentially different manner, asking only: which of the variables x j directly acts on a given unit i and thus explicitly appears on the right hand side of Eq. (1)? We aim to reveal not only pairwise network interactions, specified by terms of the form $$\dot x_i = ... + g_j^i\left( {x_j} \right) + g_{ij}^i\left( {x_i,x_j} \right) + \ldots ,$$ but also higher-order hypernetwork interactions, induced, for instance, by terms of the form $$\dot x_i = ... + g_{jk}^i\left( {x_j,x_k} \right) + g_{ijk}^i\left( {x_i,x_j,x_k} \right) + ...$$ where two or more units j and k different than i jointly influence unit i directly.

To distinguish among units and at the same time treat all orders of interactions simultaneously, we introduce explicit dependency matrices Λi {0, 1}N×N, diagonal matrices defined by

$${\mathrm{\Lambda }}_{jj}^i = \left( {\begin{array}{*{20}{c}} 0 & {{\mathrm{if}}\,\frac{{\partial F_i}}{{\partial x_j}} \equiv 0} \\ 1 & {{\mathrm{if}}\,\frac{{\partial F_i}}{{\partial x_j}} \ne 0} \end{array}} \right..$$
(3)

Hence, if a unit j directly acts on unit i, we have $${\mathrm{\Lambda }}_{jj}^i$$ equals 1, and $${\mathrm{\Lambda }}_{jj}^i$$ equals 0 otherwise. With this notation, the dynamics of the units becomes

$$\dot x_i = f_i({\mathrm{\Lambda }}^i{\bf{x}}(t)) + \xi _i(t),$$
(4)

where $$f_i:{\Bbb R}^N \to {\Bbb R}$$ is a smooth function that specifies the deterministic evolution of component i and $$\xi _i(t) \in {\Bbb R}$$ represents external noise acting on i.

The explicit dependency matrix Λi selects which variables x j directly control the rate of change of x i , thus going beyond the related graph-theoretical notions of adjacency and incidence matrices and thereby emphasizing aspects of the dynamics: first, it offers a uniform representation of pairwise and higher-order interactions; and second, it is thus suitable for generic dynamical systems representations, as it appears exactly once in the right hand side of Eq. (4).

The resulting generic model (Eq. (4)) links state space points x(t) at time t to their rate of change $$\dot x_i(t)$$. In particular, the complemented system state $$s_i(t) = \left[ {{\bf{x}}(t),\dot x_i(t)} \right]^\intercal \in {\Bbb R}^{N + 1}$$ is an element of a higher-dimensional “dynamics space” $${\cal D}_i$$ for each i formed by the state space and the rate of change of unit i. Therefore, the f i specifying the dynamics defines a smooth manifold $${\cal M}_i \subset {\cal D}_i$$, with the $${\mathrm{\Lambda }}_{jj}^i$$ indicating whether or not $${\cal M}_i$$ is constant in direction x j . cf. Fig. 1.

In practical scenarios, the functions f i are generally not accessible. We address this challenge in two stages. First, we functionally decompose the dynamics of units i {1, 2, ..., N} into interaction terms with the entire network as

$$\begin{array}{*{20}{l}} {\dot x_i} \hfill & \hskip-8pt = \hfill &\hskip-7pt {f_i\left( {{\mathrm{\Lambda }}^i{\bf{x}}} \right) = \mathop {\sum}\limits_{j = 1}^N {\mathrm{\Lambda }}_{jj}^ig_j^i\left( {x_j} \right) + \mathop {\sum}\limits_{j = 1}^N \mathop {\sum}\limits_{s = 1}^N {\mathrm{\Lambda }}_{jj}^i{\mathrm{\Lambda }}_{ss}^ig_{js}^i\left( {x_j,x_s} \right)} \hfill \\ {} \hfill & {} \hfill & { + \mathop {\sum}\limits_{j = 1}^N \mathop {\sum}\limits_{s = 1}^N \mathop {\sum}\limits_{w = 1}^N {\mathrm{\Lambda }}_{jj}^i{\mathrm{\Lambda }}_{ss}^i{\mathrm{\Lambda }}_{ww}^ig_{jsw}^i\left( {x_j,x_s,x_w} \right) + \ldots + \xi _i,} \hfill \end{array}$$
(5)

where $$g_j^i:{\Bbb R} \to {\Bbb R}$$, $$g_{js}^i:{\Bbb R}^2 \to {\Bbb R}$$, $$g_{jsw}^i:{\Bbb R}^3 \to {\Bbb R}$$ and, in general $$g_{j_1j_2 \ldots j_K}^i:{\Bbb R}^K \to {\Bbb R}$$, represent the (unknown) K-th order interactions between units j k for all k {1, 2, …, K} and unit i. Specifically, the decomposition (Eq. (5)) separates contributions to unit i arising from different orders, e.g., pairwise and higher-order interactions with other units in the system. The Λ i are defined such that, if $${\mathrm{\Lambda }}_{rr}^i \equiv 0$$, all functions $$g_{j_1,j_2,...,j_K}^i$$ with any of the indices j k  = r disappear from the right hand side of Eq. (5).

Given that functions $$g_{j_1,j_2,...,j_K}^i$$ are taken to not be accessible, we decompose each $$g_{j_1,j_2,...,j_K}^i$$ into basis functions h as

$$\begin{array}{*{20}{l}} {\dot x_i} \hfill & \hskip-8pt = \hfill &\hskip-7pt {\mathop {\sum}\limits_{j = 1}^N {\mathrm{\Lambda }}_{jj}^i\mathop {\sum}\limits_{p = 1}^{P_1} c_{j,p}^ih_{j,p}\left( {x_j} \right) + \mathop {\sum}\limits_{j = 1}^N \mathop {\sum}\limits_{s = 1}^N {\mathrm{\Lambda }}_{jj}^i{\mathrm{\Lambda }}_{ss}^i\mathop {\sum}\limits_{p = 1}^{P_2} c_{js,p}^ih_{js,p}\left( {x_j,x_s} \right)} \hfill \\ {} \hfill & {} \hfill & { + \mathop {\sum}\limits_{j = 1}^N \mathop {\sum}\limits_{s = 1}^N \mathop {\sum}\limits_{w = 1}^N {\mathrm{\Lambda }}_{jj}^i{\mathrm{\Lambda }}_{ss}^i{\mathrm{\Lambda }}_{ww}^i\mathop {\sum}\limits_{p = 1}^{P_3} c_{jsw,p}^ih_{jsw,p}\left( {x_j,x_s,x_w} \right) + \ldots + \xi _i,} \hfill \end{array}$$
(6)

where P k indicates the number of basis functions employed in the expansion, c.f. ref. 38. Thus, provided a time series (2) where Δt is sufficiently small such as to reliably estimate time derivatives $$\dot x_{i,m}$$, revealing direct interactions becomes identifying the non-zero coefficients in the right hand side of Eq. (6) that best fit the estimated $$\dot x_{i,m}$$. Such expansions (Eq. (6)) differ qualitatively from those developed in refs. 15,16,18,20 since ours do neither require the functions $$g_{j_1,j_2,...,j_K}^i$$ to be represented exactly by the basis functions chosen nor the condition to admit a sparse representation in the basis. Instead, we only require the functions h to form any basis of a relevant function space, thereby additionally allowing the investigator to choose basis functions not appearing explicitly in any of the $$g_.^i$$. In particular, this reduced requirement implies that, for instance, all coefficients $$c_{j,p}^i \equiv 0$$ are (indistinguishable from) zero for all p if there is no functional dependency $$g_j^i\left( {x_j} \right) = \mathop {\sum}\nolimits_p c_{j,p}^ih_{j,p}(x_j) \equiv 0$$.

This weaker requirement is sufficient to impose a structure of blocks of zero and non-zero coefficients in Eq. (6), representing absent and existing interactions, respectively, thereby posing a mathematical regression problem with grouped variables39,39,40,42. To solve such structured problems, we developed the Algorithm for Revealing Network Interactions (ARNI) (Supplementary Note 1), a greedy approach based on the Block Orthogonal Least Squares (BOLS) algorithm40. Specifically, our approach takes the time series of all units in the network as inputs and returns a ranked list of interactions indicating the order in which interactions in the right hand side of Eq. (6) were identified as most strongly lowering a cost function (see text below and Supplementary Note 1/Supplementary Fig. 3). We remark that here we do not intend to recover the actual functional form of interactions, but instead we aim at determining the existence or absence of interactions between units. So, even if our scheme infers an optimal model from a given time series, it is not guaranteed that such a model would agree with an actual model generating the dynamics43. Indeed, the fact that we only ask for the units interacting with a given unit and not for details of the coupling functions enables robust performance across systems (compare Figs. 1, 2, 3, 4, 5, and 6).

### Revealing direct links in model systems

To demonstrate the robustness of our approach, we inferred the interactions of model systems and compared our results to those obtained from thresholding correlations11,44, partial correlations45, and transfer entropy46. In particular, we have selected such quantities because they are model independent, and they have been traditionally used to quantify interactions in networked systems. We tested our framework on systems displaying diverse types of collective dynamics, such as transient dynamics toward steady states, non-periodic dynamics, and chaotic and noisy dynamics, as emerging in models of Michaelis Menten kinetics in gene regulation, generic heteroclinic, and generic chaotic oscillatory dynamics. We measured the quality of reconstruction in terms of area under the receiver-operating-characteristic curve (AUC) score (Supplementary Note 3). The AUC score equals 1 for perfect reconstruction and it equals 1/2 for predictions equivalent to random guessing.

Predictions improve with longer time series as well as by composing one long time series out of different short ones, as illustrated for non-periodic dynamics in Fig. 1c. This indicates that sampling sufficient parts of state space is essential for revealing direct network interactions. Generally, we found that if long time series are not available (or not preferred, see the following), compositions of short time series are at least equally appropriate for reconstruction, see, e.g., Fig. 1c. Exemplary tests demonstrate that even time series as short as m = 5 time points recorded from dynamics from different trajectories evolving toward a steady state might be sufficient. Moreover, reconstruction quality improves with the total number of available recordings M = S × m where S is the number of experiments, in contrast to inferences from thresholding correlations, partial correlations, and transfer entropy, which cannot predict existing interactions under these minimal sampling conditions (Fig. 2a–c). Moreover, inference studies on collections of short time series extracted from non-periodic dynamics further confirms that larger numbers M = S × m of recordings improve quality (as expected). Again, correlations, partial correlations, and transfer entropy are in general less capable of capturing the intrinsic structure of interactions under equally minimal conditions (Fig. 2d–f). Finally, interactions may still be recovered in networks of higher-dimensional units by extending Eq. (5) to include all components $$x_i^d(t)$$ of i {1, 2, …, N}, where d {1, 2, …, D i } and D i is the number of components of unit i, Fig. 3a–c.

### Performance

To further characterize the performance of our approach, we carried out systematic reconstructions of various networks of different sizes, numbers of incoming connections per unit, noise levels, fraction of higher-order (hypernetwork) interactions, and number of hidden units (Fig. 4). We report four classes of results. First, the number M θ of time points necessary for AUC scores larger than a threshold θ scales sublinearly with the size of the network, Fig. 4a, and linearly with the number of incoming connections per unit, Fig. 4b. Moreover, inferring the incoming connections of single units in large sparse networks (N = 1000, n i  = 10) in conventional hardware (Intel® CoreTM i5-2430M) takes 65 ± 26 s per unit. Such results highlight the potential applicability of our approach in combination with parallel computing for revealing interactions in real-world networks, which are often large in size and sparsely connected. Second, M θ depends supralinearly on the noise level η, Fig. 4c. Here, sampling longer time series (more data) improves reconstruction quality. These results indicate that inference is still viable for highly noisy dynamics at the expense of recording longer time series. Third, systematic reconstructions of hypernetwork interactions in exemplary models of phase-coupled oscillators (Supplementary Note 4) suggest that our results are independent of the probability of having hypernetwork interactions p h, Fig. 4d, e. This is a consequence of treating pairwise and higher-order interactions equally, by decomposing the coupling into orders of jointly acting units via explicit dependency matrices (Eq. (3)). Thus the approach is insensitive to the appearance of higher-order interactions. Finally, even if some units of the network are not measured (hidden units), existing and non-existing links among measured units may still be reliably inferred, Fig. 4f. To compute AUC scores, we compare our predictions for the existence and absence of links among the measured units with those actually existing and not existing among those units, making no statement about indirect interactions mediated by hidden units. As more units are hidden, the quality of reconstruction decreases because the hidden units act upon the measured units in an unknown way. Still, sampling longer time series again improves reconstruction quality. Thereby, the model-free approach provides accurate predictions even if only a fraction of the network is recorded.

### Proper basis functions and learning curves

Selecting an appropriate class of basis functions to represent the network interactions in system (Eq. (6)) is vital for any such approach. Choosing basis functions that capture the intrinsic nature of interactions (e.g., h(x i ), h(x i , x j ), h(x i , x j , x w ), and so on) by construction yields optimal results. However, to exactly pick the correct interaction function requires prior knowledge of the potential functions involved in coupling units of the system under consideration. To overcome this limitation, we aim at appropriate classes of coupling functions only but do not require to pick a correct function (that would enable prediction of time series). While the former implies to find basis functions of correct order, the latter implies to find a unique set of basis functions capable of fitting the recorded dynamics (see below for further consequences). We remark that a particularly chosen basis function constitutes a representative of an entire class of appropriate functions. For instance, the functions indexed ad in Table 1 are all equally appropriate representatives of the class of pairwise functions $$g_{ij}^i\left( {x_i,x_j} \right)$$, Fig. 5.

We investigated the effects of selecting different basis functions. For the example shown in Fig. 5, we studied networks of phase-coupled oscillators and divided the time series in a training set (60% of time points) for inferring interactions and a validation set (40% of time points) for evaluating the predictions; we tracked the evolution of a fitting cost function with respect to the l-th discovered interaction. Specifically, the fitting cost function is defined as

$$C_i(l): = \frac{1}{{M_{\mathrm{s}}}}\mathop {\sum}\limits_{m = 1}^{M_{\mathrm{s}}} \left( {\dot x_{i,m} - \widehat {\dot x}_{i,m}(l)} \right)^2,$$
(7)

where M s is the number of time points in the set and $$\widehat {\dot x}_{i,m}(l) \in {\Bbb R}$$ is the prediction by our approach of a computed $$\dot x_{i,m}$$ using the inferred interactions up to the l-th discovered interaction.

The functional forms of the cost function C i (l), depending on the number l of interactions considered, are either L-shaped, indicating the number of incoming connections at the knee l* of the L (basis functions ad of Table 1, Fig. 5a–d), or not, thereby not revealing any features of the network (basis functions e and f of Table 1, Fig. 5e, f). Simultaneously to reveal the number of incoming connections, the first l * interactions actually chosen provide the full information about which units j directly act on unit i. We remark that, for sufficiently short sampling intervals, both the time derivative $$\dot x_{i,m}$$ as well as its estimator $$\widehat {\dot x}_{i,m}$$ are obtainable from recorded dynamics data without any model assumption.

These findings confirm that basis functions that merely capture the essential structure of the interactions but not necessarily exactly represent the full dynamics are sufficient to reveal network connectivity. As a consequence, reconstruction of direct network interactions is possible without preknowlegde about a system model.

### Effects of noise and hidden units

In experimentally relevant biological settings, there may be several uncontrolled factors affecting the recorded time series. For instance, in gene networks, noisy dynamics is simultaneously present at several different levels (e.g., gene-intrinsic, network intrinsic, and cell-intrinsic)2. Fundamentally, noise complicates the inference process by corrupting measurements of units dynamics, thereby masking network interactions. Moreover, one may not have complete access to measure all units in the network. This may induce correlating or decorrelating effects among units, thus promoting the recovery of indirect interactions34,47.

To test the robustness of our approach against the combination of both noise and hidden units, we simulated transients toward steady states under the external influence of Gaussian noise and recorded the dynamics of only a subset of randomly selected units in the network. Results indicate that both noise and hidden units moderately reduce the performance of our approach, Fig. 6. However, the inference quality still increases with M, Fig. 6d, such that larger sampling collections may still reveal interaction topology. Moreover, systematic reconstructions of different sets of recorded units indicate that our predictions generally outperform those extracted from correlations, partial correlations, and transfer entropy.

### Robust inference of biological networks

Next we establish the potential of our framework to reconstruct interactions for biological system settings. Specifically, we demonstrate results on two networked model biological systems: glycolytic oscillator in yeast48 and circadian clock in Drosophila 49. The glycolytic oscillator, exhibiting one of the classical examples for cellular oscillations, accounts for the main reactions of glycolysis. Here we focus on a model for anaerobic glycolytic oscillations in yeast, containing the influx of glucose and outflux of pyruvate and/or acetaldehyde48 (see Supplementary Note 5 for an extended description). The circadian clock underlies the biological response to the day–night cycle, and the oscillations it exhibits in Drosophila are driven by a negative feedback between two genes and the complex that is formed by the proteins they code for. The model equations for the circadian clock are based on ref. 49 (see Supplementary Note 5).

Employing the above approach of combining a dynamics space representation, expanding in suitable families of basis functions, and solving the resulting linear regression problem by an orthogonal least squares method, we reconstructed the interactions between the different components of the glycolytic oscillator (Fig. 7a, b) and the circadian clock (Fig. 7c, d) from transient dynamics toward their periodic orbits. As for the other systems’ settings, the results confirm that larger number M of observations improve the predictions. Moreover, the reconstruction quality by this method again outperforms those resulting from correlations, partial correlations, and transfer entropy.

## Discussion

We proposed a model-free framework for inferring direct interaction networks from only the time series of collective nonlinear system dynamics. First, defining the notion of explicit dependency matrices enabled us to systematically decompose each units’ dynamics into pairwise, three-point, and higher-order interactions and at the same time treat present influences from one unit to another on the same footing independently of the interaction order. Second, by capturing the structure (but not necessarily the exact functional form) of the dynamical influences through appropriately chosen basis functions, we posed the reconstruction problem based on nonlinear dynamics as a mathematical regression problem with grouped variables. Given that the reconstructions of the sets of incoming connections to different units of the network are mathematically independent (despite using overlapping recorded dynamical data), the framework is scalable (see Supplementary Note 2) and computationally parallelizable for large networks. Reconstruction is robust across a wide range of dynamical regimes, combined pairwise and hypernetwork interactions, noise, and hidden units.

The main advantage of our framework is its minimal sampling conditions. For instance, in systems during transients to steady states (such as in gene regulation3,5,7) or periodic orbits (such as in glycolytic oscillations48), we reconstructed direct interactions without the need to know the actual strength or actual distributed patterns of perturbations from those states. In contrast to several previous studies1,11,13,50,50,, our framework in general does not require to apply external driving signals and if a system is externally driven, e.g., to create transients, these signals need not be controlled; thus our framework might be suitable for systems not easily accessible for controlled driving or external driving at all. Moreover, collections of very short time series, in practice potentially resulting from different experiments on the same system, are sufficient for reconstruction. In particular, collective dynamics that is transient, stochastically driven, or otherwise sufficiently complex helps revealing interactions, whereas certain stable dynamics on low-dimensional subsets of state space only sample limited regions of the dynamics space and thus in principle do not provide full information about network interactions. Lower-dimensional dynamics may in particular be induced by symmetries or other invariants represented by algebraic conditions, such as z(x) = 0. For instance, in systems evolving in synchronized states, the existence and directionality of interactions are impossible to extract from time series32. Furthermore, the number of independent measurements required for successful reconstruction grows linearly with the local number of interaction partners and sublinearly with the number of units in the network, providing an advantage for reconstructing large systems. As we illustrated by examples, our framework may be easily combined with learning curves derivable from recorded data only and thus enables researchers to determine the accuracy of inferences when there is no ground truth available.

Previous studies on inferring the direct interaction structure from time series have focused on the reconstruction of networks with known local dynamics and coupling functions11,12,14,15,16,17,18. Such prior knowledge reduces the task to a standard linear algebra problem, where one has to solve linear systems of equations to reveal the network connections, cf. ref.11 for a comprehensive review. Recent work on low-dimensional dynamical systems20, based on expanding the system dynamics in basis functions, requires the dynamics to admit a sparse representation in the proposed basis. Moreover, a work21 applying an extension of the method described in ref. 20 on models for gene regulation also suggests that such approaches scale supralinearly with the dimensionality of the network for both the number of candidate coupling functions and the time points necessary for successful reconstruction. The theory presented above does neither require prior knowledge of parameters and coupling functions involved in the network dynamics nor does it require these functions to admit a sparse representations in any basis chosen; it is not limited to low-dimensional networked systems, also because the number of necessary time points for successful reconstruction scales sublinearly with network size.

Taken together, this model-free, robust framework can be based on collections of short time series, noisy data, partially inaccessible units, and essentially arbitrary nonlinear dynamics and may thus enable the reconstruction of direct interaction networks from dynamical data from a new range of times series from coupled dynamical systems where no model is known.

## Methods

### Overview

To generate dynamical trajectories displaying transients toward steady states, we simulated dynamical systems employing Michaelis–Menten kinetics (Supplementary Note 4), systems frequently used to model gene regulation3,5,27. To generate dynamical trajectories exhibiting transients to periodic dynamics, we employed two biological model systems: (i) glycolytic oscillations in yeast48 and circadian clock in Drosophila 49 (Supplementary Note 5), which possess hypernetwork interactions, where two units jointly and directly influence a third such that their interaction function cannot be disentangled into sums of pairwise interactions. To study the effects of non-periodicity, we simulated networks of phase-coupled oscillators (Supplementary Note 4) whose coupling stems from a simple model of weakly coupled populations of biological neurons52,53,54. Finally, to test robustness against chaos and noise, we simulated networks of noisy and asynchronous Rössler oscillators (Supplementary Note 4), prototypical systems for studying chaos55.

In what follows, we provide a brief description of each model (see Supplementary Notes 4 and 5 for further details).

### Gene regulatory circuits

To simulate systems mimicking gene regulation, we simulated networks of dynamical systems having Michaelis–Menten kinetics3,27

$$\dot x_i = - x_i + \frac{1}{{n_i}}\mathop {\sum}\limits_{j = 1}^N J_{ij}\frac{{x_j}}{{1 + x_j}} + \xi _i,$$
(8)

having n i randomly-selected incoming connections per node. Here J ij of $$J \in {\Bbb R}^{N \times N}$$ represents a weighted and directed link from unit j to i.

### Networks and hypernetworks of phase-coupled oscillators

To generate non-periodic dynamics, we simulated a model52 of phase-coupled oscillators with coupling functions having two Fourier modes

$$\dot x_i = \omega _i + \frac{1}{{n_i}}\mathop {\sum}\limits_{j = 1}^N J_{ij}\left[ {{\mathrm{sin}}\left( {x_j - x_i - 1.05} \right) + 0.33\,{\mathrm{sin}}\left( {2\left( {x_j - x_i} \right)} \right)} \right] + \xi _i,$$
(9)

with constant natural frequencies ω i .

We extended this model to hypernetworks of the form

$$\dot x_i = \omega _i + \frac{1}{{n_i}}\mathop {\sum}\limits_{j = 1}^N \mathop {\sum}\limits_{k = 1}^N E_{jk}^i\left[ {{\mathrm{sin}}\left( {x_j - x_k - 1.05} \right) + 0.33\,{\mathrm{sin}}\left( {2\left( {x_j - x_k} \right)} \right)} \right] + \xi _i.$$
(10)

Differently from (Eq. (9)), here we introduce the second-order interaction matrix $$E^i \in {\Bbb R}^{N \times N}$$ for all i = {1, 2, …, N}. Specifically, the element $$E_{jk}^i$$ quantify how strongly units j and k jointly and directly influence unit i.

### Networks of Rössler oscillators

To generate chaotic dynamics, we simulated networks of coupled Rössler oscillators55. The dynamics of each oscillator $${\boldsymbol{x}}_i = \left[ {x_i^1,x_i^2,x_i^3} \right] \in {\Bbb R}^3$$ is set by three differential equations

$$\dot x_i^1 = - x_i^2 - x_i^3 + \frac{1}{{n_i}}\mathop {\sum}\limits_{j = 1}^N J_{ij}\,{\mathrm{sin}}\left( {x_j^1} \right) + \xi _i^1,$$
(11)
$$\dot x_i^2 = x_i^1 + 0.1x_i^2 + \xi _i^2,$$
(12)
$$\dot x_i^3 = 0.1 + x_i^3\left( {x_i^1 - 18.0} \right) + \xi _i^3,$$
(13)

where $$\xi _i^k$$ with k {1, 2, 3} represent external noisy signals acting on the unit’s components.

### Glycolytic oscillator model

To test performance on biological model systems, we first simulated the glycolytic oscillator defined as48

$$\dot S_1 = J_0 - \frac{{k_1S_1S_6}}{{1 + (S_6{\mathrm{/}}K_1)^q}}$$
(14)
$$\dot S_2 = 2\frac{{k_1S_1S_6}}{{1 + (S_6{\mathrm{/}}K_1)^q}} - k_2S_2(N - S_5) - k_6S_2S_5$$
(15)
$$\dot S_3 = k_2S_2(N - S_5) - k_3S_3(A - S_6)$$
(16)
$$\dot S_4 = k_3S_3(A - S_6) - k_4S_4S_5 - \kappa (S_4 - S_7)$$
(17)
$$\dot S_5 = k_2S_2(N - S_5) - k_4S_4S_5 - k_6S_2S_5$$
(18)
$$\dot S_6 = - 2\frac{{k_1S_1S_6}}{{1 + (S_6{\mathrm{/}}K_1)^q}} + 2k_3S_3(A - S_6) - k_5S_6$$
(19)
$$\dot S_7 = \psi \kappa (S_4 - S_7) - kS_7$$
(20)

where S 1 represents the concentration of glucose, S 2 that of glyceraldehydes-3-phosphate and dihydroxyacetone phosphate pool, S 3 that of 1, 3-bisphosphoglycerate, S 4 that of cytosolic pyruvate and acetaldehyde pool, S 5 that of NADH, S 6 that of ATP, and S 7that of extracellular pyruvate and the acetaldehyde pool.

A second biological model system we have studied is the circadian clock, underlying the response to the day–night cycle. It is defined as49:

$$\dot M_p = v_{sP}\frac{{K_{IP}^n}}{{K_{IP}^n + C_N^n}} - v_{mP}\frac{{M_P}}{{K_{mP} + M_P}} - k_dM_P$$
(21)
$$\dot P_0 = k_{sP}M_P - V_{1p}\frac{{P_0}}{{K_{1P} + P_0}} + V_{2p}\frac{{P_1}}{{K_{2P} + P_1}} - k_dP_0$$
(22)
$$\dot P_1 = V_{1p}\frac{{P_0}}{{K_{1P} + P_0}} - V_{2p}\frac{{P_1}}{{K_{2P} + P_1}} - V_{3p}\frac{{P_1}}{{K_{3P} + P_1}} + V_{4p}\frac{{P_2}}{{K_{4P} + P_2}} - k_dP_1$$
(23)
$$\dot P_2 = V_{3P}\frac{{P_1}}{{K_{3P} + P_1}} - V_{4p}\frac{{P_2}}{{K_{4P} + P_2}} - k_3P_2T_2 + k_4C - v_{dP}\frac{{P_2}}{{K_{dP} + P_2}} - k_dP_2$$
(24)
$$\dot M_T = v_{sT}\frac{{K_{IT}^n}}{{K_{IT}^n + C_N^n}} - v_{mT}\frac{{M_T}}{{K_{mT} + M_T}} - k_dM_T$$
(25)
$$\dot T_0 = k_{sT}M_T - V_{1T}\frac{{T_0}}{{K_{1T} + T_0}} + V_{2T}\frac{{T_1}}{{K_{2T} + T_1}} - k_dT_o$$
(26)
$$\dot T_1 = V_{1T}\frac{{T_0}}{{K_{1T} + T_0}} - V_{2T}\frac{{T_1}}{{K_{2T} + T_1}} - V_{3T}\frac{{T_1}}{{K_{3T} + T_1}} + V_{4T}\frac{{T_2}}{{K_{4T} + T_2}} - k_dT_1$$
(27)
$$\dot T_2 = V_{3T}\frac{{T_1}}{{K_{3T} + T_1}} - V_{4T}\frac{{T_2}}{{K_{4T} + T_2}} - k_3P_2T_2 + k_4C - V_{dT}\frac{{T_2}}{{K_{dT} + T_2}} - k_dT_2$$
(28)
$$\dot C = k_3P_2T_2 - k_4C - k_1C - k_2C_N - k_{dC}C$$
(29)
$$\dot C_N = k_1C - k_2C_N - k_{dN}C_N$$
(30)

where M T and M P are tim and per mRNAs, respectively. T 0, T 1, and T 2 are forms of the TIM protein, P 0, P 1, and P 2 are forms of the PER protein, and C and C N are forms of the PER–TIM complex.

### Data availability

All data reported in this study are available from the corresponding authors upon request. Example codes for simulating and reconstructing network dynamical systems may be found at https://github.com/networkinference/ARNI.

## References

1. 1.

Gardner, T. S., di Bernardo, D., Lorenz, D. & Collins, J. J. Inferring genetic networks and identifying compound mode of action via expression profiling. Science 301, 102–105 (2003).

2. 2.

Kaern, M., Elston, T. C., Blake, W. J. & Collins, J. J. Stochasticity in gene expression: from theories to phenotypes. Nat. Rev. Genet. 6, 451–464 (2005).

3. 3.

Karlebach, G. & Shamir, R. Modelling and analysis of gene regulatory networks. Nat. Rev. Mol. Cell Biol. 9, 770–780 (2008).

4. 4.

Fujita, A. et al. Modeling nonlinear gene regulatory networks from time series gene expression data. J. Bioinform. Comput. Biol. 6, 961–979 (2008).

5. 5.

Marbach, D. et al. Revealing strengths and weaknesses of methods for gene network inference. Proc. Natl. Acad. Sci. USA 107, 6286–6291 (2010).

6. 6.

Bar-Joseph, Z., Gitter, A. & Simon, I. Studying and modelling dynamic biological processes using time-series gene expression data. Nat. Rev. Genet. 13, 552–564 (2012).

7. 7.

Marbach, D. et al. Wisdom of crowds for robust gene network inference. Nat. Methods 9, 796–804 (2012).

8. 8.

Ronellenfitsch, H., Lasser, J., Daly, D. C. & Katifori, E. Topological phenotypes constitute a new dimension in the phenotypic space of leaf venation networks. PLoS Comput. Biol. 11, e1004680 (2015).

9. 9.

Kirst, C., Timme, M. & Battaglia, D. Dynamic information routing in complex networks. Nat. Commun. 7, 11061 (2016).

10. 10.

Cornelius, S. P., Kath, W. L. & Motter, A. E. Realistic control of network dynamics. Nat. Commun. 4, 1942 (2013).

11. 11.

Timme, M. & Casadiego, J. Revealing networks from dynamics: an introduction. J. Phys. A Math. Theor. 47, 343001 (2014).

12. 12.

Yu, D., Righero, M. & Kocarev, L. Estimating topology of networks. Phys. Rev. Lett. 97, 188701 (2006).

13. 13.

Timme, M. Revealing network connectivity from response dynamics. Phys. Rev. Lett. 98, 224101 (2007).

14. 14.

Shandilya, S. G. & Timme, M. Inferring network topology from complex dynamics. New J. Phys. 13, 013004 (2011).

15. 15.

Wang, W.-X., Yang, R., Lai, Y.-C., Kovanis, V. & Harrison, M. A. F. Time-series based prediction of complex oscillator networks via compressive sensing. Europhys. Lett. 94, 48006 (2011).

16. 16.

Wang, W. X., Yang, R., Lai, Y. C., Kovanis, V. & Grebogi, C. Predicting catastrophes in nonlinear dynamical systems by compressive sensing. Phys. Rev. Lett. 106, 154101 (2011).

17. 17.

Han, X., Shen, Z., Wang, W. X. & Di, Z. Robust reconstruction of complex networks from sparse data. Phys. Rev. Lett. 114, 028701 (2015).

18. 18.

Wang, W.-X., Lai, Y.-C. & Grebogi, C. Data based identification and prediction of nonlinear and complex dynamical systems. Phys. Rep. 644, 1–76 (2016).

19. 19.

Liu, Y.-Y. & Barabási, A.-L. Control principles of complex systems. Rev. Mod. Phys. 88, 035006 (2016).

20. 20.

Brunton, S. L., Proctor, J. L. & Kutz, J. N. Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proc. Natl. Acad. Sci. USA 113, 3932–3937 (2016).

21. 21.

Mangan, N. M., Brunton, S. L., Proctor, J. L. & Kutz, J. N. Inferring biological networks by sparse identification of nonlinear dynamics. IEEE Trans. Mol. Biol. Multiscale Commun. 2, 52–63 (2016).

22. 22.

Nitzan, M., Casadiego, J. & Timme, M. Revealing physical interaction networks from statistics of collective dynamics. Sci. Adv. 3 e1600396 (2017).

23. 23.

Granger, C. W. J. Investigating causal relations by econometric models and cross-spectral methods. Econometrica 37, 424–438 (1969).

24. 24.

Ren, J., Wang, W. X., Li, B. & Lai, Y. C. Noise bridges dynamical correlation and topology in coupled oscillator networks. Phys. Rev. Lett. 104, 058701 (2010).

25. 25.

Quinn, C. J., Coleman, T. P., Kiyavash, N. & Hatsopoulos, N. G. Estimating the directed information to infer causal relationships in ensemble neural spike train recordings. J. Comput. Neurosci. 30, 17–44 (2011).

26. 26.

Friston, K. J. Functional and effective connectivity: a review. Brain Connect. 1, 13–36 (2011).

27. 27.

Barzel, B. & Barabási, A.-L. Network link prediction by global silencing of indirect correlations. Nat. Biotechnol. 31, 720–725 (2013).

28. 28.

Feizi, S., Marbach, D., Médard, M. & Kellis, M. Network deconvolution as a general method to distinguish direct dependencies in networks. Nat. Biotechnol. 31, 726–733 (2013).

29. 29.

Guo, X., Zhang, Y., Hu, W., Tan, H. & Wang, X. Inferring nonlinear gene regulatory networks from gene expression data based on distance correlation. PLoS One. 9, e87446 (2014).

30. 30.

Tirabassi, G., Sevilla-Escoboza, R., Buldú, J. M. & Masoller, C. Inferring the connectivity of coupled oscillators from time-series statistical similarity analysis. Sci. Rep. 5, 10829 (2015).

31. 31.

Ching, E. S. C. & Tam, H. C. Reconstructing links in directed networks from noisy dynamics. Phys. Rev. E 95, 010301 (2017).

32. 32.

Paluš, M. & Vejmelka, M. Directionality of coupling from bivariate time series: How to avoid false causalities and missed connections. Phys. Rev. E 75, 056211 (2007).

33. 33.

Nawrath, J. et al. Distinguishing direct from indirect interactions in oscillatory networks with multiple time scales. Phys. Rev. Lett. 104, 38701 (2010).

34. 34.

Zou, Y., Romano, M. C., Thiel, M., Marwan, N. & Kurths, J. Inferring indirect coupling by means of recurrences. Int. J. Bifurc. Chaos 21, 1099–1111 (2011).

35. 35.

Lin, W., Wang, Y., Ying, H., Lai, Y. C. & Wang, X. Consistency between functional and structural networks of coupled nonlinear oscillators. Phys. Rev. E 92, 012912 (2015).

36. 36.

Kaplan, U., Türkay, M., Biegler, L. & Karasözen, B. Modeling and simulation of metabolic networks for estimation of biomass accumulation parameters. Discret. Appl. Math. 157, 2483–2493 (2009).

37. 37.

Bekey, G. A. System Identification-An Introduction and a Survey 15 (Springer London, London, 1970).

38. 38.

Hastie, T., Tibshirani, R. & Friedman, J. The Elements of Statistical Learning, Springer Series in Statistics. (Springer New York, NY, 2009).

39. 39.

Eldar, Y. C. & Mishali, M. Robust recovery of signals from a structured union of subspaces. IEEE Trans. Inf. Theory 55, 5302–5316 (2009).

40. 40.

Majumdar, A. & Ward, R. K. Fast group sparse classification. Can. J. Electr. Comput. Eng. 34, 136–144 (2009).

41. 41.

Eldar, Y. C., Kuppinger, P. & Bölcskei, H. Block-sparse signals: uncertainty relations and efficient recovery. IEEE Trans. Signal Process. 58, 3042–3054 (2010).

42. 42.

Duarte, M. F. & Eldar, Y. C. Structured compressed sensing: from theory to applications. IEEE Trans. Signal Process. 59, 4053–4085 (2011).

43. 43.

Judd, K. & Nakamura, T. Degeneracy of time series models: the best model is not always the correct model. Chaos 16, 033105 (2006).

44. 44.

Lünsmann, B. J., Kirst, C. & Timme, M. Transition to reconstructibility in weakly coupled networks. PLoS ONE 12, 1–12 (2017).

45. 45.

Opgen-Rhein, R. & Strimmer, K. From correlation to causation networks: a simple approximate learning algorithm and its application to high-dimensional plant gene expression data. Bmc. Syst. Biol. 1, 37 (2007).

46. 46.

Schreiber, T. Measuring information transfer. Phys. Rev. Lett. 85, 461–464 (2000).

47. 47.

Hirata, Y. & Aihara, K. Identifying hidden common causes from bivariate time series: a method using recurrence plots. Phys. Rev. E 81, 016203 (2010).

48. 48.

Wolf, J. & Heinrich, R. Effect of cellular interaction on glycolytic oscillations in yeast: a theoretical investigation. Biochem. J. 345, 321–334 (2000).

49. 49.

Leloup, J.-C. & Goldbeter, A. Chaos and Birhythmicity in a model for circadian oscillations of the PER and TIM proteins in drosophila. J. Theor. Biol. 198, 445–459 (1999).

50. 50.

Yeung, M. K., Tegner, J. & Collins, J. J. Reverse engineering gene networks using singular value decomposition and robust regression. Proc. Natl. Acad. Sci. USA 99, 6163–6168 (2002).

51. 51.

Yu, D. & Parlitz, U. Estimating parameters by autosynchronization with dynamics restrictions. Phys. Rev. E 77, 66221 (2008).

52. 52.

Hansel, D., Mato, G. & Meunier, C. Clustering and slow switching in globally coupled phase oscillators. Phys. Rev. E 48, 3470–3477 (1993).

53. 53.

Hansel, D., Mato, G. & Meunier, C. Phase dynamics for weakly coupled hodgkin-huxley neurons. Europhys. Lett. 23, 367–372 (2007).

54. 54.

Izhikevich, E. M. Dynamical Systems in Neuroscience: The Geometry of Excitability and Burtsing. (MIT Press, Cambridge, 2007).

55. 55.

Rössler, O. E. An equation for continuous chaos. Phys. Lett. A. 57, 397–398 (1976).

56. 56.

Buhmann, M. D. Radial Basis Function: Theory and Implementations (Cambridge University Press, Cambridge, 2003).

## Acknowledgements

We thank Fabio Schittler Neves, Benedict Lünsmann and Fenna Müller for useful discussions. M.T. thanks Albert Laszlo Barabasi for hospitality and useful discussions during a visit in March 2016. We acknowledge support by the German Research Foundation and the Open Access Publication Funds of the TU Dresden. This work is supported through the German Science Foundation (DFG) by a grant toward the Center of Excellence “Center for Advancing Electronics Dresden” (cfaed). We also gratefully acknowledge support from the Federal Ministry of Education and Research (BMBF Grant Nos. 03SF0472E and 03SF0472F) and the Max Planck Society.

## Author information

All authors conceived the research and contributed materials and analysis tools. J.C. and M.T. developed the theory and algorithms and designed the research. All authors provided model systems and the quality measures. J.C., M.N., and S.H. carried out the numerical experiments. All authors analyzed the data, discussed and interpreted the results, and wrote the manuscript.

Correspondence to Jose Casadiego or Marc Timme.

## Ethics declarations

### Competing interests

The authors declare no competing financial interests.

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Rights and permissions

Reprints and Permissions

• ### Reconstruction of dynamic networks with time-delayed interactions in the presence of fast-varying noises

• Zhaoyang Zhang
• , Yang Chen
• , Yuanyuan Mi
•  & Gang Hu

Physical Review E (2019)

• ### Network reconstruction from infection cascades

• Alfredo Braunstein
• , Alessandro Ingrosso
•  & Anna Paola Muntoni

Journal of The Royal Society Interface (2019)

• ### Detecting Hidden Units and Network Size from Perceptible Dynamics

• Hauke Haehne
• , Joachim Peinke
•  & Marc Timme

Physical Review Letters (2019)

• ### Reconstruction of ensembles of nonlinear neurooscillators with sigmoid coupling function

• Ilya V. Sysoev
•  & Mikhail D. Prokhorov

Nonlinear Dynamics (2019)

• ### Scaling oscillatory platform frequency reveals recurrence of intermittent postural attractor states

• Aviroop Dutt-Mazumder
• , Troy J. Rand
• , Mukul Mukherjee
•  & Karl M. Newell

Scientific Reports (2018)