Network inference from perturbation time course data

Sarmah, Deepraj; Smith, Gregory R.; Bouhaddou, Mehdi; Stern, Alan D.; Erskine, James; Birtwistle, Marc R.

doi:10.1038/s41540-022-00253-6

Download PDF

Article
Open access
Published: 01 November 2022

Network inference from perturbation time course data

npj Systems Biology and Applications volume 8, Article number: 42 (2022) Cite this article

2566 Accesses
5 Citations
4 Altmetric
Metrics details

Subjects

Abstract

Networks underlie much of biology from subcellular to ecological scales. Yet, understanding what experimental data are needed and how to use them for unambiguously identifying the structure of even small networks remains a broad challenge. Here, we integrate a dynamic least squares framework into established modular response analysis (DL-MRA), that specifies sufficient experimental perturbation time course data to robustly infer arbitrary two and three node networks. DL-MRA considers important network properties that current methods often struggle to capture: (i) edge sign and directionality; (ii) cycles with feedback or feedforward loops including self-regulation; (iii) dynamic network behavior; (iv) edges external to the network; and (v) robust performance with experimental noise. We evaluate the performance of and the extent to which the approach applies to cell state transition networks, intracellular signaling networks, and gene regulatory networks. Although signaling networks are often an application of network reconstruction methods, the results suggest that only under quite restricted conditions can they be robustly inferred. For gene regulatory networks, the results suggest that incomplete knockdown is often more informative than full knockout perturbation, which may change experimental strategies for gene regulatory network reconstruction. Overall, the results give a rational basis to experimental data requirements for network reconstruction and can be applied to any such problem where perturbation time course experiments are possible.

Three million images and morphological profiles of cells treated with matched chemical and genetic perturbations

Article Open access 09 April 2024

Srinivas Niranj Chandrasekaran, Beth A. Cimini, … Anne E. Carpenter

Inferring gene regulatory networks from single-cell multiome data using atlas-scale external data

Article Open access 12 April 2024

Qiuyue Yuan & Zhana Duren

Gene trajectory inference for single-cell data by optimal transport metrics

Article 05 April 2024

Rihao Qu, Xiuyuan Cheng, … Yuval Kluger

Introduction

Networks underlie much cellular and biological behavior, including transcriptional, protein-protein interaction, signaling, metabolic, cell-cell, endocrine, ecological, and social networks, among many others. As such, identifying and then representing their structure has been a focus of many for decades now. This is not just from experimental perspectives alone, but predominantly computational with a variety of statistical methodologies that integrate prior knowledge from interaction databases with new experimental data sets^{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24}. Alternatively, a variety of methods have investigated general ways to infer detailed reaction mechanisms—often a foundation of networks—from experimental data^{25,26,27,28,29}. Such tasks may be considered a subset of network inference.

Network structure is usually represented as either an undirected or a directed graph, with edges between nodes specifying the system. There are five main areas where current approaches to reconstructing networks struggle to capture important features of biological networks. The first is directionality of edges^6,8,30,31. Commonly employed correlational methods predominantly generate undirected edges, which impedes causal and other mechanistic analyses. Second is cycles. Cycles such as feedback or feedforward loops are nearly ubiquitous in biological systems and central to their function^32,33. This also includes an important type of cycle: self-regulation of a node, that is, an edge onto itself, which is rarely considered³⁴. Third is that biological networks are often dynamic. Two notable examples are circadian and p53 oscillators^35,36, where dynamics are key to biological function. Directionality and edge signs (i.e. positive or negative) dictate dynamics. Fourth is pinpointing how external variables impinge on network nodes. For example, is the effect of a growth factor on a network node direct, or though other nodes in the network? Fifth, the design and method employed should be robust to typical experimental noise levels. The experimental design and data requirements to uniquely identify the dynamic, directed and signed edge structures in biological networks containing all types of cycles and external stimuli remains a largely open but significant problem. Any such design should ideally be feasible to implement with current experimental technologies.

Modular Response Analysis (MRA) approaches, first pioneered by Kholodenko and colleagues in 2002^37,38 inherently deal with cycles and directionality by prescribing systematic perturbation experiments followed by steady-state measurements. The premise for data requirements is to measure the entire system response to at least one perturbation for each node. Thus, an n node system requires n experiments, if the system response can be measured in a global fashion (i.e. all nodes measured at once). The original instantiations struggled with the impact of experimental noise, but total least squares MRA and Monte Carlo sampling helped to improve performance^39,40,41. Incomplete and prior knowledge can be handled as well using both maximum likelihood and Bayesian approaches^42,43,44,45. However, these approaches are based on steady-state data, or fixed time point data, limiting abilities to deal with dynamic systems. There is a formal requirement for small perturbations, which are experimentally problematic and introduce issues for estimation with noisy data. Subsequent approaches have recommended the use of large perturbations as a trade off in dealing with noisy data, but the theory still formally requires small perturbations⁴¹. Lastly, there are two classes of biologically relevant edges that MRA does not comprehensively address. First is self-regulation of a node, which is often normalized (to -1) causing it to not be uniquely identifiable. The other are the effects of stimuli external to the network (basally present or administered) on the modeled nodes.

In addition to perturbations, another experimental design feature that can inform directionality is a time-series, which have also been integrated into MRA. This work^37,46 uses time-series perturbation data to uniquely infer a signed, directed network that can predict dynamic network behavior. In an n node open system (e.g. protein levels are not constant), multiple nodes would either be distinctly perturbed more than once, such as both production and degradation of a transcript, or phosphorylation and dephosphorylation of a protein, or the system monitored before and after the perturbation (with one perturbation per node). This can be experimentally challenging both in terms of scale and finding suitable distinct perturbations for a node. Moreover, as is often the case, noise in the experimental data severely limits inference accuracy (due to required estimation of 2^nd derivatives). Subsequent work⁴⁷ recommends smaller perturbations and difference in timepoints but also does not address noisy data. Further work has demonstrated that larger perturbations produce better results due to inevitable experimental noise⁴¹. Thus, there remains a need for methods that can infer signed, directed networks from feasible perturbation time course experiments that capture dynamics, can uniquely estimate edge properties related to self-regulation and external stimuli, and finally that function in the presence of typical experimental noise levels.

Here we describe a novel, MRA-inspired approach called Dynamic Least-squares MRA (DL-MRA). For an n-node system, n perturbation time courses are required, and thus experimental requirements scale linearly as the network size increases. The approach uses an underlying network model that captures dynamic, directional, and signed networks that include cycles, self-regulation, and external stimulus effects. We test DL-MRA using simulated time-series perturbation data with known network topology under increasing levels of simulated noise. The approach has good accuracy and precision for identifying network structure in randomly generated two and three node networks that contain a wide variety of cycles. For the investigated cases, we find between 7 to 11 evenly distributed time points yielded reasonable results, although we expect this will strongly depend on time point placement. We apply the approach to models describing a cell state switching network⁴⁸, a signal transduction network⁴⁹, and a gene regulatory network³². Although signaling networks are often a focus in network biology, our analysis suggests they have unique properties that render them generally recalcitrant to reconstruction. Results from the gene regulatory network application suggest that incomplete perturbation (e.g. partial knockdown vs. knockout) is more informative than complete inhibition. While challenges remain for expanding to other and larger systems, the proposed algorithm robustly infers a wide range of networks with good specificity and sensitivity using feasible time course experiments, all while making progress on limitations of current inference approaches.

Results

Formulation of sufficient experimental data requirements for network reconstruction

Consider a 2-node network with four directed, weighted edges (Fig. 1a). An external stimulus may affect each of the two nodes differently and its effect is quantified by S_1,ex and S_2,ex, respectively (e.g. Methods, Eq. (15)). We also allow for basal/constitutive production in each node (S_i,b). Let x_i(k) be the activity of node i at time point t_k. The network dynamics can be cast as a system of ordinary differential equations (ODEs) as follows

$$\frac{{dx_1}}{{dt}} \equiv f_1(x_1(k),x_2(k),S_{1,ex},S_{1,b}) \equiv f_1(k);\frac{{dx_2}}{{dt}} \equiv f_2(x_1(k),x_2(k),S_{2,ex},S_{2,b}) \equiv f_2(k).$$

(1)

The network edges can be connected to the system dynamics through the Jacobian matrix J^37,38,46,

$${{{\mathbf{J}}}} \equiv \left( {\begin{array}{*{20}{c}} {F_{11}} & {F_{12}} \\ {F_{21}} & {F_{22}} \end{array}} \right) \equiv \left( {\begin{array}{*{20}{c}} {{\textstyle{{\partial f_1} \over {\partial x_1}}}} & {{\textstyle{{\partial f_1} \over {\partial x_2}}}} \\ {{\textstyle{{\partial f_2} \over {\partial x_1}}}} & {{\textstyle{{\partial f_2} \over {\partial x_2}}}} \end{array}} \right)$$

(2)

The network edge weights (F_ij’s) describe how the activity of one node affects the dynamics of another node in a causal and direct sense, given the explicitly considered nodes (though not necessarily in a physical sense). In practice, however, causality can only be approached if every component of the system is included in the model, which is not typical (and even more so, there must be no model mismatch, which is almost impossible to guarantee)^{6,30,31,50,51}. In MRA, these nodes may be individual species or “modules”. In order to simplify a complex network it may often be separated into “modules” comprising smaller networks of inter-connected species with the assumption that each module is generally insulated from other modules except for information transfer through so-called communicating species³⁷. Cases where such modules may not be completely isolated are explored elsewhere⁵².

What experimental data are sufficient to uniquely estimate the signed directionality of the network edges and thus infer the causal relationships within the system? Fundamentally, we know that perturbations and/or dynamics are important for inferring causality^{6,37,46,51,52}. Consider a simple setup of three time-course experiments that each measure x₁ and x₂ dynamics in response to a stimulus (Fig. 1b–g). One time course is in the presence of no perturbation (vehicle), one has a perturbation of Node 1, and one has a perturbation of Node 2. Consider further that the perturbations are reasonably specific, such that the perturbation of x₁ has negligible direct effects on x₂, and vice versa, and that these perturbations may be large. Experimentally, this could be an shRNA or gRNA that is specific to a particular node, or that a small molecule inhibitor is used at low enough dose to predominantly inhibit the targeted node. A well-posed estimation problem can be formulated (see Methods) that, in principle, allows for unique estimation of the Jacobian elements as a function of time with the following set of linear algebra relations:

$$\left[ {\begin{array}{*{20}{c}} {y_1(t_{k + 1})} \\ {y_{1,2}^{}(t_k)} \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} {\Delta _tx_1(t_{k + 1})} & {\Delta _tx_2(t_{k + 1})} \\ {\Delta _{p,2}^{}x_1(t_k)} & {\Delta _{p,2}^{}x_2(t_k)} \end{array}} \right]\left[ {\begin{array}{*{20}{c}} {F_{11}(t_k)} \\ {F_{12}(t_k)} \end{array}} \right]$$

(3)

$$\left[ {\begin{array}{*{20}{c}} {y_2(t_{k + 1})} \\ {y_{2,1}^{}(t_k)} \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} {\Delta _tx_1(t_{k + 1})} & {\Delta _tx_2(t_{k + 1})} \\ {\Delta _{p,1}^{}x_1(t_k)} & {\Delta _{p,1}^{}x_2(t_k)} \end{array}} \right]\left[ {\begin{array}{*{20}{c}} {F_{21}(t_k)} \\ {F_{22}(t_k)} \end{array}} \right]$$

(4)

Here, y_i,j refers to a measured first-time derivative of node i in the presence of node j perturbation (if used), and Δ to a difference with respect to perturbation (subscript p) or time (subscript t) (see Methods). Since we do not use data from the perturbation of node i for estimation of node i edges, we do not have to impose assumptions on how the perturbation functionally acts on the system dynamics (see Methods). Moreover, constraints on the perturbation strength can be relaxed, following recent recommendations⁴¹ (although accuracy of the underlying Taylor series approximation can affect estimation—see Methods). If these measurements with and without perturbations were each taken in their steady state as is done in MRA, the solution for F_ij would be trivial. MRA gets around this by normalizing self-regulatory parameters F_ii to -1. Using dynamic data allows unique estimation of self-regulatory parameters without such normalization. Estimation of the node-specific stimulus strengths or basal production rates (S’s) requires evaluation after specific functional assumptions, but in general these effects are knowable from the data to be generated (see Methods and below results).

Note that this formulation is generalizable to an n dimensional network. With n² unknown parameters in the Jacobian matrix, n equations originate from the vehicle perturbation and n−1 equations originate from each of the n perturbations (discarding equations from Node i with Perturbation i). This results in $n + n^\ast (n - 1) = n + n^2 - n = n^2$ independent equations.

$$\begin{array}{l}S_{1,b} = - \left(F_{11}x_{1,ss} + F_{12}x_{2,ss}\right)\\ S_{2,b} = - \left(F_{21}x_{1,ss} + F_{22}x_{2,ss}\right)\end{array}$$

Using sufficient simulated data to reconstruct a network

As an initial test of the above formulation, we used a simple 2 node, single activator network where Node 1 activates Node 2, one node has first-order degradation (-1 diagonal elements), and the other has negative self-regulation (-0.8 diagonal) (Fig. 1a—see Methods for equations). A stimulus at t = 0 (time-invariant; S_,ex = 1) increases the activity of each node, which we sample with an evenly spaced 11-point time course. This simulation was done for no perturbation (i.e. vehicle) and for each perturbation (Node 1 and Node 2) to generate the necessary simulation data per the theoretical considerations above (Fig. 1c, e, g, left panel). Here, we modeled perturbations as complete inhibition; for example, a perturbation of Node 1 makes its value 0 at all times. Solving Eqs. (3) and (4) to infer the Jacobian elements at each time point yielded good agreement between the median estimates and the ground truth values (Fig. 1h, “Analytic Solution”, No Noise). Using the node activity data corresponding to the last time point in the time course and the median estimates of Jacobian elements, the external stimuli S_1,ex and S_2,ex were also determined (Eqs. (18) and (19)) and reasonably agree with the ground truth values.

How does this approach fare when data are noisy? We performed the estimation with the same data but with a relatively small amount of simulated noise added (10:1 signal-to-noise—Fig. 1c, e, g). The resulting estimates are neither accurate nor precise, varying on a scale more than ten times greater than each parameter’s magnitude with median predictions both positive and negative regardless of the ground truth value (Fig. 1i). The stimulus strengths S_1,ex and S_2,ex are estimated to be negative, while the ground truth is positive.

Although the analytic equations suggest the sufficiency of the perturbation time course datasets to uniquely estimate the edge weights, in practice even small measurement noise corrupts estimates obtained from direct solution of these equations. Therefore, we considered an alternative representation by employing a least squares estimation approach rather than solving the linear equations directly. For a given set of guesses for edge weight and stimulus parameters, one can integrate to obtain a solution for the dynamic behavior of the resulting model, which can be directly compared to data in a least-squares sense. Least squares methods were shown to improve traditional MRA-based approaches^39,40, but had never been formulated for such dynamic problems. Two hurdles were how to model the effect of a perturbation without (i) adding additional parameters to estimate or (ii) requiring strong functional assumptions regarding perturbation action. We solved these here by using the already-available experimental measurements within the context of the least-squares estimation (see Methods). We applied this approach to the single activator model, 10:1 signal-noise ratio case above where the analytic approach failed. This new estimation approach was able to infer the network structure accurately and precisely (Fig. 1j). We conclude that analytic formulations can be useful for suggesting experimental designs that should be sufficient for obtaining unique estimates for a network reconstruction exercise, but in practice directly applying those equations may not yield precise nor accurate estimates. Alternatively, using a least-squares formulation seems to work well for this application.

Reconstruction of random 2 and 3 node networks

To investigate the robustness of the least-squares estimation approach, we applied it to increasingly complex networks with larger amounts of measurement noise and smaller numbers of time points (Fig. 2). We focused on 2 and 3 node networks. We generated 50 randomized 2 and 3 node models, where each edge weight is randomly sampled from a uniform distribution over the interval [−2, 2], and the basal and external strength from [0, 2] (Fig. 2a, Supplementary Figs. S1a and S2a). Each random network was screened for stability. Many networks (29/50 for 2 node and 3 node) displayed potential for oscillatory behavior (non-zero imaginary parts of the eigenvalues of the Jacobian matrix). However, since the real parts of the eigenvalues are non-zero and negative, these oscillations should dampen over time, and no sustained oscillatory behavior was analyzed. For each random model, we generated a simulated dataset based on the prescribed experimental design, using complete inhibition as the perturbation. We considered evenly-spaced sampling within the time interval of 0-10 AU (approximate time to reach steady-state—Supplementary Figs. S1b and S2b) with different numbers of time points (3, 7, 11 and 21), and added 10:1 signal-to-noise, 5:1 signal-to-noise, and 2:1 signal-to-noise to the data. Non-uniform time point spacing may change inference results but that was not explored at these first investigations.

**Fig. 2: Application to linear two and three node models.**

For each random network model, number of time points, and noise level, we evaluated the fidelity of the proposed reconstruction approach in terms of signed directionality (Fig. 2c–f). We overall found reasonable agreement between inferred and ground truth values, even at the higher noise levels and low number of timepoints. Expectedly, the overall classification accuracy increases with more time points and decreases with higher noise levels. But, surprisingly, even in the worst case investigated of 3 timepoints and 2:1 signal-to-noise ratio, classification accuracy was above 85% for 2 node models and 70% for 3 node models. Increasing the number of nodes decreases performance, with 3-node reconstruction being slightly worse than 2-node reconstruction, other factors held constant.

We wondered whether the magnitude of an edge weight influenced its classification accuracy, since small edge weights may be more difficult to discriminate from noise. We found that edge weights with greater absolute values, which are expected to have a greater influence on the networks, were more likely to be classified correctly (Supplementary Figs. S1c–f and S2c–f). Also, for models with damped oscillatory behavior, the classification accuracy is very similar to that of all 50 random models (Supplementary Fig. S3a, b).

How does this method compare to similar network reconstruction methods? There are limited methods to compare to which also use dynamic data and sequential perturbations. MRA³⁷, from which this method was inspired, uses steady-state data. However, we could use MRA methods requiring dynamic perturbation data as is used in our method^46,53.To compare, we further generated another set of perturbation data with 50% perturbation (as opposed to 100%). We then used the two sets of perturbation data to estimate the network node edges with dynamic modular response analysis (Fig. 2g). Even in absence of noise, for low to medium numbers of timepoints (3-11) the network is not always accurately inferred (Fig. 2g). In the presence of noise, DL-MRA performs better, although the difference between the two methods becomes lower at high number of timepoints. Thus, DL-MRA not only outperforms with half the data, but it also estimates 6 additional parameters-basal production and external stimulus for each node. Although Cho’s approach⁴⁷ builds upon MRA methods by recommending smaller time point intervals and smaller perturbations, for our purposes, the time intervals and perturbations are fixed and this would not affect the results obtained here. Moreover, further work has actually recommended larger perturbations while dealing with noisy data⁴¹.

To explore a scenario where data from a node might be unavailable, we removed the data from one of the nodes in the 50 random 3 node models and used the remaining data to reconstruct a 2-node system (Supplementary Fig. S4). Comparing with corresponding model parameters in the 3 node system, we find a good but expectedly reduced classification accuracy (No Noise-94.75%, 10:1 Signal: Noise-93.75%, 5:1 Signal: Noise-91.25%, 2:1 Signal: Noise-87).

A part of the inference process is performing parameter estimation using multiple starting guesses (i.e. multi-start), and we wanted to determine how robust the estimated parameters were across the multi-start processes. We looked at the distribution of coefficient of variation (CV) among the parameters in the multi-start results in the 50 random 3 node models where either the data generated from the estimated parameters had low sum of squared errors (SSE) compared to the original data (<10⁻⁴) or with SSE less than twice the minimum SSE. We find that the CVs peak around zero and generally have a small spread, especially for low noise scenarios (Supplementary Fig. S5). This implies a good convergence of the parameter sets obtained through multi-start.

We conclude that the network parameters of 2 and 3 node systems can be robustly and uniquely estimated using DL-MRA. However, these were ideal conditions where there was no model mismatch that is expected in specific biological applications. How does DL-MRA perform when applied to data reflective of different biological use cases?

Application to cell state networks

Cell state transitions are central to multi-cellular organism biology. They are commonly transcriptomic in nature and underlie development and tissue homeostasis and can also play roles in disease, such as drug resistance in cancer^{48,54,55,56,57,58,59,60,61}. Could DL-MRA reconstruct cell state transition networks? As the application, we use previous data on SUM159 cells that transition between luminal, basal and stem-like cells⁴⁸. Pure populations of luminal, basal and stem-like cells eventually grow to a stable final ratio amongst the three. The authors used a discrete time Markov transition probability model to describe the data and estimate a cell state transition network (Fig. 3a). Thus, we seek to compare DL-MRA to such a Markov model in this case.

**Fig. 3: Application to cell state transition networks.**

We hypothesized that perturbations to the system in this case, in contrast to above, did not have to change node activity (i.e. edges). Rather, we thought that perturbing the equilibrium cell state distribution could serve an equivalent purpose. Thus, the data for reconstruction consisted of observing the cell state proportions evolve over time from “pure” populations (Fig. 3b), in addition to equal proportions. DL-MRA is capable of explaining the data (Fig. 3b). Interpretation of the estimated network parameters to DL-MRA depends on the transformation of the original discrete time Markov probabilities to a continuous time formulation (see Methods—there are constraints on self-regulatory parameters), but DL-MRA correctly infers the cell state transition network as well (Fig. 3c). Conveniently, DL-MRA is not constrained to 1-day time point spacing as is the original discrete time Markov model.

How does noise and the number of timepoints affect the reconstruction? As above, we generated data for 50 random cell state transition models with 3, 7, 11 and 21 timepoints within 5 days, as the models generally seemed to reach close to equilibrium within 5 days. Noise levels of 10:1, 5:1 and 2:1 were used. All parameters were classified accurately (Fig. 3d) (although additional constraints in the estimation—see Methods—facilitates this classification performance). With 3 timepoints, there was deviation from perfect fit even with no noise in the data. At 7 and higher number of timepoints, the estimates matched ground truth well, and noise expectedly reduced the accuracy (Fig. 3d). We conclude that DL-MRA can robustly infer cell state networks given perturbation data in the form of non-equilibrium proportions as initial conditions.

Application to intracellular signaling networks

How does the method perform for intracellular signaling networks? The Huang–Ferrell model⁴⁹ (Fig. 4a) is a well-known intracellular signaling pathway model and has been investigated by different reconstruction methods, including previous versions of MRA^{37,39,41,46,62}. It captures signal flux through a three-tiered MAPK cascade where the 2^nd and 3^rd tier contain two phosphorylation sites. An important aspect of the Huang–Ferrell model is that although the reaction scheme is a cascade and without obvious feedbacks, there may be hidden feedbacks due to sequestration effects and depending on how the perturbations were performed.

**Fig. 4: Application to a signaling network.**

In order to reconstruct the Huang–Ferrell MAPK network through DL-MRA, we first simplified it to a three-node model with p-MAPKKK, pp-MAPKK and pp-MAPK as observable nodes, as is typical for reconstruction efforts (Fig. 4b)^{37,39,41,46,52,62}. Second, to model perturbations, we sequentially perturbed the activation parameters of each of the observable species (k3, k15 and k27 respectively). Such perturbations, although hard to achieve experimentally, are important because modules must be “insulated” from one another and perturbations must be specific to the observables^37,52. Even specific inhibitors do not have such kinetic specificity. Third, in the simplification of the reaction scheme, the observables are shown to influence each other but in the actual scheme, they conduct their effects through the unphosphorylated and semi-phosphorylated species. We sought to keep the levels of these two species relatively constant between different perturbations, so that they wouldn’t add to non-linearities in the estimation. Therefore, we used a stimulus which only activated the observables to a maximum of about 5% of the total forms of the protein⁵².

Estimation with DL-MRA under the above conditions fits the data (Fig. 4c) and predicts positive node edges down the reaction cascade (F₂₁, F₃₂), negligible direct relation between p-MAPKKK and pp-MAPK (F₁₃, F₃₁), negative self-regulation of each of the observables (F₁₁, F₂₂, F₃₃) negative feedbacks from pp-MAPKK to p-MAPKKK (F₁₂) and from pp-MAPK to pp-MAPKK (F₂₃), and negligible external stimuli on pp-MAPK to pp-MAPKK (F₁₃, F₃₁). All these effects are consistent with the reaction scheme. The negative feedback effects, although not immediately obvious, are consistent with ground truth sequestration effects. For instance, pp-MAPK has an overall negative effect on pp-MAPKK as the existence of pp-MAPK lowers the amount of species MAPK and p-MAPK which sequester pp-MAPKK and makes it avoid deactivation by its phosphatase.

How do the estimation results for the Huang–Ferrell model in our method compare with those obtained from other methods? Previous work using MRA also reported negative feedbacks from successive modules to the preceding ones^37,46,52. Similarly, self-regulation parameters in most preceding MRA based methods are also estimated to be negative but are fixed at -1^37,39,52.

Besides MRA inspired methods, SELDOM is another network reconstruction method which can also deal with dynamic data⁶². SELDOM is a data-driven method which uses ensembles of logic based dynamic models followed by training and model reduction steps to predict state trajectories under untested conditions. However, when dealing with the Huang–Ferrell network, the true value model of SELDOM does not map the effects of self-regulation, nor feedback effects between nodes (Fig. 4e). This may be explained by the fact that although SELDOM uses an extensive number of models to test the data obtained from multiple different stimuli, perturbation data was not included to test the Huang–Ferrell Model. This implies that systematic perturbation of each of the nodes, as prescribed by MRA-based methods, are necessary in order to unearth feedbacks and self-regulation effects.

Although application of DL-MRA to the Huang–Ferrell model was able to unearth latent network structure, the simulation conditions were restrictive. First, the perturbation scheme chosen in this paper, although specifically targeted at the observable species, is hard to produce experimentally. In practice, knock-down/out, overexpression, or specific inhibitors could be used as suitable perturbations, but do not have the preciseness needed to be compatible with MRA-imposed constraints. The feedback effect observed could depend on the perturbation scheme chosen-for instance knockdown of an entire module as a perturbation would likely have manifested as positive feedback to the preceding module. That is because such a knockdown would have reduced the effect of sequestration of the module on the preceding observable and would have made it more available for dephosphorylation. Second, we assumed a low stimulus to avoid effects from the unphosphorylated version of the proteins. A higher activation may increase non-linearities adding to the complexity of the model, whereas a lower stimulus may not activate enough proteins to be well detected in experiments. The degree of activation needed for an experiment may be hard to predict beforehand. Such specific perturbations and stimulus had to be done to reduce the effects arising from the non-observable species behavior. Hence application of DL-MRA to intracellular signaling networks with multiple physical interactions needs to be carefully considered before modeling or experiments.

Application to gene regulatory networks: partial perturbations are more informative than full perturbations

Here, we applied DL-MRA further to a series of well-studied non-linear feed forward loop (FFL) gene regulatory network models that have time-varying Jacobian elements (Fig. 5a, Table 1)^32,33. Such FFL motifs are strongly enriched in multiple organisms and are important for signaling functions such as integrative control, persistence detection, and fold-change responsiveness^63,64,65.

**Fig. 5: Application to 16 non-linear gene regulatory networks: no noise.**

Table 1 Structure of each of the 16 non-linear models.

Full size table

The FFL network has three nodes (x₁, x₂, and x₃), and the external stimulus acts on x₁ (S_1,ex). There is no external stimulus on x₂ and x₃; however, there may be basal production of x₂ (S_2,b) and x₃ (S_3,b)_,. Each node exhibits first-order decay (F_ii = −1). The parameters F₁₂, F₁₃, and F₂₃ represent connections that do not exist in the model; we call these null edges, but we allow them to be estimated. The relationship between x₁ and x₂ (F₂₁), between x₁ and x₃ (F₃₁), or between x₂ and x₃ (F₃₂) can be either activating or inhibitory. Furthermore, x₁ and x₂ can regulate x₃ through an “AND” gate (both needed) or an “OR” gate (either sufficient) (Fig. 5a). These permutations give rise to 16 different FFL structures (Table 1).

To generate simulated experimental data from these models, we first integrated the system of ODEs starting from a zero initial condition to find the steady state in the absence of stimulus. We then introduced the external stimulus and integrated the system of ODEs (see Methods) to generate time series perturbation data consistent with the proposed reconstruction algorithm, using full inhibitory perturbations. We used 11 evenly spaced timepoints for all 16 non-linear models, based on the random 3-node model analysis above, and also added noise as above.

We first noticed that even in the absence of added noise, a surprising number of inferences were incorrect (Fig. 5b, f). Model #1 (Table 1, Fig. 5b, c) is used as an example, where F₂₁, F₃₁ and F₃₂ are activators with an AND gate, and F₃₁ is incorrectly predicted as null (Fig. 5b—compare ground truth to 100% inhibition). To understand the reason for the incorrect estimation, we looked at the node activity dynamics across the perturbation time courses (Fig. 5d). All three nodes start from an initial steady state of zero, but Node 3 is zero for all three perturbation cases. This is because of the following. Since x₁ is required for the activation of x₂ and x₃, complete inhibition of x₁ completely blocks both x₂ and x₃ activation. But, because both x₁ and x₂ are required for the activation of x₃, completely inhibiting x₂ activity also completely inhibits x₃. Thus, given this experimental setup, it is impossible to discern if x₁ directly influences x₃ or if it acts solely through x₂.

We thus reasoned that full inhibitory perturbation may suppress the information necessary to correctly reconstruct the network, but that a partial perturbation experiment may contain enough information available to make a correct estimate. If this were true, then upon applying partial perturbations (we chose 50% here), Node 3 dynamics should show differences across the perturbation time courses. Simulations showed that this is the case (Fig. 5e). Subsequently, we found that for partial perturbation data, F₃₁ is correctly identified as an activator. More broadly, we obtain perfect classification from noise-free data across all 16 FFL networks when partial perturbation data are used, as opposed to 5/16 networks having discrepancy with full perturbation data (Fig. 5f). The fits to simulated data from the reconstructed model align very closely, despite model mismatch (Supplementary Fig. S6). We conclude that in these cases of non-linear networks, a partial inhibition is necessary to estimate all the network parameters accurately. Thus, moving forward, we instead applied 50% perturbation to all simulation data and proceeded with least squares estimation.

Application to gene regulatory networks: performance

The above analysis prompted us to use a partial (50%) perturbation strategy, since it classified each edge for each model in the absence of noise correctly. What classification performance do we obtain in the presence of varying levels of experimental noise? We first devised the following strategy to assess classification performance. We generated 50 bootstrapped datasets for each network structure/signal-to-noise pair, and thus obtained 50 sets of network parameter estimates. To classify the network parameters, we used a symmetric cutoff of a percentile window around the median of these 50 estimates (Fig. 6a). We illustrate this approach with three different example edges and associated estimates, one being positive (Edge 1), one being negative (Edge 2), and one being null (Edge 3). Given the window of values defined by the percentile cutoff being chosen, if the estimates in this window are all positive, then the network parameter would be classified as positive. Similarly, if the estimates in this window are all negative, then the parameter would be classified as negative. Finally, if the estimates in the window cross zero (i.e. span both positive and negative terms), then it would be classified as null. First, consider the case that the percentile window is just set at the median with no percentile span. Then, the classifications for true positives and negatives are likely to be accurate while the null parameters are likely to be incorrectly categorized as either positive or negative (Fig. 6a). If we increase the percentile window span slightly (e.g. between the 40th and 60^th percentile, middle panel), we can categorize null edges better, while maintaining good classification accuracy of both true positive and negative edges. However, if we relax the percentile window too much, (e.g. between the 10th and 90^th percentile, far right panel) we may categorize most parameters as null, including the true positive and negatives. Thus, it is clear there will be an optimal percentile cutoff that maximizes true positives and minimizes false positives as the threshold is shifted from the median to the entire range.

**Fig. 6: Application to 16 non-linear gene regulatory networks: including noise.**

Now, we applied this classification strategy to the 16 FFL model estimates from data with different noise levels. We varied the percentile window from the median only (50) to the entire range of estimated values (100) and calculated the true and false positive rates for all edges across all 16 FFL models, which allowed generation of receiver operator characteristic (ROC) curves (Fig. 6b). For each noise level, we chose the percentile window that yielded a 5% false positive rate (13-87 percentile for 10:1 Signal:Noise, 19-81 percentile for 5:1 Signal:Noise, and 21-79 percentile for 2:1 Signal:Noise). Using this simple cutoff classifier, we observed good classification performance across all noise levels according to traditional area under the ROC curve metrics (10:1 AUC = 0.99, 5:1 AUC = 0.9, 2:1 AUC = 0.92).

How does classification accuracy break down by FFL model and edge type? To evaluate the performance for each of the 16 FFL cases, we calculated the fraction of the 12 links in each FFL model that was classified correctly as a function of signal-to-noise, given the percentile windows determined above (Fig. 6c). We also looked at the fraction of the 16 models where each of the 12 links were correctly classified (Fig. 6d). Perfect classification is a value of one, which is the case for no noise, and for many cases with 10:1 signal-to-noise.

In general, as noise level increases, prediction accuracy decreases, as expected. Although for some models and parameters, performance at 2:1 signal-to-noise is poor, in some cases it is surprisingly good. This suggests that the proposed method can yield information even in high noise cases; this information is particularly impactful for null, self-regulatory, and stimulus edges. High noise has strong effects on inference of edges that are either distinct across models, time variant or reliant on other node activities (F₂₁, F₃₁, F₃₂) (Fig. 6c, d, Supplementary Fig. S7). F_21, which is reliant on activity of x₁, is inferred better than F₃₁ and F₃₂. This may be caused by the fact that x₃ dynamics depend on both x₁ and x₂, whereas x₂ dynamics only depend on x₁.

Comparing across models, we find that Models 1-8 are reconstructed slightly better than Models 9-16 (Fig. 6c) when noise is high. This performance gap is predominantly caused by S_3,b misclassification—basal production of Node 3 (Supplementary Fig. S7). What is the reason for the possible misclassification of S_3,b in Models 9-16? We know that S_3,b depends on the initial values of x₁, x₂ and x₃ and the estimated values of F₃₁, F₃₂ and F₃₃ (See Methods, Eq. (19)). For Models 1-8, x₁(t = 0) and x₂(t = 0) are both zero and therefore S_3,b is effectively only dependent on estimated value of F₃₃ and x₃(t = 0) (Supplementary Fig. S6 and Methods). But for Models 9-16, x₂(t = 0) is non-zero and S_3,b is dependent on the estimated values of both F₃₂ and F₃₃, in addition to x₂(t = 0) and x₃(t = 0), which increases the variability of S_3,b estimates. Therefore with high levels of noise, S_3,b is more likely to be mis-classified in Models 9-16, whereas this does not happen in Models 1-8 (Fig. 6c, d, Supplementary Fig. S7). In the future, including stimulus and basal production parameters in the least squares estimations themselves, rather than further deriving algebraic relations to estimate them, will likely help improve reliability.

We conclude that (i) when dealing with non-linear gene regulatory networks, complete perturbations such as genetic knockouts may fundamentally impede one’s ability to deduce network architecture and (ii) this class of non-linear networks can be reconstructed with reasonable performance using the proposed strategy employing partial perturbations.

Discussion

Despite intensive research focus on network reconstruction, there is still room to improve discrimination between direct and indirect edges (towards causality), particularly when biologically-ubiquitous feedback and feedforward cycles are present that stymie many statistical or correlation-based methods, and given that experimental noise is inevitable. The presented DL-MRA method prescribes a realistic experimental design for inference of signed, directed edges when typical levels of noise are present. It allows estimation of self-regulation edges as well as those for basal production and external stimuli. For 2 and 3 node networks, the method can successfully handle random linear networks, cell state transition networks, and gene regulatory networks, and, under certain limiting conditions, signaling networks. Prediction accuracy improved with more timepoints, which in our case accounted for more relevant dynamic data. However, we would like to stress that here we did not explore time point placement, which likely underlies the performance increase rather than simply number of timepoints. Prediction accuracy was strong in many cases even with simulated noise that exceeds typical experimental variability (2:1 signal-to-noise). The method presented here is quite general and could be applied not only to cell and molecular biology, but also vastly different fields where perturbation time course experiments are possible, and where network structures are important to determine.

One type of non-linear model that we did not investigate is one with sustained oscillations, such as those found in the cell cycle⁶⁶, or sometimes even MAPK signaling pathways^67,68,69. We found that in our application to general two and three node linear models, DL-MRA could reconstruct multiple networks that have damped oscillatory behavior (Fig. 1b). However, we expect time point measurement selection and frequency to be much more important for inferring networks that give rise to sustained oscillations, with time points comprehensively covering peaks and troughs, and the frequency high enough to do so. We do expect that the method could infer the structure of such networks given appropriate sampling, but this requires a much deeper investigation.

MRA and its subsequent methods allow for inference of direct edges by prescribing systematic perturbation of each node^{37,39,41,43,45} and the idea of directionality has been followed through in DL-MRA. Often, such edge directness is equated to causality, but this is not necessarily the case, especially when the entire system is not explicitly represented. In practice, the causality and strength of an edge may be dependent on how well the model represents the underlying phenomenon and might be affected by simplification of larger networks, non-linearities in the actual model and even by noise in the data. Secondly, in discussions about causal system inferences, consideration of the counterfactuals is important^30,31,50,51. For a network of nodes going through dynamics, the counterfactuals to intrinsic network edges causing the dynamics would be the environmental factors extrinsic to the network edges. In DL-MRA, by evaluating external stimuli and basal production as well as the network edges, we have mapped some counterfactuals to node dynamics, thus presenting a more complete map of the causal factors to the network dynamics compared to methods which only show network edges. This also allows for a concise mapping of the environmental contexts in which the network edges are reconstructed.

Application of DL-MRA could reconstruct cell state transition networks based on discrete time Markov transition models, with the added benefit of not being constrained to specific time intervals. It can also successfully handle noisy data. The additional constraints in DL-MRA in the context of cell state transitions (summations of transition rates—see Methods) implies that the underlying network may be estimated even with less data requirements than in other cases. This method can be a useful tool to model cell state transitions and predict cell state. Perturbations were modeled as a difference in initial states, and that worked well in this case, suggesting that such modeling of perturbations may work in other cell state transition or biological networks.

Although application of DL-MRA to an intracellular signaling network (Huang–Ferrell MAPK) was able to explain its ground truth, including feedback due to sequestration, the method was constrained to specific, difficult-to-implement perturbations and a low stimulus which may not always be feasible experimentally. Specific inhibitors could be a source of perturbation, but even they influence more kinetic parameters than was required here for a clean solution. In MRA, a larger reaction scheme is often simplified into modules with one species in the module representing the activity of the module. But often, the activity of the other species in the module is implicit and becomes significant in dictating how perturbations and stimulus affect the network dynamics. Moreover, the type of perturbation chosen, such as specific inhibitors versus knock-down, also may yield different network inference results. Therefore, the use of MRA methods on simplified large intracellular signaling networks, especially while dealing with experiments, have significant caveats that should be carefully considered^41,70.

Although complete inhibition is often used for perturbation studies of gene regulatory networks (e.g. CRISPR-mediated gene knockout), we found that partial inhibition is important to fully reconstruct the considered non-linear gene regulatory networks. It is important to distinguish here, however, small perturbations vs. partial perturbations. Small perturbations are formally recommended for both MRA and other techniques⁷⁰ where the effects of noise are not extensively explored. In practice however, there is a tradeoff between perturbation strength and feasibility, since the effects of small perturbations are masked by noise⁴¹. Partial perturbations, as considered in this work (~50%) are much larger than what are typically considered small perturbations. The theoretical formulation of DL-MRA reduces the impact of not having small perturbations, because perturbation data from a particular node is not used for inference of edges connected to that node. Yet, DL-MRA still uses linearizations of the Jacobian which are are always subject to greater inaccuracy the further away from reference points such perturbations take the system. Since many biological networks share the same types of non-linear features contained within the considered FFL models, this is not likely to be the only case when partial inhibition will be important. We are thus inclined to speculate that large partial perturbations may be a generally important experimental design criterion moving forward. Partial inhibition is often “built-in” to certain assay types, such as si/shRNA or pharmacological inhibition that are titratable to a certain extent.

One major remaining challenge is scaling to larger networks. Here, we limited our analysis to 2 and 3 node networks. Conveniently, the number of necessary perturbation time courses needed grows linearly (as opposed to exponentially) with the number of considered nodes. Furthermore, as long as system-wide or omics-scale assays are available, the experimental workload also grows linearly. This is routine for transcriptome analyses⁷¹, and is becoming even more commonplace for proteomic assays (e.g. mass cytometry⁷², cyclic immunofluorescence, mass spectrometry⁷³, RPPA⁷⁴. Thus, the method is arguably experimentally scalable to larger networks.

However, the computational scaling past 2 and 3 node models remains to be determined and is likely to require different approaches for parameter estimation. Increasing the network size will quadratically increase the number of unknown parameters, which will significantly increase the computational requirements for obtaining robust solutions. Yet, recent work has shown that large estimation problems in ODE models may be broken into several smaller problems⁷⁵, which may be applicable here, and is likely to yield large computational speed up by allowing parallelization of much smaller tasks. However, theory on how to merge potentially discrepant results between independently estimated overlapping subnetworks would need to be derived. Importantly, we saw in the linear 2 and 3 node model examples that the impact of experimental noise was larger for 3 node models, implying that increasing the number of nodes past 3 will further increase the impact of experimental noise. Another synergistic avenue could be imposing prior knowledge to improve initial parameter guesses and even reduce the parametric space, such as in Bayesian Modular Response Analysis⁴⁵, or with functional database information⁷⁶. Such prior knowledge could also help inform emergent network properties as network size grows, such as degree distributions for scale-free networks². Here, we only investigated dense subnetworks, so sparseness patterns and judicious allocation of non-zero Jacobian elements could also have great impact on estimation for large networks. Overall, application to larger networks is of great interest but these non-trivial computational roadblocks must be solved first.

In conclusion, the proposed approach to network reconstruction is systematic and feasible, robustly operating in the presence of experimental noise and accepting data from large perturbations. It addresses important features of biological networks that current methods struggle to account for: causality/directionality/sign, cycles (including self-regulation), dynamic behavior and environmental stimuli. It does so while leveraging dynamic data of the network and only requires one perturbation per node for completeness. We expect this approach to be broadly useful not only for reconstruction of biological networks, but to enable using such networks to build more predictive models of disease and response to treatment, and more broadly, to other fields where such networks are important for system behavior.

Methods

Deriving sufficiency conditions for unique estimation of Jacobian elements

The first-order partial derivatives comprising J (Eq. (2)) can be approximated by a first-order Taylor series expansion of Eq. (1) about a time point k

$$f_1(k + 1) \approx f_1(k) + \frac{\partial }{{\partial x_1}}\left( {f_1(k)} \right).\left( {x_1(k + 1) - x_1(k)} \right)\, + \frac{\partial }{{\partial x_2}}\left( {f_1(k)} \right) \cdot \left( {x_2(k + 1) - x_2(k)} \right)$$

(5)

$$f_2(k + 1) \approx f_2(k) + \frac{\partial }{{\partial x_1}}\left( {f_2(k)} \right) \cdot \left( {x_1(k + 1) - x_1(k)} \right)\, + \frac{\partial }{{\partial x_2}}\left( {f_2(k)} \right) \cdot \left( {x_2(k + 1) - x_2(k)} \right)$$

(6)

Equations (5) and (6) may be written more succinctly as

$$\begin{array}{l}y_1(k + 1) \approx F_{11}(k) \cdot \Delta _tx_1(k + 1) + F_{12}(k) \cdot \Delta _tx_2(k + 1)\\ y_2(k + 1) \approx F_{21}(k) \cdot \Delta _tx_1(k + 1) + F_{22}(k) \cdot \Delta _tx_2(k + 1)\end{array}$$

(7)

where

$$y_i(k + 1) \equiv f_i(k + 1) - f_i(k);\Delta _tx_i(k + 1) \equiv x_i(k + 1) - x_i(k).$$

(8)

The approximation in Eq. (7) becomes more accurate as more time points are measured. Also, the edge weights are potentially time-dependent, although this is rarely considered when describing biological networks.

How do we estimate the edge weights F in Eq. (7) and thus reconstruct the network? Time series data can inform x_i’s and f_i’s as a function of time, following application of a stimulus. Given such stimulus-response data, however, for each time point there are only two equations for four unknowns, an underdetermined system for which more data are needed.

Consider now stimulus-response time course data in the presence of single perturbations. Let _pi be a variable that reflects the strength and/or presence of different potential perturbations: p₁ represents perturbation of x₁ and p₂ represents perturbation of x₂. If p_j is not explicitly written, its value is zero and/or it has no effect. Now, the ODEs become a function of the perturbation variables

$$f_{i,j}^{}(k) \equiv f_i(k,p_j) = f_i(x_1(k),x_2(k),p_j)$$

(9)

The 1^st order Taylor series expansions for cases with perturbations become

$$y_{1,1}^{_{_{_{_{}}}}}(k) \approx F_{11}(k) \cdot \Delta _{p,1}^{}x_1(k) + F_{12}(k) \cdot \Delta _{p,1}^{}x_2(k) + \frac{\partial }{{\partial p_1}}\left( {f_1(k)} \right) \cdot p_1$$

(10)

$$y_{1,2}^{}(k) \approx F_{11}(k) \cdot \Delta _{p,2}^{}x_1(k) + F_{12}(k) \cdot \Delta _{p,2}^{}x_2(k) + \frac{\partial }{{\partial p_2^{}}}\left( {f_1(k)} \right) \cdot p_2^{_{}}$$

(11)

$$y_{2,1}(k) \approx F_{21}(k) \cdot \Delta _{p,1}x_1(k) + F_{22}(k) \cdot \Delta _{p,1}x_2(k) + \frac{\partial }{{\partial p_1}}\left( {f_2(k)} \right) \cdot p_1$$

(12)

$$y_{2,2}(k) \approx F_{21}(k) \cdot \Delta _{p,2}x_1(k) + F_{22}(k) \cdot \Delta _{p,2}x_2(k) + \frac{\partial }{{\partial p_2}}\left( {f_2(k)} \right) \cdot p_2$$

(13)

where

$$y_{i,j}(k) \equiv f_{i,j}(k) - f_i(k);\Delta _{p,j}x_i(k) \equiv x_i(k,p_j) - x_i(k)$$

(14)

Here, we have expanded with respect to the perturbation, rather than with respect to time as previously. However, since the reference point is the same, the Jacobian elements remain identical in these equations. It is also interesting to note that the Jacobian elements, or network, may be affected by the perturbation, but we do not necessarily have to know those effects mathematically, since the reference point is the same. Now we have six potential equations with which to estimate the four Jacobian elements. However, we must make some determination as to how the perturbations p₁ and p₂ directly affect Node 1 and Node 2 dynamics f₁ and f₂ to account for the perturbation variable partial derivatives.

By design, the Node 1 perturbation has significant direct effects on Node 1 dynamics, and similarly for the Node 2 perturbation on Node 2 dynamics. Using equations including $\partial f_1/\partial p_1$ and $\partial f_2/\partial p_2$ require precise definition of perturbation strength and their effects on dynamics, which could be difficult to determine experimentally and implement in simulations. Therefore, we do not employ equations involving such terms. On the other hand, if the Node 1 perturbation has negligible direct effect on Node 2 dynamics, that is, the effects on Node 2 dynamics are through the network (i.e. p₁) and not explicit in f₂), and similarly the Node 2 perturbation has negligible direct effect on Node 1 dynamics, then $\partial f_2/\partial p_1$ and $\partial f_1/\partial p_2$ are approximately zero. This mild condition is often the case experimentally. The only determining factors for the suitability of the Taylor series truncation are the spacing of time points and the accuracy of the expansion about the perturbation difference. From this, the main set of linear equations presented in Eqs. (3) and (4) are obtained.

General estimation model equations

We employ the following general model for a two-node network: -

$$\begin{array}{l}\frac{{dx_1}}{{dt}} = f_1(x_1,x_2) = S_1 + F_{11}x_1 + F_{12}x_2\\ \frac{{dx_2}}{{dt}} = f_2(x_1,x_2) = S_2 + F_{21}x_1 + F_{22}x_2\end{array}$$

(15)

Here, S₁ and S₂ are the stimuli strengths on Node 1 and Node 2 respectively, and F₁₁, F₁₂, F₂₁ and F₂₂ are the network edge weights (Fig. 1a). In many systems, there may be a basal or constitutive production driving the node activities, besides an external stimulus. For these cases, the Stimulus term (S_i), may be considered as an addition of these two effects- the basal production term (S_i,b) and the external stimulus (S_i,ex). Then the two-node model can be represented by the following equations-

$$\begin{array}{l}\frac{{dx_1}}{{dt}} = S_{1,b} + S_{1,ex} + F_{11}x_1 + F_{12}x_2\\ \frac{{dx_2}}{{dt}} = S_{2,b} + S_{2,ex} + F_{21}x_1 + F_{22}x_2\end{array}$$

(16)

Or more generally,

$$\frac{{dx_i}}{{dt}} = S_{i,b} + S_{i,ex} + \mathop {\sum}\limits_{j = 1}^n {F_{ij}x_j} ,$$

(17)

where n is the total number of nodes.

When a steady state exists, the dx_i/dt terms become zero and it becomes easy to represent the stimulus terms as a function of the node activities (x_i) and network edges (F_ij).

$$S_{i,b} + S_{i,ex} = - (\mathop {\sum}\limits_{j = 1}^n {F_{ij}x_{i,ss}} )$$

(18)

This is helpful to understand that the perturbation time course data also generally constrains not only the edge weights, but also the stimulus terms. For a system at a steady state without an external stimulus, for example at t = 0:

$$S_{i,b} = - (\mathop {\sum}\limits_{j = 1}^n {F_{ij}x_{i,ss}} )$$

(19)

The two-node single activator model

The two-node single activator model (Fig. 1a, Supplementary Fig. S1a) is described by

$$\begin{array}{l}\frac{{dx_1}}{{dt}} = f_1(x_1,x_2) = 1 - x_1\\ \frac{{dx_2}}{{dt}} = f_2(x_1,x_2) = 1 + 1.5x_1 - 0.8x_2\end{array}$$

(20)

Here, S_1,ex = 1, F₁₁ = −1, F₁₂ = 0, S_2,ex = 1, F₂₁ = 1.5, F₂₂ = −0.8. The basal production terms are both zero, for simplicity, and the initial conditions for x₁(t = 0) and x₂(t = 0) are zero. The stimulus terms S_i,ex are calculated through Eq. (18), using the median values of F_ij and the x_i(t = 10), when the system reaches near steady state.

Random two-node and three-node models

The random 2 node network is described by

$$\begin{array}{l}\frac{{dx_1}}{{dt}} = f_1(x_1,x_2) = S_{1,b} + S_{1,ex} + F_{11}x_1 + F_{12}x_2\\ \frac{{dx_2}}{{dt}} = f_2(x_1,x_2) = S_{2,b} + S_{2,ex} + F_{21}x_1 + F_{22}x_2\end{array}$$

(21)

Values for S_1,b, S_2,b, S_1,ex and S_2,ex are sampled from a uniform distribution over the range [0,2] and values for F₁₁, F₁₂, F₂₁, and F₂₂ are sampled from a uniform distribution over the range [−2,2] using the MATLAB function rand. To capture basal activity, we use a two-step approach. First, starting from node activity values of zero, without the external stimulus on Node 1 and Node 2 (S_1,ex = S_2,ex = 0 in Eq. (22)) we simulate until the network reaches steady-state with just basal production driving the network behavior. Then, we introduce the external stimulus on Node 1 and Node 2, integrate the ODEs, and sample evenly spaced time-points using ode15s in MATLAB with default settings. We sample 3,7, 11, and 21 evenly spaced time points across a time course, from 0 to 10 arbitrary time units in all the cases.

The random 3 node networks use the same sampling rules as the 2 node networks with the following equations.

$$\begin{array}{l}\frac{{dx_1}}{{dt}} = f_1(x_1,x_2,x_3) = S_{1,b} + S_{1,ex} + F_{11}x_1 + F_{12}x_2 + F_{13}x_3\\ \frac{{dx_2}}{{dt}} = f_2(x_1,x_2,x_3) = S_{2,b} + S_{2,ex} + F_{21}x_1 + F_{22}x_2 + F_{23}x_3\\ \frac{{dx_3}}{{dt}} = f_3(x_1,x_2,x_3) = S_{3,b} + S_{3,ex} + F_{31}x_1 + F_{32}x_2 + F_{33}x_3\end{array}$$

(22)

Intracellular signaling networks

In the simplification of the Huang–Ferrell network to three nodes, p-MAPKKK, pp-MAPKK and pp-MAPK were taken as nodes. Since, in absence of external stimuli, the basal values of the nodes are zero, the basal production was estimated as zero beforehand and not considered in the estimation of the rest of the network. Aside from the basal production edges, a full 3 node network (Fig. 4b) was estimated from the simulation data of each of the observables. After estimation, parameters with values less than 1/100th of the largest parameter, were considered negligible.

Cell state transition models

The cell transition model⁴⁸ is a discrete time Markov probability model. Here, we show how this form is related to the ODE model used in DL-MRA. Starting at any initial value, each next step representing a time difference of one day follows from the previous time point as follows-

$$\begin{array}{l}x_{1,t + 1} = M_{11}x_{1,t} + M_{12}x_{2,t} + M_{13}x_{3,t}\\ x_{2,t + 1} = M_{21}x_{1,t} + M_{22}x_{2,t} + M_{23}x_{3,t}\\ x_{3,t + 1} = M_{31}x_{1,t} + M_{32}x_{2,t} + M_{33}x_{3,t}\end{array}$$

(23)

Where M_ij denotes the Markov transition probabilities of species j into species i. In matrix form it may be represented as follows-

$$\left[ \begin{array}{l}x_{1,t + 1}\\ x_{2,t + 1}\\ x_{3,t + 1}\end{array} \right] = \left[ \begin{array}{l}M_{11}{{{\mathrm{ }}}}M_{12}{{{\mathrm{ }}}}M_{13}\\ M_{21}{{{\mathrm{ }}}}M_{22}{{{\mathrm{ }}}}M_{23}\\ M_{31}{{{\mathrm{ }}}}M_{32}{{{\mathrm{ }}}}M_{33}\end{array} \right]\left[ \begin{array}{l}x_{1,t}\\ x_{2,t}\\ x_{3,t}\end{array} \right]$$

(24)

Representing the Markov parameter matrix as M and the species relative concentration variables as vector X, the equation becomes

$$X_{t + 1} = MX_t$$

(25)

The Markov transition probabilities for a species must add up to 1. In experimental terms, a species can either transition to other species or stay the same and the sum of all those probabilities is 1.

$$\mathop {\sum}\limits_{i = 1:3} {M_{ij} = 1}$$

(26)

As a first step in relating these equations to the ODE form underlying DL-MRA, we put the variables in terms in terms of ∆x (with respect to time),

$$X_{t + 1} - X_t = MX_t - X_t$$

(27)

$$\Delta X_{t + 1} = (M - I)X_t$$

(28)

$$\Delta X_{t + 1} = M^\prime X_t$$

(29)

Where M’ is M-I, and I is the identity matrix. M’ is M, except that 1 is subtracted from all its diagonal elements. Hence Eq. (26) for M’ becomes

$$\mathop {\sum}\limits_{i = 1:3} {M^\prime _{ij} = 0}$$

(30)

This also implies that the diagonal term for M’ is negative of the sum of the other two terms in the same column. In experimental terms, the amount of reduction of a species is equal to how much it got converted to other species.

The above equations apply for the cases where ∆t is 1. We can incorporate arbitrary time steps as

$$\Delta X_{t + \Delta t} = M^\prime _{\Delta t}X_t\Delta t$$

(31)

Where ∆t is the scalar value of time difference and M’_∆t is the matrix of the set of parameters, specific to the time difference chosen. For a case where ∆t tends to 0, the equation becomes-

$$\mathop {{\lim }}\limits_{\Delta t \to 0} (\Delta X_{t + \Delta t}/\Delta t) = M_{dt}^\prime X_t$$

(32)

$$dX/dt = M_{dt}^\prime X_t$$

(33)

Where M’_dt is the matrix of the set of parameters specific to the case where ∆t is infinitesimally small. Note that Eq. (33) is similar in form to Eq. (22), only without the extra stimulus terms and where M’_dt is equivalent to the Jacobian matrix F with terms F_ij. There would be an added constraint that the sum of the terms in the same column would add up to zero, or that the diagonal term is the negative of the sum of the other two terms in the same column.

$$\frac{{dX}}{{dt}} = FX_t$$

(34)

$$F_{ii} = - \mathop {\sum}\limits_{j = 1,j \ne i}^3 {F_{ij}}$$

(35)

Non-linear models

The non-linear feedforward loop models³² are described by:

$$\begin{array}{l}\frac{{dx_1}}{{dt}} = f_1(x_1,x_2,x_3) = 1 - x_1\\ \frac{{dx_2}}{{dt}} = f_2(x_1,x_2,x_3) = f(x_1,K_{x_1x_2}) - x_2\\ \frac{{dx_3}}{{dt}} = f_3(x_1,x_2,x_3) = G(x_1,K_{x_1x_3},x_2,.K_{x_2x_3}) - x_3\end{array}$$

(36)

When an AND gate is present

$$G(x_1,K_{x_1x_3},x_2,K_{x_2x_3}) = f(x_1,K_{x_1x_3})^\ast f(x_2,K_{x_2x_3})$$

(37)

When an OR gate is present

$$G(x_1,K_{x_1x_3},x_2,K_{x_2x_3}) = fc(x_1,K_{x_1x_3},K_{x_2x_3},x_2) + fc(x_2,K_{x_2x_3},K_{x_1x_3},x_1)$$

(38)

For a given u, v ϵ {x₁, x₂, x₃} and K, K_u, K_v ϵ {$K_{x_1x_2}$, $K_{x_1x_3}$, $K_{x_2x_3}$}:

If u activates its target, then:

$$f(u,K) = \frac{{\left( {\frac{u}{K}} \right)^2}}{{1 + \left( {\frac{u}{K}} \right)^2}};\,fc(u,K_u,K_v,v) = \frac{{\left( {\frac{u}{{K_u}}} \right)^2}}{{1 + \left( {\frac{u}{{K_u}}} \right)^2 + \left( {\frac{v}{{K_v}}} \right)^2}}$$

(39)

If u represses its target, then:

$$f(u,K) = \frac{1}{{1 + \left( {\frac{u}{K}} \right)^2}};\,fc(u,K_u,K_v,v) = \frac{1}{{1 + \left( {\frac{u}{{K_u}}} \right)^2 + \left( {\frac{v}{{K_v}}} \right)^2}}$$

(40)

Effectively, an external stimulus of ‘S_1,ex = 1’, acts on Node 1 at t = 0 and is propagated through the network. There is no external stimulus acting on Node 2 and Node 3. However, in many cases there is basal production in one or both of Node 2 and Node 3. This leads to a non-zero steady-state of the network before the external stimulus is introduced.

To capture basal activity, we use a two-step approach. First, starting from node activity values of zero, without the external stimulus on Node 1 (S_1,ex = 0), we simulate until the network reaches steady-state. Then, we introduce the external stimulus on Node 1, integrate the ODEs, and sample 11 evenly spaced time-points using ode15s in MATLAB with default settings and steady-state node values without the external stimulus as the initial conditions. We chose 11 timepoints because it yields good classification accuracy for the above random 3 node model even in presence of noisy data. For each of the 16 non-linear models, the values of the parameters (K, Ku, Kv), were varied and chosen so that the resulting node activity data are responsive to the stimulus and perturbations (Supplementary Fig. S6, See Supplementary Code for values).

Modeling perturbations

Precisely modeling perturbations can be a challenge, since experimentally, there may be several ways of causing a perturbation with different mechanisms such as siRNAs, competitive/non-competitive/uncompetitive inhibition, etc. It may be hard to quantify how much a perturbation is affecting a node, in terms of its dynamics (i.e. right-hand sides of the ODEs). Therefore, we employ the following approaches which circumvent the need to model how each perturbation mechanistically manifests in the ODEs during parameter estimation. There are two cases to consider: (i) when we have a perturbation of node i and we need to simulate node i dynamics; (ii) when we have a perturbation of node i and we need to simulate other node j dynamics. To illustrate the approach, we use the above-described 2 node model with an example of a Node 1 perturbation. Recall that

$$\begin{array}{l}\frac{{dx_1}}{{dt}} = S_{1,b} + S_{1,ex} + F_{11}x_1 + F_{12}x_2\\ \frac{{dx_2}}{{dt}} = S_{2,b} + S_{1,ex} + F_{21}x_1 + F_{22}x_2\end{array}$$

(41)

For case (i), we have to obtain values for x₁ under perturbation of Node 1. We refer to the perturbed time-course as x_1,1. In experimental situations, x_1,1 would be measured directly. To obtain simulation data for x_1,1 we use the following:

$$x_{1,1}(k) = p_1 \times x_1(k),$$

(42)

where x₁ is obtained from the simulations without perturbations, and recall that k refers to time point k. For a 50% inhibition, p = 0.5 and for a complete inhibition, p = 0.

For case (ii), we have to obtain the values for x₂ under perturbation of Node 1, which we refer to as x_2,1. To do this, we have to integrate the ODE for dx₂/dt, but using x_1,1 values, as follows

$$\frac{{dx_{2,1}}}{{dt}} = S_{2,b} + S_{2,ex} + F_{21}x_{1,1} + F_{22}x_{2,1}$$

(43)

Here, x₂ has been replaced with x_2,1 to represent x₂ under perturbation of Node 1, for clarity. To solve this equation, we simply use the “measured” x_1,1 time course directly in the ODE.

When data are generated by simulations, there is little practical limit to temporal resolution, but with real data, to solve Eq. (43) one may need values for x_1,1 at multiple time points where measurements are not available, depending on the solver being used. We therefore fit x_1,1 data to a polynomial using polyfit in MATLAB, and use the polynomial to interpolate given a required time point. In this work, we have used an order of 5 to fit the data as well as avoid overfitting, but the functional form is quite malleable so long as it captures the data trends.

For modeling perturbations of the cell transition model, the initial value of the simulated data for the perturbed node was taken as zero during simulation. The estimation was performed in a similar way as a random 3 node network as described above.

For modeling perturbations for the Huang–Ferrell model, the parameters k3, k15 and k27 were sequentially set as zero. The estimation was performed in a similar way as a random 3 node network as described above.

Simulated noise

Normally distributed white (zero mean) noise is added to simulated time courses point-wise with

$$y = x + N(0,d \cdot x)$$

(44)

where x is the simulation data point, y is the noisy data point, and d represents the noise level. Signal-to-noise ratio of 10:1, 5:1 and 2:1 are, respectively d = 0.1, 0.2, and 0.5. Normally distributed samples are obtained using randn in MATLAB. While there are many different distributional options for modeling noise, we chose this for simplicity and to capture the effects generically of noisier data. We do not intend to answer questions related to whether specific distributional assumptions about the form of the noise have significant impact of the methods performance.

Parameter estimation

For the two-node model, the entire network, with and without perturbations, can be explained by the following system of equations

$$\begin{array}{l}\frac{{dx_1}}{{dt}} = S_{1,b} + S_{1,ex} + F_{11}x_1 + F_{12}x_2\\ \frac{{dx_2}}{{dt}} = S_{2,b} + S_{2,ex} + F_{21}x_1 + F_{22}x_2\\ \frac{{dx_{2,1}}}{{dt}} = S_{2,b} + S_{2,ex} + F_{21}x_{1,1} + F_{22}x_{2,1}\\ \frac{{dx_{1,2}}}{{dt}} = S_{1,b} + S_{1,ex} + F_{11}x_{1,2} + F_{12}x_{2,2}\end{array}$$

(45)

where x_1,1 and x_2,2 are the perturbed node values, from either simulated or experimental data. Eight parameters (S_1,b, S_1,ex, F₁₁, F₁₂, S_2,b, S_2,ex, F₂₁, F₂₂) need to be estimated to fully reconstruct this network. We seek a set of parameters that minimizes deviation between simulated and measured dynamics.

For an initial guess, the node edge parameters (F_ij) are randomly sampled from a uniform distribution over the interval [−2,2] and the stimulus parameters (S_i,ex) are sampled from a uniform distribution over the interval [0,2]. Using data at t = 0, which corresponds to a steady-state without S_i,ex, the S_i,b can be estimated during each iteration of the estimation as follows-

$$\begin{array}{l}\hat S_{1,b} = - (\hat F_{11}x_1(t = 0) + \hat F_{12}x_2(t = 0))\\ \hat S_{2,b} = - (\hat F_{21}x_1(t = 0) + \hat F_{22}x_2(t = 0))\end{array}$$

(46)

For an n-node model, this equation can be scaled accordingly to obtain each $\hat S$_i,b.

For these initial guesses we compute the activity data using Eq. (45). The perturbation data x_k,k is used in the perturbation equations as detailed above (Eq. (43)). Let $\hat x$_i, and $\hat x$_i,j denote the predicted node activity values for non-perturbed and perturbed cases respectively. For a total of n nodes and N_t timepoints, the objective function is the sum of squared errors Φ

$$\Phi = \mathop {\sum}\limits_{k = 1}^{N_t} {\left[ {\left( {\mathop {\sum}\limits_{i = 1}^n {\left( {x_i(k) - \hat x_i(k)} \right)^2} } \right) + \mathop {\sum}\limits_{i = 1}^n {\mathop {\sum}\limits_{j \ne i} {\left( {x_{i,j}(k) - \hat x_{i,j}(k)} \right)^2} } } \right]}$$

(47)

Note here that we do not use data from node j, when perturbation j was used (per the derivation). The MATLAB function fmincon is used to minimize Φ by changing edge weights and stimulus terms within the range [−10,10].

We employ “multi-start” by running the estimation 10 times, starting from different randomly generated initial starting points⁷⁷. The estimated parameters and their respective final sum of squared errors (Φ) are saved and the estimated parameter set corresponding to the minimum Φ is chosen as the final parameter set. Variability of parameter estimates across multi-start runs is explored in Supplementary Fig. S5.

Parameter estimation for non-linear models

For estimating the Non-Linear models, we start with a prior knowledge that S_1,b is always zero and S_2,ex and S_3,ex are always zero as well, which is directly evident from x₁ initial conditions and x₂, x₃ stimulus response in the presence of a complete Node 1 perturbation. The equations for the non-perturbation case become as follows

$$\begin{array}{l}\frac{{dx_1}}{{dt}} = S_{1,ex} + F_{11}x_1 + F_{12}x_2 + F_{13}x_3\\ \frac{{dx_2}}{{dt}} = S_{2,b} + F_{21}x_1 + F_{22}x_2 + F_{23}x_3\\ \frac{{dx_3}}{{dt}} = S_{3,b} + F_{31}x_1 + F_{32}x_2 + F_{33}x_3\end{array}$$

(48)

Since the system is at steady-state before the external stimulus, the basal production parameter can be estimated during each iteration of the estimation as-

$$\begin{array}{l}\hat S_{2,b} = - (\hat F_{21}x_1(t = 0) + \hat F_{22}x_2(t = 0) + \hat F_{23}x_3(t = 0))\\ \hat S_{3,b} = - (\hat F_{31}x_1(t = 0) + \hat F_{32}x_2(t = 0) + \hat F_{33}x_3(t = 0))\end{array}$$

(49)

where $\hat F$_i,j are the current model parameter estimates and x_i (t = 0) are the x values at the initial system steady state before the induction of external stimulus.

Bootstrapping simulated data for the FFL model cases

To generate multiple parameter set estimates to classify edge weights for the FFL model cases, we employ a bootstrapping approach. In an experimental scenario, each data point will have a mean and a standard deviation, and upon a distributional assumption (e.g. normal), one can then resample datasets to obtain measures of estimation uncertainty. We use the simulated data as the mean, and then vary the standard deviation as described above to generate 50 bootstrapped datasets for each of the 16 considered models. Estimation is carried out for each of the 50 datasets using multi-start, which yields 50 best-fitting parameter sets for each model. Uncertainty analysis and classification error is based on these sets.

Data availability

All relevant simulated data used in the paper are provided and can be accessed along with the code at https://doi.org/10.5281/zenodo.6516238. Any other relevant data can be obtained from the authors.

Code availability

The code needed to reproduce the data and figures are included and can be accessed at https://doi.org/10.5281/zenodo.6516238. Parallelization when necessary to generate data was run on Palmetto cluster (372 GB, 48 nodes) and MATLAB 2020a. The code also includes Jupyter notebooks that implement the estimation functions (in python) for a 2 node system, a 3 node system, and a 3 node cell state system. These use simple csv input files where the experimental data are placed.

References

Angulo, M. T., Moreno, J. A., Lippner, G., Barabási, A.-L. & Liu, Y.-Y. Fundamental limitations of network reconstruction from temporal data. J. R. Soc. Interface 14, 20160966 (2017).
Article PubMed PubMed Central Google Scholar
Barabási, A. L. & Albert, R. Emergence of scaling in random networks. Science 286, 509–512 (1999).
Article PubMed Google Scholar
Califano, A., Butte, A. J., Friend, S., Ideker, T. & Schadt, E. Leveraging models of cell regulation and GWAS data in integrative network-based association studies. Nat. Genet. 44, 841–847 (2012).
Article CAS PubMed PubMed Central Google Scholar
Calvano, S. E. et al. A network-based analysis of systemic inflammation in humans. Nature 437, 1032–1037 (2005).
Article CAS PubMed Google Scholar
Dorel, M. et al. Modelling signalling networks from perturbation data. Bioinformatics 34, 4079–4086 (2018).
Article CAS PubMed Google Scholar
Hackett, S. R. et al. Learning causal networks using inducible transcription factors and transcriptome‐wide time series. Mol. Syst. Biol. 16, e9174 (2020).
Article CAS PubMed PubMed Central Google Scholar
Hein, M. Y. et al. A human interactome in three quantitative dimensions organized by stoichiometries and abundances. Cell 163, 712–723 (2015).
Article CAS PubMed Google Scholar
Hill, S. M. et al. Context specificity in causal signaling networks revealed by phosphoprotein profiling. Cell Syst. 4, 73–83.e10 (2017).
Article CAS PubMed PubMed Central Google Scholar
Ideker, T. et al. Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. Science 292, 929–934 (2001).
Article CAS PubMed Google Scholar
Ideker, T., Ozier, O., Schwikowski, B. & Siegel, A. F. Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics 18, S233–S240 (2002).
Article PubMed Google Scholar
Liu, Y.-Y., Slotine, J.-J. & Barabási, A.-L. Observability of complex systems. Proc. Natl Acad. Sci. USA 110, 2460–2465 (2013).
Article CAS PubMed PubMed Central Google Scholar
Ma’ayan, A. et al. Formation of regulatory patterns during signal propagation in a Mammalian cellular network. Science 309, 1078–1083 (2005).
Article PubMed PubMed Central Google Scholar
Margolin, A. A. et al. ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinforma. 7, S7 (2006).
Article Google Scholar
Mazloom, A. R. et al. Recovering protein-protein and domain-domain interactions from aggregation of IP-MS proteomics of coregulator complexes. PLoS Comput. Biol. 7, e1002319 (2011).
Article CAS PubMed PubMed Central Google Scholar
Mehla, J., Caufield, J. H. & Uetz, P. The yeast two-hybrid system: a tool for mapping protein-protein interactions. Cold Spring Harb. Protoc. 2015, 425–430 (2015).
PubMed Google Scholar
Molinelli, E. J. et al. Perturbation biology: inferring signaling networks in cellular systems. PLoS Comput. Biol. 9, e1003290 (2013).
Article PubMed PubMed Central Google Scholar
Nyman, E. et al. Perturbation biology links temporal protein changes to drug responses in a melanoma cell line. PLOS Comput. Biol. 16, e1007909 (2020).
Article CAS PubMed PubMed Central Google Scholar
Pe’er, D., Regev, A., Elidan, G. & Friedman, N. Inferring subnetworks from perturbed expression profiles. Bioinformatics 17, S215–S224 (2001).
Article PubMed Google Scholar
Pósfai, M., Liu, Y.-Y., Slotine, J.-J. & Barabási, A.-L. Effect of correlations on network controllability. Sci. Rep. 3, 1067 (2013).
Article PubMed PubMed Central Google Scholar
Schraivogel, D. et al. Targeted Perturb-seq enables genome-scale genetic screens in single cells. Nat. Methods 1–7, https://doi.org/10.1038/s41592-020-0837-5 (2020).
Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).
Article CAS PubMed PubMed Central Google Scholar
Stein, R. R., Marks, D. S. & Sander, C. Inferring pairwise interactions from biological data using maximum-entropy probability models. PLoS Comput. Biol. 11, e1004182 (2015).
Article PubMed PubMed Central Google Scholar
Wynn, M. L. et al. Inferring intracellular signal transduction circuitry from molecular perturbation experiments. Bull. Math. Biol. 80, 1310–1344 (2018).
Article CAS PubMed Google Scholar
Yuan, B. et al. CellBox: interpretable machine learning for perturbation biology with application to the design of cancer combination therapy. Cell Syst. 12, 128–140.e4 (2021).
Article CAS PubMed Google Scholar
Chevalier, T., Schreiber, I. & Ross, J. Toward a systematic determination of complex reaction mechanisms. J. Phys. Chem. 97, 6776–6787 (1993).
Article CAS Google Scholar
Díaz-Sierra, R., Lozano, J. B. & Fairén, V. Deduction of chemical mechanisms from the linear response around steady state. J. Phys. Chem. A 103, 337–343 (1999).
Article Google Scholar
Hoffmann, M., Fröhner, C. & Noé, F. Reactive SINDy: discovering governing reactions from concentration data. J. Chem. Phys. 150, 025101 (2019).
Article PubMed Google Scholar
Kim, J., Bates, D. G., Postlethwaite, I., Heslop-Harrison, P. & Cho, K.-H. Least-squares methods for identifying biochemical regulatory networks from noisy measurements. BMC Bioinforma. 8, 8 (2007).
Article Google Scholar
Schmidt, H., Cho, K.-H. & Jacobsen, E. W. Identification of small scale biochemical networks based on general type system perturbations. FEBS J. 272, 2141–2151 (2005).
Article CAS PubMed Google Scholar
Morgan, S. L. & Winship, C. Counterfactuals and causal inference: methods and principles for social research. (Cambridge University Press, 2014). https://doi.org/10.1017/CBO9781107587991.
Pearl, J. Structural counterfactuals: a brief introduction. Cogn. Sci. 37, 977–985 (2013).
Article PubMed Google Scholar
Mangan, S. & Alon, U. Structure and function of the feed-forward loop network motif. Proc. Natl Acad. Sci. USA 100, 11980–11985 (2003).
Article CAS PubMed PubMed Central Google Scholar
Reeves, G. T. The engineering principles of combining a transcriptional incoherent feedforward loop with negative feedback. J. Biol. Eng. 13, 62 (2019).
Article PubMed PubMed Central Google Scholar
Fournier, T. et al. Steady-state expression of self-regulated genes. Bioinformatics 23, 3185–3192 (2007).
Article CAS PubMed Google Scholar
Bell-Pedersen, D. et al. Circadian rhythms from multiple oscillators: lessons from diverse organisms. Nat. Rev. Genet. 6, 544–556 (2005).
Article CAS PubMed PubMed Central Google Scholar
Stewart-Ornstein, J., Cheng, H. W. J. & Lahav, G. Conservation and divergence of p53 oscillation dynamics across species. Cell Syst. 5, 410–417.e4 (2017).
Article CAS PubMed PubMed Central Google Scholar
Kholodenko, B. N. et al. Untangling the wires: a strategy to trace functional interactions in signaling and gene networks. Proc. Natl Acad. Sci. USA 99, 12841–12846 (2002).
Article CAS PubMed PubMed Central Google Scholar
Santra, T., Rukhlenko, O., Zhernovkov, V. & Kholodenko, B. N. Reconstructing static and dynamic models of signaling pathways using Modular Response Analysis. Curr. Opin. Syst. Biol. 9, 11–21 (2018).
Article Google Scholar
Andrec, M., Kholodenko, B. N., Levy, R. M. & Sontag, E. Inference of signaling and gene regulatory networks by steady-state perturbation experiments: structure and accuracy. J. Theor. Biol. 232, 427–441 (2005).
Article CAS PubMed Google Scholar
Santos, S. D. M., Verveer, P. J. & Bastiaens, P. I. H. Growth factor-induced MAPK network topology shapes Erk response determining PC-12 cell fate. Nat. Cell Biol. 9, 324–330 (2007).
Article CAS PubMed Google Scholar
Thomaseth, C. et al. Impact of measurement noise, experimental design, and estimation methods on modular response analysis based network reconstruction. Sci. Rep. 8, 16217 (2018).
Article PubMed PubMed Central Google Scholar
Gross, T. & Blüthgen, N. Identifiability and experimental design in perturbation studies. Bioinformatics 36, i482–i489 (2020).
Article CAS PubMed PubMed Central Google Scholar
Halasz, M., Kholodenko, B. N., Kolch, W. & Santra, T. Integrating network reconstruction with mechanistic modeling to predict cancer therapies. Sci. Signal. 9, ra114 (2016).
Article PubMed Google Scholar
Klinger, B. et al. Network quantification of EGFR signaling unveils potential for targeted combination therapy. Mol. Syst. Biol. 9, 673 (2013).
Article PubMed PubMed Central Google Scholar
Santra, T., Kolch, W. & Kholodenko, B. N. Integrating Bayesian variable selection with Modular Response Analysis to infer biochemical network topology. BMC Syst. Biol. 7, 57 (2013).
Article PubMed PubMed Central Google Scholar
Sontag, E., Kiyatkin, A. & Kholodenko, B. N. Inferring dynamic architecture of cellular networks using time series of gene expression, protein and metabolite data. Bioinformatica 20, 1877–1886 (2004).
Article CAS Google Scholar
Cho, K.-H., Choo, S.-M., Wellstead, P. & Wolkenhauer, O. A unified framework for unraveling the functional interaction structure of a biomolecular network based on stimulus-response experimental data. FEBS Lett. 579, 4520–4528 (2005).
Article CAS PubMed Google Scholar
Gupta, P. B. et al. Stochastic state transitions give rise to phenotypic equilibrium in populations of cancer cells. Cell 146, 633–644 (2011).
Article CAS PubMed Google Scholar
Huang, C. Y. & Ferrell, J. E. Ultrasensitivity in the mitogen-activated protein kinase cascade. Proc. Natl Acad. Sci. 93, 10078–10083 (1996).
Article CAS PubMed PubMed Central Google Scholar
Höfler, M. Causal inference based on counterfactuals. BMC Med. Res. Methodol. 5, 28 (2005).
Article PubMed PubMed Central Google Scholar
Shipley, B. Cause and correlation in biology: a user’s guide to path analysis, structural equations and causal inference with R (Cambridge University Press, 2016). https://doi.org/10.1017/CBO9781139979573.
Lill, D. et al. Mapping connections in signaling networks with ambiguous modularity. npj Syst. Biol. Appl 5, 19 (2019).
Article PubMed PubMed Central Google Scholar
Kholodenko, B. N. & Sontag, E. D. Determination of functional network structure from local parameter dependence data. Preprint at http://arxiv.org/abs/physics/0205003 (2002).
Armond, J. W. et al. A stochastic model dissects cell states in biological transition processes. Sci. Rep. 4, 3692 (2014).
Article PubMed PubMed Central Google Scholar
Dirkse, A. et al. Stem cell-associated heterogeneity in Glioblastoma results from intrinsic tumor plasticity shaped by the microenvironment. Nat. Commun. 10, 1787 (2019).
Article PubMed PubMed Central Google Scholar
Hormoz, S. et al. Inferring cell-state transition dynamics from lineage trees and endpoint single-cell measurements. cels 3, 419–433.e8 (2016).
CAS Google Scholar
Larsson, I. et al. Modeling glioblastoma heterogeneity as a dynamic network of cell states. Mol. Syst. Biol. 17, e10105 (2021).
Article CAS PubMed PubMed Central Google Scholar
Neftel, C. et al. An integrative model of cellular states, plasticity, and genetics for glioblastoma. Cell 178, 835–849.e21 (2019).
Article CAS PubMed PubMed Central Google Scholar
Sha, Y., Wang, S., Zhou, P. & Nie, Q. Inference and multiscale model of epithelial-to-mesenchymal transition via single-cell transcriptomic data. Nucleic Acids Res. 48, 9505–9520 (2020).
Article CAS PubMed PubMed Central Google Scholar
Shen, S. & Clairambault, J. Cell plasticity in cancer cell populations. F1000Res 9, F1000 (2020). Faculty Rev-635.
Article PubMed PubMed Central Google Scholar
Zarkoob, H., Taube, J. H., Singh, S. K., Mani, S. A. & Kohandel, M. Investigating the link between molecular subtypes of glioblastoma, epithelial-mesenchymal transition, and CD133 cell surface protein. PLoS One 8, e64169 (2013).
Article CAS PubMed PubMed Central Google Scholar
Henriques, D., Villaverde, A. F., Rocha, M., Saez-Rodriguez, J. & Banga, J. R. Data-driven reverse engineering of signaling pathways using ensembles of dynamic models. PLOS Comput. Biol. 13, e1005379 (2017).
Article PubMed PubMed Central Google Scholar
Goentoro, L., Shoval, O., Kirschner, M. W. & Alon, U. The incoherent feedforward loop can provide fold-change detection in gene regulation. Mol. Cell 36, 894–899 (2009).
Article CAS PubMed PubMed Central Google Scholar
Goentoro, L. & Kirschner, M. W. Evidence that fold-change, and not absolute level, of β-catenin dictates wnt signaling. Mol. Cell 36, 872–884 (2009).
Article CAS PubMed PubMed Central Google Scholar
Nakakuki, T. et al. Ligand-specific c-fos expression emerges from the spatiotemporal control of ErbB network dynamics. Cell 141, 884–896 (2010).
Article CAS PubMed PubMed Central Google Scholar
Tyson, J. J., Csikasz-Nagy, A. & Novak, B. The dynamics of cell cycle regulation. Bioessays 24, 1095–1109 (2002).
Article CAS PubMed Google Scholar
Albeck, J. G., Mills, G. B. & Brugge, J. S. Frequency-modulated pulses of ERK activity transmit quantitative proliferation signals. Mol. Cell 49, 249–261 (2013).
Article CAS PubMed Google Scholar
Kholodenko, B. N. Negative feedback and ultrasensitivity can bring about oscillations in the mitogen-activated protein kinase cascades. Eur. J. Biochem 267, 1583–1588 (2000).
Article CAS PubMed Google Scholar
Ryu, H. et al. Frequency modulation of ERK activation dynamics rewires cell fate. Mol. Syst. Biol. 11, 838 (2015).
Article PubMed PubMed Central Google Scholar
Fuente, A., de la, Brazhnik, P. & Mendes, P. Linking the genes: inferring quantitative gene networks from microarray data. Trends Genet. 18, 395–398 (2002).
Article PubMed Google Scholar
Stark, R., Grzelak, M. & Hadfield, J. RNA sequencing: the teenage years. Nat. Rev. Genet. 20, 631–656 (2019).
Article CAS PubMed Google Scholar
Spitzer, M. H. & Nolan, G. P. Mass cytometry: single cells, many features. Cell 165, 780–791 (2016).
Article CAS PubMed PubMed Central Google Scholar
Aksenov, A. A., da Silva, R., Knight, R., Lopes, N. P. & Dorrestein, P. C. Global chemical analysis of biology by mass spectrometry. Nat. Rev. Chem. 1, 1–20 (2017).
Article Google Scholar
Akbani, R. et al. Realizing the promise of reverse phase protein arrays for clinical, translational, and basic research: a workshop report: the RPPA (Reverse Phase Protein Array) Society. Mol. Cell. Proteom. 13, 1625–1643 (2014).
Article CAS Google Scholar
Stapor, P. et al. Mini-batch optimization enables training of ODE models on large-scale datasets. Nat. Commun. 13, 34 (2022).
Article CAS PubMed PubMed Central Google Scholar
Wu, G., Feng, X. & Stein, L. A human functional protein interaction network and its application to cancer data analysis. Genome Biol. 11, R53 (2010).
Article PubMed PubMed Central Google Scholar
Raue, A. et al. Lessons learned from quantitative dynamical modeling in systems biology. PLOS One 8, e74335 (2013).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We would like to thank Clemson University and the CCIT team for the generous allotment of time and support in the Palmetto cluster for running the simulations in this paper. M.R.B. acknowledges funding from Mount Sinai, Clemson University, the National Institutes of Health Grants R01GM104184 and R35GM141891 and an IBM faculty award. M.B. and A.D.S. were supported by a National Institute for General Medical Sciences-funded Integrated Pharmacological Sciences Training Program grant (T32GM062754).

Author information

These authors contributed equally: Deepraj Sarmah, Gregory R. Smith.

Authors and Affiliations

Department of Chemical and Biomolecular Engineering, Clemson University, Clemson, SC, USA
Deepraj Sarmah, James Erskine & Marc R. Birtwistle
Department of Neurology, Center for Advanced Research on Diagnostic Assays, Icahn School of Medicine at Mount Sinai, New York, NY, USA
Gregory R. Smith
J. David Gladstone Institutes, San Francisco, CA, 94158, USA
Mehdi Bouhaddou
Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, CA, 94158, USA
Mehdi Bouhaddou
Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
Alan D. Stern
Department of Bioengineering, Clemson University, Clemson, SC, USA
Marc R. Birtwistle

Authors

Deepraj Sarmah
View author publications
You can also search for this author in PubMed Google Scholar
Gregory R. Smith
View author publications
You can also search for this author in PubMed Google Scholar
Mehdi Bouhaddou
View author publications
You can also search for this author in PubMed Google Scholar
Alan D. Stern
View author publications
You can also search for this author in PubMed Google Scholar
James Erskine
View author publications
You can also search for this author in PubMed Google Scholar
Marc R. Birtwistle
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.R.B., G.R.S., and D.S. conceived of the work. D.S., G.R.S., M.B., M.R.B., and J.E. performed analyses. D.S., M.B., and G.R.S. made the figures. D.S., G.R.S., M.B., and M.R.B. wrote the manuscript.

Corresponding author

Correspondence to Marc R. Birtwistle.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Figures

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Sarmah, D., Smith, G.R., Bouhaddou, M. et al. Network inference from perturbation time course data. npj Syst Biol Appl 8, 42 (2022). https://doi.org/10.1038/s41540-022-00253-6

Download citation

Received: 05 May 2022
Accepted: 18 October 2022
Published: 01 November 2022
DOI: https://doi.org/10.1038/s41540-022-00253-6