Abstract
Networks underlie much of biology from subcellular to ecological scales. Yet, understanding what experimental data are needed and how to use them for unambiguously identifying the structure of even small networks remains a broad challenge. Here, we integrate a dynamic least squares framework into established modular response analysis (DLMRA), that specifies sufficient experimental perturbation time course data to robustly infer arbitrary two and three node networks. DLMRA considers important network properties that current methods often struggle to capture: (i) edge sign and directionality; (ii) cycles with feedback or feedforward loops including selfregulation; (iii) dynamic network behavior; (iv) edges external to the network; and (v) robust performance with experimental noise. We evaluate the performance of and the extent to which the approach applies to cell state transition networks, intracellular signaling networks, and gene regulatory networks. Although signaling networks are often an application of network reconstruction methods, the results suggest that only under quite restricted conditions can they be robustly inferred. For gene regulatory networks, the results suggest that incomplete knockdown is often more informative than full knockout perturbation, which may change experimental strategies for gene regulatory network reconstruction. Overall, the results give a rational basis to experimental data requirements for network reconstruction and can be applied to any such problem where perturbation time course experiments are possible.
Similar content being viewed by others
Introduction
Networks underlie much cellular and biological behavior, including transcriptional, proteinprotein interaction, signaling, metabolic, cellcell, endocrine, ecological, and social networks, among many others. As such, identifying and then representing their structure has been a focus of many for decades now. This is not just from experimental perspectives alone, but predominantly computational with a variety of statistical methodologies that integrate prior knowledge from interaction databases with new experimental data sets^{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24}. Alternatively, a variety of methods have investigated general ways to infer detailed reaction mechanisms—often a foundation of networks—from experimental data^{25,26,27,28,29}. Such tasks may be considered a subset of network inference.
Network structure is usually represented as either an undirected or a directed graph, with edges between nodes specifying the system. There are five main areas where current approaches to reconstructing networks struggle to capture important features of biological networks. The first is directionality of edges^{6,8,30,31}. Commonly employed correlational methods predominantly generate undirected edges, which impedes causal and other mechanistic analyses. Second is cycles. Cycles such as feedback or feedforward loops are nearly ubiquitous in biological systems and central to their function^{32,33}. This also includes an important type of cycle: selfregulation of a node, that is, an edge onto itself, which is rarely considered^{34}. Third is that biological networks are often dynamic. Two notable examples are circadian and p53 oscillators^{35,36}, where dynamics are key to biological function. Directionality and edge signs (i.e. positive or negative) dictate dynamics. Fourth is pinpointing how external variables impinge on network nodes. For example, is the effect of a growth factor on a network node direct, or though other nodes in the network? Fifth, the design and method employed should be robust to typical experimental noise levels. The experimental design and data requirements to uniquely identify the dynamic, directed and signed edge structures in biological networks containing all types of cycles and external stimuli remains a largely open but significant problem. Any such design should ideally be feasible to implement with current experimental technologies.
Modular Response Analysis (MRA) approaches, first pioneered by Kholodenko and colleagues in 2002^{37,38} inherently deal with cycles and directionality by prescribing systematic perturbation experiments followed by steadystate measurements. The premise for data requirements is to measure the entire system response to at least one perturbation for each node. Thus, an n node system requires n experiments, if the system response can be measured in a global fashion (i.e. all nodes measured at once). The original instantiations struggled with the impact of experimental noise, but total least squares MRA and Monte Carlo sampling helped to improve performance^{39,40,41}. Incomplete and prior knowledge can be handled as well using both maximum likelihood and Bayesian approaches^{42,43,44,45}. However, these approaches are based on steadystate data, or fixed time point data, limiting abilities to deal with dynamic systems. There is a formal requirement for small perturbations, which are experimentally problematic and introduce issues for estimation with noisy data. Subsequent approaches have recommended the use of large perturbations as a trade off in dealing with noisy data, but the theory still formally requires small perturbations^{41}. Lastly, there are two classes of biologically relevant edges that MRA does not comprehensively address. First is selfregulation of a node, which is often normalized (to 1) causing it to not be uniquely identifiable. The other are the effects of stimuli external to the network (basally present or administered) on the modeled nodes.
In addition to perturbations, another experimental design feature that can inform directionality is a timeseries, which have also been integrated into MRA. This work^{37,46} uses timeseries perturbation data to uniquely infer a signed, directed network that can predict dynamic network behavior. In an n node open system (e.g. protein levels are not constant), multiple nodes would either be distinctly perturbed more than once, such as both production and degradation of a transcript, or phosphorylation and dephosphorylation of a protein, or the system monitored before and after the perturbation (with one perturbation per node). This can be experimentally challenging both in terms of scale and finding suitable distinct perturbations for a node. Moreover, as is often the case, noise in the experimental data severely limits inference accuracy (due to required estimation of 2^{nd} derivatives). Subsequent work^{47} recommends smaller perturbations and difference in timepoints but also does not address noisy data. Further work has demonstrated that larger perturbations produce better results due to inevitable experimental noise^{41}. Thus, there remains a need for methods that can infer signed, directed networks from feasible perturbation time course experiments that capture dynamics, can uniquely estimate edge properties related to selfregulation and external stimuli, and finally that function in the presence of typical experimental noise levels.
Here we describe a novel, MRAinspired approach called Dynamic Leastsquares MRA (DLMRA). For an nnode system, n perturbation time courses are required, and thus experimental requirements scale linearly as the network size increases. The approach uses an underlying network model that captures dynamic, directional, and signed networks that include cycles, selfregulation, and external stimulus effects. We test DLMRA using simulated timeseries perturbation data with known network topology under increasing levels of simulated noise. The approach has good accuracy and precision for identifying network structure in randomly generated two and three node networks that contain a wide variety of cycles. For the investigated cases, we find between 7 to 11 evenly distributed time points yielded reasonable results, although we expect this will strongly depend on time point placement. We apply the approach to models describing a cell state switching network^{48}, a signal transduction network^{49}, and a gene regulatory network^{32}. Although signaling networks are often a focus in network biology, our analysis suggests they have unique properties that render them generally recalcitrant to reconstruction. Results from the gene regulatory network application suggest that incomplete perturbation (e.g. partial knockdown vs. knockout) is more informative than complete inhibition. While challenges remain for expanding to other and larger systems, the proposed algorithm robustly infers a wide range of networks with good specificity and sensitivity using feasible time course experiments, all while making progress on limitations of current inference approaches.
Results
Formulation of sufficient experimental data requirements for network reconstruction
Consider a 2node network with four directed, weighted edges (Fig. 1a). An external stimulus may affect each of the two nodes differently and its effect is quantified by S_{1,ex} and S_{2,ex}, respectively (e.g. Methods, Eq. (15)). We also allow for basal/constitutive production in each node (S_{i,b}). Let x_{i}(k) be the activity of node i at time point t_{k}. The network dynamics can be cast as a system of ordinary differential equations (ODEs) as follows
The network edges can be connected to the system dynamics through the Jacobian matrix J^{37,38,46},
The network edge weights (F_{ij}’s) describe how the activity of one node affects the dynamics of another node in a causal and direct sense, given the explicitly considered nodes (though not necessarily in a physical sense). In practice, however, causality can only be approached if every component of the system is included in the model, which is not typical (and even more so, there must be no model mismatch, which is almost impossible to guarantee)^{6,30,31,50,51}. In MRA, these nodes may be individual species or “modules”. In order to simplify a complex network it may often be separated into “modules” comprising smaller networks of interconnected species with the assumption that each module is generally insulated from other modules except for information transfer through socalled communicating species^{37}. Cases where such modules may not be completely isolated are explored elsewhere^{52}.
What experimental data are sufficient to uniquely estimate the signed directionality of the network edges and thus infer the causal relationships within the system? Fundamentally, we know that perturbations and/or dynamics are important for inferring causality^{6,37,46,51,52}. Consider a simple setup of three timecourse experiments that each measure x_{1} and x_{2} dynamics in response to a stimulus (Fig. 1b–g). One time course is in the presence of no perturbation (vehicle), one has a perturbation of Node 1, and one has a perturbation of Node 2. Consider further that the perturbations are reasonably specific, such that the perturbation of x_{1} has negligible direct effects on x_{2}, and vice versa, and that these perturbations may be large. Experimentally, this could be an shRNA or gRNA that is specific to a particular node, or that a small molecule inhibitor is used at low enough dose to predominantly inhibit the targeted node. A wellposed estimation problem can be formulated (see Methods) that, in principle, allows for unique estimation of the Jacobian elements as a function of time with the following set of linear algebra relations:
Here, y_{i,j} refers to a measured firsttime derivative of node i in the presence of node j perturbation (if used), and Δ to a difference with respect to perturbation (subscript p) or time (subscript t) (see Methods). Since we do not use data from the perturbation of node i for estimation of node i edges, we do not have to impose assumptions on how the perturbation functionally acts on the system dynamics (see Methods). Moreover, constraints on the perturbation strength can be relaxed, following recent recommendations^{41} (although accuracy of the underlying Taylor series approximation can affect estimation—see Methods). If these measurements with and without perturbations were each taken in their steady state as is done in MRA, the solution for F_{ij} would be trivial. MRA gets around this by normalizing selfregulatory parameters F_{ii} to 1. Using dynamic data allows unique estimation of selfregulatory parameters without such normalization. Estimation of the nodespecific stimulus strengths or basal production rates (S’s) requires evaluation after specific functional assumptions, but in general these effects are knowable from the data to be generated (see Methods and below results).
Note that this formulation is generalizable to an n dimensional network. With n^{2} unknown parameters in the Jacobian matrix, n equations originate from the vehicle perturbation and n−1 equations originate from each of the n perturbations (discarding equations from Node i with Perturbation i). This results in \(n + n^\ast (n  1) = n + n^2  n = n^2\) independent equations.
Using sufficient simulated data to reconstruct a network
As an initial test of the above formulation, we used a simple 2 node, single activator network where Node 1 activates Node 2, one node has firstorder degradation (1 diagonal elements), and the other has negative selfregulation (0.8 diagonal) (Fig. 1a—see Methods for equations). A stimulus at t = 0 (timeinvariant; S_{,ex} = 1) increases the activity of each node, which we sample with an evenly spaced 11point time course. This simulation was done for no perturbation (i.e. vehicle) and for each perturbation (Node 1 and Node 2) to generate the necessary simulation data per the theoretical considerations above (Fig. 1c, e, g, left panel). Here, we modeled perturbations as complete inhibition; for example, a perturbation of Node 1 makes its value 0 at all times. Solving Eqs. (3) and (4) to infer the Jacobian elements at each time point yielded good agreement between the median estimates and the ground truth values (Fig. 1h, “Analytic Solution”, No Noise). Using the node activity data corresponding to the last time point in the time course and the median estimates of Jacobian elements, the external stimuli S_{1,ex} and S_{2,ex} were also determined (Eqs. (18) and (19)) and reasonably agree with the ground truth values.
How does this approach fare when data are noisy? We performed the estimation with the same data but with a relatively small amount of simulated noise added (10:1 signaltonoise—Fig. 1c, e, g). The resulting estimates are neither accurate nor precise, varying on a scale more than ten times greater than each parameter’s magnitude with median predictions both positive and negative regardless of the ground truth value (Fig. 1i). The stimulus strengths S_{1,ex} and S_{2,ex} are estimated to be negative, while the ground truth is positive.
Although the analytic equations suggest the sufficiency of the perturbation time course datasets to uniquely estimate the edge weights, in practice even small measurement noise corrupts estimates obtained from direct solution of these equations. Therefore, we considered an alternative representation by employing a least squares estimation approach rather than solving the linear equations directly. For a given set of guesses for edge weight and stimulus parameters, one can integrate to obtain a solution for the dynamic behavior of the resulting model, which can be directly compared to data in a leastsquares sense. Least squares methods were shown to improve traditional MRAbased approaches^{39,40}, but had never been formulated for such dynamic problems. Two hurdles were how to model the effect of a perturbation without (i) adding additional parameters to estimate or (ii) requiring strong functional assumptions regarding perturbation action. We solved these here by using the alreadyavailable experimental measurements within the context of the leastsquares estimation (see Methods). We applied this approach to the single activator model, 10:1 signalnoise ratio case above where the analytic approach failed. This new estimation approach was able to infer the network structure accurately and precisely (Fig. 1j). We conclude that analytic formulations can be useful for suggesting experimental designs that should be sufficient for obtaining unique estimates for a network reconstruction exercise, but in practice directly applying those equations may not yield precise nor accurate estimates. Alternatively, using a leastsquares formulation seems to work well for this application.
Reconstruction of random 2 and 3 node networks
To investigate the robustness of the leastsquares estimation approach, we applied it to increasingly complex networks with larger amounts of measurement noise and smaller numbers of time points (Fig. 2). We focused on 2 and 3 node networks. We generated 50 randomized 2 and 3 node models, where each edge weight is randomly sampled from a uniform distribution over the interval [−2, 2], and the basal and external strength from [0, 2] (Fig. 2a, Supplementary Figs. S1a and S2a). Each random network was screened for stability. Many networks (29/50 for 2 node and 3 node) displayed potential for oscillatory behavior (nonzero imaginary parts of the eigenvalues of the Jacobian matrix). However, since the real parts of the eigenvalues are nonzero and negative, these oscillations should dampen over time, and no sustained oscillatory behavior was analyzed. For each random model, we generated a simulated dataset based on the prescribed experimental design, using complete inhibition as the perturbation. We considered evenlyspaced sampling within the time interval of 010 AU (approximate time to reach steadystate—Supplementary Figs. S1b and S2b) with different numbers of time points (3, 7, 11 and 21), and added 10:1 signaltonoise, 5:1 signaltonoise, and 2:1 signaltonoise to the data. Nonuniform time point spacing may change inference results but that was not explored at these first investigations.
For each random network model, number of time points, and noise level, we evaluated the fidelity of the proposed reconstruction approach in terms of signed directionality (Fig. 2c–f). We overall found reasonable agreement between inferred and ground truth values, even at the higher noise levels and low number of timepoints. Expectedly, the overall classification accuracy increases with more time points and decreases with higher noise levels. But, surprisingly, even in the worst case investigated of 3 timepoints and 2:1 signaltonoise ratio, classification accuracy was above 85% for 2 node models and 70% for 3 node models. Increasing the number of nodes decreases performance, with 3node reconstruction being slightly worse than 2node reconstruction, other factors held constant.
We wondered whether the magnitude of an edge weight influenced its classification accuracy, since small edge weights may be more difficult to discriminate from noise. We found that edge weights with greater absolute values, which are expected to have a greater influence on the networks, were more likely to be classified correctly (Supplementary Figs. S1c–f and S2c–f). Also, for models with damped oscillatory behavior, the classification accuracy is very similar to that of all 50 random models (Supplementary Fig. S3a, b).
How does this method compare to similar network reconstruction methods? There are limited methods to compare to which also use dynamic data and sequential perturbations. MRA^{37}, from which this method was inspired, uses steadystate data. However, we could use MRA methods requiring dynamic perturbation data as is used in our method^{46,53}.To compare, we further generated another set of perturbation data with 50% perturbation (as opposed to 100%). We then used the two sets of perturbation data to estimate the network node edges with dynamic modular response analysis (Fig. 2g). Even in absence of noise, for low to medium numbers of timepoints (311) the network is not always accurately inferred (Fig. 2g). In the presence of noise, DLMRA performs better, although the difference between the two methods becomes lower at high number of timepoints. Thus, DLMRA not only outperforms with half the data, but it also estimates 6 additional parametersbasal production and external stimulus for each node. Although Cho’s approach^{47} builds upon MRA methods by recommending smaller time point intervals and smaller perturbations, for our purposes, the time intervals and perturbations are fixed and this would not affect the results obtained here. Moreover, further work has actually recommended larger perturbations while dealing with noisy data^{41}.
To explore a scenario where data from a node might be unavailable, we removed the data from one of the nodes in the 50 random 3 node models and used the remaining data to reconstruct a 2node system (Supplementary Fig. S4). Comparing with corresponding model parameters in the 3 node system, we find a good but expectedly reduced classification accuracy (No Noise94.75%, 10:1 Signal: Noise93.75%, 5:1 Signal: Noise91.25%, 2:1 Signal: Noise87).
A part of the inference process is performing parameter estimation using multiple starting guesses (i.e. multistart), and we wanted to determine how robust the estimated parameters were across the multistart processes. We looked at the distribution of coefficient of variation (CV) among the parameters in the multistart results in the 50 random 3 node models where either the data generated from the estimated parameters had low sum of squared errors (SSE) compared to the original data (<10^{−4}) or with SSE less than twice the minimum SSE. We find that the CVs peak around zero and generally have a small spread, especially for low noise scenarios (Supplementary Fig. S5). This implies a good convergence of the parameter sets obtained through multistart.
We conclude that the network parameters of 2 and 3 node systems can be robustly and uniquely estimated using DLMRA. However, these were ideal conditions where there was no model mismatch that is expected in specific biological applications. How does DLMRA perform when applied to data reflective of different biological use cases?
Application to cell state networks
Cell state transitions are central to multicellular organism biology. They are commonly transcriptomic in nature and underlie development and tissue homeostasis and can also play roles in disease, such as drug resistance in cancer^{48,54,55,56,57,58,59,60,61}. Could DLMRA reconstruct cell state transition networks? As the application, we use previous data on SUM159 cells that transition between luminal, basal and stemlike cells^{48}. Pure populations of luminal, basal and stemlike cells eventually grow to a stable final ratio amongst the three. The authors used a discrete time Markov transition probability model to describe the data and estimate a cell state transition network (Fig. 3a). Thus, we seek to compare DLMRA to such a Markov model in this case.
We hypothesized that perturbations to the system in this case, in contrast to above, did not have to change node activity (i.e. edges). Rather, we thought that perturbing the equilibrium cell state distribution could serve an equivalent purpose. Thus, the data for reconstruction consisted of observing the cell state proportions evolve over time from “pure” populations (Fig. 3b), in addition to equal proportions. DLMRA is capable of explaining the data (Fig. 3b). Interpretation of the estimated network parameters to DLMRA depends on the transformation of the original discrete time Markov probabilities to a continuous time formulation (see Methods—there are constraints on selfregulatory parameters), but DLMRA correctly infers the cell state transition network as well (Fig. 3c). Conveniently, DLMRA is not constrained to 1day time point spacing as is the original discrete time Markov model.
How does noise and the number of timepoints affect the reconstruction? As above, we generated data for 50 random cell state transition models with 3, 7, 11 and 21 timepoints within 5 days, as the models generally seemed to reach close to equilibrium within 5 days. Noise levels of 10:1, 5:1 and 2:1 were used. All parameters were classified accurately (Fig. 3d) (although additional constraints in the estimation—see Methods—facilitates this classification performance). With 3 timepoints, there was deviation from perfect fit even with no noise in the data. At 7 and higher number of timepoints, the estimates matched ground truth well, and noise expectedly reduced the accuracy (Fig. 3d). We conclude that DLMRA can robustly infer cell state networks given perturbation data in the form of nonequilibrium proportions as initial conditions.
Application to intracellular signaling networks
How does the method perform for intracellular signaling networks? The Huang–Ferrell model^{49} (Fig. 4a) is a wellknown intracellular signaling pathway model and has been investigated by different reconstruction methods, including previous versions of MRA^{37,39,41,46,62}. It captures signal flux through a threetiered MAPK cascade where the 2^{nd} and 3^{rd} tier contain two phosphorylation sites. An important aspect of the Huang–Ferrell model is that although the reaction scheme is a cascade and without obvious feedbacks, there may be hidden feedbacks due to sequestration effects and depending on how the perturbations were performed.
In order to reconstruct the Huang–Ferrell MAPK network through DLMRA, we first simplified it to a threenode model with pMAPKKK, ppMAPKK and ppMAPK as observable nodes, as is typical for reconstruction efforts (Fig. 4b)^{37,39,41,46,52,62}. Second, to model perturbations, we sequentially perturbed the activation parameters of each of the observable species (k3, k15 and k27 respectively). Such perturbations, although hard to achieve experimentally, are important because modules must be “insulated” from one another and perturbations must be specific to the observables^{37,52}. Even specific inhibitors do not have such kinetic specificity. Third, in the simplification of the reaction scheme, the observables are shown to influence each other but in the actual scheme, they conduct their effects through the unphosphorylated and semiphosphorylated species. We sought to keep the levels of these two species relatively constant between different perturbations, so that they wouldn’t add to nonlinearities in the estimation. Therefore, we used a stimulus which only activated the observables to a maximum of about 5% of the total forms of the protein^{52}.
Estimation with DLMRA under the above conditions fits the data (Fig. 4c) and predicts positive node edges down the reaction cascade (F_{21}, F_{32}), negligible direct relation between pMAPKKK and ppMAPK (F_{13}, F_{31}), negative selfregulation of each of the observables (F_{11}, F_{22}, F_{33}) negative feedbacks from ppMAPKK to pMAPKKK (F_{12}) and from ppMAPK to ppMAPKK (F_{23}), and negligible external stimuli on ppMAPK to ppMAPKK (F_{13}, F_{31}). All these effects are consistent with the reaction scheme. The negative feedback effects, although not immediately obvious, are consistent with ground truth sequestration effects. For instance, ppMAPK has an overall negative effect on ppMAPKK as the existence of ppMAPK lowers the amount of species MAPK and pMAPK which sequester ppMAPKK and makes it avoid deactivation by its phosphatase.
How do the estimation results for the Huang–Ferrell model in our method compare with those obtained from other methods? Previous work using MRA also reported negative feedbacks from successive modules to the preceding ones^{37,46,52}. Similarly, selfregulation parameters in most preceding MRA based methods are also estimated to be negative but are fixed at 1^{37,39,52}.
Besides MRA inspired methods, SELDOM is another network reconstruction method which can also deal with dynamic data^{62}. SELDOM is a datadriven method which uses ensembles of logic based dynamic models followed by training and model reduction steps to predict state trajectories under untested conditions. However, when dealing with the Huang–Ferrell network, the true value model of SELDOM does not map the effects of selfregulation, nor feedback effects between nodes (Fig. 4e). This may be explained by the fact that although SELDOM uses an extensive number of models to test the data obtained from multiple different stimuli, perturbation data was not included to test the Huang–Ferrell Model. This implies that systematic perturbation of each of the nodes, as prescribed by MRAbased methods, are necessary in order to unearth feedbacks and selfregulation effects.
Although application of DLMRA to the Huang–Ferrell model was able to unearth latent network structure, the simulation conditions were restrictive. First, the perturbation scheme chosen in this paper, although specifically targeted at the observable species, is hard to produce experimentally. In practice, knockdown/out, overexpression, or specific inhibitors could be used as suitable perturbations, but do not have the preciseness needed to be compatible with MRAimposed constraints. The feedback effect observed could depend on the perturbation scheme chosenfor instance knockdown of an entire module as a perturbation would likely have manifested as positive feedback to the preceding module. That is because such a knockdown would have reduced the effect of sequestration of the module on the preceding observable and would have made it more available for dephosphorylation. Second, we assumed a low stimulus to avoid effects from the unphosphorylated version of the proteins. A higher activation may increase nonlinearities adding to the complexity of the model, whereas a lower stimulus may not activate enough proteins to be well detected in experiments. The degree of activation needed for an experiment may be hard to predict beforehand. Such specific perturbations and stimulus had to be done to reduce the effects arising from the nonobservable species behavior. Hence application of DLMRA to intracellular signaling networks with multiple physical interactions needs to be carefully considered before modeling or experiments.
Application to gene regulatory networks: partial perturbations are more informative than full perturbations
Here, we applied DLMRA further to a series of wellstudied nonlinear feed forward loop (FFL) gene regulatory network models that have timevarying Jacobian elements (Fig. 5a, Table 1)^{32,33}. Such FFL motifs are strongly enriched in multiple organisms and are important for signaling functions such as integrative control, persistence detection, and foldchange responsiveness^{63,64,65}.
The FFL network has three nodes (x_{1}, x_{2}, and x_{3}), and the external stimulus acts on x_{1} (S_{1,ex}). There is no external stimulus on x_{2} and x_{3}; however, there may be basal production of x_{2} (S_{2,b}) and x_{3} (S_{3,b})_{,}. Each node exhibits firstorder decay (F_{ii} = −1). The parameters F_{12}, F_{13}, and F_{23} represent connections that do not exist in the model; we call these null edges, but we allow them to be estimated. The relationship between x_{1} and x_{2} (F_{21}), between x_{1} and x_{3} (F_{31}), or between x_{2} and x_{3} (F_{32}) can be either activating or inhibitory. Furthermore, x_{1} and x_{2} can regulate x_{3} through an “AND” gate (both needed) or an “OR” gate (either sufficient) (Fig. 5a). These permutations give rise to 16 different FFL structures (Table 1).
To generate simulated experimental data from these models, we first integrated the system of ODEs starting from a zero initial condition to find the steady state in the absence of stimulus. We then introduced the external stimulus and integrated the system of ODEs (see Methods) to generate time series perturbation data consistent with the proposed reconstruction algorithm, using full inhibitory perturbations. We used 11 evenly spaced timepoints for all 16 nonlinear models, based on the random 3node model analysis above, and also added noise as above.
We first noticed that even in the absence of added noise, a surprising number of inferences were incorrect (Fig. 5b, f). Model #1 (Table 1, Fig. 5b, c) is used as an example, where F_{21}, F_{31} and F_{32} are activators with an AND gate, and F_{31} is incorrectly predicted as null (Fig. 5b—compare ground truth to 100% inhibition). To understand the reason for the incorrect estimation, we looked at the node activity dynamics across the perturbation time courses (Fig. 5d). All three nodes start from an initial steady state of zero, but Node 3 is zero for all three perturbation cases. This is because of the following. Since x_{1} is required for the activation of x_{2} and x_{3}, complete inhibition of x_{1} completely blocks both x_{2} and x_{3} activation. But, because both x_{1} and x_{2} are required for the activation of x_{3}, completely inhibiting x_{2} activity also completely inhibits x_{3}. Thus, given this experimental setup, it is impossible to discern if x_{1} directly influences x_{3} or if it acts solely through x_{2}.
We thus reasoned that full inhibitory perturbation may suppress the information necessary to correctly reconstruct the network, but that a partial perturbation experiment may contain enough information available to make a correct estimate. If this were true, then upon applying partial perturbations (we chose 50% here), Node 3 dynamics should show differences across the perturbation time courses. Simulations showed that this is the case (Fig. 5e). Subsequently, we found that for partial perturbation data, F_{31} is correctly identified as an activator. More broadly, we obtain perfect classification from noisefree data across all 16 FFL networks when partial perturbation data are used, as opposed to 5/16 networks having discrepancy with full perturbation data (Fig. 5f). The fits to simulated data from the reconstructed model align very closely, despite model mismatch (Supplementary Fig. S6). We conclude that in these cases of nonlinear networks, a partial inhibition is necessary to estimate all the network parameters accurately. Thus, moving forward, we instead applied 50% perturbation to all simulation data and proceeded with least squares estimation.
Application to gene regulatory networks: performance
The above analysis prompted us to use a partial (50%) perturbation strategy, since it classified each edge for each model in the absence of noise correctly. What classification performance do we obtain in the presence of varying levels of experimental noise? We first devised the following strategy to assess classification performance. We generated 50 bootstrapped datasets for each network structure/signaltonoise pair, and thus obtained 50 sets of network parameter estimates. To classify the network parameters, we used a symmetric cutoff of a percentile window around the median of these 50 estimates (Fig. 6a). We illustrate this approach with three different example edges and associated estimates, one being positive (Edge 1), one being negative (Edge 2), and one being null (Edge 3). Given the window of values defined by the percentile cutoff being chosen, if the estimates in this window are all positive, then the network parameter would be classified as positive. Similarly, if the estimates in this window are all negative, then the parameter would be classified as negative. Finally, if the estimates in the window cross zero (i.e. span both positive and negative terms), then it would be classified as null. First, consider the case that the percentile window is just set at the median with no percentile span. Then, the classifications for true positives and negatives are likely to be accurate while the null parameters are likely to be incorrectly categorized as either positive or negative (Fig. 6a). If we increase the percentile window span slightly (e.g. between the 40th and 60^{th} percentile, middle panel), we can categorize null edges better, while maintaining good classification accuracy of both true positive and negative edges. However, if we relax the percentile window too much, (e.g. between the 10th and 90^{th} percentile, far right panel) we may categorize most parameters as null, including the true positive and negatives. Thus, it is clear there will be an optimal percentile cutoff that maximizes true positives and minimizes false positives as the threshold is shifted from the median to the entire range.
Now, we applied this classification strategy to the 16 FFL model estimates from data with different noise levels. We varied the percentile window from the median only (50) to the entire range of estimated values (100) and calculated the true and false positive rates for all edges across all 16 FFL models, which allowed generation of receiver operator characteristic (ROC) curves (Fig. 6b). For each noise level, we chose the percentile window that yielded a 5% false positive rate (1387 percentile for 10:1 Signal:Noise, 1981 percentile for 5:1 Signal:Noise, and 2179 percentile for 2:1 Signal:Noise). Using this simple cutoff classifier, we observed good classification performance across all noise levels according to traditional area under the ROC curve metrics (10:1 AUC = 0.99, 5:1 AUC = 0.9, 2:1 AUC = 0.92).
How does classification accuracy break down by FFL model and edge type? To evaluate the performance for each of the 16 FFL cases, we calculated the fraction of the 12 links in each FFL model that was classified correctly as a function of signaltonoise, given the percentile windows determined above (Fig. 6c). We also looked at the fraction of the 16 models where each of the 12 links were correctly classified (Fig. 6d). Perfect classification is a value of one, which is the case for no noise, and for many cases with 10:1 signaltonoise.
In general, as noise level increases, prediction accuracy decreases, as expected. Although for some models and parameters, performance at 2:1 signaltonoise is poor, in some cases it is surprisingly good. This suggests that the proposed method can yield information even in high noise cases; this information is particularly impactful for null, selfregulatory, and stimulus edges. High noise has strong effects on inference of edges that are either distinct across models, time variant or reliant on other node activities (F_{21}, F_{31}, F_{32}) (Fig. 6c, d, Supplementary Fig. S7). F_{21,} which is reliant on activity of x_{1}, is inferred better than F_{31} and F_{32}. This may be caused by the fact that x_{3} dynamics depend on both x_{1} and x_{2}, whereas x_{2} dynamics only depend on x_{1}.
Comparing across models, we find that Models 18 are reconstructed slightly better than Models 916 (Fig. 6c) when noise is high. This performance gap is predominantly caused by S_{3,b} misclassification—basal production of Node 3 (Supplementary Fig. S7). What is the reason for the possible misclassification of S_{3,b} in Models 916? We know that S_{3,b} depends on the initial values of x_{1}, x_{2} and x_{3} and the estimated values of F_{31}, F_{32} and F_{33} (See Methods, Eq. (19)). For Models 18, x_{1}(t = 0) and x_{2}(t = 0) are both zero and therefore S_{3,b} is effectively only dependent on estimated value of F_{33} and x_{3}(t = 0) (Supplementary Fig. S6 and Methods). But for Models 916, x_{2}(t = 0) is nonzero and S_{3,b} is dependent on the estimated values of both F_{32} and F_{33}, in addition to x_{2}(t = 0) and x_{3}(t = 0), which increases the variability of S_{3,b} estimates. Therefore with high levels of noise, S_{3,b} is more likely to be misclassified in Models 916, whereas this does not happen in Models 18 (Fig. 6c, d, Supplementary Fig. S7). In the future, including stimulus and basal production parameters in the least squares estimations themselves, rather than further deriving algebraic relations to estimate them, will likely help improve reliability.
We conclude that (i) when dealing with nonlinear gene regulatory networks, complete perturbations such as genetic knockouts may fundamentally impede one’s ability to deduce network architecture and (ii) this class of nonlinear networks can be reconstructed with reasonable performance using the proposed strategy employing partial perturbations.
Discussion
Despite intensive research focus on network reconstruction, there is still room to improve discrimination between direct and indirect edges (towards causality), particularly when biologicallyubiquitous feedback and feedforward cycles are present that stymie many statistical or correlationbased methods, and given that experimental noise is inevitable. The presented DLMRA method prescribes a realistic experimental design for inference of signed, directed edges when typical levels of noise are present. It allows estimation of selfregulation edges as well as those for basal production and external stimuli. For 2 and 3 node networks, the method can successfully handle random linear networks, cell state transition networks, and gene regulatory networks, and, under certain limiting conditions, signaling networks. Prediction accuracy improved with more timepoints, which in our case accounted for more relevant dynamic data. However, we would like to stress that here we did not explore time point placement, which likely underlies the performance increase rather than simply number of timepoints. Prediction accuracy was strong in many cases even with simulated noise that exceeds typical experimental variability (2:1 signaltonoise). The method presented here is quite general and could be applied not only to cell and molecular biology, but also vastly different fields where perturbation time course experiments are possible, and where network structures are important to determine.
One type of nonlinear model that we did not investigate is one with sustained oscillations, such as those found in the cell cycle^{66}, or sometimes even MAPK signaling pathways^{67,68,69}. We found that in our application to general two and three node linear models, DLMRA could reconstruct multiple networks that have damped oscillatory behavior (Fig. 1b). However, we expect time point measurement selection and frequency to be much more important for inferring networks that give rise to sustained oscillations, with time points comprehensively covering peaks and troughs, and the frequency high enough to do so. We do expect that the method could infer the structure of such networks given appropriate sampling, but this requires a much deeper investigation.
MRA and its subsequent methods allow for inference of direct edges by prescribing systematic perturbation of each node^{37,39,41,43,45} and the idea of directionality has been followed through in DLMRA. Often, such edge directness is equated to causality, but this is not necessarily the case, especially when the entire system is not explicitly represented. In practice, the causality and strength of an edge may be dependent on how well the model represents the underlying phenomenon and might be affected by simplification of larger networks, nonlinearities in the actual model and even by noise in the data. Secondly, in discussions about causal system inferences, consideration of the counterfactuals is important^{30,31,50,51}. For a network of nodes going through dynamics, the counterfactuals to intrinsic network edges causing the dynamics would be the environmental factors extrinsic to the network edges. In DLMRA, by evaluating external stimuli and basal production as well as the network edges, we have mapped some counterfactuals to node dynamics, thus presenting a more complete map of the causal factors to the network dynamics compared to methods which only show network edges. This also allows for a concise mapping of the environmental contexts in which the network edges are reconstructed.
Application of DLMRA could reconstruct cell state transition networks based on discrete time Markov transition models, with the added benefit of not being constrained to specific time intervals. It can also successfully handle noisy data. The additional constraints in DLMRA in the context of cell state transitions (summations of transition rates—see Methods) implies that the underlying network may be estimated even with less data requirements than in other cases. This method can be a useful tool to model cell state transitions and predict cell state. Perturbations were modeled as a difference in initial states, and that worked well in this case, suggesting that such modeling of perturbations may work in other cell state transition or biological networks.
Although application of DLMRA to an intracellular signaling network (Huang–Ferrell MAPK) was able to explain its ground truth, including feedback due to sequestration, the method was constrained to specific, difficulttoimplement perturbations and a low stimulus which may not always be feasible experimentally. Specific inhibitors could be a source of perturbation, but even they influence more kinetic parameters than was required here for a clean solution. In MRA, a larger reaction scheme is often simplified into modules with one species in the module representing the activity of the module. But often, the activity of the other species in the module is implicit and becomes significant in dictating how perturbations and stimulus affect the network dynamics. Moreover, the type of perturbation chosen, such as specific inhibitors versus knockdown, also may yield different network inference results. Therefore, the use of MRA methods on simplified large intracellular signaling networks, especially while dealing with experiments, have significant caveats that should be carefully considered^{41,70}.
Although complete inhibition is often used for perturbation studies of gene regulatory networks (e.g. CRISPRmediated gene knockout), we found that partial inhibition is important to fully reconstruct the considered nonlinear gene regulatory networks. It is important to distinguish here, however, small perturbations vs. partial perturbations. Small perturbations are formally recommended for both MRA and other techniques^{70} where the effects of noise are not extensively explored. In practice however, there is a tradeoff between perturbation strength and feasibility, since the effects of small perturbations are masked by noise^{41}. Partial perturbations, as considered in this work (~50%) are much larger than what are typically considered small perturbations. The theoretical formulation of DLMRA reduces the impact of not having small perturbations, because perturbation data from a particular node is not used for inference of edges connected to that node. Yet, DLMRA still uses linearizations of the Jacobian which are are always subject to greater inaccuracy the further away from reference points such perturbations take the system. Since many biological networks share the same types of nonlinear features contained within the considered FFL models, this is not likely to be the only case when partial inhibition will be important. We are thus inclined to speculate that large partial perturbations may be a generally important experimental design criterion moving forward. Partial inhibition is often “builtin” to certain assay types, such as si/shRNA or pharmacological inhibition that are titratable to a certain extent.
One major remaining challenge is scaling to larger networks. Here, we limited our analysis to 2 and 3 node networks. Conveniently, the number of necessary perturbation time courses needed grows linearly (as opposed to exponentially) with the number of considered nodes. Furthermore, as long as systemwide or omicsscale assays are available, the experimental workload also grows linearly. This is routine for transcriptome analyses^{71}, and is becoming even more commonplace for proteomic assays (e.g. mass cytometry^{72}, cyclic immunofluorescence, mass spectrometry^{73}, RPPA^{74}. Thus, the method is arguably experimentally scalable to larger networks.
However, the computational scaling past 2 and 3 node models remains to be determined and is likely to require different approaches for parameter estimation. Increasing the network size will quadratically increase the number of unknown parameters, which will significantly increase the computational requirements for obtaining robust solutions. Yet, recent work has shown that large estimation problems in ODE models may be broken into several smaller problems^{75}, which may be applicable here, and is likely to yield large computational speed up by allowing parallelization of much smaller tasks. However, theory on how to merge potentially discrepant results between independently estimated overlapping subnetworks would need to be derived. Importantly, we saw in the linear 2 and 3 node model examples that the impact of experimental noise was larger for 3 node models, implying that increasing the number of nodes past 3 will further increase the impact of experimental noise. Another synergistic avenue could be imposing prior knowledge to improve initial parameter guesses and even reduce the parametric space, such as in Bayesian Modular Response Analysis^{45}, or with functional database information^{76}. Such prior knowledge could also help inform emergent network properties as network size grows, such as degree distributions for scalefree networks^{2}. Here, we only investigated dense subnetworks, so sparseness patterns and judicious allocation of nonzero Jacobian elements could also have great impact on estimation for large networks. Overall, application to larger networks is of great interest but these nontrivial computational roadblocks must be solved first.
In conclusion, the proposed approach to network reconstruction is systematic and feasible, robustly operating in the presence of experimental noise and accepting data from large perturbations. It addresses important features of biological networks that current methods struggle to account for: causality/directionality/sign, cycles (including selfregulation), dynamic behavior and environmental stimuli. It does so while leveraging dynamic data of the network and only requires one perturbation per node for completeness. We expect this approach to be broadly useful not only for reconstruction of biological networks, but to enable using such networks to build more predictive models of disease and response to treatment, and more broadly, to other fields where such networks are important for system behavior.
Methods
Deriving sufficiency conditions for unique estimation of Jacobian elements
The firstorder partial derivatives comprising J (Eq. (2)) can be approximated by a firstorder Taylor series expansion of Eq. (1) about a time point k
Equations (5) and (6) may be written more succinctly as
where
The approximation in Eq. (7) becomes more accurate as more time points are measured. Also, the edge weights are potentially timedependent, although this is rarely considered when describing biological networks.
How do we estimate the edge weights F in Eq. (7) and thus reconstruct the network? Time series data can inform x_{i}’s and f_{i}’s as a function of time, following application of a stimulus. Given such stimulusresponse data, however, for each time point there are only two equations for four unknowns, an underdetermined system for which more data are needed.
Consider now stimulusresponse time course data in the presence of single perturbations. Let _{pi} be a variable that reflects the strength and/or presence of different potential perturbations: p_{1} represents perturbation of x_{1} and p_{2} represents perturbation of x_{2}. If p_{j} is not explicitly written, its value is zero and/or it has no effect. Now, the ODEs become a function of the perturbation variables
The 1^{st} order Taylor series expansions for cases with perturbations become
where
Here, we have expanded with respect to the perturbation, rather than with respect to time as previously. However, since the reference point is the same, the Jacobian elements remain identical in these equations. It is also interesting to note that the Jacobian elements, or network, may be affected by the perturbation, but we do not necessarily have to know those effects mathematically, since the reference point is the same. Now we have six potential equations with which to estimate the four Jacobian elements. However, we must make some determination as to how the perturbations p_{1} and p_{2} directly affect Node 1 and Node 2 dynamics f_{1} and f_{2} to account for the perturbation variable partial derivatives.
By design, the Node 1 perturbation has significant direct effects on Node 1 dynamics, and similarly for the Node 2 perturbation on Node 2 dynamics. Using equations including \(\partial f_1/\partial p_1\) and \(\partial f_2/\partial p_2\) require precise definition of perturbation strength and their effects on dynamics, which could be difficult to determine experimentally and implement in simulations. Therefore, we do not employ equations involving such terms. On the other hand, if the Node 1 perturbation has negligible direct effect on Node 2 dynamics, that is, the effects on Node 2 dynamics are through the network (i.e. p_{1}) and not explicit in f_{2}), and similarly the Node 2 perturbation has negligible direct effect on Node 1 dynamics, then \(\partial f_2/\partial p_1\) and \(\partial f_1/\partial p_2\) are approximately zero. This mild condition is often the case experimentally. The only determining factors for the suitability of the Taylor series truncation are the spacing of time points and the accuracy of the expansion about the perturbation difference. From this, the main set of linear equations presented in Eqs. (3) and (4) are obtained.
General estimation model equations
We employ the following general model for a twonode network: 
Here, S_{1} and S_{2} are the stimuli strengths on Node 1 and Node 2 respectively, and F_{11}, F_{12}, F_{21} and F_{22} are the network edge weights (Fig. 1a). In many systems, there may be a basal or constitutive production driving the node activities, besides an external stimulus. For these cases, the Stimulus term (S_{i}), may be considered as an addition of these two effects the basal production term (S_{i,b}) and the external stimulus (S_{i,ex}). Then the twonode model can be represented by the following equations
Or more generally,
where n is the total number of nodes.
When a steady state exists, the dx_{i}/dt terms become zero and it becomes easy to represent the stimulus terms as a function of the node activities (x_{i}) and network edges (F_{ij}).
This is helpful to understand that the perturbation time course data also generally constrains not only the edge weights, but also the stimulus terms. For a system at a steady state without an external stimulus, for example at t = 0:
The twonode single activator model
The twonode single activator model (Fig. 1a, Supplementary Fig. S1a) is described by
Here, S_{1,ex} = 1, F_{11} = −1, F_{12} = 0, S_{2,ex} = 1, F_{21} = 1.5, F_{22} = −0.8. The basal production terms are both zero, for simplicity, and the initial conditions for x_{1}(t = 0) and x_{2}(t = 0) are zero. The stimulus terms S_{i,ex} are calculated through Eq. (18), using the median values of F_{ij} and the x_{i}(t = 10), when the system reaches near steady state.
Random twonode and threenode models
The random 2 node network is described by
Values for S_{1,b}, S_{2,b}, S_{1,ex} and S_{2,ex} are sampled from a uniform distribution over the range [0,2] and values for F_{11}, F_{12}, F_{21}, and F_{22} are sampled from a uniform distribution over the range [−2,2] using the MATLAB function rand. To capture basal activity, we use a twostep approach. First, starting from node activity values of zero, without the external stimulus on Node 1 and Node 2 (S_{1,ex} = S_{2,ex} = 0 in Eq. (22)) we simulate until the network reaches steadystate with just basal production driving the network behavior. Then, we introduce the external stimulus on Node 1 and Node 2, integrate the ODEs, and sample evenly spaced timepoints using ode15s in MATLAB with default settings. We sample 3,7, 11, and 21 evenly spaced time points across a time course, from 0 to 10 arbitrary time units in all the cases.
The random 3 node networks use the same sampling rules as the 2 node networks with the following equations.
Intracellular signaling networks
In the simplification of the Huang–Ferrell network to three nodes, pMAPKKK, ppMAPKK and ppMAPK were taken as nodes. Since, in absence of external stimuli, the basal values of the nodes are zero, the basal production was estimated as zero beforehand and not considered in the estimation of the rest of the network. Aside from the basal production edges, a full 3 node network (Fig. 4b) was estimated from the simulation data of each of the observables. After estimation, parameters with values less than 1/100th of the largest parameter, were considered negligible.
Cell state transition models
The cell transition model^{48} is a discrete time Markov probability model. Here, we show how this form is related to the ODE model used in DLMRA. Starting at any initial value, each next step representing a time difference of one day follows from the previous time point as follows
Where M_{ij} denotes the Markov transition probabilities of species j into species i. In matrix form it may be represented as follows
Representing the Markov parameter matrix as M and the species relative concentration variables as vector X, the equation becomes
The Markov transition probabilities for a species must add up to 1. In experimental terms, a species can either transition to other species or stay the same and the sum of all those probabilities is 1.
As a first step in relating these equations to the ODE form underlying DLMRA, we put the variables in terms in terms of ∆x (with respect to time),
Where M’ is MI, and I is the identity matrix. M’ is M, except that 1 is subtracted from all its diagonal elements. Hence Eq. (26) for M’ becomes
This also implies that the diagonal term for M’ is negative of the sum of the other two terms in the same column. In experimental terms, the amount of reduction of a species is equal to how much it got converted to other species.
The above equations apply for the cases where ∆t is 1. We can incorporate arbitrary time steps as
Where ∆t is the scalar value of time difference and M’_{∆t} is the matrix of the set of parameters, specific to the time difference chosen. For a case where ∆t tends to 0, the equation becomes
Where M’_{dt} is the matrix of the set of parameters specific to the case where ∆t is infinitesimally small. Note that Eq. (33) is similar in form to Eq. (22), only without the extra stimulus terms and where M’_{dt} is equivalent to the Jacobian matrix F with terms F_{ij}. There would be an added constraint that the sum of the terms in the same column would add up to zero, or that the diagonal term is the negative of the sum of the other two terms in the same column.
Nonlinear models
The nonlinear feedforward loop models^{32} are described by:
When an AND gate is present
When an OR gate is present
For a given u, v ϵ {x_{1}, x_{2}, x_{3}} and K, K_{u}, K_{v} ϵ {\(K_{x_1x_2}\), \(K_{x_1x_3}\), \(K_{x_2x_3}\)}:
If u activates its target, then:
If u represses its target, then:
Effectively, an external stimulus of ‘S_{1,ex} = 1’, acts on Node 1 at t = 0 and is propagated through the network. There is no external stimulus acting on Node 2 and Node 3. However, in many cases there is basal production in one or both of Node 2 and Node 3. This leads to a nonzero steadystate of the network before the external stimulus is introduced.
To capture basal activity, we use a twostep approach. First, starting from node activity values of zero, without the external stimulus on Node 1 (S_{1,ex} = 0), we simulate until the network reaches steadystate. Then, we introduce the external stimulus on Node 1, integrate the ODEs, and sample 11 evenly spaced timepoints using ode15s in MATLAB with default settings and steadystate node values without the external stimulus as the initial conditions. We chose 11 timepoints because it yields good classification accuracy for the above random 3 node model even in presence of noisy data. For each of the 16 nonlinear models, the values of the parameters (K, Ku, Kv), were varied and chosen so that the resulting node activity data are responsive to the stimulus and perturbations (Supplementary Fig. S6, See Supplementary Code for values).
Modeling perturbations
Precisely modeling perturbations can be a challenge, since experimentally, there may be several ways of causing a perturbation with different mechanisms such as siRNAs, competitive/noncompetitive/uncompetitive inhibition, etc. It may be hard to quantify how much a perturbation is affecting a node, in terms of its dynamics (i.e. righthand sides of the ODEs). Therefore, we employ the following approaches which circumvent the need to model how each perturbation mechanistically manifests in the ODEs during parameter estimation. There are two cases to consider: (i) when we have a perturbation of node i and we need to simulate node i dynamics; (ii) when we have a perturbation of node i and we need to simulate other node j dynamics. To illustrate the approach, we use the abovedescribed 2 node model with an example of a Node 1 perturbation. Recall that
For case (i), we have to obtain values for x_{1} under perturbation of Node 1. We refer to the perturbed timecourse as x_{1,1}. In experimental situations, x_{1,1} would be measured directly. To obtain simulation data for x_{1,1} we use the following:
where x_{1} is obtained from the simulations without perturbations, and recall that k refers to time point k. For a 50% inhibition, p = 0.5 and for a complete inhibition, p = 0.
For case (ii), we have to obtain the values for x_{2} under perturbation of Node 1, which we refer to as x_{2,1}. To do this, we have to integrate the ODE for dx_{2}/dt, but using x_{1,1} values, as follows
Here, x_{2} has been replaced with x_{2,1} to represent x_{2} under perturbation of Node 1, for clarity. To solve this equation, we simply use the “measured” x_{1,1} time course directly in the ODE.
When data are generated by simulations, there is little practical limit to temporal resolution, but with real data, to solve Eq. (43) one may need values for x_{1,1} at multiple time points where measurements are not available, depending on the solver being used. We therefore fit x_{1,1} data to a polynomial using polyfit in MATLAB, and use the polynomial to interpolate given a required time point. In this work, we have used an order of 5 to fit the data as well as avoid overfitting, but the functional form is quite malleable so long as it captures the data trends.
For modeling perturbations of the cell transition model, the initial value of the simulated data for the perturbed node was taken as zero during simulation. The estimation was performed in a similar way as a random 3 node network as described above.
For modeling perturbations for the Huang–Ferrell model, the parameters k3, k15 and k27 were sequentially set as zero. The estimation was performed in a similar way as a random 3 node network as described above.
Simulated noise
Normally distributed white (zero mean) noise is added to simulated time courses pointwise with
where x is the simulation data point, y is the noisy data point, and d represents the noise level. Signaltonoise ratio of 10:1, 5:1 and 2:1 are, respectively d = 0.1, 0.2, and 0.5. Normally distributed samples are obtained using randn in MATLAB. While there are many different distributional options for modeling noise, we chose this for simplicity and to capture the effects generically of noisier data. We do not intend to answer questions related to whether specific distributional assumptions about the form of the noise have significant impact of the methods performance.
Parameter estimation
For the twonode model, the entire network, with and without perturbations, can be explained by the following system of equations
where x_{1,1} and x_{2,2} are the perturbed node values, from either simulated or experimental data. Eight parameters (S_{1,b}, S_{1,ex}, F_{11}, F_{12}, S_{2,b}, S_{2,ex}, F_{21}, F_{22}) need to be estimated to fully reconstruct this network. We seek a set of parameters that minimizes deviation between simulated and measured dynamics.
For an initial guess, the node edge parameters (F_{ij}) are randomly sampled from a uniform distribution over the interval [−2,2] and the stimulus parameters (S_{i,ex}) are sampled from a uniform distribution over the interval [0,2]. Using data at t = 0, which corresponds to a steadystate without S_{i,ex}, the S_{i,b} can be estimated during each iteration of the estimation as follows
For an nnode model, this equation can be scaled accordingly to obtain each \(\hat S\)_{i,b}.
For these initial guesses we compute the activity data using Eq. (45). The perturbation data x_{k,k} is used in the perturbation equations as detailed above (Eq. (43)). Let \(\hat x\)_{i}, and \(\hat x\)_{i,j} denote the predicted node activity values for nonperturbed and perturbed cases respectively. For a total of n nodes and N_{t} timepoints, the objective function is the sum of squared errors Φ
Note here that we do not use data from node j, when perturbation j was used (per the derivation). The MATLAB function fmincon is used to minimize Φ by changing edge weights and stimulus terms within the range [−10,10].
We employ “multistart” by running the estimation 10 times, starting from different randomly generated initial starting points^{77}. The estimated parameters and their respective final sum of squared errors (Φ) are saved and the estimated parameter set corresponding to the minimum Φ is chosen as the final parameter set. Variability of parameter estimates across multistart runs is explored in Supplementary Fig. S5.
Parameter estimation for nonlinear models
For estimating the NonLinear models, we start with a prior knowledge that S_{1,b} is always zero and S_{2,ex} and S_{3,ex} are always zero as well, which is directly evident from x_{1} initial conditions and x_{2}, x_{3} stimulus response in the presence of a complete Node 1 perturbation. The equations for the nonperturbation case become as follows
Since the system is at steadystate before the external stimulus, the basal production parameter can be estimated during each iteration of the estimation as
where \(\hat F\)_{i,j} are the current model parameter estimates and x_{i} (t = 0) are the x values at the initial system steady state before the induction of external stimulus.
Bootstrapping simulated data for the FFL model cases
To generate multiple parameter set estimates to classify edge weights for the FFL model cases, we employ a bootstrapping approach. In an experimental scenario, each data point will have a mean and a standard deviation, and upon a distributional assumption (e.g. normal), one can then resample datasets to obtain measures of estimation uncertainty. We use the simulated data as the mean, and then vary the standard deviation as described above to generate 50 bootstrapped datasets for each of the 16 considered models. Estimation is carried out for each of the 50 datasets using multistart, which yields 50 bestfitting parameter sets for each model. Uncertainty analysis and classification error is based on these sets.
Data availability
All relevant simulated data used in the paper are provided and can be accessed along with the code at https://doi.org/10.5281/zenodo.6516238. Any other relevant data can be obtained from the authors.
Code availability
The code needed to reproduce the data and figures are included and can be accessed at https://doi.org/10.5281/zenodo.6516238. Parallelization when necessary to generate data was run on Palmetto cluster (372 GB, 48 nodes) and MATLAB 2020a. The code also includes Jupyter notebooks that implement the estimation functions (in python) for a 2 node system, a 3 node system, and a 3 node cell state system. These use simple csv input files where the experimental data are placed.
References
Angulo, M. T., Moreno, J. A., Lippner, G., Barabási, A.L. & Liu, Y.Y. Fundamental limitations of network reconstruction from temporal data. J. R. Soc. Interface 14, 20160966 (2017).
Barabási, A. L. & Albert, R. Emergence of scaling in random networks. Science 286, 509–512 (1999).
Califano, A., Butte, A. J., Friend, S., Ideker, T. & Schadt, E. Leveraging models of cell regulation and GWAS data in integrative networkbased association studies. Nat. Genet. 44, 841–847 (2012).
Calvano, S. E. et al. A networkbased analysis of systemic inflammation in humans. Nature 437, 1032–1037 (2005).
Dorel, M. et al. Modelling signalling networks from perturbation data. Bioinformatics 34, 4079–4086 (2018).
Hackett, S. R. et al. Learning causal networks using inducible transcription factors and transcriptome‐wide time series. Mol. Syst. Biol. 16, e9174 (2020).
Hein, M. Y. et al. A human interactome in three quantitative dimensions organized by stoichiometries and abundances. Cell 163, 712–723 (2015).
Hill, S. M. et al. Context specificity in causal signaling networks revealed by phosphoprotein profiling. Cell Syst. 4, 73–83.e10 (2017).
Ideker, T. et al. Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. Science 292, 929–934 (2001).
Ideker, T., Ozier, O., Schwikowski, B. & Siegel, A. F. Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics 18, S233–S240 (2002).
Liu, Y.Y., Slotine, J.J. & Barabási, A.L. Observability of complex systems. Proc. Natl Acad. Sci. USA 110, 2460–2465 (2013).
Ma’ayan, A. et al. Formation of regulatory patterns during signal propagation in a Mammalian cellular network. Science 309, 1078–1083 (2005).
Margolin, A. A. et al. ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinforma. 7, S7 (2006).
Mazloom, A. R. et al. Recovering proteinprotein and domaindomain interactions from aggregation of IPMS proteomics of coregulator complexes. PLoS Comput. Biol. 7, e1002319 (2011).
Mehla, J., Caufield, J. H. & Uetz, P. The yeast twohybrid system: a tool for mapping proteinprotein interactions. Cold Spring Harb. Protoc. 2015, 425–430 (2015).
Molinelli, E. J. et al. Perturbation biology: inferring signaling networks in cellular systems. PLoS Comput. Biol. 9, e1003290 (2013).
Nyman, E. et al. Perturbation biology links temporal protein changes to drug responses in a melanoma cell line. PLOS Comput. Biol. 16, e1007909 (2020).
Pe’er, D., Regev, A., Elidan, G. & Friedman, N. Inferring subnetworks from perturbed expression profiles. Bioinformatics 17, S215–S224 (2001).
Pósfai, M., Liu, Y.Y., Slotine, J.J. & Barabási, A.L. Effect of correlations on network controllability. Sci. Rep. 3, 1067 (2013).
Schraivogel, D. et al. Targeted Perturbseq enables genomescale genetic screens in single cells. Nat. Methods 1–7, https://doi.org/10.1038/s4159202008375 (2020).
Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).
Stein, R. R., Marks, D. S. & Sander, C. Inferring pairwise interactions from biological data using maximumentropy probability models. PLoS Comput. Biol. 11, e1004182 (2015).
Wynn, M. L. et al. Inferring intracellular signal transduction circuitry from molecular perturbation experiments. Bull. Math. Biol. 80, 1310–1344 (2018).
Yuan, B. et al. CellBox: interpretable machine learning for perturbation biology with application to the design of cancer combination therapy. Cell Syst. 12, 128–140.e4 (2021).
Chevalier, T., Schreiber, I. & Ross, J. Toward a systematic determination of complex reaction mechanisms. J. Phys. Chem. 97, 6776–6787 (1993).
DíazSierra, R., Lozano, J. B. & Fairén, V. Deduction of chemical mechanisms from the linear response around steady state. J. Phys. Chem. A 103, 337–343 (1999).
Hoffmann, M., Fröhner, C. & Noé, F. Reactive SINDy: discovering governing reactions from concentration data. J. Chem. Phys. 150, 025101 (2019).
Kim, J., Bates, D. G., Postlethwaite, I., HeslopHarrison, P. & Cho, K.H. Leastsquares methods for identifying biochemical regulatory networks from noisy measurements. BMC Bioinforma. 8, 8 (2007).
Schmidt, H., Cho, K.H. & Jacobsen, E. W. Identification of small scale biochemical networks based on general type system perturbations. FEBS J. 272, 2141–2151 (2005).
Morgan, S. L. & Winship, C. Counterfactuals and causal inference: methods and principles for social research. (Cambridge University Press, 2014). https://doi.org/10.1017/CBO9781107587991.
Pearl, J. Structural counterfactuals: a brief introduction. Cogn. Sci. 37, 977–985 (2013).
Mangan, S. & Alon, U. Structure and function of the feedforward loop network motif. Proc. Natl Acad. Sci. USA 100, 11980–11985 (2003).
Reeves, G. T. The engineering principles of combining a transcriptional incoherent feedforward loop with negative feedback. J. Biol. Eng. 13, 62 (2019).
Fournier, T. et al. Steadystate expression of selfregulated genes. Bioinformatics 23, 3185–3192 (2007).
BellPedersen, D. et al. Circadian rhythms from multiple oscillators: lessons from diverse organisms. Nat. Rev. Genet. 6, 544–556 (2005).
StewartOrnstein, J., Cheng, H. W. J. & Lahav, G. Conservation and divergence of p53 oscillation dynamics across species. Cell Syst. 5, 410–417.e4 (2017).
Kholodenko, B. N. et al. Untangling the wires: a strategy to trace functional interactions in signaling and gene networks. Proc. Natl Acad. Sci. USA 99, 12841–12846 (2002).
Santra, T., Rukhlenko, O., Zhernovkov, V. & Kholodenko, B. N. Reconstructing static and dynamic models of signaling pathways using Modular Response Analysis. Curr. Opin. Syst. Biol. 9, 11–21 (2018).
Andrec, M., Kholodenko, B. N., Levy, R. M. & Sontag, E. Inference of signaling and gene regulatory networks by steadystate perturbation experiments: structure and accuracy. J. Theor. Biol. 232, 427–441 (2005).
Santos, S. D. M., Verveer, P. J. & Bastiaens, P. I. H. Growth factorinduced MAPK network topology shapes Erk response determining PC12 cell fate. Nat. Cell Biol. 9, 324–330 (2007).
Thomaseth, C. et al. Impact of measurement noise, experimental design, and estimation methods on modular response analysis based network reconstruction. Sci. Rep. 8, 16217 (2018).
Gross, T. & Blüthgen, N. Identifiability and experimental design in perturbation studies. Bioinformatics 36, i482–i489 (2020).
Halasz, M., Kholodenko, B. N., Kolch, W. & Santra, T. Integrating network reconstruction with mechanistic modeling to predict cancer therapies. Sci. Signal. 9, ra114 (2016).
Klinger, B. et al. Network quantification of EGFR signaling unveils potential for targeted combination therapy. Mol. Syst. Biol. 9, 673 (2013).
Santra, T., Kolch, W. & Kholodenko, B. N. Integrating Bayesian variable selection with Modular Response Analysis to infer biochemical network topology. BMC Syst. Biol. 7, 57 (2013).
Sontag, E., Kiyatkin, A. & Kholodenko, B. N. Inferring dynamic architecture of cellular networks using time series of gene expression, protein and metabolite data. Bioinformatica 20, 1877–1886 (2004).
Cho, K.H., Choo, S.M., Wellstead, P. & Wolkenhauer, O. A unified framework for unraveling the functional interaction structure of a biomolecular network based on stimulusresponse experimental data. FEBS Lett. 579, 4520–4528 (2005).
Gupta, P. B. et al. Stochastic state transitions give rise to phenotypic equilibrium in populations of cancer cells. Cell 146, 633–644 (2011).
Huang, C. Y. & Ferrell, J. E. Ultrasensitivity in the mitogenactivated protein kinase cascade. Proc. Natl Acad. Sci. 93, 10078–10083 (1996).
Höfler, M. Causal inference based on counterfactuals. BMC Med. Res. Methodol. 5, 28 (2005).
Shipley, B. Cause and correlation in biology: a user’s guide to path analysis, structural equations and causal inference with R (Cambridge University Press, 2016). https://doi.org/10.1017/CBO9781139979573.
Lill, D. et al. Mapping connections in signaling networks with ambiguous modularity. npj Syst. Biol. Appl 5, 19 (2019).
Kholodenko, B. N. & Sontag, E. D. Determination of functional network structure from local parameter dependence data. Preprint at http://arxiv.org/abs/physics/0205003 (2002).
Armond, J. W. et al. A stochastic model dissects cell states in biological transition processes. Sci. Rep. 4, 3692 (2014).
Dirkse, A. et al. Stem cellassociated heterogeneity in Glioblastoma results from intrinsic tumor plasticity shaped by the microenvironment. Nat. Commun. 10, 1787 (2019).
Hormoz, S. et al. Inferring cellstate transition dynamics from lineage trees and endpoint singlecell measurements. cels 3, 419–433.e8 (2016).
Larsson, I. et al. Modeling glioblastoma heterogeneity as a dynamic network of cell states. Mol. Syst. Biol. 17, e10105 (2021).
Neftel, C. et al. An integrative model of cellular states, plasticity, and genetics for glioblastoma. Cell 178, 835–849.e21 (2019).
Sha, Y., Wang, S., Zhou, P. & Nie, Q. Inference and multiscale model of epithelialtomesenchymal transition via singlecell transcriptomic data. Nucleic Acids Res. 48, 9505–9520 (2020).
Shen, S. & Clairambault, J. Cell plasticity in cancer cell populations. F1000Res 9, F1000 (2020). Faculty Rev635.
Zarkoob, H., Taube, J. H., Singh, S. K., Mani, S. A. & Kohandel, M. Investigating the link between molecular subtypes of glioblastoma, epithelialmesenchymal transition, and CD133 cell surface protein. PLoS One 8, e64169 (2013).
Henriques, D., Villaverde, A. F., Rocha, M., SaezRodriguez, J. & Banga, J. R. Datadriven reverse engineering of signaling pathways using ensembles of dynamic models. PLOS Comput. Biol. 13, e1005379 (2017).
Goentoro, L., Shoval, O., Kirschner, M. W. & Alon, U. The incoherent feedforward loop can provide foldchange detection in gene regulation. Mol. Cell 36, 894–899 (2009).
Goentoro, L. & Kirschner, M. W. Evidence that foldchange, and not absolute level, of βcatenin dictates wnt signaling. Mol. Cell 36, 872–884 (2009).
Nakakuki, T. et al. Ligandspecific cfos expression emerges from the spatiotemporal control of ErbB network dynamics. Cell 141, 884–896 (2010).
Tyson, J. J., CsikaszNagy, A. & Novak, B. The dynamics of cell cycle regulation. Bioessays 24, 1095–1109 (2002).
Albeck, J. G., Mills, G. B. & Brugge, J. S. Frequencymodulated pulses of ERK activity transmit quantitative proliferation signals. Mol. Cell 49, 249–261 (2013).
Kholodenko, B. N. Negative feedback and ultrasensitivity can bring about oscillations in the mitogenactivated protein kinase cascades. Eur. J. Biochem 267, 1583–1588 (2000).
Ryu, H. et al. Frequency modulation of ERK activation dynamics rewires cell fate. Mol. Syst. Biol. 11, 838 (2015).
Fuente, A., de la, Brazhnik, P. & Mendes, P. Linking the genes: inferring quantitative gene networks from microarray data. Trends Genet. 18, 395–398 (2002).
Stark, R., Grzelak, M. & Hadfield, J. RNA sequencing: the teenage years. Nat. Rev. Genet. 20, 631–656 (2019).
Spitzer, M. H. & Nolan, G. P. Mass cytometry: single cells, many features. Cell 165, 780–791 (2016).
Aksenov, A. A., da Silva, R., Knight, R., Lopes, N. P. & Dorrestein, P. C. Global chemical analysis of biology by mass spectrometry. Nat. Rev. Chem. 1, 1–20 (2017).
Akbani, R. et al. Realizing the promise of reverse phase protein arrays for clinical, translational, and basic research: a workshop report: the RPPA (Reverse Phase Protein Array) Society. Mol. Cell. Proteom. 13, 1625–1643 (2014).
Stapor, P. et al. Minibatch optimization enables training of ODE models on largescale datasets. Nat. Commun. 13, 34 (2022).
Wu, G., Feng, X. & Stein, L. A human functional protein interaction network and its application to cancer data analysis. Genome Biol. 11, R53 (2010).
Raue, A. et al. Lessons learned from quantitative dynamical modeling in systems biology. PLOS One 8, e74335 (2013).
Acknowledgements
We would like to thank Clemson University and the CCIT team for the generous allotment of time and support in the Palmetto cluster for running the simulations in this paper. M.R.B. acknowledges funding from Mount Sinai, Clemson University, the National Institutes of Health Grants R01GM104184 and R35GM141891 and an IBM faculty award. M.B. and A.D.S. were supported by a National Institute for General Medical Sciencesfunded Integrated Pharmacological Sciences Training Program grant (T32GM062754).
Author information
Authors and Affiliations
Contributions
M.R.B., G.R.S., and D.S. conceived of the work. D.S., G.R.S., M.B., M.R.B., and J.E. performed analyses. D.S., M.B., and G.R.S. made the figures. D.S., G.R.S., M.B., and M.R.B. wrote the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Sarmah, D., Smith, G.R., Bouhaddou, M. et al. Network inference from perturbation time course data. npj Syst Biol Appl 8, 42 (2022). https://doi.org/10.1038/s41540022002536
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41540022002536
This article is cited by

Characterizing heterogeneous singlecell dose responses computationally and experimentally using threshold inhibition surfaces and dosetitration assays
npj Systems Biology and Applications (2024)