Introduction

In living organisms, signaling systems are crucial for many biological functions, e.g., transmitting the environmental nutrient signal in bacteria chemotaxis1, interpreting the morphogen concentration in the development of multicellular organisms2,3, T cell activation in response to antigens in immune systems4, and a variety of medicine signaling in disease treatments5. However, how signaling systems can perform their functions with high accuracy remains intriguing6,7.

To ensure proper functions, signaling systems need to be precise in responding to external signals. However, they are inevitably exposed to noise including randomness of molecular processes8,9, heterogeneity between individuals9,10, and environmental fluctuation11, etc. At the same time, signaling systems also need to be sensitive to signal changes. While noise controlling mechanisms have been studied extensively, e.g., feedback control12, pathway redundancy12,13,14, temporal average and spatial signal integration14,15,16,17,18, etc., some of them are achieved by sacrificing sensitivity, e.g., the negative feedback mechanism7. In recent years, there are some attempts on revealing the relation between sensitivity and precision for systems under fluctuating noise resembling the fluctuation-dissipation theorem11, it is still unclear on this relation for systems affected by static noise, signals affected by which do not fluctuate over time but deviate from the population average constantly (Fig. 1a). Hence, we focus on the static noise component in this study.

Fig. 1: Illustration of the trade-off between sensitivity and precision of the response in a signaling process.
figure 1

a The noise of a biochemical signal \(S(t)\) can generally be decomposed into time-varying fluctuation around the temporal average \(\bar{S}\), and the static noise, i.e., the deviation of the individual temporal average from the ensemble-temporal average \(\left\langle \bar{S}\right\rangle\). Our model focuses on the static noise. Green, yellow and blue lines represent the signal in three hypothetical individual systems. b In our model, a signaling system contains a response module (the dashed rectangle) that responds to the external input \(I\) via two intermediate signals \({S}_{1}\) and \({S}_{2}\). The response \(R\) is deterministic, while the noisy intermediate signals depend on the input. c As a proxy for the precision, the response noise is defined as the variance of the response for a given external input. Error bars are the square root of the noise. The response sensitivity is defined as the squared slope of the average response curve. d Given the response sensitivity, a lower limit on the response noise is expected. The actual value of the noise relies on the detail of the system.

To study the relationship between noise and sensitivity, instead of following the conventional bottom-up approach, e.g., computationally exploring different network topologies19,20, we attempt to directly derive the relationship between response sensitivity and precision from a top-down approach. Specifically, while the topological representation of complex networks is widely utilized in network enumeration, some recent seminal studies demonstrated that the geometric representation (e.g., the landscape, phase space, or phase diagram) of a network can be better at characterizing its function than the topological representation21,22,23,24,25,26. Hence, we focus on the phase diagram structure, which is amenable for analytical derivation.

In this paper, we establish a general relationship between response noise and sensitivity in signaling systems containing static noise, and provide the optimal condition for achieving the lower limit of noise for given sensitivity. Furthermore, we confirm that this optimal relationship is better characterized by a defined phase diagram structure rather than some specific network topologies. We then apply this relationship in analyzing the patterning network in early Drosophila embryos. The gene expression data matches well with the derived noise-sensitivity relationship, and the predicted optimal phase diagram structure yields a parameter-free quantitative explanation on the measured patterning shifts of the downstream genes under Bicoid (Bcd) dosage perturbation.

Results

The lower limit of response noise is bound by sensitivity

To address the trade-off between sensitivity and precision of the response in signaling systems, we first focus on a simplified signal transduction process. As illustrated in Fig. 1, the input (\(I\)) is static and encoded into two static intermediate signals (\({S}_{{{{{\mathrm{1,2}}}}}}\)) with noise, which are deterministically responded to by a response module. This simplified process is a generalized feedforward cascade, i.e., \(I\to {S}_{1}\to R\) and \(I\to {S}_{2}\to R\) are two arbitrary feedforward pathways. Feedforward cascades are ubiquitous motifs in biological systems27, e.g., the osmotic stress response pathway in cells28, the multiple morphogens transmitting positional information in multicellular systems2,3,13, and the T cell activation pathway in immune systems4. In this model, the consideration of two intermediate signals represents the simplest non-trivial scenario (refer to Supplementary Note 1 for the derivation of multiple intermediate signals). Here we focus on the static noise source stemming from the encoding process, whereby the intermediate signals (\({S}_{{{{{\mathrm{1,2}}}}}}\)) exhibit variability among different individuals compared to the ensemble average (Fig. 1a). This choice is made as temporal fluctuations are commonly dampened by mechanisms such as temporal averaging15,16,17,18 (further discussion on the effect of fluctuating noise alongside static noise is provided in the Discussion). The response module could manifest as a complex, multi-node regulatory network. For the purpose of examining the properties of signal noise, we initially consider the response module to be deterministic. Detailed exploration of its noise effect is reserved for the Discussion.

The function of this signaling system can be characterized by two important traits: sensitivity and precision. Given a single-valued response curve of the system, to avoid negative values, we define sensitivity as \({\left(d\left\langle R\right\rangle /{dI}\right)}^{2}\), where \(\left\langle \ldots \right\rangle\) denotes the ensemble average, and use the response noise (defined as the variance of the response \({\sigma }_{R}^{2}\)) as a proxy for precision (which negatively correlates to the noise) for convenience (Fig. 1c). Due to the randomness in the intermediate signals, the response to a given input is noisy. Hence, a trade-off between sensitivity and precision is expected (Fig. 1d). For example, while a “constant response” can achieve high precision (zero noise) in a deterministic model, its sensitivity is also zero. Next, we explore what is the general trade-off relationship and how to achieve a lower noise limit for a given sensitivity.

The behavior of the response module in this signal system can be characterized by a phase diagram (Fig. 2), on which a signal curve represents the average intermediate signals under different inputs, and iso-response curves are defined as points with the same response on the phase diagram. Since in the feedforward cascade there is no feedback from the response module to the intermediate signal, the intermediate signals encoding the input can be expressed as:

$${S}_{1,2}={f}_{1,2}(I)+{a}_{1,2}{\eta }_{1,2},$$
(1)

where \({f}_{1}\) and \({f}_{2}\) are two arbitrary deterministic functions, \({a}_{{{{{\mathrm{1,2}}}}}}\) denotes the noise amplitudes, \({\eta }_{1}\) and \({\eta }_{2}\) are Gaussian random variables with zero means, unit variances, and a correlation coefficient \(\rho =\left\langle {\eta }_{1}{\eta }_{2}\right\rangle\) denoting the encoding noise. The correlation between the two signals could stem from their common external noise or interaction. Equation (1) without the noise term is the parametric form of the signal curve on the phase diagram (Fig. 2).

Fig. 2: Illustration of a phase diagram of the response module.
figure 2

Different responses, to various \({S}_{1}\) and \({S}_{2}\), are exhibited as different colors on the \({S}_{1}-{S}_{2}\) plane. The iso-response curves and the signal curve are two decisive geometric features. The green curve denotes an example iso-response curve \(R={R}_{0}\). The signal curve (red curve) represents the intermediate signals \({S}_{1}\) and \({S}_{2}\) under different external input \(I\) without noise. The green and red dashed lines denote the linear extension and the slopes,\(\,{k}_{{{{{{\rm{r}}}}}}}\) and \({k}_{{{{{{\rm{s}}}}}}}\), of the two curves at the intersection point, respectively.

While the form of the iso-response curve family is generally unknown, to perform analytical analysis, we resort to the linear perturbation approximation, which is a widely-used approach in studying systems near fixed points. We will address the nonlinear interactions or large noise via numerical simulations in the next section, and confirm that the sensitivity-precision trade-off derived from the linear approximation is still valid.

In the neighborhood of the intersection point of the signal curve and the iso-response curve \(R={R}_{0}\), we have \(R\approx {R}_{0}+{\nabla }_{{{{{{\bf{S}}}}}}}R\cdot {{{{{\mathbf{\Delta }}}}}}{{{{{\bf{S}}}}}}\). For signals determined by Eq. (1), \(R\approx {R}_{0}+\frac{\partial R}{\partial {S}_{1}}{a}_{1}{\eta }_{1}+\frac{\partial R}{\partial {S}_{2}}{a}_{2}{\eta }_{2}\). Combining with the identities, \(\frac{d\left\langle R\right\rangle }{{dI}}\approx \frac{\partial R}{\partial {S}_{1}}{f}_{1}^{{\prime} }\left(I\right)+\frac{\partial R}{\partial {S}_{2}}{f}_{2}^{{\prime} }\left(I\right)\), the slope of the iso-response curve \({k}_{{{{{{\rm{r}}}}}}}={k}_{{{{{{\rm{r}}}}}}}({S}_{1},{S}_{2})=-\frac{\partial R}{\partial {S}_{1}}/\frac{\partial R}{\partial {S}_{2}}\), and the slope of the signal curve \({k}_{{{{{{\rm{s}}}}}}}={f}_{2}^{{\prime} }\left(I\right)/{f}_{1}^{{\prime} }\left(I\right)\), we find that the response noise \({\sigma }_{R}^{2}\) is proportional to the response sensitivity \({\left(d\left\langle R\right\rangle /{dI}\right)}^{2}\), following:

$${\sigma }_{R}^{2}=\langle{(R-{R}_{0})}^{2}\rangle=\lambda \times {\left(\frac{d{\langle R \rangle }}{dI}\right)}^{2},$$
(2)

where the structural factor \(\lambda =\frac{{k}_{{{{{{\rm{r}}}}}}}^{2}{a}_{1}^{2}+{a}_{2}^{2}-2{k}_{{{{{{\rm{r}}}}}}}{a}_{1}{a}_{2}\rho }{{\left({k}_{{{{{{\rm{s}}}}}}}-{k}_{{{{{{\rm{r}}}}}}}\right)}^{2}}\frac{1}{{f}_{1}^{{\prime} }{\left(I\right)}^{2}}\) is determined by the structure of the phase diagram (\({k}_{{{{{{\rm{r}}}}}}}\)), the encoding functions (\({f}_{{{{{\mathrm{1,2}}}}}}\) and \({k}_{{{{{{\rm{s}}}}}}}\)), and the noise structure of the signals (\({a}_{{{{{\mathrm{1,2}}}}}}\) and \(\rho\)). For different response modules, \({k}_{{{{{{\rm{r}}}}}}}\) in the structural factor is varying, which yields an inequality:

$$\lambda \ge {\lambda }_{\min }=\frac{{a}_{1}^{2}{a}_{2}^{2}(1-{\rho }^{2})}{({k}_{{{{{{\rm{s}}}}}}}^{2}{a}_{1}^{2}+{a}_{2}^{2}-2{k}_{{{{{{\rm{s}}}}}}}{a}_{1}{a}_{2}\rho )}\frac{1}{{f}_{1}^{{\prime} }{(I)}^{2}},$$
(3)

where \({\lambda }_{\min }\) is only determined by the encoding function and the noise structure, i.e., an external factor beyond the response module, which also depends on \({f}_{2}\) as \({k}_{s}={f}_{2}^{{\prime} }\left(I\right)/{f}_{1}^{{\prime} }\left(I\right)\). Therefore, for a given upstream signaling pathway, the response noise (\({\sigma }_{R}^{2}\)) and the response sensitivity \({\left(d\left\langle R\right\rangle /{dI}\right)}^{2}\) of a response module follow a general relationship:

$${\sigma }_{R}^{2}\, \ge \, {\lambda }_{\min }\times {\left(\frac{d{\langle R \rangle}}{dI}\right)}^{2}.$$
(4)

To achieve the lower limit of response noise for given sensitivity or, equivalently, to minimize \(\lambda\) with respect to the response module, the phase diagram of the response module should follow:

$${k}_{{{{{{\rm{r}}}}}}}=-\frac{{a}_{2}^{2}-{k}_{{{{{{\rm{s}}}}}}}{a}_{1}{a}_{2}\rho }{{k}_{{{{{{\rm{s}}}}}}}{a}_{1}^{2}-{a}_{1}{a}_{2}\rho }.$$
(5)

Equation (5) defines the phase diagram structure for the optimal response module in the signaling system under consideration here, and it could be used to further predict behavior of the system, which we will apply later in the patterning network in early Drosophila embryos. It is worth noting that this result relies on the feedforward architecture of the signaling cascade. If there exist feedback mechanisms from the response module to the intermediate signals, the function \({f}_{{{{{\mathrm{1,2}}}}}}\) could also become reliant on \(R\), thereby implying a potential extra dependency between \({k}_{{{{{{\rm{s}}}}}}}\) and \({k}_{{{{{{\rm{r}}}}}}}\).

So far, we have derived a general relationship between the response sensitivity and response noise representing the sensitivity-precision trade-off, and the corresponding optimal phase diagram structure. An immediate question is to search for potential optimal network topologies for the response module to achieve the lower limit of response noise.

The optimal response module is defined by the phase diagram

To systematically explore the potential optimal network topology for the response module with respect to the sensitivity-precision trade-off, we enumerate all possible three-node network topologies, i.e., 3411 different topologies after removing symmetric structures (see “Methods” section and Supplementary Fig. 1a–c for details).

We apply a coarse-grained nonlinear model, using a sigmoid function mimicking the saturation effect, to describe the network dynamics:

$${\tau }_{i}\frac{d{R}_{i}}{dt}=\frac{1}{1+\exp ({r}_{0i}-\sum {r}_{ji}{R}_{j}-\sum {r}_{sji}{S}_{j})}-{R}_{i},$$
(6)

where \({R}_{i}\) (\(i={{{{\mathrm{1,2,3}}}}}\)) represents the normalized instantaneous level of node \({i}\), whose steady-state value is regarded as the response of the signaling system, i.e., there are three responses of this system, \({S}_{j}\) (\(j={{{{\mathrm{1,2}}}}}\)) is the \(j\)th intermediate signal, \({\tau }_{i}\) is the timescale of node \(i\), \({r}_{0i}\) is a parameter of the regulatory function, \({r}_{{ji}}\) is the regulatory matrix representing the regulation from node \(j\) to node \(i\) (\({r}_{{ji}} > 0\) represents positive regulation, \({r}_{{ji}} < 0\) represents negative regulation), and \({r}_{{{{{{\rm{s}}}}}}{ji}}\) is the regulatory matrix representing the regulation from the \(j\)th intermediate signal to node \(i\). To test the generalizability of the model, we also apply different formalisms such as the gene circuit model29,30 or a Monod–Wyman–Changeux (MWC) model inspired formalism31,32,33 to describe the network dynamics (see Supplementary Note 2 for more details).

Throughout the simulation, we assume that the intermediate signals encode the input as:

$$\bigg\{\begin{array}{c}{S}_{1}={e}^{3(I-1+{a}_{1}{\eta }_{1}+{a}_{{{{{{\rm{c}}}}}}1}{\eta }_{{{{{{\rm{c}}}}}}})}\,\\ {S}_{2}={e}^{-3(I+{a}_{2}{\eta }_{2}+{a}_{{{{{{\rm{c}}}}}}2}{\eta }_{{{{{{\rm{c}}}}}}})}\hfill\end{array},$$
(7)

where \({a}_{1}={a}_{{{{{{\rm{c}}}}}}1}=0.02\), \({a}_{2}=-{a}_{{{{{{\rm{c}}}}}}2}=0.03\) represent noise amplitude, \({\eta }_{1}\), \({\eta }_{2}\), and \({\eta }_{{{{{{\rm{c}}}}}}}\) are independent Gaussian random variables with zero means and unit variances, \({\eta }_{1}\) and \({\eta }_{2}\) represent the independent part of the encoding noise, \({\eta }_{{{{{{\rm{c}}}}}}}\) represents the common part of the encoding noise. For the other parameters, i.e., the \({r}_{{ji}}\), \({r}_{{sji}}\), and \({\tau }_{i}\), we test 10,000 randomly chosen parameter sets for each network, and simulate 50 times for each parameter set to evaluate the response noise and the response sensitivity (see “Methods” section for details). This form of encoding function follows the classic mechanism, synthesis-diffusion-degradation model, for morphogens encoding positional information in multicellular organisms, e.g., the spatial distribution of Bcd in the early Drosophila embryos34. In additional, we also test two other forms of the encoding function (see Supplementary Note 2 for more details).

The resulting noise and sensitivity from all different topologies, different parameter sets, and different input levels exhibit a clear trade-off (Fig. 3a, only one-tenth of the data points are shown due to the limitation of graphical computation power), the lower bound agrees well with the derived equation without any parameter fitting (Eqs. (3) and (4)), despite the strong nonlinearity. By combining the structural factor \(\lambda\) to the horizontal axis, the data all collapse onto the predicted line described by Eq. (2) with the coefficient of determination being \({R}^{2}=1.00\) under the logarithmic scale (Fig. 3b). In the analysis, we exclude the resulting data points exhibiting multistability since their apparent response noise is dominated by the gap between different stable states rather than the encoding noise (see “Methods” section and Supplementary Fig. 1d for details).

Fig. 3: The noise-sensitivity relationship of all three-node network topologies.
figure 3

a The lower bound of response noise clearly depends on the response sensitivity in the simulation data (dots), which aligns well with the theoretical prediction (dashed line, Eq. (4)). Data with \({\left({dR}/{dI}\right)}^{2}\le {10}^{-10}\) are discarded due to the limited precision in numerical simulations. Results without stable steady-state solutions or with multistability are also discarded. In our simulation, all three nodes receive the intermediate signals indicated by the red and blue triangles in the inset. b By combining the structural factor \(\lambda\) to the horizontal axis, data from all network topologies (dots) align well with the theoretical prediction (red dashed line, Eq. (2)) with the coefficient of determination \({R}^{2}=1.00\).

Generally, as long as the noise is low enough to fulfill the linear approximation in deriving Eq. (4), our result remains unaffected by varying the mathematic formalism of the dynamical system (Eq. (6)). This includes employing different formalisms of Eq. (6) such as the gene circuit model29,30 or an MWC inspired model31,32,33 (Supplementary Fig. 2a, b), altering the noise correlation between the two intermediate signals (Supplementary Fig. 2c, d), and adjusting the form of the encoding functions (Eq. (7)) (Supplementary Fig. 3). Further details can be found in Supplementary Note 2.

However, we could not identify any network topology that is apparently preferred in achieving the lower bound of the response noise. To evaluate the performance of each topology, we calculate its optimal tendency (OT), defined as the proportion of the simulation results that lie close to the lower bound (\(\lambda \le 1.1{\lambda }_{\min }\)) (the statistic is different from the Q-value measuring network robustness19,35, as it is not based on parameter sets, but the individual resulting points in Fig. 3). The histogram of the OT values shows a unimodal distribution with no obvious long tail or extra peak in the higher end (Fig. 4a). Besides, we find that the OT values for different network categories do not depend on their feedback types, i.e., the number of positive-feedback or negative-feedback sub-circuits (Fig. 4b). These results suggest that network topologies may not affect the performance of a response module regarding the sensitivity-precision trade-off. Moreover, by performing a hierarchical clustering, which is a promising method in searching core topological structures19, for the high-OT network topologies (the top \(2 \%\) of all networks, see “Methods” section for details), we find no distinct clusters (Fig. 4c), indicating no characteristic topological structure can be identified. We further perform a t-distributed stochastic neighbor embedding (t-SNE) (Fig. 4d; see “Methods” section for details) for the high-OT networks, which also suggest no obvious separable clusters. One concern is that the three-node network is not the minimal module to achieve the optimal trade-off between sensitivity and precision. Therefore, we also test the topological structures in the simplest case, i.e., one-node network topologies, and confirm that no dependence on the type of auto-regulation (see Supplementary Note 3 and Supplementary Fig. 3c–f for details). These analyses indicate that the phase diagram structure, rather than the network topology, can better characterize the optimal response module.

Fig. 4: The performance of all three-node network topologies.
figure 4

a The OT shows a narrow distribution, indicating no apparent difference between topologies is presented regarding the trade-off. b No clear dependence between OT and the number of feedback loops (both positive and negative feedback). For illustration, the dot size is set to be proportional to the number of network topologies in each category. c Hierarchical clustering of the high-OT networks shows no characteristic traits in the topological structure. Each row represents one network topology. Each column represents one edge between nodes as indicated by the labels below each column. Red, green, and black correspond to positive, negative, and no regulation, respectively. The dendrogram on the left indicates the assigned clusters. The height of the dendrogram indicates the average difference (Hamming distance) between two clusters. d t-SNE result shows no distinct clusters for the high-OT topologies. Each dot represents one network topology.

The patterning network in early Drosophila embryos is optimized

After numerically testing the noise-sensitivity relationship on coarse-grained three-node networks, we further test whether it can be applied in real biological systems. Here we consider the intensively-studied gene regulatory network controlling the developmental pattern in early Drosophila embryos24,30,32,36,37,38,39. During early Drosophila embryogenesis, multiple genes express spatial patterns along the main body axis, i.e., the anterior-posterior (A-P) axis. By considering the A-P position (\(x\)) as the input of the patterning network and the gene expression level (\(G\)) as the response, the response sensitivity is equivalent to the squared spatial slope of the gene expression pattern, i.e., \({\left({dG}/{dx}\right)}^{2}\).

Based on this idea, we extract the gene expression noise and sensitivity in the published data set38, combine the data of 7 different gene expression patterns including hunchback (hb), Kruppel (Kr), giant (gt), knirps (kni), even-skipped (eve), runt (run), and paired (prd), and observe a global trade-off (Fig. 5a). Furthermore, the identified lower bound agrees well with the derived relationship (Eq. (4)). Moreover, near the respective boundary positions of each pattern, the data lie close to the lower bound (Fig. 5b, for other genes, see Supplementary Fig. 4). Therefore, it suggests that the patterning network, at least for the gene expression boundaries, could be optimized.

Fig. 5: The optimality of the expression boundaries of the developmental patterns in early Drosophila embryos.
figure 5

a The expression data of 7 different genes exhibits a global trade-off. As predicted, the lower bound of noise, indicated by the dashed line, is proportional to the sensitivity following Eq. (4) with \({\lambda }_{\min }={0.007}^{2}\). We use the published data38, only the gene expression profiles within 40 ~ 44 minutes into n.c.14 are included here. b Data near the Hb expression boundary lie close to the lower bound (dashed line). Insert, the average Hb profile. For illustration, the dots are colored according to the distance from the boundary. c, d The estimated structural factor \({\lambda }_{{{\mbox{est.}}}}\) of the patterning data, reflecting the distance to the lower bound, exhibits a greater negative correlation to sensitivity (with a Pearson correlation coefficient \(\rho =-0.90\)) comparing to that in random networks (\(\rho =-0.20\)). Different color in c denote different genes as in a.

To further test this hypothesis, we estimate the structure factor of the underlying patterning network by modifying Eq. (2) as \({\lambda }_{{{\mbox{est.}}}}={\sigma }_{G}^{2}/{\left(\frac{{dG}}{{dx}}\right)}^{2}\) since the structural factor can be used to evaluate the distance from optimality (Fig. 3a). A clear negative correlation between \({\lambda }_{{{\mbox{est.}}}}\) and \({\left({dG}/{dx}\right)}^{2}\) is observed (Fig. 5c, the Pearson correlation coefficient \(\rho =-0.90\) under the logarithmic scale), suggesting that the gene expression boundaries are being optimized in the patterning network. In contrast, the randomly generated networks, used in the previous section, only exhibit a mild negative correlation (Fig. 5d, \(\rho =-0.20\), under the logarithmic scale).

Subsequently, assuming the patterning network is optimized, we derive its phase diagram structure, which is then utilized to predict additional behaviors of the patterning network. Considering the patterning network as a highly coarse-grained model, we depict it as a signaling system in accordance with the framework proposed in Fig. 1b. Here, the input variable becomes the A-P position \(x\), and the intermediate signals represent maternal morphogen concentrations. In this setup, the feedforward pathway is modified to \(x\to\) maternal morphogens \(\to\) downstream gene expression. It is worth mentioning that the resultant downstream gene expression pattern is an outcome of the downstream gene regulatory network, which interprets the signals conveyed by maternal morphogen concentrations. It is well known that multiple maternal morphogens encode the positional information and govern patterning36, e.g., Bcd, Nanos (Nos), Caudal (Cad), Torso (Tor), maternal Hb (mHb), and etc. For the sake of simplicity, we consider two static maternal morphogens being the intermediate signals, including a static proxy of Bcd (\(B\)) and a second posterior-originated maternal morphogen \(M\) (Fig. 6a, b) following:

$$\bigg\{\begin{array}{c}B=\exp (-xL/\ell )\\ M=f((1-x)L)\end{array},$$
(8)

where \(x\) is the normalized relative A-P position ranging from 0 to 1, \(L=\left\langle L\right\rangle +{a}_{{{{{{\rm{L}}}}}}}{\eta }_{{{{{{\rm{L}}}}}}}\) denotes the embryo length, \({a}_{{{{{{\rm{L}}}}}}}\) represents the variability amplitude of embryo size, \({\eta }_{{{{{{\rm{L}}}}}}}\) is a Gaussian random variable with a zero mean and a unit variance representing the variability of embryo size40,41, ℓ =0.165〈L〉 is the length constant of the Bcd profile41, and the function \(f\) depicts the shape of the second maternal morphogen. Since we assume the second maternal morphogen is originated from the posterior end, the argument of its distribution function is the absolute distance from posterior, i.e., \((1-x)L\). Without losing generality, we use a normalized unit such that \(\left\langle L\right\rangle =1\). For simplification, this model only considers the embryo size variability as the noise source. A more general model considering independent noise terms is discussed in the Supplementary Note 4. The identity of the second maternal morphogen could be an unidentified morphogen36 or an effective morphogen integrating multiple known maternal morphogens, nevertheless, this does not affect the results.

Fig. 6: The Bcd dosage response of the patterning network in early Drosophila embryos.
figure 6

a, b In the simplified model, the patterning network is assumed to be controlled by two morphogens, including the Bcd gradient (\(B\)) and a posteriorly originated gradient (\(M\)), that encodes the A-P position \(x\). a The average Bcd profile follows an exponential curve. b The shape of the second morphogen is arbitrary and denoted by the function \(f\). c The response of patterning network \(R\) could be the expression level of different target genes, whose wildtype patterns are denoted as G(x). d Based on the derived phase diagram structure, the predicted pattern position shift, Δx= ℓ(1-x0) ln D, upon Bcd dosage perturbation (Eq. (13)) aligns well with the experimental measured shift42 with the coefficient of determination R2 = 0.93.

The deterministic response of the patterning network can be characterized by a phase diagram and formulated as \(R=R\left(B,M\right)\), where \(R\) represents the expression level of a gene. Furthermore, the observed spatial expression pattern is denoted as \(G\left(x\right)\) (Fig. 6c), i.e., on average

$$R(x)=R(B(x),M(x))=G(x).$$
(9)

On the one hand, following the procedure in deriving Eq. (5), we can use the linear approximation to derive the optimal slope of the iso-response curves. On the other hand, we can combine Eq. (8) and Eq. (9) to completely eliminate the influence of the noise term \({a}_{{{{{{\rm{L}}}}}}}{\eta }_{{{{{{\rm{L}}}}}}}\), which yields an analytical form of the optimal phase diagram as:

$$R(B,M)=G\left(\frac{\ell \,{{{{\mathrm{ln}}}}}\,B}{\ell \,{{{{\mathrm{ln}}}}} \, B-{{f}^{-1}}({M})}\right),$$
(10)

where the exact form of \(G\) is not required for our analysis. Based on this optimal phase diagram structure, we further predict the average patterning under Bcd dosage perturbation. To do that, Eq. (8) is modified as:

$$\bigg\{\begin{array}{c}B=D\exp (-x/\ell )\\ M=f(1-x)\hfill\end{array},$$
(11)

where \(D\) denotes the fold change of the Bcd dosage, and \(D=1\) for the wildtype (Fig. 6a). Combining Eqs. (10) and (11), we have the predicted response at position \(x\) as:

$$R(x,D)=G\left(\frac{\ell \,{{{{\mathrm{ln}}}}}\,D-x}{\ell \,{{{{\mathrm{ln}}}}}\,D-x-(1-x)}\right)=G\left(\frac{\ell \,{{{{\mathrm{ln}}}}}\,D-x}{\ell \,{{{{\mathrm{ln}}}}}\,D-1}\right).$$
(12)

Here we use \({f}^{-1}\left[f\left(1-x\right)\right]=1-x\) in the derivation to get rid of the function \(f\) in the expression. Although the function \(f\) may not be invertible in the whole domain \(x\in \left[{{{{\mathrm{0,1}}}}}\right]\), we can dissect the \(x\) range into subdomains in which it is invertible and perform the derivation from Eq. (10) for each domain separately.

Combining with Eq. (9), the model predicts that a response \({R}_{0}\) that originally occurs at x0 as D=1, now shifts to x following: \(G({x}_{0})={R}_{0}=R(x,D)=G(\frac{\ell \,{{{{\mathrm{ln}}}}}\,D-x}{\ell \,{{{{\mathrm{ln}}}}}\,D-1})\), and therefore \({x}_{0}=\frac{\ell \,{{{{\mathrm{ln}}}}}\,D-x}{\ell \,{{{{\mathrm{ln}}}}}\,D-1}\) yielding:

$$\Delta x=x(D,{R}_{0})-{x}_{0}=\ell (1-{x}_{0}){{{{\mathrm{ln}}}}}\,D.$$
(13)

This prediction serves as an independent validation for the observed optimality (Fig. 5). Indeed, this parameter-free prediction agrees well with the experimental data42,43 (with the coefficient of determination \({R}^{2}=0.93\), Fig. 6d). In previous work, the dosage of Bcd was perturbed and the position shifts of its downstream patterning were quantified including the boundaries or peaks of four gap genes hb, Kr, kni, and gt, and the peaks of a pair-rule gene eve, along with the cephalic furrow (CF) position (see Supplementary Fig. 5 for details). Most of the existing gene network models cannot quantitatively explain these patterning position deviations as far as we know. Moreover, our model uses one single dosage response function, i.e., a master curve44, to explain the behavior of all measured patterns without any parameter fitting. Since these different patterns are controlled by different regulators and different network topologies, this result implies the universality of the relationship derived, it is also consistent with the numerical simulation results that no preferred network topology can be identified to achieve the optimal trade-off between sensitivity and precision.

The validity of the prediction in Eq. (13) hinges upon the assumption that the second maternal morphogen conveys positional information regarding the distance to the posterior pole. If the source of the second maternal morphogen relocates away from the posterior pole, the prediction will diverge from the experimental data. Nonetheless, despite the ongoing uncertainty surrounding its identity, this assumption remains reasonable for nearly all the proposed candidates for the second maternal morphogen to the best of our knowledge.

We also test a more comprehensive noise structure for the maternal morphogens, including the common noise from size variability discussed above and additional independent noise, and find a similar result (see Supplementary Note 4 and Supplementary Fig. 6 for details). As a comparison, a model with a single maternal morphogen, i.e., the threshold dependent model45, exhibits a large deviation between prediction and experimental data, i.e., a much smaller coefficient of determination \({R}^{2}=0.41\) (Supplementary Fig. 6b).

Discussion

Signaling systems are critical for many biological functions. However, noise inevitably affects their performance. In this study, we investigate the trade-off between sensitivity and precision in near-steady-state signaling systems. Here we only consider a static encoding noise. But for a real biological system, other types of noise could affect the response behavior as well46,47. For example, temporal fluctuating noise, either intrinsic or extrinsic, is another potential noise source. While fluctuating noise can be well attenuated by temporal averaging via positive feedback, without affecting the sensitivity in signaling systems7,16, we find that the timescale of the dynamical system does not affect the performance regarding the noise-sensitivity relationship we derived (see Supplementary Note 5). Therefore, our discussion of encoding noise could serve as an example for the, perhaps more critical, static noise in signaling systems.

Besides the static noise as the only input noise, we apply several other assumptions in the derivation of the trade-off relation and in the numerical simulation for simplification, e.g., only two intermediate signals are involved in the pathway, the output response does not show multistability, and the input noise is low. However, we find that the discovered trade-off between sensitivity and precision could still be valid if we relax the above constraints. Firstly, we can mathematically derive the same noise-sensitivity relation (Eq. (4)) in pathways with an arbitrary number of intermediate signals (see Supplementary Note 1). When there is only one intermediate signal, Eq. (4) becomes an equality reproducing the well-known noise propagation relation. Moreover, although we do not explicitly consider the interaction between intermediate signals \({S}_{{{{{\mathrm{1,2}}}}}}\), the noise-sensitivity relation (Eq. (4)) remains valid in systems where \({S}_{{{{{\mathrm{1,2}}}}}}\) mutually regulate each other. This holds true as long as the downstream network does not depend on dynamic decoding, e.g., pre-steady-state decoding48. The encoding function (Eq. (1)) can be interpreted as a steady state relation, i.e., although \({S}_{{{{{\mathrm{1,2}}}}}}\) regulate each other dynamically, there still exists a stable value depending on the input level \(I\). Hence the general formalism of the encoding function remains unchanged, and subsequent derivations remain unaffected by this alternate interpretation of Eq. (1). Secondly, the addition of intrinsic fluctuating noise will only increase the response noise and, therefore, the derived inequality (Eq. (4)) is conserved. For example, if fluctuating noise is added in the three-node network simulation, the noise-sensitivity scatter plot remains nearly the same as sensitivity is high enough albeit a plateau appears at low \({\left({dR}/{dI}\right)}^{2}\) end (see “Methods” section and Supplementary Fig. 7a, for details). These plateaus resulted from intrinsic noise likely also appears in the gene expression data of the patterning network in Drosophila embryos when the sensitivity is low (i.e., low slope in the expression curves) (Fig. 5b and Supplementary Fig. 4). Due to the interference of intrinsic noise, data of an optimized system could also deviate from the lower bound on the noise-sensitivity scatter plot, for example, the gene expression peaks of the developmental patterns in early Drosophila embryos. Thirdly, the derived trade-off relationship is still valid for multistable systems despite their exceedingly high apparent noise dominated by the gap between different stable states (Supplementary Fig. 7b). Since our definition of sensitivity is based on the average response (\(\left\langle R\right\rangle\)), multistable systems still have a continuous, single-valued average response curve, and the sensitivity is still well-defined. Finally, although a rigorous derivation without the small noise limit could be challenging, simulations show that the trade-off still shows up even if the encoding noise (\({a}_{1}\), \({a}_{2}\), \({a}_{{{{{{\rm{c}}}}}}1}\), and \({a}_{{{{{{\rm{c}}}}}}2}\) in Eq. (7)) is increased by 10 times (Supplementary Fig. 7c).

We notice that the trade-off between sensitivity and precision discovered in our study is different from a previous published relationship found in biological systems11, in which the response of a system to the input, equivalent to \(\frac{d\left\langle R\right\rangle }{{dI}}\) in our framework, was found to be proportional to the noise of the response, equivalent to the variance \({\sigma }_{R}^{2}\), i.e., \(\frac{d\left\langle R\right\rangle }{{dI}}\propto {\sigma }_{R}^{2}\). We find that the approximations in deriving that relationship are not justified in our model (see Supplementary Note 6 for details). This suggests that different noise structures could yield different trade-off relationships. Therefore, it is crucial to determine what the dominant noise structure is before applying specific theories. Our results could be an important step in extending the fluctuation-response relationship to nonequilibrium systems while static noise, rather than temporal fluctuation, are under consideration. Moreover, it has been proposed that fluctuations can be attenuated by consuming energy6, and dynamical decoding temporal-modulated signals can reduce the noise-induced information loss48,49. It remains to take the energy dissipation and dynamic signaling into account to further explore the trade-off between sensitivity and precision of biological signaling systems.

Our results reveal a design principle for the response module in signaling networks to achieve optimal performance regarding the sensitivity-precision trade-off. This design principle can be characterized by a phase diagram structure as given by Eq. (5), but not a network topological structure. In short, the iso-response curve and signal curve are better to be near orthogonal rather than near parallel to each other, while the specific optimal condition is given by Eq. (5) (Supplementary Fig. 3f). With this optimal phase diagram structure (Eq. (5)), we can predict non-trivial behaviors of the system even without knowing the real network topological structure. As a demonstration, we show the practicality of this phase diagram method in a real biological system, the patterning network in early Drosophila embryos (Fig. 5a, b), and derive a parameter-free prediction on its behavior under Bcd dosage perturbation (Fig. 6, Eqs. (12) and (13)) based on the optimal phase diagram structure (Eq. (10)). Although our model is highly simplified (Fig. 6), the agreement between prediction and data suggests that we might have captured a core property of this patterning network.

In terms of modeling a complex network, the phase diagram representation shows several advantages compared with the more widely applied network topology representation. First, it is a technically more convenient mathematical tool for analyzing some network functions. To analyze a network topology, a set of ordinary differential equations (ODEs) is needed20,32. However, ODE models always require some pre-defined functions and contain many uncertain parameters even with proper coarse-graining, while the phase diagram representation could contain fewer parameters by coarse-graining and omitting irrelevant details, e.g., the linear approximation in deriving the trade-off relationship (Eq. (2)). Besides, it could be computationally-expensive to reveal design principles by network enumeration and parameter sampling19,20,35,50, while analytical derivation could be possible by utilizing the phase diagram representation, e.g., the analytical derivation of the optimal phase diagram (Eqs. (5) and (10)). Moreover, the phase diagram structure could be more tightly connected with some network functions. Although many network topology motifs have been identified to achieve certain biological functions27, it has been reported that the connection between the function and the topological structure could be loose in some scenarios21,22,51. In our example, the optimal response module regarding the sensitivity-precision trade-off does not correlate with its topology (Fig. 4, e.g., all topologies have a \(\sim 6 \%\) chance in achieving the optimal performance), but it can be mapped into a unique phase diagram structure. It is worth noting that the exact value of the OT peak position depends largely on the chosen threshold for \(\lambda\), i.e., the criterion for good performance, it occurs near \(6 \%\) when choosing \(\lambda \le 1.1{\lambda }_{\min }\).

Actually, the geometric representation including the phase diagram has been successfully applied to study a variety of complex biological processes23,26,52,53,54,55. For example, a phase diagram can directly decode the cell fate decision process in developing neural tubes (also referred to as a decoding map)53. The gene-free geometric modeling has been applied to construct a phase diagram, resembling the Waddington landscape, to reproduce the cell fate dynamics in the vulval development in the nematode Caenorhabditis elegans55,56,57. A geometric method is used to reveal the trade-off between internal and external noise resistance in biological clocks23. A Waddingtonian-like dynamical landscape can be constructed from quantitative gene expression data and accurately characterize the cell fate transition of pluripotent stem cells54. These studies imply that the topological representation is no longer the necessary option for describing a complex network.

It should be pointed out that the topological representation and geometric representation are not mutually exclusive. The topological representation provides a more intuitive image of the complex network and is more straightforward when compared with gene-centered biological observations. While the phase diagram is usually analyzed after constructing the network topology of the dynamical system in the literatures, we propose that the phase diagram structure could be determined theoretically in the first place, and potentially as a constraint in constructing the network topology of the dynamical system. This could be a complemental approach in modeling complex networks.

Moreover, this approach resonates with an emerging method in modeling based on first principles44,58,59, which does not heavily depend on the experimental data as the widely used data-driven modeling. For example, by optimizing the direction inference (i.e., inferring binary information) against noisy sensory signals in a one-dimensional chemotactic system, the resulting model based on information-theoretically optimal dynamics is equivalent to the standard, experiment-based biochemical chemotactic network model60. In addition, without data fitting, the parameters of the gap gene network model in a topological representation could be determined by maximizing the positional information carried by the gap gene patterns during Drosophila embryogenesis. Surprisingly, the gene expression pattern can be nearly well generated with this optimized network model32. These studies show that the principle-based top-down approach can become a complement of the data-driven bottom-up method in constructing biological models.

In conclusion, we would expect that combining the phase diagram representation with the data-free optimality approach could open a new avenue in studying and designing complex gene regulatory networks.

Methods

Enumeration of three-node network topologies

Mathematically, the total number of the three-node network topologies is \({3}^{9}\) considering 9 edges between 3 nodes and 3 possible types of regulation for each edge, i.e., positive regulation, negative regulation or no regulation. However, after discarding symmetric topologies, only 3411 different networks remain. Although each topology has at most 5 different symmetric structures, i.e., 2 rotation symmetric operations between three nodes and 3 reflectional symmetric operations between the node pairs. There are \({3}^{3}=27\) topologies that are identical to themselves in the rotational symmetric structures, and \({3}^{6}=729\) topologies that are identical to themselves in one of their reflectional symmetric structures (see Supplementary Fig. 1 for details). Therefore, after removing the symmetric topologies, \(\frac{{3}^{9}+2\times {3}^{3}+{3}^{6}}{6}=3411\) network topologies remain.

Parameter sampling in network enumeration

In our simulation, 10,000 different parameter sets are sampled for each network. Specifically, \({r}_{{ji}}\) follows a Gaussian distribution with a zero mean and a standard deviation of 10, their signs are changed according to the topology, i.e., positive values for the positive regulation, negative values for the negative regulation, and zero if no regulation is present for the target node pair; \({r}_{{sji}}\) also follows a Gaussian distribution with a zero mean and a standard deviation of 10; \({r}_{0i}=-1\) is the leakage; \({\tau }_{i}\) follows a log-uniform distribution where the range of \({\log }_{10}{\tau }_{i}\) is 2, i.e., \({\log }_{10}{\tau }_{i}\) follows a uniform distribution with a range of 2. In each parameter set, the timescale \({\tau }_{i}\) is normalized so that the smallest one is \(0.1\). For each parameter set, its response noise and sensitivity are evaluated with a random input signal \(I\in \left[{{{{\mathrm{0,1}}}}}\right]\). The response noise is evaluated by 50 repetitions with the same parameters and variating \({S}_{{{{{\mathrm{1,2}}}}}}\) following Eq. (7).

Determination of multistability in simulations

The k-means clustering and mean shift clustering are combined to determine whether a result shows multistability. Specifically, k-means is used to find the two steady-state values as the initial centers, the mean shift clustering is then applied to search the refined centers. If the result shows monostability, the two identified centers will colocalize, otherwise, two separate centers will be found. The threshold for colocalization is set to be \(0.1\), i.e., the two centers is defined as colocalization if their distance is less than 0.1, and defined as multistability otherwise. No difference is noticed if the threshold value is 0.5.

Network topology clustering

In the clustering procedure, a network topology is represented by a 9-dimensional vector, each dimension represents one of the 9 edges. The value is \(1\), \(-1\), or \(0\), if the corresponding edge is a positive, negative, or no regulation, respectively. The pairwise distance between topologies is defined as the Hamming distance between the corresponding vector, i.e., the number of different edges between the two topologies. Symmetric structures are included for the clustering. In the t-SNE analysis, the same data structure is used, i.e., each topology is represented by a 9-dimensional vector with the value of \(0,\pm 1\). The pairwise distance is also defined as the Hamming distance.

Network enumeration with intrinsic noise

Another three-node network enumeration is performed to evaluate the effect of intrinsic noise. In this simulation, Eq. (6) is modified as (Langevin equation):

$$\frac{d{R}_{i}}{dt}=\frac{1}{{\tau }_{i}}\left[\frac{1}{1+\exp ({r}_{0i}-\sum {r}_{ji}{R}_{j}-\sum {r}_{sji}{S}_{j})}-{R}_{i}\right]+{a}_{{{{{{\rm{int}}}}}},i}{\eta }_{{{{{{\rm{int}}}}}},i},$$

where aint,i=0.01 (\(i={{{{\mathrm{1,2,3}}}}}\)) represents the amplitude of intrinsic noise, \({\eta }_{{{{{\mathrm{int}}}}},i}\) (\(i=1,2,3\)) denotes the white noise, which is described by an independent Gaussian random variable with the autocorrelation function \(\langle {\eta }_{{{{{\mathrm{int}}}}},i}(t){\eta }_{{{{{\mathrm{int}}}}},i}({t}^{{\prime} })\rangle =\delta (t-{t}^{{\prime} })\), where \(\delta \left(t-{t}^{{\prime} }\right)\) is the Dirac delta function. Other parameters and the simulation scheme are the same as the simulation in the main text.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.