## Abstract

Robustness, and the ability to function and thrive amid changing and unfavorable environments, is a fundamental requirement for living systems. Until now it has been an open question how large and complex biological networks can exhibit robust behaviors, such as perfect adaptation to a variable stimulus, since complexity is generally associated with fragility. Here we report that all networks that exhibit robust perfect adaptation (RPA) to a persistent change in stimulus are decomposable into well-defined modules, of which there exist two distinct classes. These two modular classes represent a topological basis for all RPA-capable networks, and generate the full set of topological realizations of the internal model principle for RPA in complex, self-organizing, evolvable bionetworks. This unexpected result supports the notion that evolutionary processes are empowered by simple and scalable modular design principles that promote robust performance no matter how large or complex the underlying networks become.

## Introduction

Robust perfect adaptation (RPA) is the ability of a system to generate an output that returns to a fixed reference level (its 'set point') following a persistent change in input stimulus, with no need for tuning of system parameters^{1,2,3}. RPA has been widely observed throughout biology^{1,4,5,6,7,8,9,10,11}, at the cellular level (signal transduction, gene regulation, protein interaction networks^{6,7,8}), in sensory systems^{6,10,11}, at the whole-organism level in mammals^{9}, and during development^{12,13}. For example, mammalian plasma calcium concentration exhibits perfect adaptation to persistent changes in calcium export (e.g., lactation), or influx (e.g., diet changes or bone remodeling), thereby keeping plasma calcium within very tight tolerances as calcium demands vary^{9}. In addition, perfect adaptation enables a biological system to reset itself following a perturbation, in order to maintain responsiveness to subsequent variations in external stimuli^{3}. The RPA property thus promotes high sensor sensitivity, while increasing the dynamic range, regardless of the intensity, or the variations, in the average stimulus^{1,2,3,14,15}.

Importantly, while RPA confers many benefits to living systems, loss of the RPA property in networks that require it could lead to disease (e.g., ras-mediated oncogenesis^{3,16}), reduced fitness^{1}, or death^{17}. During development, and throughout evolution^{1}, biologic networks—or bionetworks—grow to enormous size and molecular complexity, apparently without any compromise in robustness. Why isn’t this growth in complexity associated with heightened fragility, or instablility, or a loss of requisite function^{18,19,20}?

In this connection, it is essential to recognize that biological systems differ in fundamental ways from engineering control systems. Molecular signaling networks are self-organizing, self-regulating, adaptable, and evolvable, and as such, are comprised of elements that must serve both as the transmitted signals and their own controllers. Unlike their designed counterparts in engineering control systems, bionetworks do not have the luxury of employing specially-designed, dedicated components whose purpose is to sense or control biochemical signals. Although asymptotic tracking problems (of which RPA to constant exogenous inputs is a special case) have been studied extensively for engineering systems^{21,22,23}, how can we understand the mechanisms governing robust performance in the context of the self-organizing, self-regulating autonomous systems arising in biology, or that become deranged in disease^{24,25,26}?

Until now, three basic approaches have been used to understand the allowed topologies for RPA networks^{2,3,5,8,14,18,19}, and they have only provided answers for very small systems. The first approach is to model a small, well-defined, and well-studied natural system that exhibits RPA; a classic example of this approach is the work of Barkai and Leibler on RPA in bacterial chemotaxis^{5}. A second approach is to build trial modifications of input–output network connection maps, hoping to find circuit designs that have the RPA property^{3,18,19}. Both of these approaches consider the RPA problem via ad-hoc modeling. By contast, a third approach is to undertake high-throughput computational searches to comprehensively study very small networks, typically only containing two or three components^{2,13}. Ma et al.^{2} used this approach to suggest that, for three-node networks, only two types of signaling motif were capable of implementing RPA.

Crucially, such ad-hoc or high-throughput strategies are impractical for most networks in living and evolving systems, which can contain huge numbers of interacting molecules, and for which we have limited or no a priori knowledge of component and pathway interconnections, reaction kinetics, or system parameters^{1,17,18}. If computational search strategies could be extended to networks with more than three nodes, would this reveal any RPA network topologies that are distinct from the two basic network designs already discovered for three-node RPA networks? Do the topological requirements for RPA in large systems increase in complexity, or change qualitatively, along with the growth of the system, or must larger systems simply replicate the same basic design principles used by smaller systems? Could there exist universal topological principles that characterize all RPA-capable network designs^{18}?

In the present study we provide definitive answers to these questions. In stark contrast to previous work on the RPA problem (see^{3,18} for recent reviews), we develop a topological framework, and a new set of unifying definitions, that is able to interpret and account for the flow and control of biochemical signals through a network. This global and top-down methodology allows us to describe the full set of possible network topologies for achieving RPA in arbitrarily large and complex networks, involving any number of interacting components, with no prior assumptions as to how the components are interconnected, or the kinetics of any reactions. Remarkably, we show that all networks, no matter how large or interconnected, have just two distinct mechanisms at their disposal, corresponding to two distinct types of integral control, in order to implement RPA. Each of these two mechanisms generates a rich class of well-defined network topologies—'modules'—containing previously unrecognized architectural features that are too complex to be observed in three- or four-node networks. Most importantly, we show that these two rich and distinct classes of modules represent a topological basis for the solution to the RPA problem: that is, the full set of all possible RPA-capable networks can be expressed via the interconnections of these special modules, subject to well-defined modular connectivity rules.

## Results

### General schema for identifying RPA topologies

In order to specify the complete solution space to the general RPA problem, we derive and analyze a suitable algebraic condition that we refer to as the RPA Equation (see Methods). This equation accounts for all possible interactions and interconnections in a network of arbitrary size, and establishes a special case of the Internal Model Principle (IMP) from which topological structures may be deduced. The equation takes the form of a particular Jacobian determinant, which is required to take a zero value for all system inputs, *I*.

At the broadest level, we interpret the (signed) terms of the RPA equation’s determinant expansion as a set, **R**, and recognize that partitions of **R** may exist for which the contents of every subset can sum to zero independently of the contents of every other subset. We proceed to identify general conditions under which such 'independently adapting subsets' can exist by accounting for the topological information contained within each term. In particular, we show in Supplementary Note 2 that each member of **R** represents a unique product of the three fundamental mathematical elements of signal transmission— (1) a route factor (a pathway through the network, corresponding to an unbroken sequence of node-node interactions commencing at the input node and ending at the output node), (2) circuit products (corresponding to multi-node cycles, i.e., feedback loops), and (3) 'kinetic-multipliers' (single-node cycles, which encode important properties of the reaction kinetics at the node in question).

The cornerstone of our methodology is to identify conditions under which an arbitrary subset of **R** can sum to zero for all *I*, thus solving a local adaptation equation (LAQ), with no possible further sub-division into smaller adaptive subsets. We refer to such a subset as a 'minimally adaptive (MA)-subset', of which we identify two distinct types: singleton MA-subsets (hereafter, S-sets), whose single term must be able to assume a zero value for all *I*, and multi-term MA-subsets (hereafter, M-sets), all of whose terms must solve their LAQ with strictly non-zero values for all *I*.

S-sets and M-sets are generated (that is, their LAQ is solved) by distinct mechanisms that are orchestrated by special classes of reaction kinetics at one or more key network nodes. For S-sets, we refer to the generating mechanism as 'opposition', and the special class of reaction kinetics that execute the mechanism as 'opposer kinetics'. The 'balancing' mechanism that generates M-sets, on the other hand, requires two different types of reaction kinetics working together in collaboration—'balancer kinetics' and 'connector kinetics'—at distinct nodes. We will see later that these special classes of reaction kinetics play remarkable computational roles in RPA-capable networks.

### S-sets and the creation of Opposer Modules

S-sets are created by a mechanism we refer to as opposition, since a factor in the cycle component of the term assumes a zero value, thereby opposing that term’s route component. The mechanism is transacted by an opposer node, *P*_{
o
}, which appears in the form of a kinetic multiplier, and whose reaction kinetics (opposer kinetics) must satisfy \(\frac{{\partial f_o}}{{\partial P_o}} = 0\) at steady-state, for all *I*.

Now, a node can only exhibit opposer kinetics if it participates in a feedback loop (Theorem 1, Supplementary Note 2), making opposition a circuit-based mechanism. We show that the mathematical requirements of opposer kinetics imply a well-defined class of possible chemical reaction forms, which we describe in Supplementary Note 4. In addition, an opposer node requires a single independent regulator, which must participate in a common circuit with the opposer, in order to implement these reaction kinetics. This requirement for a single independent regulator implies that an opposer node cannot occur at the junction of two independent feedback loops, for instance.

Importantly, a single opposer node will oppose (assign to S-sets) all instances in **R** of a particular route if and only if (a) it is disjoint from the route, and (b) it participates only in circuits that are contiguous with the route (Theorem 2, Supplementary Note 3). From this it follows that a single opposer node can only partially oppose a route if it participates in even one circuit that is disjoint from the route in question. Nevertheless, all instances of a route could still be assigned to S-sets by a collection of two or more opposer nodes working together in concert. We refer to such a collection of opposer nodes as an 'opposing set'.

We identify a strict set of topological conditions for which a collection of opposer nodes, {*P*_{o1},…, *P*_{
om
}} constitutes an opposing set for a particular route in Theorem 3 of Supplementary Note 3. The topological requirements of opposing sets specified in Theorem 3, combined with the requirement for each opposer node to have a single independent regulator in a common circuit, define a rich class of network topologies associated with the opposition mechanism: a collection of opposer nodes distributed to a set of interlinked circuits, embedded into a feedback loop that is contiguous with the route being (fully) opposed. In this sense, a single opposer (with no disjoint circuits relative to a route it fully opposes, and is embedded alone into a contiguous circuit), may be considered a trivial opposing set—a special case which vacuously satisfies the conditions of Theorem 3.

In Fig. 1a, b, we depict the class of network topologies corresponding to the mechanism of an opposing set, illustrating the case of a single opposer node, as well as a simple version of a two-node opposing set. We refer to these network topologies hereafter as Opposer Modules.

### Computational role of Opposer Nodes

We illustrate in Figs. 2a and 3 (with additional analysis in Supplementary Note 4) that opposer nodes calculate an integral of a tracking error—the difference between some network quantity (e.g., the activity of a particular node) and its steady-state value, the latter being determined purely by system parameters, rather than the magnitude of the system input.

For a single opposer node, the tracking error in question corresponds to the error in the activity of the single independent regulator (Fig. 2a). Since the opposer and its regulator participate in a common circuit, the computation of this integral constrains the regulator node to exhibit the RPA property. In an opposing set, on the other hand, each opposer in the collection computes the integral of a unique tracking error, involving various combinations of nodes in the master set (Fig. 3). All nodes featuring in these various tracking errors exhibit the RPA property due to the combined effect of the multiple integrals; the opposer nodes themselves, by contrast, never exhibit the RPA property (Supplementary Note 4).

It is apparent that these more complex arrangements of opposer nodes are indeed employed in certain gene regulatory circuitries since we show in Supplementary Note 5 that the recently identified phenomenon of antithetical integral feedback^{3,14} is actually an instance of a two-node feedback opposing set with single input/output node. Some additional examples of opposing sets are depicted in Fig. 3, with a more general representation of opposing sets given in Fig. 4 (further details in Supplementary Note 3).

### M-sets and the creation of Balancer Modules

Although the opposition mechanism is orchestrated by feedback loops, along with the collections of opposer nodes embedded into them, the balancing mechanism is route-based. Indeed, we show that an M-set must contain at least two distinct routes (Theorem 4, Supplementary Note 3). A consequence of this requirement is that networks in which a single node acts as both input and output cannot invoke the balancing mechanism to achieve RPA, and must instead rely on opposition (Corollary 2, Supplementary Note 3).

Obtaining a solution to the LAQ places constraints on the reaction kinetics of all nodes within certain route segments, and any fully embedded feedback loops, for the particular routes contained in the proposed M-set. The interconnectivities among these constrained nodes delineate a specific sub-network topology that implements the balancing mechanism. We refer to this topological configuration hereafter as a Balancer Module, and present a general schematic for the module in Fig. 1c. As shown, a Balancer Module is characterized by a 'diverter node' (D-node) at the apex, and a 'connector node' (C-node) at the base of the module. Separating the D- and C-nodes are a collaborating set of balancer nodes, comprising all members of the route segments between the D-node and the C-node, for the M-set in question, along with any feedback loops fully embedded into those route segments. The requisite reaction kinetics for balancer nodes constrain their steady-state values to reside on a 'flat manifold' (Theorem 5, Supplementary Note 3)—that is, the steady-state for each balancer is a linear function of the steady-state of the D-node. Full details on the class of reaction forms that implement balancer kinetics are given in Supplementary Note 4; a key topological requirement for the implementation of balancer kinetics is for the set of collaborating balancer nodes to have a single independent regulator—namely, the D-node. The set of balancer nodes must also collaborate with one additional node—the connector node—whose special reaction kinetics (Supplementary Note 4) complete the balancing act for the M-set, allowing its terms to sum to zero for all *I*.

The balancing mechanism thus requires a computational collaboration between two distinct types of nodes: a collection of one or more balancer nodes, along with a single connector node. The reaction kinetics for these two-node types implement a form of integral control that is distinct from the integral feedback control that characterizes the activity of opposer nodes. As we illustrate in Fig. 2b for a very simple Balancer Module, the computational function of the balancer nodes is to linearize their steady-state responses to the activity of the D-node. This upstream linearization creates the conditions that allow the connector node to compute a particular integral—one which constraints the connector to track a steady-state value that is independent of the D-node (and therefore, of the input to the network). In a Balancer Module, then, the connector node alone exhibits the RPA property; the D-node and all balancer nodes, by contrast, assume steady-state values that vary with network input.

### RPA basis sets and their corresponding basis modules

Armed with an understanding of the two basic RPA-generating mechanisms, we now turn our attention to the central concern of this work—the topological characterization of the set of all possible networks capable of exhibiting RPA.

We note at this point that the balancing mechanism described in the preceding section automatically balances all copies of all routes in **R** that contain the noted route segments between the D- and C- nodes, regardless of whether those terms were selected in the original M-set (see Supplementary Note 5). The union of the M-set with all such terms thus represents the independently adapting subset of **R** associated to the M-set. The Balancer Module depicted in Fig. 1c thus represents the class of network topologies that correspond to an independently adapting (balancing) subset of an RPA equation.

Moreover, any route in a network that is only partially opposed must have copies that are balanced. Since a balancing mechanism will automatically balance all copies of its routes, such a partially opposed route is redundantly opposed. As such, the independently adapting subset associated to any opposition mechanism (via opposing sets) comprises the union of only those terms in **R** whose routes are fully opposed by the mechanism. The union of S-sets generated by partial opposition of a particular route should thus be absorbed into the independently adapting subset associated with the relevant balancing mechanism.

From the conditions of Theorems 2 and 3, then, the independently adapting subset associated with an opposition mechanism contains all copies of all routes that are disjoint from the opposing set, while contiguous with a circuit into which the opposing set is embedded—that is, all routes fully opposed by the opposition mechanism in question. The Opposer Module presented in Fig. 1a,b then represents the class of network topologies that correspond to an independently adapting (or opposing) subset of an RPA equation.

Now, the hallmark of an RPA-capable network is the existence of a partition of its RPA equation into independently adapting subsets. (This general description includes the possibility of the trivial partition into a single subset comprising all of **R**). In addition, from the observation that the terms of **R** are distributed to independently adapting subsets by route (that is, all instances in **R** of a particular route are to be grouped together into a single such subset), it follows that these subsets are disjoint, and must cover **R**. We have seen, moreover, that two and only two mechanisms—which we call opposition and balancing—are able to generate the independently adapting subsets of **R** in an RPA-capable network, and that each such mechanism may be implemented by a rich class of sub-network topologies—Opposer Modules and Balancer Modules, respectively. Taken together, these considerations imply that a network can exhibit RPA only if it is decomposable into Opposer and/or Balancer Modules—that is, each route for the transmission of biochemical signal from input to output must be either balanced or (fully) opposed by a single network module.

A general RPA network could contain an arbitrary number of such modules—corresponding to its RPA equation being partitioned into (the same) arbitrary number of disjoint independently adapting subsets—so the question now remains as to how multiple such network modules may coexist (i.e., be connected together) in RPA networks.

### Interconnections of basis modules and larger RPA networks

Now that we have a clear picture of the topological basis modules from which any RPA-capable network must be constructed, a set of rules governing how the modules may be combined (interconnected) defines the full solution space of possible RPA-capable network topologies.

The nature of the two distinct RPA-generating mechanisms, and their topological realizations in self-organizing/self-regulating networks, does place some constraints on how RPA modules may be interconnected to form more complex multimodular networks. These constraints are two-fold: first, we note that the three types of reaction kinetics required to implement RPA—opposer kinetics, balancer kinetics and connector kinetics—are mutually exclusive (Supplementary Note 4). That is, any given node can exhibit at most one of the three types of reaction kinetics. Second, any given computational node (opposer, balancer or connector) is constrained in how it may be regulated: an opposer node, or a collection of collaborating balancer nodes, each has a single independent regulator; and a connector node works with a single collection of collaborating balancer nodes.

From this, we can conclude that the active part of each module (that is, nodes residing between the apex (node C in Fig. 1a, b and node D in Fig. 1c), and the base (D in Fig. 1a, b and C in Fig. 1c) must be distinct from the active part of any other module. A node that plays the role of an opposer in one module, for instance, cannot also be required to operate as a balancer (or a connector) for some other module. Moreover, the requirement for a single independent regulator implies that an opposer node can only perform its computational function for a single Opposer Module. Likewise, a set of collaborating balancer nodes, together with their connector node, delineates a single Balancer Module.

The requirement for distinctness of the active parts of RPA modules implies that the modules may either be connected 'in parallel', or 'in series' according to the definitions given in our Methods section.

### Live and blind regulations from intramodular nodes

In order to make the series connection of RPA modules precise from a topological perspective, we first recall from the preceding sections that within the active part of each module, some nodes exhibit the RPA property, while others do not. Opposer nodes, along with any associated dependent nodes, do not exhibit the RPA property. The single independent regulator for an opposer, along with any associated dependent nodes, do exhibit the RPA property. Likewise, balancer nodes, along with their single independent regulator (the associated D-node) do not exhibit the RPA property, while connector nodes do exhibit the RPA property. From these considerations, we can consider any outgoing regulations from the active parts of an RPA module–leading ultimately to the network’s output node—to be either 'blind regulations' if they come from node(s) that exhibit the RPA property or 'live regulations' if they come from node(s), which do not exhibit the RPA property.

We illustrate the essential principles of series interconnections of modules, which are required in any RPA network containing a module with live outgoing regulations, in Fig. 5. As shown in Fig. 5a, b, outgoing regulations from an opposer node (or associated dependent nodes) place that opposer in a route which must then be either balanced or (fully) opposed by some other RPA module (as indicated by the symbol A which indicates the position of the required ancillary module). Likewise, in a Balancer Module, outgoing regulations from balancer nodes place these nodes in routes that are not balanced by the module; these routes must be either balanced or (fully) opposed by some other ancillary module, as indicated by the symbol A in Fig. 5c. In either case, any outgoing blind regulations generate no requirements for any ancillary modules. Thus, any module with only blind outgoing regulations may exist alone in an RPA network, and any sub-network structures downstream of such regulations may be considered part of the module itself. In addition, blind outgoing regulations may feed into any other RPA module(s) without affecting the ability of those modules to contribute to RPA in the network as a whole.

Figures 6 and 7 present two different illustrative examples of RPA networks that are decomposable into two different RPA basis modules connected in series. In Fig. 6, the upstream module is an Opposer Module, whose single opposer node also contributes to two routes in the network (owing to the presence of live outgoing regulations from the opposer). These two routes are then balanced by the downstream Balancer Module. Figure 7 presents a network whose topological structure admits two different possible decompositions into RPA basis modules, depending on the choice of reaction kinetics at nodes 6 and 7. If the reaction kinetics at node 6 conform to opposer kinetics, for instance, this creates an Opposer Module where the single opposer node also participates in a collection of network routes; in this case, nodes 9 and 10 must be able to exhibit balancer kinetics, and node 11 connector kinetics, in order for the network as a whole to exhibit RPA through the creation of a downstream Balancer Module (Solution 1). If node 7 were to operate with opposer kinetics, on the other hand, this would be sufficient to create a single Opposer Module from the entire network (Solution 2). Detailed analyses of these examples are given in Supplementary Note 5.

## Discussion

In the postgenomic era, as we continue to amass ever larger quantities of data on the vast and complex networks of molecular interactions within living systems, a tantalizing question continues to be raised: could complex biological systems be constructible from just a limited set of simple design principles?^{20,21,22,23,24,25,26}. Here we show conclusively that, for RPA-capable networks at least, the answer is an unequivocal yes.

The centerpiece of the present work is the identification of two rich yet well-defined classes of network topologies which, together, span the space of all possible RPA-capable networks when suitably interconnected. These two classes of network module thus represent a topological basis for the solution to the RPA problem in any network, no matter how large or interconnected. In this sense, the topological basis modules are like the atoms of robust adaptation.

These findings represent a significant advance in our understanding of the basic structures underlying the complex and evolving networks occurring in nature. In many biological contexts—cellular signal transduction and cellular metabolism, for instance—the underlying signaling networks are so complex and high-dimensional, so prone to change over time, and so extraordinarily variable from one realization to another (even from one cell to another phenotypically identical neighboring cell), that the networks themselves are virtually impossible to define concretely at any useful level of detail. Although most investigators view this variability as a source of intractable complexity, particularly in our current age of Big Data, our work reveals that these networks may now be considered from the point of their unexpected simplicity—that is, as decompositions into well-defined basis modules.

It is interesting to consider these findings in the light of established results in control theory, which have determined that asymptotic tracking problems (such as RPA) require integral control as a structural property of the system^{21,22,23}. Our work shows that there are, in fact, two distinct types of integral control involved in the solution to the general RPA problem, corresponding to each of the two classes of RPA basis module: one type of integral is computed within feedback structures, employing specialized computational nodes (opposing sets) within collections of interlinked circuits; the other type is computed by a collaboration between two different types of computational nodes (balancers and connectors) embedded into parallel pathways (routes). Beyond this, we offer the novel insight that sufficiently large networks may solve the RPA problem via the arbitrary combination of the two topological basis modules, thereby distributing integrals of the two possible types throughout their vast assemblies of interacting nodes.

We summarize a wide selection of previously reported RPA examples^{2,3,5,6,7,8,9,14} in Supplementary Note 7, and show that all such previous solutions are special cases of a single RPA basis module. All previous work to our knowledge considers networks that are so small and simple in construction that their RPA equations only admit the trivial partition into either a single S-set (generated by a single opposer node), or a single M-set (generated by a single balancer node and its collaborating connector). Indeed, all previously identified integral feedback in simple signaling motifs^{2,3,5,8,16} are special cases of an Opposer Module using a single opposer node (itself a special case of an opposing set, Fig. 1a, b). The buffer nodes identified by Ma et al.^{2} are special cases of opposer nodes. Previously identified examples of the incoherent feedforward motif^{3,27,28,29,30} are special cases of the Balancer Module. The proportioner nodes described by Ma et al.^{2} are special cases of balancer nodes.

We emphasize again that combinations of these two distinct mechanisms within a single RPA network have never before been proposed, presumably since the requisite network sizes are beyond the reach of blind computational screening methods.

In Fig. 8, we consider the smallest RPA networks that are capable of invoking the various novel topological features we identify in the present work, illustrating the significant increases in the sizes of the computational screening problems that would be required to identify these topologies. For a network to employ both an Opposer Module and an Balancer Module working together in collaboration, for instance, a minimum of five nodes would be required (Fig. 8a, b). Likewise, for a network to feature an opposer node that is also involved in a route (Fig. 8a, c), at least five nodes are needed. For a network with distinct input/output nodes to incorporate a (non-trivial) opposing set (Fig. 8d), five nodes are, once again, the minimum requirement. If one or both of these opposer nodes are also in a route (Fig. 8e, f), the smallest such RPA networks contain seven nodes. Additional analysis of these small RPA topologies is presented in Supplementary Note 5.

Our general topological view of RPA networks highlights the role of antagonizing compensatory mechanisms—opposition and balancing—and the modular network structures induced by those mechanisms, in the robust regulation of signaling networks. This deep connection between compensation and modularity suggests that a modular design may characterize a wider class of robust networks beyond RPA-capable networks. Indeed, it raises the question as to whether all biochemical networks, of any size, with a fundamental need to exhibit robust functionalities, are characterized by modular architectures involving just a small number of topological basis modules.

Several lines of evidence already support the generalization of our network modularization to other bio-signaling contexts. The work of Averbukh et al.^{31}, and Ben-Zvi et al.^{32}, for example, point to a limited set of modules in the context of spatial signaling problems in embryonic development. Those authors undertook computational searches of small reaction-diffusion networks to identify configurations that could produce robust and scalable morphogen gradients^{13,31,32}. Very few networks were capable of generating the requisite robust patterning, and those that did were of a very specific type.

This paradigm shift suggests a resolution to a baffling paradox in living systems—that while networks of interacting molecules are often unimaginably complex, a property that is generally associated with fragility^{1,20}, such networks are nevertheless characterized by a remarkable robustness^{1,17}. Indeed, robust traits are selected by evolution, and robustness facilitates evolvability in the face of changing and unfavorable environments. Our work provides strong evidence that a small number of universal modular designs offers a simplifying framework for the construction of complex bionetworks, giving them the capacity to scale to any size or complexity, without impairing their ability to adapt robustly, and without the requirement for any tuning of network parameters.

## Methods

### Methods overview and relationship to the IMP

In the 1970s, Francis and Wonham^{21,22} investigated the necessary controller structures required to achieve robust regulation with internal stability, and established what is now referred to as the IMP. By this principle, a controller can reject exogenous disturbances and/or track prescribed reference signals by incorporating within itself a model of the dynamic structure of the disturbances/references. More recently, Yi et al.^{23} considered a special case of the IMP concerning RPA to constant exogenous inputs, in the context of providing a framework for understanding the extraordinary precision of adaptation in bacterial chemotaxis^{5}. This analysis provided a purely algebraic condition that must be satisfied by an RPA-capable system, which has been shown to be equivalent to the requirement for integral control^{23}.

Here, our interest is not in confirming whether a particular network topology is capable of RPA, but in specifying all the possible network topologies—i.e., all the possible arrangements of nodes that are capable of exhibiting RPA, along with any constraints on the reaction kinetics for those nodes. For this, we begin by developing an alternative version of the algebraic condition specified by Yi et al.^{23}, for the special case in which a particular input/output node pair is specified, thereby constructing a framework from which topological structures may be deduced relative to that input/output node pair.

The generality of our method for studying RPA in biological networks, being self-organizing, self-regulating, complex, and evolvable, builds upon precise definitions of all the key terms of the problem, which we provide in detail in the attached Supplementary Information (SI). Briefly, we consider a node to be any entity that can encode and transmit a biochemical signal. Most commonly a node represents a molecule (its concentration, say, or the concentration of a particular activation state), a complex of molecules, or even a mathematical function of multiple biomolecular entities (See Supplementary Note 1). An input node is the recipient of some outside stimulus, *I*, while the end-point of interest (not necessarily distinct from the input node) is assigned the role of the output node. From this, a network may be defined, being the set of all nodes that are 'connected' and 'transmissive' relative to a chosen input/output node pair (see Supplementary Note 1 for detailed explanations of these terms). RPA is said to occur when the output node always returns to the same steady-state level, regardless of the magnitude of the stimulus delivered to the input node, with no requirement for special (fine-tuned) parameter choices.

Based solely on these general characteristics and definitions, for a network containing *n* nodes *P*_{1},…, *P*_{
n
}, each with a reaction rate, *f*_{1},…, *f*_{
n
}, respectively, we show in Supplementary Note 2 that RPA occurs only when

and

where \(J_n = \frac{{\partial \left( {f_1, \ldots ,\ f_n} \right)}}{{\partial \left( {P_1, \ldots ,\ P_n} \right)}}\) is the *n* × *n* Jacobian for the system \(\underline {\mathrm{f}} {\mathrm{ = }}\left[ {f_1, \ldots ,f_n} \right]^T\), and *M*_{
IO
} is the (*n*−1) × (*n*−1) input–output minor of *J*_{
n
}—that is, the matrix obtained by removing the input row and the output column from *J*_{
n
}, and where all matrix entries (i.e., partial derivatives) are evaluated at the network’s steady-state, \(\underline {\boldsymbol \pi} _n{\mathrm{ = }}\left[ {P_1^ \ast , \ldots ,P_n^ \ast } \right]^T\) (see Supplementary Note 2 for detailed derivations and supporting discussion).

We refer to Eq. 1 as the RPA equation. In contrast to previous work^{22}, we do not consider this equation as simply an algebraic test for RPA; rather, we view the determinant expansion of the RPA equation from a topological perspective—that is, as a set of signed terms, together with collections of its subsets that may play independent roles in its solution. Noting that each term in the expansion contains topological information on the underlying network in terms of routes from input to output, feedback loops, and reaction kinetics, we thereby uncover general principles as to how network sub-structures are able to work together in collaboration to generate RPA in arbitrarily large and complex networks.

It is clear that the RPA equation is a potentially huge equation in general, comprising some subset of the (*n*−1)! terms corresponding to a fully-connected network of *n* nodes. A 10-node network, for instance, could have as many as 362,880 terms in its RPA equation. Adding just five more nodes to give a 15-node network results in an equation of up to 8.7 × 10^{10} terms. Doubling this network size to a 30-node network produces an RPA equation of up to 8.8 × 10^{30} terms (see Supplementary Table 1). In any event, an arbitrary network of *n* nodes will be able to exhibit RPA only if the *τ* ≤ (*n*−1)! terms of its RPA equation can sum to zero for all *I* without violating Eq. 2.

We note that an alternative, but mathematically equivalent, version of the RPA equation has also been developed in the recent work of Tang and McMillen^{33}. Those authors refer to the condition as 'the cofactor condition', and apply this approach to the issue of designing novel homeostatic systems. In particular, their design algorithm has been used to generate topologies and parameter constraints that 'will support homeostatic behavior for a given set of network components and a desired set of general regulatory constraints to be applied between them'^{33}.

### Deducing general mechanisms of RPA from the RPA equation

We provide full details on our solution method for solving the RPA equation in complete generality in our Supplementary Information. As noted in the preceding section, Supplementary Notes 1 and 2 provide a complete set of precise definitions corresponding to our problem, and present detailed derivations of the RPA equation and the RPA constraint, along with mathematical forms for a set of axioms for the reaction kinetics at individual network nodes.

Supplementary Note 3 provides full details on our topological approach to the solution of the RPA problem, identifying conditions whereby the RPA equation may be partitioned into independently adapting subsets. To this end, we begin with the notion of a minimally adaptive subset of the RPA equation, of which there are two basic types—S-sets and M-sets—each with their own type of LAQ. We explore how these mathematical conditions imply novel topological structures within RPA networks—Opposer Modules (employing the novel concept of opposing sets) and Balancer Modules

The constraints on reaction kinetics that are imposed by the creation of S-sets and M-sets from the RPA equation are presented in detail in Supplementary Note 4. Here we also explore a range of parameter constraints that would allow the requisite reaction kinetics to be implemented in RPA networks, and also consider how these reactions implement some form of integral control.

Supplementary Note 5 considers the central matter of this work, namely the relationship between the two fundamental types of RPA Module (Opposer and Balancer) and a topological basis for RPA-capable networks. We consider how these topological basis modules may be interconnected to form larger (multimodular) RPA-capable networks.

To aid in the general delineation of all RPA-capable network topologies through the interconnections of Opposer and/or Balancer Modules we distinguish between the two possible relationships between interconnected modules, as we outline in the next section.

### Modules connected in series or in parallel

Two RPA modules are said to be connected in parallel if none of the computational nodes within either module participate in route(s) that are opposed/balanced by the other. The respective route collections for the two modules must, therefore, diverge upstream of the active parts of the modules, and then reconnect again downstream of the active parts. Informally speaking, parallel modules are connected side-by-side within the global topology of the RPA network. When an opposer module is connected in parallel with all other RPA modules that comprise the network, for instance, its opposer node(s) do not participate in any route of the network; they participate in feedback loops only. This is a comparatively straightforward intermodular arrangement, then, for which we present an example in Supplementary Note 5 for two Opposer Modules connected in parallel (see Supplementary Figure 15).

A parallel arrangement of modules may be contrasted with the possibility that in some particular RPA module, one (or more) of its computational nodes may also participate in some network route that is not opposed or balanced by the module in question. For example, an opposer node—which operates within a feedback arrangement relative to the route(s) it opposes—may also participate in some route within the network. Likewise, a balancer node—embedded into the route segments defining its Balancer Module—may also participate in some other route in the network (that is, a route that is not balanced by the Balancer Module in question). In either case, the extramodular route(s) in which the computational node(s) participate must be either balanced or fully opposed by one (or more) additional RPA module(s) connected in series with the original module. Informally speaking, series modules are connected in an upstream-downstream arrangement, since computational nodes for the upstream module feed into the downstream module.

### Additional notes on Methods

We conclude the detailed presentation of our methods in the Supplementary Information with a brief consideration of how all previously reported instances of RPA in the literature, to our knowledge, are special cases of the general solution we present in this article (Supplementary Notes 6 and 7).

For completeness, we also observe that although the topological structures we identify here are necessary conditions for solving the RPA problem in complete generality, these conditions are not sufficient by themselves to guarantee the implementation of RPA across all possible parameter regimes. In practice, RPA also requires global stability to ensure that there is a unique and stable steady-state regardless of initial conditions. We discuss stability issues briefly in Supplementary Note 8, where we point out that feedback loops, if present at all, should be negative-feedback loops since these are stability promoting. We nevertheless acknowledge that negative feedback could potentially induce oscillations or even chaotic behavior. More direct dynamical systems approaches are required to examine these possibilities for specific RPA topologies and specific parameter regimes.

### Code availability

All computational simulations presented for illustrative purposes in this work were performed with MATLAB’s inbuilt ODE solver, ODE45. All equations and parameters supplied to this solver are available in the Supplementary Information (see Supplementary Note 5).

### Data availability

All data used in the present research are available upon request from the authors.

## Additional information

**Publisher's note:** Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## References

- 1.
Wagner, A.

*Robustness and Evolvability in Living Systems*(Princeton University Press, New Jersey, 2005). - 2.
Ma, W., Trusina, A., El-Samad, H., Lim, W. A. & Tang, C. Defining network topologies that can achieve biochemical adaptation.

*Cell***138**, 760–773 (2009). - 3.
Ferrell, J. E. Perfect and near-perfect adaption in cell signaling.

*Cell Syst.***2**, 62–7 (2016). - 4.
Hoeller, O., Gong, D. & Weiner, O. D. How to understand and outwit adaptation.

*Dev. Cell***28**, 607–616 (2014). - 5.
Barkai, N. & Leibler, S. Robustness in simple biochemical networks.

*Nature***387**, 913–917 (1997). - 6.
Alon, U., Surette, M., Barkai, N. & Leibler, S. Robustness in bacterial chemotaxis.

*Nature***397**, 168–171 (1999). - 7.
Hart, Y. et al. Robust control of nitrogen assimilation by a bifunctional enzyme in E. coli.

*Mol. Cell***41**, 117–127 (2011). - 8.
Muzzey, D., Gomez-Uribe, C. A., Mettetal, J. T. & van Oudenaarden, A. A systems-level analysis of perfect adaptation in yeast osmoregulation.

*Cell***138**, 160–171 (2009). - 9.
El-Samad, H., Goff, J. P. & Khammash, M. Calcium homeostasis and parturient hypocalcemia: an integral feedback perspective.

*J. Theor. Biol.***214**, 17–29 (2002). - 10.
Kaupp, U. B. Olfactory signalling in vertebrates and insects: differences and commonalities.

*Nat. Rev. Neurosci.***11**, 188–200 (2010). - 11.
Yau, K. W. & Hardie, R. C. Phototransduction motifs and variations.

*Cell***139**, 246–264 (2009). - 12.
Ben-Zvi, D. & Barkai, N. Scaling of morphogen gradients by an expansion-repression integral feedback control.

*Proc. Natl Acad. Sci. USA***107**, 6924–6929 (2010). - 13.
Eldar, A. et al. Robustness of the BMP morphogen gradient in

*Drosophila*embryonic patterning.*Nature***419**, 304–308 (2002). - 14.
Briat, C., Gupta, A. & Khammash, M. Antithetic integral feedback ensures robust perfect adaptation in noisy biomolecular networks.

*Cell Syst.***2**, 15–26 (2016). - 15.
Bachmann, J. et al. Division of labor by dual feedback regulators controls JAK2/STAT5 signaling over broad ligand range.

*Mol. Syst. Biol.***7**, 516 (2011). - 16.
Shin, S. Y. et al. Positive- and negative-feedback regulations coordinate the dynamic behavior of the Ras-Raf-MEK-ERK signal transduction pathway.

*J. Cell. Sci.***122**, 425–435 (2009). - 17.
Whitacre, J. M. Biological robustness: paradigms, mechanisms, and systems principles.

*Front. Genet.***11**, 1–15 (2012). - 18.
Lim, W. A., Lee, C. M. & Tang, C. Design principles of regulatory networks: searching for the molecular algorithms of the cell.

*Mol. Cell***49**, 202–212 (2013). - 19.
Tyson, J., Chen, K. C. & Novak, B. Sniffers, buzzers, toggles, and blinkers:dynamics of regulatory and signaling pathways in the cell.

*Curr. Opin. Cell. Biol.***15**, 221–224 (2003). - 20.
Quinton-Tulloch, M. J., Bruggeman, F. J., Snoep, J. L. & Westerhoff, H. V. Trade-off of dynamic fragility but not of robustness in metabolic pathways in silico.

*FEBS J.***280**, 160–173 (2013). - 21.
Francis, B. A. & Wonham, W. M. The internal model principle for linear multivariable regulators.

*Appl. Math. Optim.***2**, 170–194 (1975). - 22.
Francis, B. A. & Wonham, W. M. The internal model principle of control theory.

*Automatica***12**, 457–465 (1976). - 23.
Yi, T. M., Huang, Y., Simon, M. I. & Doyle, J. Robust perfect adaptation in bacterial chemotaxis through integral feedback control.

*Proc. Nat**l Acad. Sci.***97**, 4649–4653 (2000). - 24.
Araujo, R. P., Liotta, L. A. & Petricoin, E. F. Proteins, drug targets and the mechanisms they control: the simple truth about complex networks.

*Nat. Rev. Drug Discov.***6**, 871–880 (2007). - 25.
Araujo, R. P. & Liotta, L. A. A control theoretic paradigm for cell signaling networks: a simple complexity for a sensitive robustness.

*Curr. Opin. Chem. Biol.***10**, 81–87 (2006). - 26.
Araujo, R. P., Petricoin, E. F. & Liotta, L. A. A mathematical model of combination therapy using the EGFR signaling network.

*Biosystems***80**, 57–69 (2007). - 27.
Goentoro, L., Shoval, O., Kirschner, M. W. & Alon, U. The incoherent feedforward loop can provide fold-change detection in gene regulation.

*Mol. Cell***36**, 894–899 (2009). - 28.
Heidary, Z., Ghaisari, J., Moein, S., Naderi, M. & Gheisari, Y. Stochastic petri net modeling of hypoxia pathway predicts a novel Incoherent feed-forward loop controlling SDF-1 expression in acute kidney injury.

*IEEE TransNanobioscience***15**, 19–26 (2016). - 29.
Sasagawa, S., Ozaki, Y., Fujita, K. & Kuroda, S. Prediction and validation of the distinct dynamics of transient and sustained ERK activation.

*Nat. Cell. Biol.***7**, 365–373 (2005). - 30.
Mangan, S. & Alon, U. Structure and function of the feed-forward loop network motif.

*Proc. Natl Acad. Sci. USA***100**, 11980–11985 (2013). - 31.
Averbukh, I., Ben-Zvi, D., Mishra, S. & Barkai, N. Scaling morphogen gradients during tissue growth by a cell division rule.

*Development***141**, 2150–2156 (2014). - 32.
Ben-Zvi, D., Shilo, B. Z., Fainsod, A. & Barkai, N. Scaling of the BMP activation gradient in Xenopus embryos.

*Nature***453**, 1205–1211 (2008). - 33.
Tang, Z. F. & McMillen, D. R. Design principles for the analysis and construction of robustly homeostatic biological networks.

*J. Theor. Biol.***408**, 274–289 (2016).

## Acknowledgements

This study was partially supported by NIH grants R33CA206937 and R01AR068436.

## Author information

### Affiliations

#### School of Mathematical Sciences, Queensland University of Technology, Brisbane, QLD, 4000, Australia

- Robyn P. Araujo

#### Institute of Health and Biomedical Innovation (IHBI), 60 Musk Avenue, Kelvin Grove, Brisbane, QLD, 4059, Australia

- Robyn P. Araujo

#### Center for Applied Proteomics and Molecular Medicine, George Mason University, 10920 George Mason Circle, Manassas, Virginia, 20110, USA

- Lance A. Liotta

### Authors

### Search for Robyn P. Araujo in:

### Search for Lance A. Liotta in:

### Contributions

R.P.A. conceived of the analytical methodology, and performed all derivations, proofs and computational simulations. R.P.A. and L.A.L. wrote the paper.

### Competing interests

The authors declare no competing interests.

### Corresponding author

Correspondence to Robyn P. Araujo.

## Electronic supplementary material

## Rights and permissions

**Open Access** This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

## About this article

## Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.