Introduction

Controlling complex networked systems is a fundamental challenge in natural, social sciences and engineered systems. A networked system is controllable if its state can be controlled from any initial state to a desired accessible state1,2 by inputting external signals from a few suitable selected nodes, which are called input nodes3,4,5,6. Existing works3 provide an efficient method based on maximum matching to find a Minimum Input nodes Set (abbreviated MIS) used to fully control a network.

However, these works have primarily focused on analyzing single MIS4,5,6,7,8, while the underlying control relationships of nodes and MISs remain elusive. Owing to the structural complexity of a network, its MISs are typically not unique and the number of MISs are exponential to the size of the network9,10. The enumeration of all possible MISs is a #P problem11 which requires high computational costs. A few works analyzed the node types in control12,13 and control capacities10 of input nodes. Moreover, although any of its MISs are capable of fully controlling the network, they may composed of nodes with different topological properties, such as high-degree nodes14. The existence of physical constraints and limitations15 may also affect the choice of a suitable MIS. For example, when controlling an inter-bank market16,17, one may need certain specific input nodes to ensure that a MIS can be manipulated by a given organization; when controlling a protein interaction network18, some proteins cannot be used as input nodes because of technique limitation.

Given the existence of numerous MISs in a network, a node can be classified based on its participation in MISs12: 1. possible input node, which appear in at least one MIS; 2. redundant node, which never appear in any MIS. Previous works12 found that the dense networks exhibit a surprising bifurcation phenomenon, in which the majority of nodes are either redundant nodes or possible driver nodes. However, the origin of bifurcation phenomenon and the method of altering the type of nodes are still unknown.

Besides many approaches on controllability analysis of complex networks, the following questions are critical yet remain unknown: (i) what is the relationship between many available MISs of a network? (ii) what topological structure determines whether a node is a possible input node? (iii) how to design suitable MIS with the desired nodes?

Here, we present input graph, a simple geometry but capable of revealing the complex correlation of all MISs and nodes in control. The input graph is constructed by replacing the original edges with new edges reflecting control correlations of nodes. We prove that the node adjacent to an input node in input graph will appear in an MIS, and the nodes of the same connected component in input graph have the same control type, thus they are either all participate in control or not. Therefore, the emergence of giant connected component in input graph provides a clear topological explanation of the bifurcation phenomenon12 in dense networks, and the complex control correlation of nodes of original network can be reduced into a few simple connected components of input graph. Furthermore, we present an efficient method to precisely manipulate the types of any node in control based on its connectivity of input graph. We believe that input graph is important because it (i) presents a framework that reveals the inherent correlation of MISs and nodes in control and (ii) enables the design and manipulation of a suitable MIS of a network under constraints. Ultimately, this will promote the application of network control in real networked systems.

Results

Control adjacency and input graph

The dynamics of a linear time-invariant network G(V, E) is described by:

where the state vector x(t) = (x1(t), …, xN(t))T denotes the value of N nodes in the network at time t, A is the transpose of the adjacency matrix of the network, B is the input matrix that defines how control signals are inputted to the network, and u(t) = (u1(t), …, uH(t))T represents the H input signals at time t.

To analyze relationship of all nodes in control, we first define the control adjacency of nodes pair: for a network G and any maximum matching M, node a is said to be control adjacent to node b if there exists a node c connecting a and b with an unmatched edge eca and a matched edge ecb. Control adjacency reveals an important property about control correlation of nodes: a node that is control adjacent to an input node must appear in another MIS. For example, in Fig. 1A, input node a is control adjacent to node b, and the two nodes alternately appear in MIS {a, c} and MIS {b, c}. We prove that this property is satisfied by any network, which is called the Exchange Theorem, that is: For any MIS D of a network G, if an input node nD is control adjacent to another node m, then D’ = D\{n}  {m} is also an MIS of G (see Supplementary Information). This means that a new MIS D’ can be obtained from MIS D by exchanging a node of D with its control adjacent neighbor.

Figure 1
figure 1

Control adjacency and input graph.

(A) A simple network with dilation. For a maximum matching {ecb}, input nodes a and matched node b are control adjacent because they are connected by node c with an unmatched edge eca and a matched edge ecb, which makes them alternately appear in two MISs, {a, c} and {b, c}; (B) Sample directed network (left) and its bipartite graph (right up) and input graph (right down). The bipartite graph are constructed by split nodes of directed network into two separated nodes set in and out, which make a clear representation of control structure. The input graph are built based on control adjacency relationship. The input components (shaded in yellow) contain all possible input nodes and the matched components (shaded in blue) contain all redundant nodes. (C) All six MISs of the network shown in (B), and the edges are their control adjacent relationship. If an input node of a MIS is no longer useable, we can immediate find its substitute node based on input graph. Note that the difference of adjacent MISs are one matched edge and input node, which is the minimal change to original control structure.

Then, we define the input graph GD(V, ED) based on the control adjacency between nodes, where V is the node set, ED is the edge set and eijED if node i is control adjacent to node j (Fig. 1C). Apparently, the input graph reveals all control relationships of nodes.

The input graph has several potential applications in analyzing controllability of complex networks. We find that the degree of a node in input graph reflects an important property about its substitutability in control. Based on the exchange theorem, for each edge of an input node in input graph, we can find a substitute node and obtain a new MIS with only one node replaced. It has important practical value. For example, when an input node of a MIS is no longer available due to physical constraint or attack, we can immediate find a new one by replacing the node with one of its neighbor in input graph. The above method makes the minimum change to the original control scheme, which is only one edge. Note that the computational complexity of the above process is only O(1), which yields a significantly improved method to obtain a new MIS in comparison with the state of the art approach10.

Connected components of input graph

Next we focus on analyzing the connectivity of input graph. Similar to the concept of the path and reachable set in graph theory, we define control path p as the node sequence where neighbor nodes are control adjacent. The control-reachable set C(n) of node n is defined as the set of all nodes that are reachable from node n through any control path (Fig. 1B). Based on the above definition, we prove the following Adjacency Corollary: 1. For any MIS D and an input node nD, all nodes of C(n) must be possible input nodes; 2. For any MIS D, if node m did not belong to any control-reachable set of the input nodes of D, then m must be a redundant node and never appears in any MIS (see Supplementary Information). Therefore, it is easy to conclude that a node is a possible input node if it can be control reachable from an input node of any MIS.

The adjacency corollary show that the control-reachable sets of possible input nodes and redundant nodes will never intersect. Thus, there are only two types of connected components in the input graph: 1. Input Component (IC), which contains at least one input node; and 2. Matched Component (MC), which contains no input node. We call the two type connected components as control components. Apparently, all nodes of IC are possible input nodes and all those of MC are redundant nodes.

We analyze the control components of some real networks, and find that the complex control relationships of these networks can be reduced into a few control components of input graphs, i.e., little rock (Fig. 2A) and political blog networks (Fig. 2B). Furthermore, we find that many real networks have a giant control component in their input graph (Table 1 and Fig. S7), suggesting that the majority of nodes are tightly connected by control adjacency and have the same type in control. The giant control component can either be a giant IC, or a giant MC.

Table 1 Characteristics of the real networks analyzed in the paper.
Figure 2
figure 2

Control components of real and synthetic networks.

(A,B) Real networks and its corresponding control adjacency network. The complex structure of (A) Little rock food-web and (B) Political blog are reduced into several simple connected components of input graph. (C) The size of the largest control component CCmax versus the average degree <k> in scale-free networks with degree exponents rin = rout = 3, N = 104 and (D) the number of control components (CC) decreases significantly with increasing <k>, which illustrates the emergence of a giant control component; (E) two types of giant control components; the input component (IC) and matched component (MC) cannot coexist in highly connected networks; (F) the emergence of a giant control component leads to the bifurcation phenomenon of possible input nodes in dense networks. The majority of nodes of a network with a giant IC are possible input nodes, whereas those of a network with a giant MC are redundant nodes; (G,H) three example networks with average degrees of <k> = 5 and <k> = 10; the type of their giant control component determines the type of the majority of nodes in control.

To further understand the origin of giant control component, we analyze the size of the largest control component of synthetic networks. We found that the size of the largest control component increases with the average degree of a network (Fig. 2C), whereas the number of control components decreases monotonically (Fig. 2D). Therefore, there exists only one giant control component in dense networks (Fig. 2E). The type of a node in control is determined by the type of control component to which it belongs, which depends on whether the control component contains an input node. If the giant control component contain at least one input nodes, most of its nodes will be possible input nodes; and if the giant control component contains no input node, most of its nodes will be redundant nodes. Thus, we can observe the bifurcation phenomenon12 (Fig. 2F) that emerges in dense networks. Therefore, the formation of the giant control component in input graph provides a clear explanation for the origin of the bifurcation phenomenon emergent in dense networks.

Altering the type of nodes in control

Owing to the economical or physical constraints exist in many actual control scenarios, we may need some specified nodes as input nodes. If the node is a possible input node, we can easily find a MIS which contain the node. However, if the node is a redundant node, we must alter the structure of the network and turn the node into a possible input node.

Since the nodes of the same control component have same control type, we only need to alter the type of control component in which the target node lies. This problem can be solved by adding new edges to the network. For example, if we link several input nodes to an MC, the nodes in the MC will be turned into possible input nodes and the MC will be turned into an IC. Additionally, if we match all input nodes of an IC, it will be turned into an MC and all nodes in it will be redundant nodes (Fig. 3A,B,E).

Figure 3
figure 3

Type alteration of a giant control component in scale-free networks with degree exponents rin = rout = 3, N = 104.

(A) Illustration of altering an input component (IC) to a saturated matched component (SMC). For each input node of an IC, we must add an edge pointing from an unsaturated node (node without matched out-edge) to the input node, and the IC will turn into an SMC; (B) Illustration of an alteration of an unsaturated matched component (UMC) to an SMC. For each unsaturated node of an MC, we add an edge from the unsaturated node to an input node, and the UMC will turn into an SMC; (C,D). The average degree versus percentage of added edges p/L used to alter the control component for IC and UMC to SMC. For large <k>, the percentage of added edges significant decreases, and the changed possible input nodes of per edge ΔnD/p increases rapidly, which indicates that it is easy to alter the giant control component type of dense networks; (E) Illustration of an alteration of an SMC to an IC. We need link an input nodes to SMC and it will turn into IC; (F) Average degree versus density of changed possible input nodes for SMC to IC. With only one added edge, the control type of most nodes changes in dense network.

Therefore, we present an algorithm to alter the type of the control component (see Method). To quantify the efficiency of the algorithm, we investigate the number of added edges p in both ER-random and scale-free networks. The results (Fig. 3C,D) showed that p significantly decreases with the average degree <k>, and the proportion of changed possible input nodes ΔnD increases monotonically, which indicates that it is easier to alter the control component of a denser network.

Surprisingly, the giant control component of a few networks can be changed by adding only one edge (Fig. 3E,F). For example, the control type of most nodes of some real networks (e.g. Facebook and Amazon networks shown in Table 1) can be altered by only one added edge. All these networks have a special giant MC in their corresponding input graphs, and the nodes of the MC was not linked by any unsaturated node (node without a matched out-edge) in the original network. We call it as a saturated matched component (SMC). Therefore, if we link an input node to an SMC, most nodes of the SMC will be control reachable by the input node and be turned into possible input nodes. However, when an MC is linked by one or more unsaturated nodes in original network, which we call it as an unsaturated matched component (UMC), we need to match all the unsaturated nodes to change its type. The result show the cost of altering an IC to a UMC (Fig. 3C) is similar to that of altering a UMC to an IC (Fig. 3D).

Furthermore, we find that the size of MIS significantly decreases after altering the type of the giant control component (Fig. S8), which means that the method can also be used to optimize the controllability of complex networks19,20.

Discussion

In summary, we developed the input graph, a fundamental structure that reveals the control relationship of nodes and MISs. Our key finding, that the control adjacent nodes have the same type in control, allows us to reveal the inherent control correlation of nodes, and offers a general mechanism to manipulate the control type of nodes or design a suitable control scheme. Furthermore, networks with a giant control component display a surprising type transition phenomenon in response to well-chosen structural perturbations, which is ubiquitous in dense networks across multiple disciplines.

The input graph presented here is a starting point for deeply investigating the control properties of networks. It paves the way to analyze the properties of all MISs of a network, such as enumerate all MISs14, estimate the node capacity in MISs10 or find the optimum MIS under different control constraints or with specific node property21. Furthermore, the structural properties of the input graph, such as the node’s degree and connected component also reveal several important topics on controllability of a network. We believe the other structural properties such as average distance and diameter of input graph, also worth deep investigation for multiple disciplines, such as brain network22, protein interaction18, and et al.

However, besides the input nodes in a MIS, there may exist other nodes which also need to be inputted the control signals. These nodes formed a cycle4 in the network and cannot accessible from any input nodes in the current MIS. Therefore, to fully control the network, a control signal need to be inputted to any node of the cycle, and the signal can be shared with any input node4. Furthermore, there may exist more than one input nodes within the same input component. Any of these input nodes can be exchanged with its control reachable node based on exchange theorem. However, if two or more input nodes share same neighbor node in the input graph, they would not be substituted at the same time.

Furthermore, we design an algorithm to alter the control type of most nodes of a network with small structural perturbations, which is the first attempt to convert the control mode12,13 of a network as far as we know. Many real networks, especially biological network, are incomplete and may have many missing edges. It means that if some new edges are discovered, it may alter the control type of existing nodes dramatically. However, these newly discovered edges will never weaken the performance of our algorithm, because they will only increase the size of the giant connected component of input graph.

Overall, these findings will improve our understanding of the control principles of complex networks and may be useful in controlling various real complex systems, such as drug designs23,24,25, financial markets16,26 and biological networks18,22.

Methods

Construct input graph

To build an input graph GD(V, ED) of the directed network G(V, E), we need find all control adjacent edges ED between nodes. Based on the adjacency corollary, there are no control adjacent relationship between any possible input nodes and redundant nodes. Therefore, the set of edges ED of input graph are composed by two subsets: the set of edges EDi between all possible input nodes and the set of edges EDr between all redundant nodes.

The edges set EDi and all possible input nodes VPD can be found as follows:

  1. 1

    Find a maximum matching M and the corresponding MIS D; let the candidate set of all possible input nodes VPD = D;

  2. 2

    Select a node n of D, let D = D − {n}; for all in-edges of node n, find the corresponding control adjacent neighbors {c1, c2, …, ci};

  3. 3

    Let D = D + {c1, c2, …, ci}, VPD = VPD + {c1, c2, …, ci} and EDi = EDi + {e(n, c1), …, e(n, ci)};

  4. 4

    Repeat step 2 and 3 until D is empty.

The edges set EDr between all redundant nodes can be found as follows:

  1. 1

    Let Vtemp = V − VPD;

  2. 2

    Select a node n of Vtemp, let Vtemp ← Vtemp − {n};

  3. 3

    Let the matched in-edge of n be e(m, n), find all out-edge {e(m, c1), …, e(m, ci)}of node m; let EDr = EDr + {e(c1, n), …, e(ci, n)} and Vtemp ← Vtemp+ {c1, c2, …, ci};

  4. 4

    Repeat step 2 and 3 until Vtemp is empty.

Next we analyze the computational complexity of above method. Let N is the number of the nodes and L is the number of the edges of the directed network G(V, E). First, the complexity for finding a maximum matching is O(N0.5L)9. Second, each node requires a breadth first search (BFS) process to finding its control reachable set, which computational complexity is O(L). For the worst case, we need to find the control researchable set for all nodes and the complexity is O(NL). Therefore, the total computational complexity to building an input graph is O(NL).

Altering an IC to a MC

For a network G(V, E), let B(Vout, Vin, E) be the corresponding bipartite graph and ICalter be the target input component. The basic idea of altering an IC to a MC is to match all input nodes of the IC by adding edges. The following are the detailed steps:

  1. 1

    Find all unmatched nodes S corresponding to current maximum matching. Let S1 = S ∩ Vout, S2 = S ∩ Vin ∩ ICalter;

  2. 2

    Select a node nS2 and a node mS1; add an edge emn to G; remove nodes n and m from S2 and S1, respectively;

  3. 3

    Repeat step 2 until S2 is empty.

The correctness of method of above method is list as follows. First, we prove that |S1| ≥ |S2|. Apparently, Based on the definition of B(Vout, Vin, E), |Vin| = |Vout|. Because every edge of the maximum matching starts with a node of Vout and ends with a node of Vin, then |S1| = |S ∩ Vout| = |S ∩ Vin| ≥ |S ∩ Vin ∩ ICalter| = |S2|, which means that for any node of S2, we can find a corresponding node in S1. When we add an edge emn to G in step 3, the matching M’ = M + emn must be a maximum matching of G’ = G + emn because n and m are not matched by M. Therefore, nodes n and m are matched by M’. When we match all input nodes of the ICalter, the ICalter will be turned into an SMC.

Altering a MC to an IC

For a network G(V, E) with a MC, let B(Vout, Vin, E) be the corresponding bipartite graph. The basic idea of altering a MC to an IC is to link several input nodes to the MC and the MC will be turned into an IC. For a network with an UMC, if we directly link an input node to the UMC, the input node will be matched with an unsaturated node of the UMC. Therefore, we need first alter the UMC to an SMC and then turn the SMC into an IC. The method of altering an UMC to an SMC is similar to altering an IC to an SMC, which is as follows:

  1. 1

    Find all unmatched nodes S corresponding to current maximum matching. Let S1 = S ∩ Vin, S2 = S ∩ Vout ∩ UMCalter;

  2. 2

    Select a node nS2 and a node mS1; add an edge enm to G; remove nodes n and m from S2 and S1, respectively;

  3. 3

    Repeat step 2 until S2 is empty.

For a network with a giant SMC, we only need to link an input node to the SMC, and the part which is control reachable by the input node will be turned into an IC. Therefore, in order to maximize the size of result IC, we need to find the most “influential” node of the SMC based on the size of its control-reachable set. The algorithm is listed as follows:

  1. 1

    Let SMCalter be the target saturated matched component; compute the size of control-reachable set of nodes in SMCalter, let the node with maximum size be n;

  2. 2

    Let emn be the matched edge pointing to node n; select an input node d, add edge emd to G.

After adding the edge emd, the nodes of control-reachable set of n will be turned into possible input nodes because the input node d is control adjacent to node n. If we want to convert all nodes of an SMC, we need to find the minimal input nodes set that can reach all node of the SMC. The problem can be solved by a simple greedy algorithm.

Additional Information

How to cite this article: Zhang, X. et al. Input graph: the hidden geometry in controlling complex networks. Sci. Rep. 6, 38209; doi: 10.1038/srep38209 (2016).

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.