Input graph: the hidden geometry in controlling complex networks

The ability to control a complex network towards a desired behavior relies on our understanding of the complex nature of these social and technological networks. The existence of numerous control schemes in a network promotes us to wonder: what is the underlying relationship of all possible input nodes? Here we introduce input graph, a simple geometry that reveals the complex relationship between all control schemes and input nodes. We prove that the node adjacent to an input node in the input graph will appear in another control scheme, and the connected nodes in input graph have the same type in control, which they are either all possible input nodes or not. Furthermore, we find that the giant components emerge in the input graphs of many real networks, which provides a clear topological explanation of bifurcation phenomenon emerging in dense networks and promotes us to design an efficient method to alter the node type in control. The findings provide an insight into control principles of complex networks and offer a general mechanism to design a suitable control scheme for different purposes.

provides a clear topological explanation of the bifurcation phenomenon 12 in dense networks, and the complex control correlation of nodes of original network can be reduced into a few simple connected components of input graph. Furthermore, we present an efficient method to precisely manipulate the types of any node in control based on its connectivity of input graph. We believe that input graph is important because it (i) presents a framework that reveals the inherent correlation of MISs and nodes in control and (ii) enables the design and manipulation of a suitable MIS of a network under constraints. Ultimately, this will promote the application of network control in real networked systems.

Results
Control adjacency and input graph. The dynamics of a linear time-invariant network G(V, E) is described by: where the state vector x(t) = (x 1 (t), … , x N (t)) T denotes the value of N nodes in the network at time t, A is the transpose of the adjacency matrix of the network, B is the input matrix that defines how control signals are inputted to the network, and u(t) = (u 1 (t), … , u H (t)) T represents the H input signals at time t.
To analyze relationship of all nodes in control, we first define the control adjacency of nodes pair: for a network G and any maximum matching M, node a is said to be control adjacent to node b if there exists a node c connecting a and b with an unmatched edge e ca and a matched edge e cb . Control adjacency reveals an important property about control correlation of nodes: a node that is control adjacent to an input node must appear in another MIS. For example, in Fig. 1A, input node a is control adjacent to node b, and the two nodes alternately appear in MIS {a, c} and MIS {b, c}. We prove that this property is satisfied by any network, which is called the Exchange Theorem, that is: For any MIS D of a network G, if an input node n ∈ D is control adjacent to another node m, then D' = D\{n} ∪ {m} is also an MIS of G (see Supplementary Information). This means that a new MIS D' can be obtained from MIS D by exchanging a node of D with its control adjacent neighbor.
Then, we define the input graph G D (V, E D ) based on the control adjacency between nodes, where V is the node set, E D is the edge set and e ij ∈ E D if node i is control adjacent to node j (Fig. 1C). Apparently, the input graph reveals all control relationships of nodes.
The input graph has several potential applications in analyzing controllability of complex networks. We find that the degree of a node in input graph reflects an important property about its substitutability in control. Based on the exchange theorem, for each edge of an input node in input graph, we can find a substitute node and obtain a new MIS with only one node replaced. It has important practical value. For example, when an input node of a MIS is no longer available due to physical constraint or attack, we can immediate find a new one by replacing the node with one of its neighbor in input graph. The above method makes the minimum change to the original control scheme, which is only one edge. Note that the computational complexity of the above process is only O(1), which yields a significantly improved method to obtain a new MIS in comparison with the state of the art approach 10 .
Connected components of input graph. Next we focus on analyzing the connectivity of input graph.
Similar to the concept of the path and reachable set in graph theory, we define control path p as the node sequence where neighbor nodes are control adjacent. The control-reachable set C(n) of node n is defined as the set of all nodes that are reachable from node n through any control path (Fig. 1B). Based on the above definition, we prove the following Adjacency Corollary: 1. For any MIS D and an input node n ∈ D, all nodes of C(n) must be possible input nodes; 2. For any MIS D, if node m did not belong to any control-reachable set of the input nodes of D, then m must be a redundant node and never appears in any MIS (see Supplementary Information). Therefore, it is easy to conclude that a node is a possible input node if it can be control reachable from an input node of any MIS.
The adjacency corollary show that the control-reachable sets of possible input nodes and redundant nodes will never intersect. Thus, there are only two types of connected components in the input graph: 1. Input Component (IC), which contains at least one input node; and 2. Matched Component (MC), which contains no input node. We call the two type connected components as control components. Apparently, all nodes of IC are possible input nodes and all those of MC are redundant nodes.
We analyze the control components of some real networks, and find that the complex control relationships of these networks can be reduced into a few control components of input graphs, i.e., little rock ( Fig. 2A) and political blog networks (Fig. 2B). Furthermore, we find that many real networks have a giant control component in their input graph (Table 1 and Fig. S7), suggesting that the majority of nodes are tightly connected by control adjacency and have the same type in control. The giant control component can either be a giant IC, or a giant MC.
To further understand the origin of giant control component, we analyze the size of the largest control component of synthetic networks. We found that the size of the largest control component increases with the average degree of a network (Fig. 2C), whereas the number of control components decreases monotonically (Fig. 2D). Therefore, there exists only one giant control component in dense networks (Fig. 2E). The type of a node in control is determined by the type of control component to which it belongs, which depends on whether the control component contains an input node. If the giant control component contain at least one input nodes, most of its nodes will be possible input nodes; and if the giant control component contains no input node, most of its nodes will be redundant nodes. Thus, we can observe the bifurcation phenomenon 12 (Fig. 2F) that emerges in dense networks. Therefore, the formation of the giant control component in input graph provides a clear explanation for the origin of the bifurcation phenomenon emergent in dense networks.
Scientific RepoRts | 6:38209 | DOI: 10.1038/srep38209 Altering the type of nodes in control. Owing to the economical or physical constraints exist in many actual control scenarios, we may need some specified nodes as input nodes. If the node is a possible input node, we can easily find a MIS which contain the node. However, if the node is a redundant node, we must alter the structure of the network and turn the node into a possible input node.
Since the nodes of the same control component have same control type, we only need to alter the type of control component in which the target node lies. This problem can be solved by adding new edges to the network. For example, if we link several input nodes to an MC, the nodes in the MC will be turned into possible input nodes and the MC will be turned into an IC. Additionally, if we match all input nodes of an IC, it will be turned into an MC and all nodes in it will be redundant nodes (Fig. 3A,B,E).
Therefore, we present an algorithm to alter the type of the control component (see Method). To quantify the efficiency of the algorithm, we investigate the number of added edges p in both ER-random and scale-free networks. The results (Fig. 3C,D) showed that p significantly decreases with the average degree < k> , and the If an input node of a MIS is no longer useable, we can immediate find its substitute node based on input graph. Note that the difference of adjacent MISs are one matched edge and input node, which is the minimal change to original control structure.
proportion of changed possible input nodes Δn D increases monotonically, which indicates that it is easier to alter the control component of a denser network.
Surprisingly, the giant control component of a few networks can be changed by adding only one edge (Fig. 3E,F). For example, the control type of most nodes of some real networks (e.g. Facebook and Amazon networks shown in Table 1) can be altered by only one added edge. All these networks have a special giant MC in their corresponding input graphs, and the nodes of the MC was not linked by any unsaturated node (node without a matched out-edge) in the original network. We call it as a saturated matched component (SMC). Therefore, if we link an input node to an SMC, most nodes of the SMC will be control reachable by the input node and be turned into possible input nodes. However, when an MC is linked by one or more unsaturated nodes in original network, which we call it as an unsaturated matched component (UMC), we need to match all the unsaturated nodes to change its type. The result show the cost of altering an IC to a UMC (Fig. 3C) is similar to that of altering a UMC to an IC (Fig. 3D).
Furthermore, we find that the size of MIS significantly decreases after altering the type of the giant control component (Fig. S8), which means that the method can also be used to optimize the controllability of complex networks 19,20 .

Discussion
In summary, we developed the input graph, a fundamental structure that reveals the control relationship of nodes and MISs. Our key finding, that the control adjacent nodes have the same type in control, allows us to reveal the inherent control correlation of nodes, and offers a general mechanism to manipulate the control type of nodes or design a suitable control scheme. Furthermore, networks with a giant control component display a surprising type transition phenomenon in response to well-chosen structural perturbations, which is ubiquitous in dense networks across multiple disciplines.
The input graph presented here is a starting point for deeply investigating the control properties of networks. It paves the way to analyze the properties of all MISs of a network, such as enumerate all MISs 14 , estimate the node capacity in MISs 10 or find the optimum MIS under different control constraints or with specific node property 21 . Furthermore, the structural properties of the input graph, such as the node's degree and connected component also reveal several important topics on controllability of a network. We believe the other structural properties such as average distance and diameter of input graph, also worth deep investigation for multiple disciplines, such as brain network 22 , protein interaction 18 , and et al.
However, besides the input nodes in a MIS, there may exist other nodes which also need to be inputted the control signals. These nodes formed a cycle 4 in the network and cannot accessible from any input nodes in the current MIS. Therefore, to fully control the network, a control signal need to be inputted to any node of the cycle, and the signal can be shared with any input node 4 . Furthermore, there may exist more than one input nodes within the same input component. Any of these input nodes can be exchanged with its control reachable node based on exchange theorem. However, if two or more input nodes share same neighbor node in the input graph, they would not be substituted at the same time.  Table 1. Characteristics of the real networks analyzed in the paper. For each network, we show its type, name, number of nodes (N) and edges (L), density of input nodes n MIS , size and type of the largest control component CC max , in which I, U and S denote IC, UMC and SMC respectively, the proportion of edges (p) that is added into the network to change the type of CC max and the density of changed possible input nodes (Δn D ) after adding edges.
Furthermore, we design an algorithm to alter the control type of most nodes of a network with small structural perturbations, which is the first attempt to convert the control mode 12,13 of a network as far as we know. Many real networks, especially biological network, are incomplete and may have many missing edges. It means that if some new edges are discovered, it may alter the control type of existing nodes dramatically. However, these newly discovered edges will never weaken the performance of our algorithm, because they will only increase the size of the giant connected component of input graph.
Overall, these findings will improve our understanding of the control principles of complex networks and may be useful in controlling various real complex systems, such as drug designs [23][24][25] , financial markets 16,26 and biological networks 18,22 .

Construct input graph.
To build an input graph G D (V, E D ) of the directed network G(V, E), we need find all control adjacent edges E D between nodes. Based on the adjacency corollary, there are no control adjacent relationship between any possible input nodes and redundant nodes. Therefore, the set of edges E D of input graph are composed by two subsets: the set of edges E Di between all possible input nodes and the set of edges E Dr between all redundant nodes.
The edges set E Di and all possible input nodes V PD can be found as follows: 1. The edges set E Dr between all redundant nodes can be found as follows: . For each input node of an IC, we must add an edge pointing from an unsaturated node (node without matched out-edge) to the input node, and the IC will turn into an SMC; (B) Illustration of an alteration of an unsaturated matched component (UMC) to an SMC. For each unsaturated node of an MC, we add an edge from the unsaturated node to an input node, and the UMC will turn into an SMC; (C,D). The average degree versus percentage of added edges p/L used to alter the control component for IC and UMC to SMC. For large < k> , the percentage of added edges significant decreases, and the changed possible input nodes of per edge Δn D /p increases rapidly, which indicates that it is easy to alter the giant control component type of dense networks; (E) Illustration of an alteration of an SMC to an IC. We need link an input nodes to SMC and it will turn into IC; (F) Average degree versus density of changed possible input nodes for SMC to IC. With only one added edge, the control type of most nodes changes in dense network.