Abstract
The ability to control a complex network towards a desired behavior relies on our understanding of the complex nature of these social and technological networks. The existence of numerous control schemes in a network promotes us to wonder: what is the underlying relationship of all possible input nodes? Here we introduce input graph, a simple geometry that reveals the complex relationship between all control schemes and input nodes. We prove that the node adjacent to an input node in the input graph will appear in another control scheme, and the connected nodes in input graph have the same type in control, which they are either all possible input nodes or not. Furthermore, we find that the giant components emerge in the input graphs of many real networks, which provides a clear topological explanation of bifurcation phenomenon emerging in dense networks and promotes us to design an efficient method to alter the node type in control. The findings provide an insight into control principles of complex networks and offer a general mechanism to design a suitable control scheme for different purposes.
Introduction
Controlling complex networked systems is a fundamental challenge in natural, social sciences and engineered systems. A networked system is controllable if its state can be controlled from any initial state to a desired accessible state^{1,2} by inputting external signals from a few suitable selected nodes, which are called input nodes^{3,4,5,6}. Existing works^{3} provide an efficient method based on maximum matching to find a Minimum Input nodes Set (abbreviated MIS) used to fully control a network.
However, these works have primarily focused on analyzing single MIS^{4,5,6,7,8}, while the underlying control relationships of nodes and MISs remain elusive. Owing to the structural complexity of a network, its MISs are typically not unique and the number of MISs are exponential to the size of the network^{9,10}. The enumeration of all possible MISs is a #P problem^{11} which requires high computational costs. A few works analyzed the node types in control^{12,13} and control capacities^{10} of input nodes. Moreover, although any of its MISs are capable of fully controlling the network, they may composed of nodes with different topological properties, such as highdegree nodes^{14}. The existence of physical constraints and limitations^{15} may also affect the choice of a suitable MIS. For example, when controlling an interbank market^{16,17}, one may need certain specific input nodes to ensure that a MIS can be manipulated by a given organization; when controlling a protein interaction network^{18}, some proteins cannot be used as input nodes because of technique limitation.
Given the existence of numerous MISs in a network, a node can be classified based on its participation in MISs^{12}: 1. possible input node, which appear in at least one MIS; 2. redundant node, which never appear in any MIS. Previous works^{12} found that the dense networks exhibit a surprising bifurcation phenomenon, in which the majority of nodes are either redundant nodes or possible driver nodes. However, the origin of bifurcation phenomenon and the method of altering the type of nodes are still unknown.
Besides many approaches on controllability analysis of complex networks, the following questions are critical yet remain unknown: (i) what is the relationship between many available MISs of a network? (ii) what topological structure determines whether a node is a possible input node? (iii) how to design suitable MIS with the desired nodes?
Here, we present input graph, a simple geometry but capable of revealing the complex correlation of all MISs and nodes in control. The input graph is constructed by replacing the original edges with new edges reflecting control correlations of nodes. We prove that the node adjacent to an input node in input graph will appear in an MIS, and the nodes of the same connected component in input graph have the same control type, thus they are either all participate in control or not. Therefore, the emergence of giant connected component in input graph provides a clear topological explanation of the bifurcation phenomenon^{12} in dense networks, and the complex control correlation of nodes of original network can be reduced into a few simple connected components of input graph. Furthermore, we present an efficient method to precisely manipulate the types of any node in control based on its connectivity of input graph. We believe that input graph is important because it (i) presents a framework that reveals the inherent correlation of MISs and nodes in control and (ii) enables the design and manipulation of a suitable MIS of a network under constraints. Ultimately, this will promote the application of network control in real networked systems.
Results
Control adjacency and input graph
The dynamics of a linear timeinvariant network G(V, E) is described by:
where the state vector x(t) = (x_{1}(t), …, x_{N}(t))^{T} denotes the value of N nodes in the network at time t, A is the transpose of the adjacency matrix of the network, B is the input matrix that defines how control signals are inputted to the network, and u(t) = (u_{1}(t), …, u_{H}(t))^{T} represents the H input signals at time t.
To analyze relationship of all nodes in control, we first define the control adjacency of nodes pair: for a network G and any maximum matching M, node a is said to be control adjacent to node b if there exists a node c connecting a and b with an unmatched edge e_{ca} and a matched edge e_{cb}. Control adjacency reveals an important property about control correlation of nodes: a node that is control adjacent to an input node must appear in another MIS. For example, in Fig. 1A, input node a is control adjacent to node b, and the two nodes alternately appear in MIS {a, c} and MIS {b, c}. We prove that this property is satisfied by any network, which is called the Exchange Theorem, that is: For any MIS D of a network G, if an input node n ∈ D is control adjacent to another node m, then D’ = D\{n} ∪ {m} is also an MIS of G (see Supplementary Information). This means that a new MIS D’ can be obtained from MIS D by exchanging a node of D with its control adjacent neighbor.
Then, we define the input graph G_{D}(V, E_{D}) based on the control adjacency between nodes, where V is the node set, E_{D} is the edge set and e_{ij} ∈ E_{D} if node i is control adjacent to node j (Fig. 1C). Apparently, the input graph reveals all control relationships of nodes.
The input graph has several potential applications in analyzing controllability of complex networks. We find that the degree of a node in input graph reflects an important property about its substitutability in control. Based on the exchange theorem, for each edge of an input node in input graph, we can find a substitute node and obtain a new MIS with only one node replaced. It has important practical value. For example, when an input node of a MIS is no longer available due to physical constraint or attack, we can immediate find a new one by replacing the node with one of its neighbor in input graph. The above method makes the minimum change to the original control scheme, which is only one edge. Note that the computational complexity of the above process is only O(1), which yields a significantly improved method to obtain a new MIS in comparison with the state of the art approach^{10}.
Connected components of input graph
Next we focus on analyzing the connectivity of input graph. Similar to the concept of the path and reachable set in graph theory, we define control path p as the node sequence where neighbor nodes are control adjacent. The controlreachable set C(n) of node n is defined as the set of all nodes that are reachable from node n through any control path (Fig. 1B). Based on the above definition, we prove the following Adjacency Corollary: 1. For any MIS D and an input node n ∈ D, all nodes of C(n) must be possible input nodes; 2. For any MIS D, if node m did not belong to any controlreachable set of the input nodes of D, then m must be a redundant node and never appears in any MIS (see Supplementary Information). Therefore, it is easy to conclude that a node is a possible input node if it can be control reachable from an input node of any MIS.
The adjacency corollary show that the controlreachable sets of possible input nodes and redundant nodes will never intersect. Thus, there are only two types of connected components in the input graph: 1. Input Component (IC), which contains at least one input node; and 2. Matched Component (MC), which contains no input node. We call the two type connected components as control components. Apparently, all nodes of IC are possible input nodes and all those of MC are redundant nodes.
We analyze the control components of some real networks, and find that the complex control relationships of these networks can be reduced into a few control components of input graphs, i.e., little rock (Fig. 2A) and political blog networks (Fig. 2B). Furthermore, we find that many real networks have a giant control component in their input graph (Table 1 and Fig. S7), suggesting that the majority of nodes are tightly connected by control adjacency and have the same type in control. The giant control component can either be a giant IC, or a giant MC.
To further understand the origin of giant control component, we analyze the size of the largest control component of synthetic networks. We found that the size of the largest control component increases with the average degree of a network (Fig. 2C), whereas the number of control components decreases monotonically (Fig. 2D). Therefore, there exists only one giant control component in dense networks (Fig. 2E). The type of a node in control is determined by the type of control component to which it belongs, which depends on whether the control component contains an input node. If the giant control component contain at least one input nodes, most of its nodes will be possible input nodes; and if the giant control component contains no input node, most of its nodes will be redundant nodes. Thus, we can observe the bifurcation phenomenon^{12} (Fig. 2F) that emerges in dense networks. Therefore, the formation of the giant control component in input graph provides a clear explanation for the origin of the bifurcation phenomenon emergent in dense networks.
Altering the type of nodes in control
Owing to the economical or physical constraints exist in many actual control scenarios, we may need some specified nodes as input nodes. If the node is a possible input node, we can easily find a MIS which contain the node. However, if the node is a redundant node, we must alter the structure of the network and turn the node into a possible input node.
Since the nodes of the same control component have same control type, we only need to alter the type of control component in which the target node lies. This problem can be solved by adding new edges to the network. For example, if we link several input nodes to an MC, the nodes in the MC will be turned into possible input nodes and the MC will be turned into an IC. Additionally, if we match all input nodes of an IC, it will be turned into an MC and all nodes in it will be redundant nodes (Fig. 3A,B,E).
Therefore, we present an algorithm to alter the type of the control component (see Method). To quantify the efficiency of the algorithm, we investigate the number of added edges p in both ERrandom and scalefree networks. The results (Fig. 3C,D) showed that p significantly decreases with the average degree <k>, and the proportion of changed possible input nodes Δn_{D} increases monotonically, which indicates that it is easier to alter the control component of a denser network.
Surprisingly, the giant control component of a few networks can be changed by adding only one edge (Fig. 3E,F). For example, the control type of most nodes of some real networks (e.g. Facebook and Amazon networks shown in Table 1) can be altered by only one added edge. All these networks have a special giant MC in their corresponding input graphs, and the nodes of the MC was not linked by any unsaturated node (node without a matched outedge) in the original network. We call it as a saturated matched component (SMC). Therefore, if we link an input node to an SMC, most nodes of the SMC will be control reachable by the input node and be turned into possible input nodes. However, when an MC is linked by one or more unsaturated nodes in original network, which we call it as an unsaturated matched component (UMC), we need to match all the unsaturated nodes to change its type. The result show the cost of altering an IC to a UMC (Fig. 3C) is similar to that of altering a UMC to an IC (Fig. 3D).
Furthermore, we find that the size of MIS significantly decreases after altering the type of the giant control component (Fig. S8), which means that the method can also be used to optimize the controllability of complex networks^{19,20}.
Discussion
In summary, we developed the input graph, a fundamental structure that reveals the control relationship of nodes and MISs. Our key finding, that the control adjacent nodes have the same type in control, allows us to reveal the inherent control correlation of nodes, and offers a general mechanism to manipulate the control type of nodes or design a suitable control scheme. Furthermore, networks with a giant control component display a surprising type transition phenomenon in response to wellchosen structural perturbations, which is ubiquitous in dense networks across multiple disciplines.
The input graph presented here is a starting point for deeply investigating the control properties of networks. It paves the way to analyze the properties of all MISs of a network, such as enumerate all MISs^{14}, estimate the node capacity in MISs^{10} or find the optimum MIS under different control constraints or with specific node property^{21}. Furthermore, the structural properties of the input graph, such as the node’s degree and connected component also reveal several important topics on controllability of a network. We believe the other structural properties such as average distance and diameter of input graph, also worth deep investigation for multiple disciplines, such as brain network^{22}, protein interaction^{18}, and et al.
However, besides the input nodes in a MIS, there may exist other nodes which also need to be inputted the control signals. These nodes formed a cycle^{4} in the network and cannot accessible from any input nodes in the current MIS. Therefore, to fully control the network, a control signal need to be inputted to any node of the cycle, and the signal can be shared with any input node^{4}. Furthermore, there may exist more than one input nodes within the same input component. Any of these input nodes can be exchanged with its control reachable node based on exchange theorem. However, if two or more input nodes share same neighbor node in the input graph, they would not be substituted at the same time.
Furthermore, we design an algorithm to alter the control type of most nodes of a network with small structural perturbations, which is the first attempt to convert the control mode^{12,13} of a network as far as we know. Many real networks, especially biological network, are incomplete and may have many missing edges. It means that if some new edges are discovered, it may alter the control type of existing nodes dramatically. However, these newly discovered edges will never weaken the performance of our algorithm, because they will only increase the size of the giant connected component of input graph.
Overall, these findings will improve our understanding of the control principles of complex networks and may be useful in controlling various real complex systems, such as drug designs^{23,24,25}, financial markets^{16,26} and biological networks^{18,22}.
Methods
Construct input graph
To build an input graph G_{D}(V, E_{D}) of the directed network G(V, E), we need find all control adjacent edges E_{D} between nodes. Based on the adjacency corollary, there are no control adjacent relationship between any possible input nodes and redundant nodes. Therefore, the set of edges E_{D} of input graph are composed by two subsets: the set of edges E_{Di} between all possible input nodes and the set of edges E_{Dr} between all redundant nodes.
The edges set E_{Di} and all possible input nodes V_{PD} can be found as follows:
Find a maximum matching M and the corresponding MIS D; let the candidate set of all possible input nodes V_{PD} = D;
Select a node n of D, let D = D − {n}; for all inedges of node n, find the corresponding control adjacent neighbors {c_{1}, c_{2}, …, c_{i}};
Let D = D + {c_{1}, c_{2}, …, c_{i}}, V_{PD} = V_{PD} + {c_{1}, c_{2}, …, c_{i}} and E_{Di} = E_{Di} + {e(n, c_{1}), …, e(n, c_{i})};
Repeat step 2 and 3 until D is empty.
The edges set E_{Dr} between all redundant nodes can be found as follows:
Let V_{temp} = V − V_{PD};
Select a node n of V_{temp}, let V_{temp} ← V_{temp} − {n};
Let the matched inedge of n be e(m, n), find all outedge {e(m, c_{1}), …, e(m, c_{i})}of node m; let E_{Dr} = E_{Dr} + {e(c_{1}, n), …, e(c_{i}, n)} and V_{temp} ← V_{temp} + {c_{1}, c_{2}, …, c_{i}};
Repeat step 2 and 3 until V_{temp} is empty.
Next we analyze the computational complexity of above method. Let N is the number of the nodes and L is the number of the edges of the directed network G(V, E). First, the complexity for finding a maximum matching is O(N^{0.5}L)^{9}. Second, each node requires a breadth first search (BFS) process to finding its control reachable set, which computational complexity is O(L). For the worst case, we need to find the control researchable set for all nodes and the complexity is O(NL). Therefore, the total computational complexity to building an input graph is O(NL).
Altering an IC to a MC
For a network G(V, E), let B(V_{out}, V_{in}, E) be the corresponding bipartite graph and IC_{alter} be the target input component. The basic idea of altering an IC to a MC is to match all input nodes of the IC by adding edges. The following are the detailed steps:
Find all unmatched nodes S corresponding to current maximum matching. Let S_{1} = S ∩ V_{out}, S_{2} = S ∩ V_{in} ∩ IC_{alter};
Select a node n ∈ S_{2} and a node m ∈ S_{1}; add an edge e_{mn} to G; remove nodes n and m from S_{2} and S_{1}, respectively;
Repeat step 2 until S_{2} is empty.
The correctness of method of above method is list as follows. First, we prove that S_{1} ≥ S_{2}. Apparently, Based on the definition of B(V_{out}, V_{in}, E), V_{in} = V_{out}. Because every edge of the maximum matching starts with a node of V_{out} and ends with a node of V_{in}, then S_{1} = S ∩ V_{out} = S ∩ V_{in} ≥ S ∩ V_{in} ∩ IC_{alter} = S_{2}, which means that for any node of S_{2}, we can find a corresponding node in S_{1.} When we add an edge e_{mn} to G in step 3, the matching M’ = M + e_{mn} must be a maximum matching of G’ = G + e_{mn} because n and m are not matched by M. Therefore, nodes n and m are matched by M’. When we match all input nodes of the IC_{alter}, the IC_{alter} will be turned into an SMC.
Altering a MC to an IC
For a network G(V, E) with a MC, let B(V_{out}, V_{in}, E) be the corresponding bipartite graph. The basic idea of altering a MC to an IC is to link several input nodes to the MC and the MC will be turned into an IC. For a network with an UMC, if we directly link an input node to the UMC, the input node will be matched with an unsaturated node of the UMC. Therefore, we need first alter the UMC to an SMC and then turn the SMC into an IC. The method of altering an UMC to an SMC is similar to altering an IC to an SMC, which is as follows:
Find all unmatched nodes S corresponding to current maximum matching. Let S_{1} = S ∩ V_{in}, S_{2} = S ∩ V_{out} ∩ UMC_{alter};
Select a node n ∈ S_{2} and a node m ∈ S_{1}; add an edge e_{nm} to G; remove nodes n and m from S_{2} and S_{1}, respectively;
Repeat step 2 until S_{2} is empty.
For a network with a giant SMC, we only need to link an input node to the SMC, and the part which is control reachable by the input node will be turned into an IC. Therefore, in order to maximize the size of result IC, we need to find the most “influential” node of the SMC based on the size of its controlreachable set. The algorithm is listed as follows:
Let SMC_{alter} be the target saturated matched component; compute the size of controlreachable set of nodes in SMC_{alter}, let the node with maximum size be n;
Let e_{mn} be the matched edge pointing to node n; select an input node d, add edge e_{md} to G.
After adding the edge e_{md}, the nodes of controlreachable set of n will be turned into possible input nodes because the input node d is control adjacent to node n. If we want to convert all nodes of an SMC, we need to find the minimal input nodes set that can reach all node of the SMC. The problem can be solved by a simple greedy algorithm.
Additional Information
How to cite this article: Zhang, X. et al. Input graph: the hidden geometry in controlling complex networks. Sci. Rep. 6, 38209; doi: 10.1038/srep38209 (2016).
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
 1.
Kalman, R. E. Mathematical description of linear dynamical systems. Journal of the Society for Industrial and Applied Mathematics, Series A: Control. 1, 152–192 (1963).
 2.
Luenberger, D. G. Introduction to Dynamic Systems: Theory, Models, & Applications. Proceedings of the IEEE. 69, 1173 (1979).
 3.
Murota, K. Matrices and Matroids for Systems Analysis (Springer Science & Business Media, 2000).
 4.
Liu, Y. Y., Slotine, J. J. & Barabasi, A. L. Controllability of complex networks. Nature. 473, 167 (2011).
 5.
Nepusz, T. & Vicsek, T. Controlling edge dynamics in complex networks. Nat Phys. 8, 568–573 (2012).
 6.
Menichetti, G., Dall’Asta, L. & Bianconi, G. Network Controllability Is Determined by the Density of Low InDegree and OutDegree Nodes. Physical Review Letters. 113 (2014).
 7.
Posfai, M., Liu, Y. Y., Slotine, J. J. & Barabasi, A. L. Effect of correlations on network controllability. Sci. Rep. 3, 1067 (2013).
 8.
Ruths, J. & Ruths, D. Control profiles of complex networks. Science. 343, 1373 (Mar 21, 2014).
 9.
Zdeborová, L. & Mézard, M. The number of matchings in random graphs. Journal of Statistical Mechanics Theory & Experiment. 2006 (2006).
 10.
Jia, T. & Barabasi, A. L. Control capacity and a random sampling method in exploring controllability of complex networks. Scientific reports. 3, 2354 (2013).
 11.
Valiant, L. G. The complexity of computing the permanent. Theoretical Computer Science. 8, 189–201 (1979).
 12.
Jia, T. & Barabasi, A. L. Emergence of bimodality in controlling complex networks. Nature communications. 4, 2002 (2013).
 13.
Jia, T. & Posfai, M. Connecting Core Percolation and Controllability of Complex Networks. Scientific Reports. 4, 5379 (2014).
 14.
Zhang, X., Lv, T., Yang, X. & Zhang, B. Structural Controllability of Complex Networks Based on Preferential Matching. PloS one. 9, e112039 (2014).
 15.
Pasqualetti, F., Zampieri, S. & Bullo, F. Controllability Metrics, Limitations and Algorithms for Complex Networks. IEEE Transactions on Control of Network Systems 1, 40–52 (2014).
 16.
Delpini, D. et al. Evolution of controllability in interbank networks. Scientific reports. 3, 1626 (2013).
 17.
Schweitzer, F. Economic networks: The new challenges. Science. 325, 422–425 (2009).
 18.
Wuchty, S. Controllability in protein interaction networks. Proceedings of the National Academy of Sciences. 111, 7156–7160 (2014).
 19.
Wang, W. X., Ni, X., Lai, Y. C. & Grebogi, C. Optimizing controllability of complex networks by minimum structural perturbations. Physical Review E 85, 026115 (2012).
 20.
Xiao, Y., Lao, S., Hou, L. & Bai, L. Edge orientation for optimizing controllability of complex networks. Physical Review E 90, 042804 (2014).
 21.
Yan, G., Ren, J., Lai, Y. C., Lai, C. H. & Li, B. Controlling complex networks: How much energy is needed? Physical Review Letters. 108, 218703 (2012).
 22.
Kumar, A., Vlachos, I., Aertsen, A. & Boucsein, C. Challenges of understanding brain function by selective modulation of neuronal subpopulations. Trends in neurosciences. 36, 579–586 (2013).
 23.
Yıldırım, M. A., Goh, K. I., Cusick, M. E., Barabási, A. L. & Vidal, M. Drugtarget network. Nature biotechnology. 25, 1119–1126 (2007).
 24.
Wu, L., Shen, Y., Li, M. & Wu, F. X. Network Output ControllabilityBased Method for Drug Target Identification. IEEE Transactions on NanoBioscience. 14, 184–191 (2015).
 25.
Asgari, Y., SalehzadehYazdi, A., Schreiber, F. & MasoudiNejad, A. Controllability in cancer metabolic networks according to drug targets as driver nodes. PloS one. 8, e79397 (2013).
 26.
Battiston, S., Puliga, M., Kaushik, R., Tasca, P. & Caldarelli, G. Debtrank: Too central to fail? financial networks, the fed and systemic risk. Scientific reports 2 (2012).
Acknowledgements
We thank Guangyan Zhang and Bin Zhang for discussions. This work was supported by the Fundamental Research Funds for the Central Universities of China under grand number N140404011, the Natural Science Foundation of China under grant number 60903009, 71272216, 91546110, the Special Program for Applied Research on Super Computation of the NSFCGuangdong Joint Fund (the second phase), and the plan Project for Youth Scholar Backbone of General Colleges and Universities of Heilongjiang under the grant number 1253G017.
Author information
Affiliations
School of Computer Science and Engineering, Northeastern University, Shenyang 110819, China
 Xizhe Zhang
 & Yuanyuan Pu
College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China
 Tianyang Lv
IT Center, National Audit Office, Beijing 100830, China
 Tianyang Lv
Authors
Search for Xizhe Zhang in:
Search for Tianyang Lv in:
Search for Yuanyuan Pu in:
Contributions
X.Z.Z. was the lead writer of the manuscript. X.Z.Z. designed research, proved the theorems, analyzed the data and wrote the paper. T.Y.L. wrote the paper and participated in designing the research. Y.Y.P. performed the experiments. All authors reviewed the manuscript.
Competing interests
The authors declare no competing financial interests.
Corresponding authors
Correspondence to Xizhe Zhang or Tianyang Lv.
Supplementary information
PDF files
Rights and permissions
This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
About this article
Further reading

1.
An efficient algorithm for finding all possible input nodes for controlling complex networks
Scientific Reports (2017)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.