Trade-offs between driving nodes and time-to-control in complex networks

Recent advances in control theory provide us with efficient tools to determine the minimum number of driving (or driven) nodes to steer a complex network towards a desired state. Furthermore, we often need to do it within a given time window, so it is of practical importance to understand the trade-offs between the minimum number of driving/driven nodes and the minimum time required to reach a desired state. Therefore, we introduce the notion of actuation spectrum to capture such trade-offs, which we used to find that in many complex networks only a small fraction of driving (or driven) nodes is required to steer the network to a desired state within a relatively small time window. Furthermore, our empirical studies reveal that, even though synthetic network models are designed to present structural properties similar to those observed in real networks, their actuation spectra can be dramatically different. Thus, it supports the need to develop new synthetic network models able to replicate controllability properties of real-world networks.

external input signal. In addition, let us assume that the input matrix B ∈ R N ×P has rank P (i.e., full column rank). Notice that if B does not have full column rank, then some columns of B can be written as a linear combination of the remaining ones. Consequently, removing linearly dependent columns of B (and their corresponding inputs) would not affect our ability to control the network. If the system in (1) is controllable, then its controllability matrix has rank N (i.e., N linearly independent columns). In what follows, we introduce the concept of controllability index to account for the time-to-control.
Let b i be the ith column of B, then the controllability matrix C(A, B; N ) can be written as follows: To define the controllability index, for each i ∈ {1, . . . , P }, we define the set of column vectors S i = where τ i is the maximum integer for which the set of column vectors in S i is linearly independent. Therefore, it can be proved that if C(A, B; N ) has rank N , then The controllability index of the pair (A, B), describing the dynamical network in (1), is defined as [1] τ (A, B) = max{τ 1 , τ 2 , . . . , τ P }.
Equivalently, if (A, B) is controllable, the controllability index τ (A, B) is the least integer T such that rank(C(A, B; T )) = N.
From a control point of view, the controllability index τ (A, B) is equal to the minimum number of time steps required to steer the system from an initial state x 0 to an arbitrary desired state x d ∈ R N . In particular, if the system is controllable in T time steps, and the initial state is the origin (i.e., x 0 = 0), then the input signal {u [t]} T −1 t=0 that steers the system to x d can be explicitly computed as [5] where u 0:T −1 = [u [0] , u [1] , . . . , u [T − 1] ] is a vector in R T P containing a concatenation of the input signal. Notice that, for T ≥ τ (A, B), the matrix inside the brackets in (2) is invertible and u 0:T −1 is well-defined.

Graph Theory and Structural Systems
The following standard terminology and notions from graph theory can be found, for instance, in [3].
Let D(Ā) = (X , E X ,X ) be the state digraph corresponding to the digraph representation ofĀ ∈ {0, } N ×N (i.e., the structural matrix associated with A in (1)), where the node set X has its nodes labeled by the state variables (also referred to as state nodes) and E X ,X = {(x i , x j ) : A ji = 0} denotes the set of edges connecting state nodes. Similarly, we define the system digraph D(Ā,B) = (X ∪ U, E X ,X ∪ E U ,X ), wherē B ∈ {0, } N ×P represents the structural matrix associated with B in (1), U represents the set of P nodes labeled by the input variables (also referred to as input nodes), and E U , Two digraphs are disjoint if they do not share any node. A subgraph of D s with some property P (e.g., being connected) is maximal if there is no other subgraph containing called the root (respectively, the end) of the path. An elementary path is said to be open if v 1 = v l . An elementary path for which v 1 = v l is called a cycle (in particular, a node with an edge to itself, i.e., a self-loop, is a cycle).
In addition, a digraph D is said to be strongly connected if there exists an elementary path between any pair of nodes. A strongly connected component (SCC) is a maximal subgraph D S = (V S , E S ) of D for which the following property is satisfied: for every pair of nodes v, w ∈ V S , there exists a path in D S from v to w. Similarly, if the graph is undirected, a strongly connected graph is simply said to be connected. A digraph is said to be weakly connected if, after disregarding edge directions, the resulting graph is connected. A directed tree is a digraph if, after disregarding edge directions, the resulting graph is connected and does not contain cycles. Furthermore, a collection of disjoint directed trees is referred to as a directed forest. Finally, the graph partition (GP) problem consists in determining κ weakly connected In what follows, we define specific subgraphs that are relevant in structural system theory [2], [3]. Given a state digraph D(Ā) and a system digraph D(Ā,B), we define the following special subgraphs [6]: • State Stem -An isolated node or an open elementary path, composed exclusively of state nodes.
• Input Stem -An input node linked to the root of a state stem.
• State Cactus -Defined recursively as follows: A state stem is a state cactus. A state cactus connected by an edge to a (disjoint) cycle is also a state cactus.
• Input Cactus -Defined recursively as follows: An input stem with at least one state node is an input cactus. An input cactus connected by an edge to a disjoint cycle of state nodes is also an input cactus.• The root and the end of a stem are also called the root and the end of the associated cactus, respectively.
Note that, by definition, an input cactus may have an input node linked to several state nodes. In particular, an input node may connect to the root of a state stem and to one or more states in a cycle. Finally, the input vertices in the system digraph required to ensure structural controllability, also referred to as driving nodes, can be characterized in terms of state unmatched nodes associated with a maximum matching of the bipartite representation of the state digraph (see [8] for details). In particular, the minimum number of driving nodes (or, equivalently, the minimum number of state unmatched nodes) is related with the so called term-rank ofĀ ∈ {0, } N ×N , denoted by σ(Ā) and defined as the maximum number of non-zero diagonal elements in any of the matrices resulting from a permutation of rows and columns ofĀ. Therefore, the minimum number of driving nodes can be described as (3)

Computational Complexity
In what follows, we introduce some concepts of computational complexity theory [9]. This theory allows the classification of (computational) problems into complexity classes. In particular, we can classify decision problems, i.e., problems with a "yes" or "no" answer. Furthermore, if there exists a procedure/algorithm that obtains the correct answer of a decision problem in a number of steps that is bounded by a polynomial in the size of the input data, then the algorithm is referred to as an efficient or polynomial-time solution, and the decision problem is said to be polynomially solvable. A decision problem is said to be in NP (i.e., nondeterministic polynomial) if any possible solution instance can be verified using a polynomial procedure. It is easy to see that any problem that is polynomially solvable is also in NP, although, there are some problems in NP for which it is unclear whether polynomial solutions exist or not. These latter problems are referred to as being NP-complete. Consequently, the class of NP-complete problems is the hardest among the NP problems, i.e., those that are verifiable using polynomial algorithms, but no polynomial algorithms to solve them are known to exist. Although the above classification is intended for decision problems, it can be immediately extended to general optimization problems by noticing that every optimization problem can be posed as a decision problem. More precisely, given a minimization problem, we can pose the following decision problem: Is there a solution to the minimization problem that is less than or equal to a prescribed value? On the other hand, if the solution to the optimization problem is known, then its decision version can be trivially addressed. Consequently, if a (decision) problem is NP-complete, then the associated optimization problem is referred to as being NP-hard. We suggest the reader to [4] for an introduction to the topic of computational complexity.
In the next section, we show that the problem of finding the minimum number of driven nodes to ensure a structural controllability index equal to T is NP-hard. The NP-hardness is demonstrated by polynomially reducing this problem to a well known NP-hard problem, in particular, the GP problem introduced above.
Even though polynomial complexity algorithms able to solve general instances of the GP problem are unlikely to exist, it is possible to approximate its solution using polynomial-time algorithms (with some optimality guarantees). One of the most successful software tools to approximate the GP problem is called METIS [10], described in further detail in the next section.

II. MAIN RESULTS
In this section, we formally introduce the structural counterpart of the controllability index, the structural controllability index. In Theorem 2, we provide a graph-theoretical characterization of the structural controllability index. In Theorem 3, we provide a lower bound on the structural controllability index in terms of the state unmatched nodes and graph partitions. In Theorem 4, we show that the problem of computing the minimum number of driven nodes under constraints in the time-to-control is NP-hard.
Subsequently, we propose an efficient approximation approach, described in Algorithm 1. This algorithm provides us with an upper bound on the structural controllability index. Comparing this upper bound with the lower bound in Theorem 3, we can assess the quality of our approximation. Furthermore, the proposed algorithm leverages existing tools that consistently achieve approximate solutions with optimality guarantees.
This structural controllability index is defined as follows [11], [12]: Consider the structural matrices where the entries are either 0 (i.e., there is no edge between two nodes), or an unknown nonzero entry (i.e., there is an edge between two nodes with an arbitrary weight) denoted by . In other words, the matricesĀ andB characterize the topology of the system digraph, when the weights can take any arbitrary value. Given a structural state matrixĀ and a structural input matrixB, we say that the corresponding structural system is structurally controllable with index T if there exists a pair of real matrices (A, B) corresponding to a weighted realization of the system digraph such that the controllability index of (A, B) is equal to T . In other words, we can find a (weighted) network with a system digraph matching the topology described by the pair (Ā,B) such that it can be controlled in (at least) T time steps. This value of T is called the structural controllability index, which we denote byτ (Ā,B).
To formalize this concept, we need to introduce the notion of generic rank of the partial controllability In what follows, we define the term rank of a structural matrix, which is useful to provide a graphtheoretical interpretation of the structural controllability index. Consider a partial structural controllability matrix of order T , given byC where P 1 and P 2 and two permutation matrices of appropriate dimensions.
Based on these concepts, the structural controllability index can be characterized as follows: The optimal set J ⊂ {1, . . . , N } represents the minimal set of dedicated driving nodes, i.e., the set of input nodes that are connected to only one state node, which we will refer to as driven (state) nodes. A similar problem can be posed to determine the minimum number of driving nodes (i.e., not necessarily dedicated), which can be immediatly obtain from the solution to problem P 1 , as described in [3]. Now, we provide a novel graph-theoretic interpretation of the structural controllability index. to zero the free parameters associated with these edges) that do not belong to any of the input cacti in C. Therefore, we obtain p disjoint sub-systems (Ā(C i ),B(C i )), whereM (C i ) consists in the submatrix ofM with the columns and rows associated with the nodes in C i . Subsequently, invoking Theorem 1, it follows that each subsystem is structurally controllable, and, in particular,τ (Ā(C i ),B(C i )) ≤ T for all i = 1, . . . , p. Hence, by definition of structural controllability index it follows thatτ (Ā,B) = T .
If the pair (Ā,B) is structurally controllable with index T , then it is also structurally controllable, and, in particular, it has to be spanned by a disjoint union of p input cacti C = {C i } p i=1 . Suppose, by contradiction, that there exist no decomposition where all input cacti contain at most T state nodes. Then, it follows that in any decomposition there exists a cactus C j that has more than T state nodes. Consider two permutation matrices P and P of appropriate dimensions, such thatĀ = P Ā P has diagonal blocks corresponding toĀ(C i ) for i = 1, . . . , p, andB = P B P is such that the i-th column ofB corresponds toB(C i ). associated with C i , i.e., its rows are those indexed by the rows of (Ā(C j ),B(C j )). Therefore, it follows 8 that the j-th block-row of [B Ā B . . . (Ā ) T −1B ] is such that no two permutation matrices P 1 , and P 2 exist such thatTr(P 1 [B Ā B . . . (Ā ) T −1B ] j P 2 ) equals the number of rows, which implies that r(Ā ,B ; T ) < N . Consequently, we obtain thatτ (Ā ,B ) > T , or, equivalently,τ (Ā,B) > T (since the structural controllability index is invariant with respect to permutation operations), which leads to a contradiction, and the result follows.
As a corollary to Theorem 2, we can obtain the following known result: spanning forests. Further, a spanning forest F i contains p i ∈ N directed trees, whose collection we denote Then, the structural controllability index can be defined as follows: where |T | s denotes the number of state nodes in the tree T rooted in an input node.
As a consequence of Theorem 2, a lower bound to the minimum number of driving nodes can be obtained as follows: Therefore, if we want to ensure structural controllability (i.e., the structural controllability index is equal to N ), then we obtain n LSB N = max{1, α(Ā)} as prescribed in [8]. As already mentioned, finding the minimum number of driving/driven nodes to ensure a given controllability index is NP-hard [13], [15], [16]. In what follows, we provide an alternative proof to the former that relies on the GP, later used to obtain an approximate solution to our problem.
Proof: To show that P 1 is NP-hard, we need to show that there exists a polynomial reduction from a problem that is known to be NP-hard to our problem. Towards this goal, we consider the graph partitioning (GP) problem, which is known to be NP-hard. Let G = (V, E) be a connected undirected graph, then the GP problem aims to determining the minimum decomposition of G into p connected undirected graphs consider P 1 , whereĀ =Ā(G) is described as follows: Therefore, D(Ā) is a strongly connected digraph where every state node has a self-loop. If we assume that J * is a solution to P 1 , then, as a consequence of Corollary 1, it follows that each dedicated input is the root of a tree with at most T state nodes.
where (without loss of generality) x i is the only state variable to which the input has a connection to. Therefore, it readily follows that corresponds to a subgraph i in a partition of G. In other words, by solving P 1 withĀ as described above, we will be able to obtain a solution to the GP problem; hence, our problem is at least as difficult as the GP problem, i.e., problem P 1 is NP-hard.
Computational complexity theory can also help us to establish some strategies to find the solution to P 1 by reduction to other well-known NP-hard problems, which can then be leveraged to obtain approximate solutions to Problem P 1 . This is what we will do next, by polynomially reducing P 1 to a GP problem, under certain assumptions.
Theorem 5: LetĀ be symmetric structural matrix with zero-free diagonal, and G be the undirected graph associated with D(Ā), which we assume to be strongly connected. Let J * contain the indices of exactly one node from each subgraph in a partition of G, where each subgraph contains at most T nodes.
Then J * is a solution to P 1 .

Proof: Consider a partition of G into a collection of subgraphs
with at most T nodes. The proof follows by noticing that represent the collection of these directed trees; then, the first condition in Lemma 1 yields. In addition, the second condition of the same theorem holds by assumption, since D(Ā) is spanned by cycles, due to the self-loops corresponding to the non-zero diagonal entries inĀ. Subsequently, the minimality immediately follows by noticing that if J with fewer elements than J * existed, then from the proof of Theorem 4, it follows that there exists a graph partitioning with fewer partitions is possible, which would lead to a contradiction.
Notice that in Theorem 5 we made two assumptions: (i)Ā is symmetric and D(Ā) is strongly connected, and (ii)Ā is zero-free diagonal. On one hand, if assumption (i) does not hold, then D(Ā) can be an arbitrary digraph. Consequently, Problem P 1 can be reduced to a graph partitioning by considering the structural matrixÃ =Ā+Ā (i.e., we disregard directions in the state digraph). On the other hand, if assumption (ii) does not hold, then we may need to consider at least α(Ā(G i )) driving nodes for each partition G i , wherē A(G i ) is the submatrix ofĀ with columns and rows associated with the nodes in G i . Nonetheless, the total One of the advantages of reducing our problem to that of determining the solution to GP is that its solution has to be determined using approximation (polynomial-time) algorithms with some optimality guarantees. One of the most successful software tools to solve GP problems is METIS [10]. This software package is publicly available, and has been shown to consistently lead to only 1% − 3% of partitions that do not satisfy the partitioning criteria, i.e., |V i | ≤ |V| T . Notwithstanding, one can further evaluate the quality of the solution obtained by comparing the number of nodes obtained with the lower bound in Theorem 3. In fact, we notice that, in our empirical evaluations, the total number of driving/driven nodes is often close to the lower-bound provided in Theorem 3, which implies that the solution obtained by Algorithm 1 is close to the optimal. Furthermore, the proposed partition-based algorithm achieves in practice better results then the sequential minimization algorithms previously suggested in the literature [13], [15] (see Supplementary Figure 2 for a brief comparison between the two).