Noise Response Data Reveal Novel Controllability Gramian for Nonlinear Network Dynamics

Control of nonlinear large-scale dynamical networks, e.g., collective behavior of agents interacting via a scale-free connection topology, is a central problem in many scientific and engineering fields. For the linear version of this problem, the so-called controllability Gramian has played an important role to quantify how effectively the dynamical states are reachable by a suitable driving input. In this paper, we first extend the notion of the controllability Gramian to nonlinear dynamics in terms of the Gibbs distribution. Next, we show that, when the networks are open to environmental noise, the newly defined Gramian is equal to the covariance matrix associated with randomly excited, but uncontrolled, dynamical state trajectories. This fact theoretically justifies a simple Monte Carlo simulation that can extract effectively controllable subdynamics in nonlinear complex networks. In addition, the result provides a novel insight into the relationship between controllability and statistical mechanics.

Scientific RepoRts | 6:27300 | DOI: 10.1038/srep27300 directly used as a numerical procedure to solve the optimal control problem below, it is a key building block to prove the main result.

Results
Controllability function and Gramian. Consider the nonlinear controlled dynamics: where t represents time, x(t) = [x 1 (t), … , x n (t)] T and u(t) are the state and input variables, and smooth functions f and g describe the autonomous dynamics and the input effect, respectively. In (1), the initial state x 0 is fixed, which affects both controllability determination and quantification. This can be arbitrarily chosen, although typically the initial state is fixed to a stable equilibrium in the conventional controllability quantification results of nonlinear systems 22 . Moreover, all the results in this paper hold for any probabilistic initial state (i.e., x 0 is a random variable) and multi input cases as far as x 0 is independent of the input noise below. For a final time τ > 0, the minimum control effort ∫ τ u t dt ( ) x G x ( ) T 1 2 1 provided G τ is nonsingular when x 0 = 0 6,8,18 . However, this definition of G τ cannot straightforwardly be extended to nonlinear systems. Here, the controllability function and Gramian were introduced independently, and then a simple quadratic relation was shown. By changing our way of thinking, let us define G(L τ ), which we call Gibbs Gramian, in terms of the Gibbs distribution associated with the given controllability function x G x ( ) T 1 2 1 , the Gaussian integral formula shows G(L τ ) = G τ . Therefore, this definition is consistent with the conventional one for linear dynamics. It should be noted that the definition of the controllability function does not assume linearity. Thus, this definition can readily be employed also for nonlinear cases. Another important feature is that we do not need to care about the reachability of each state. Even if some  x are not reachable by any finite energy input (e.g., linear dynamics for which G τ is singular), . This means we can handle network dynamics that evolve on a specific domain due to the dynamics' structure or physical constraints.
For linear systems with the initial state at the origin, eigenstructure of the controllability Gramian G τ is useful for identifying directions in the state space that require small control energy to be reached. The Gibbs Gramian enjoys a similar property. Specifically, principle component analysis on G(L τ ) reveals all effectively reachable directions. For instance, by setting = 1 , we observe for the linear case that the principle eigenvector of G(L τ ) = G τ minimizes 2 , that is, the control effort divided by the squared distance takes its minimum value when the final state  x lies on the principle eigenvector. Another interpretation is that, of every possible direction, with a fixed control energy the state can be driven the furthest from the origin by driving it to a destination state that lies along the principle eigenvector. Interpretation for the nonlinear case has similarities with the linear case. Note that, for any unit vector e, large φ τ   x e x ( ) implies that a small energy input can be used (i.e., small τ  L x ( ), and consequently large φ τ  x ( ) L ) to place the state far from the origin along the direction of e (i.e., large  e x T 2 ) at the final time. Then, its spatial integral over all final states  x satisfies the following theorem, which readily follows from the equality is maximized when e is the principal eigenvector of G(L τ ). In this sense, the principal eigenvector of the Gibbs Gramian captures the direction along which the states can be reached furthest from the origin by a control effort that is small on average. In addition, it is trivial to change the reference point. For example, one can modify the definition as to evaluate the travelling distance instead of the distance from the origin. Similarly, the secondary, and further, eigenvectors enable us to characterize an effectively reachable subspace. See also the subsection entitled Dimensionality reduction of nonlinear network dynamics below for another interpretation in terms of the minimal projection error. The conclusion is that the Gibbs Gramian introduced G(L τ ) in (2) is a proper extension of the conventional controllability Gramian G τ for nonlinear dynamics.
Stochasticity connects Gibbs Gramian and simulation data. For linear dynamics, G τ is given as a solution to a linear matrix equation. On the other hand, the controllability function L τ is given as a solution to a nonlinear partial differential equation for nonlinear dynamics 22 . Therefore, it is not realistic to compute L τ , and consequently G(L τ ), even for small-scale cases. However, the situation drastically changes when the input is disturbed by random noise: where T > 0 is a noise level or temperature, ξ(t) is white noise such that 〈 ξ 2 (t)〉 = 1, and the expectation is taken over noise samples 29,30 . There are several theoretical results concerning the controllability determination of Scientific RepoRts | 6:27300 | DOI: 10.1038/srep27300 stochastic systems (e.g., approximate controllability 31,32 ). In this paper, we define a stochastic controllability function  τ  x ( ) for the controllability quantification as where the infimum is taken over all feedback control laws and we define Φ such that with the Dirac's delta function δ. Then, the terminal cost is an alternative representation of the terminal boundary . Therefore,  τ  x ( ) can be viewed as the minimum expected value of the control effort to regulate τ =  x x ( ) ; see Fig. 1a. Similarly to the deterministic case, we do not require the boundedness of τ  . It should be emphasized that the resulting Gibbs distribution φ τ  is not identical to φ τ L , and depends on T. Next, we refer to the uncontrolled (u(t) = 0), but randomly excited dynamics t x( ) as noise response 0 whose sample path is shown in Fig. 1b. The key finding in this paper is the following theorem, the proof of which is in the Methods section: The probability density function of τ , that is, the noise response τ x( ) obeys the Gibbs distribution associated with  τ T / . This result means that the noise response data completely characterizes the minimum required input energy  τ  x ( ) for each target state  x. An intuitive reason for this nontrivial relation to hold is that the noise in (6) is added through the input channel. This type of noise is known to have an ability to search for the solution to a wide class of optimal control problems 28 . By this connection, the noise response data t x( ) inherently contains information about the control energy minimization problem. Therefore, this bridges the gap to the controllability function that is defined via the minimum energy control input.
Note that the evaluation of  τ  x ( ) over the whole state space based on the density function estimation of τ x( ) is still computationally intractable. However, in this paper, we focus on the Gramian induced by the stochastic controllability function, which is given by the spatial integral in (2), and is much easier to determine than the pointwise evaluation of  τ  x ( ). Actually, the theorem above yields the following equality for the stochastic Gibbs T This equality tells us that the stochastic Gibbs Gramian can easily be calculated via Monte Carlo sampling of the uncontrolled dynamics open to environmental noise. Furthermore, both this computation and also the principle component analysis of  τ T G( / ) are efficiently implementable because various algorithms to achieve computational scalability exist for both Monte Carlo sampling (e.g., importance sampling) and matrix eigenvalue analysis. Thus, the novel equality (7) characterized in this paper leads to the first numerically tractable procedure to find an effectively reachable subspace of large scale nonlinear dynamics, when they are open to environmental noise, and is particularly useful for network dynamics for which only simulation algorithms, or time-series data collected in a noisy environment, are available. . The corresponding control effort is measured is the minimum of these average values over all such control laws. (b) Sample paths of the noise response t x( ) in (6).
Scientific RepoRts | 6:27300 | DOI: 10.1038/srep27300 Dimensionality reduction of nonlinear network dynamics. The controllability quantification enables us to characterize subspaces that require a large control energy to be reached. By eliminating such subspaces, we can obtain a reduced order model, which is expected to well approximate the state trajectories as long as the input energy is not large. Actually, the dimensionality reduction of (mainly linear 18,33 ) dynamical systems, which is helpful for understanding the hidden core mechanism, or to perform efficient numerical simulation, is an important application of the controllability quantification. In this section, we investigate two conceptually different nonlinear model reduction methods in the light of the Gibbs Gramian. Let an integer k(<n) be the desired order of the reduced model and define the set of (n × k)-matrices can approximately recover the original one by x(t) ≈ ρz(t). Hence, we expect the Galerkin projection given as to be a good reduced order model of dynamics (1). In what follows, we focus on the problem of finding such a ρ.
In computational physics, the Proper Orthogonal Decomposition (POD), or Karhunen-Loeve method, has a long history of intensive research 25,26 . This is a simulation-based model reduction method, and is widely used for the simulation of nonlinear large-scale dynamical systems as found in computational fluid dynamics and aerospace engineering. Suppose we replace the requirement (8) by where the error is evaluated at multiple, given time instances τ ∈  . (This optimization criterion is equivalent to the maximal singular value of (I − ρρ T ) [x(τ 1 ), x(τ 2 ), … ]). This is the fundamental idea of the POD, and is referred to as the method of snapshots. If we need to approximate only the autonomous system dx(t)/dt = f(x(t)), this optimization is computationally tractable even for nonlinear large-scale dynamics, and the resulting Galerkin projection yields a satisfactory reduced model. However, when controlled dynamics (1) are of interest, we need to determine which input signal u(t) should be injected when collecting snapshots, because we cannot simulate the trajectories corresponding to all possible input signals. Many practically useful techniques, as well as theoretical analysis tools, for this have been developed; see 34 and references therein. On the other hand, from a controllability quantification viewpoint, it is also reasonable to find ρ such that if  x is reachable with a small energy input u(t). For this purpose, the Gramian-based model reduction for linear systems employs a ρ that maximizes Trace(ρG τ ρ T ). The Galerkin projection associated with this choice extracts effectively reachable subdynamics, in that the resulting projection eliminates a subspace on which 1 is large. However, as mentioned at the beginning of the previous section, for nonlinear dynamics it is unrealistic to compute the controllability function L τ , which is no longer a quadratic form. This is the main reason why there have been no practical methods for the control-theoretic model reduction of general nonlinear large-scale systems [20][21][22]34 . This limited applicability shows a clear contrast to the POD. There are many results that attempt to solve optimal control problems by the POD [35][36][37] . However, the relation between the simulation-based and Gramian-based model reductions has not yet been fully understood.
The remainder of this section is devoted to forming a theory-bridge to connect these two model reduction approaches that were developed independently for similar purposes. Concerning the input selection for the POD, the impulse signals for the empirical Gramian 24 , or the sinusoidal signals for the frequency domain POD 21,25 , may be suitable for linear systems. Actually, the POD with these input signals is equivalent to the Gramian-based model reduction for linear systems [34,Chapter 5], [18, Section 9.1]. However, although it is technically easy to inject the same inputs for nonlinear systems, there is no solid justification for their use. An interesting solution is to choose white noise ξ(t) for the input signal, and minimize the snapshots' ensemble average of the squared projection error, that is, Therefore, the approximation error of the noise response data is equal to the projection error weighted by the Gibbs distribution associated with the stochastic controllability function τ  . Consequently, the POD evaluates the projection errors on the trajectories that are realizable by a small control effort, without computing any minimum energy input. In this sense, the POD with noise response data can be regarded as an easily implementable nonlinear model reduction method that explicitly takes the controllability into account.
Note that (9) is equal to . Therefore, the stochastic Gibbs Gramian-based reduction (the maximization of L ) is equivalent to the best approximation of effectively reachable states (the minimization of the right-hand-side of (9)). This is another justification for the conclusion that the Gibbs Gramian is a proper extension of the conventional controllability Gramian.
Furthermore, equality (9) holds even for any nonlinear projection in the place of ρρ T , although its optimization is nontrivial. In the case of linear systems, observability, a dual concept of the controllability, is also investigated, and often referred to as the balanced POD 34 . Extensions in this direction are currently under investigation.

Discussion
The dynamics' nonlinearity makes the controllability sensitive to T. This temperature dependency is discussed in this section. First,  τ T / in (7) and (9) indicates that the input cost is inversely proportional to the noise level, that is, a less noisy (accurate) control channel is more expensive. In particular, as T → 0, the criterion Scientific RepoRts | 6:27300 | DOI: 10.1038/srep27300 evaluates the error at the snapshots located almost on the trajectory of the autonomous system dx(t)/dt = f(x(t)). Its interpretation from a controllability perspective is as follows: The input weight T −1 becomes unboundedly large, and consequently the states reachable with small control energy are limited to a small neighborhood around the autonomous trajectory (the noise is negligible because T → 0).
Next, we demonstrate the nontrivial effect of the noise level by means of a numerical example of p identical, coupled neuronal oscillators of the FitzHugh-Nagumo model. The individual neuron generates the stable limit cycle shown in Fig. 2a. The state variable of the i-th neuron is denoted by v i (t) = [v i (t) w i (t)] T , and the dimension of the entire system's state The dynamics of the i-th neuron, subject to the diffusive coupling with nonuniform intensity and external forcing, are given by . … As explained in Table 1, e 1 and e 2 approximately span the subspace given by Recall that ρ = [e 1 e 2 ] minimizes (9) because L T is maximized; see the previous section. Thus, the Galerkin projection onto this subspace extracts core subdynamics in the following two senses. First, from a POD perspective, this subspace best approximates the noise response data; see the left-hand-side of (9). This is confirmed by the fact that quick convergence to this subspace is nothing but the aforementioned (de)synchronization phenomena. Second, from a controllability perspective, this subspace best approximates the effectively reachable states; see the right-hand-side of (9). In other words, even if we apply the optimal feedback control, it is expensive to avoid the (de)synchronization phenomena.
This can also be understood from the structure of the dynamics. Concerning (A) and (C), since only the common input is allowed, the synchronization induced by the strong coupling and noise is difficult to prevent, independently of T. On the other hand, concerning the desynchronization in (B), even though some well designed entrainment signals exist 38 , they are not effective enough when the input weight T −1 is large. As observed in this example, controllability of highly nonlinear phenomena can be suitably captured from the noise-driven simulation data. Note that reduced order models for controlled complex networks obtained by the proposed method do not always allow such a simple interpretation. In other words, this method can extract nontrivial core dynamics purely from time series data.
In summary, we have proven that the noise response of the uncontrolled dynamics reveals the temperature-dependent controllability of general nonlinear network dynamics. This contribution consists of the following two achievements: One is a novel extension of the celebrated controllability Gramian for linear systems. To the author's best knowledge, this is the first nonlinear extension of the controllability Gramian, which was introduced over half a century ago and played a central role in the development of modern control theory 5 . The second achievement is equality (7), which mathematically proves that, when the system is open to environmental noise, the newly introduced Gramian is equal to the covariance matrix of the noise response data. This result forms a theory-bridge connecting controllability quantification and time series data analysis. An important outcome is that the equality (9) yields an easily implementable method to a control-theoretic nonlinear model reduction problem for the first time. An extensive amount of noisy data is presently being gathered, and has been gathered to date, for a variety of uncontrolled systems. The equality (7) makes such data useful to glean insight into the modeling/controllability of controlled systems. We believe that this result can provide new methods and viewpoints in many research fields in view of the fact that much controllability related work is inspired by the pivotal contribution by Liu et al. 1 For example, the condition number of the Gibbs Gramian should characterize the effect of nonlinearity on the network nonlocality analogously to the linear case 8 . Also, in view of the numerical simulation above, the relation between the noise effect and the connection topology of the dynamical complex networks 40 can be analyzed. Furthermore, the fact that any uncontrolled nonlinear dynamics subject to environmental noise obey the Gibbs distribution associated with  τ T / , which is the minimum input energy divided by the temperature, suggests a nontrivial link to the canonical distribution that is used in statistical mechanics.

Methods
For simplicity of exposition we let T = 1, but note that general results can be shown similarly. It is well known 28 that the optimal value of the stochastic control problem in (4)   with ρ i listed above. Based on the standard correlation analysis, we conclude that ρ 1 ≈ ρ 2 , ρ 3 ≈ ρ 4 , ρ 2 ≉ ρ 3 for T = T L , and ρ 1 ≈ ρ 2 ≈ ρ 3 ≈ ρ 4  and consequently,  for an arbitrary smooth function w(x) defined on  n , where we exchanged the order of expectation and spatial integral. The arbitrariness of w(x) in (11) means the probability density function of τ x( ) is given by  φ τ x ( ). Finally, (11) with w(x) = xx T yields (7).