Controlling network ensembles

The field of optimal control typically requires the assumption of perfect knowledge of the system one desires to control, which is an unrealistic assumption for biological systems, or networks, typically affected by high levels of uncertainty. Here, we investigate the minimum energy control of network ensembles, which may take one of a number of possible realizations. We ensure the controller derived can perform the desired control with a tunable amount of accuracy and we study how the control energy and the overall control cost scale with the number of possible realizations. Our focus is in characterizing the solution of the optimal control problem in the limit in which the systems are drawn from a continuous distribution, and in particular, how to properly pose the weighting terms in the objective function. We verify the theory in three examples of interest: a unidirectional chain network with uncertain edge weights and self-loop weights, a network where each edge weight is drawn from a given distribution, and the Jacobian of the dynamics corresponding to the cell signaling network of autophagy in the presence of uncertain parameters.

I like that the paper is clearly written and the contributions are stated clearly. The flow of the presentation of the preliminaries and results is also very good. The topic is also very timely, with applications all over science and engineering. There are also two points that are not major contributions but I liked about the paper. One is the formulation of the problem in the form of the optimization problem (4). It is quite straightforward but still not something that I had seen elsewhere. Second is the exponential decay rates they have observed as Assumptions 1 and 2. They are not proved or shown in any systematic form, only in 3 examples, but seeing them formulated in one place is nice.
What I am most concerned about is the significance of the contributions. The main conclusion of the paper that the optimal cost remains finite seems trivial to me, unless I am missing something. The reason is that, let's assume the control input u is set to 0 for all t. As long as the control horizon is finite, which is the case in this paper, the deviation of the output Cx_j(t) from the target output y_f is necessarily finite (say U_j), no matter how large, even if the system is unstable. It is of course true that this finite value U_j is different for different system realizations j, but the authors' choice of alpha(N) is such that (1-alpha)*D is essentially a Monte Carlo (sample) expected value of the output deviation, which is expected to be finite especially when the parameters have finite support. Even if the support of parameter distributions was unbounded, still I'd expect the expected value of output deviation to be finite for finite time horizon. And all of these were assuming u = 0. When you can optimize over u, the bound will definitely be lower. So in any case, I do not see why this main result of the paper is non-trivial. Also, I am not sure how the result is useful, because the upper bounds on the optimal cost can be arbitrarily high. They are only finite, which I don't think is enough for any application.
I would recommend giving the authors a chance to revise, but if granted, I only advise them to use it if they have a strong response in proof of the significance of their contributions. If they want to resubmit, please also address the following. I also haven't checked all the derivations carefully at this point, and will do so later if a revision is to be made. So far, the derivations that I have checked have been correct except for minor modifications mentioned below. 1) There is quite some literature on robust optimal control. I do not see any papers from that literature cited, even though the problem of robust optimal control is almost exactly the same as this paper: finding the optimal control of a system with (parametric) uncertainty. There might be some differences in approach, but the distinction and superiority with respect to that literature should be comprehensively discussed.
2) The abstract reads "Our work sheds fundamental insight into the relationship between optimality and uncertainty." Please either remove that sentence or clearly explain what fundamental light the work is shedding into this relationship. This is of course related to my main concern above.
3) The introduction reads "While the optimal control can be derived for any particular set of parameters, the resulting control is only optimal for that set." How is this not exactly the case for this paper? If N is large you can intuitively say that the optimal control of the finite sample is also probably good for the entire distribution, but you haven't shown that, and the same would hold for any other work. Note that in contrast, the literature of robust optimal control I mentioned often consider infinite parameter sets all at once (at least as far as I know), not using finite sampling as done here.
4) The construction of the composite system has a minor problem and that is the fact that the matrix C tilde is not a horizontal concatenation of N copies of C, but a block-diagonal concatenation such as in A tilde. The vector y(t) should also be defined the same way as x(t).
5) The two matrices U bar and W bar are NOT similar. Two matrices A and B are similar if they can be related to one another via a similarity transform A = PBP^{-1}. Similar matrices always have the same eigenvalues but different eigenvectors.
6) The ordering of theta_k (with k) in Assumption 2 cannot be the same as the ordering of mu_k in Assumption 1. This is because the ordering of theta_k can be changed arbitrarily by changing y_f. Please clarify this in the paper.
7) The matrix B is defined in two inconsistent ways in Example 1. One is for what shown in the main text and one is more general for the supplementary. 8) What is the point of assuming that the parameter supports be finite? I don't see any use of it in the paper. 9) In the supplementary, the sentence following (S8) is incorrect. There is no optimization problem remaining. There is only a set of equations to solve to find x(t_f). That what the authors do as well, solve for x(t_f) from the constraints, no free variable is left after applying all the conditions of maximum principle.
Also I have two points to make; 23 1. The neural network of e.coli is actually quite well established although there might be 24 different genomic variation so people use it as a brain network as a consensus model.

25
The metabolic network is quire well established and fixed. networks, such as that they must satisfy a specific degree sequence, though in principle we 45 could apply our analysis to that case. In two of our examples, the existence of the connec-46 tions is assigned but the weights associated to the connections are not, which indicates the ensemble constraints are satisfied on average. Thus based on the reviewer's input, we think 48 it is more appropriate to call the ensembles we consider in the paper canonical, with the 49 more specific case of microcanonical ensembles remaining a possibility.  We thank the reviewer for their positive feedback. We have followed their recommendation 56 and worked on the presentation to make it more accessible to the broad readership of Nature 57 Communications. All the changes aimed at increasing readability have been highlighted in red.

58
This paper studies the problem of the optimal control of an uncertain system using Monte 60 Carlo sampling of its unknown parameters and designing the control input such that the fi-61 nal state deviation of some average system is optimized. As stated by the authors, their main 62 result is that the optimal cost remains finite as the number of Monte Carlo samples goes to 63 infinity, thus providing better and better approximations of the uncertainty distributions.

64
I like that the paper is clearly written and the contributions are stated clearly. The flow 66 of the presentation of the preliminaries and results is also very good. The topic is also very 67 timely, with applications all over science and engineering. There are also two points that are 68 not major contributions but I liked about the paper. One is the formulation of the problem 69 in the form of the optimizaation problem (4). It is quite straightforward but still not some-70 thing that I had seen elsewhere. Second is the exponential decay rates they have observed  What I am most concerned about is the significance of the contributions. The main con-78 clusion of the paper that the optimal cost remains finite seems trivial to me, unless I am 79 missing something. The reason is that, let's assume the control input u(t) is set to 0 for all t.

80
As long as the control horizon is finite, which is the case in this paper, the deviation of the 81 output Cx j (t) from the target output y f is necessarily finite (say U j ), no matter how large, 82 even if the system is unstable. It is of course true that this finite value U j is different for of the paper is non-trivial. Also, I am not sure how the result is useful, because the upper bounds on the optimal cost can be arbitrarily high. They are only finite, which I don't think 91 is enough for any application.

93
The reviewer is bringing up an important question, related to the underlying significance of 94 our contribution. We are grateful for having a chance to clarify this point. The case the author 95 is presenting (u = 0) is not actually encompassed in our formulation Eq. (4). In fact, this case 96 is a solution when setting α = 1 (we only allow α to vary in the open interval (0, 1). In the 97 revised version of the paper we have added a sentence that explicitly mentions that the α = 1 98 case is excluded.) For the special case α = 1, we have that the objective function coincides with 99 only the control energy J = 1 2 E, for which the solution of the optimal control problem simply 100 yields u = 0 and the objective function achieves a minimum at J = 0. However, for the case we 101 consider 0 < α < 1, things can be quite different, because the objective function depends on both 102 the deviation and the control energy.

103
The reviewer is correct that, for finite time, D N (α)/N p remains finite even in the thermody-104 namic limit (N → ∞). This is both clear from the logical arguments made by the reviewer, and is 105 proven in Eq. (S37) in the revised SI where we show the average deviation is bounded by a value 106 independent of the weighting parameter α and converges to a finite value in the thermodynamic 107 limit. What we found to not be trivial is that for proper choice of α(N), the control energy remains 108 bounded in the thermodynamic limit since this does not hold true for other choices of α(N). In 109 the revised SI, in Eq. (S29) we derive the approximate control energy,Ē N (α), and bound it on 110 both sides with, where the lower bound appears in Eq. (S34) and grows as 1 while the upper bound appears in Eq. (S30) grows as (1−α)N p α 2 . Note that both the lower and 114 upper bounds' behavior depends on the quantity,

115
( 116 and its behavior as N → ∞. For the upper and lower bounds to remain finite in the thermodynamic 117 limit, it is necessary that, To the second point that the bound can be arbitrarily large, in section S1.5 of the revised SI, we 125 demonstrate that the average deviation is bounded (as stated by the reviewer) between, We hope the reviewer is convinced by the changes we have made to refocus the paper towards the 141 fact that the control energy in the thermodynamic limit, lim can be calculated, while maintaining an upper bound on the deviation which can be designed, 1. There is quite some literature on robust optimal control. I do not see any papers from 147 that literature cited, even though the problem of robust optimal control is almost ex-148 actly the same as this paper: finding the optimal control of a system with (parametric) 149 uncertainty. There might be some differences in approach, but the distinction and 150 superiority with respect to that literature should be comprehensively discussed.
The problem of integrating optimal control and robustness has been dealt with in a num-153 ber of classic works, such as for example 2-8 . Typically though, robust control develops 154 feedback control which, for biological systems, may not be feasible with limited ability 155 for continuous measurement. We have briefly reviewed the field of robust optimal control 156 and included a discussion of previous work in this area in the introduction. We have also 157 emphasized the differences with our proposed approach and the reason why we believe it is 158 convenient when dealing with uncertain biological systems.  3. The introduction reads "While the optimal control can be derived for any particular 168 set of parameters, the resulting control is only optimal for that set." How is this not 169 exactly the case for this paper? If N is large you can intuitively say that the optimal 170 control of the finite sample is also probably good for the entire distribution, but you 171 haven't shown that, and the same would hold for any other work. Note that in contrast, 172 the literature of robust optimal control I mentioned often consider infinite parameter 173 sets all at once (at least as far as I know), not using finite sampling as done here.

175
The sentence in the introduction was not referring to the particular optimal control problem 176 considered in this paper, but to previous work in which several optimal control solutions 177 were computed for slightly different versions of the same "network" to assess robustness 178 of the computed optimal control solution. In this previous work, the optimal control inputs 179 u 1 (t) and u 2 (t), ... were allowed to vary from system realization to system realization. Then 180 they were compared to assess the extent to which the optimal control solution was sensitive 181 to parameter variations.
4. The construction of the composite system has a minor problem and that is the fact that 183 the matrixC is not a horizontal concatenation of N copies of C, but a block-diagonal 184 concatenation such as inÃ. The vector y(t) should also be defined the same way as x(t).

186
We thank the reviewer for pointing this out. The definition of the matrixC has been 187 corrected. The reviewer is correct and we apologize for the oversight. This has been fixed in the 193 revised manuscript.  We thank the reviewer for pointing this out. That sentence has been rewritten following 231 the reviewer's recommendation.

232
The authors have improved the presentation of the work, answering satisfactorily to my comments. I think that the paper is now suitable for publication in Nature Communications. Just as one historical note that might not be clear to people not strictly in the field, the first paper that has classified networks ensembles as canonical and microcanonical is Anand & Bianconi PRE 2009 and the non equivalence of the esenmbles was already discussed in this paper and fully demonstrated for the first time in Anand Bianconi PRE 2010 despite claims this for found many years later by other authors.
Reviewer #2 (Remarks to the Author): The authors have made a decent effort to improve the paper and address my concerns, but unfortunately my main concern regarding the significance of the contributions remains. In the rebuttal letter, in response to my main criticism, a number of points are mentioned: 1) "The case the author is presenting ... the control energy.": I think the authors have not understood what I meant by setting u = 0. I did not mean a case in which u = 0 is the optimal solution to Eq. (4). Instead, I was pointing to the fact that u = 0 (or u = 1, or u = 8.3434, whatever) is a feasible value, and it gives a cost value J(u = 0) which is by definition larger than the optimal cost J(u = u^*). But even J(u = 0) is finite, and so is J(u = u^*). Anyhow, this is not a point of disagreement.
2) "The reviewer is correct that ... additional requirements on b.": I understand the refocusing of the paper and the authors' point, but I think boundedness of optimal control energy is more or less a direct consequence of the boundedness of average deviation. Here is why. Let alpha = Np/(Np+b), as chosen in the paper. For N -> infinity, this makes 2*J equal to the control energy plus sample average of output deviation. Again, setting u = 0 (or any other value, as a feasible point) gives a bounded J, say J-bar, because the second term is bounded. Now if we optimize over u, the resulting u^* cannot be (or converge to) infinity because it has to give J^* <= J-bar < infinity.
The situation with constant alpha is different, and in fact interesting. First, please note that the control energy does not need to diverge to infinity if alpha is constant. Only if the lower bound on E goes to infinity, we can say that E goes to infinity as well. And the limit of 1/(Np)*(1/r_1)^log((1-alpha)Np/alpha) as N goes to infinity may be 0, if r_1 > 1/e, or infinity, if r_1 < 1/e. Now assume, as is that case in Example 1, tha r_1 < 1/e and the control energy corresponding to constant alpha diverges to infinity. This is a non-trivial and interesting observation, but it is not well developed (or even clearly mentioned) in the paper. The reason I think it is interesting is because with constant alpha, the contribution of E in J goes to 0 as N increases. So control energy becomes cheaper and cheaper. In the limit, control energy is free. But even with free control energy, it is not trivial to me that minimizing the expected value of output deviation may end up using infinite energy. So the contrast between the r_1 > 1/e case and the r_1 < 1/e case is very interesting, and something that the authors can potentially build on.
3) "To the second point ... in the manuscript.": Unless I am missing the authors' point, the provided argument is clearly false. If a function is monotonically decreasing and positive, it need not converge to 0. 1 + 1/b is such a function, so is 1000 + 1/b, and none of them converge to 0. In fact, the value of the upper bound as b -> infinity can remain arbitrarily large, which goes back to my point last time: we only know a bound exists, which is expected, and its value depends on extra variables r_1, ... from Assumptions 1 & 2 which cannot be computed except using simulations (in which case, the actual value of E and J can be computed using simulations as well). So how can this be useful in any real application?
The authors have improved the presentation of the work, answering satisfactorily to my 7 comments. I think that the paper is now suitable for publication in Nature Communications.

8
Just as one historical note that might not be clear to people not strictly in the field, the first 9 paper that has classified networks ensembles as canonical and microcanonical is Anand & We thank the reviewer for accepting our responses. We have included these additional refer-15 ences to the original definitions of the canonical and microcanonical ensembles as applied to 16 networks for our discussion therein to be thorough with respect to the previous literature.
The authors have made a decent effort to improve the paper and address my concerns, 19 but unfortunately my main concern regarding the significance of the contributions remains.

20
In the rebuttal letter, in response to my main criticism, a number of points are mentioned:

21
We hope that our responses to the reviewers' comments will convince the reviewer that our 23 contributions are significant. We agree with the reviewer that this argument can be shown to demonstrate the control energy 33 remains finite given a proper choice of α(N) in a more straightforward manner than we had used.

34
The argument can be formally stated as follows. Letū(t) be any integrable sub-optimal control 35 input, and let γ max be the resulting maximum difference of any state realization, The optimal cost can now be upper bounded by, Note that the energy term is a constant independent of N while the first term grows linearly with 40 N. To compensate for this growth, we need (1 − α) to decay as 1 N , so we choose This upper bound can then be expressed as, Taking the limit of this expression for constant b yields the asymptotic behavior, which is a constant. we optimize over u, the resulting u * cannot be (or converge to) infinity because it has to give 55 J * <=J < ∞.

57
We agree with the reviewer. By following the reviewer's argument and our derivations pre-58 sented above, we can prove that J is upper bounded given a proper choice of the parameter α(N).

59
Based on the reviewer's feedback, we have removed claims throughout the manuscript that we can can make the optimal control problem either feasible or infeasible in the large N limit. We also 63 present our derivations above to show boundedness of the optimal control energy. We hope the 64 reviewer will find our revised presentation of the paper and our added discussion satisfactory.

66
The situation with constant α is different, and in fact interesting. First, please note that 67 the control energy does not need to diverge to infinity if α is constant. Only if the lower 68 bound on E goes to infinity, we can say that E goes to infinity as well. And the limit of as N goes to infinity may be 0, if r 1 > 1/e, or infinity, if r 1 < 1/e.

71
Now assume, as is that case in Example 1, that r 1 < 1/e and the control energy corre-72 sponding to constant α diverges to infinity. This is a non-trivial and interesting observation, 73 but it is not well developed (or even clearly mentioned) in the paper. The reason I think it is 74 interesting is because with constant α, the contribution of E in J goes to 0 as N increases. So 75 control energy becomes cheaper and cheaper. In the limit, control energy is free. But even with free control energy, it is not trivial to me that minimizing the expected value of output 77 deviation may end up using infinite energy. So the contrast between the r 1 > 1/e case and 78 the r 1 < 1/e case is very interesting, and something that the authors can potentially build on.

80
We thank the reviewer for proposing an interesting extension of our work. As the reviewer 81 can easily verify, our assumption 1 is that µ k ≈ µ 0 r k 1 , where the eigenvalues are ordered such 82 that µ k ≥ µ k+1 . The approximation implies that 0 < r 1 < 1 so that 1 r 1 > 1. Then the limit of 83 1/(N p) * (1/r 1 ) log((1−α)N p/α) as N goes to infinity is infinity, independent of the particular value 84 of r 1 ∈ (0, 1). The reviewer is correct that the fact that a function is monotonically decreasing and positive, 96 does not imply it converges to 0 as the argument approaches infinity. However, this is not our Please see our response above.

118
By 'good' formulation, we meant one in which we parameterize the weighting coefficient in 24 such a way that both the control energy is bounded and there exists a parameter to arbitrarily ad-25 just this upper bound. The reviewer suggests we choose α = N p−1 N p (which makes (1 − α) = 1 N p ), 26 or in other words dividing the first term y N. While the reviewer is correct that this choice bounds 27 the control energy, it does not introduce a parameter capable of adjusting the upper bound of the 28 control energy. One could then attempt to introduce a parameter by defining α = N p−b 2 N p but note 29 that b 2 is restricted to the interval b 2 ∈ (0, N p) to ensure α ∈ (0, 1). This means that b 2 cannot be 30 adjusted independently of the size of the system, and in fact if we attempt to relate this alternative 31 parametrization to the original one, α = N p N p+b , we find that for any choice of b, the corresponding 32 value of b 2 is,

34
The importance of this relation arises from the fact we have already shown that the upper bound 35 can be arbitrarily reduced or raised for some value of b, but the corresponding value of b 2 depends 36 on the system size.
Another option would be α = N p−1 N p b 3 . In this case, we the parameter b 3 must be restricted to, to ensure α ∈ (0, 1). While this choice of parameterization may be equally capable of bounding 40 the control energy, the resulting form is not as clear as what we derived with our choice of α(N), which makes the relation between the parameter and the control energy very clear.

43
The reviewer may still believe that our choice of α(N) is a trivial but our choice is the one capable them. Just plug in r 1 = 0.5 or any other r 1 > 1/e and you can easily see numerically that

51
We have derived a far simpler lower bound for the control energy in the revised supplemen-52 tary information that better encompasses the points we are trying to make. We believe that there 53 was a notation mistake on our part that lead to the confusion. We apologize for this oversight.

58
Now I see the authors' argument better. I hadn't understood it correctly last time. I haven't 59 yet checked all the details of the derivations in S1.5, but I assume they are correct. My prob-60 lem is bigger. Even if the limit of theD N (b)/N p as b goes to infinity is 0, that is for a fixed 61 N. But the whole focus and motivation of the paper is for N to infinity. At first I thought this 62 is a minor point, but now I think it is a pretty fundamental inconsistency. Note that b is a 63 constant in the denominator of alpha, which can be chosen arbitrarily but has to be chosen 64 as you want, but that only shifts the initial value of α towards 0, which has 2 problems: 1) 68 it places all the weight of the objective function on deviations and makes energy very cheap, 69 which blows up energy as the authors acknowledge, and 2) it becomes irrelevant as N goes 70 to infinity, which is the thermodynamic limit the entire paper is about.

72
We once again thank the reviewer for thinking carefully about our derivations. We agree with 73 the reviewer that we want to show that for a fixed value of b the control energy is bounded for any 74 finite N and even in the thermodynamic limit. This is exactly what we do in the paper: from Eq.

75
(11) we see that the limit lim N→∞JN (b) is a constant independent of N. In addition to that, we