The feasibility and stability of large complex biological networks: a random matrix approach

In the 70’s, Robert May demonstrated that complexity creates instability in generic models of ecological networks having random interaction matrices A. Similar random matrix models have since been applied in many disciplines. Central to assessing stability is the “circular law” since it describes the eigenvalue distribution for an important class of random matrices A. However, despite widespread adoption, the “circular law” does not apply for ecological systems in which density-dependence operates (i.e., where a species growth is determined by its density). Instead one needs to study the far more complicated eigenvalue distribution of the community matrix S = DA, where D is a diagonal matrix of population equilibrium values. Here we obtain this eigenvalue distribution. We show that if the random matrix A is locally stable, the community matrix S = DA will also be locally stable, providing the system is feasible (i.e., all species have positive equilibria D > 0). This helps explain why, unusually, nearly all feasible systems studied here are locally stable. Large complex systems may thus be even more fragile than May predicted, given the difficulty of assembling a feasible system. It was also found that the degree of stability, or resilience of a system, depended on the minimum equilibrium population.


Robert May's "Neutral Interaction" Model of Large Complex Systems
It is helpful to first recall the original argument of May 1 . For an n-species community, let N i (t) be the abundance or biomass of the i'th species and let ⁎ N i be the species equilibrium value (the symbol * indicating equilibrium). The actual units of N i (t) will depend on the details of the particular model and on many occasions it is convenient to scale the populations as we discuss shortly. We suppose that the growth rate of species-i depends on its interactions with other species as defined by some possibly complicated nonlinear function …. f N N N ( , , , ) i n 1 2 . That is: Close to equilibrium, the abundance of species-i is given by is the perturbation from the equilibrium value at time t. The dynamics of the population perturbations, when linearized around equilibrium, is of the form 1,2 : d dt The vector = x x ( ) i contains the perturbed population disturbances x i (t). The matrix A is referred to as the "community matrix" or Jacobian matrix and has elements = Here a ij represents the effect species-j has on the growth of species-i when close to equilibrium. A cooperative effect implies > a 0 ij , while a negative effect is just the opposite with < a 0 ij . The self-interactions between species are all scaled such that a 1 ii = − , as in May 1 . May 1 studied communities under the limited "neutral interaction" assumption where interspecific interactions are equally positive as negative, and their expected or average value is zero i.e., E(a ij ) = 0. Environmental fluctuations are assumed to perturb the interaction strengths, so that the community matrix is: where I is the identity matrix. The matrix B is a random matrix with coefficients b ij drawn from some random distribution having mean zero and variance Var(b ij ) = σ 2 , while diagonal terms = .
b 0 ii Entries of the matrix B are identically distributed independent random variables, and thus have no correlations. The variability of the interaction strengths σ reflects the strength of the disturbance perturbing the ecological system. Finally, to model connectance of the interaction network, a proportion C (1 ) − of randomly chosen off-diagonal interactions a ij are set to zero, leaving a proportion C nonzero.
We are interested in finding conditions for the "local stability" of these biological models which guarantee that a system will return to equilibrium after a "small" population perturbation. Unless otherwise stated, the paper will be concerned exclusively with local stability. It is well known that if all eigenvalues (λ i ) of the community matrix A have negative real parts (Re(λ i ) < 0), the system is locally stable. Thus local stability depends on the critical eigenvalue of the community matrix A that has the largest real part, i.e., on the stability threshold: The system is locally stable if Λ < 0, in which case all perturbations x i (t) eventually die out and the populations eventually converge to their equilibrium values N i ⁎ . When the eigenvalues of a matrix A are such that Λ < 0, it is sometimes convenient to refer to A as a "stable matrix. " Instability occurs when Λ > 0.
A central result in RMT states that the eigenvalues of random matrices such as A are distributed according to a "circular law. " Specifically, the n eigenvalues λ i of A are distributed uniformly in a circle with radius nC γ σ = in the complex plane [22][23][24] . As an example, Fig. 1a visualises the distribution of eigenvalues of the random matrix A I B = − + in the complex plane for n = 400 and γ σ = = . n 0 5, as determined numerically. The eigenvalues clearly fall in a circle having radius γ. The circle is centred at the point (−1, 0) and thus translated one unit to the left of the origin (0, 0) as a result of the identity matrix −I that is present in A.
In this paper, it is often of interest to study the properties of each new matrix A, as γ is increased incrementally from zero. If the radius of the eigenvalues circular distribution is increased to the point where it exceeds γ = 1, then one or more eigenvalues of A will populate the right-hand-side (RHS) of the complex plane (see Fig. 1a). In that case, at least one eigenvalue has a real part that is positive, so that Λ > 0. But the latter is the aforementioned condition for triggering instability.
This was the argument used by May 1 to demonstrate that Eq. 1 is locally stable for the neutral interaction model, if the interaction disturbances are "not too large, " namely if: and unstable otherwise. The larger the number of species n, the sharper the transition from stability to instability at γ = 1. This is visualised in Fig. 1b which plots the percentage of random matrices that are locally stable as a function of disturbance γ. In terms of model parameters, the threshold criterion Eq. 5 means that if either n, σ or C become too large, the system will transition into an unstable regime. With this simple but powerful argument, May demonstrated the fragility of large complex and highly connected systems.

Eigenvalue Distribution of Density-Dependent Community Matrix S = DA
More plausible biological models that include the operation of density-dependence (DD), may be framed in the form 2 i i i n 1 2 In these models, the net per-capita growth rate of an individual of species-i (i.e. 1/N i dN i /dt) depends on its interactions with other species as defined by some (often complicated) function ….
In the simplest DD model, dN dt r N / i i = , and each species has the same constant per-capita growth rate r, giving rise to exponential growth for each species. The well known Lotka-Volterra equations are a paradigmatic example of a more complex DD model, as discussed below.
We will examine whether or not a feasible equilibrium solution of Eq. (6), = …….
, is stable. But stability of this equilibrium is no longer solely determined by eigenvalues of matrices of the form A, as defined earlier. Linearizing Eq. 6 about equilibrium using a Taylor expansion yields an equation for the perturbations 2 : where the diagonal matrix D = diag( ⁎ N i ). Stability of the perturbations is determined by the critical eigenvalue of the community matrix S = DA, where the diagonal matrix D = diag(N i ⁎ ) 2,19,20 . Again, local stability of a feasible equilibrium is guaranteed if the critical eigenvalue component of S satisfies Λ(S) < 0. It is important to emphasise, that even though the matrix A might be stable, this does not automatically imply the matrix S = DA is stable (D > 0) [6][7][8][9][10] . Only very special classes of matrices have the property of D-stability for which stability of A implies S = DA is stable for any D > 0 (see ref. 27

and below).
A useful although hypothetical starting point is to assume that all n population equilibria N i ⁎ are randomly distributed in the interval (0, 1), and then examine the community matrix S = DA, taking A as a random matrix (n = 400, γ = 0.2). While A has eigenvalues distributed in a circle in the complex plane as shown in Fig. 2a, this is no longer the case for the community matrix S = DA which now has a "guitar-shaped" distribution as seen in in the complex plane as seen in Fig. 2c between the two demarked points in blue.
Here we show how to extract the eigenvalue distribution for S = DA. In a recent important paper in the context of neuronal networks, Ahmadian et al. 16 studied the eigenvalue distribution of matrices having forms similar to the community matrix = = − + S DA D I B ( ) where B is a random matrix. Their results imply that for large n, the eigenvalue density of S is nonzero in the region of the complex plane, satisfying: The complex variable = + z x iy, and trace(A) = ∑ a ii is defined as the usual sum of the matrix A's diagonal elements. In the Methods section, it is shown that the region corresponds to those values of z x iy = + for which: The inequality specifies a well-defined region in the complex plane where the eigenvalues of S lie. The region is referred to as the "support" of the eigenvalue distribution, and unlike the RMT circular law, the eigenvalue density is generally not uniform in this region. Note though, that a potential caveat of the method of Ahmadian et al. 16 is that it depends on the matrices D and A being independent. But for our application the caveat appears to be relatively inconsequential (see SI6). Inequality Eq. 9 shows that the support region of the eigenvalues is determined exclusively by the equilibrium populations N i ⁎ , σ, and connectance C. Furthermore, one immediately observes that T has singularities at those points where ⁎ = − z N i , indicating that the region containing the eigenvalues of S must necessarily envelope the population equilibria − ⁎ N i . This gives an important hint of the strong relationship between the eigenvalues and the population equilibria.
It is possible to capture the complicated eigenvalue boundary that arises by evaluating Eq. 9 at equality.   Figure 3b plots the eigenvalues of S for both γ = 0.2 (yellow dots) and γ = 0.9 (blue dots) superimposed on the same graph. Note that as γ is increased to γ = 0.9, the red boundary expands considerably. When γ > 1, the boundary moves into the RHS of the complex plane (not shown) where eigenvalues have positive real parts Re ( ( ) 0) i λ > , and the system is unstable. Finally, it is not hard to see from Eq. 9 that if we set all = N 1, i ⁎ the May model is retrieved, and all eigenvalues lie in a circle in the complex plane centred at the point z 1 = − , having radius γ σ = nC . This of course retrieves May's result Eq. 5, that stability is ensured if γ < 1. For a given system, the eigenvalue distribution changes as γ σ = nC is increased, similar to that shown in Fig. 3b for two different values of γ. For small γ, the eigenvalues all sit in the LHS of the complex plane, and the system is stable (the critical eigenvalue component is As γ is increased beyond a threshold point, the eigenvalues start to populate the RHS and the system becomes unstable (Λ > 0). Based on Eq. 5, as γ is increased from zero, the threshold between stability and instability occurs when the right-most point of the elliptical-like eigenvalue boundary (red line) first touches the point z = 0 at the origin, in the complex plane. That The threshold value of γ, can be found by evaluating Eq. 9 at equality with z = 0: And thus the feasible equilibrium N * is locally stable if: which surprisingly is independent of the positive equilibrium populations. The system is unstable if γ σ = > nC 1. We thus find that May's stability criterion is unusually general and holds for DD systems having community matrices of the form S = DA, even though the eigenvalue distributions of the latter are far from "circular. " Note that the identical stability criteria (5) and (11) for A and S = DA are statistical criteria for an ensemble of matrices, and do not necessarily imply that the stability of the individual matrix A guarantees the stability of the matrix S = DA. However, based on the above results, it is demonstrated in SI2 that for these feasible RMT systems the matrices A and S = DA become unstable at exactly the same parameter values (approximately γ = 1). Thus for large feasible systems (D > 0):

= .
A S DA stability of the interaction matrix implies stability of the community matrix (12) where A is a random matrix as defined by May 1 . This is similar but not exactly the same as D-stability (6,11,27), where by definition local stability of the interaction matrix A implies local stability of the community matrix S = DA, for any D > 0. It is important to emphasise that our results concerning the eigenvalue distribution of the stability matrix S = DA here, depend on the work of Ahmadian et al. 16 , and thus assume that the matrix D is fixed and deterministic while the matrix A is random. This is not always the case, as will be discussed when we study the Lotka-Volterra equations shortly.

Relationship Between Eigenvalues of S and the Equilibrium Abundances
Based on an "off-diagonal" matrix perturbation analysis (ref. 28 ) it is possible to show that the eigenvalues λ i of the community matrix S = DA of RMT systems and the equilibrium abundances N i ⁎ , are simply related, namely: (see Methods and SI1). The approximation holds in the range γ < 1. Thus the critical eigenvalue component , can be well approximated by the minimum equilibrium population N min The critical eigenvalue component is often used as a stability or resiliency index 5,29,30 . When Λ < S ( ) 0, the system is technically locally stable. However, the smaller or more negative is S ( ), Λ the more resilient is the community in terms of the time taken to return to equilibrium after a small perturbation. Arnoldi et al. (2017) 31 write that this form of "resilience is the most commonly used stability measure in theoretical ecology" (30). Equation 13 implies that the larger is the biomass of the rarest species ⁎ N ( ) min , the stronger is the stability or resilience of the system since it will ensure a more negative S ( ) Λ , and faster return-time to equilibrium after perturbation 5,29,30 . Since feasibility requires that the smallest equilibrium population N min ⁎ is positive i e N ( , 0) min ⁎ . . > , then Eq. 13 makes transparent that in the regime γ < 1, feasibility is linked to both local stability (which requires Λ < 0) and resilience. Note this result does not depend on any assumptions about randomness of the perturbation matrix B. In the feasible regime, resilience of the most simple or the most complex network, is entirely dependent on the smallest equilibrium abundance, and not directly determined by network properties such as topology, modularity, clustering, and connectedness.
To give an indication of the performance of Eq. 13 as an estimator for Λ, results for DD community matrices S = DA are examined later in Figs 4b and SI4. Some caveats and limitations concerning this approach are discussed in SI4 and 5.

Extensions and Ecological Examples
Example 1. Stability and eigenvalues of Lotka-Volterra competition communities. We now proceed to explore a fully defined density-dependent biological model, rather than just an abstract analysis of an arbitrary community matrix with random equilibria populations. The classical Lotka-Volterra (LV) equations serve this purpose well, being one of the most successful models for studying large complex systems [2][3][4][5][6][7]25 . For an n-species system, the equation for the abundance of species-i is: The simplest competition community is the "uniform model, " where all coefficients are fixed to the same constant = − a c ij , c 0 1 , < < and the system is fully connected (C = 1). In this parameter range, the equilibrium is always feasible and stable 10,11 . Hence the deterministic uniform model predicts that large competitive communities will satisfy two potentially advantageous features of viable ecosystems, namely feasibility and stability. We will see nevertheless that these seemingly stable and well-organized systems may be highly fragile in the presence of environmental fluctuations.
In the spirit of May 1 and Roberts 7 , a large ensemble of competitive communities may be specified all of which, on the average, resemble the uniform model with mean interaction strength where the b ij are mean zero random perturbations with variance Var(b ij ) = σ 2 . In this model, environmental fluctuations make the interaction strengths vary about the community's mean strength of competition. Thus two communities may both have the same average interaction strength -c, but the one undergoing stronger perturbation will show a greater variation in its interaction coefficients. Hence the stochastic model associates increasing disturbance with an increase in σ 2 .
The interaction matrix can now be written as  The eigenvalue support for S m = DA m may be found by using Eq. 9 after appropriate adjustment for the factor (1 − c). That is, the eigenvalue density of the matrix S m = DA m is nonzero in the region of the complex plane, satisfying: Figure 4a plots the eigenvalue distribution of the community matrix S for a typical n = 400 species competition community (γ = 0.3, c = 0.1) and we see that the red boundary for the support of the eigenvalues predicted by Eq. 16 at equality, is an excellent fit. Note that the eigenvalues in Fig. 4a are all in close vicinity, and are referred to as the "bulk" eigenvalues 23 . There is also an outlying 23 real eigenvalue λ = −21.41 not shown in the figure as it is completely out of scale. For competition communities, the outlying eigenvalue is an outcome of having added a constant term -c to all interaction coefficients = −    +    a c b ij ij , or a rank-one perturbation (see Methods). Hence, using previous arguments and Eq. 16 (setting z 0 = ), we can understand that if A m is stable, both S m = DA m and S = DA are stable, and it may be simply deduced from Eq. 16 that: apart from rare statistical exceptions (see also refs 10,11 ). To remind the reader, the key assumptions behind this result is that parameters are such that the "uniform competition model, " is stable with c 0 1 , < < and the system is fully connected (C = 1).
In Fig. 4b, the real parts of the eigenvalues of S are plotted against the equilibrium populations (1 − c) , and the points sit close to the 45 degree line as predicted by the theory, Eq. 13, when adjusted for competition.
It is important to note that Ahmadian et al. 16 , in their study of the eigenvalue distribution of S = DA, assume that the matrix D is fixed and deterministic while the matrix A is random. However, in our case the matrix D = diag( ⁎ N i ) is composed of the equilibrium populations which depend on the interaction matrix A as it changes from realization to realization. Thus given individual realizations etc. of sufficiently large random matrices, all of these would approximately exhibit the same eigenvalue distribution. However, the analogous set of random matrices ′ ′ ″ ″

DA D A D A
, , etc. considered in the current manuscript can potentially have different eigenvalue distributions. However, although differences do occur, in practice they are relatively minor in the parameter range required for feasible systems (i.e., where 1 γ  , as will be shortly demonstrated). Moreover, the key result given by Eq. 17 should remain unaffected by this limitation. Example 2. Feasibility implies stability in the ensemble LV model. We have just seen that when 1 γ < all feasible systems of the structure examined here are locally stable (apart from rare statistical exceptions). Without having gained an understanding of the eigenvalue relationship between the interaction matrix A and S = DA, this result would not be available to us. An important question to ask now, is whether feasible RMT competition systems are always locally stable? This would be the case if it could be shown that feasible systems only occur when 1 γ < . We therefore need to estimate the parameter regime where feasible systems can be found. This requires determining analytically the conditions all n species have positive equilibrium values with N i n 0 1, 2, , i > = … .
⁎ The mathematical techniques required to accomplish this may be found in ref. 11 (and in Supplementary Information of ref. 10 , but given here because the approach and result is critical to the main findings in this paper. The probability that a particular system is feasible Pr(Feasible) requires first the determination that a typical single species has positive population i.e., = > p Pr N ( 0) i ⁎ , which we proceed to find.
Competition communities. Based on the equilibrium condition AN * = −1 from Eq. 14, when γ < 1, a first-order approximation of the equilibrium populations of the competition equations is where κ is a positive constant and the symbol ′ represents a division by (1 − c).
and note that by the Central Limit Theorem, X i is a normally distributed random v a r i a b l e w i t h m e a n a n d v a r i a n c e : where Z is the standardized normal variate, namely Z ~ N(0, 1).
i ⁎ = > is purely a function of the single aggregated parameter γ i.e., p = p(γ). Since the species are relatively independent, and since the n-species all have similar characteristics, a first order estimate of system feasibility is given by the probability that all n-species equilibria are greater than zero, namely: In addition to the theoretical results, Fig. 5 provides a plot of the percentage of feasible competition Lotka-Volterra models from a simulated random ensemble of systems, as a function of disturbance γ The graphs corroborate that the larger the number of species n, the more difficult it becomes to generate a feasible system. One also sees that the analytical predictions based on Eq. 19 are accurate since they sit close to the numerical equilibrium analyes (circles) of the Lotka-Volterra systems in Fig. 5. Return now to our initial query: Are all feasible systems stable? Results for competition communities indicate that we should not expect to find feasible systems unless γ ≪ 1, (refer to discussion related to Eq. 20). Yet from Example 1 above, it was found that all feasible systems are stable as long as γ < 1. This implies all feasible competition systems must be stable. It also explains both in intuitive terms and theoretical terms (details in SI3) why there are no feasible-stable systems when γ > 1.
Mutualistic communities. For the case of mutualist systems, the result can be generalized further. Consider the LV n-species mutualistic system: i i i j n ij j 1 in which a a 0, 1, ij ii ≥ =− and it is assumed the matrix A is strongly connected (i.e., irreducible). The birth rates r i > 0 and at least one r i > 0. (This is just Eq. 9 for competition communities, but with signs of interactions made appropriate for mutualistic systems). A simple application of M-matrix theory establishes that all feasible systems are locally stable 11,32 . More recently, this result has been extended and it has been shown that the mutualistic system Eq. (23) possess a globally asymptotically stable feasible equilibrium iff A is locally stable 32 . This leads to an interesting situation with regards to mutualist systems, in that local stability of the interaction matrix A and feasibility are tied in a manner that ensures that all feasible mutualistic systems are stable.
The equilibrium N i ⁎ are solutions of the the LV model (eq. 21) whereby AN* = −1. Thus the community matrix SN* = −1 N*, has an eigenvalue of −1 (see Methods). This "outlier" eigenvalue is well separated from the "bulk" as shown in Fig. 6. The critical eigenvalue of S proves to be Λ = −1 for all values m for which there is a feasible equilibrium. Thus the degree of mutualistic interaction m has no impact on the resiliency of a feasible equilibrium.

Discussion
Many previous studies of biological networks have been unable to determine the stability properties of the community matrix S = DA for large complex random matrix systems. This is considered an unsolved and open problem 6,10 . Here a simple solution is presented based on the trace statistics of random matrices. For feasible RMT systems, it was shown that the community matrix S = DA transitions from stability to instability, at exactly the same parameter values for which the interaction matrix A transitions. Thus for a large feasible system with D > 0, stability of the interaction matrix A implies stability of the community matrix S = DA. The theoretical prediction depends on the assumption that the matrices D and A are independent, but the presence of correlations appears only minor in the present context. Simulations indicate the dependence is weak and appears to have little impact (see SI6).  Feasible RMT systems were shown to be nearly always stable in the regime γ < 1. However, for the classical ecological RMT models examined here, feasible systems are rarely found when γ > 1. For this reason all feasible systems examined were stable. While these results may in some sense be model dependent, they should provide a good general characterization of how the addition of heterogeneity and external perturbations will affect any feasible stable system. Namely, as heterogeneity and disturbance increases, the feasibility of the system will be particularly sensitive to the heterogeneity in interaction, and feasibility will be lost often even before the transition from stability to instability of the interaction matrix. Future research in network science may benefit from shifting focus to study those factors which promote system feasibility 6,10,15 .
Finally, if the LV systems are a good guide to real world ecological systems, they inform us that large complex systems may be far more fragile than May's main result predicts. The models studied here suggest that feasible stable mutualistic, and competition systems can only be found if  1 γ (refer to Eq. 20). The same is true for predator-prey systems modeled by setting c = 0. Thus the models indicate the difficulty of assembling large complex ecosystems that are feasible, and in addition indicate their fragility to perturbation in interaction strength. However, those RMT systems that can be assembled and that are feasible, are nearly always found to be automatically stable. This suggests the possibility that many large ecological networks observed in the real world (i.e., feasible systems) may be endowed with some underlying intrinsic stability, a possibility that needs further investigation.

Methods
Boundary of eigenvalue distributions. Ahmadian et al. 16 16 demonstrated that for large n, the eigenvalue density of the matrix S is nonzero in the region of the complex plane, satisfying: Being a product of random variables, = σ ( ) b C Var ij u 2 , and the term on the RHS of the inequality of Eq. 8 should thus be replaced by 1/(Cσ 2 ). Hence, with connectance C, the eigenvalue density of the community matrix is nonzero in the region of the complex plane, satisfying: , D is diagonal, and the i'th diagonal entry is of the form Thus the region for which the eigenvalue density of the matrix S is nonzero corresponds to the region in the z-plane where:

Relation between population equilibria ⁎
N i and eigenvalues λ i . Returning to inequality Eq. 9, note that the left-hand-side of the expression for T has a singularity for those values of z for which z N i ⁎ = − .This is visualised in Fig. 7 where T is plotted as a function of z for a hypothetical n = 10 species community with = .
For purposes of illustration, it is assumed that z is a real number in the interval [0, 1]. The function T clearly explodes at all points where z N i ⁎ = − . In this cut in the complex plane, the eigenvalues are predicted to be located on the x-axis (real-axis) at those points where T C 1/[ ] 2 σ > = 1000 (in this example). It is clear that the eigenvalues must lie close to the population equilibria In general, the smaller the population −N i ⁎ , the more exacting is the approximation as can be seen from comparing the slopes of the graphs about the equilibria (and as can be verified by examining ∂T/∂x). When interspecific interactions are switched on (γ > 0), and for reasonable assumptions (see also SI4 and ref. 28 ), the "off-diagonal" perturbation expansion is: The intuition behind approximation Eq. 13 may be understood as follows. In the extreme limiting case, when all off-diagonal interspecific interactions are set to zero (γ = 0), the eigenvalues of S = DA have precisely the same magnitude as the equilibrium population values with ⁎ λ = −N i i , and therefore this holds exactly. Figure 2c shows a situation very near to this case with γ = 0.01, for which there are many weak off-diagonal interspecific interactions, and nearly all of the eigenvalues sit on the real axis in close proximity to the equilibrium population values  Fig. 2c between the two demarked points in blue. But this holds to a good approximation even when the intensity of the perturbed interactions is increased, as shown for γ = 0.2 in Fig. 2b. See SI4 for more examples and a discussion of caveats. Data Availability Statement. No datasets were generated or analysed during the current study.

The eigenvalue approximation.
Note. After completion and submission of this manuscript (first submitted June 2017), it was found that some results overlap with an unpublished manuscript of Gibbs, Grilli, Rogers, Allesina found on BioArxiv 1708.08837v1 (submitted 29/8/17), but were obtained by different methods.