Ranking species in complex ecosystems through nestedness maximization

Identifying the rank of species in a social or ecological network is a difficult task, since the rank of each species is invariably determined by complex interactions stipulated with other species. Simply put, the rank of a species is a function of the ranks of all other species through the adjacency matrix of the network. A common system of ranking is to order species in such a way that their neighbours form maximally nested sets, a problem called nested maximization problem (NMP). Here we show that the NMP can be formulated as an instance of the Quadratic Assignment Problem, one of the most important combinatorial optimization problem widely studied in computer science, economics, and operations research. We tackle the problem by Statistical Physics techniques: we derive a set of self-consistent nonlinear equations whose fixed point represents the optimal rankings of species in an arbitrary bipartite mutualistic network, which generalize the Fitness-Complexity equations widely used in the field of economic complexity. Furthermore, we present an efficient algorithm to solve the NMP that outperforms state-of-the-art network-based metrics and genetic algorithms. Eventually, our theoretical framework may be easily generalized to study the relationship between ranking and network structure beyond pairwise interactions, e.g. in higher-order networks.

and columns of the adjacency matrix has revealed the existence of nested structures: neighbors of low rank nodes are subsets of the neighbors of high rank nodes [1][2][3].For example, nested patterns are found in the world trade, in which products exported by low-fitness countries constitute subset of those exported by high-fitness countries [4].In fragmented habitats, species found in the least hospitable islands are a subset of species in the most hospitable islands [1].Nestedness in real world interaction networks has captured cross-disciplinary interest for three main reasons.First, nested patterns are ubiquitous among complex systems, ranging from ecological networks [1,2] and the human gut microbiome [5] to socioeconomic systems [4,6] and online social media and collaboration networks [7,8].Second, the ubiquity of nested patterns have triggered intensive debates about the reasons behind the emergence of nestedness in mutualistic systems [9][10][11][12] and socioeconomic networks [6,8].Third, nestedness may have profound implications for the stability and dynamics of ecological and economic communities: highly-nested rankings of the nodes have revealed vulnerable species in mutualistic networks [13] and competitive actors in the world trade [14,15].
The ubiquity of nestedness and its implications in shaping the structure of biotas have motivated the formulation of the nestedness maximization problem.This problem can be stated in the following way: find the permutation (i.e.ranking) of the rows and columns of the adjacency matrix of the network resulting in a maximally nested layout of the matrix elements.Originally introduced by Atmar and Patterson [1], the problem has been widely studied in ecology, leading to several algorithms for measuring the nestedness of a matrix, e.g. the popular nestedness temperature calculator and its variants [1,[16][17][18].Yet many of these methods do not attempt to optimize the actual cost of a nested solution, but exploit some simple heuristic that is deemed to be correlated with nestedness.Other methods, e.g.BINMATNEST [16], do optimize a nestedness cost following a genetic algorithm, but lack the theoretical insight contained in an analytic solution to the problem.
More generally, we lack a formal theory to derive the degree of nestedness of a network from the structure of the adjacency matrix and the ranking of the nodes.
Here, we map the nestedness maximization problem onto the Quadratic Assignement Problem [19], thereby tackling directly the problem of finding the optimal permutation of rows and columns that maximizes the nestedness of the adjacency matrix.In our formulation, the degree of nestedness is measured by a cost function over the space of all possible rows and columns permutations, whose global minimum corresponds to a matrix layout having maximum nestedness.
Roughly speaking, the cost function is designed to reward permutations that move the maximum number of non-zero elements of the matrix in the upper left corner and to penalize those that move non-zero elements in the bottom right corner.Next, we set up a theoretical framework which allows us to obtain the mean field solution to the NMP as a leading order approximation and, in principle, calculate also next-to-leading order corrections.

II. PROBLEM FORMULATION
We consider bipartite networks where nodes of one kind, representing for example plants indexed by a variable i = 1, ..., N , can only be connected with nodes of another kind, e.g.pollinators indexed by another variable a = 1, ..., M , as seen in Fig. 1a.We denote by A ia the element of the network's N × M adjacency matrix: A ia ̸ = 0 if i and a are connected, and A ia = 0 otherwise.
Besides connectivity, the adjacency matrix encodes the interaction strength between nodes such that whenever i and a are connected, the strength of their interaction is A ia = w ia > 0. A ranking of the rows is represented by a permutation of the integers {1, 2, ..., N }, denoted r ≡ {r 1 , r 2 , ..., r N }; a ranking of the columns is represented by a (different) permutation of the integers {1, 2, ..., M }, denoted c ≡ {c 1 , c 2 , ..., c M }.More precisely, the r sequence arranges rows in ascending order of their ordinal rankings r i such that row i is ranked higher than row j if r i < r j .Similarly, the c sequence arranges columns such that column a ranks higher than column b if c a < c b .
To model the problem, one more concept is needed: network nestedness.Nestedness is the property whereby if j ranks lower than i, than the neighbors of j form a subset of the neighbors of i, as illustrated in Fig. 1b.Different rankings, i.e. different sequences r and c, produce different nested patterns, that is, nestedness is a function of the rankings.Therefore, any cost (energy) function that seeks to quantify matrix nestedness must be a function of the rankings r and c.The simplest energy function that does the job, aside from trivial cases (see Supplementary Information Sec.VI), is The product A ia r i c a penalizes strong interactions between low-rank nodes, since they contribute a large amount to the cost function; thus, low rank nodes typically interact weakly.Strong interactions are only allowed between high rank nodes, because when A ia is large the product A ia r i c a can be made small by choosing r i and c a to be small.Furthermore, high rank nodes can have moderate interactions with low rank nodes, because the product r i A ia c a can be still relatively small when r i is large and c a is small (or viceversa) provided A ia is not too large (hence the name 'moderate' interaction).
The assumptions of our model are relevant to diverse scenarios where nestedness has been ob- equivalent to optimizing the cost E(P, Q) with respect to the permutations matrices P and Q.The optimal permutation matrices bring the adjacency matrix to its maximally nested form P t AQ = A nested , which is complementary to the layout of matrix B.
served.In bipartite networks of countries connected to their exported products, we could interpret r i as the fitness of country i and c a as the inverse of the complexity of product a.In this scenario, high-energy links r i A ia c a represent the higher barriers faced by underdeveloped countries to produce and export sophisticated products [4], whereas low-energy links represent competitive countries exporting ubiquitous products.In mutualistic ecological networks, high-energy links represent the higher extinction risk for specialist pollinators to be connected with specialist plants, whereas low-energy links represent connections within the core of generalist nodes [2] as depicted in Fig. 1b.
With this equipment, it should be clear that to maximize nestedness, we have to minimize the energy function in Eq. ( 1).More precisely, nestedness maximization is the mathematical optimization problem in which we seek to find the optimal sequences r * and c * that minimize the energy function, i.e. min r,c E(r, c) = E(r * , c * ).Since the sequence r is a permutation of the ordered sequence {1, 2, ..., N }, we can always write r i = N n=1 P in n, where P is a N × N permutation matrix.Similarly, we can write c a = M m=1 Q am m where Q is a M × M permutation matrix.Therefore, the energy function, considered as a function of the permutation matrices P and Q, can be rewritten in the form where B is a N × M matrix with entries B ia = ia, as shown in Fig. 1c.In this language, the NMP is simply the problem of finding the permutations P * and Q * that minimize the energy function given by Eq. ( 2), which mathematically reads The geometric meaning of the optimal permutations P * and Q * is clear if we apply them to the adjacency matrix as P t AQ = A nest in that the nested structure in A nest is visually manifest, as schematized in Fig. 1c.The optimization problem defined by Eqs. ( 2) and ( 3) can be recognized as an instance of the Quadratic Assignment Problem (QAP) in the Koopmans-Beckmann form [19], one of the most important problem in combinatorial optimization, that is known to be NP-hard.
The formal mathematical mapping of the NMP onto an instance of the QAP represents our first most important result.Having formulated the NMP in the language of permutation matrices, we move next to solve it using a Statistical Physics approach.

III. SOLVING THE NMP WITH STATISTICAL PHYSICS
Our basic tool to study the NMP is the partition function Z(β) defined by where β is an external control parameter, akin to the 'inverse temperature' in the statistical physics language.The partition function Z(β) provides a tool to determine the global minimum of the energy function via the limit Calculating the partition function may seem hopeless, since it requires to evaluate and sum up N !M !terms.Nonetheless, the calculation is greatly simplified in the limit of large β, since we can evaluate Z(β) via the steepest descent method.The strategy consists of two main steps.The first step is to work out an integral representation of Z(β) of the form where the integral is over the space of N × N doubly-stochastic (DS) matrices X and M × M DS matrices Y , that converge onto permutation matrices P and Q when β → ∞; and F (X, Y ) is an "effective cost function" that coincides with E(P, Q) for β → ∞.The second step is to find the stationary points of F (X, Y ) by zeroing the derivatives ∂F/∂X = ∂F/∂Y = 0, resulting in a set of self-consistent equations for X and Y , called saddle point equations.All steps of the calculation are explained in great detail in Supplementary Information VII.The resulting saddle point equations are given by where u, v are N -dimensional vectors and µ, ν are M -dimensional vectors determined by imposing that all row and column sums of X and Y are equal to 1.At this point we can exploit the specific form of matrix B, i.e.B ia = ia, to further simplify Eqs.(7).Specifically, we define the "stochastic" rankings ρ i and σ a as whereby we can cast Eqs.(7) in the following vectorial form (details in Supplementary Information VII) where the normalizing vectors v and ν satisfy Equations ( 9) and ( 10) represent our second most important result and, when interpreted as iterative equations, provide a simple algorithm to solve the NMP, whose implementation is discussed in detail in Supplementary Information VIII.Note that ρ and σ converge to the the actual ranking r and s for β → ∞.However, in practice, we solve Eqs. ( 9) and ( 10) iteratively at finite β.Once we reach convergence, we estimate r and s by simply sorting the entries of ρ and σ.We observe that larger values of β give better results, i.e., lower values of the cost E(r, s), as seen in Fig. 2a.
A full discussion of convergence and bounds of our algorithm will be published elsewhere.Here, we test its performance by applying it to many real mutualistic networks and show that we obtain better results than state-of-the-art network metrics and genetic algorithms, as discussed next.

IV. NUMERICAL RESULTS
We apply our algorithm on 47 real mutualistic networks freely downloadable at https: //www.web-of-life.es/,whose filenames can be found in the first column of Table I.To standardize the comparison with existing methods, we binarize the adjacency matrices of the networks setting A ij = 1 if nodes i and j are connected and zero otherwise, thus ignoring the weights.
Despite this simplification, we like to emphasize that our algorithm can be applied, as is, to any mutualistic weighted network of the most general form.Then we run four different algorithms comprising: naive degree [20], fitness-complexity (FC) [4], minimal extremal metric (MEM) [21], and BINMATNEST [16].While BINMATNEST is the state-of-the-art algorithm in ecology for nestedness maximization [22], the effectiveness of the FC [23,24] and MEM [21] has been proved in recent works in economic complexity, which also connected the FC to the Sinkhorn algorithm from optimal transport [24][25][26].We compare the value of the cost function E(r, c) returned by each of the analyzed algorithms to the value returned by our algorithm (see Supplementary Information Sec.VI for implementation details).As shown in Fig. 2b, our algorithm finds a better (i.e.lower) cost than degree, FC, and MEM on 100% of the networks.When compared to BINMATNEST, we find a better (or equal) minimum cost in 80% of the instances, as seen in Fig. 2b and Table I.
We conclude this section by showing an application of the similarity transformation that brings the adjacency matrix to its maximally nested form.We call P and Q the optimal permutations that solve the QAP in Eq. ( 3) (details in Supplementary Information Sec.VIII) and we perform the similarity trasformation which reveals the nested structure of the adjacency matrix shown in Fig. 2c.

V. CONCLUSIONS
In this work we introduced a cost function for the NMP in bipartite mutualistic networks.
This formulation allowed us to recast the problem as an instance of the QAP, that we tackled by Statistical Physics techniques.In particular, we obtained a mean field solution by using the steepest-descent approximation of the partition function.The corresponding saddle-point equations depend on a single hyper-parameter (the inverse temperature β) and can be solved by iteration to find the optimal rankings of the rows and columns of the adjacency matrix that result in a maximally nested layout.We benchmarked our algorithm against other methods on several real ecological networks and showed that our algorithm outperforms the best existing algorithm in 80% of the instances.
We note that by changing the definition of the matrix B, i.e. using measures other than a sequence of ordinal numbers, one can repurpose our algorithm to rank rows and columns of a matrix according to other geometric patterns [27,28].Therefore, the proposed framework holds promise for the effective detection of a wide range of network structural patterns beyond the nestedness considered here.Finally, the present framework can be easily extended and applied to solve the ranking problem in networks with higher order interactions.For example, given the adjacency tensor A iaγ for a system with 3-body interactions, we can define the energy function E(P, Q, R) to be optimized over 3 permutation matrices P , Q, and R following exactly the same steps outlined in this paper for the case of pairwise interactions.This may be especially relevant in the world trade for ranking countries according to both exported and imported goods.In each panel we plot the cost returned by each algorithm divided by the cost returned by our algorithm, denoted E/E our , for each network considered in this work.A value E/E our > 1 means that our algorithm returns a better, i.e. lower, cost.We find that our algorithm returns a better cost in 100% of the networks when compared to degree, FC, and MEM, and in 80% of the networks when compared to BINMATNEST (see also Table I).c, Similarity transformation applied to the adjacency matrix A of network ML-PL-OO1 that brings A into its maximally nested form P t AQ, where P and Q are the optimal permutation matrices constructed from the optimal ranking vectors r * and s * .
and Patterson [1], and their mutations.From a well-performing candidate solution, an offspring of solution is created by selecting a second "parent" from the remaining solutions in the population, suitably combining the information from the two solutions, and eventually performing random mutations in the resulting child solution.Specifically, denote as w the row ranking vector of a well-performing solution and p the row ranking vector of its selected partner (the procedure is analogous for the column ranking vectors).The row ranking vector of the offspring solution, o, is set to w with probability 0.5, otherwise it is determined by a combination of w and p determined by the following algorithm [16]: • An integer k ∈ {1, . . ., N } is selected uniformly at random.
• For i ∈ {k + 1, . . ., N }, if p i ∈ {w i , . . ., w k }, then the value of o i is chosen at random from all the unused positions.
As final step, a random mutation of ranking vector o is performed by selecting at random k 1 , k 2 ∈ {1, . . ., N } and performing a cyclical permutation of the elements r k 1 , . . ., r k 2 .For both rows and columns, the procedure is repeated for a prefixed number of iterations, and the lowesttemperature candidate solution (r * , c * ) is then chosen as the final solution.In our study, we run the BINMATNEST algorithm through the nestedrank function of the bipartite R package [2].

D. Fitness-complexity
The fitness-complexity algorithm has been introduced to simultanoeusly measure the economic competitivenss of countries (f i ∈ [0, ∞)) and the sophistication of products (q α ∈ [0, ∞) from the bipartite network connecting the countries with the products they export in world trade [4].The original fitness-complexity equations read [4] which implies that high-fitness countries export many products -both high-and low-complexity ones -and high-complexity products are rarely exported by low-fitness countries.We observe that the fitness-complexity equations are formally equivalent to the Sinkhorn-Knopp equations used in optimal transport [24,25].As such, they can be derived by solving a quadratic optimization problem with logarithmic barriers, defined by the energy function [26] By taking the partial derivatives of E(x, y) with respect to x i and y α , respectively, we obtain indeed the fitness-complexity equations in Eq. ( 14).This remark provides an optimization-based interpretation of the fitness-complexity equations, while it does not provide a principled interpretation for the logarithmic barriers and the relation between the fitness-complexity scores and the degree of nestedness of a network.The algorithm has been shown to effectively pack bipartite adjacency matrices into nested configurations through both qualitative and quantitative arguments [4,23], which motivates its inclusion in our paper.

E. Minimal extremal metric
The minimal extremal metric (MEM) is a variant of the fitness-complexity algorithm that penalizes more heavily products exported by low-fitness countries.The MEM equations read [21] which implies high-complexity products are never exported by low-fitness countries.The metric has been shown to visually pack bipartite adjacency matrices better than the original FC algorithm [21], which motivates its inclusion in our paper.

VII. DERIVATION OF THE SADDLE POINT EQUATIONS
In this section we discuss in detail how to derive the saddle point Eqs.(7) given in the main text.We consider the minimization problem defined by (r * , s * ) = arg min where the cost (energy) function is given by and R N and R M are the sets of all vectors r and s obtained by permuting the entries of the representative vectors r 0 and s 0 defined as r 0 ≡ (1, 2, 3, ..., N ) , Therefore, we can write any two vectors r and s as where P and Q are arbitrary permutation matrices of size N × N and M × M , respectively.
Furthermore, we introduce the N × M matrix B defined as the tensor product of r 0 and s 0 , whose components are explicitly given by With these definitions we can rewrite the energy function as the trace of a product of matrices in the following way: The minimization problem in Eq. ( 17) can be reformulated as a minimization problem in the space of permutation matrices as follows (P * , Q * ) = arg min where S N and S M denote the symmetric groups on N and M elements, respectively.
Next we discuss a relaxation of the problem in Eq. ( 23) that amounts to extend the spaces F (X * , Y * ) = min and are only 'slightly different' from the permutation matrices P * and Q * (we will specify later what 'slightly different' means in mathematical terms and what F actually is).The quantity which plays where the integration measures are defined by DX ≡ i,j dX ij and DY ≡ a,b dY ab .The next step is to transform the sum over permutation matrices P, Q into a sum over semi-permutations matrices / P , / Q and then performing explicitly this sum using the Lemma in Eq. (30).In order to achieve this goal, we insert into Eq.(34) N delta functions N j=1 δ i X ij − 1 and M delta functions M b=1 δ a Y ab − 1 to enforce the conditions that the columns of X and Y do sum up to one.By inserting these delta functions, we can then replace the sum over P, Q by a sum over / P , / Q, thus obtaining To proceed further in the calculation, we use the following integral representations of the deltafunctions: into Eq.( 35) and we get where we defined the integration measures D X ≡ i,j d Xij /2πi, D Ŷ ≡ a,b d Ŷab /2πi, Dz ≡ j dz j /2πi, and Dw ≡ b dw b /2πi.Performing the sums over / P and / Q using Eq.(30) we obtain Next we introduce the effective cost function F (X, X, Y, Ŷ , z, w) defined as whereby we can write the partition function as which can be evaluated by the steepest descent method when β → ∞, as we explain next.
Steepest descent evaluation of the partition function In the limit of large β the integral in Eq. ( 40) is dominated by the saddle point where E(X, Y ) is minimized and S(X, X, Y, Ŷ , z, w) is stationary (in order for the oscillating contributions to not cancel out).In order to find the saddle point, we have to set the derivatives of F (X, X, Y, Ŷ , z, w) to zero, thus obtaining the following saddle point equations and similar equations for the triplet (Y, Ŷ , w).The derivative of E with respect to X ij gives for arbitrary values of ζ and ξ.This translational symmetry is due to the fact that the 2N constraints on the row and column sums of P are not linearly independent, since the sum of all entries of P must be equal to N , i.e. ij P ij = N .The same reasoning applies to the 2M constraints on the row and column sums of Q, of which only 2M − 1 are linearly independent, since ab Q ab = M .Furthermore, we notice that the solutions matrices X and Y in Eqs. ( 44 48), (49), and (50) are the constitutive equations for the relaxed nestednessmaximization problem corresponding to Eqs. (7) given in the main text.
We conclude this section by deriving the self-consistent equations for the "stochastic rankings" corresponding to Eqs. ( 9) and (10) given in the main text.We define the stochastic rankings as the two vectors

Fig. 1 :
Fig. 1: Modeling of the Nested Maximization Problem.a, A bipartite network models the interactions between, e.g., plants i, represented by purple circles, and pollinators a, represented by cyan squares, through the adjacency matrix A. The interaction is mutualistic, i.e.A ia = 1 > 0 if i interacts with a and A ia = 0 otherwise.b, A nested network has a hierarchical structure wherein the neighbors of low rank nodes (the specialist species at the bottom) are a subset of the neighbors of high rank nodes (the generalists at the top).The rank of a node is encoded in the variables r i (for plants) and c a (for pollinators).Top rank nodes have r = c − 1, while bottom ones have r = c = 4.The adjacency matrix of a nested network shows a peculiar pattern with all non-zero entries clustered in the upper left corner.c, Maximizing network nestedness amounts to minimize the cost function E(r, c) over the ranking vectros r and c, which, in turn, is

Fig. 2 :
Fig. 2: Numerical solution and comparison with other methods.a, Optimal cost E(r, c) returned by our algorithm on the mutualistic network named ML-PL-OO1 in the Web-of-Life database, for several choices of the parameter β.Larger values of β give lower costs.In particular, for sufficiently large β our algorithm returns a lower cost than the best off-the-shelf algorithm for nestedness maximization (BINMATNEST, red line).b, Comparison of our algorithm with state-of-the-art methods in the literature: Degree (upper-left), Fitness-Complexity (upper-right), Minimal-Extremal-Metric (bottom-left) , and BINMATNEST (bottomright).In each panel we plot the cost returned by each algorithm divided by the cost returned by our

S
N and S M of permutation matrices onto the spaces of doubly-stochastic (DS) matrices D N and D M .The space D N (D M ) is a superset of the original space S N (S M ).Solving the problem on the D-space means to find two doubly-stochastic matrices X * and Y * that minimize an 'effective' cost function F , i.e.
and the derivative of W with respect to Xij gives Eq. (41) with respect to X ij we getX ij = e −β(AY B t ) ij −z j k e −β(AY B t ) ik −z k .(44)Analogously, solving with respect to Y ab we getY ab = e −β(A t XB) ab −w b c e −β(A t XB)ac−wc .(45)It is worth noticing that Eqs.(44) and (45) are invariant under the tranformationsz j → z j + ζ , w b → w b + ξ ,

1 , 1 , 1 , 1 ,
), (45) automatically satisfy the condition of having row sums equal to one.Next, we derive the equations to determine the Lagrange multipliers z j and w b .To this end we first introduce the vectors v and ν with componentsv j = e −z j , ν b = e −w b .(47)Then, we define the vectors u and µ asu i = k e −β(AY B t ) ik v k −µ a = c e −β(A t XB)ac ν c −(48)so that we can write the solutions matrices X and Y in Eqs.(44), (45) asX ij = u i e −β(AY B t ) ij v j , Y ab = µ a e −β(A t XB) ab ν b .(49)Finally,imposing the conditions on X and Y to have column sums equal to one, we find the equations to be satisfied by v and ν v j = i u i e −β(AY B t ) ij −ν b = a µ a e −β(A t XB) ab −