## Introduction

Complex systems store energy, process and, very often, efficiently exchange information to perform complex tasks. The universal mechanisms behind this behavior are unknown, although pioneering works have shown that the robustness of this type of systems to random failures or targeted attacks1 might emerge from the trade-off between the cost of exchanging information and the importance of guaranteeing communication dynamics for functioning2,3,4. Therefore, it is crucial for units in a complex network to route information through shortest paths, broadcasting, or according to some dynamics between these two extremes5,6, as it happens for instance in the Internet7. For several applications of interest, even the inverse problem, of identifying either the origin or the destination of the flow from the observation of pathways, is relevant8,9. This framework enables the description of a wide variety of systems, from cell signaling to individuals exchanging information in social/socio-technical systems such as human flows through different parts of a city by public or private transportation means. In the following we focus our attention on flow networks, systems characterized by the exchange of flows—e.g., number of streets between different parts of the city or human movements within a city, migration between different geographic areas, goods traded among countries, packets routed among servers, electricity in a power grid—through edges10,11,12,13. System’s units and their connections have a limited capacity and, in absence of sources and sinks, the sum of the overall incoming and outgoing flows is constant.

Two descriptors traditionally employed to characterize the structure, and hence indirectly the information flow, of unweighted, simple, sparse, and connected networks are the characteristic path length L and the clustering coefficient C2. L is defined as the average length over all shortest paths in the network, while C is the average local clustering coefficient over all nodes in the network and quantifies the network transitivity. Those networks having both a small characteristic path length L—typical of random graphs—and a large clustering coefficient C—typical of regular lattices—display the so-called “small-world” property2, which is found in real world networks and is related to how efficiently the information is exchanged in a system3,14.

One widely accepted measure of efficiency in information flow is the communication efficiency, that has been used to highlight the possible designing principles responsible for neural, man-made communication, and transportation systems3. This measure of efficiency was introduced in 20013, as a physically grounded and more general way to characterize networks displaying the small-world property. Instead of the two descriptors—L and C—for two apparently different kinds of properties of these networks, the communication efficiency evaluated at different scales is able to identify both structural features, indeed, 1/L and C have been shown to be approximations of the efficiency at the global and local scale, respectively. If the clustering coefficient finds a natural, physical generalization in the local communication efficiency, one main difference between the global communication efficiency and the characteristic path length remains: the first concerns the parallel, while the second the sequential information exchange in a system. This discrepancy is negligible if the distances in the network are not too diverse, while it becomes significant if they are highly heterogeneous, as, for instance, in the Internet3,14. A further advantage of the efficiency over the original characterization of small-world networks through L and C, is that it does not require the connectedness and sparseness of the network and, subject to an appropriate normalization, not even its unweightedness.

The topology of a complex network influences the information exchange among its units and is responsible for a rich repertoire of interaction patterns. For instance, the existence of a connection between two neurons allows them to exchange electrochemical signals and their communication dynamics is relevant for the functional organization of the brain. Similarly, human flows through different geographic areas shape the functional organization of a city and its neighborhood, or even email interactions among individuals in an organization determine how information reaches different teams. In these real systems we never see the everyone-is-connected-to-everyone structure, i.e., fully connected networks, because, even if it would be very efficient for the information exchange, it would also be extremely costly. The trade-off between the communication efficiency and the wiring cost characterizes complex systems and their robustness to perturbation in communication dynamics15,16.

Even more importantly, many empirical systems are characterized by connections with heterogeneous intensities and different correlations among weighted and purely topological network descriptors are ubiquitous17, from the human brain4,18,19,20 to transportation networks21. Therefore, it is essential to account for these underlying weighted architectures to gain real insights about the hidden construction principles and mechanisms used to transform, process, and exchange information14. An even broader scenario is possible: think for instance at infrastructure systems, where the units do not exchange information in parallel, where communication is subject to queues or priorities, where noise and failures may play an important role in the communication. In this case to assess the efficiency of the system one needs more information that may not be present (or be representable) in the topology or flows of the network.

However, even assuming that information is exchanged in parallel—which is assumed from henceforth so that when we use the terms efficiency or communication efficiency we mean the efficiency of parallel communication—for a wide class of weighted systems17 which are not embedded in space or for which metadata about the underlying geometry (nodes coordinates) are not available, the normalization of the weighted efficiency descriptor proposed by Latora and Marchiori3,14 may fail—due to a mathematical constraint which is not fulfilled—or may be difficult to compute—because of the nature of flows, encoded in edge weights. As a matter of fact, we observed that in many applications22,23,24,25,26,27 the weighted efficiency is not normalized by comparison with the most efficient version of the network at hand, as suggested by Latora and Marchiori3,14, but instead it is computed upon normalized weights. This latter descriptor, to the best of the authors’ knowledge, has not yet been studied in detail, so we will take care of it, underlining especially its lack of statistical robustness to fluctuations in the network topology or flows.

In this work we show that a mathematically rigorous, statistically robust, and physically grounded, normalized descriptor of the global efficiency of parallel information exchange can be computed without any knowledge on the system, but its weighted network representation. We demonstrate how to define a suitable "physical distance” between system’s units in terms of the flow they exchange across least resistance pathways. We also show that the quantification of the system efficiency might vary dramatically if flows are not adequately accounted for. In fact, discarding edge weights and considering only the topology of a network leads to an overestimation of its communication efficiency. In the opposite direction, incorporating the flows without normalizing the weighted efficiency descriptor leads to a measure that cannot be used to compare different systems. In between these two extremes lie several normalizing procedures for the weighted communication efficiency, which are discussed and compared in the remaining of this article. The normalizing procedures we propose in this work yields an efficiency descriptor that effectively summarizes both the topological and flow information encoded in the networks; in particular, on synthetic models we observe that the efficiency grows not only when the flows heterogeneity decreases, but also if the there is not a subset of privileged pathways monopolizing the whole information flow in the network.

## Results

### Flow exchange in complex topologies

Let us consider a complex network G = (V, E), whose weighted adjacency matrix $${\bf{W}}={({w}_{ij})}_{{i,j}\in V}$$ characterizes both its topology—indeed, wij = 0 if i, j are not adjacent, while wij > 0 if they are—and flows—by the magnitude of the weights wij.

The efficiency ϵij in the communication between two nodes i ≠ jV is assumed to be inversely proportional to their distance dij3. It follows that if i and j belong to different connected components, i.e., dij = , ϵij = 0. The global communication efficiency of the network G is the average over pairwise efficiencies

$$E(G)=\frac{1}{N(N-1)}\mathop{\sum}\limits_{i\ne {j}\in V}{d}_{ij}^{-1}.$$
(1)

The natural metric on unweighted networks is the shortest-path distance. In this case the topological distances satisfy $$0\le {d}_{ij}^{-1}\le 1$$, implying 0 ≤ E(G) ≤ 1, with equality holding when G is a clique and, since each pairwise communication occurs without mediators, information propagates the most efficiently. In case of weighted networks, distances should also account for weights and for what they stand for28. As a matter of fact, the algorithm proposed by Dijkstra in 195929 (and used mostly) involves the sum of the cost of connections to find the path of least resistance, which means that if the edge weights encode the intensity of interactions, their costs have to be derived before computing weighted distances. Furthermore, weighted distances are real valued so that, in general, E(G)  [$$0,\infty$$) and depends on the scale of the weights. For this reason, a global indicator of efficiency should be rescaled in [0, 1] considering an idealized proxy of G, called Gideal, having maximum efficiency.

In3,14 the authors propose to build Gideal based on pairwise physical distances ij, which are supposed (i) “to be known even if in the graph there is no edge between i and j”, i.e. ij > 0 for all i ≠ j, (ii) should fulfill the constraint ij ≤ dij for all i, jV, and (iii) should be considered along with topological information in the computation of weighted shortest-path distances dij. Then, $$E({G}_{\text{ideal}})=\frac{1}{N(N-1)}\mathop{\sum}\nolimits_{i\ne {j}\in V}{\ell }_{ij}^{-1}\ge E(G)$$ and $$\frac{E(G)}{E({G}_{\text{ideal}})}$$—which is henceforth denoted by GCE(G)—is correctly normalized. For some spatial networks—e.g., transportation systems like the railway or infrastructures such as the power grid—the physical distances are well-defined by the underlying geometry, for others—among which power stations and water resources—it might be difficult to calculate physical distances because of the lack of direct information about spatial coordinates of units. For nonspatial systems—such as social and socio-technical systems—$${({\ell }_{ij})}_{i,j \in V}$$ can be found as ad hoc transformations of connection strengths (weights) into connection costs. For instance, in a biological network, where wij represents the velocity of chemical reaction along a direct connection between i and j, ij could be taken as its inverse3; or, ij could be the minimum between 1 and the inverse number of edges between i and j in network with multiple unweighted edges14. Unfortunately, this apparently straightforward procedure hides several issues, e.g., if there is no direct connection between two biochemical units in a connected network, their physical distance is infinite according to the previous definition, while their weighted shortest-path distance will be some positive real number, violating (ii). Furthermore, in case of real positive weight $${w}_{ij}\in {{\mathbb{R}}}_{+}$$ one cannot take $${\ell }_{ij}=\min \left\{1,\frac{1}{{w}_{ij}}\right\}$$, since this introduces a cut-off on weights smaller than 1. We indicate by $$E^{\mathrm{LM}} = \frac{E^{G}} {E(G_{\mathrm{ideal}})}$$ the weighted efficiency of $$G$$ when $$G_{\mathrm{ideal}}$$ is built according to3,14. Another common method for obtaining a normalized efficiency indicator22,23,24,25,26, assuming that the weight encode the interaction intensities, consists in firstly, rescaling the weights into [0, 1], then transforming them to costs (usually taking their reciprocals), applying Dijkstra’s algorithm for evaluating the pairwise distances and finally computing the efficiency by (8), without any further comparison with a Gideal. See the “Methods” section for further details. For instance, let us mention the max-normalization of weights $${\tilde{w}}_{ij}=\frac{{w}_{ij}}{\mathop{\max }\limits_{i,j}\{{w}_{ij}\}}$$, which leads to $${E}^{\text{MN}}(G)=\frac{E(G)}{\mathop{\max} \limits_{i, j} \{{w}_{ij}\}}$$ and will be used for comparison in the rest of this study. Observe as of now, that this rescaling is particularly sensitive to outliers or extreme values of the link weights and that, differently from the original definition by Latora and Marchiori14, it compares every pairwise weighted efficiency to the maximum possible efficiency in the whole network. In conclusion, in a broad spectrum of scenarios of practical interest for applications, there is no general recipe to compute E(Gideal).

### Rethinking efficiency of information flow in weighted architectures

To overcome the above issues, we build Gideal from the weighted graph G so that physical distances (i) are not necessarily calculated from metadata or accessible spatial information and (ii) preserve a local feature.

We assume hereafter that edge weights are nonnegative real values and represent the strength of connections. Recall that a path is the sequence of vertices in a nonintersecting walk across the network; the length of the path is the number of edges in—or the sum of edge costs along—that path. Weighted shortest-path distances are then computed minimizing the sum of the reciprocals of weights30,31, which can be seen as costs, over all paths between node pairs (other weighted metrics may be used28 but are not discussed here). Let us denote by SP(i, j) a weighted and possibly directed shortest-path from i to j; its length, $${d}_{ij}=\mathop{\sum}\nolimits_{n,m\in {\mathrm{SP}}({i,j})}{w}_{nm}^{-1}$$, is the shortest-path distance between i and j, while $${\phi }_{ij}=\mathop{\sum}\nolimits_{n,m\in {\mathrm{SP}}({i,j})}{w}_{nm}$$ is the total flow along SP(i, j).

The matrix $${\mathbf{\Phi }}={({\phi }_{ij})}_{i,j\in V}$$ represents an artificial connectivity made of shortcuts, where total flows along shortest paths are delivered in one topological step, as shown in Fig. 1. Gideal is then obtained averaging between the true structure W and the artificial connectivity, i.e., $${{\bf{W}}}_{\text{ideal}}=\frac{{\mathbf{\Phi }}+{\bf{W}}}{2}$$. We finally define $${\ell }_{ij}={({w}_{ij}^{\text{ideal}})}^{-1}$$ and, hereafter, $${\mathrm{GCE}}(G)$$ indicates the global communication efficiency normalized w.r.t. our $$G_{\mathrm{ideal}}$$. Note that a stronger option than averaging between the real and the artificial connectivity would be to take Wideal = Φ, or to define ϕij as the average flow along SP(i, j).

When G is connected, Gideal is completely connected and ij is finite i ≠ j. If otherwise G is not connected, Gideal will be disconnected as well. If there is no path between i, j both ij = dij =  and their pairwise efficiency contribute neither to E(G) nor to E(Gideal). Note that in this case we are computing the average communication efficiency, a global indicator, of disconnected subnetworks, which may not be meaningful. Finally, it is possible to prove (using the Cauchy–Schwarz inequality, see “Methods” section) that the constraint ij ≤ diji ≠ j is always satisfied, hence ij are well-defined physical distances that can be calculated for any weighted systems. Having defined the mathematical tools, we now analyze some synthetic networks with a tunable structure. This enables us to separate the effects of topology and flows on the global efficiency of the network.

### Global efficiency of synthetic networks

We start with the simplest combination of topology and weights: upon a full network (a clique) with N = 30 nodes, we generate two ensembles of weighted networks sampling edge weights from different families of probability distributions. The topological efficiency ET, i.e., (1) with shortest-path distances computed ignoring weights, is 1 for all networks, since they are fully connected. We therefore focus on the weighted descriptors ELM, EMN, and GCE. The trivial case, wij = w > 0 constant, leads to ELM = EMN = GCE = 1. We impose more realistic homogeneous flows sampling from a Poisson distribution $${\mathcal{P}}(\lambda )$$ with varying λ. Since zero belongs to the support of the distribution, we add one to each sample to keep the complete connectedness of the network. The heterogeneity in the weighted structure is instead modeled with wij following power-laws(α) with a lower bound xmin = 532. For each value of the parameters λ and α we take 30 random samples from the respective distribution and generate 30 synthetic weighted networks. Figure 2a) shows their GCEs summarized through boxplots, as a function of λ and α. The five statistics of the GCEs distribution shown in the boxplots are: the first (Q1) and third (Q3) quartiles, or quantiles of order 25 and 75% (resp. lower and upper box hinges)—the width of the box shows the interqaurtile range (IQR = Q3 − Q1)—the median (middle line in the box), the smallest observation greater than or equal to Q1 − 1.5  IQR (lower whisker) and the largest observation smaller than or equal to Q3 + 1.5  IQR. Outliers, observations falling outside the expanded IQR are shown as dots. All these synthetic networks are topologically equally efficient since they are fully connected, however, accounting for the weights can lead to dramatically different results. The extreme heterogeneity of edge weights, characteristics of power-law distributions with small scaling exponent, strongly reduces the average communication efficiency of the network. Furthermore, as the tails of the weight distributions become lighter, the weighted GCE tends to the topological one. As the parameters λ and α grow the heterogeneity of weights decreases, since the tailness of the distributions decreases. A measure of the tailness of a distribution is the kurtosis, the standardized central moment of order 4. Usually, one evaluates the kurtosis minus three, which is called the excess kurtosis, and represents the excess w.r.t. the kurtosis of any normal distribution, which is always equal to three. For the Poisson distribution, the excess kurtosis is λ−1; for the power-law the excess kurtosis is finite only for α > 5 and, for α > 5, it decreases as a function of α, tending to 6 as α → . More details in the Supplementary Note 1. This reduction in the weights distribution tailness can also be seen in Fig. 2b), where we show the probability mass (resp. density) functions for λ = 1, 12 and α = 1.5, 7.

We next study the interplay between weights heterogeneity and topology through bond percolation, i.e., the targeted attack and removal of the links in the network26,33. By removing edges in decreasing weight order, we trim the tail of the weights distributions, reducing their heterogeneity. In Fig. 2c) we plot the four efficiency quantifiers as functions of the fraction f of removed edges and averaged over 30 random realizations of each model. Shaded areas indicate the standard deviation from the mean. We denote by Gf the damaged network obtained from G removing f % of its heavier links. G0 is, topologically, a clique, so ET(G0) = 1. In ET(Gf) the denominator is always N(N − 1), hence Gf is compared with a clique, by definition and ET(Gf) decreases monotonically. On the other side EMN, ELM (Fig. S4), and GCE use the flows of Gf to build the corresponding $${G}_{f}^{\text{ideal}}$$ and are, consequently non-monotone functions of f. It might seem a limitation nevertheless, it allows us to compare a series of networks Gf with slightly different topologies, increasingly sparser, and flows that become increasingly homogeneous. In the “Methods” section we propose a modification of the GCE to overcome this possible limitation in percolation applications. EMN and GCE behave similarly, although EMN has larger fluctuations because at each step the edge with maximum weight is removed. As expected, there are clear differences in the percolation plots of Poissonian and power-law network flows, but in both cases removing the heaviest links produces an increase in the average communication efficiency. In both cases, when the flows become more homogeneous the GCE depends largely on the topology, Finally, when the network is disrupted—near the critical threshold fc indicated by the maximum of the second largest component size34,35 (insets of Fig. 2)—the GCE has a break-down point, since we are averaging the efficiencies of many, small, distinct (and maybe individually efficient) disconnected networks.

Before moving to synthetic networks with realistic topologies, let us spend few words on the comparison between the GCE and EMN. The latter is, apparently, more attractive than the GCE, because it is easier to compute. A more attentive look, however, reveals some issues: firstly, the sample maximum is the least robust order statistic, i.e., it is very sensitive to extreme values and outliers. If this is not a strong enough reason to avoid it, have a look at Fig. 3. The GCE converges to 1 as the weights become more homogeneous, while EMN remains approximately below 0.5; and these are very specific networks, they are fully connected. What is then the meaning of a descriptor normalized into [0, 1], when the maximum value is so difficult to reach?

Finally, we compare the two descriptors on synthetic networks generated from models of real-systems, in particular small-world networks (Watts–Strogatz (WS) model2) and scale-free networks (Barabási–Albert (BA) model36). Again we consider 30 realizations of each topology, each having N = 256 nodes, average degree 〈k〉  12 and around 5% of all possible edges. We indicated by $${\bf{A}}=({A}_{ij})$$ the adjacency matrix of each network. Upon these, the edge weights are assigned following the two following rules:

$${w}_{ij}={k}_{i}^{\beta }$$
(2)
$${w}_{ij}={e}_{ij}^{\beta }$$
(3)

where $${k}_{i}=\mathop{\sum }\nolimits_{j = 1}^{N}{A}_{ij}$$ is the degree of node i, eij is the (topological) edge-betweenness of the link {i,j}, and β is a free parameter allowing us to tune the flow structure. The betweenness37 of the edge {i,j} is the number of the shortest paths between any pairs of nodes s, t that go through {i,j}, here indicated by gsijt, over the total number of shortest paths between the nodes s and t: $${e}_{ij}=\mathop{\sum}\nolimits_{s\ne t}\frac{{g}_{sijt}}{{g}_{st}}$$. First observe that (2)38 generates asymmetric weights; for positive β hubs have strong out-going links, while for negative values of β the intensity of the connections decreases with the degree. The case β = 0 leads to unweighted systems. Results are summarized in Fig. 4. The poor robustness of EMN emerges in the plot: the distributions of EMN over the ensembles are skewed and have greater variance. Therefore, we focus on the GCE for the between models comparison. Topologically, BA networks are slightly more efficient than WS networks, but as soon as weights are introduced the panorama changes: the degree distribution of small-world networks is less heterogeneous w.r.t. the scale-free leading to less heterogeneous weights generated by (2) and higher efficiency, independently from the sign of β. On the other side, when links strengths are related to their topological betweenness, BA networks are generally more efficient than WS networks and when β > 0 the networks are more efficient, because those edges that are very in-between shortest paths are also very efficient. Notice that here, differently from the previous example where all possible links were present, communication paths are unlikely to be able to “reorganize” (i.e., choose a different sequence of edges) in response to weights changes.

### Global efficiency of real interconnected systems

We use our framework to study the efficiency of four real systems (see Table 1). From the FAO worldwide food trade network we selected the layers of cocoa, coffee, tea, and tobacco. From the migration dataset we selected internal migration flows inside three Asian regions: India, China, and Vietnam. From the worldwide air traffic network we extracted the traffic in and between Europe and Africa. Finally, we consider the structural connectivity of human brain, quantified through diffusion tensor imaging (DTI) and fiber tractography methods.

These real networks have different properties, among which edge density and weight distribution. Based on the results of our analysis of synthetic networks and on previous studies14,26, we expect the weighted efficiency of these real networks to be smaller than their topological efficiency. Thanks to our normalizing procedure, which can be applied unchanged to all networks, it is legitimate to compare the weighted efficiencies of diverse systems and we expect the trade networks to have the smallest efficiency. As a matter of fact, observing the boxplots in Fig. 5 (and Fig. S8 of the Supplementary Note 2) we can see that the distributions of the network flows in the trade networks are highly heterogeneous. The whiskers in the boxplots extend from the minimum to the maximum of the distribution to highlight the presence of extreme outliers or very heavy tails. Let us then look at the results of the analysis.

Figure 6 shows the curves corresponding to ET(Gf) and GCE(Gf). Independently from the system, ignoring the network flows leads to an overestimation of the average efficiency, especially when flows are highly heterogeneous. The network of internal migration, is the most efficient, but it also has the highest cost being a clique. The tea trade network is the most inefficient. Finally the brain and the airports network have similar GCEs until the first 25% of their edges are removed, with the brain remaining afterwards more efficient w.r.t the reduced flows. Observe that the total flow could be restored, while keeping a specific efficiency value, redistributing the removed flow on the remaining links. In general, removing those edges monopolizing shortest paths forces their reallocation inducing an increase of the global weighted efficiency.

## Conclusion

Exchanging information is one of the main functions of many real complex systems and quantifying how efficiently they perform this task is of great interest for different disciplines. Consequently, the concept of communication and transport efficiency is relevant for a broad range of applications, from public transportation to the human brain and the Internet. Defining the efficiency and telling which system is efficient from which is not, is still an open and debated question. It depends on the context—“Are we interested in the efficiency of a system in relation to an objective, such as a maximum cost, a performance level, or in an absolute efficiency measure?”—on the amount of information available on the system—“Do its units have to wait before passing a massage?”, “Is this information encoded (or even, can this information be encoded) in the network representing the system?”—etc. For instance, communication and transport processes in many real systems may involve waiting times, error rates and technical inefficiency of the networked system, and possibly, also other variables that cannot be encoded as purely topological network features. In this case, characterizing the efficiency of the system by link weights and weighted shortest-path distances is an oversimplification.

However, given a network, without any metadata, we can reasonably assume that nodes connected by a link are closer or more similar than disconnected nodes, and that connected nodes communicate more easily and efficiently than disconnected ones. Furthermore, we can assume that strong, heavy links bring the nodes nearer, reducing the cost of their interaction and communication. Hence, our approach on system’s efficiency is based on the assumptions (i) of parallel information exchange between units and (ii) that the network representation of the system is enough for assessing its communication efficiency. Under these assumptions, we studied systems represented by flow networks, where links encode volumes of people, electrochemical junctions, packets, etc. While there is a widely adopted descriptor for the global communication efficiency in case of unweighted networks, we have found that its generalization to the case of weighted networks might not be suitable in all relevant cases. In this work we have identified and explained the mathematical limitations of the current measures.

A direct consequence of our analysis is that an estimation of global efficiency can be trusted only under specific conditions: i.e., the analysis of efficiency in the case of real network flows cannot be performed or, alternatively, when it is performed it might lead to important underestimation or overestimation of results, which agrees with previous results26. Since flow networks are ubiquitous, here we have proposed the most general definition of the global communication efficiency for weighted directed networks, which does not assume any other (meta-)information on the system.

Using our physically grounded definition of flow network efficiency, our results indicate that one can achieve a desired level of efficiency by wisely redistributing weights, instead of altering the underlying topology. This result is relevant for practical applications, since it is not always guaranteed that one can rewire or dramatically change with other interventions the network connectivity. In fact, altering network structure is usually expensive in economic or energetic terms. Our framework works under mild assumptions about the underlying topology and about the ideal and most efficient network, with no metadata, nor additional spatial (e.g., geographic) information on the system, allowing for trustworthy applications to empirical problems. It also allows for a complementary view of bond percolation from a functional perspective, allowing us to gain new insights about critical phases of information exchange and network flows in addition to topological ones.

## Methods

### Mathematical details on the normalizing procedure

We provide the proof of dij ≥ iji ≠ jV, which is a sufficient condition for the GCE to be correctly normalized in [0, 1]. Recall that SP(i, j) denotes a weighted (directed) shortest-path from i to j and $${d}_{ij}=\mathop{\sum}\nolimits_{n,m\in {\mathrm{SP}}(i,j)}{w}_{nm}^{-1}$$. Observe also that, if the shortest-path between i, j coincides with their link (i, j) the number of vertices in the sequence is SP(i, j) = 2 and their shortest-path distance is $$d(i,j)=\frac{1}{{w}_{ij}}$$. The total flow between i and j through the shortest-path SP(i, j) is defined as $${\phi }_{ij}=\mathop{\sum}\nolimits_{n,m\in {\mathrm{SP}}(i,j)}{w}_{nm}$$.

Before proving our main statements we write an inequality, which will be extensively used in the following proofs. The Cauchy–Schwarz inequality for vectors u, v in an inner product space reads u, v2≤〈u, u〉  〈v, v〉. Taking $${\bf{u}}=\left(\frac{1}{\sqrt{{x}_{1}}},\ldots ,\frac{1}{\sqrt{{x}_{n}}}\right)$$ and $${\bf{v}}=\left(\sqrt{{x}_{1}},\cdots \ ,\sqrt{{x}_{n}}\right)$$ the inequality becomes

$${n}^{2}={\left(\mathop{\sum }\limits_{i = 1}^{n}\frac{\sqrt{{x}_{i}}}{\sqrt{{x}_{i}}}\right)}^{2} \le \, \left(\mathop{\sum }\limits_{i=1}^{n}\frac{1}{{x}_{i}}\right)\left(\mathop{\sum }\limits_{i=1}^{n}{x}_{i}\right)\\ {n}^{2}{\left(\mathop{\sum }\limits_{i = 1}^{n}{x}_{i}\right)}^{-1} \le \, \left(\mathop{\sum }\limits_{i=1}^{n}\frac{1}{{x}_{i}}\right).$$
(4)

(4) states that for nonnegative real numbers x1, …, xn the inverse of their sum is smaller or equal to the sum of their reciprocals.

Since we have assumed edges weights to be positive we can apply the inequality, which leads us to

$${\left(\mathop{\sum}\limits_{n,m\in {\mathrm{SP}}(i,j)}{w}_{nm}\right)}^{-1}\le | {\mathrm{SP}}(i,j){| }^{2}{\left(\mathop{\sum}\limits_{n,m\in {\mathrm{SP}}(i,j)}{w}_{nm}\right)}^{-1}\le \mathop{\sum}\limits_{n,m\in {\mathrm{SP}}(i,j)}{w}_{nm}^{-1}.$$
(5)

Observe that SP(i, j)≥2 if G is connected, therefore the first inequality is actually strict.

From (5) we can derive useful inequalities involving wij, ϕij, dij, and ij:

$${\phi }_{ij}^{-1}={\left(\mathop{\sum}\limits_{n,m\in {\mathrm{SP}}(i,j)}{w}_{nm}\right)}^{-1}\le \mathop{\sum}\limits_{n,m\in {\mathrm{SP}}(i,j)}{w}_{nm}^{-1}={d}_{ij}$$
(6)

note that if wij ≠ 0, it also holds $${d}_{ij}\le \frac{1}{{w}_{ij}}$$.

It is also possible to prove that ϕij ≥ wij, i, jV. Indeed, if i, j are not adjacent then wij = 0 but, since G is connected, there is a path between them with ϕij > 0. If instead, they are adjacent, either ϕij = wij meaning that the weighted shortest-path coincides with the edge (i, j), or there is a shortest-path going through other vertices, such that $${d}_{ij}=\mathop{\sum}\nolimits_{n,m\in {\mathrm{SP}}(i,j)}{w}_{nm}^{-1}<\frac{1}{{w}_{ij}}$$ and the claim follows from (6).

Starting from the definition of physical distances ij, using simple inequalities and (5)

$${\ell }_{ij} = \, 2{\left({w}_{ij}+{\phi }_{ij}\right)}^{-1}\,\le\, 2{\left(\mathop{\sum}\limits_{n,m\in {\mathrm{SP}}(i,j)}{w}_{nm}\right)}^{-1}\,\le\, | {\mathrm{SP}}(i,j){| }^{2}{\left(\mathop{\sum}\limits_{n,m\in {\mathrm{SP}}(i,j)}{w}_{nm}\right)}^{-1}\\ \le \,\mathop{\sum}\limits_{n,m\in {\mathrm{SP}}(i,j)}{w}_{nm}^{-1}={d}_{ij}.$$
(7)

Again, for a connected network G the strict inequality ij < dij holds.

Finally, ϕij = 0 if and only if i and j lie in disconnected components and, consequently, the ideal network is disconnected as the original one. In this case both $${d}_{ij}=\frac{1}{{\phi }_{ij}}=\infty$$ and the missing links among disconnected components will not produce an underestimation of the efficiencies of the subgraphs. Of course, if the network is very fragmented, the GCE, a global descriptor, will not be very informative. Below, we propose a variant of the GCE, which is most appropriate in this case and in percolation simulations in general.

### Comparisons with other weighted efficiency measures

In this work we introduced existing measures of topological and weighted efficiency, more specifically, ET the topological efficiency defined in3, ELM defined by Latora and Marchiori in ref. 14, EMN obtained evaluating the efficiency on the network with max-normalized weights.

Let us recall the definition of efficiency3

$$E(G)=\frac{1}{N(N-1)}\mathop{\sum}\limits_{i\ne j\in V}{d}_{ij}^{-1}.$$
(8)

where we have the sum of the reciprocal values of pairwise distances $$\mathop{\sum}\nolimits_{{\mathrm{i}}\ne {\mathrm{j}}\in V}{d}_{{\mathrm{ij}}}^{-1}$$ divided by the number of non-diagonal entries in the distances matrix, i.e., N(N − 1). In the topological case this last term plays the role of a normalizing factor, since the sum of inverse shortest-path distances in a clique is exactly equal to N(N − 1). We refer to the efficiency (8) evaluated without edge weights, or in other words with topological shortest-path distances $$({d}_{ij})$$, as the topological efficiency and indicate it by ET. Then ET naturally lies in [0, 1].

The main difficulty arising in the definition of a weighted efficiency descriptor, have to do with the diversity of information that can be encoded as edge weights in a network. Usually weights represent connection strengths and connection costs are obtained as a function (e.g., inverse) of weights. Given the connection costs, one can compute weighted shortest-path distances28,29,30,31, which vary in [0 + ] and therefore, the efficiency computed according to (1) E(G)  [0 + ) and needs to be rescaled (or normalized) in order to be comparable among different systems.

EMN, which has been used, for instance, in refs. 22,24,25,26 is the simplest generalization of ET to the weighted case: rescaling the weights to [0, 1] implies that shortest-path distances dij ≥ 1, since weighted shortest paths are those paths minimizing the sum of edge costs, that is, inverse weights. Consequently, being dij ≥ 1, (8) results to be normalized. Different rescaling transformations of weights are possible, the most common is the max-normalization (from which the superscript MN) $${\tilde{w}}_{ij}=\frac{{w}_{ij}}{\mathop{\max }\limits_{i,j}{w}_{ij}}$$. The cost of edges is then $${\tilde{w}}_{ij}^{-1}$$. We show, now, that $${E}^{\text{MN}}(G)=\frac{E(G)}{{w}_{max}}$$, where E(G) is (8) calculated on weighted geodesic distances without the max-normalization of weights. Let wmax be the maximum weight over all edges of a weighted network G = (V, E) and let SP(i, j) be a weighted shortest path between i, jV. Observe that the max-normalization of weights does not affect the shortest path, but it does affect the shortest-path distance

$${\tilde{d}}_{ij}=\mathop{\sum}\limits_{n,m\in {\mathrm{SP}}(i,j)}\frac{1}{{\tilde{w}}_{ij}}=\mathop{\sum}\limits_{n,m\in {\mathrm{SP}}(i,j)}\frac{{w}_{\max }}{{w}_{ij}}={w}_{\max }{d}_{ij}.$$

Finally,

$${E}^{\text{MN}}(G)=\frac{1}{N(N-1)}\mathop{\sum}\limits_{i\ne j\in V}{\tilde{d}}_{ij}^{-1}=\frac{1}{{w}_{\max }}E(G).$$

This fact, could be appealing for computational reasons, but it is definitely not from a statistical point of view: the sample maximum and minimum are the least robust statistics, they are maximally sensitive to outliers. For this reason EMN may have very wild fluctuations over topologically similar networks but with different maximum weights, which makes this indicator not well suited for comparisons between different systems. To make it clearer, using this descriptor you might not be able to tell if your systems show different global efficiency values because they are characterized by different topologies and interplay between topology and flows, or because their maximum weights are, or are not, outliers to their weights distributions, a very global and extreme feature of the network. Of course, not only the max-normalization and inverse are available, for instance, in ref. 23 weights are wavelet correlation coefficient between regions in the brain and the cost of the connection between regions i and j is defined as cij = 1 − wij.

ELM is the weighted generalization of E(G) proposed by Latora and Marchiori in ref. 14. The idea is to normalize E(G) considering an ideal case Gideal, where all possible edges are present in the idealized graph and the information propagated most efficiently. Then,

$$\frac{E(G)}{E({G}_{\text{ideal}})}=\frac{\frac{1}{N(N-1)}\mathop{\sum}\limits_{i\ne j\in V}{d}_{ij}^{-1}}{\frac{1}{N(N-1)}\mathop{\sum}\limits_{i\ne j\in V}{l}_{ij}^{-1}}\le 1.$$
(9)

Observing that a sufficient condition for (9) is 0 ≤ ij ≤ dij for all i, j, V, defining Gideal reduces to building the matrix $${({\ell }_{ij})}_{i,j}$$. They called ij physical distances, in contrast to shortest-path distances, highlighting that the latter are computed using “the information contained both in the binary adjacency matrix and in $${({\ell }_{ij})}_{i,j}$$”. Observe that the matrix $${({\ell }_{ij})}_{i,j}$$ is, in every respect, a matrix of connection costs. In ref. 14 (Sec. 3) the authors give some examples to ij from edge weights. For instance, if weights wij ≥ 1 one can define $${\ell }_{ij}=\min \{1,\frac{1}{{w}_{ij}}\}$$, which is the transformation adopted in this work to compute ELM.

We refer to the Supplementary Note 2 (Figs. S2 and S3) for the full plot corresponding to Fig. 2b) of this study. The GCE converges faster to 1 as the weight distributions become less heterogeneous (in terms of kurtosis). We claim that the maximum of the GCE is obtained not only for full networks with constant edge weight distribution, but it is sufficient to have a uniform edge betweenness, as shown in Fig. S3.

Finally, the panel c) of Fig. 2 without the cut on the range of y − values is reported in Fig. S4 of the Supplementary Note 2. The percolation simulation consists in removing edges from an undirected weighted full network G, in decreasing weight order. We indicate by f the fraction of edges removed from G and by Gf the resulting, damaged, network with G0 = G. We then evaluate the efficiency of Gf by means of the already described measures: ET, EMN, ELM, and GCE. We repeat the process 30 times, sampling the edge weights from a Poisson distribution with parameter λ = 2 and 30 times, sampling the edge weights from a power-law distribution with free parameter α = 2.5. We include also the plots for common percolation indicators, such as the total weight of—i.e., the sum of the weights in—the largest connected component (LCC) rescaled in [0, 1], the size of the second LCC—divided by N = V—and the number of clusters (components)—also divided by N = V; see Fig. S5. The second LCC size has proven better at pinpointing the critical threshold in percolating lattices35, as well as at distinguishing between percolation regimes34.

Our normalization procedure can also be used to build a slightly modified version of the GCE that plays the role of a weighted integrity descriptor for percolation analysis. Let $${G}_{0}^{\,\text{ideal}\,}$$ be the idealized network corresponding to G0 build as described in our study. Then

$${\text{GCE}}^{* }({G}_{f})=\frac{E({G}_{f})}{E({G}_{0}^{\,\text{ideal}\,})}$$
(10)

is normalized in [0, 1] and it is a monotone decreasing function w.r.t. f.

This variant of the GCE has been evaluated for real networks in the Supplementary Note 2, Fig. S9.

### On artificial flows

We choose to build the artificial flows integrating weights over paths, but this is not the only possibility, provided that constraint ij ≤ dij is satisfied. Now, using the same definitions and notation adopted in this study,

$${d}_{ij} = \, \mathop{\sum}\limits_{n,m\in {\mathrm{SP}}(i,j)}\frac{1}{{w}_{nm}}\\ \ge \, \frac{1}{\min \{{w}_{nm}:n,m\in {\mathrm{SP}}(i,j)\}}$$
(11)
$$\, \ge \frac{| {\mathrm{SP}}(i,j)| }{{\sum }_{n,m\in {\mathrm{SP}}(i,j)}{w}_{nm}}\qquad\qquad$$
(12)
$$\; \;\;\ge \frac{1}{\max \{{w}_{nm}:n,m\in {\mathrm{SP}}(i,j)\}}$$
(13)
$$\ge \frac{1}{{\phi }_{ij}}\qquad\qquad\qquad\qquad\quad$$

So both the minimum, the maximum, and the average weight over the path are valid choices, as well as, the sum (our choice) and the maximum over all edges (the already discussed max-normalization). Now, when using the sum of flows over paths, we can combine the two sources of information W and Φ through the arithmetic mean, while this strategy is not possible if we define $${\phi }_{ij}^{* }=\min \{{w}_{nm}:n,m\in {\mathrm{SP}}(i,j)\}$$ (resp. $$\max$$) because we cannot prove that $${\ell }_{ij}=\frac{2}{{\phi }_{ij}^{* }+{w}_{ij}}\le {d}_{ij}$$, so we should drop W and simply define Wideal = Φ*. We implemented these choices corresponding to the minimum and maximum and show the results on our synthetic networks ensembles in Fig. S6. The two variants, called here GCEmin, $${\text{GCE}}^{\max }$$, converge faster to 1, since both minimizations (11)–(13) are less strict than ours (the sum). Taking the minimum (11), in particular, may result in values of efficiency spanning a narrow range, near 1, with a consequent difficulty in distinguishing networks on the basis of efficiency. Furthermore, in the bottom panel of Fig. S6 we can see a decreasing-increasing behavior of GCEmin which is, from our point of view, not desirable. $${\text{GCE}}^{\max }$$ displays, in general, a larger variability. We opt for the sum, since it has a physical meaning in terms of total flow of a subgraph (a path SP(i, j)), it allows us to average the artificial flows matrix with the original flows given by W and, last but not least, it is easily worked with in mathematical terms (simplifies rigorous proofs).

### On the normalized weighted efficiency of Latora and Marchiori14

Let us take G as the subgraph consisting of vertices q, r, v of Fig. 1 (indicated now by the indices {1, 2, 3}) and suppose that the weights are the result of the aggregation of multiple binary connections. Its weighted adjacency matrix is

$${\bf{W}}=\left(\begin{array}{lll}\cdot &4&1\\ 4&\cdot &2\\ 1&2&\cdot \end{array}\right)$$

We can compute physical distances ij following the suggestions in ref. 14 and shortest-path distances dij minimizing the sum of costs (i.e., inverse weights)

$${\bf{L}}=\left(\begin{array}{lll}\cdot &\frac{1}{4}&1\\ \frac{1}{4}&\cdot &\frac{1}{2}\\ 1&\frac{1}{2}&\cdot \end{array}\right)\quad {\bf{D}}=\left(\begin{array}{lll}\cdot &\frac{1}{4}&\frac{3}{4}\\ \frac{1}{4}&\cdot &\frac{1}{2}\\ \frac{3}{4}&\frac{1}{2}&\cdot \end{array}\right).$$

The global communication efficiency defined in ref. 14 is given by $${E}^{{\rm{LM}}}=\frac{E(G)}{E\left({G}_{{\rm{ideal}}}\right)}$$, where $$E(G)=\frac{1}{N(N-1)}\mathop{\sum}\nolimits_{i\ne j}\frac{1}{{d}_{ij}}$$ and $$E({G}_{{\rm{ideal}}})=\frac{1}{N(N-1)}\mathop{\sum}\nolimits_{{\mathrm{i}}\ne {\mathrm{j}}}\frac{1}{{\ell }_{{\mathrm{ij}}}}$$. Observe that the condition (which is sufficient for $$\frac{E(G)}{E({G}_{{\rm{ideal}}})}\le 1$$)

$${d}_{{\mathrm{ij}}}\ge {\ell }_{{\mathrm{ij}}}\quad \forall {\mathrm{i}}\ne {\mathrm{j}}\in {V}$$
(14)

is not satisfied for i = 1, j = 3 and this causes $$\,\text{GCE}\,=\frac{E(G)}{E({G}_{{\rm{ideal}}})}=\frac{22}{9}{\left(\frac{7}{3}\right)}^{-1}> 1$$.

This counter-example on the statement of (14) is not a pathological case: (14) is violated whenever the weighted shortest-path between adjacent nodes i, j does not traverse the direct link eij, i.e., $${d}_{{\mathrm{ij}}}<\frac{1}{{w}_{{\mathrm{ij}}}}$$ and it may often happen in real networks with large heterogeneous weights.

Trying to reproduce the results in ref. 14, we considered the neural network of the C. elegans2,14, with data from http://www-personal.umich.edu/mejn/netdata/. Firstly, we aggregate multiple edges, obtaining a simple, directed, weighted network with N = 297 nodes, m = 2345 edges, and weights in the range [1, 70]. If we consider the network as undirected, we obtain m = 2148 edges and weights in the range [1, 72]. The data are not the same used in ref. 14, so we cannot reproduce their results exactly. Let us focus on the undirected network: Fig. S7 shows the distance matrix D evaluated using Dijkstra’s algorithm with the reciprocal of edge weights, and the matrix of physical distances L, with $${\ell }_{{\mathrm{ij}}}=\min \{1,\frac{1}{{w}_{{\mathrm{ij}}}}\}$$.

### Real interconnected systems, additional results

Here we first apply the variant of the GCE, i.e., GCE*(Gf) to the networks of migrations inside Vietnam and of the human brain; secondly, we report the detailed percolation results for the real network flows discussed in this study.

We refer to the Supplementary Note 2, Fig. S9, to show the behavior of GCE*(G) for two of the real networks from Table 1.

Finally, we show the percolation plots for the remaining datasets studied in this work, see Supplementary Figs. S11 and S12.