## Introduction

A multiplex is a network in which nodes are connected through different types or flavors of pairwise edges1,2,3. A convenient way to think of a multiplex is as a collection of network layers, each representing a specific type of edges. Multiplex networks are genuine representations for several real-world systems, including social4,5, and technological systems6,7. From a theoretical point of view, a common strategy to understand the role played by the co-existence of multiple network layers is based on a rather simple approach. Given a process and a multiplex network, one studies the process on the multiplex and on the single-layer projections of the multiplex (e.g., each of the individual layers, or the network obtained from aggregation of the layers). Recent research has demonstrated that ignoring the effective co-existence of different types of interactions in the study of a multiplex network may have dramatic consequences in the ability to model and predict properties of the system. Examples include dynamical processes, such as diffusion8,9, epidemic spreading10,11,12,13, synchronization14, and controllability15, as well as structural processes such as those typically framed in terms of percolation models16,17,18,19,20,21,22,23,24,25,26,27,28,29.

The vast majority of the work on structural processes on multiplex networks have focused on ordinary percolation models where nodes (or edges) are considered either in a functional or in a non-functional state with homogenous probability30. In this paper, we shift the focus on the optimal version of the percolation process: we study the problem of identifying the smallest set of nodes in a multiplex network such that, if these nodes are removed, the network is fragmented into many disconnected clusters with non-extensive sizes. We refer to the nodes belonging to this minimal set as structural nodes (SNs) of the multiplex network. The solution of the optimal percolation problem has direct applicability in the context of robustness, representing the cheapest way to dismantle a network31,32,33. The solution of the problem of optimal percolation is, however, important in other contexts, being equivalent to the best strategy of immunization to a spreading process, and also to the best strategy of seeding a network for some class of opinion dynamical models34,35,36,37. Despite its importance, optimal percolation has been introduced and considered in the framework of single-layer networks only recently35,36. Optimal percolation is a NP-complete problem32. Hence, on large networks, we can only use heuristic methods to find approximate solutions. Most of the research activity on this topic has indeed focused on the development of greedy algorithms31,32,33,35.

Here we consider the generalization of optimal percolation to multiplex networks. Our generalization consists in the redefinition of the problem in terms of mutual connectedness16. To this end, we reframe several algorithms for optimal percolation in single-layer networks to obtain methods that consider the multiplex structure of networks as well. Basically all the algorithms we use provide coherent solutions to the problem, finding sets of SNs that are almost identical. Our main focus, however, is not on the development of new algorithms, but on understanding the consequences that arise from neglecting the multiplex nature of a network under an optimal percolation process. We compare the actual solution of the optimal percolation problem in a multiplex network with the solutions to the same problem for single-layer networks extracted from the multiplex system. We show that “forgetting” about the presence of multiple layers can be potentially dangerous, leading to the overestimation of the true robustness of the system mostly due to the identification of a very high number of false SNs. We reach this conclusion with a systematic analysis of both synthetic and real-world multiplex networks.

## Results

### Identifying structural nodes in multiplex networks

We consider a multiplex network composed of N nodes arranged in two layers. Each layer is an undirected and unweighted network. Connections of the two layers are encoded in the adjacency matrices. A and B. The generic element A ij  = A ji  = 1 if nodes i and j are connected in the first layer, whereas A ij  = A ji  = 0, otherwise. The same definition applies to the second layer, and thus to the matrix B. The aggregated network obtained from the superposition of the two layers is characterized by the adjacency matrix C, with generic elements C ij  = A ij +B ij A ij B ij . We focus our attention on clusters of mutually connected nodes16: two nodes in a multiplex network are mutually connected, and thus part of the same cluster of mutually connected nodes, only if they are connected by at least a path, composed of nodes within the same cluster, in every layer of the system. In particular, we focus our attention on the largest among these cluster, usually referred to as the giant mutually connected cluster (GMCC). Our goal is to find the minimal set of nodes such that, if removed from the multiplex, no mutual cluster with a size greater than N 1/2 is found in the network. This is a common prescription, yet not the only one possible, to ensure that all clusters have non-extensive sizes in systems with a finite number of elements35. Whenever we consider single-layer networks, the above prescription applies to the single-layer clusters in the same exact way.

We generalize most of the algorithms devised to find approximate solutions to the optimal percolation problem in single-layer networks to multiplex networks31,32,33,35,36. Details on the implementation of the various methods are provided in the Supplementary Note 1. We stress that the generalization of these methods is not trivial at all. For instance, most of the greedy methods use node degrees as crucial ingredients to calculate and assign scores to each of the nodes, and then remove nodes with respect to their scores. In a multiplex network, however, a node has multiple degree values, one for every layer. In this respect, it is not clear what is the most effective way of combining these numbers to assign a single score to a node: they may be summed, thus obtaining a number approximately equal to the degree of the node in the aggregated network derived from the multiplex, but also multiplied, or combined in more complicated ways. We find that the results of the various algorithms are not particularly sensitive to this choice, provided that the simple but effective post-processing technique considered in refs. 31,32,33 is applied to the set of SNs found by a given method. In Fig. 1, for example, we show the performance of several greedy algorithms when applied to a multiplex network composed of two layers generated independently according to the Erdős−Rényi (ER) model. Although the mere application of an algorithm may lead to different estimates of the size of the set of SNs, if we greedily remove from these sets the nodes that do not increase the size of the GMCC to the predefined sub-linear threshold (N 1/2)31,32,33 (Supplementary Note 2), the sets obtained after this post-processing technique have almost identical sizes (Supplementary Figs. 14).

As Fig. 1 clearly shows, the best results, in the sense that the size of the set of SNs is minimal, is found with a simulated annealing (SA) optimization strategy32 (see details in the Supplementary Note 1). The fact that the SA method is outperforming score-based algorithms is not surprising. SA actually represents one of the best strategies that one can apply in hard-optimization tasks. In our case, it provides us with a reasonable upper bound on the size of the set of SNs that can be identified in a multiplex network. The second advantage of SA in our context is that it does not rely on ambiguous definitions of ingredients (e.g., node degree). Despite its better performance, SA has a serious drawback in terms of computational speed. As a matter of fact, the algorithm can be applied only to multiplex networks with moderate sizes. As here we are interested in understanding properties of the optimal percolation problem in multiplex networks, the analysis presented in the main text of the paper is entirely based on results obtained through SA optimization. This provides us with a solid ground to support our statements. Extending the analysis of score-based algorithms to larger multiplex networks leads to qualitatively similar results (Supplementary Note 3, Supplementary Figs. 58).

### The size of the set of structural nodes

We consider the relative size of the set of SNs, denoted by q, for a multiplex composed of two independently fabricated ER network layers as a function of their average degree 〈k〉. We compare the results obtained applying the SA algorithm to the multiplex, namely q M, with those obtained using SA on the individual layers, i.e., q A and q B, or the aggregated network generated from the superposition of the two layers, i.e., q S. By definition, we expect that $$q_{\rm M} \le q_{\rm A} \simeq q_{\rm B} \le q_{\rm S}$$. What we do not know, however, is how bad/good are the measures q A, q B and q S in the prediction of the effective robustness of the multiplex q M. For ordinary random percolation on ER multiplex networks with negligible overlap, we know that $$q_{\rm M} \simeq 1 - 2.4554/\langle k\rangle$$ 16, $$q_{\rm A} \simeq q_{\rm B} \simeq 1 - 1/\langle k\rangle$$, and $$q_{\rm S} \simeq 1 - 1/(2\langle k\rangle )$$ 38. Relative errors are therefore $$\varepsilon _{\rm A} \simeq \varepsilon _{\rm B} \simeq (2.4554 - 1)/(\langle k\rangle - 2.4554)$$, and $$\varepsilon _{\rm S} \simeq (2.4554 - 1/2)/(\langle k\rangle - 2.4554)$$. We find that the relative error for optimal percolation behaves more or less in the same way as that of ordinary percolation (Fig. 2b), noting that, as 〈k〉 is increased, the decrease in the relative error associated with the individual layers is slightly faster than what expected for ordinary percolation. The relative error associated with the aggregated network is larger than the one expected from the theory of ordinary percolation. As shown in Fig. 2a, for sufficiently large 〈k〉, dismantling an ER multiplex network is almost as hard as dismantling any of its constituent layers.

### Edge overlap and degree correlations

Next, we test the role played by edge overlap and layer-to-layer degree correlation in the optimal percolation problem. These are the ingredients that dramatically change the nature of the ordinary percolation transition in multiplex networks26,39,40,41,42,43. In Fig. 3, we report the results of a simple analysis. We take advantage of the model introduced in ref. 44. This is one of the simplest models able to tune a system from a multiplex to a simplex topology. The system is composed of two identical network layers. Nodes in one of the two layers are relabeled with a certain probability α. For α = 0, multiplex, aggregated network and single-layer graphs are all identical. For α = 1, the networks are analogous to those considered in the previous section. We note that this model does not allow to disentangle the role played by edge overlap among layers and the one played by the correlation of node degrees. For α = 0, edge overlap amounts to 100%, and there is a one-to-one match between the degree of a node in one layer and its degree in the other layer. As α increases, both edge overlap and degree correlation decrease simultaneously. As it is apparent from the results of Fig. 3, the system reaches the multiplex regime for very small values of α, in the sense that the relative size of the set of SNs deviates instantly from its value for α = 0. This is in line with what already found in the context of ordinary percolation processes in multiplex networks: as soon as there is a finite fraction of edges that are not shared by the two layers, the system behaves exactly as a multiplex26,39,40,41,42,43.

### Accuracy and sensitivity

So far, we focused our attention only on the size of the set of SNs. We neglected, however, any analysis regarding the identity of the nodes that actually compose this set. To proceed with such an analysis, we note that different runs of the SA algorithm (or any algorithm with stochastic features) generally produce slightly different sets of SNs, even if they all have almost identical sizes. The issue is not related to the optimization technique, rather to the existence of degenerate solutions to the problem. In this respect, we work with the quantities p i , each of which describes the probability that a node i appears in the set of SNs in a realization of the detection method (here, the SA algorithm). This treatment takes into account the fact that a node may belong to the set of SNs in a number of realizations of the detection method and may be absent from this set in some other realizations.

We define self-consistency of a SN-detection method as $$S = \left[ {\mathop {\sum}\nolimits_i p_i^2} \right]/\left[ {\mathop {\sum}\nolimits_i p_i} \right]$$, which describes the ratio of the expected overlap between two SNs obtained from two independent realizations of the detection method to the expected size of the SN. If the set of SNs is identical across different runs, then S = 1. The minimal value we can observe is S = Q/N, assuming that the size of the structural set is equal to Q in all runs, but nodes belonging to this set are changing all the times, so that for every node i we have p i  = Q/N. As reported in Fig. 4a, self-consistency S assumes high values for single-layer representations of the network, even for syntethic multiplex networks. On the other hand, S decreases significantly as the overlap and interlayer degree correlations decrease (Fig. 4a). Low S values for multiplexes with small overlap and correlation together with the small sizes of their set of SNs (Fig. 2) suggest that in such networks many slightly different SN sets may exist.

Next, we turn our attention on quantifying how the sets of SNs identified in single-layer or aggregated networks are representative of the ground-truth sets found on multiplex networks. We denote by p i and w i the probability that node i is found within the set of SNs of, respectively, a multiplex network (ground truth) and a specific single-layer representation of that multiplex. To compare the sets represented by w i with the ground-truth sets, we adopt three standard metrics in information retrieval45,46, namely precision, recall and the Van Rijsbergen’s F 1 score. Precision is defined as P = [∑ i p i w i ]/[∑ i w i ], i.e., the ratio of the expected number of correctly detected SNs to the expected total number of detected SNs. Recall is defined as R = [∑ i p i w i ]/[∑ i p i ], i.e., the ratio of the expected number of correctly detected SNs to the expected number of actual SNs of the multiplex. We note that the self-consistency we previously defined corresponds to precision and recall of the ground-truth set with respect to itself, thus providing a base line for the interpretation of the results. The F 1 score defined as F 1 = (2)/(1/P + 1/R) provides a balanced measure in terms of P and R. As Fig. 4b shows, P deteriorates as the edge overlap and interlayer degree correlation decrease. In particular, when overlap and correlation between the layers of the multiplex network are not large, precision values for the sets of SNs identified in single layers or in the superposition of the layers are quite small ($$P \simeq 0.3$$), even smaller than the ratio of the q M of the multiplex to the q of any of these sets (Fig. 3). This means that, when the multiplex nature of the system is neglected, two systematic errors are committed. First, the number of SNs is greatly overestimated; second, a significant number of the true SNs of the multiplex are not identified. The quantity R, on the other hand, behaves differently for single-layer and aggregated networks (Fig. 4c). In single layers, we see that R systematically decreases as the relabeling probability increases. The structural set of nodes obtained on the superposition of the layers instead provides large values of R. This is not due to a good performance rather to the fact that the set of SNs identified on the aggregated network is very large (Fig. 3), and it is further supported by the results of Fig. 4c, d, where large R values do not correspond to high F 1 scores.

### Real-world multiplex networks

In Table 1, we present summary statistics of the solution of the optimal percolation problem studied on several real-world multiplex networks generated from empirical data. For most of these networks, we find high values of self-consistency among solutions. This implies that there is a certain small group of nodes that have a major importance in the robustness of such real-world networks to the optimal percolation process. For most of the networks, the F 1 scores are low, indicating that on real-world networks we loose essential information about the optimal percolation problem if the multiplex structure is not taken into account.

To provide a practical case study with an intuitive interpretation, we depict in Fig. 5 the solution of the optimal percolation problem on a multiplex network describing air transportation in the United States. SA identifies always 10 airports in the set of SNs of this network. There is a slight variability among different instances of the SA optimization, with a total of 14 distinct airports appearing in the structural set at least once over 100 SA instances. However, changes in the SN set from run to run mostly regard airports in the same geographical region. Overall, airports in the structural set are scattered homogeneously across the country, suggesting that the GMCC of the network mostly relies on hubs serving specific geographical regions, rather than global hubs in the entire transportation system. For instance, the probabilities that describe the membership of the airports to the set of SNs do not strictly follow the same order as that of the recorded flight traffics; nor merely the number of connections of the airports (not shown) is sufficient to determine the SNs.

## Discussion

In this paper, we studied the optimal percolation problem on multiplex networks. The problem regards the detection of the minimal set of nodes (i.e., the set of structural nodes, SNs) such that, if its members are removed from the network, the network is dismantled. The solution to the problem provides important information on the microscopic parts that should be maintained in a functional state to keep the overall system functioning, in a scenario of maximal stress. Our study focused mostly on the characterization of the SN sets of a given multiplex network in comparison with those found on the single-layer projections of the same multiplex, i.e., in a scenario where one “forgets” about the multiplex nature of the system. Our results demonstrate that, generally, multiplex networks have considerably smaller sets of SNs compared to the SN sets of their single-layer based network representations. The error committed when relying on single-layer representations of the multiplex does not regard only the size of the SN sets, but also the identity of the SNs. Both issues emerge in the analysis of synthetic network models, where edge overlap and/or interlayer degree–degree correlations seem to fully explain the amount of discrepancy between the SN set of a multiplex and the SN sets of its single-layer based representations. These issues are apparent also in many of the real-world multiplex networks we analyzed. Overall, we conclude that neglecting the multiplex structure of a network system subjected to maximal structural stress may result in significant inaccuracies about its robustness.

### Data availability

Real multiplex networks analyzed in the paper have been constructed using data publicly available on the Web (see references in Table 1). The source code of the implementation of the various algorithms used in the paper is available from the authors upon request.