Interplay between $k$-core and community structure in complex networks

The organisation of a network in a maximal set of nodes having at least $k$ neighbours within the set, known as $k$-core decomposition, has been used for studying various phenomena. It has been shown that nodes in the innermost $k$-shells play a crucial role in contagion processes, emergence of consensus, and resilience of the system. It is known that the $k$-core decomposition of many empirical networks cannot be explained by the degree of each node alone, or equivalently, random graph models that preserve the degree of each node (i.e., configuration model). Here we study the $k$-core decomposition of some empirical networks as well as that of some randomised counterparts, and examine the extent to which the $k$-shell structure of the networks can be accounted for by the community structure. We find that preserving the community structure in the randomisation process is crucial for generating networks whose $k$-core decomposition is close to the empirical one. We also highlight the existence, in some networks, of a concentration of the nodes in the innermost $k$-shells into a small number of communities.


Introduction
Whenever a system can be abstracted as a set of units (nodes) interacting in pairs (edges), we can describe it as a network (also called a graph). Network analysis has proven to be a valuable framework to aid us to understand a plethora of phenomena taking place in many complex systems. Examples include cascades and collective behaviour in socio-technical systems, the emergence of cognitive functions in neural systems, the stability of chemical/biological systems, and the shape of spatially embedded systems, to cite a few [1][2][3] .
One of the advantages of the network representation is the possibility to probe the system in a coarse-grained manner, going beyond dyadic interactions by identifying high-order structures of the network 4,5 . Examples include tightly connected groups of nodes, i.e., communities 6 , multiscale coarse-grained structures 7 , core-periphery structure 8,9 , nested assembly of nodes 10 , rich clubs 11,12 , and the k-core 13,14 .
The k-core decomposition of a network is the maximal set of nodes that have at least k neighbours within the set 13,14 . The algorithm to extract the k-core consists in recursively removing the nodes having less than k connections. A k-shell is defined as the set of nodes belonging to the k th core but not to the (k + 1) th core 15 . The k-core decomposition has proven to be useful in a variety of domains such as identifying and ranking the most influential spreaders in networks, identifying keywords used for classifying documents, and in assessing the robustness of mutualistic ecosystem and protein networks 16 .
Despite the vast range of applications of the k-core decomposition, to the best of our knowledge, there have been only few attempts to build models to generate networks with a given k-core structure. One indirect attempt to generate networks with a given k-core decomposition is the so-called BRITE model 17 . Originally, this model sought to replicate the features (including the k-core) of the Internet network at the Autonomous System (AS) level by mixing the mechanism of growth with preferential attachment 18,19 and that of adding edges between already existing nodes. Another model aimed at generating networks with a k-core structure akin to an empirical one by leveraging the information stored in the so-called core fingerprint 20 . The core fingerprint corresponds to knowing the number of nodes in each k-shell, the number of intra-shell edges (i.e., those connecting nodes belonging to the same k-shell), and the number of inter-shell edges (i.e., those connecting nodes belonging to different k-shells) of a given network. Moreover, the authors qualitatively compared the Internet AS networks and synthetic networks preserving the core fingerprint of the original networks using several indicators 20 .
A more quantitative comparison of distribution P ≥ (k s ) between the empirical and deg networks may be done by, for example, the Kolmogorov-Smirnov (KS) test 31 . However, because a majority of the nodes usually belongs to outer k-shells, (i.e., set of nodes with small k s values) and Fig. 1 shows that the strongest discrepancies between the two distributions tend to occur at large k s values, the KS test fails to grasp the differences at large k s values that we are mostly interested in. Therefore, we compare the k-core decomposition of the empirical and deg networks using four indicators, i.e., the relative difference in the average k-shell index, ∆ k s , the relative difference in the network's degeneracy, ∆D, the Jaccard score, J, and Kendall's, τ K of the nodes belonging to the top 10% (i.e., innermost k-shells) of the P ≥ (k s ) distribution. The average of each indicator over all the data sets for the networks obtained with the deg shuffling method is equal to ∆ k s = 0.052 ± 0.056, ∆D = 0.302 ± 0.288, J = 0.563 ± 0.194, and τ K = 0.763 ± 0.176. The value of ∆ k s indicates that k s is only ≈ 5% different between the original and deg networks on average. However, their degeneracy differs by ≈ 30% on average. The J and τ K values inform us that innermost k-shells of the original networks and those of the deg networks tend to share approximately half of the nodes, albeit their ranking seems to be fairly preserved. The values of each indicator are reported in Table S1.

2/14
Community-aware reconstruction of the k-core We have seen that the degree distribution by itself does not reproduce main features of the k-shell index distribution. An alternative feature that may explain the k-shell index distribution is the community structure. For this reason, we generated synthetic networks that preserve both the degree of each node and the community structure, C = {C 1 , . . . ,C N c }, where N c is the number of communities of the original network. We identified communities of each network using two methods: the Louvain method 32 , denoted by Lvn, and the degree-corrected stochastic block model 33 , denoted by SBM. In combination with each of the two community detection methods, we considered two rewiring methods preserving C and the degree of each node, denoted by commA and commB. Method commA preserves the exact number of inter-and intra-community edges at the level of single communities. Method commB preserves the number of inter-and intra-community edges for each node. Figure 1 indicates that preserving the community structure in addition to the degree of each node improves the similarity in P ≥ (k s ) between the empirical and synthetic networks, especially at large k s values, which correspond to inner k-shells. In particular, commA and commB generate networks whose D value tends to be closer to the empirical value than deg does. Furthermore, P ≥ (k s ) for commA and commB tends to have plateaus and abrupt drops at k s ≤ D similarly to the empirical networks. Overall, synthetic networks preserving the SBM community structure have a k-core decomposition more akin to the empirical one than those preserving the Lvn community structure. This observation is quantitatively supported by the values of the four indices reported in Table S1.
To obtain an overview of the performances of different network randomisation methods, in Fig. 2 we show the fraction of data sets, f X , for which a certain shuffling method generates a k-core decomposition that is the most similar to that of the empirical network according to each indicator. The figure indicates that commB-SBM (i.e., the commB shuffling method that preserves the community structure determined by SBM) performs the best in mimicking the k-shell index features for approximately 65%-80% of the data sets, depending on the indicator. Detailed results for the performance of each method for each empirical network are shown in Fig. S2 and Table S1.
Imposing the simultaneous conservation of each node's degree and community structure may result in synthetic networks that are not substantially different from the original ones. To exclude this possibility, we computed the Jaccard score, J(L, L ), (see Eq. (3)) for the sets of edges, L and L , of the original and shuffled networks, respectively. The values of J approximately fall between 0.01 and 0.5, confirming that the set of edges -hence, the networks -are considerably different.
The results presented so far suggest that preserving the community structure improves the preservation of the k-core decomposition of the original network. Therefore, the mere presence of a community structure may be enough to preserve the main features of the k-core decomposition of the original networks. To test this possibility, we applied the k-core decomposition to networks with communities generated using the LFR model 34 (see Sec. 2 of SM). The plots of P ≥ (k s ) shown in Figs. S3-S6 indicate that the presence of a community structure is not sufficient for producing main features of the k-core structure in the empirical networks. Specifically, P ≥ (k s ) of networks generated by the LFR model does not show plateaus or abrupt drops as k s increases, and their range of k s values is narrow, i.e., max(k s ) − min(k s ) ≈ 10. It should also be noted that for the LFR model, as for the empirical network, the commB-SBM is the method that generates the most similar networks to the original LFR networks among the different shuffling methods in terms of P ≥ (k s ).
Overlap between communities and k-core Preserving the community structure in addition to the node's degree can lead to preservation of features of the k-core structure possibly because nodes with high values of k s form a k-core which tend to belong to the same community. To examine this possibility, we show the number of communities to which the nodes of a given k-shell belong, n C (k s ), in Fig. 3 (see Fig. S7 for the other data sets). Although each data set shows a distinct pattern, for many data sets, inner k-shells (i.e., nodes with large k s values) are concentrated into one or a few communities. The concentration effect is particularly noticeable for some data sets, e.g., Facebook 1 and Twitter. To check whether the number of communities per k-shell is merely a byproduct of the random combinatorial effect owing to the number of communities, the distribution of the community size, and the distribution of k s , we computed a random assignments of the nodes to communities and then calculated n C (k s ) for each k s value (see Sec. 3 and Fig. S8 of the SM). We have found that the nodes in each k-shell are almost always more concentrated into a smaller number of communities than what is expected by the random assignment of the nodes to communities for all the data sets and community detection methods, with the only exception of SBM for Cookpad's data sets. The concentration implies a stark overlap between the k-shells and the community structure, suggesting that nodes belonging to those k-shells might share some common functions. In particular, we observe a strong concentration of the k-shells into a few communities for the Facebook 1, Twitter, Cond. Matter, Comp. Science, and Words networks, which are those showing a more pronounced difference in the values of D between the original and deg networks.

Discussion
The information encoded in the degree of each node is not sufficient for generating networks with a k-core structure that is similar to those of empirical networks 29 . This gap of knowledge calls for the design of generative models of networks beyond the configuration model. Such models are expected to be useful to generate benchmark networks and to understand the mechanisms behind the emergence of the k-core. To the best of our knowledge, the few models available to generate networks with a given k-core decomposition are based on heuristics 20 .
In the present study, we investigated how much the combination of the nodes' degrees and community structure accounts for k-core structure of empirical networks. Given a network G, we randomly shuffled G's edges to generate its synthetic counterparts preserving each node's degree and/or community structure of G. We found that randomised networks preserving the community structure obtained through a stochastic block model showed a k-shell index distribution that was reasonably similar to the distribution for the original networks. We also sought to understand more the relationship between k-core and communities by studying networks generated by the LFR model which enables us to control the extent to which the communities are distinguished from each other. However, regardless of whether or not different communities are relatively distinguished from each other in a network, the k-shell index distribution of LFR networks does not show the same features as those observed in the empirical networks. Finally, we have investigated the overlap between communities and k-shells and found that, in some empirical networks, the nodes in inner k-shells are concentrated into a small number of communities, much more so than a randomised counterpart. Up to our numerical efforts, the concentration is observed if and only if the empirical network and its deg counterpart are substantially different in terms of their k-core decomposition. This result suggests that inner k-shells may perform specific functions in such networks, corresponding to the functions of the communities they belong to.
The "community aware" rewiring mechanisms introduced in this paper can be used for assessing whether or not a given property of a network is a direct expression of its community structure. One example of such an approach is given in 35 , where the authors have improved the robustness against attacks on a network while keeping its community structure. In that case, the method only preserves the communities and alters the connectivity pattern by increasing the density of intra-community edges as well as changing the edges between communities. It may be interesting, instead, to check whether the robustness of the network can be improved even when one also preserves the degree of the nodes using our community-aware rewiring mechanisms.
One viable extension of our work is to the case of k-peak graph decomposition method 36 . In Ref. 36 , the authors argue that for networks with communities, the k-core decomposition should be performed locally rather than globally, thus returning the k-peak decomposition of each of the system's regions. The rationale behind this approach is to avoid that, if the network contains regions with different densities of edges, the standard k-core decomposition would fail to recognise local core nodes in sparser regions. Studying the evolution of the k-peak decomposition in response to the rewiring of the connections may unveil salient features of complex systems.
Summing up, in this work we have analysed the interplay between the k-core decomposition and community structure of networks. Understanding such a relationship is useful not only owing to the broad range of applications of k-core decomposition, but also to inform the design of models capable of generating networks with both a community structure and k-core's features beyond those explainable by the degree distribution. Such models may stand on, for instance, the stochastic block model 33 , the enhanced configuration model based on maximum entropy 37 , or the hierarchical extension of the LFR model 38 . Alternatively, models based on microscopic growth mechanisms such as triadic closure 39,40 or modified preferential attachment 41 may deserve further investigation.

Data
We have considered networks corresponding to systems of different types: from social to technological, from semantic to transportation. Table 1 summarises main properties of such networks. Except for Cookpad networks, all the data sets are publicly available and have been retrieved from the Stanford Large Network data set Collection 42 (Facebook 1, Twitter, Emails, and Cond. Matter), the Network Repository 43-45 (Facebook 2, 3, 4, and 5), the Koblenz Network Collection (KONECT) 46 (Comp. Science, and Words), Mark E. J. Newman's personal network data repository 47 (Web-blogs), and the OpenFlights data repository 48 (Global airline). In the following text, we provide a brief description of each data set.
Facebook & Twitter. These networks describe social relationships. Nodes are people. Edges represent their friendship relations.
Web-blogs. This network is composed of the hyperlinks (edges) between weblogs on US politics (nodes) recorded in 2005.

4/14
Emails. This is a network of email data from a large European research institution. Nodes are people. Edges connect pairs of individuals who have exchanged at least one e-mail.
Cond. Matter & Comp. Science. The former network is the co-authorship network of the authors of preprint manuscripts submitted to the Condensed Matter Physics arXiv e-print archive from January 1993 to April 2003. The latter network is similarly defined using manuscripts appearing in the DBLP computer science bibliography, using a comprehensive list of research papers in computer science. The submission time of the papers of the DBLP collection is unavailable. A node is an author. An edge represents the existence of at least one manuscript co-authored by two authors.
Global airline. In this network nodes are airports across the globe. An edge indicates direct commercial flights between two airports.
Words. This network accounts for the lexical relationships among words extracted from the WordNet data set. Nodes are English words. Edges are relationships (synonymy, antonymy, meronymy, etc.) between pairs of words.
Cookpad. These networks are extracted from the Cookpad online recipe sharing platform 49 . Users can post and browse recipes, as well as interact with other users through recipes in multiple ways including liking, sharing, and posting a comment. The platform is present in many countries (e.g., Japan, Indonesia, United Kingdom, and Italy). Here, we consider the data collected from September to November of 2018 in Greece, Spain, and the United Kingdom, separately for each country. In the three networks, nodes are users. An edge between a pair of users exists if one or more of the following types of events takes place: like or follow a user, viewing, bookmarking, commenting, or making a cooksnap of another user's recipe.
All the networks considered in this work are treated as undirected and unweighted, even when the original data contains more information. Finally, we also consider synthetic networks, generated using the LFR (Lancichinetti-Fortunato-Radicchi) model 34 (see Sec. 2 of SM for details).

Network shuffling
Given a network, G, with N nodes and L edges, we generate a randomised counterpart, G , that has the same nodes and the same number of edges by shuffling the edges of G. We consider three shuffling methods denoted by deg, commA, and commB; each shuffling method preserves different properties of G. The shuffling consists in selecting uniformly at random two edges (a, b) and (c, d), and replacing them with, e.g., (a, c) and (b, d), if the swapping of the edges is accepted. An attempt to swap edges is accepted, in which case we call the swapping effective, if and only if it respects the rule of the specific shuffling method and the swapping does not generate self-loops or multiple edges. We continued the shuffling until we carried out 2L effective swaps, such that an edge was swapped four times on average.
In the following text, we provide the details of each shuffling method. Assume that network G partitions into communities such that the set of the communities is C = {C 1 , . . . ,C N c }, where N c is the number of communities. Furthermore, let g(i) ∈ C, i = 1, . . . , N, be the community to which the ith node belongs and k i be the degree of node i. We have: Degree-preserving shuffling (deg). This method preserves degree k i of each node i and is equivalent to the configuration model 30 .
Community-preserving shuffling of type A (commA). On top of the degree of each node, this method preserves the total number of edges within each community and between each pair of communities. In attempts to swap edges, we replace two randomly selected edges (a, b) and (c, d) by (a, c) and (b, d) if and only if an end node of edge (a, b) and an end node of edge (c, d) belong to the same community (i.e., if g(b) = g(c) or g(a) = g(d)).
Community-preserving shuffling of type B (commB). Like commA, this method preserves the degree of each node and the number of edges within each community and between each pair of communities. In contrast with commA, the commB method preserves the numbers of edges within and across communities for each node, and not only for each community or pairs of communities. Given two selected edges (a, b) and (c, d), we replace them with (a, c) and (b, d) if and only if the two new edges connect the same community pairs as before the swapping (i.e., g(b) = g(c) and g(a) = g(d)).

Comparison of the k-core decomposition
To assess the similarity between the k-core decomposition of the original network, G, and of its shuffled counterpart, G , we used four indicators: the average k-shell index, k s , the network's degeneracy, D, the Jaccard score, J, and the generalised Kendall's tau, τ K . The indicator k s explicitly depends on all the nodes in the network, whereas D, J and τ K only depend on the nodes belonging to the innermost k-shell(s). We use the latter three indicators because, although a majority of nodes tends

5/14
to belong to outer k-shells, it is a difference in the tails of the k s distributions that often affect functions of networks such as the impact of influencers in contagion processes 50 . The four indicators are defined as follows.
The average of the k-shell index, k s , is equal to where k s (i) is the k-shell index of node i. The degeneracy, D, of a network G is given by 51 Rather than using these raw indicators, to compare across the different data sets, we compute their relative difference between the empirical network and its shuffled counterpart given by To compute J and τ K , we need to define a criterion to select nodes belonging to the innermost k-shells. We decided to confine the comparison to the nodes whose k s falls within the top 10% among the N nodes. The horizontal lines in Fig. 1 indicate the threshold values of k s such that P ≥ (k s ) = 0.1. We used the set of nodes with k s ≥ k s in G and the set of nodes with k s ≥ k s in G to calculate J and τ K . Note that the k s value is different between G and G in general. Furthermore, the value of k s varies from one combination of a run of shuffling and community detection to another. Moreover, as in the case of Facebook 2 data set, k s sometimes does not even exist. In such a case, we set k s = D and select all the nodes belonging to the innermost k-shell although they constitute more than 10% of the nodes in the network.
Given two sets A and B, the Jaccard score quantifies their overlap and is given by The Jaccard index ranges between 0 and 1. A value of 1 indicates the complete overlap between the two sets (i.e., the sets are the same), whereas a value of 0 indicates that the sets are completely different. The generalised Kendall's tau, τ K , measures the consistency between two rankings by assigning penalties to pairs of elements on which the two rankings disagree 52,53 . Given two sets A and B having m A and m B elements, respectively, consider their associated ranking functions X and Y. We denote with (z 1 , z 2 ) an arbitrary pair of elements of A ∪ B. We assign a penalty K z 1 ,z 2 (X, Y) = 1 to (z 1 , z 2 ) if (a) the rankings of the two elements within each set are different (i.e., X(z 1 ) ≷ X(z 2 ) and Y(z 1 ) ≶ Y(z 2 )), (b) the element with the higher rank in one set is missing in the other set, i.e., X(z 1 ) > X(z 2 ) and z 1 / ∈ B (or X(z 2 ) > X(z 1 ) and z 2 / ∈ B), or (c) both elements belong to one set each, which is not the same set, i.e., z 1 / ∈ B and z 2 / ∈ A (and vice-versa). In all the other cases K z 1 ,z 2 (X, Y) = 0, such that we do not penalise the (z 1 , z 2 ) pair. Finally, we sum the penalties over all the possible pairs of elements and normalise it, thus obtaining the generalised Kendall's tau: Index τ K ranges between 0 and 1. If τ K = 1, the two rankings are completely coherent. If τ K = 0, the two sets A and B have no pair of elements on which rankings X and Y are coherent. The above formulation of the Kendall's tau is the so-called the optimistic approach 52 . This means that we do not penalise the case in which a pairs of elements is present in one set and not in the other set.

Community detection methods
We considered two methods for community detection. The first is the Louvain method (Lvn) 32 , which is a heuristic greedy multiscale method that approximately maximises the modularity function. Given a network with N nodes distributed among N c communities, the modularity, Q, reads where a i, j is the element of the network's adjacency matrix A; g(i) is the community to which the i-th node belongs (1 ≤ g(i) ≤ N c ), and δ g(i), g( j) is the Kronecker delta. A large value of Q implies a good partitioning. The Louvain method seeks the partitioning that maximises the modularity. Note that we obtain Q ≈ 0 for random assignment of nodes to communities and that we obtain Q ≈ 1 when the network is made of perfectly disjoint communities.

6/14
The other community detection method that we used is the stochastic block model 54 . It uses the probabilities P = {p C i ,C j } with which there exists an edge (a, b) connecting an arbitrarily selected node a in community C i (i.e., g(a) = C i ) and an arbitrarily selected node b in community C j (i.e., g(b) = C j ). Different instances of probabilities P allow the description of different mixing patterns. When the diagonal entries of P predominate, we obtain the most usual community structure, whereas other instances yield other structures such as bipartite or core-periphery structure.
To find the optimal partition, one maximises the likelihood function with respect to {p C i ,C j } corresponding to the partitioning C = {C i }, where i, j ∈ 1, . . . , N c . The unnormalised log-likelihood, L, with which a partition of network G into N c communities, C, is reproduced reads where e i j is the number of edges connecting community C i and community C j , and m i is the number of nodes belonging to C i . The above formulation, however, has one major limitation: it assumes that the degrees of the nodes are distributed according to a Poisson-like function. To account for the degrees' heterogeneity, Karrer et al. have implemented the so-called degree corrected stochastic block model, in which the expected degree of each node is kept constant via the introduction of additional parameters 33 . Let e i be the sum of the node's degree over all nodes in community C i . Then, the unnormalised log-likelihood for the degree-corrected stochastic block model reads Equations (6) and (7) depend on the number of communities N c . Because the value of N c is not known a priori, it is inferred through the minimisation of a quantity called the description length. The minimum description length principle describes how much a model compresses the data and allows us to find the optimal number of communities avoiding overfitting. We use the degree-corrected stochastic block model, which we refer to as SBM for brevity, in the present work.

Additional information
Competing interests The authors declare no competing interests.
Data availability The data sets on Cookpad TM analysed in the current study are not publicly available due to exclusive ownership of Cookpad Limited. All the other data sets are available from the corresponding repositories listed in the bibliography.

10/14
Data set     Fig. 1 for the details). For each data set, we show the results corresponding to the community structure obtained using either Lvn or SBM.

14/14
Supplementary Materials for the manuscript entitled: Interplay between k-core and community structure in complex networks Irene Malvestio, Alessio Cardillo, & Naoki Masuda

Comparison between the original and shuffled networks
In this section, we provide a detailed characterisation of the k-core decomposition of the shuffled networks for all the empirical networks. Figure S1 shows the survival probability of the k-shell index, P ≥ (k s ), of the original networks and their shuffled counterparts (deg, commA, and commB). The figure indicates that commA and commB produce k-shell distributions that are more similar to the original ones, compared to deg, in particular when commA or commB is combined with SBM. This result also holds true for the Greece and Spain networks of Cookpad where, contrarily to the other data sets, the deg networks have a degeneracy, D, higher than the original networks.
In Table S1, we report the values of the four indicators used for comparing the k-core decomposition between the original and shuffled networks. In particular, we report the average value and standard deviation of the relative difference ∆X = |X(G) − X(G )| /X(G) where X is either the average k-shell index, k s , or D. In the same table, we also report the values of the Jaccard score, J, and Kendall's tau, τ K , calculated for the set of nodes belonging to the innermost k-shells (see the main text for the details of the methods). We notice that, in general, commB-SBM yields the smallest values of ∆ k s and ∆D and the largest values of J and τ K ; confirming its good performances in reconstructing the k-core decomposition of the original network. Figure S2 provides an overview of the performances of each shuffling method. Figure 2 in the main text is a projection of the information contained in Fig. S2. Figure S1: Survival probability of the k-shell index, P ≥ (k s ), as a function of k s for the empirical network (dotted lines) and shuffled networks (solid lines). Each panel corresponds to a data set. The horizontal dashed lines represent P ≥ (k s ) = 0.1. The results shown are averages over 10 different runs of each shuffling method, and the shaded areas (when visible) represent the standard deviations.  Table S1: Average and standard deviation of the four indicators characterising the k-core decomposition. In the cells with missing values, the shuffling method did not converge.  Table S1. For each pair of an empirical network and indicator, we show the value of the indicator for each shuffling method. The error bars represent the standard deviation.

The LFR model
The Lancichinetti-Fortunato-Radicchi (LFR) model generates networks where both the node's degree and the size of the communities (i.e. the number of nodes belonging to a community) follow power-law distributions [1]. Such features are found in many empirical networks [2] and have led to the success of the LFR model as generator of benchmark networks to test community detection algorithms [3]. A main finding presented in the main text is that preserving the community structure of the original network in addition to the degree of each node improves the ability of the shuffling methods to mimic the k-core decomposition of the original networks. Here, to test whether or not the community structure and the degree of each node, but not a possible intricate association between the two, is sufficient for mimicking the features of k-core decomposition observed for many empirical networks, we generated networks using the LFR model and analysed their k-cores and those of the shuffled counterparts. The LFR algorithm depends on the following parameters: the exponent, t 1 ∈ [2, 3], of the degree distribution P (k) ∝ k −t 1 ; the exponent, t 2 ∈ [1, 2], of the community's size distribution P (S c ) ∝ S c −t 2 ; the mixing parameter, µ ∈ [0, 1], specifying the fraction of intra-community edges for a node. A value of µ = 0 indicates that a node is connected only with nodes belonging to communities different from its own. A value of µ = 1 indicates that a node is connected exclusively with nodes belonging to its own community; either one of the following: the average degree, k , the minimum degree, k min , or the minimum number of communities, min N c . This stochastic algorithm may not produce a network fulfilling all the requirements in some realisations. Therefore, we have to set the parameter values to ensure the algorithm's convergence.
To encompass a good spectrum of networks, we consider four batches of parameter sets, which are summarised in Table S2, together with the properties of the generated networks. Each batch of parameter sets consists of a value of t 1 , a value of t 2 , and seven values of µ ranging from 0.1 to 0.8. We assumed N = 10000 nodes and used the implementation of the LFR algorithm in the NetworkX Python package [4].
For each network generated, we extracted its k-core decomposition and calculated the four indicators. We did the same for the shuffled counterparts generated using the deg, commA, and commB methods. In analogy to Fig. S1, in Figs. S3-S6 we show the survival probability distribution of the k-shell index, P ≥ (k s ), for the original LFR networks and the shuffled counterparts, one figure per each (t 1 , t 2 ) pair. An eye inspection of Figs. S3-S6 highlights the existence of three trends.
First, Figs. S3 and S4 indicate that, in networks generated using the smaller t 1 values (i.e. parameters batches 1 and 2 in Table S2), the shuffled networks generated by deg, commA-Lvn, and commB-Lvn attain a k-core decomposition with a degeneracy, D, considerably higher than the original one. In contrast, Figs. S5 and S6 indicate that, with the larger t 1 values (i.e. parameter batches 3 and 4), we recover the same trend as that shown in Fig. 1. In other words, D for the original networks are larger than that for the shuffled networks. The difference between the original D and its shuffled counterpart seems to be influenced by the value of t 1 , but not t 2 or µ.
Second, P ≥ (k s ) for the original LFR networks mainly decreases smoothly as k s increases, without plateaus or abrupt drops. Therefore, the k-core decomposition of LFR networks does not return any k-shell that is empty or much more populated than its adjacent k-shells. This result is in stark contrast to that for various empirical networks, e.g. the Facebook 1 data set (see Fig. S1).
Third, regardless of the values of t 1 , t 2 , and µ, the commB-SBM shuffling method produces networks with the P ≥ (k s ) more akin to the original one than the other shuffling methods do. This result is consistent with that for the empirical networks presented in the main text.
In a nutshell, the analysis of the k-core decomposition of networks generated by the LFR model reveals that the presence of communities is not enough to justify main properties of the k-shell structure observed in the empirical networks.  Table S2: Summary of the properties of the networks generated with the LFR model. For each combination of parameters t 1 , t 2 , and µ we report the number of edges, L, minimum degree, k min , average degree, k , maximum degree, k max , degeneracy, D, number of communities, N c , and modularity, Q, for communities extracted using either the Louvain (Lvn) or stochastic block model (SBM) method. All networks have N = 10000 nodes. Figure S3: Survival probability distribution, P ≥ (k s ), of the k-shell index, k s , for the LFR networks generated using parameter batch 1 (i.e. with t 1 = 2.2 and t 2 = 1.5; see Table S2). The dotted lines correspond to the original network. The solid lines correspond to shuffled networks. Each panel corresponds to a value of µ. Shuffled results are averages over 10 realisations. The shaded area corresponds to the standard deviation. All networks have N = 10000 nodes. Figure S4: Survival probability distribution, P ≥ (k s ), of the k-shell index, k s , for the LFR networks generated using parameter batch 2 (i.e. with t 1 = 2.6 and t 2 = 2.0; see Table S2). See the caption or Fig. S3 for notations and legends. Figure S5: Survival probability distribution, P ≥ (k s ), of the k-shell index, k s , for the LFR networks generated using parameter batch 3 (i.e. with t 1 = 2.9 and t 2 = 1.5; see Table S2). See the caption or Fig. S3 for notations and legends. Figure S6: Survival probability distribution, P ≥ (k s ), of the k-shell index, k s , for the LFR networks generated using parameter batch 4 (i.e. with t 1 = 3.0 and t 2 = 2.0; see Table S2). See the caption or Fig. S3 for notations and legends.

LFR parameters
3 Relationship between community structure and k-core decomposition In this section, we examine the number of communities to which the nodes in each k-shell belong, with the aim of examining whether or not those nodes are concentrated into one or a small number of communities, particularly for nodes in innermost k-shells. Figure S7 shows the number of distinct communities to which the nodes with a given k s value belong, denoted by n C (k s ), for all the data sets. In agreement with Fig. 3, some data sets show a strong concentration of the innermost k-shells (i.e. nodes with large k s values) into one or a few communities. Next, we ask whether or not the number of communities across which each k-shell is distributed is a byproduct of random interactions. To answer this question, first, for each network, we extract communities using either Lvn or SBM. Second, we compute n C (k s ) for each k s . Third, we compute the same quantity for the case in which we permute the association between the k-shell index of each node, k s (i), and the community membership of the node, g(i), uniformly at random; in fact, it is sufficient to randomly permute either {k s (1), . . . , k s (N )} or {g(1), . . . , g(N )}, not both. Fourth, we calculate the number of communities to which the set of nodes with a given k s value belong after the permutation, which is denoted by n S C (k s ). Fifth, using an approach similar to the calculation of the rich-club coefficient [5], we compute ϕ(k s ) = n S C (k s ) n C (k s ) (S1) for each k s . A value of ϕ(k s ) larger (smaller) than 1 indicates that the number of communities to which the nodes having the k s value belong is smaller (larger) than in the case of the randomised association between the nodes and communities. Therefore, ϕ(k s ) larger than 1 implies that the nodes with the given k-shell index, k s , are concentrated into a relatively small number of communities as compared to randomised counterparts. In Fig. S8 we plot ϕ(k s ) against k s for all the data sets. We observe that, with the exception of the Spanish and British Cookpad's networks, ϕ(k s ) tends to be larger than 1. This result implies that, on average, nodes of a given k-shell tend to belong to less communities than the randomised case. We stress that the permutation of either the kshell index or the community membership sequences may return networks whose k-shell and community structure are not physically plausible. For instance, if a node i receives a k-shell index value of α upon randomisation and α is larger than k i (i.e. degree of node i), then the node cannot belong to the corresponding k-shell. Figure S7: Number of communities, n C (k s ), to which the nodes having k-shell index k s belong. The horizontal line is a guide to the eyes representing n C (k s ) = 1. We identified the community structure using either Lvn or SBM. Each panel accounts for a different data set. Figure S8: Ratio, ϕ(k s ), (see Eq. S1) plotted against the k-shell index, k s , for all the data sets. We identified the community structure using either Lvn or SBM. Each panel accounts for a different data set. Results are averaged over one hundred runs of randomisation between the association between the node's k-shell index and community label. The horizontal dashed lines represent ϕ(k s ) = 1.