Core-like groups resulting in invalidation of k-shell decomposition analysis

Identifying the most influential spreaders is an important issue in understanding and controlling spreading processes on complex networks. Recent studies showed that nodes located in the core of a network as identified by the k-shell decomposition are the most influential spreaders. However, through a great deal of numerical simulations, we observe that not in all real networks do nodes in high shells are very influential: in some networks the core nodes are the most influential which we call true core, while in others nodes in high shells, even the innermost core, are not good spreaders which we call core-like group. By analyzing the k-core structure of the networks, we find that the true core of a network links diversely to the shells of the network, while the core-like group links very locally within the group. For nodes in the core-like group, the k-shell index cannot reflect their location importance in the network. We further introduce a measure based on the link diversity of shells to effectively distinguish the true core and core-like group, and identify core-like groups throughout the networks. Our findings help to better understand the structural features of real networks and influential nodes.

Identifying the most influential spreaders is an important issue in understanding and controlling spreading processes on complex networks.Recent studies showed that nodes located in the core of a network as identified by the k-shell decomposition are the most influential spreaders.However, through a great deal of numerical simulations, we observe that not in all real networks do nodes in high shells are very influential: in some networks the core nodes are the most influential which we call true core, while in others nodes in high shells, even the innermost core, are not good spreaders which we call core-like group.By analyzing the k-core structure of the networks, we find that the true core of a network links diversely to the shells of the network, while the core-like group links very locally within the group.For nodes in the core-like group, the k-shell index cannot reflect their location importance in the network.We further introduce a measure based on the link diversity of shells to effectively distinguish the true core and core-like group, and identify core-like groups throughout the networks.Our findings help to better understand the structural features of real networks and influential nodes.
The most influential nodes can maximize the speed and scope of information spreading compared with other nodes in a network [1].Locating these influential nodes is important in improving the use of available resources [2] and controlling the spread of information [3].A critical issue is how to determine and distinguish the spreading capability of a node.Centrality is usually used to measure the relative importance of nodes within the network, such as degree centrality [4], betweenness centrality [5], closeness centrality [6], eigenvector centrality [7], PageRank centrality [8] and its variance [9].Nodes with high centrality are considered more influential in the spreading process [10][11][12][13][14].Among these measures, degree centrality is a simple and effective way, although it is based only on local link information [15,16].The merit of degree is challenged by a recent study [12], in which the authors pointed out that the most influential spreaders do not correspond to the nodes with largest degree, but are those located in the core of the network as identified by the k-shell decomposition [17].This means the higher coreness of a node, the more influential it is in the spreading dynamics.
The k-shell decomposition decomposes a network into hierarchically ordered shells by recursively pruning the nodes with degree less than current shell index (See Methods for details).This procedure assigns each node with an index k S , representing its coreness in the network.A large k S value means a core position in the network, while a small k S value defines the periphery of the network [12].Because of a low computational complexity of O(N + E) [18], where N is the network size and E is the number of edges in the network, this method is extensively used for large-scale network analysis.Generally speaking, it is used to efficiently visualize the structure of large-scale networks [19,20], analyze the core structure of networks [21][22][23][24][25], and capture the essential structural properties of real networks [26].Since the publication of Ref. [12], the coreness is widely used to quantify the importance of a node in a spreading process [27,28].For example, in a economic crisis network, nodes with the highest coreness are most likely to spread a crisis globally [29], while in a rumor spreading, nodes with high coreness act as firewalls to prevent the diffusion of a rumor to the whole system [30].Even nodes with low coreness are considered as bridge elements, which can effectively control the disease in small world networks through an acquaintance-based vaccination strategy [31].Many works extended the k-shell decomposition method, either modify it for a better ranking [32][33][34][35][36] or generalize it to weighted networks [37,38], dynamical networks [39] and multiplex networks [40].
In all these studies, the k-shell decomposition is used as a powerful tool to analyze the network structure and identify important nodes.Then a critical question is: in different real networks, do the core nodes have a higher spreading influence than other nodes?In a common belief, it is.But through intensive computer simulations, we find that it is not the case.In some networks core nodes have the largest spreading efficiency , while in other networks core nodes have relatively low spreading efficiency.What is the reason for the obvious distinct results?No work has focused on this question to our knowledge.In Refs.[12,36], the authors pointed out that the performance of centrality measure relates somehow to the infection probability when evaluating the spreading capability of nodes.We find that although the infection probability will cause some fluctuations, the specific structure of real networks is the origin of the distinct performance of coreness: in the first case, the core of a network has a link diversity to other shells of the network, while in the latter case the core is linked very locally.We respectively call them true core and core-like group.Then, we propose a measure of information entropy to locate these core-like groups in real networks.These findings will help us in understanding the real network structures.

Results
We first calculate the imprecision function of coreness and degree in the SIR spreading process and discover the true core and core-like group in real networks.We then analyze the structural characteristics of the true cores and core-like groups.Finally we successfully locate the position of core-like groups throughout the network by defining a measure of link entropy.
Calculating the imprecision function of coreness and degree in the SIR spreading process.We use a classic Susceptible-Infected-Recovered (SIR) spreading model to simulate the spreading process [41,42], and record the spreading capability (or spreading efficiency) of each node, which is defined as the average size of infected population M for each node as spreading origin (see Methods for details).To evaluate whether the structural centrality of coreness is an effective index to measure the spreading capability of nodes compared with degree, we calculate the imprecision function ǫ kS (p) and ǫ k (p) proposed in Ref. [12].The imprecision function is defined as where p is the fraction of network size N (p ∈ [0, 1]), M kS (k) (p) and M ef f (p) are the average spreading efficiencies of pN nodes with highest coreness k S (degree k) values and largest spreading efficiency, respectively.This function quantifies how close to the optimal spreading is the average spreading of the pN nodes with largest k S (k) values.The smaller the ε kS (k) value, the more accurate the k S (k) index is a measure to identify the most influential spreaders.
The imprecision functions of nine real networks are shown in Fig. 1.Contrary to common belief, the coreness k s does not perform consistently well in all networks.We divide them into three groups.In Router, Emailcontact and AS networks, the k S imprecision is consistently lower than the k based method.In Fig. 1 (a)-(c), the imprecision ǫ kS (p) is very low, under 0.06 in the demonstrated range of 0.003 p 0.029, and is much lower than ǫ k (p).This means the coreness predicts the outcome of spreading more reliably than degree.However, the imprecision ǫ kS (p) for the next three networks (i.e., Email, CA-Hep and Hamster) is much higher than the imprecision ǫ k (p).In Fig. 1 (d)-(f), the values of ǫ kS (p) is above 0.2 for all the three networks, and is much higher than ǫ k (p).This is completely contrary to the case of the first three networks.As for the last three networks of PGP, Netsci and Astro networks, things are more complicated shown in Figs. 1 (g), (h) and (i).In PGP, the k S method acts better than k when p < 0.015.Then there is a sudden rise in ǫ kS (p) and it becomes higher than the imprecision of degree.In Netsci, when p ≤ 0.026, the imprecision of k S is much lower than that of k.There is a fast rise of the ǫ kS (p) at p = 0.027, and at p = 0.05 k S imprecision exceeds k imprecision (see Fig. S1 in Supporting Information (SI) for large p plots).In Astro, the sudden rise of k S imprecision occurs at p = 0.015 and the value of ǫ kS (p) goes up to around 0.18.This indicates a complex performance of coreness as a measure of spreading efficiency.
Discovering true core and core-like group in real networks.To find out the reason for the distinct performance of coreness in the SIR spreading process is the origin of our research interest in this paper.In the following, we first explore the structural characteristics of the first two groups of networks, and then explain the performance of coreness in the last three networks.As we know, the k-shell decomposition tends to assign many nodes with identical k S value, although their spreading capabilities may be different.When we calculate the imprecision function at a certain p, nodes with the same k S value are chosen randomly.This will cause some fluctuation in the k S imprecision curve (fraction of nodes in high shells is shown in the SI table S1).Given this fluctuation, we change to calculate where M kS (k S ) is the average spreading efficiency of the nodes with coreness k ′ S ≥ k S (nodes in k S -core), and M ef f (k S ) is the average spreading efficiency of n nodes with highest spreading efficiency, where n equals to the the number of nodes with coreness k ′ S ≥ k S .To compare with k performance, we have where M k (k S ) is the average spreading efficiency of n nodes with highest degree, and n equals to the the number of nodes with coreness k ′ S ≥ k S .The imprecision of k S is supposed to be low if the nodes in high shells are efficient spreaders.The results are shown in Fig. 2. In the first three networks (a)-(c), the ε kS (k S ) is very low and much lower than the imprecision of degree ε k (k S ) for large k S , which means most of nodes in high shells (shells with large k S value) are efficient spreaders.In the next three networks (d)-(f), the ε kS (k S ) is much higher than the ε k (k S ) for the innermost core (the shell with the maximum k S value), and the absolute value is even greater than 0.4, which means many nodes in the innermost core are not influential spreaders.From the dynamical perspective, we call the innermost core of the first three networks a true core, and presumably  call that of the other three networks a false core, or core-like group.This poor k S performance is obviously different from the fluctuation of imprecision caused by the resolution of k S index we mentioned above.
Investigating the impact of structural features on the coreness imprecision.In order to find out the reason for the distinct performances of coreness in the spreading process, we first look into the structural properties of the studied real networks.The structural features of the studied real networks are listed in Table I.From Table I, we see that the degree heterogeneity of the first group is sufficiently larger than that of the second group.In Ref. [26] the authors pointed out that the maximum k S value, k Smax , in a network scales approximately as the square root of maximum degree √ k max in many real networks, which implies the larger the maximum degree of the network, the more shells.A broad distribution of degree with large k max results in a large degree heterogeneity H k , which is the feature of the three networks in the first group.In addition, the degree assortativity r of the first group is negative, which implies that nodes of large degrees are inclined to connecting nodes of small degrees.As nodes in high shells always have large degrees and nodes in low shells (shells with small k S value) have small degrees, negative assortativity implies a good connection between high shell nodes and low shell nodes.On the contrary, the assortativity of the second group is positive or close to zero, which implies nodes of large degrees are inclined to connect to each other or connect randomly.
To evaluate whether degree heterogeneity H k and assortativity r have a direct impact on the performance of coreness, we randomize the networks using two rewiring schemes.In the first one, degrees of nodes are preserved after each single rewiring but correlations between the degrees of connected nodes are destroyed [43].In the second scheme, the rewiring preserves both  the degrees of nodes and the joint degree-degree distribution of connected nodes, P (k, k ′ ), so that the degree-degree correlations of all nodes are preserved.For each scheme, we rewire the real network by 100E times, where E is the total number of edges.As is shown in Fig. S2 in SI, the k S imprecision is very low and basically lower than or close to the k imprecision in degreepreserving networks, which indicates coreness an accurate measure for the spreading efficiency.The results for the degree-degree correlations preserved randomization in Fig. S3 are almost the same as that in Fig. S2, with only a slight difference in that the k S impression is lower than k in high shells in Email and Hamster.The above results in the randomized networks imply that H k and r are not key factors of the bad k S performance in high shells in the second group, and there should exist some specific structures in these real networks.
Analyzing the connectivity between shells.We then move to explore the complex connectivity between shells of the studied real networks.Specifically, we consider the link patterns from each shell to its upper shells (shells with index k S greater than the coreness of node i), equal shell (the shell with index k S equal to the coreness of node i) and lower shells (shells with index k S smaller than the coreness of node i).We define the link strength of node i to its upper shells by the proportion function where e u i is the number of links originating from node i to nodes in upper shells , k i is the total number of links of node i, that is the degree of node i.Large r u i indicates more links to the upper shells.Similarly, the link strengths of node i to its equal shells and lower shells are quantified by r e i = e e i /k i , r l i = e l i /k i respectively.The link strengths of k S shell to its upper (equal, lower) shells are the average link strength of nodes in that shell, that is where Γ kS consists of nodes with coreness k S , n kS is the number of nodes with coreness k S , and R u kS + R e kS + R l kS = 1.From Fig. 3(a)-(c) for the first group of networks, R u kS generally decreases with k S , this is because the number of nodes in shells decreases monotonously with the increase of k S .R e kS remains stable with the increase of k S .R l kS increases with k S , and in the innermost core, this value goes up to 0.6 and above.For large k S values, R l kS is much greater than R e kS , which means a large proportion of links of high shells point to the their lower shells, obviously higher than the proportion of links within the shell.In Fig. 3(d)-(f) for the second group, R u kS decreases with k S in Email and CA-Hep, but there is some fluctuation in Hamster.R e kS increases with k S .In the innermost core, R e kS is close to 0.7 in Email, close to 1.0 in CA-Hep, and close to 0.8 in Hamster, which is at least 50% larger than that of the first group.R l kS increases with k S at first and falls suddenly at some high shells.For these three networks, R l kS are under 0.4 in the innermost core, and is much lower than R e kS .This indicates that in the second group, the proportion of links from high shells pointing to lower shells is obviously lower than the proportion of links pointing within the shell.This is a sign of densely connected small group within the shell.The average clustering coefficient of nodes in high shells also reflects the overly dense connection in high shells in the second group (See Fig. S4 in SI).We also plot the link strength of each shell to its lower shells, equal shell and upper shells in the degree-degree correlation preserving randomized networks in Fig. S5.R l kS is promoted above 0.35 and is greater than R e kS in most high shells in CA-Hep and Hamster, although in Email there is only a little promotion.The rewiring has changed the dense local link patterns of core-like groups, which is reflect by the increase of R l kS and decrease of R e kS in high shells.The promoted k S performance is the result of enhanced link diversity.
Next, we focus on the link pattern of the innermost core.The link strength r ks i = e ks i /k i defines the ratio of links from a innermost core node i to the shell with index k S to the degree of node i. R kSmax is the average link strength of nodes in the innermost core to the shell with index k S .Fig. 4 (a) shows the link strength of innermost core to all shells in the first three networks, which is a U-shape curve.In these networks, apart from the link ratio within the core, the largest link ratio points to the shell with most nodes, usually the 1-shell.A U-shape distribution of links from the core is a good feature of core-periphery structure, in which core nodes are well connected to other core nodes and to periphery nodes and periphery nodes are not well connected to each other [44].In the second group, shown in Fig. 4 (b), the link of innermost core to all shells is different.Core nodes are very inclined to connecting to core nodes, with a link strength above 0.6.The second largest link ratio points to the adjacent shell of the innermost core, other than the shell with most nodes.When a disease origins from core nodes, it is easily spreading through out the core, but is relatively difficult to spread system wide.This locally connected phenomena also implies the origin of core-like group (i.e., false core): nodes are densely connected within a small group which contributes much to the k S index of the nodes, but in the whole network, these nodes are not best connected and not located in the most important position for spreading.The link pattern of the second innermost shell is shown in Fig. S6 in SI.
Identifying core-like groups from a structural perspective.The above analysis suggests an obvious structural difference between the two groups of networks: in the first one, the link pattern of innermost core to other shells exhibits a strong diversity, while in the second group, the link of innermost core is very localized within the shell.To quantify the link diversity of a shell with index k S , we define an link entropy as where r kS ,k ′ S is the average link strength of nodes of coreness k S to the shell of coreness k ′ S and L is the number of shells.The formalized factor lnL measures the entropy when links are uniformly distributed in all shells.This formalization makes the networks with different number of shells comparable.For the innermost core of each network, k S is set to the maximum k S value of the network.Entropy of cores of the real network and its degree-preserving randomized version are shown in Fig. 5.In  Fig. 5(a), true cores have a link entropy H kSmax higher than 0.6 while false cores have a link entropy lower than 0.5.But in the randomized network, Fig. 5(b), all the cores have a link entropy H kSmax higher than 0.6.Fig. S7 in SI shows the core entropy of degree-degree randomized networks, which is above 0.5 for all studied networks.High entropy corresponds to a more uniform link pattern, where the core is well-connected to the other parts of the network.Low entropy corresponds to a localized link pattern, where the core is densely connected within the shell.In fact, these false cores are not located in the central position of the networks, reflecting by the relatively low spreading efficiency, e.g.Locating the position of core-like groups throughout the networks.Uncovering locally connected core-like groups leads us to understand the imprecision of coreness centrality in the spreading process.We present the imprecision function of PGP, Netsci and Astro networks in Fig. 6(a)-(c).The coreness performs very well at large k S values, but then rises suddenly at certain shells.Specifically speaking, in PGP at the 22-shell and above, the imprecision of k S is lower than that of k.However, there are sudden rises at the 21-shell, 16-shell and 15-shell.From the 10-shell, the k S imprecision acts better than k again.In Netsci, the k S imprecision is very low at the 8-shell.Then it rises up at the 7-shell and 6-shell.The k S imprecision is worse than that of k until the 4-shell.In Astro, the k S imprecision is low at the 51-shell and higher shells.Then it rises up at the 48-shell and then falls.The same phenomena occurs at the 30-shell.The rise of k S imprecision implies that the corresponding shells are core-like groups.Locating them by a dynamic spreading method requires time-consuming simulations.
According to Eq. ( 6), we calculate the link entropy of each shell in these networks.The shells outlined by hollow red circles in Fig. 6(d)-(f) have relatively low entropy, which corresponds to locally connected core-like groups.This is reflected by the rise in k S imprecision shown in Fig. 6(a)-(c).The link patterns of the core-like groups, shown in Fig. S8, are similar to that of false cores in the second group of Email, CA-Hep and Hamster: a dense connection within the shell.The only difference is that these core-like groups locate in the outer shells of the network other than the innermost shell.These core-like groups have an obvious low spreading efficiency than their adjacent shells, which is also confirmed in Fig. S9 and S10.From the above, we see the link entropy provides a fast way to locate the position of core-like groups in the network without running a large amount of spreading simulations, which is very important in identifying key spreaders and controlling the spreading dynamics on networks.

Discussion
Analyzing and profiling the structures of real networks is an important step in understanding and controlling dynamic behaviors on these networks.The k-shell decomposition is a powerful tool to profile the hierarchical structures of networks.The inner core corresponds to the shells of large k S and the network periphery corresponds to the shells of small k S .This makes k S index an effective centrality measure to distinguish the spreading capability of nodes, which is validated in many real networks.However, in some real networks, there may exist core-like groups, which have high coreness but in fact are not influential spreaders.By analyzing the k-core structures of real networks, we discover the distinct link patterns of true cores and core-like groups.For the true core of a network, it displays strong link diversity to other shells of the network, represented by a U-shape link curve.
As for the core-like group, it has a very dense and local internal connections, represented as a slope-shape link curve.Based on the link pattern, we define a measure of link entropy to evaluate the link diversity of a shell to the remaining shells of the network.This provides a fast way to locate the core-like groups throughout the network from a structural perspective, which have a relatively low link entropy in the network.Uncovering these core-like groups are important in identifying key players and making control strategy for spreading dynamics.It is worth noticing that in the core-like groups, except from nodes of low spreading efficiency, there may also exist some good spreaders.A better ranking strategy that is able to distinguish them need further investment.It implies that there should be new network decomposition method which will effectively locate the nodes of different importance in core-like groups in the right hierarchical position.The new method should apply well in real networks with specific structures such as strong community structures.

Methods
The k-shell decomposition.The algorithm starts by removing all nodes with degree k = 1.After removing all nodes with k = 1, there may appear some nodes with only one link left.we should iteratively remove these nodes until there is no node left with k = 1.The removed nodes are assigned with an index k S = 1 and are considered in the 1-shell.In a similar way, nodes with degree k = 2 are iteratively removed and assigned an index k S = 2.This pruning process continues removing higher shells until all nodes are removed.As a result, each node is assigned a k S index, and the network can be viewed as a hierarchical structure from the innermost shell to the periphery shell.
SIR Model.The Susceptible-Infected-Recovered (SIR) model is widely used for simulating the spreading process on networks.In the model, a node has three possible states: S (susceptible), I (infected) and R (recovered).An individual in the susceptible state does not have the disease yet but could catch it if they come into contact with someone who does.An individual in the infected state has the disease and can pass it to susceptible individuals.An individual in recovered state neither spread disease nor be infected by others.In the start of a spreading process, a single node is infected, considered as seed, and all other nodes are in susceptible states.At each time step, there are two stages.In the first stage, susceptible individuals become infected with probability λ when they have contacted with an infected neighbor.In the second stage, infected nodes recover or die (change to R state) with probability µ.Here we set µ = 1 for generality.The spreading process stops when there is no infected node in the network.We record the average infected population M i originating at node i over 100 times of the spreading process to quantify the influence of node i in a SIR spreading.
As we take the final infected population to quantify the spreading efficiency of each node, the infection probability should be carefully considered.If it is too large, the effect of node position is not obvious and all nodes show almost identical spreading capabilities.If it is too small, the infection is very localized in the neighborhood, which cannot reflect the overall spreading influence of the nodes.So we first calculate the epidemic threshold of a network using the heterogeneous mean-filed method in Ref. [45].That is λ c = k /( k 2 − k ).Then we chose an infection probability λ > λ c [14,36], which makes the final infected population above the critical point, M > 0, and reaches a finite but small fraction of the network size for most nodes as spreading origins, in the range of 1%-20% [12].In fact, we plot the infected population of a shell as an average over nodes belong to the shell when infection probability is 1-5 times of the threshold λ c , as well as the infected population when infection probability is around the chosen infected probability λ.We find that, the relative spreading efficiency of shells is almost the same under different infection probabilities (See Fig. S9 and S10 in SI).
Data Sets.The real networks studied in the paper are: (1) Router (the router level topology of the Internet, collected by the Rocketfuel Project) [46];(2) Email-contact (Email contacts at Computer Science Department of University college London) [12]; (3) AS (Internet at the autonomous system level) [47]; (4) Email (e-mail network of University at Rovira i Virgili, URV) [48]; (5) CA-Hep (collaboration network of arxiv in high-energy physics theory) [49]; (6) Hamster (friendships and family links between users of the website hamsterster.com)[50]; (7) PGP (an encrypted communication network) [51]; (8) Netsci (collaboration network of network scientists) [52]; (9) Astro physics (collaboration network of astrophysics scientists) [53].In all the networks shown in (a)-(f), the kS imprecision is very low, under the value of 0.07, and in most cases lower than k imprecision.For the randomized networks of Email and Hamster, although the kS imprecision is slightly higher than that of degree in some shells, the absolute values are very low, under 0.025.This indicates that the k-shell strategy is more effective than or at least as well as degree in most cases.Infected population of kS shells as a function of infection probability, which is q times of the infected probability λ, q ranges from 0.9 to 1.5.The relative spreading efficiency of shells is the same as the spreading when the infection probability is around λc as shown in Figure S9.

FIG. 1 :
FIG.1:The imprecision of kS and k as a function of p for nine real networks.The kS imprecision (black squares) and k imprecision (red circles) are compared in each network.p is the proportion of nodes calculated, ranging from 0.003 to 0.029.See Fig.S1for large p plots in SI.

FIG. 2 :
FIG. 2:The imprecision of kS and k as a function of kS for six real networks.The kS imprecision (black squares) and k imprecision (red circles) are compared in each network.Each square represents the kS imprecision of nodes in kS-core, and each circle represents the k imprecision of n highest degree nodes, where n equals to the number of nodes in kS-core.kS is an integer representing the shell index, ranging from the smallest kS value to the largest kS value in the network.

FIG. 3 :
FIG. 3: Link strength of shells for the real networks.The link strength of each shell to its lower shells R l k S (black squares), equal shell R e k S (red circles) and upper shells R u k S (blue triangles) are represented.kS ranges from the smallest kS value to the largest kS value in the network.

FIG. 4 :
FIG.4: Link strength of the innermost core to each shell of the network.(a) The link strength of the innermost core to each shell exhibits a U-shape curve in Router (black squares), Emailcontact (red circles) and AS (blue triangles) networks.(b) The link strength of the innermost core to each shell exhibit a slope in Email (black squares), CA-Hep (red circles) and Hamster (blue triangles) networks.kS ranges from the smallest kS value to the largest kS value in the network.

FIG. 5 :
FIG. 5: Link entropy of the innermost core for the real networks and their randomized version.(a) Link entropy of the innermost core for the real networks.(b) Link entropy of the innermost core for the degree-preserving randomized networks.kSmax is the largest kS value in the network.H k Smax is the link entropy of the innermost core.

FIG. 6 :
FIG. 6: Locating core-like groups in real networks by link entropy.(a)-(c) The imprecision of kS and k as a function of kS for three real networks.The kS imprecision(black squares) and k imprecision(red circles) are compared.Link entropy of shells in three networks.H k S is the link entropy of kS shell.Hollow red circles outline the shells which are densely connected core-like groups.These are 21-shell, 16-shell and 15-shell in PGP, 7-shell and 6-shell in Netsci and 48-shell and 30-shell in Astro.kS ranges from the smallest kS value to the largest kS value in the network.
FIG. S2:The imprecision of kS and k as a function of kS for degree-preserving randomized networks.In all the networks shown in (a)-(f), the kS imprecision is very low, under the value of 0.07, and in most cases lower than k imprecision.For the randomized networks of Email and Hamster, although the kS imprecision is slightly higher than that of degree in some shells, the absolute values are very low, under 0.025.This indicates that the k-shell strategy is more effective than or at least as well as degree in most cases.
FIG. S3:The imprecision of kS and k as a function of kS for degree-degree correlation preserving randomized networks.The imprecision of kS is very low in high shells and is lower than that of k.
FIG. S4: Clustering coefficient of shells for the real networks.(a), (b), (c) The average clustering coefficient is smaller than 0.1 in the innermost core shell.(d), (e), (f) The average clustering coefficient is greater than 0.5 in the innermost core.
FIG. S5: Link strength of shells for degree-degree correlation preserving randomized networks .The link strength of each shell to its lower shells R l k S (black squares), equal shell R e k S (red circles), and upper shells R u k S (blue triangles) in the degree-degree correlation preserving randomized networks are represented.(a), (b), (c) R l k S is much larger than R e k S in high shells.(d), (e), (f) R l k S is strongly promoted and is always larger than R e k S in high shells in CA-Hep (e) and Hamster (f), although in Email (d) there is no obvious promotion.

:
FIG.S6: Link strength of the second innermost shell to each shell of the network.(a) Similar to the core, the second innermost shell are well connected to other parts of the network in Router (black squares), Emailcontact (red circles) and AS (blue triangles).(b) The link ratio within the second highest shell is lower than 0.6 in Email (black squares) and Hamster (blue triangles), but is still close to 1.0 CA-Hep (red circles).
FIG.S7: Link entropy of the innermost core for the real networks and their randomized version.(a) Link entropy of the innermost core for the real networks.(b) Link entropy of the innermost core for the degree-degree correlation preserving randomized networks.In all the randomized networks, the core entropy is above 0.5.

TABLE I :
Properties of the real networks studied in this work.Structural properties include number of nodes (N ), number of edges (E), average degree ( k ), maximum degree (kmax), degree heterogeneity(H k ), degree assortativity (r), clustering coefficient (C), maximum kS index (kSmax), epidemic threshold (λc), infection probability in the SIR spreading in the main text (λ).
11-shell in Email, 31-shell in CA-Hep and 24-shell in Hamster, as shown in Figs.S9 and S10 in SI.

The imprecision of kS and k as a function of p for nine real networks. p
is the proportion of nodes calculated, ranging from 0.02 to 0.2.