Abstract
The rich club organization (the presence of highly connected hub core in a network) influences many structural and functional characteristics of networks including topology, the efficiency of paths and distribution of load. Despite its major role, the literature contains only a very limited set of models capable of generating networks with realistic rich club structure. One possible reason is that the rich club organization is a divisive property among complex networks which exhibit great diversity, in contrast to other metrics (e.g. diameter, clustering or degree distribution) which seem to behave very similarly across many networks. Here we propose a simple yet powerful geometrybased growing model which can generate realistic complex networks with high rich club diversity by controlling a single geometric parameter. The growing model is validated against the Internet, proteinprotein interaction, airport and power grid networks.
Introduction
The rich club organization plays a central role in the structure and function of networks^{1,2,3,4,5,6,7,8}. Some networks (e.g. the human brain^{7}, airport networks, social networks^{1} and the Internet^{8}) have a strong rich club, meaning that their hubs are densely connected to each other. Others (e.g. proteinprotein interaction networks^{1}, the power grid^{9}) behave quite the contrary as the subgraphs made out of their hubs are very sparse. This high variation across networks is illustrated in Fig. 1, which shows the normalized rich club coefficient ρ(k)^{1} as the function of degree k for the airport network, the Internet and the proteinprotein interaction network. The explanation and reproduction of this great rich club diversity is highly non trivial. The stateoftheart models targeting the rich club organization are based on heavy randomization techniques^{10,11,12,13}, which shuffle network connections until a given organization structure is artificially imitated. Although these randomizationbased models are fairly usable, they do not give deeper insight into the mechanisms causing this diversity during the evolution of the networks. Consequently, growing models capable of incorporating various rich club networks in a simple and intuitive manner would be useful towards deeper understanding of the underlying evolutionary reasons of this diversity.
Here we propose a simple geometrybased growing model which can explain the emergence of the rich club variability in real networks by adjusting a single spatial parameter. Our model is built upon the realworld observation that in some networks the establishment of very long connections is not feasible. For example, in power grid networks, the electric current cannot be transferred efficiently (i.e. without huge losses of energy) over large distances without intermediate transformations at middle stations^{14, 15}. Similarly, optical networks apply signal regenerators for the transmission of light signals over large distances to be able to sustain the signaltonoise ratio^{16}. Also in certain social networks, middlemen as intermediate nodes may play crucial role in enhancing cooperation between the individuals or groups^{17}. Such networks seem to implement an “artificial” threshold above which no direct connections are allowed. Other networks do not have such inherent thresholds and the length of the edges is only limited by the “natural” geometric boundary of the network. For example, in airport networks we can find very long links, because transferring passengers over large distances is not an issue with the current aviation technologies.
In this paper we confine these observations into a simple geometric growing model, in which we introduce a length threshold for creating edges. We show that such a growing model can naturally reproduce and account for the experienced diversity in the richclub organization of networks, while keeping other network statistics (diameter, degree distribution and clustering) intact. The applied geometric representation of networks is an active and quickly advancing research direction in network science^{18}. There are numerous studies describing networks as random geometric graphs, performing some functions^{19, 20} (e.g. navigation, information transmission) or structural properties (e.g. smallworld, clustering, modularity)^{21, 22} of networks in a geometric context, and disclosing some fundamental relations between topology and hidden metric spaces^{23}. A the proper choice of geometry (e.g. Euclidean, BolyaiLobachevskian hyperbolic geometry or other metric space) can also promote the interpretation of numerous network processes^{24,25,26}.
Results
In our model N nodes are randomly generated one after another on an Euclidean 2D Rdisk with uniformly distributed coordinates. When adding a new node, it selects the m closest nodes already residing on the disk (if there are less than m nodes on the disk then it selects all of them). The distances between the new node and the old ones are calculated by the Euclidean distances normalized by a function of the old node degrees (as in the Growing Homophilic model^{22}). If this “effective” distance between the new node and a selected one is smaller than the threshold T then they are directly connected, otherwise a socalled “bridge” node to the midpoint of the two nodes is established and connects to both nodes. The formal description of the network generation process is performed in panel (a) of Fig. 2, while panel (b) shows a small network generated with the model.
Time evolution of the model, as new nodes are inserted into the network at different stages is shown in Fig. 2. For the sake of simplicity, in this illustration the distance normalization by node degrees is omitted. At the beginning of the generation process, many bridge nodes are inserted as the distance between the nodes is typically larger than T (see panel (c) in Fig. 2). As the network grows, the average node density and degrees increases, so the typical normalized distance between the nodes will fall below T and no more bridge nodes are added (panel (d) in Fig. 2). From this stage the model falls back to the growing homophilic model analyzed in ref. 22. Setting T to a very large value (e.g. T > 2R) completely recovers the model in ref. 22 because bridge nodes are never inserted to the graph. We show, that by varying T, the model generates complex networks with diverse richclub organization, while having scalefree degree distribution, small diameter and large clustering. In the remaining of the paper we will use the settings summarized in Table 1 in our analytical and simulation results.
Number of bridge nodes
First, we show that the total number of bridge nodes quickly converges to a relatively small value compared to the network size (N) during the generation of the graph, and this value is independent of the graph size. To support this observation we give a recursive estimation of the expected number of new bridge nodes generated at each step of the model, and based on this recursion a mathematical expression is given to the limit of the expected total number of bridge nodes (see Methods for more details). By analyzing the recursion one can show that the expected number of bridge nodes at step N denoted by b _{ N } can approximately be expressed in the form
where the functions f _{1}, f _{2} and f _{3} may depend on R, T, m but are independent from N. From this it immediately follows, that for the total number of bridge nodes B _{ N }
where E is the exponential integral function. The vanishing term during the convergence in B _{ N } is \({N}^{1+{f}_{2}}{E}_{{f}_{2}}({f}_{1}N)\) and also approximately exponential.
In Fig. 3 the expected total number of bridge nodes (B _{ N }) calculated by recursion (9) is plotted in each iteration together with the simulation result for the same parameters. The two plots readily justify that B _{ N } has a characteristic plateau after certain iterations, which means that B _{ N } converges to a finite fixed value during the graph generation process. This also illustrates that for sufficiently large networks the total number of bridge nodes is negligible comparing to the network size. Furthermore, according to statistical tests the overall distribution of the nodes on the Rdisk is apparently not affected by the existence of the bridge nodes, and still can be treated as uniform.
Diameter, clustering and degree distribution
The diameter of all three generated networks (see Table 1) is around 9–10, similar to the real networks (Table 2). Figure 4 shows that the diameter of the T = 12 networks is an approximate logarithmic function of the network size, which confirms the smallworld property. Also the generated networks have high clustering coefficients with values very close to that of real networks. Finally, Table 2 confirms that the clustering coefficient is insensitive to the threshold parameter. Now we show that the generated networks has scalefree degree distribution independently of T.
Theorem 1 The networks produced by the model have scalefree degree distribution with γ = 3 when N → ∞.
Proof: Suppose we compute the effective distance as \({d}_{eff}=\frac{{d}_{Euc}}{\sqrt{k}}\). At each insertion step the algorithm connects the new element to exactly m neighbors that globally minimize the normalized distance. To infer the degree distribution of the neighbor elements, we temporary fix the distance to the m + 1th nearest neighbor \({d}_{{\rm{eff}}}^{m+1}\) and randomly shuffle positions of the m neighbor nodes under the condition that they all remain the m nearest neighbors with respect to the new element (i.e. having effective distance to the new element less than \({d}_{{\rm{eff}}}^{m+1}\)).
For every possible value of the neighbor degree k, possible element positions are bounded in the initial Euclidean space by a radius r _{Euc} = \({d}_{{\rm{eff}}}^{m+1}\sqrt{k}\). Since the nodes are distributed uniformly in the Euclidean space, the probability of having an element with degree k proportional to the r _{Euc}ball volume. Thus, under fixed \({d}_{{\rm{eff}}}^{m+1}\) the overall probability of connecting to an element with degree k is proportional to (k).
The probability inferred for a fixed value of \({d}_{{\rm{eff}}}^{m+1}\) does not depend on either the value of \({d}_{{\rm{eff}}}^{m+1}\), or the positions of the nodes that are not the closest neighbors of the inserted elements, so that is true for every possible positions of the elements in the Euclidean space and overall probability of connection to a node is proportional to its degree (k). This means that new nodes connect to the old ones with probability proportional to k, which is equivalent to the BarabasiAlbert model^{27}, proved to produce scalefree networks with γ = 3.
The Fig. 5 shows the degree distributions of three networks generated with our model with various values of T. The plot readily confirms that the degree distributions are indeed scalefree with γ = 3 independently of T.
Richclub coefficient
Although the insertion of bridge nodes keeps degree distribution, clustering and diameter intact, the simulation results plotted in Fig. 6 clearly show that the graphs generated by the model differ greatly in their richclub organization depending on T. Setting T to the diameter of the Rdisk (T = 100, red triangles in Fig. 6), the model does not limit the lengths of the edges artificially, so the only limiting factor is the natural geometry of the disk itself. In this case we obtain a network with a strong richclub, similarly to the airport network. Conversely, adjusting T to 12, the model will create only edges having d _{eff} < 12. This is a strong “artificial” limitation for the edge lengths imposed by the generation process. As a result, the model yields a network with no richclub (T = 12, blue squares in Fig. 6), likewise the PPI network. We note the appealing similarity between Figs 1 and 6, showing the richclub diversity in real networks.
Discussion
An intriguing question could be whether our model captures something fundamental from the growth processes of real networks, or exhibit similar richclub diversity simply by chance. To answer this question we have performed the CCDF’s (Complementary Cumulative Distribution Function) of the normalized edge length distribution in a richclub (airports with flights in the US) and a non richclub network (the North American Power Grid) together with the networks generated with our model in Fig. 7. Panel (a) shows continuously significant (on all length scale) decrease of edge length distributions before the final “natural” cutoff for the airport and the T = 100 networks caused by the geometry of the continent and the Rdisk respectively. On panel (b) however we can observe a clearly visible plateau before the cutoff of the edge lengths in the power grid network. This means that edge lengths are much denser near the cutoff, which in this case is rather “artificial” and caused by the growth process of the network and not the underlying geometry. Our model produces a very similar edge length distribution for the setting T = 12. These results hint that networks with no richclubs have a very similar limiting for the length of the connections as our model does. As a consequence, this lengthlimiting phenomenon can also account for the emergence of the observed diverse richclub organization in real networks.
These two examples also underline that our method is parsimonious in a sense that the rich club organization can be tuned by only a single geometric threshold parameter in a growing homophilic model. We think the results presented in this paper are strong indications that the rich club diversity can be placed at all on a growing/evolutionary perspective, and provide deeper insight into the mechanisms resulting certain rich club behavior during the growth of networks.
Methods
Recursive estimation of the number of bridge nodes
Let A(r, T, R) be the area of the intersection of an r–centered disk with radius T and the R–disk, and let p(r, T, R) be the fraction of A(r, T, R) and the area of the R–disk, i.e. \(p(r,T,R)=\frac{A(r,T,R)}{{R}^{2}\pi }\). Further, let us assume that there are already j nodes in the network. The (j + 1)^{th} randomly generated node will connect to the m nearest neighbors. For calculating the necessary bridge nodes in this step, the task is to determine what are the nodes among the m nearest neighbors which are farther than T. To ease the computation, the degrees of the neighbors are substituted by their expectation values (denoted by \({\overline{k}}_{j}\) and to be determined later). Since the effective distance is computed as the Euclidean distance divided by \(\sqrt{{\overline{k}}_{j}}\), it is approximately equivalent to investigate the expected number of points among the m nearest ones being outside of the (j + 1)^{th} node \(T\sqrt{{\overline{k}}_{j}}\) radius vicinity. This will be equal to the expected number of newly inserted bridge nodes at this step. Denote the radial coordinate of the (j + 1)^{th} node by r and assume that the previously generated random points and established bridge nodes are still evenly distributed on the R–disk. With this, the probability that i, 0 ≤ i ≤ j nodes among the j ones are closer to the (j + 1)^{th} node than \(T\sqrt{{\overline{k}}_{j}}\) is
and hence the expected number of necessary bridge nodes at this step is
Note, that this is still a conditional expectation value which is to be deconditioned by the density of the radial coordinate r. Towards the deconditioning, first the function A(r, T, R) is to be determined. Clearly, A(r, T, R) = T ^{2} π if r ≤ R − T, i.e. there is no intersection of the two disks. Otherwise, if r ≥ R − T then by using straightforward geometrical calculations
where
Now, the deconditioning is possible with p(r, T, R) = A(r, T, R)/(R ^{2} π) and the density of the radial coordinate r, which is \(\frac{2r}{{R}^{2}}\). Further, let j(N) = N + b _{1} + b _{2} + … +b _{ N } where N is the randomly generated points and b _{ l } l = 1, …, N is the expected number of bridge nodes established after the l ^{th} random node. For completing the recursive estimation, the expected degree should also be expressed upon the l ^{th} random node generation. This is
else
For l = 1 let b _{1} = 0, \({\overline{k}}_{1}=0\) and let \({B}_{N}={\sum }_{i=1}^{N}{b}_{i}\). The main recursion can now be expressed as
Data Availability
The data that support the findings of this study are available from public data repositories. In particular, the topology of the AS level Internet has been downloaded from CAIDA (Center for Applied Internet Data Analysis, www.caida.org). We have downloaded the airport network from the OpenFlights database (www.openflights.org). We used the DIP^{28} database as a source for the proteinprotein interaction network of the S. cerevisiae. Finally, the map of the north american power grid has been downloaded from ref. 29.
References
 1.
Colizza, V., Flammini, A., Serrano, M. A. & Vespignani, A. Detecting richclub ordering in complex networks. Nature physics 2, 110–115, doi:10.1038/nphys209 (2006).
 2.
Park, H.J. & Friston, K. Structural and functional brain networks: from connections to cognition. Science 342, 1238411–1238411, doi:10.1126/science.1238411 (2013).
 3.
Vaquero, L. M. & Cebrian, M. The rich club phenomenon in the classroom. Scientific reports 3, doi:10.1038/srep01174 (2013).
 4.
van den Heuvel, M. P. et al. Abnormal rich club organization and functional brain dynamics in schizophrenia. JAMA psychiatry 70, 783–792, doi:10.1001/jamapsychiatry.2013.1328 (2013).
 5.
Ball, G. et al. Richclub organization of the newborn human brain. Proceedings of the National Academy of Sciences 111, 7456–7461, doi:10.1073/pnas.1324118111 (2014).
 6.
Harriger, L., Van Den Heuvel, M. P. & Sporns, O. Rich club organization of macaque cerebral cortex and its role in network communication. PloS one 7, e46497, doi:10.1371/journal.pone.0046497 (2012).
 7.
Van Den Heuvel, M. P. & Sporns, O. Richclub organization of the human connectome. The Journal of neuroscience 31, 15775–15786, doi:10.1523/JNEUROSCI.353911.2011 (2011).
 8.
Zhou, S. & Mondragón, R. J. The richclub phenomenon in the internet topology. IEEE Communications Letters 8, 180–182, doi:10.1109/LCOMM.2004.823426 (2004).
 9.
McAuley, J. J., da Fontoura Costa, L. & Caetano, T. S. Richclub phenomenon across complex network hierarchies. Applied Physics Letters 91, 084103, doi:10.1063/1.2773951 (2007).
 10.
Mondragón, R. J. & Zhou, S. Random networks with given richclub coefficient. The European Physical Journal B 85, 1–6, doi:10.1140/epjb/e2012210263 (2012).
 11.
Mondragón, R. J. Network nullmodel based on maximal entropy and the richclub. Journal of Complex Networks 2, 288–298, doi:10.1093/comnet/cnu006 (2014).
 12.
Ma, A. & Mondragón, R. J. Richcores in networks. PloS one 10, e0119678, doi:10.1371/journal.pone.0119678 (2015).
 13.
Xu, X.K., Zhang, J. & Small, M. Richclub connectivity dominates assortativity and transitivity of complex networks. Physical Review E 82, 046117, doi:10.1103/PhysRevE.82.046117 (2010).
 14.
Paris, L. et al. Present limits of very long distance transmission systems. Global Energy Network Institute (1984).
 15.
SimpsonPorco, J. W., Dörfler, F. & Bullo, F. Voltage collapse in complex power grids. Nature communications 7 (2016).
 16.
Giles, R. & Li, T. Optical amplifiers transform longdistance lightwave telecommunications. Proceedings of the IEEE 84, 870–883, doi:10.1109/5.503143 (1996).
 17.
Borondo, J., Borondo, F., RodriguezSickert, C. & Hidalgo, C. A. To each according to its degree: The meritocracy and topocracy of embedded markets. Scientific reports 4, 3784, doi:10.1038/srep03784 (2014).
 18.
Cohen, R. & Havlin, S. Complex networks: structure, robustness and function (Cambridge University Press, 2010).
 19.
Kleinberg, J. M. Navigation in a small world. Nature 406, 845–845, doi:10.1038/35022643 (2000).
 20.
Gulyás, A., Bró, J. J., Körösi, A., Rétvári, G. & Krioukov, D. Navigable networks as nash equilibria of navigation games. Nature communications 6 (2015).
 21.
Papadopoulos, F., Kitsak, M., Serrano, M. Á., Boguná, M. & Krioukov, D. Popularity versus similarity in growing networks. Nature 489, 537–540, doi:10.1038/nature11459 (2012).
 22.
Malkov, Y. A. & Ponomarenko, A. Growing homophilic networks are natural navigable small worlds. Plos ONE e0158162 (2016).
 23.
Serrano, M. A., Krioukov, D. & Boguná, M. Selfsimilarity of complex networks and hidden metric spaces. Physical review letters 100, 078701, doi:10.1103/PhysRevLett.100.078701 (2008).
 24.
Boguna, M., Krioukov, D. & Claffy, K. C. Navigability of complex networks. Nature Physics 5, 74–80, doi:10.1038/nphys1130 (2009).
 25.
Krioukov, D., Papadopoulos, F., Kitsak, M., Vahdat, A. & Boguná, M. Hyperbolic geometry of complex networks. Physical Review E 82, 036106, doi:10.1103/PhysRevE.82.036106 (2010).
 26.
Allard, A., Serrano, M. Á., GarcaPérez, G. & Boguñá, M. The geometric nature of weights in real complex networks. Nature Communications 8, 14103, doi:10.1038/ncomms14103 (2017).
 27.
Barabási, A.L. & Albert, R. Emergence of scaling in random networks. Science 286, 509–512, doi:10.1126/science.286.5439.509 (1999).
 28.
Xenarios, I. et al. Dip, the database of interacting proteins: a research tool for studying cellular networks of protein interactions. Nucleic acids research 30, 303–305, doi:10.1093/nar/30.1.303 (2002).
 29.
Wiegmans, B. Gridkit: European and northamerican extracts, https://doi.org/10.5281/zenodo.47317 (2016).
Acknowledgements
The research work leading to these results was partially supported by the Hungarian Scientific Research Fund (grant No. OTKA 108947), HSNLab and Ericsson. Yury Malkov is grateful for the support from RFBR, according to the research project No. 163160104 mol_a_dk and by the Government of Russian Federation (agreement #14.Z50.31.0033 with the Institute of Applied Physics of RAS). A. Gulyás was supported by the Janos Bolyai Fellowship of the Hungarian Academy of Sciences.
Author information
Affiliations
Contributions
M.C. and A.G. have developed the model and contributed to the experiments. A.K., J.B., Z.H. and Y.M. contributed to the analysis and the numerical results. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Competing Interests
The authors declare that they have no competing interests.
Additional information
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Csigi, M., Kőrösi, A., Bíró, J. et al. Geometric explanation of the richclub phenomenon in complex networks. Sci Rep 7, 1730 (2017). https://doi.org/10.1038/s4159801701824y
Received:
Accepted:
Published:
Further reading

Estimating degree–degree correlation and network cores from the connectivity of high–degree nodes in complex networks
Scientific Reports (2020)

Reorganization of richclubs in functional brain networks during propofolinduced unconsciousness and natural sleep
NeuroImage: Clinical (2020)

Generalized richclub ordering in networks
Journal of Complex Networks (2019)

Resilience or robustness: identifying topological vulnerabilities in rail networks
Royal Society Open Science (2019)

Effect of richclub on diffusion in complex networks
International Journal of Modern Physics B (2018)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.