Generalised thresholding of hidden variable network models with scale-free property

Balogh, Sámuel G.; Pollner, Péter; Palla, Gergely

doi:10.1038/s41598-019-47628-0

Download PDF

Article
Open access
Published: 02 August 2019

Generalised thresholding of hidden variable network models with scale-free property

Sámuel G. Balogh¹,
Péter Pollner² &
Gergely Palla²

Scientific Reports volume 9, Article number: 11273 (2019) Cite this article

1414 Accesses
1 Citations
1 Altmetric
Metrics details

Subjects

Abstract

The hidden variable formalism (based on the assumption of some intrinsic node parameters) turned out to be a remarkably efficient and powerful approach in describing and analyzing the topology of complex networks. Owing to one of its most advantageous property - namely proven to be able to reproduce a wide range of different degree distribution forms - it has become a standard tool for generating networks having the scale-free property. One of the most intensively studied version of this model is based on a thresholding mechanism of the exponentially distributed hidden variables associated to the nodes (intrinsic vertex weights), which give rise to the emergence of a scale-free network where the degree distribution p(k) ~ k^−γ is decaying with an exponent of γ = 2. Here we propose a generalization and modification of this model by extending the set of connection probabilities and hidden variable distributions that lead to the aforementioned degree distribution, and analyze the conditions leading to the above behavior analytically. In addition, we propose a relaxation of the hard threshold in the connection probabilities, which opens up the possibility for obtaining sparse scale free networks with arbitrary scaling exponent.

Spike sorting with Kilosort4

Article Open access 08 April 2024

Marius Pachitariu, Shashwat Sridhar, … Carsen Stringer

Tandem mass spectrum prediction for small molecules using graph transformers

Article 05 April 2024

Adamo Young, Hannes Röst & Bo Wang

Geometry-enhanced pretraining on interatomic potentials

Article 05 April 2024

Taoyong Cui, Chenyu Tang, … Wanli Ouyang

Introduction

Network theory provides an ubiquitous and sophisticated approach for the characterization of complex systems, possibly composed of many interacting units^1,2. One of the most widely studied features of complex networks is given by the scale-free (SF) property, manifesting in the strong inhomogeneity of the degree distribution, p(k), accompanied by a power-law like decay of p(k) in the large degree regime^3,4,5,6,7. On the modelling ground, several growing mechanisms have been proposed for generating networks without a characteristic degree scale, including the fundamental concept of the Barabási-Albert model together with its modifications and generalizations^3,8,9. Meanwhile it became also evident that not all networks emerge from a growing mechanism^10,11,12, and there are numerous examples where new connections can easily occur between already existing nodes in the system^13,14. In some cases we can also reasonably assume that the rewiring of the network leads towards a more optimal configuration^15,16,17, and that the propensity for creating links is encoded in each node as an intrinsic parameter. Assumptions of this type naturally gave rise to the development of the hidden variable formalism.

Inspired by the nature of protein interaction networks, the hidden variable model was originally introduced by Caldarelli et al. for explaining the emergence of non-growing scale-free networks, where link creation might be related to some intrinsic features of the nodes¹⁰. In a following work, Boguña et al. proposed a systematic analytic framework for generally characterizing classes of random graphs generated with hidden variables¹⁸. Based on this framework, later on a general method was implemented, capable of producing SF networks with a tunable γ scaling exponent⁶. Since then, the applicability of the hidden variable formalism has been confirmed on a large scale^{19,20,21,22,23,24,25,26}, and due to its very general nature, a large variety of further network models ranging from stochastic block models^{27,28,29,30,31} through multifractal graph generators³² and evolving, fitness based network models combined with preferential attachment^33,34 to networks defined over hidden metric spaces^{35,36,37,38,39} can be alternatively interpreted as special forms of this approach.

Although many different variations of this model has been proposed, the basic idea of the concept is to first associate a parameter (hidden variable) to the nodes, usually drawn from a prescribed distribution, and then connect the pairs of nodes according to a probability given by a fixed linking function taking the node variables as arguments. A part of these models can be referred to as geographical, where besides the hidden variables, node also have coordinates in a d dimensional Euclidean space, and the connection function is depending on both the distance and the hidden variables⁴⁰. A very closely related class of models is where the coordinates are distributed in a hyperbolic space instead of a Euclidean one^36,38, which provide a very interesting direction for research on their own, due to that they can generate SF networks with a high clustering coefficient in a natural way and due to their relevance in routing problems³⁸.

A widely studied version of the hidden variable approach is where the linking function acts as a threshold, giving a connection probability 1 when the value of the two variables fulfill some criteria, and 0 otherwise. Non-geographical threshold models of this kind have gained considerable attention^{10,21,22,38,41,42,43}, later on being extended even to the geographical space⁴⁰. An interesting phenomenon observed in the non-geographical thresholded model is that in the infinitely large system size limit the degree distribution seems to be universally characterized by a ∝ k⁻² decay for various fitness distributions^21,22. In the present paper we extend the previously studied families of hidden variable distributions and linking functions that fall into this class, and study the mathematical conditions leading to this specific degree decay exponent analytically. In addition, we also introduce a simple and intuitive relaxation of the former ‘hard’ threshold functions, that allows the modification of the exponent according to numerical simulations.

The Hidden Variable Model

When generating a network with N nodes in this approach, first we need to assign a hidden variable {x_i | x_i ≥ 0 ∀i ∈ {1, …, N}} to each node i, where x_i is drawn from an arbitrary (but normalized) ρ(x) probability distribution. For simplicity, x_i is often referred to as the fitness of node i. After distributing these intrinsic parameters we also have to define a linking function 0 ≤ f(x, y) ≤ 1, based on which the connection probability between nodes i and j can be simply expressed as p(link between i and j) = f(x_i, x_j). Thus, in this model all of the information and the properties of the emerging network is completely encoded in the pre-defined form of ρ(x) and f(x, y). Following the continuous approximation introduced in^6,21, the expected degree of nodes with fitness x can be expressed as

$$k(x)=N{\int }_{0}^{\infty }\,f(x,y)\rho (y){\rm{d}}y$$

(1)

Assuming that k(x) is a monotonically increasing and invertible function of x, according to the rule of transformation of random variables⁶ the degree distribution can be written as

$$p(k)=\int \,\rho (x)\,\delta (k-k(x)){\rm{d}}x={\rho (x)\frac{{\rm{d}}x}{{\rm{d}}k(x)}|}_{\tilde{x}(k)},$$

(2)

where $\tilde{x}(k)$ denotes the inverse function of k(x).

The simplest choice for the connection function is f(x, y) = const., where the connection probability is uniform and independent from the hidden variables, leading to the emergence of an Erdős-Rényi random graph. A more interesting example was shown in¹⁰ with a linking function $f(x,y)=\frac{xy}{{x}_{M}^{2}}$ and ${x}_{M}={{\rm{\max }}}_{i}\{{x}_{i}|i=\mathrm{1,}\,\mathrm{...,}\,N\}$ where (apart from multiplicative constants) the degree distribution inherits the form of the fitness distribution $p(k) \sim \rho (\frac{{x}_{M}^{2}}{N\langle x\rangle }k)$. Thus, by choosing a fitness distribution of $\rho (x) \sim {x}^{-\gamma }$, the obtained network certainly displays the SF property with the same γ exponent. Later on it turned out that for a more general class of linking functions where f(x, y) can be decomposed into a product such as f(x, y) = g(x)g(y), the hidden variable formalism can be regarded as analytically well-treatable, and it is also able to reproduce fat-tailed degree distributions with arbitrary γ exponents⁶. The product form also implies that for randomly chosen links the degrees of the endpoints are un-correlated, thus, from the point of view of degree assortativity the obtained networks are neutral. Another surprising result in ref.¹⁰ is connected to the case where an exponential fitness distribution $\rho (x) \sim {{\rm{e}}}^{-x}$ is chosen together with an f(x, y) corresponding to a threshold function

$$f(x,y)={\rm{\Theta }}(x+y-{\rm{\Delta }}),$$

(3)

where Θ(x) refers to the Heaviside step function, and Δ is a constant with a logarithmic dependency on the system size N. Under these settings a power law decay of the degree distribution was detected with a γ = 2 scaling exponent, providing the first evidence for that SF networks can be generated in this approach even with non power-law like fitness distributions. As we already mentioned in the Introduction, in later studies it was observed that the inverse square decay of the degree distribution is actually a quite general feature, that holds for various other fitness distributions as well^21,22. Further interesting occurrences of the inverse square decay is briefly discussed in⁴⁴.

Generalized Classes of Non-Geographical Thresholded SF Hidden Variable Models with γ = 2

Model class definitions

In this section we introduce a broad set of hidden variable models where the degree decay exponent is equal to γ = 2. A common feature of these models is that they are thresholded in the sense that the linking function f(x, y) has a lower cut-off, controlled by a parameter Δ. The example in (3) is a special case of this, where f(x, y) immediately becomes 1 above the threshold. Here we use a much weaker assumption, namely that f(x, y) is 0 for a certain range of x and y values, and is non-zero (but not necessarily 1) outside this range. This is a far more general way of thresholding the connection probabilities, which allows a very broad range of connection functions to be used in the model, as shall be shown later. The linking functions having this property are denoted as f_Δ(x, y) throughout the paper. Here we introduce sub-classes of hidden variable models, the first one is to which we refer as exponential-like and where

ρ(x) can be written as
$$\rho (x)=H{\rm{^{\prime} }}(x)\,\exp [-\,H(x)],$$
(4)
where H(x) is a differentiable, monotonously increasing function and H′(x) denotes its derivative,
while the thresholded f_Δ(x, y) shows an additive dependency on H(x) and H(y),
$${f}_{{\rm{\Delta }}}(x,y)=\{\begin{array}{ll}0, & {\rm{if}}\,H(x)+H(y)\le {\rm{\Delta }},\\ \tilde{f}(H(x)+H(y))\in [0,1], & {\rm{if}}\,H(x)+H(y) > {\rm{\Delta }},\end{array}$$
(5)
where $\tilde{f}(x,y)$ is assumed to be a general function taking values in the [0, 1] interval. We refer to the second sub-class as power-like, where
ρ(x) can be written as
$$\rho (x)=G^{\prime} (x){G}^{-\alpha }(x),$$
(6)
where G(x) has the same properties as H(x) in (4),
while the thresholded f_Δ(x, y) shows a multiplicative dependency on G(x) and G(y),
$${f}_{{\rm{\Delta }}}(x,y)=\{\begin{array}{ll}\mathrm{0,} & {\rm{if}}\,G(x)G(y)\le {\rm{\Delta }},\\ \tilde{f}(G(x)G(y))\in [0,1] & {\rm{if}}\,G(x)G(y) > {\rm{\Delta }}.\end{array}$$
(7)
And last, if both additive and multiplicative dependency are present (mixed class):
ρ(x) can be written as
$$\rho (x)=\frac{M^{\prime} (x)}{{(1+M(x))}^{\alpha }},$$
(8)
where M(x) has the same properties as H(x) in (4),
while f_Δ(x, y) can be expressed as:

$${f}_{{\rm{\Delta }}}(x,y)=\{\begin{array}{cc}0, & {\rm{i}}{\rm{f}}\,(1+M(x))(1+M(y))\le {\rm{\Delta }},\\ \mathop{f}\limits^{ \sim }(1+M(x)+M(y)+M(x)M(y))\in [0,\,1] & {\rm{i}}{\rm{f}}\,(1+M(x))(1+M(y)) > {\rm{\Delta }}.\end{array}$$

(9)

These generalized sub-classes can be derived from two simple observations. First, it can be shown in general that when replacing the step-like function in (3) by an arbitrary f_Δ(x + y) having a lower cut-off at Δ, the degree distribution of the emerging networks is not affected. Second, the form of the hidden variable distributions and accompanying connection functions given in (4–9) are also in very close relation with the transformations of the (ρ, f) pair that leave the degree distribution of the generated network invariant. To see that, let us assume an arbitrary ρ(x) and f(x, y) yielding a network with a degree distribution of p(k). By transforming the hidden variables using a monotonous function H as x_i = H(z_i) and z_i = H⁻¹(x_i) for all nodes i, according to the rule of transforming random variables the density of the original variable x can be also written as ρ_x(x) = ρ_z(z)/H′(z) = ρ_z(H⁻¹(x))/H′(H⁻¹(x)). Based on that, the expected degree for nodes with variable x given in (2) can be also expressed as

$$k(x)=k(H(z))=N{\int }_{0}^{\infty }\,f(x,y){\rho }_{x}(y){\rm{d}}y=N{\int }_{0}^{\infty }\,f(H(z),H(z^{\prime} )){\rho }_{z}(z^{\prime} ){\rm{d}}z^{\prime} ,$$

(10)

where we have changed the integration variable y to z′ = H⁻¹(y). By combining the expression of (10) with the transformation rule of degrees given in (2), we obtain that a transformed model where ρ_z(z) = ρ_x(H(z))H′(z) and the linking function is given by f(H(x), H(y)), will essentially lead to the same degree distribution as the original model. This conservation law of the degree distribution is also closely related to a general transformation rule of the fitness values written as

$$p(k)={\frac{{\rm{d}}H(z)}{{\rm{d}}z}\rho (H(z))\frac{{\rm{d}}x}{{\rm{d}}{\int }_{0}^{\infty }f(H(z),H(z^{\prime} ))\frac{{\rm{d}}H(z^{\prime} )}{{\rm{d}}z^{\prime} }\rho (H(z^{\prime} )){\rm{d}}z^{\prime} }|}_{{\tilde{z}}_{H}(k)},$$

(11)

which is analogous to (2). The sub-classes of models we defined in this paper exploit this property, where the ‘original’ model is corresponding to the simple model introduced in¹⁰, following the inverse square decay law. Nevertheless, this invariance of the degree distribution under appropriate simultaneous transformation of ρ(x) and f(x, y) is valid for the hidden variable approach in general, also in the case of geographical models. We also note that any model in one of the above defined three sub-classes can generally be mapped into a model in the other sub-class via simple transformations between H(x), G(x) and M(x) given by

$$\begin{array}{ccc}H(x)=\alpha \,\mathrm{ln}\,G(\tilde{x}) & H(x)=\alpha \,\mathrm{ln}(1+M(\tilde{x})) & G(x)=1+M(\tilde{x}\mathrm{).}\end{array}$$

(12)

However, when the goal is to obtain a size independent SF degree distribution, the dependency of Δ = Δ(N) on the number of nodes N can be different in each class. In summary, the previous observations clarify in a simple manner why seemingly different realizations of (ρ, f) can give rise to the emergence of networks with the same degree distributions. In addition, we also gained simple rules for mapping the different realisations of (ρ, f) into each other. A remarkable consequence of the above is that for any ${\tilde{f}}_{{\rm{\Delta }}}$ showing either additive, multiplicative or mixed dependency on its arguments, we can now construct a fitness distribution for which it is guaranteed that the degree distribution of the emerging network will display the $p(k) \sim {k}^{-2}$ behaviour.

For illustration, a few examples from each sub-classes are listed in Table 1, all generating SF networks with a degree decay exponent γ = 2. In addition, in Fig. 1. we show the simulation results for the Weibull fitness distribution from the class of exponential-like distributions, where the corresponding linking function was chosen as

$${f}_{{\rm{\Delta }}}(x,y)=\{\begin{array}{ll}0 & {\rm{if}}\,{x}^{2}+{y}^{2}\le {\rm{\Delta }},\\ 1-\frac{\exp [a-({x}^{2}+{y}^{2})]}{({x}^{2}+{y}^{2})} & {\rm{if}}\,{x}^{2}+{y}^{2} > {\rm{\Delta }},\end{array}$$

(13)

where a is a constant and Δ is defined via the transcendent equation Δ = exp(a − Δ). The fact that the complementary cumulative distribution $F(k)={\int }_{k^{\prime} =k}^{\infty }\,p(k^{\prime} ){\rm{d}}k^{\prime} $ of the degrees behaves as k⁻¹ in the large degree regime in Fig. 1. is in full consistency with the inverse square decay law.

Table 1 Examples for fitness distributions with the accompanying thresholded fitness functions f_Δ(x, y) for each sub-class.

Full size table

The inverse square decay law

Here we show in details that for thresholded hidden variable models falling into the class described in the previous subsection, the degree distribution of the emerging SF network will always have a degree decay exponent of γ = 2. Let us assume that we are dealing with an exponential-like model, where the fitness distribution is given in (4), and the linking function follows (5). Starting from the expression for the average degree given in (2), and multiplying both sides by exp[−H(x)] we can write

$$k(x)\,\exp [-H(x)]=N{\int }_{{H}^{-1}({\rm{\Delta }}-H(x))}^{{\rm{\infty }}}\,\mathop{f}\limits^{ \sim }(H(x)+H(y)){H}^{{\rm{^{\prime} }}}(y)\,\exp [-H(x)-H(y)]{\rm{d}}y,$$

(14)

where we used that f_Δ(x, y) = 0 if H(x) + H(y) ≤ Δ. By a change of variable z = H(x) + H(y) we arrive to an equation where the right hand side is independent of x,

$$k(x)\,\exp [-\,H(x)]=N{\int }_{{\rm{\Delta }}}^{\infty }\,\tilde{f}(z){{\rm{e}}}^{-z}{\rm{d}}z.$$

(15)

Based on (15) we define the integral

$${L}_{{\rm{\Delta }}}(\tilde{f})\equiv {\int }_{{\rm{\Delta }}}^{\infty }\,\tilde{f}(z){{\rm{e}}}^{-z}{\rm{d}}z,$$

(16)

that depends on the form of the actually chosen $\tilde{f}(z)$ appearing in (5). Assuming that this integral exists, using (15–16) we can express the average degree of nodes with fitness x and the derivative of k(x) as

$$k(x)=N{L}_{{\rm{\Delta }}}(\tilde{f})\,\exp [H(x)],$$

(17)

$$\frac{{\rm{d}}k(x)}{{\rm{d}}x}=N{L}_{{\rm{\Delta }}}(\tilde{f})\,\exp [H(x)]H^{\prime} (x\mathrm{).}$$

(18)

In the thermodynamic limit of N → ∞, by substituting (17–18) into (2) we obtain

$$p(k)={\frac{H^{\prime} (x)\exp [-H(x)]}{N{L}_{{\rm{\Delta }}}(\tilde{f})H^{\prime} (x)\exp [H(x)]}|}_{{\tilde{x}}_{H}(k)}={\frac{N{L}_{{\rm{\Delta }}}(\tilde{f})}{{[N{L}_{{\rm{\Delta }}}(\tilde{f})\exp [H(x)]]}^{2}}|}_{{\tilde{x}}_{H}(k)}=\frac{N{L}_{{\rm{\Delta }}}(\tilde{f})}{{k}^{2}},$$

(19)

showing that p(k) is proportional to k⁻², since ${L}_{{\rm{\Delta }}}(\tilde{f})$ is a constant. Moreover, according to (19) we can formulate a very simple condition under which p(k) becomes independent of the system size in the form of

$${L}_{{\rm{\Delta }}}(\tilde{f})=L({\rm{\Delta }},\tilde{f})={\int }_{{\rm{\Delta }}}^{\infty }\,\tilde{f}(z){{\rm{e}}}^{-z}{\rm{d}}z=\frac{1}{N},$$

(20)

which is reducing to $L({\rm{\Delta }},{\rm{\Theta }})={{\rm{e}}}^{-{\rm{\Delta }}}=\frac{1}{N}$ in the case of the simple step-function like connection function given in (3), in complete agreement with the results in^6,18.

Similarly to the case of exponential-like distributions, if we choose the fitness distribution to be power-like as in (6), together with a thresholded connection function given in (7), the expected degree for a node having a hidden variable x can be written as

$$k(x)=N{\int }_{1}^{\infty }\,f(x,y)\rho (y){\rm{d}}y=N{\int }_{{G}^{-1}({\rm{\Delta }}/G(x))}^{\infty }\tilde{f}(G(x)G(y)){G}^{-\alpha }(y)G^{\prime} (y){\rm{d}}y.$$

(21)

By changing to the integration variable z = G(x)G(y) we obtain

$$k(x){G}^{-\alpha +1}(x)=N{\int }_{{\rm{\Delta }}}^{{\rm{\infty }}}\,\mathop{f}\limits^{ \sim }(G(x)G(y)){G}^{-\alpha }(x){G}^{-\alpha }(y){\rm{d}}[G(x)G(y)],$$

(22)

leading to

$$k(x)=N{G}^{\alpha -1}(x){\int }_{{\rm{\Delta }}}^{\infty }\,\tilde{f}(z){z}^{-\alpha }{\rm{d}}z,$$

(23)

where α > 1. The integral

$${K}_{\alpha }(\tilde{f})\equiv {\int }_{{\rm{\Delta }}}^{\infty }\,\tilde{f}(z){z}^{-\alpha }{\rm{d}}z$$

(24)

appearing in (23) depends only on the chosen form of $\tilde{f}(z)$, thus, by assuming that ${K}_{\alpha }(\tilde{f})$ exists we can express k(x) simply as

$$k(x)=N{K}_{\alpha }(\tilde{f}){G}^{\alpha -1}(x\mathrm{).}$$

(25)

By substituting this into the general formula for the degree distribution given in (2) we gain

$$p(k)={\rho (x)\frac{dx}{dk(x)}|}_{{\tilde{x}}_{G}(k)}={\frac{G^{\prime} (x){G}^{-\alpha }(x)}{N(\alpha -\mathrm{1)}{K}_{\alpha }(\tilde{f})G^{\prime} (x){G}^{\alpha -2}(x)}|}_{{\tilde{x}}_{G}(k)} \sim {\frac{1}{{G}^{\mathrm{2(}\alpha -\mathrm{1)}}(x)}|}_{{\tilde{x}}_{G}(k)} \sim {k}^{-2},$$

(26)

proving that the power-like sub-class also leads to the emergence of a SF network with γ = 2. However, in this case the condition for a size independent degree distribution with the $p(k) \sim {k}^{-2}$ property is given by a different formula compared to (20), written as

$${K}_{\alpha }(\tilde{f})\equiv {\int }_{{\rm{\Delta }}}^{\infty }\,\tilde{f}(z){z}^{-\alpha }{\rm{d}}z=\frac{1}{N}.$$

(27)

By following similar mathematical arguments as we did in the exponential- and power-like cases the same behaviour can be obtained for the third sub-class defined through (8–9). Nevertheless, the condition for a size independent SF degree distribution with γ = 2 has yet again, a different form from (20) or (27), given by

$${J}_{\alpha }(\tilde{f})\equiv {\int }_{{\rm{\Delta }}}^{\infty }\,\frac{\tilde{f}(z)}{{\mathrm{(1}+z)}^{\alpha }}{\rm{d}}z=\frac{1}{N}.$$

(28)

An important related remark is that for all fitness distributions and linking functions given in the forms of (4), (5) or (6), (7) or (8), (9) the emerging networks always display the inverse square decay law independent from the specific form of $\tilde{f}$, and thus, the appropriate form of $\tilde{f}$ only determines the condition for obtaining size-independent degree distribution.

Soft Thresholding with Tunable Degree Decay Exponent

The degree decay exponent of SF complex networks characterizing real systems is usually between γ = 2 and γ = 3^1,2. Motivated by that, we extend the models defined in the previous section to allow the emergence of hidden variable networks with a γ > 2 exponent as well in the same framework.

The basic idea is to relax the ‘hard’ threshold in the connection function, controlled by the variable Δ in (3) and (5). Let us start with the Heaviside step function given in (3), which we can intuitively replace by a ‘reversed’ Fermi-Dirac function

$$f(x,y)=f(x+y)=\frac{1}{1+{{\rm{e}}}^{-\beta (x+y-{\rm{\Delta }})}},$$

(29)

where β is a parameter taking positive values. Naturally, at β → ∞ we recover the original step-like connection function (3), whereas for finite β values we obtain a ‘soft’ threshold function. The form of f(x, y) in (29) is similar to that of the linking probability in temperature dependent graph ensembles^45,46. The interpretation of β in this respect is analogous to the inverse temperature, with β → ∞ corresponding to the zero temperature ‘ground’ state, while networks generated with finite β values can be interpreted as states at higher temperatures^36,45. Regarding the case where we can not assume the additive but rather the multiplicative dependency on x and y in (29), a simple approach to relax the step function is to define f(x, y) as

$$f(x,y)=f(xy)=\frac{1}{1+{(\frac{xy}{{\rm{\Delta }}})}^{-\beta }},$$

(30)

converging to f(x, y) = Θ(xy − Δ) in the β → ∞ limit.

In a similar fashion to (29), for the general exponential-like models defined in (4–5) we can replace (5) by

$${f}_{\beta ,{\rm{\Delta }}}(x,y)=\frac{1}{1+{{\rm{e}}}^{-\beta (H(x)+H(y)-{\rm{\Delta }})}},$$

(31)

and for the power-like models given in (6–7) we can change (7) to

$$f(x,y)={f}_{\beta ,{\rm{\Delta }}}(x,y)=\frac{1}{1+{(\frac{{G}^{\alpha }(x){G}^{\alpha }(y)}{{\rm{\Delta }}})}^{-\beta }}\mathrm{.}$$

(32)

Similar form of connection function can be established for the third sub-class given by (8–9) based on (12). For all β dependent connection functions defined above, at β → 0 we obtain a linking kernel that becomes independent of the hidden variables, and thus, the generated network is an Erdös-Rényi random graph. However, in the opposite case, when β → ∞, the connection functions converge to the original ‘hard’ thresholded forms, and the generated networks are scale-free with a decay exponent of γ = 2. Therefore, by tuning the parameter β from 0 to ∞ we can scan through a series of networks starting from the classical random graph, and arriving to a SF network obeying the inverse square decay law at the other end of the spectrum. Presumably, during this transition we may find a finite β range in which the degree distribution is already power law like instead of the Poisson distribution, but the decay exponent γ has not yet reached the γ = 2 limit value.

Our simulation results shown in Fig. 2. provide a strong support for the assumption above. In the four panels we display the complementary cumulative degree distribution for an exponential-like model with $\rho (x)=3{x}^{2}{{\rm{e}}}^{-{x}^{3}}$ at different β parameters.

Characterizing the γ(β) transition

In this section we aim to characterize the main features of the above mentioned transition of γ as a function of the effective temperature for the exponential case. In order to do so, we first define the general β dependent integral for k(x) and perform a transformation of variables (z = H(x) + H(y) − Δ):

$$k(x)=N{\int }_{0}^{\infty }\,\frac{H^{\prime} (y){{\rm{e}}}^{-H(y)}{\rm{d}}y}{1+{{\rm{e}}}^{-\beta (H(x)+H(y)-{\rm{\Delta }})}}=N{{\rm{e}}}^{H(x)-{\rm{\Delta }}}{\int }_{H(x)-{\rm{\Delta }}}^{\infty }\,\frac{{{\rm{e}}}^{-z}}{1+{{\rm{e}}}^{-\beta z}}{\rm{d}}z=N{{\rm{e}}}^{H(x)-{\rm{\Delta }}}{F}_{\beta ,{\rm{\Delta }}}(x)$$

(33)

offering a non-invertible form for the degree variable in general. However, approximate results can still be obtained possibly with logarithmic or sub-power corrections. If $H(x)\ll {\rm{\Delta }}$ and β ∈ (0, 1), the main contribution to the integral comes from the $z\ll 0$ range, where the value of the denominator is large. Based on that, the above expression can be approximated by

$$k(x)=N{{\rm{e}}}^{H(x)-{\rm{\Delta }}}{\int }_{H(x)-{\rm{\Delta }}}^{\infty }\,\frac{{{\rm{e}}}^{-z}}{1+{{\rm{e}}}^{-\beta z}}{\rm{d}}z\approx N{{\rm{e}}}^{H(x)-{\rm{\Delta }}}{\int }_{H(x)-{\rm{\Delta }}}^{\infty }\,{{\rm{e}}}^{(\beta -\mathrm{1)}z}{\rm{d}}z=\frac{N}{1-\beta }{{\rm{e}}}^{\beta (H(x)-{\rm{\Delta }})}\mathrm{.}$$

(34)

Even though this approximation tends to be less and less accurate as the value of β converges to zero, its advantage is that it provides an analytic and invertible expression for the degree distribution written as

$${p}_{\beta }(k)\sim {k}^{-(1+\frac{1}{\beta })}+{\rm{s}}{\rm{u}}{\rm{b}}{\rm{p}}{\rm{o}}{\rm{w}}{\rm{e}}{\rm{r}}\,{\rm{c}}{\rm{o}}{\rm{r}}{\rm{r}}{\rm{e}}{\rm{c}}{\rm{t}}{\rm{i}}{\rm{o}}{\rm{n}}{\rm{s}}.$$

(35)

We note that similar forms have already been established in refs^36,45, however, not for the reversed Fermi-Dirac function. In addition to that, based on (34) we can also formulate an approximate condition for having a size independent degree distribution given by

$${{\rm{\Delta }}}_{\beta }(N)\simeq \frac{1}{\beta }\,\mathrm{ln}\,N.$$

(36)

For β > 1 the integrand appearing in (33) becomes negligible for z = H(x) − Δ < 0, hence the lower bound of the integral can approximately be replaced by zero as

$$k(x)\approx N{{\rm{e}}}^{H(x)-{\rm{\Delta }}}{\int }_{0}^{\infty }\,\frac{{{\rm{e}}}^{-z}}{1+{{\rm{e}}}^{-\beta z}}{\rm{d}}z,$$

(37)

which along similar arguments is yielding

$$\begin{array}{ccc}{p}_{\beta }(k)\sim {k}^{-2}+{\rm{s}}{\rm{u}}{\rm{b}}{\rm{p}}{\rm{o}}{\rm{w}}{\rm{e}}{\rm{r}}\,{\rm{c}}{\rm{o}}{\rm{r}}{\rm{r}}{\rm{e}}{\rm{c}}{\rm{t}}{\rm{i}}{\rm{o}}{\rm{n}}{\rm{s}} & {\rm{a}}{\rm{n}}{\rm{d}} & {{\rm{\Delta }}}_{\beta }(N)\simeq \,{\rm{l}}{\rm{n}}\,N.\end{array}$$

(38)

The results in (38) become exact in the limit of β → ∞. Analogously, the above analysis applies to the power sub-class as well, where the average degree as a function of fitness is written as

$$k(x)=N{\int }_{1}^{\infty }\,\frac{G^{\prime} (y){G}^{-\alpha }(y)}{1+{(\frac{{G}^{\alpha }(x){G}^{\alpha }(y)}{{\rm{\Delta }}})}^{-\beta }}{\rm{d}}y=N{G}^{\alpha -1}(x){{\rm{\Delta }}}^{-\alpha +1}{\int }_{G(x)/{\rm{\Delta }}}^{\infty }\,\frac{{z}^{-\alpha }}{1+{z}^{-\alpha \beta }}{\rm{d}}z,$$

(39)

with $z=\frac{G(x)G(y)}{\Delta }$. Despite the same behaviour of the degree distribution, the formula above suggests that the condition for obtaining a size independent degree distribution requires Δ^αβ to be proportional to N (for β ∈ (0, 1)). Furthermore, it also implies that the accurate characterization of the γ(β) transition for the three sub-classes requires similar considerations.

Our simulation results together with the approximation discussed above are shown in Fig. 3., depicting the transition of the scaling exponent as a function of the effective temperature 1/β. We kept control over the average degree of the generated networks by relying on (36) and (38), ensuring the size independence of the degree distribution.

Discussion

We have revisited the inverse square decay law of non-geographical thresholded hidden variable models, which was already studied in refs^10,21,22 from different perspectives. We provided a far more general framework for thresholding linking mechanisms, where the form of the connection function f(x, y) is not restricted to the usual Heaviside function, but instead can correspond to any general function with values in the [0,1] interval, as long as f(x, y) = 0 for a certain range of x and y values, and is non-zero outside this range. According to our results, this considerably weaker assumption on the form of the thresholded f(x, y) allows a very broad range of connection functions, that combined with properly chosen fitness distributions ρ(x) result in the inverse square decay law, similarly to the models discussed in ref.¹⁰. Along this line we provided three general sub-classes of hidden variable distributions and accompanying connection functions (i.e., the exponential, the power-like and the mixed class) that all generate SF networks with a degree decay exponent of γ = 2, and we also discussed how these different sub-classes are interrelated to each other.

Despite the invariance of the degree distribution obtained by using different f(x, y), the generated networks might show different properties at the level of local network quantities such as degree correlations or the clustering coefficient. For illustration in Fig. 4. we provide simulation results for the clustering coefficient as a function of k, when replacing the Heaviside step function with other possible forms of f_Δ(x + y). According to Fig. 4a, the average clustering coefficient $\bar{C}$ is more or less constant below a characteristic degree and is decaying as a power law above this characteristic degree for the Heaviside step function (red symbols), and also for a connection function f_Δ(x + y) converging to the Heaviside step function in an exponential manner (green symbols). In contrast, for connection functions having a peak at x + y = z = Δ(N) with a decaying tail for larger z values (green and yellow symbols), we can observe a peak in $\bar{C}(k)$, together with a faster (but still power law like) decay in the large degree regime.

We also proposed a relaxation of the ‘hard’ threshold for each sub-class imposed by the Δ controlled boundary in the connection functions. The basic idea was to use a ‘reversed’ Fermi-Dirac function providing a sigmoid transition between low and high linking probability values, where the sharpness of the transition (or in other words, the width of the intermediate linking probability values) is controlled by a parameter β. Based on analogy with temperature dependent graph ensembles⁴⁵, β can be interpreted as a sort of inverse temperature, where the original ‘hard’ thresholded models are recovered in the zero temperature limit of β → ∞. The great advantage of the higher temperature models is that according to numerical simulations, the degree decay exponent becomes larger than γ = 2, and by changing β, it can be tuned to any preferred value in the range of typical γ values measured in real systems. We also generally discussed the criteria in multiple different cases of how to generate networks having degree distribution independent of the size. Hence, the models with the relaxed threshold at finite β values offer a flexible fitness-based approach being adjustable to complicated fitness distributions for generating sparse SF networks with realistic degree decay exponent.

In conclusion, our analysis showed that linking kernels with a general lower-cutoff and having either additive, multiplicative or mixed dependence on their arguments can always generate SF networks together with the appropriately chosen fitness distributions. A further remarkable consequence of the above is that a general mapping can be established between different ρ fitness distributions and possible f linking functions. I.e., for any fitness distribution ρ^* in general there exists a family of thresholded linking functions ${f}_{{\rm{\Delta }}}^{\ast }$ that together give rise to scale-free networks with a γ = 2, and vice versa, for any thresholded linking function ${f}_{{\rm{\Delta }}}^{\ast }$ we can find the corresponding fitness distribution ρ^* together which they display the same property. This might provide an alternative way of understanding how those fitness/activity driven systems exhibit SF behaviour where the distributions of the hidden variables follow non-trivial, complicated forms.

References

Albert, R. & Barabási, A.-L. Statistical mechanics of complex networks. Rev. Mod. Phys. 74, 47–97 (2002).
Article ADS MathSciNet Google Scholar
Mendes, J. F. F. & Dorogovtsev, S. N. Evolution of Networks: From Biological Nets to the Internet and WWW (Oxford Univ. Press, Oxford, 2003).
Barabási, A.-L. & Albert, R. Emergence of scaling in random networks. Sci. 286, 509–512 (1999).
Article ADS MathSciNet Google Scholar
Newman, M. E. J. Scientific collaboration networks. i. network construction and fundamental results. Phys. Rev. E 64, 016131 (2001).
Article ADS CAS Google Scholar
He, B. J. Scale-free brain activity: past, present, and future. Trends Cogn. Sci. 18, 480–487 (2014).
Article Google Scholar
Servedio, V. D. P., Caldarelli, G. & Buttà, P. Vertex intrinsic fitness: How to produce arbitrary scale-free networks. Phys. Rev. E 70, 056126 (2004).
Article ADS Google Scholar
Albert, R. Scale-free networks in cell biology. J. Cell Sci. 118, 4947–4957 (2005).
Article CAS Google Scholar
Bianconi, G. & Barabási, A.-L. Competition and multiscaling in evolving networks. Eur. Lett. 54, 436 (2001).
Article ADS CAS Google Scholar
Dorogovtsev, S., Mendes, J. F. & Samukhin, N. A. Structure of growing networks with preferential linking. Phys. Rev. Lett. 85, 4633–6 (2000).
Article ADS CAS Google Scholar
Caldarelli, G., Capocci, A., De Los Rios, P. & Muñoz, M. A. Scale-free networks from varying vertex intrinsic fitness. Phys. Rev. Lett. 89, 258702 (2002).
Article ADS CAS Google Scholar
Timár, G., Dorogovtsev, S. N. & Mendes, J. F. F. Scale-free networks with exponent one. Phys. Rev. E 94, 022302 (2016).
Article ADS Google Scholar
Seyed-allaei, H., Bianconi, G. & Marsili, M. Scale-free networks with an exponent less than two. Phys. Rev. E 73, 046113 (2006).
Article ADS Google Scholar
Watts, D. J. & Strogatz, S. H. Collective dynamics of’small-world’. networks. Nat. 393, 440–442 (1998).
CAS MATH Google Scholar
Bedogne’, C. & Rodgers, G. J. Complex growing networks with intrinsic vertex fitness. Phys. Rev. E 74, 046115 (2006).
Article ADS Google Scholar
Guimerà, R., Díaz-Guilera, A., Vega-Redondo, F., Cabrales, A. & Arenas, A. Optimal network topologies for local search with congestion. Phys. Rev. Lett. 89, 248701 (2002).
Article ADS Google Scholar
Palla, G., Derényi, I., Farkas, I. & Vicsek, T. Statistical mechanics of topological phase transitions in networks. Phys. Rev. E 69, 046117 (2004).
Article ADS MathSciNet Google Scholar
Kim, J., Kim, I., Han, S. K., Bowie, J. U. & Kim, S. Network rewiring is an important mechanism of gene essentiality change. Sci. Reports 2, 900 (2012).
Article ADS Google Scholar
Boguñá, M. & Pastor-Satorras, R. Class of correlated random networks with hidden variables. Phys. Rev. E 68, 036112 (2003).
Article ADS Google Scholar
Masuda, N. & Konno, N. Vip-club phenomenon: Emergence of elites and masterminds in social networks. Soc. Networks 28, 297–309 (2006).
Article ADS Google Scholar
van der Hofstad, R., Janssen, A. J. E. M., van Leeuwaarden, J. S. H. & Stegehuis, C. Local clustering in scale-free networks with hidden variables. Phys. Rev. E 95, 022307 (2017).
Article ADS Google Scholar
Masuda, N., Miwa, H. & Konno, N. Analysis of scale-free networks based on a threshold graph with intrinsic vertex weights. Phys. Rev. E 70, 036124 (2004).
Article ADS MathSciNet Google Scholar
Fujihara, A., Uchida, M. & Miwa, H. Universal power laws in the threshold network model: A theoretical analysis based on extreme value theory. Phys. A: Stat. Mech. its Appl. 389, 1124–1130 (2010).
Article ADS CAS Google Scholar
Petermann, T. & Rios, D. L. P. Exploration of scale-free networks. The Eur. Phys. J. B 38, 201–204 (2004).
Article ADS CAS Google Scholar
Boguñá, M., Pastor-Satorras, R., Díaz-Guilera, A. & Arenas, A. Models of social networks based on social distance attachment. Phys. Rev. E 70, 056122 (2004).
Article ADS Google Scholar
Starnini, M. & Pastor-Satorras, R. Topological properties of a time-integrated activity-driven network. Phys. Rev. E 87, 062807 (2013).
Article ADS Google Scholar
Garlaschelli, D., Capocci, A. & Caldarelli, G. Self-organized network evolution coupled to extremal dynamics. Nat. Phys 3 (2006).
Holland, P. W., Laskey, K. B. & Leinhardt, S. Stochastic blockmodels: First steps. Soc. Networks 5, 109–137 (1983).
Article MathSciNet Google Scholar
Fienberg, S. E., Meyer, M. M. & Wasserman, S. S. Statistical analysis of multiple sociometric relations. J. Am. Stat. Assoc. 80, 51–67 (1985).
Article Google Scholar
Decelle, A., Krzakala, F., Moore, C. & Zdeborová, L. Inference and phase transitions in the detection of modules in sparse networks. Phys. Rev. Lett. 107, 065701 (2011).
Article ADS Google Scholar
Aicher, C., Jacobs, A. Z. & Clauset, A. Learning latent block structure in weighted networks. J. Complex Networks 3, 221–248 (2015).
Article MathSciNet Google Scholar
Peixoto, T. P. Nonparametric weighted stochastic block models. Phys. Rev. E 97, 012306 (2018).
Article ADS CAS Google Scholar
Palla, G., Lovász, L. & Vicsek, T. Multifractal network generator. Proc. Natl. Acad. Sci. USA (2010).
Eom, Y.-H. & Fortunato, S. Characterizing and modeling citation dynamics. Plos One 6, 1–7 (2011).
Article Google Scholar
Medo, M. C. V., Cimini, G. & Gualdi, S. Temporal effects in the growth of networks. Phys. Rev. Lett. 107, 238701 (2011).
Article ADS Google Scholar
Serrano, M. A., Krioukov, D. & Boguñá, M. Self-similarity of complex networks and hidden metric spaces. Phys. Rev. Lett. 100, 078701 (2008).
Article ADS Google Scholar
Krioukov, D., Papadopoulos, F., Vahdat, A. & Boguñá, M. Curvature and temperature of complex networks. Phys. Rev. E 80, 035101 (2009).
Article ADS Google Scholar
Boguñá, M., Krioukov, D. & Claffy, K. C. Navigability of complex networks. Nat. Phys. 5, 74–80 (2009).
Article Google Scholar
Krioukov, D., Papadopoulos, F., Kitsak, M., Vahdat, A. & Boguñá, M. Hyperbolic geometry of complex networks. Phys. Rev. E 82, 036106 (2010).
Article ADS MathSciNet Google Scholar
Boguñá, M., Papadopoulos, F. & Krioukov, D. Sustaining the internet with hyperbolic mapping. Nat. Commun. 1, 62 (2010).
Article ADS Google Scholar
Masuda, N., Miwa, H. & Konno, N. Geographical threshold graphs with small-world and scale-free properties. Phys. Rev. E 71, 036108 (2005).
Article ADS Google Scholar
Konno, N., Masuda, N., Roy, R. & Sarkar, A. Rigorous results on the threshold network model. J. Phys. A: Math. Gen. 38, 6277 (2005).
Article MathSciNet Google Scholar
Fujihara, A. et al. Limit theorems for the average distance and the degree distribution of the threshold network model. Interdiscip. Inf. Sci. 15, 361–366 (2009).
MathSciNet MATH Google Scholar
Ide, Y., Konno, N. & Masuda, N. Statistical properties of a generalized threshold network model. Methodol. Comput. Appl. Probab. 12, 361–377 (2010).
Article MathSciNet Google Scholar
Caldarelli, G., Caretta Cartozo, C., De Los Rios, P. & Servedio, V. D. P. Widespread occurrence of the inverse square distribution in social sciences and taxonomy. Phys. Rev. E 69, 035101 (2004).
Article ADS Google Scholar
Garlaschelli, D., Ahnert, S. E., Fink, T. M. A. & Caldarelli, G. Low-temperature behaviour of social and economic networks. Entropy 15, 3148–3169 (2013).
Article ADS MathSciNet Google Scholar
Park, J. & Newman, M. E. J. Origin of degree correlations in the internet and other networks. Phys. Rev. E 68, 026112 (2003).
Article ADS Google Scholar

Download references

Acknowledgements

This project has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under Grant Agreement No. 740688 and was partially supported by the National Research, Development and Innovation Office under Grant No. K128780.

Author information

Authors and Affiliations

Department of Biological Physics, Eötvös University, H-1117, Budapest, Hungary
Sámuel G. Balogh
MTA-ELTE Statistical and Biological Physics Research Group, Hungarian Academy of Sciences, H-1117, Budapest, Hungary
Péter Pollner & Gergely Palla

Authors

Sámuel G. Balogh
View author publications
You can also search for this author in PubMed Google Scholar
Péter Pollner
View author publications
You can also search for this author in PubMed Google Scholar
Gergely Palla
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.G.B., P.P. and G.P. developed the concept of the study, S.G.B. derived the equations, S.G.B., P.P., G.P. contributed to the interpretation of the results, S.G.B. prepared the table and the figures, S.G.B., P.P. and G.P. wrote the paper.

Corresponding author

Correspondence to Sámuel G. Balogh.

Ethics declarations

Competing Interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Balogh, S.G., Pollner, P. & Palla, G. Generalised thresholding of hidden variable network models with scale-free property. Sci Rep 9, 11273 (2019). https://doi.org/10.1038/s41598-019-47628-0

Download citation

Received: 04 December 2018
Accepted: 19 July 2019
Published: 02 August 2019
DOI: https://doi.org/10.1038/s41598-019-47628-0

This article is cited by

Generalised popularity-similarity optimisation model for growing hyperbolic networks beyond two dimensions
- Bianka Kovács
- Sámuel G. Balogh
- Gergely Palla
Scientific Reports (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.