Divisibility patterns of natural numbers on a complex network

Investigation of divisibility properties of natural numbers is one of the most important themes in the theory of numbers. Various tools have been developed over the centuries to discover and study the various patterns in the sequence of natural numbers in the context of divisibility. In the present paper, we study the divisibility of natural numbers using the framework of a growing complex network. In particular, using tools from the field of statistical inference, we show that the network is scale-free but has a non-stationary degree distribution. Along with this, we report a new kind of similarity pattern for the local clustering, which we call “stretching similarity”, in this network. We also show that the various characteristics like average degree, global clustering coefficient and assortativity coefficient of the network vary smoothly with the size of the network. Using analytical arguments we estimate the asymptotic behavior of global clustering and average degree which is validated using numerical analysis.


Introduction
6][7] The characterization of structure of real networks is an indispensable part of this study.Despite being random, real networks show certain statistical properties which set them apart from their completely random mathematical counterparts.This hints towards underlying organizing principles which shape the structures of real networks. 8n particular, many real networks are scale-free which means that the distribution of degrees of their nodes follows a power law. 8,9 he density of triangles in the network is another important characteristic of networks measured using a quantity called clustering coefficient.Empirical studies show that the real networks are highly clustered as compared to completely random mathematical models like Erdos-Renyi graph. 8,9  the present paper, we report an analysis for a particular deterministic network that resembles real networks in many aspects.This network consists of natural numbers 1, 2, 3, • • • as nodes and if a given number divides another, then their corresponding nodes are connected by an undirected link.The network thus constructed, though deterministic, can be studied on an equal footing with the other random networks because of the irregular distribution of primes which makes divisibility relations themselves irregular.Also, it is helpful to view this network as a growing network where nodes are added one at a time.A similar network with nodes as composite numbers has already been studied. 10In the present work we consider a more general set up where we put all natural numbers on a complex network with their divisibility relation as the underlying deterministic rule of connections.Using tools from statistical inference, we confirm that this network is scale-free and show that average degree, global clustering coefficient and assortativity coefficient vary smoothly with the size of the network.This is surprising in view of the fact that distribution of primes is quite irregular in the sequence of natural numbers.We provide analytical results for the asymptotic behavior of average degree and global clustering coefficient for this network.In particular, we show that the global clustering coefficient of this network decays to zero whereas average degree increases logarithmically.We also report an interesting and novel similarity exhibited by local clustering coefficients of nodes in this network which we call "stretching similarity".
The remaining paper is organized as follows: In the next section we describe the construction of the network and show that the network is scale-free.We then describe the existence of stretching similarity in this network.Finally we show the behavior of average degree, global clustering coefficient and assortativity coefficient as a function of size of the network and analytically obtain the asymptotic trends for average degree and clustering.

Results
Construction of the network and its scaling properties.The nodes of the present network are natural numbers 1, 2, 3, • • • and there is a link between two nodes if either divides the other.We avoid self-links and all the links are undirected.Since the sequence of natural numbers has natural ordering, it is helpful to view this network as a growing network with the addition of a new node at each discrete time as follows: 1.At time t = 1 network starts with a single node n = 1 and at every time t, a node with the number n = t is added to the network.
2. This node connects to all the existing nodes whose numbers divide it.
The network thus constructed is shown in Fig. 1 at two different times t = 16 and t = 32 which would correspond to networks of size N = 16 and N = 32 respectively.In each panel, the size of each node is proportional to its degree and color of each node is graded according to its clustering coefficient with more white nodes as nodes with higher value of local clustering.
To find the distribution of degrees of this network, we grow the network till the size reaches N = 2 15 = 32768.The resulting distribution shown in Fig. 2 seems to follow a power law (p(k) ∼ k −α ) asymptotically .Using the method of maximum likelihood we find that the scaling-index α ∼ 2.03 for the network of size N = 2 15 .We establish the existence of power-law in the distribution (and hence the fact that this network is scale-free) using the approach described in Clauset et al 11

(see Methods).
We also study the scaling behavior of the local clustering coefficient with degree.The local clustering of a node in the network is defined as the fraction of number of edges that are present among its neighbors.For node i with degree k i this can be written as: 13 where E i is the actual number of edges among the neighbors of node i.
In Fig. 3 we show the dependence of local clustering coefficient of nodes in the network on the degree.It can be seen that the asymptotic behavior is compatible with a power law with exponent 2. This behavior is similar to one that is usually observed in real networks 13 however the decay of clustering with degree is much faster in the present case.
Stretching similarity of local clustering.We now discuss an interesting behavior that sets network of natural numbers apart from other complex networks.In the network presented here, each node has an identity which is the number attached to it and this defines a natural order on the nodes.This means that we can study various properties of nodes as a function of their labels.This is not possible for other networks because no such unique labeling exists for the nodes.Here we specifically consider local clustering coefficient of nodes and study its behavior as a function of node index.We find that the clustering Sizes of successive bins are equal to successive positive powers of 2 and count in each bin is normalized by dividing by a bin width.The lower bound of power law regime of the distribution in this case is estimated to be k = 123 using method used by Clauset et al. 12 The dotted line in the graph has slope α = −2.03and it is calculated using the method of maximum likelihood. 11The existence of the underlying power law is established by calculating p-value using Kolmogorov-Smirnov statistic for smaller sizes of the same network (see Methods).coefficient c i of node i varies seemingly irregularly.However, when c i is plotted against i, a global pattern is seen.In Fig. 4 we show this pattern for three different network sizes.
From the figure, it is clear that the global pattern of the local clustering coefficient gets stretched as the size of the network increases such that the nature of the pattern remains the same.We call this new kind of similarity as "stretching similarity" and this seems to be a unique feature of this network, not so far reported for any other complex network.We note from plots in Fig. 4 that for a network with size N some discontinuous vertical steps occur approximately at values N/2, N/3, N/4, • • •.Also, we observe a band of numbers with clustering coefficient 1 between N/3 and N/2 and these numbers correspond to prime numbers and their powers in that range.This can be seen by the following argument.Consider any prime number p in the interval (N/3, N/2).On the lower side, it is connected only to 1 while on the upper side, it would be connected only to its multiples.However, all the numbers in this range would have only one multiple 2p up to N. Thus, three numbers 1, p, 2p form a triangle and hence clustering coefficient of number p must be 1.A similar argument for prime powers in this range tells that they also have clustering coefficient 1.There is another band of numbers with clustering coefficient exactly 0 between N/2 and N which are also prime numbers.This is because all the primes in this range are connected only to 1 making their clustering 0.
We also observe an interesting pattern when we plot the difference △c = c i − c i+1 as a function of i in Fig. 5.We find that this pattern is symmetric about △c = 0 which can be quantified by finding the local density of values in the plot (see Methods).With increasing size of the network, this pattern also shows stretching similarity.Topological characteristics of the network.In the present section, we discuss how three of the most important quantities average degree, global clustering and assortativity coefficient vary with the size of the network.

Average degree
Here we derive an approximate expression for the average degree of the network as a function of its size.By definition, the average degree of the network is given by: where m is the total number of edges in the network and n is the size of the network.The value of m is also equal to the sum of the elements in lower (or upper) triangular part of the adjacency matrix.To find this sum, we interpret the second index of element A i j of adjacency matrix to be the divisor of first index if A i j = 1.In other words, let A i j = 1 if and only if i > j and j|i.Then the sum of the elements in the lower triangular part of the matrix is equal to the number of integers of the form k j with k ≥ 2 and k j ≤ n.However, whenever j > n 2 all the entries in the in the j th column of the lower triangular part of A are  15 .In any local region of the plot, the values c i seem to be scattered irregularly.However, with the increase in the network size, the whole pattern is stretched on a global scale.We call this similarity as "stretching similarity".
zero.Let ⌊x⌋ denote the greatest integer ≤ x.Then m is given by: It is well known that the first term on the right satisfies an estimate as follows: where γ is Euler-Mascheroni constant.Also we observe that: From Eqs.( 2),( 3),( 4), (5), it follows that: Since γ ≈ 0.5772, in the limit of large n, we get, This means that the average degree of the network increases logarithmically with the size and this variation is plotted in Fig. 6 (solid line) using Eq.( 7).We calculate this numerically by growing the network up to N = 2 15 and the results obtained, shown by solid dots in Fig. 6, are found to agree exactly with analytic expression (7).Since the average degree of the network increases with size, the degree distribution of the network is not stationary though as shown in the previous section, the network is scale-free at each stage (see Methods).

Global clustering coefficient
The global clustering coefficient of the network quantifies the density of closed triplets in the network.A connected triplet in the network is the set of 3 nodes connected to each other with exactly 2 links.A closed triplet is the set of 3 nodes connected to each other with exactly 3 links.A triangle in the network counts as three closed triplets (one centered at each node of the triangle).The global clustering coefficient of the network is then defined as: We estimate the number of triangles T n in the network using the following strategy.Let us fix a vertex i and calculate the number of triangles in which i is the smallest vertex.The number i has n i − 1 proper multiples in the range [1, n].Each of The solid dots represent the actual values calculated by direct numerical simulations while the solid line is plotted using the analytic expression (7).
them is of the form ki where k = 2, 3, ..., n i .Thus, T n is given by: Using the integral approximation for the above: The above is bounded by, Here c and c ′ are constants.Hence we see that: In particular,  8)) decays to 0 as the size of network increases.(b) The assortativity coefficient r (see Eq.( 16)) saturates to negative value close to −0.14 implying that the network is dissortative.
Let U(n) be the the number of connected triplets in the network after n th stage.Then U(n) is given by: Since we have (Fig. 2) observed that the degree distribution of the network follows a power law k −α with α ∼ 2, we see that the proportion p(k) of nodes with degree Thus, the expectation of the variable k 2 satisfies: Hence we see that From Eqs.( 8), ( 12) and ( 15), the global clustering coefficient decays to zero as the network size goes to infinity.We verify this by numerically computing the global clustering coefficient and this is shown in Fig. 7(a).However, we note that the Watts-Strogatz clustering coefficient C W S of the network (which is defined as the average of all local clustering coefficients over all the nodes of the network 15 ) does not decay to zero and instead reaches to a constant value ∼ 0.59.This is clear from Fig. (4) since the pattern repeats with stretching similarity as the network size increases.To the best of our knowledge, there is no other network in which C W S saturates to a high non-zero value but the global clustering coefficient decays to 0.

Assortativity coefficient
The correlation of degrees in the network is an important quantifier of the network structure. 16If in a network the high degree nodes tend to connect to low degree nodes (i.e. if the network has negative degree correlations), then the network is said to be dissortative in structure whereas if similar degree nodes tend to connect to each other, network is said to be assortative.All the real networks except social networks are dissortative 16 and this has been explained using the fact that the dissortative state is the most likely state of scale-free networks. 17The assortative/dissortative nature of networks can be quantified using the assortativity coefficient: 9 where k i is the degree of the i th node, A i j is the (i, j) th element of the adjacency matrix, m is the total number of edges in the network and δ i j is the Kronecker delta.In Fig. 7(b) we show the dependence of r on the size of the network and in spite of irregularity in the divisibility pattern, r has a smooth behavior with n.It can be seen that r saturates to negative value as the size of network increases and hence the network is dissortative.The dissortative nature of the network of natural numbers is understandable from the following argument.For any link in this network, the one end of the link is divisor (node A) and other is multiple (node B).Hence node A is also connected to all the nodes which are multiples of B but the reverse is not true.This means that the degree of node A always tends to be very high as compared to degree of node B for a given size of the network giving the negative value for the overall correlation coefficient.

Discussion
The network of natural numbers constructed using divisibility relation looks like real networks in many characteristics like degree distribution, clustering and degree correlations.In particular, using rigorous methods from statistical inference we show that the network is scale-free.A unique feature that we find for this network is that its global clustering coefficient decays to zero with size whereas the Watts-Strogatz clustering coefficient reaches to a high constant value.
At the same time, we show that this network exhibits a completely novel type of similarity which we term as "stretching similarity" with increase in the size of the network.We feel that the existence of stretching similarity in this network is a reflection of the nature of distribution of prime numbers and hence deserves more attention.Also, the behavior of average degree, global clustering and assortativity coefficients for this network vary quite smoothly and hence may help us to understand better the divisibility relation between numbers.In conclusion, the work presented here provides a new perspective on the divisibility relations of natural numbers and has potential to become an important tool in the investigation of the properties of natural numbers.

Methods
Establishing the scale-free nature of the network.The shape of the degree distribution of the network in Fig. 2 hints at the existence of asymptotic power law in the distribution (p(k) ∼ k −α for k ≥ k min ).However a visual inspection to find k min and least square fit and related methods to find the exponent α of the power law are known to produce very bad estimates. 18Hence we use the method of maximum likelihood for the degree sequence of the network to find scaling index α of the power-law distribution. 11For this, we initially assume that the sequence is drawn from a distribution that follows a power law k −α for all k after k ≥ k min .To find this k min , we use the approach proposed by Clauset et al. 12 The idea behind this method is to choose that value of k as k min which makes the probability distribution of the data and best-fit power-law model as similar as possible above k min where we use Kolmogorov-Smirnov statistic as the distance between two distributions. 11After finding k min using this method, the best estimation for scaling exponent α is given by: where k i ,i = 1, • • • , N are values of k such that k i ≥ k min .For the network of size 2 15 studied here, the value α is obtained here as 2.03.
To validate the existence of power law, we use the approach described in Clauset et al. 11 In this approach we generate many synthetic data sets from a true power-law distribution and measure how far they fluctuate from the power-law type of behavior.We then compare the results of similar measurements on the observed data.If the observed data set is much further from the power-law form than the synthetic one, the power-law is rejected.The p−value is defined as the fraction of the synthetic distances that are larger than the empirical distance.A large p−value is indicative of existence of power law in the data.In the present work we calculate the p−values for three different sizes of the network: N = 256, 512, 1024.For this, we generate 2500 synthetic data sets which gives p− values accurate up to two decimal places as 0.62, 0.95 and 0.98 respectively.The existence of power-law degree distribution for this network is thus confirmed by the fact that p−values rapidly converge to 1 as the network size increases.
The distribution in Fig. 2 is plotted with logarithmic binning with the successive bin sizes equal to successive powers of 2 and the count in each bin is normalized by dividing the count by the bin-width.The same strategy is used to show the dependence of local clustering coefficient C(k) on degree k in Fig. 3.
Symmetry in difference of successive local clustering coefficients.To establish the global symmetry of difference in local clustering values △c around the horizontal axis △c = 0 (Fig. 5) for any value of N, we calculate the local density of points in the plot.For this, we divide the horizontal axis into 2 7 = 128 cells and vertical axis into 200 cells.The whole plot then gets divided into pixels of dimension 0.01 × 2 N−7 .We define density ρ(x, y) of a particular pixel (x, y) as the ratio of the number of points present in the pixel to the maximum number that can be there which is equal to 2 N−7 (all the points on y-axis with difference less than 0.01 are to be considered same so the vertical dimension of each pixel is just 1).For each x we calculate the absolute difference between the corresponding pixels on each side of the line △c = 0.If the pattern is symmetric then these absolute differences are expected to be small.We calculate the average of such differences as: In Fig. 8 we show φ (x) as a function of x and as is clear from the figure, all φ values are very close to 0 confirming that the pattern is indeed symmetric.

Figure 1 .
Figure 1.Network of natural numbers with two different sizes.(a) t = 16 nodes and (b) t = 32 nodes.In each panel, the size of each node is proportional to its degree and color of each node is graded according to its clustering coefficient with more white nodes as nodes with higher value of local clustering.

Figure 2 .
Figure 2. Degree distribution of network of natural numbers with logarithmic binning.Sizes of successive bins are equal to successive positive powers of 2 and count in each bin is normalized by dividing by a bin width.The lower bound of power law regime of the distribution in this case is estimated to be k = 123 using method used by Clauset et al.12The dotted line in the graph has slope α = −2.03and it is calculated using the method of maximum likelihood.11The existence of the underlying power law is established by calculating p-value using Kolmogorov-Smirnov statistic for smaller sizes of the same network (see Methods).

Figure 3
Figure 3Dependence of local clustering coefficient on degree.The plot is created using a logarithmic binning and count in each bin is normalized by dividing by the bin width.Asymptotically, the local clustering is seen to follow a power law with exponent ∼ 2.

Figure 4 .
Figure 4. Local clustering coefficient as a function of node index for three different sizes of network.(a) N = 2 13 , (b) N = 2 14 and (c) N = 215 .In any local region of the plot, the values c i seem to be scattered irregularly.However, with the increase in the network size, the whole pattern is stretched on a global scale.We call this similarity as "stretching similarity".

)Figure 5 .
Figure 5. Difference between clustering coefficients of successive nodes i and i + 1 as a function of index i.This pattern is symmetric about the line △c = 0 and also shows stretching similarity.

Figure 6 .
Figure 6.Average degree of the network as a function of size.The solid dots represent the actual values calculated by direct numerical simulations while the solid line is plotted using the analytic expression(7).

)Figure 7 .
Figure 7. Global clustering coefficient and assortativity coefficient as a function of size of the network.(a) The global clustering coefficient (see Eq.(8)) decays to 0 as the size of network increases.(b) The assortativity coefficient r (see Eq.(16)) saturates to negative value close to −0.14 implying that the network is dissortative.