Analysis and synthesis of a growing network model generating dense scale-free networks via category theory

We propose a growing network model that can generate dense scale-free networks with an almost neutral degree−degree correlation and a negative scaling of local clustering coefficient. The model is obtained by modifying an existing model in the literature that can also generate dense scale-free networks but with a different higher-order network structure. The modification is mediated by category theory. Category theory can identify a duality structure hidden in the previous model. The proposed model is built so that the identified duality is preserved. This work is a novel application of category theory for designing a network model focusing on a universal algebraic structure.


).
Recently, the authors proposed a growing network model that can generate dense scale-free networks with arbitrary exponents in the range γ > 1 11 by modifying a copying model 12 . However, the generated networks by this model have a rather distorted higher-order network structure: They have a strong positive degree−degree correlation, namely, the degree correlation function k nn (k) 13 increases linearly as the degree of nodes k increases on average, and the local clustering coefficient C(k) 14 has a tendency that is rarely observed in real-world networks, namely, C(k) increases as k increases on average 11 . In this paper, by modifying our previous model 11 , we propose a growing network model that can generate dense scale-free networks with a different higher-order structure such as an almost neutral degree−degree correlation and a negative scaling of local clustering coefficient. We show that the proposed model can generate dense scale-free networks in which k nn (k) is an almost constant function of k on average and C(k) is a decreasing function of k on average. In particular, the latter is a hallmark of a hierarchical structure or a modular structure, and frequently observed in real-world networks 14,15 .
The modification of our previous model relies on category theory 16 . Category theory is a kind of abstract algebra that has been used to extract common mathematical structures in different fields of mathematics and transfer a theory in one field to another field 17 . Recently, it has been suggested that category theory can also have effective applications in different fields of science 18 : control theory 19,20 , electrical circuits 21  www.nature.com/scientificreports/ databases 23 , resource theory 24 , dynamical systems 25 , machine learning 26,27 , complex systems design 28 , and so on. We use category theory for identifying a hidden duality structure in our previous model of growing networks and making use of it for building a new model. In order to obtain the new model, we modify the previous model so that the duality structure is preserved. Here, we only use a small part of category theory. In particular, we only need the notions of preordered sets and the Galois connections. These materials are reviewed in "Methods".

Background
Recall that the algorithm of the BA model consists of two steps: growth and preferential attachment (PA) 3 . In the growth step, a new node enters into an existing network. The degree of the new node is given as a fixed value m. Then, in the PA step, each existing node acquires a new link to the new node with probability proportional to its degree. The network grows as these two steps are repeated indefinitely. The copying model focused on in our previous work 11 was originally proposed as a model of evolution of protein-protein interaction networks driven by gene duplications and mutations 12 . The mechanism of PA is not directly implemented in the model algorithm. However, it naturally gives rise to PA. The copying model replaces the above two steps in the BA model by the copying step and the divergence step, respectively. In the copying step, a new node is produced by copying a randomly chosen existing node together with links emanating from it. In the divergence step, each link from the new node is deleted with a given probability 0 < p < 1 . PA follows from the copying step because the higher the degree of a node is, the higher the probability that it is reached from a randomly chosen node is. It is known that the generated networks by the copying model have a powerlaw degree distribution p k ∼ k −γ with 2 ≤ γ ≤ 3 when p ≤ 1 2 , and they are dense but not scale-free for p > 1 2 12 .
In our previous model 11 , only the degree of a randomly chosen node is copied when creating a new node. The copied degree of the node is interpreted as an evaluation value of the ability to form link ('popularity'). After multiplying a conversion coefficient δ > 0 from the degree (a result of link formation) to the ability (a cause of link formation), the obtained value is called the virtual degree of the new node. The targets of links from the new node are determined by a weak form of PA called ordinal preferential attachment (OPA). In OPA, the new node connects to randomly chosen existing nodes whose evaluation value of the ability to form links is greater than or equal to the virtual degree of the new node. Thus, our previous model consists of the following two steps: the copying degree step and the OPA step. In the copying degree step, first an existing node y is chosen randomly. Then, the virtual degree d * x of the new node x is given as a randomly chosen natural number k less than or equal to ⌈δd y ⌉ , where d y is the degree of y and ⌈r⌉ is the smallest integer greater than or equal to a real number r. In the OPA step, d * x existing nodes z satisfying d * x ≤ ⌈δd z ⌉ are randomly chosen and the new node x forms links to them. When the number of existing nodes z satisfying the above inequality is less than d * x , x connects to all such nodes and the OPA step is completed.
Our previous model can generate dense scale-free networks with exponent 1 < γ ≤ 2 for 1 ≤ δ < e = 2.71828 . . . 11 . A theoretical analysis based on the rate equation shows that γ is obtained as a nontrivial solution of γ = δ(γ − 1) 2 + δ γ −1 . The range of power-law regime is given by 1 ≪ k < Mt for γ = 2 and 1 < γ < 2 , respectively. The degree correlation function k nn (k) defined as the average degree of the neighbors of a node with degree k satisfies k nn (k) ≥ δ 2 k for k ≫ 1 , indicating a strong positive degree−degree correlation. For the local clustering coefficient C(k) defined as the probability that a pair of nodes among neighbors of a node of degree k is linked, we have C(k) ≥ D t k γ for k ≫ 1 when δ < 1.796028 . . . , where D is a constant depending on δ . Thus, C(k) is expected to increase on average as k increases, which was verified by numerical simulation 11 .

Results
We modify our previous model so that the modified model still can generate dense scale-free networks but with a different higher-order network structure. We show that the generated networks have an almost neutral degree−degree correlation and a negative scaling of local clustering coefficient. First, we analyze the duality structure of our previous model by category theory. Second, we synthesize a new model preserving the extracted duality structure.
Analysis of the previous model. Let X be the set of nodes in a network generated by our previous model described in Section "Background". We define a map G : X → N from X to the set of natural numbers N by G(x) = ⌈δd x ⌉ for x ∈ X . Here, we assume that d x > 0 for all x ∈ X . This condition is satisfied in the course of growth if the initial network is a connected network with two or more nodes. We regard N as a preordered set by the usual less-than-or-equal-to relation ≤ between natural numbers. We equip X with a preorder ≤ X by defining x ≤ X y :⇔ d x ≤ d y for x, y ∈ X . The preorder ≤ X is a total preorder, namely, for any pair of nodes x, y in X, x ≤ X y or y ≤ X x hold. Two nodes x, y ∈ X are equivalent, namely, both x ≤ X y and y ≤ X x hold, if and only if d x = d y . By this definition of ≤ X , G becomes a preorder-preserving map. Namely, if x ≤ X y then G(x) ≤ G(y) holds. Note that the converse implication also holds for δ ≥ 1 , which corresponds to the dense regime of interest in this paper. Thus, x ≤ X y is equivalent to G(x) ≤ G(y) when δ ≥ 1 . G is the map evaluating the ability of each node to form links. The copying degree step in our previous model is restated as follows: Randomly choosing a natural number k such that k ≤ G(y) for a randomly chosen node y ∈ X.
If we restrict the range of G to the interval [1, max G(X)] ⊆ N , then there exists a preorder-preserving map F : which is guaranteed to exist, since ≤ X is a total preorder and X is a finite set. Since G is a many-to-one map in general (see Fig. 1 www.nature.com/scientificreports/ elements. Any choice from the set of the minimum elements can be used to define F(k). The pair of preorderpreserving maps (F, G) is called a Galois connection, or an adjunction 16,18 . Figure 1(b) illustrates a Galois connection (F, G) for the network shown in Fig. 1(a). We can see that F(G(x ′ )) and x ′ are equivalent for all x ′ ∈ X in Fig. 1(b). This is true in general when δ ≥ 1.
Since G evaluates the ability of each node to form links, the dual map F to G can be regarded as expressing a 'realization' process of the ability to form links. Under this interpretation, it might be natural to introduce the following link formation rule: For k ≤ G(y) obtained in the degree copying step, the targets of the links from the new node x with virtual degree d * x = k is chosen as F(k). In more detail, let M k be the set of the minimum ele- We call this link formation rule the adjunction rule. However, the adjunction rule cannot produce dense scale-free networks as numerically demonstrated for δ = 3 2 in Fig. 2. On the other hand, in the OPA step of our previous model, the targets of the links from the new node x are randomly chosen from the set z ∈ X | d * x ≤ G(z) rather than taking the minimum elements. We call this rule used in the OPA step the pre-adjunction rule.
Synthesis of a new model. In Section "Analysis of the previous model", we express the duality of degree of nodes (result of link formation/cause of link formation) as a Galois connection (F, G). The copying degree step and the OPA step in our previous model are restated in terms of the Galois connection (F, G). In particular, it turns out that the OPA step uses the pre-adjunction rule for link formation. Now, we obtain a new growing network model by choosing a different but isomorphic Galois connection (F ′ , G ′ ) to (F, G).
First, we describe the algorithm of the new model without referring to category theory. Then, we explain how we obtain the new model from the Galois connection (F ′ , G ′ ).
Given an initial network, the following two steps are repeated indefinitely in the new model: (1) An existing node y is chosen uniformly at random from the set of all existing nodes. A new node x is generated with its virtual degree d * x . d * x is a natural number chosen uniformly at random from the interval [1, ⌈δd y ⌉] , where δ > 0 is a parameter.
Then, G ′ is a preorder-preserving map. The preordered sets (X, ≤ X ) and (N X , ≤ N X ) are isomorphic by the isomorphism I : X → N X defined by I(x) = (x, N x ) for each x ∈ X and we have G ′ = G • I −1 . Thus, the preorder-preserving maps F ′ : N x )) . Based on the pre-adjunction rule for (F ′ , G ′ ) , we obtain the above new growing network model consisting of the copying degree step and a new version of the OPA step. We adopt the same procedure for the copying degree step. Let x be a new node to be added and d * x its virtual degree. According to the pre-adjunction rule for (F ′ , G ′ ) , the targets of x is chosen from the set (z, N z )) . However, since (z, N z ) is not a node but a pair of a node and a set of nodes, there is an arbitrariness how they are chosen. Here, we adopt the following procedure: d * x nodes are chosen randomly from N z , where (z, N z ) is chosen randomly from the above set. If the size of chosen N z is less than d * x , x connects to all nodes in N z and complete the OPA step. In short, the new node x can make a new link with existing nodes z whose ability to form links is greater than or equal to that of x and z's neighbors.
We expect that this link formation rule has following effects to degree−degree correlation and local clustering. Since a z's neighbor z ′ does not necessarily satisfy d * x ≤ G ′ ((z ′ , N z ′ )) , degree−degree correlation can be weakened compared to that in our previous model where the inequality is satisfied for all targets of x. On the other hand, if the set of chosen targets of x includes both z and its neighbors z ′ , triangles among x, z, and z ′ are formed, increasing the degree of local clustering. This mechanism can result in a negative scaling of local clustering coefficient 29 .
The new model can generate dense scale-free networks when 1 ≤ δ < e . In the following, we focus on this parameter range. An analysis based on the rate equation 30 similar to that in our previous work 11 shows that if we assume that p k ∼ k −γ for 1 ≪ k < M ′ t 1 γ , where 1 < γ ≤ 2 , M ′ > 0 is a constant, and the upper limit M ′ t 1 γ comes from the constraint on the allowed maximum degree for dense scale-free networks 5 , and that the degree of a neighbor of a randomly chosen node is independent of the degree of the chosen node, then we self-consistently obtain (see "Methods") In Fig. 3, degree distributions of numerically simulated networks for three different values of δ ( δ = 1, 3 2 , δ = √ 2 ) are compared with the theoretical prediction. Here, t denotes the number of nodes and the initial condition is given as the network with two different nodes and a single link between them. As networks grow, the scale-free regime is enlarged and the slope in the log-log plot agrees with the value obtained from Eq. (1). The number of links L scales as where k is the average degree. In other words, the average degree k diverges as ln t and t 2/γ −1 for δ = 1 and 1 < δ < e , respectively. Thus, the generated networks are expected to be dense. Indeed, Fig. 4 compares the result of numerical simulation and Eq. (2) for the number of links L, showing that they are consistent.
The numerical results for the degree correlation function k nn (k) and the local clustering coefficient C(k) are shown in Figs. 5 and 6, respectively. The average of k nn (k) is almost constant and thus is consistent with the assumption of the above rate equation analysis. C(k) tends to decrease as k increases, which is the opposite trend www.nature.com/scientificreports/ against C(k) of the generated networks by our previous model. The behavior of k nn (k) and C(k) is consistent with our expactation from the link formation rule discussed above.

Discussion
In this paper, we apply category theory for building a new growing network model that can generate dense scale-free networks. The proposed model is constructed through a modification of our previous model while preserving the duality associated with it. Both our previous and proposed models can generate dense scale-free networks. However, their higher-order network structures are different: Those generated by the former have a positive degree−degree correlation and a positive scaling of local clustering coefficient, while those generated by the latter have an almost neutral degree−degree correlation and a negative scaling of local clustering coefficient.
In Section "Analysis of the previous model", we have observed that the adjunction rule for link formation does not work for generating power-law degree distributions. In the pre-adjunction rule adopted in our previous and proposed models, a kind of fluctuation is introduced, which is crucial for generating dense scale-free networks: Taking minimum elements of a set is replaced by a random choice from the set. Formally, such incorporation of randomness can be readily extended to category theoretical limits or colimits, which are generalizations of (a) ( b) (c)   www.nature.com/scientificreports/ minimum or maximum elements of a subset of a preordered set 16,17 . Studying whether such extension of the pre-adjunction rule is meaningful or not in different mathematical models will be investigated elsewhere.
The category theoretic duality of nodes' degree described in Section "Results" is mathematically rather trivial. However, we have shown that it guides construction of a non-trivial mathematical model of growing networks. In one of the authors' previous work 31,32 , category theory was applied for analyzing the structure of static networks. In this paper, we have presented a novel kind of application of category theory, namely, designing of a dynamic network model. We hope that such an extended application of category theory leads to deepening understanding of mathematical structures of models for networks.

Methods
Preordered sets and Galois connections. For a reference on the material in this section, we refer to Chapter 1 of Fong and Spivak 18 .
Let X be a set. A preorder on X is a binary relation ≤ X ⊆ X × X satisfying the following two conditions: (i) x ≤ X x for all x ∈ X (reflexivity), and (ii) if x ≤ X y and y ≤ X z , then x ≤ X z for all x, y, z ∈ X (transitivity). A set X equipped with a preorder ≤ X is called a preordered set and is denoted by (X, ≤ X ) . Two elements x, y ∈ X are called equivalent when both x ≤ X y and y ≤ X x hold. The preorder ≤ X is called a total preorder when x ≤ X y or y ≤ X x hold for all x, y ∈ X.
Let (X, ≤ X ) and (Y , ≤ Y ) be preordered sets. A map F : X → Y is called a preorder-preserving map when F preserves the preorder, namely, it holds that if x 1 ≤ X x 2 then F(x 1 ) ≤ Y F(x 2 ) for all x 1 , x 2 ∈ X . Let F : X → Y be a preorder-preserving map. If there exists a preorder-preserving map G : Y → X satisfying G • F = id X and F • G = id Y , where id X and id Y are the identity maps on X and Y, respectively, then F is called an isomorphism between (X, ≤ X ) and (Y , ≤ Y ) . Here, G is also an isomorphism and is denoted by G = F −1 .
Let F : X → Y and G : Y → X be preorder-preserving maps for preordered sets (X, ≤ X ) and (Y , ≤ Y ) . A pair of preorder-preserving maps (F, G) is called a Galois connection or an adjunction between (X, Let (X, ≤ X ) be a preordered set and Z ⊆ X a subset. z * ∈ Z is called a minimum element of Z, when z * ≤ X z for all z ∈ Z.
Let (F, G) be a Galois connection as above. It holds that, for each x ∈ X , F(x) is a minimum element of We also have, for any y ∈ Z , x ≤ X G(y) , which is equivalent to F(x) ≤ Y y . Thus, F(x) is a minimum element of Z.
Derivation of Eq. (1). Let 1 ≤ δ < e . Let p k (t) be the fraction of nodes of degree k when the number of existing nodes in a network generated by the proposed model is equal to t. In the following, we assume k > 0 since we are interested in the regime k ≫ 1.
The time evolution of p k (t) follows the rate equation where a k (t) is the probability that an existing node of degree k gets a link from a new node x when the number of existing nodes is t, and b k (t) is the probability that the node newly added has degree k. Let q k (t) be the probability that d * x = k . We have and Let N k (t) := t k≤⌈δl⌉<t p l (t) . Let d k (t) be the probability that N z for a specific node z of degree k is chosen and a specific node in N z is chosen as the target for a link from the new node x. d k (t) is given by Let us assume that the degree of a neighbor of a randomly chosen node is independent of the degree of the latter node. Then, the probability that the former is of degree k ′ given the latter is of degree k, denoted by p(k ′ |k) , does not depend on k and is given by p(k ′ |k) = k ′ p k ′ (t) �k ′ � . Using this, we obtain Now, we assume that p k (t) ≃ ck −γ for 1 ≪ k < Mt 1/γ and t ≫ 1 , where c, M > 0 are appropriate constants and 1 < γ ≤ 2 . We have (3) (t + 1)p k (t + 1) = tp k (t) + a k−1 (t)tp k−1 (t) − a k (t)tp k (t) + b k (t), (4) q k (t) = k≤⌈δl⌉<t p l (t) × 1 ⌈δl⌉ , .