Application of hyperbolic geometry in link prediction of multiplex networks

Recently multilayer networks are introduced to model real systems. In these models the individuals make connection in multiple layers. Transportation networks, biological systems and social networks are some examples of multilayer networks. There are various link prediction algorithms for single-layer networks and some of them have been recently extended to multilayer networks. In this manuscript, we propose a new link prediction algorithm for multiplex networks using two novel similarity metrics based on the hyperbolic distance of node pairs. We use the proposed methods to predict spurious and missing links in multiplex networks. Missing links are those links that may appear in the future evolution of the network, while spurious links are the existing connections that are unlikely to appear if the network is evolving normally. One may interpret spurious links as abnormal links in the network. We apply the proposed algorithm on real-world multiplex networks and the numerical simulations reveal its superiority than the state-of-the-art algorithms.

based on interlayer and intralayer information to solve the missing link prediction problem in multiplex networks. Hajibagheri et al. 18 proposed a holistic method considering the information of all layers simultaneously in link prediction of a target layer in multiplex networks. Also, Guimerà et al. used Stochastic Block Models to predict missing and spurious links in noisy networks. Zeng et al. 19 studied the impact of spurious link identification methods on distortion of networks' structure and dynamics. In another study, Zhang et al. 20 measured the inter-similarity using the local diffusion processes in bipartite networks. Samei et al. 21 proposed a method to identify spurious links in multiplex networks. In fact, they proposed a method to employ interlayer information to improve the performance of spurious link prediction in the target layer.
In the context of hyperbolic geometry of network, Krioukov et al. 22 introduced the mapping of networks to hyperbolic space. They used the underlying hyperbolic geometry of network to study the functionality and structure of complex networks. They showed that the strong clustering and the heterogeneous degree distribution are natural reflections of the negative curvature and other properties of the hyperbolic geometry of complex networks. Then, Papadopoulos et al. 23 studied the impact of popularity and similarity in networks' growth. They developed a framework to suggest that new connections can be made between node pairs with an optimized trade-off between popularity and similarity. In another work, Papadopoulos et al. 24 presented the HyperMap method to map a network to its underlying hyperbolic space and used the hyperbolic distance as a similarity measure to solve the link prediction problem. Different from these works, other methods were introduced to infer hidden geometry of complex networks 6,25 . Recently, Muscoloni et al. 26,27 introduced a nonuniform popularity-similarity optimization model (N-PSO). This model was used to predict the missing links using the community structure of the networks in N-PSO that improved the performance of the link predictors significantly. Muscoloni et al. also proposed an intelligent machine to infer the network hyperbolic geometry based on an "angular coalescence" phenomenon 28 . A minimum curvilinear automata has been recently proposed to embed hyperbolic geometry of networks and used it for link prediction 29 .
In this paper, our proposed similarity indices based on hyperbolic geometry of network benefit both intralayer and interlayer information to solve the spurious and missing link prediction in multiplex networks. Based on that the experimental results on four single layer synthetic networks and six real multiplex networks show that the performance has been improved when the hyperbolic-based methods is used and the node pairs similarity measures are computed considering both interlayer and interlayer information.

Consider
represents the network of layer α with V as the set of nodes and E as the set of links 11,12,30 . We can assume [ ] as the adjacency matrix of each layer G α , where for 1 ≤ α ≤ M and 1 ≤ i, j ≤ |V α | 31 : In the context of unsupervised link prediction, many similarity measures are defined to find the likelihood of link existence between each node pair (i, j). In multiplex networks the similarity score in layer α is shown by α s ij . After computing the similarity scores for all potential node pairs in each layer, a ranking method can be used to choose the top ranked pairs which have more chance to make a connection. The key issue is how to calculate the similarity scores based on the known topology of networks. Recent studies have shown that the structure of the layers in multiplex networks are mostly dependent 32,33 . Hence, one of the main challenges in solving the link prediction problem in multiplex networks is to find an appropriate similarity measure that can benefit the relevant information of all layers 30 . Based on this, here we use both the information of the target layer (intralayer information) and other layers (interlayer information) and combine them based on layer relevance to improve the performance of link prediction compared with the single-layer based methods.
In the case of missing link prediction, the goal is to estimate the probability of existence of non-observed links based on the current topology of network and available node's features in network G(V, E). Since the missing links are not known, we assume that a fraction of observed links E, is missing and the goal of link prediction is to identify them. In order to do that, in each iteration a fraction of the observed links E , is removed based on k-fold decomposition method and the proposed methods are supposed to predict them. In the case of identifying spurious links, the task is to evaluate whether the observed links are reliable enough based on the current topology of the network. In order to do that, in each iteration, some nonexistent links are randomly added to the link set and the proposed methods are supposed to identify them 34 . Precision is used here to quantify the accuracy of a link prediction method which is defined as: where |TP| is the number of positive predictions that are truly predicted and |FP| is the number of positive predictions that are wrongly predicted.
node similarity index. Description of the similarity indices is given in the following.
Existing measures.
• Preferential Attachment (PA): This index is based on the node degrees and for each node pair i and j is defined as: where ||Γ i || indicates the number of neighbors of node i.
• Common Neighbors (CN): For each node pair i and j, this index counts the number of neighbors that are common between them and is defined based on the assumption that node pairs with more common neighbors are more likely to make connection. It is defined as: • CAR: This measure considers both common neighbors of each node pair and the number of connections between the common neighbors, and is computed as below: where s ij CN is the number of common neighbors between (i, j) and s ij LCL is the number of links between nodes in the common neighbors set 35 .
• CJC: This measure is a modified version of Jacard measure and is defined as below: where s ij CAR is the similarity measure CAR defined above and where the product is computed over all node pairs i, j and x ij is defined as the hyperbolic distance between pair i, j: ij i j and p(x ij ) is the Fermi-Dirac connection probability: where R ~ lnN. The estimated radial coordinate of node i is based on its degree in the network (k i ) via r i ~ lnN − lnk i . Therefore, if node degrees are correlated in different layers so will be the radial coordinates 36 .
Proposed measures. Node degree or popularity plays an important role in defining the similarity measures and many of them are based on common neighbors and preferential attachment. The underlying principle behind preferential attachment is that new connections are mainly made to more popular nodes. However, Papadopoulos et al. 23 showed that popularity is just one aspect of attractiveness, while similarity could be considered as another aspect. They developed a framework where new connections consider a trade-off between popularity and similarity.
We know that the degree distribution of many real networks follow power-law distribution. However, as it can be seen in As real multilayer networks, we consider six networks (see Table 1). The multilayer networks are converted to multiplex networks by assuming that all layers have the same number of nodes (the maximum number of nodes of all layers). Explanation of these networks is as follow: Table 1, the degree distribution of the multiplex networks with small size do not follow power-law distribution. The previous experimental results indicated that HP's performance was better in the networks with power-law distribution and less good in those that does not obey power-law degree distribution 7 . The reason for that would be the way the nodes' radial coordinates are calculated. Because one of the parameters which is considered in HyperMap method to estimate the radial coordinates is the power-law exponent of the network. Therefore, if the network does not have a power-law degree distribution, the link-prediction accuracy of HP decreases. In order www.nature.com/scientificreports www.nature.com/scientificreports/ to overcome this shortcoming and benefit the advantages of both popularity and similarity features of nodes, we proposed two approaches that are detailed in the following.

• Weighted Common neighbors (WCN):
We generate a weighted version of CN that computes the weight of common neighbors considering the hyperbolic distance of them with the target node pairs. There are some studies about converting the original similarity measure to the weighted one, however it has been shown that that such conversion may reduce the prediction performance 37 . The pseudo-code of the proposed method is as follows: 1. Approximate the hyperbolic coordinates of each node. 2. Compute the matrix H of hyperbolic distance of the existing links in the network. 3. h = average of H 4. Γ ij = list of common neighbors of node pair (i, j) in the test list of missing or spurious link prediction.
is a strong tie and has more weight node pair (i, k) is a weak tie and takes the weight as CN WCN(i, j) = WCN(i, j) + 1; 6. Repeat step 5 for node pair (k, j) 7. Sort all links in the test list in decreasing (for missing link prediction) or increasing (for spurious link prediction) order

• Ranking CN and HP (CN-HP)
This method benefits the advantages of both CN and HP measures. It uses a ranking method to combine the prediction given by both of these measures. In order to do that, one of the well-known classical rank aggregation methods, Borda's method is used 38 . It is based on absolute positioning of the ranked elements rather than their relative rankings. A Borda score for each element is calculated based on the ranking of it in the aggregated list. For a set of full list L = [L 1 , L 2 , L 3 , …., L n ], the Borda's score for element x and list L k is given by: and the total Borda's score of element x is: www.nature.com/scientificreports www.nature.com/scientificreports/ The advantage of Borda ranking method is that we can aggregate different kinds of measures with different categories and values and obtain a rank-based score. Also the computational complexity of this method is linear; however it does not satisfy the Condorcet criterion. In the proposed method, two lists of CN and HP scores for any node pair are constructed, and the final score for each node pair is computed by aggregating their ranking score using Borda method, i.e. in the case of missing/spurious link prediction the aggregated scores of CN and HP of all node pairs are computed based on Eq. (12) and are sorted descending/ascending. The top-k elements of the final list are the predicted links (k is the number of expected missing/spurious links).
Impact of layer relevance. In the case of HP as a similarity measure, the procedure of mapping each layer to its hyperbolic space can be done in different directions. One direction would be to jointly embed the different layers of a given multiplex and infer single radial and angular coordinates for each node. A second direction would be to aggregate the different layers using different operations such as those proposed in 39 , and then embed the aggregated network to infer single coordinates for nodes. Finally, a third direction would be to infer the node coordinates in each layer independently as considered here.
As it was mentioned above, we map each layer of each real multiplex to its hyperbolic space using the HyperMap method 6,24 . The method takes the network adjacency matrix and the network parameters T, γ. It then approximates the angular and radial coordinates of all nodes in the network. Parameter γ is the power law degree distribution exponent which is approximated separately for layers using the method introduced by Clauset et al. 40 , and T is the temperature. To estimate the values of T, the Nonuniform Popularity × Similarity Optimization N-PSO model is used 27 . The N-PSO model grows synthetic complex networks and it is equivalent to the hyperbolic H 2 model. The inputs to this model are the final network size N, the average node degree k, power-law coefficient γ and the network parameters T. The N-PSO model is used to construct synthetic networks with the same size N and average degree k and power-law exponent γ, using different values for T. The estimated values of T are then the values that best match the degree distribution and average clustering between the layer and the corresponding synthetic network.
In order to test whether this measure can be a good one for the link prediction, we classify the hyperbolic distance of all node pairs, and compute the probability of the existence of a link between the pairs in each bin. To this end, first the hyperbolic distances of all node pairs are sorted in ascending order and divided to k bins. Bin b i contains the node pairs with the hyperbolic distance in the range of [d i , d i+1 ]. Then, the probability p i of having a link between the node pairs of each bin is computed based on the network topology. The results are shown in Fig. 1. As it is shown, the probability of existing a link between each node pairs decreases, while their hyperbolic distance increases. Two nodes have a smaller hyperbolic distance as much as they are popular or similar to each other, in this case the probability of existing a connection between them increases. Thus, this measure can be a candidate for the similarity score for the link prediction problem. It is worth noting that the behavior of different layers are almost the same in all multiplex networks, with being more similar in the bigger networks including Rattus and SacchPomb. www.nature.com/scientificreports www.nature.com/scientificreports/ In order to employ the interlayer information, for node pair (i, j) in target layer α, we first calculate its similarity within each layer based on the proposed methods above. This enables us to compare prediction performance of the algorithms. In order to compare the prediction performance of the proposed prediction framework, we exploit different algorithms for quantifying the relevance between layers including link overlap, Pearson correlation, Spearman correlation and hyperbolic angular correlation 12 . The results show that the link overlap has the best effect on the link prediction performance. It is defined in the following.
• Link Overlap (LO): This measure identifies the ratio of common links in two layers, i.e. if α and β are two layers in a multiplex network, LO is the fraction of the same node pair that connects in both layers α and β and is defined as: where α s ij is the similarity index of target layer α and β s ij is the similarity index of any other layer β. μ αβ represents the correlation between layers α and β (link overlap), which can be explained as the weight of interlayer information involved from any layer β in link prediction in layer α and η is the tunable parameter. The correlations between different layers are shown in the Fig. 2. As it is shown, for all networks the link overlap correlation between different layers is positive. Furthermore, the highest layer relevance belongs to the Vicker network and the lower relevance belongs to larger and sparser networks. Our experiments show that LO is mostly consistent with other correlation metrics, but it has the most positive effect in the extent the interlayer information can improve the link prediction performance.

Results
We perform experiments on four single-layer synthetic networks to evaluate the similarity measures and six real multilayer networks to investigate the impact of interlayer information. The synthetic networks are evolved based on N-PSO model described above and their structural features and the precision of spurious and missing link prediction methods are presented in Figs 3 and 4. In the N-PSO model, the true node coordinates are generated for the networks. In the case of missing link prediction, we remove a fraction of edges using k-fold decomposition in each iteration and regenerate the node coordinates of the new network using the Hypermap method. Similarly, in the case of spurious link prediction, we add a fraction of nonexistent links to the network in each iteration and regenerate the node coordinates of the new network using the Hypermap method. There is no restriction in selecting the parameters for N-PSO model. It is preferred to generate networks with features that are near to real networks (large and sparse with power-law degree distribution) and temperature is chosen to be 0.3 and 0.6. Since the two parameters λ and T of Hypermap are set manually, so the approximation of hyperbolic coordinates is more accurate in synthetic networks.
Based on these reasons as it can be seen, in all cases the performance of hyperbolic distance (HP) is better than the other measures and the precision of the proposed methods (Weighted Common Neighbors (WCN) and Rank-HP-CN) is the highest in most cases. Thus, hyperbolic distance and its derived methods can be good choices as similarity measures for link prediction.
As real multilayer networks, we consider six networks (see Table 1). The multilayer networks are converted to multiplex networks by assuming that all layers have the same number of nodes (the maximum number of nodes of all layers). Explanation of these networks is as follow: (1) Vicker 41 : It is a 3 layer multiplex network with 29 nodes representing the students of a school in Australia.
The layers are defined as the contact relationship, co-working and best friends. The experimental results of the proposed link prediction methods on six real networks is presented in this section. For each multiplex network, the layer with the most density is chosen as the target layer. In the case of missing link prediction, 15% of links in the target layer are considered to be hidden and based on k-fold decomposition method, the performance of the similarity measures are examined over 20 independent experiments. For spurious link prediction, random links are added to the network and the performance of the similarity measures  are examined over 20 independent experiments. In order to evaluate the impact of employing the layer relevance and the extra information of other layers, we separately study the performance of the algorithms on single-layer (when only information of the target layer is considered) and multiplex (when inter-layer information is also considered) fashions. We employ the layer correlation based on link overlap and compute the similarity measures based on Eq. (14). Figure 5 shows the precision of missing link prediction of different similarity measures. For each measure there are two bars. The left bar shows the performance of the similarity measure while considering only the intralayer information of the target layer and the right bar is the performance of the similarity measure while using both intralayer and interlayer information. As it can be seen, in all cases incorporating the interlayer information improves performance of the missing link prediction and this is more pronounced in CElegans. The proposed similarity measures Rank-CN-HP has the best performance in most cases. Figure 6 shows the performance of the algorithms on spurious link prediction. For this problem, we also consider the cases when only intralayer information of the target layer is considered and when both intralayer and interlayer information are considered. As it can be seen, in most cases, including the interlayer information in the prediction process improves the performance. Furthermore, predictions based on PA similarity measures have the worst performance in most cases. In contrast to the missing link prediction, in this case Rank-CN-HP is not better than CN or HP in some of the networks, but WCN has the best performance in all multiplex networks. Our experiments show that in small networks, in the case of HP, the approximated radial coordinates of nodes in hyperbolic space for both true positive (correctly predicted) links and false negative links are almost in the same range. But the average degree of true positive links is significantly higher than the false negatives. It means that the radial coordinates of nodes which corresponds to their popularity are not precisely approximated, since the degree distribution of the target layers do not obey the power-law. HP mostly represents the similarity of node pairs, and thus it is not expected in most cases to have high performance. Therefore, combining this similarity measure with CN in different ways help to overcome the shortcoming of HP in covering the popularity attribute of each node. On the other hand, in large networks and especially in those with scale-free degree distribution, approximating the underlying hyperbolic geometry is more precise, but these networks are mostly sparse and similarity measures such as CN, CAR and CJC may not be quite successful in link prediction. Thus, in such cases combining the popularity-based measures with HP can improve the link prediction. The Rank-CN-HP and WCN methods both use CN as the popularity factor, and HP as the similarity factor. The difference is that in Rank-CN-HP the proposed similarity measure uses CN and HP independently, i.e. these two measures are first computed independently for each node pair, and then ranked based on Borda rank aggregating algorithm to achieve the final score that considers both CN and HP with the same weight. Whereas in the WCN method, we compute HP-distance for the common neighbors of each node pair and compare them with a threshold. If the HP-distance is less than the threshold, that node pair is assumed to have a strong tie, i.e. they are more similar to www.nature.com/scientificreports www.nature.com/scientificreports/ each other, and thus a fraction of HP-distance is added to the weight of that common neighbor; otherwise it is computed as the original CN. Therefore, in the WCN method the two similarity measures are dependent to each other.

Discussion
In this work, two novel methods based on the hyperbolic geometry of the multiplex networks are proposed to discover spurious and missing links in multiplex networks. The hyperbolic underlying of complex networks considers two parameters of popularity and similarity of nodes that both play important role in link prediction problem. Since the common local similarity measures mostly consider only the node degree (popularity), we suggest to enhance their predictability by adding the similarity feature to them. As we can see, in the case of missing link prediction specifically in social networks, each node is more likely to connect to nodes with similar features (his friends) as well as popular nodes (influencers). Another hypothesis is that interlayer relevance can be helpful in link prediction. Based on this hypothesis, recently a new method was proposed that considered the existing similarity measures in both target layer and other layers and combined the similarity measures via a correlation metric (Link Overlap) and obtained a multiplex-based similarity measure for spurious link prediction 21 . Based on this research, new measures are proposed based on the hyperbolic geometry of the network. First, a number of existing similarity measures which are widely used for the link prediction are chosen and then new measures are proposed to solve the spurious and missing link prediction problem. Our experimental results on four synthetic networks and six real-world multiplex networks shows that the new proposed measures outperform in all cases and also incorporating the interlayer information can improve the prediction performance compared with the case that only intralayer information is considered.