The D-Mercator method for the multidimensional hyperbolic embedding of real networks

One of the pillars of the geometric approach to networks has been the development of model-based mapping tools that embed real networks in its latent geometry. In particular, the tool Mercator embeds networks into the hyperbolic plane. However, some real networks are better described by the multidimensional formulation of the underlying geometric model. Here, we introduce D-Mercator, a model-based embedding method that produces multidimensional maps of real networks into the (D + 1)-hyperbolic space, where the similarity subspace is represented as a D-sphere. We used D-Mercator to produce multidimensional hyperbolic maps of real networks and estimated their intrinsic dimensionality in terms of navigability and community structure. Multidimensional representations of real networks are instrumental in the identification of factors that determine connectivity and in elucidating fundamental issues that hinge on dimensionality, such as the presence of universality in critical behavior.


I. INTRODUCTION
Geometry plays a fundamental role in our understanding of the world and in formulating theories based on geometric principles.In any scientific field, the ability to describe and visualize objects and phenomena with precision is paramount, and geometry serves as a crucial tool for scientific observation, enabling us to perceive, represent, and interpret our surroundings accurately.This transformation of abstract concepts into tangible visualizations facilitates analysis, prediction, and effective communication of scientific ideas.Moreover, measurement lies at the core of scientific inquiry, and geometry provides the framework and necessary tools for precise quantification.
Within this context, the concept of dimension assumes particular relevance as geometric properties associated with measurements on a specific system depend on dimensionality.The physical world we inhabit is threedimensional, and our understanding is rooted in this framework.However, when confronted with systems or phenomena that exist in higher dimensions, we encounter challenges in making sense of them.In such cases, dimensional reduction becomes necessary.Yet, this process carries the risk of losing or distorting information.Hence, it becomes crucial to carefully choose the appropriate dimension for describing the system.The choice of dimensionality plays a pivotal role in preserving the integrity and accuracy of the information we seek to capture and analyze.By selecting the correct dimension, * marian.serrano@ub.eduwe can ensure that our descriptions and interpretations remain faithful to the complexities of the system under investigation.
Complex networks are amenable to be described and modeled using geometric postulates.They can be represented in a simplified and comprehensible geometric framework [1] and the dimensionality question can then be addressed from first principles [2].One might be tempted to think that these principles could be rooted in explicit geometries underlying real systems.For instance, airport networks, urban networks, and power grids connect geographical locations, and 3-dimensional Euclidean space wraps brain's anatomy.However, in these complex networks explicit distances explain the tendency of the elements to be linked to each other only to some extent, and a variety of factors related with structural, functional, and evolutionary constraints are also at play.
The discovery of such hidden metric spaces and the understanding of their role have become a major research area leading to network geometry [23] as a new paradigm within network science.In this context, one of the last achievements of hyperbolic network geometry has been the discovery that real networks have ultra-low dimensionality and that networks from different domains show unexpected regularities, including tissue-specific biomolecular networks being extremely low dimensional, brain connectomes being close to the three dimensions of their anatomical embedding, and social networks and the Internet requiring slightly higher dimensionality [2].
Nonetheless, previous embedding tools that map network topologies in its latent hyperbolic geometry assumed that the similarity subspace is onedimensional [24][25][26][27][28][29][30][31][32][33].Among them, Mercator [33] embeds real networks into the hyperbolic plane on the basis of their congruency with the S 1 model [34], also H 2 in a purely geometric formulation [4], at the core of the network geometry paradigm.The model explains connectivity in real networks by assuming a one-dimensional spherical similarity space plus a popularity dimension that together define effective distances between nodes in the two-dimensional hyperbolic plane.The likelihood that two nodes are connected decreases with their hyperbolic distance.Mercator uses statistical inference techniques to find the coordinates of the nodes that maximize the congruency between the observed topology and the S 1 model [33,35].Apart from its accuracy, Mercator has the advantage of systematically inferring not only nodes coordinates but also global model parameters, and has the ability to embed networks with arbitrary degree distributions in reasonable computational time, which makes it competitive for real applications.
Beyond visualization, Mercator maps have been used in a multitude of downstream tasks, including efficient navigation [35][36][37], the detection of modular organization [38,39], prediction of missing links [26,40], and the implementation of a renormalization group [36,41,42] that brings to light hidden symmetries in the multiscale nature of real networks and enables scaled-down and scaled-up replicas.Other data-driven techniques have been proposed to embed networks in a latent space where connected nodes are kept close to each other [43][44][45][46][47][48][49], but are not comparable as long as distances are not defined in agreement with the relational and connectivity structure of the network, even if the hyperbolic plane was used in some of them as well [50].
The obtained representations are accurate enough in some cases, and for certain applications.However, many real complex networks, despite having an ultra-low dimension, are better represented by similarity subspaces of dimension higher than one [2], often with dimensions of D = 2 or D = 3.Therefore, embeddings in the most suitable dimension for each system have the potential to describe them without the drawbacks of significant dimensional reduction.Here, we introduce D-Mercator, a model-based embedding method that leverages two different techniques, model-based Laplacian Eigenmaps (LE) and Maximum Likelihood Estimation (MLE), combining them to produce multidimensional maps of real networks into the (D + 1)-hyperbolic space, where the similarity subspace is represented as a D-dimensional sphere (D-sphere).We evaluated the quality of the embedding method using synthetic S D networks.We also produced multidimensional hyperbolic maps of real networks.These maps provide more informative descriptions than two-dimensional counterparts and reproduce the structure of many real networks more faithfully.Multidimensional representations are instrumental in the identification of factors that determine connectivity in real systems and in addressing fundamental issues that hinge on dimensionality, such as the presence of universality in critical behavior.This makes D-Mercator a qualitative improvement and not a mere quantitative refinement.D-Mercator also allows us to estimate the intrinsic dimensionality of real networks in terms of navigability and community structure, in good agreement with embedding-free estimations [2].

D-
Mercator is based on the multidimensional formulation of the geometric soft configuration model, the S D /H D+1 model [2,51], which is a multidimensional generalization of the S 1 model [34].Our approach assumes that real networks are well described by the S D /H D+1 model and can be reverse-engineered to infer the coordinates of the nodes and the parameter β that give the highest congruency with the observed topology.
In the S D model, a node i is assigned a hidden variable representing its popularity, influence, or importance, denoted κ i and named hidden degree.It is also assigned a position in the D-dimensional similarity space chosen uniformly at random, and represented as a point on a D-dimensional sphere.The D-sphere is defined as the set of points in (D + 1)-dimensional Euclidean space that are situated at a constant distance R from the origin; each node is therefore assigned a vector v i ∈ R D+1 with ||v i || = R.The connection probability between a node i and a node j takes the form of a gravity law: The number of nodes in the network is N and, for convenience and without loss of generality, we set the density of nodes in the D-sphere to one so that The separation ∆θ ij = arccos( vi•vj R 2 ) represents the angular distance between nodes i and j in the D-dimensional similarity space.The parameter β > D, named inverse temperature, calibrates the coupling of the network topology with the underlying metric space and controls the level of clustering, which grows with β and goes to zero, in the thermodynamic limit, when β → D + .Finally, the parameter µ controls the average degree of the network and is defined as Hence, given the number of nodes N and the dimensionality D of the similarity subspace, the model is determined by N (D + 1) + 1 parameters: the hidden variables (κ i , v i ), i = 1, . . ., N , and the parameter β.The hidden degrees can be generated randomly from an arbitrary distribution or taken as a set of prescribed values.The model has the property that the expected value of the degree of a node with hidden variable κ is k(κ) = κ.An illustration of the S D model for D = 2 is given in Fig. 1.The S D model can be expressed as a purely geometric model in the hyperbolic space, the H D+1 model [51], by mapping the expected degree of each node κ i to a radial coordinate [52], see details in the Methods section IV A where we prove the isomorphism between the S D and the H D+1 model.

A. Multidimensional embedding method
Given a real network with adjacency matrix {a ij }, the first step in the embedding method requires to estimate the nodes' hidden degrees κ i and the inverse temperature β.This step corrects potential finite-size effects that distort the theoretical correspondence between the expected degree of a node in the S D model and its hidden degree.Second, the angular coordinates of nodes are inferred using a model-corrected version of LE.Third, the angular coordinates are refined using MLE.Finally, hidden degrees are readjusted given the newly inferred angular positions.
The estimation of hidden degrees and of the inverse temperature β are implemented as an iterative process.The initial value of β was chosen randomly between D and D + 1 so that the model is in the geometric smallworld regime.Note, however, that the quality of the inference method does not depend on the value of this initial guess.As the initial values for the hidden degrees {κ i , i = 1, . . ., N }, one can use the observed degrees {k i , i = 1, . . ., N } in the original network.The parameter µ is computed from Eq. (3) using the average of the observed degrees ⟨k⟩.The estimation proceeds by adjusting the hidden degrees such that the expected degree of each node in the model matches the observed degree in the original network (see the Methods section IV B 1).Once the hidden degrees are obtained, the theoretical mean of the local clustering coefficient of networks in the S D ensemble can be evaluated (see the Methods section IV B 2).If its value differs from the one of the original network, c, the value of β is adjusted and the process is iterated using the current estimation of hidden degrees until a predetermined precision is reached.
To infer the angular positions of nodes-the vectors v i -we first find a convenient initial guess using a S D model-corrected version of LE.The LE method was originally designed for dimensional reduction of data in Euclidean space [53] to find the coordinates of points in R m given the known distances between pairs of points in R n , with m ≤ n.This is achieved by finding a mapping of the set of points {⃗ x i ∈ R n → ⃗ y i ∈ R m } that minimize a given loss function.In D-Mercator, the target Euclidean space of the model-corrected version of LE has dimension D + 1, and the points to be found v LE i define the angular positions of network nodes in R D+1 .The loss function is where |v LE i − v LE j | are Euclidean distances between points i and j in R D+1 , and the weights {ω ij } are chosen so that the technique can be applied to networks congruent with the S D model.
As in standard LE, each weight ω ij is a decreasing function of the known Euclidean distance between the nodes but, in contrast, only connected pairs contribute to the loss.An approximation to the "known" distances can be inferred from the network structure by using chord lengths in R D+1 , so that the weights are set to where ⟨∆θ ij ⟩ is the expected angular distance between nodes i and j-with hidden degrees κ i and κ j -in the S D model and t is a scaling factor fixed as the variance of all the contributing distances.The set of coordinates {v LE i , i = 1, . . ., N } that minimize the loss function above corresponds to the solution of the eigenvalue problem of the weighted Laplacian matrix L ij = I ij − ω ij , where I is the diagonal matrix with entries I ii = j ω ij , so that v LE,i j is the i-th component of the j-th Laplacian eigenvector with non-null eigenvalue (the eigenvectors are ordered according to their eigenvalues).Fortunately, for sparse networks, there exists very fast algorithms able to solve the eigenvalue problem of weighted Laplacians [54], so that this step of the method is not computationally expensive.Finally, the positions found by solving the eigenvalue problem are then normalized so that all points lay on the D-sphere of radius R, i.e., v i = Rv LE i /(||v LE i ||).Note that, since degree-one nodes do not add geometric information, we remove them from the network and add them back once the coordinates of their neighbors are found (see the Methods section IV B 3).
Using the coordinates inferred by LE as the initial condition, the coordinates in the similarity subspace are finetuned by Maximum Likelihood Estimation (MLE) to optimize the probability for the observed network to be generated by the S D model (see the Methods section IV B 4). Nodes are visited sequentially and new positions are proposed in the vicinity of the mean vector of the node's neighbors.The most favorable proposed position, the one maximizing the local log-likelihood in Eq. (23), is selected and the process is repeated until the local loglikelihood function reaches a plateau.Notice that the final angular coordinates could be improved further by repeating the refining step taking as initial condition the previous MLE inference.
The embedding method ends after the hidden degrees are finally readjusted to compensate deviations from k(κ i ) = k i , which might have been introduced in the process of estimating the coordinates of nodes in the similarity subspace (see the Methods section IV B 5).An implementation of D-Mercator is publicly available at https://github.com/networkgeometry/d-mercator.
The complexity of Mercator is O(N 2 ) for sparse networks with N nodes.D-Mercator operates on the basis of the S D model for an arbitrary value of the dimension D, which requires general equations for inferring hidden degrees and parameter β.This makes some equations impossible to solve analytically, and numerical integrations are needed.Therefore, the time complexity of the method increases slightly compared to Mercator, but only affects pre-factors and not the scaling with the system size.The detailed computational complexity comparison between Mercator and D-Mercator is shown in The most stringent way to assess whether a map produced by D-Mercator is reliable consists in testing synthetic networks generated with the S D model with different topological properties and dimensions.The produced networks can then be embedded with the same dimension used to generate them or with a different dimension.In this case, the network's ground-truth is known and the accuracy of the embedding can be evaluated by implementing quality measures of congruency.In Fig. 2, we show an example of the capability of D-Mercator to recover the correct coordinates of nodes in synthetic S 2 networks with different values of the exponent of the scalefree power law degree distribution γ and inverse temperature β.Notice that the agreement between the original coordinates and the inferred ones is excellent, with values for the corresponding Pearson correlation coefficient above 0.96.Results for other values of β and γ with dimension D = 3 are reported in Fig. S3 with similar accuracy.
We also checked the inference of the parameter β (see Fig. S4) as well as the agreement between the empirical and the theoretical connection probabilities (left bottom panels of Figs.S5-S14), where the empirical connection probability is measured as the fraction of connected pairs as a function of the rescaled distance χ ij .Again, we found the inference of β to be very precise for all the considered synthetic networks, and the theoretical curve for the connection probability is well recovered.Altogether, these results confirm that D-Mercator is not just a high-fidelity algorithm in terms of the reconstruction of the similarity coordinates of synthetic networks, but it also determines correctly all other model parameters, including hidden degrees and inverse temperature, and it does so independently of the dimensionality of the network.
Next, we show that D-Mercator is able to identify the correct dimension used to generate S D synthetic networks without prior knowledge of D.

Navigability.
We studied the navigability of D-Mercator maps using greedy routing (GR) [55].In GR, messages are transferred on the network from a source to a target destination by repeatedly forwarding the message the neighboring node that is the closest to the target in the multidimensional hyperbolic map.In particular, we investigated whether the native dimension of the network gives the best result when taken as the dimension of the embedding space as compared to other values.Typically, hyperbolic In the top left corner of each figure, the value of the Pearson correlation coefficient between the inferred and original coordinates is reported.Since the coordinates from the embedding might be rotated, we transform them to minimize the average angular distance between the original and inferred coordinates (see Section III in SI).
maps of real networks in D = 1 display high navigability in the region of high clustering and heterogenous degree distributions [55].Hence, we generated synthetic networks with a specific dimensionality and these topological characteristics, and obtained hyperbolic maps by embedding them using D-Mercator with different values of the embedding dimension.The performance of greedy routing is evaluated by the fraction of successful attempts-when messages reach their destination-, and by their stretch defined as a ratio between the hop-lengths of successful greedy paths and the corresponding shortest paths in the network.In Fig. 3(a-d), we show the success rate as a function of the embedding dimension for networks generated in S 1 to S 4 .In all cases, the fraction of successful paths varies across D. The corresponding stretch values always remain low and vary consistently, although only slightly, with the success rate, such that the highest success corresponds to the lowest stretch (see Fig. S16).This means that D does not need to be optimized across the two dimensions of success rate and stretch, but one can only focus on the success rate.Strikingly, the success rate is markedly higher when networks are embedded in their native dimension, meaning that geodesics and shortest paths are the most congruent in the native dimension.For instance, S 2 networks have the highest success when embedded in D = 2, for which the lowest stretch is also observed.This implies that the performance of greedy routing, and in particular of the success rate, in maps produced by D-Mercator for different values of the embedding dimension D can help identifying the intrinsic dimensionality of real networks.

Geometric community concentration.
Multidimensional hyperbolic maps of networks are specially convenient to explore their community structure [56][57][58].In Mercator maps, geometric communities are defined as regions of the similarity subspace densely populated with richly interconnected nodes [20,39].Real networks in a variety of domains were found to display geometric communities that correlate well with metadata not informed to the algorithm, for instance world regions in Internet [35] or WTW maps [38], biological pathways in metabolic networks [59], and anatomic brain regions in human connectomes [37].
In synthetic networks, we found that embeddings in dimensions higher than that of the original network are still informative of geometric communities while the opposite is not true, especially when clustering is moderate or low (see Fig. S17).The intuitive explanation is that it is always possible to find an isometry between a set of points in a metric space and a space of higher dimensionality, whereas the opposite is, in general, not true.Specifically, we generated networks in S 1 with four geometrically localized communities and in S 2 with six geometrically localized communities, and obtained D-Mercator maps using different embedding dimensions.To generate the synthetic communities, we defined spherical caps with the apices evenly distributed on the surface of the D-sphere and polar angle ∆θ T = 0.7.Finally, nodes in each community were distributed homogeneously within the corresponding cap.The selected value of ∆θ T organizes nodes into non-overlapping communities.Once the ∆θ T > π/4, the communities become mixed and overlap increases, as shown in Fig. S18.
To measure how the communities remained clustered, we measured the geometric concentration of a community l around a node i as where n i,l is the number of nodes in community l out of n i,g considered nodes, and N l is the total number of nodes in the community l.The denominator n i,g can be determined in different ways.Here, we take n i,g as the top geometrically closest neighbors.Notice that with this definition, the limit n i,g → N implies that ρ i,l goes to 1.To obtain the community concentration for a given network map as a single scalar, ρ, we restricted the computation to the community to which each node belongs and averaged over all nodes in the network.Notice that n i,g → N implies that ρ → ρ ran = 1/N C , where N C is the number of communities.Finally, we calculated the community concentration as c C = ρ(n i,g = N/10), i.e., the geometric concentration at 10% of top geometrically closest nodes.The results reported in Fig. 3(g) show that embeddings in D = 1 and D = 2 of the clustered S 1 synthetic networks display similar geometric concentration of communities, and are clearly visually discernible in the maps for both dimensions, as shown in Fig. 3(e,f).In contrast, when the clustered networks are produced in D = 2, the concentration of communities in one-dimensional maps is clearly worse (see Fig. 3(j)), with some communities separated in different chunks and scattered all over the circle, as shown in Fig. 3(h).This result illustrates the importance of choosing the most appropriate dimension.Indeed, taking in this case the one-dimensional embedding would result in a very inaccurate quantitative description of the system.Despite this result, interestingly, the validation of the topological properties of the graphs shows that embeddings in both dimensions can largely replicate the properties of the network, such as degree distribution, clustering spectrum or average nearest neighbors degree (Figs.S20-S21).These results explain why one-dimensional embeddings have been quite good for many real networks, even if their natural dimension is greater than 1.However, while one-dimensional maps provide information about the community organization of the network, the two-dimensional maps offer a much richer and accurate representation.Finally, when we increase the value of ∆θ T , thus introducing the mixing of communities, the findings remain consistent (see Fig. S19).Therefore embedding in the higher dimension is needed to uncover the community structure of the networks.
The analysis above shows that there is a strong congruency between ground-truth communities and embeddings obtained by D-Mercator when the appropriate di- mension is selected, as nodes are surrounded mainly by other nodes in the same community.In turn, this result implies that such embeddings can be used to detect similarity-based communities even when the metadata defining groups is not available.

C. Multidimensional maps of real world networks
We compiled data for several real-world complex networks from different domains and embedded them in different dimensions.More specifically, we generated results for six real networks for which metadata reporting categories is available: a sample of the Add-Health study, where adolescent students in six grades are linked by social interactions [60]; the network of international trade in apples from the Food and Agricultural Organization (FAO) of the United Nations [61] (see Fig. S22); the neural network of the C. elegans worm, where communities are the different neurons' classes [62] (see Fig. S23); the network OpenFlights for flights between airports around the world [63]; Foxglove (Digitalis purpurea) network [64] which describes the global organ-wide cellular connectivity of plant hypocotyls (see Fig. S25).The links between cells were identified by detecting common surfaces using 3D cellular meshes that represent the intercellular association; and Polbooks network (see Fig. S26) [65] where nodes represent the books about U.S. politics published close to the 2004 U.S. presidential election, and sold by Amazon.com.Edges between books represent frequent copurchasing of those books by the same buyers.The nodes' attribute indicate the political leaning: liberal, conservative or moderate.In the geographical networks, we took continents as categories.We report global statistics of the networks in Table I and Table S1.
Results validating the embeddings in terms of topological congruency between the network and the inferred model can be found in Figs.S27-S32, where we show the degree distribution, the average nearest neighbors degree, the number of triangles, and the clustering distribution.The probability of connection and other local properties are well reproduced in all dimensions.However, as displayed in the bottom panels of Fig. 4, and taking the Add-health network as a case example, not all dimensions provide the same efficiency in terms of GR success rate p s , community concentration c C , and performance of community detection.To detect the communities, we employed the hierarchical clustering algorithm to cluster nodes in the D-dimensional similarity space for the detection of communities.One popular method, called the agglomerative clustering algorithm from the scikitlearn library [66], merges successive nodes close to each other in the similarity space, i.e., when they are separated by a small angular distance.The approach can be applied to real networks with a predefined number of clusters, for instance as given in the metadata.To assess the performance of the community detection algorithm, we measured the modularity of the network [67], Q, and also computed the Normalized Mutual Information (NMI) between the predicted communities and the metadata labels.We compared the geometry-based agglomerative clustering method with several topologybased alternatives (see Section XIII in SI).Overall, the geometric community detection method exhibits comparable outcomes regarding modularity and NMI.
For the networks analyzed in this work, the four metrics provide coinciding information and each network has a specific dimension that is clearly better.Hence, we propose that the dimension of a network is the consensus value of D among the analyzed structural features including congruency with metadata.Following this prescription, the hyperbolic dimension of Add-health displayed in Fig. ( 4) is D + 1 = 3, such that the similarity subspace can be easily visualized as a 2-sphere, as shown in Fig. 4b.Interestingly, performing a one-dimensional embedding results in some of the communities mixed up see Fig. 4a.This is the case of the 7th and 8th grades, which appear completely mixed up in the one-dimensional embedding whereas in the two-dimensional one both grades are well separated.Again, this result clarifies the importance of using the most appropriate dimension for the description of the system.
For OpenFlights, we found that the best hyperbolic dimension is D + 1 = 2, see Fig. S24, which indicates that the topology of the airports network is 1-dimensional in the Euclidean similarity space whereas one could have naively expected D = 2.This last result highlights the distinction between the geometry of the Earth and that of the topology of the airports network, where long-range flights between hubs reduce the effective geometry of the planet.The embedding of Foxglove revealed that the greedy routing and community concentration achieve the highest values in D + 1 = 4.This finding is consistent with the fact that the network is a 3D geometric graph of intercellular associations.Furthermore, our analysis revealed that the Polbooks network can be most accurately represented in D + 1 = 3 dimensions.Although the highest community concentrations are found at both D + 1 = 2 and D + 1 = 3, the performance of greedy routing makes it evident that D + 1 = 3 captures better the network topology.In all the networks analyzed in this work, the community concentration at the optimal dimension is significantly larger than the random case N −1 C , thereby validating the quality of embeddings found by D-Mercator.
The similarity maps in the 2-sphere show interesting information about the spatial distribution of communities and their relation with categories as defined by metadata.In the Add-Health network, there are six categories corresponding to the grades the students belong to.One can observe that students in lower grades (classes 7 and 8) are clearly separable in the D = 2 similarity map and which are mixed in the lower dimension.In contrast, nodes' positions of adolescents in the 10th to 12th grades are mixed, indicating that friendships were formed between members of the different classes.The countries in the FAO-Apples embedding are quite clearly grouped into continents.The European countries are placed in one similarity region, whereas nodes from Asia and Oceania are positioned on the opposite side of the sphere, thereby outlining the interconnectivity of trades of apples within Europe and largely not between different continents.Finally, the neurons in C. elegans are divided into five categories including motor, sensor, interneurons, neurons in the pharynx, and sex-specific neurons.Again, the different categories are clearly separated.

III. DISCUSSION
Through history, maps have been at the center of political, economic, and geoestrategic decisions to become a critical piece in our every day lives, serving as an integral, accurate, and relevant information source.Their appeal is not only visual, they provide a way of storing and presenting information and communicating findings, they let us recognize locational distributions and spatial patterns and relationships, and they make it possible for us to conceptualize processes that operate through space.Our overarching goal is to map real-world complex systems in an embedding metric space that ought not to be geographical or spatially obvious, but that may be a condensate of the different intrinsic attributes that determine how distant, conversely similar, the elements of the system are.
Maps in the hyperbolic plane obtained by S 1 modelbased optimization are meaningful representations that explain the observed regularities in the interaction fabric of real networks, and have been used in a multitude of downstream tasks.In some cases, networks are intrinsically one dimensional and, in general, D = 1 maps offer a good approximation.But, in most cases, multidimensional hyperbolic embeddings of real networks with D > 1 provide more accurate descriptions and will help to discern the role of the different attributes which determine the connectivity in complex systems-such as, for instance, the specific role of geographic and cultural factors in economic and social networks.
Community detection will also benefit from multidimensional hyperbolic embeddings, which facilitate the application of the large family of methods based on geometric information and spatial distances [68][69][70].We found here that the embedding of a real network in the proper dimension yields the partition of nodes with the highest modularity using a hierarchical clustering algorithm in the similarity subspace.However, the interplay of dimensionality with geometric community detection algorithms is not obvious and this issue will require future investigations.In relation to this, it is also worth mentioning that the interplay between the hyperbolic community structure in higher dimensions and the performance of a variety of tasks is not fully understood yet [52].A further step would be to scrutinize the effect of the coupling between the dimensionality and performance of greedy routing when the conformation of communities is being changed.
From a technical point of view, the problem of producing geometric network embeddings is NP hard, as the majority of optimization and network inference problems, and the solution can be trapped in some local optima.This is what makes particularly important a proper validation protocol on the basis of synthetic networks produced by the model such that the ground truth is known and the results can be compared against it.We believe that machine learning techniques will provide complementary tools in the future for inferring nodes' coordinates and model parameters that can more accurately approach the global optimum.This will maximize even more the likelihood of the hyperbolic model reproducing the original network.This approach could also be helpful for the optimization of the algorithm to deal with larger networks.
Other lines of research for future work refer to the extension of D-Mercator to embed networks with weak geometric structure [71].Applications such as geometric renormalization [12], link prediction [72], community detection [57,73], studying geometric temporal networks [74] and bipartite networks [75], and the analysis of geometric Turing patterns [76], will definitely benefit from representing complex networks in their optimal dimension.Beyond network science, our low dimensional representation can impact fields like machine learning in the short term, where it can be used to improve the relational structure that determines the aggregation and message passing steps of graph neural networks [77].

A. Proof of the isomorphism between S D and H D+1
The S D model can be expressed as a purely geometric model in the hyperbolic space, the H D+1 model [51], by mapping the expected degree of each node κ i to a radial coordinate as [52] , with R = 2 ln 2R (µκ 2 0 ) 1/D .(7) Using this transformation, the connection probability of the S D model given in Eq.( 1) can be rewritten in the form Hidden degrees κ i and κ j can then be transformed into radial coordinates using Eq.(7), such that Thus, we finally obtain , with which is the connection probability in the H D+1 model.The quantity x ij is a good approximation of the hyperbolic distance between two nodes with radial coordinates r i and r j separated by an angular distance ∆θ ij .This approximation is very accurate for pairs of nodes separated by ∆θ ij ≫ 2 √ e −2ri + e −2rj [4], the fraction of which converges to one in the thermodynamic limit [1].

Inferring the hidden degrees
Given a value of β and the corresponding value of µ from Eq. ( 3), 1. Initialize the hidden degrees by setting κ i = k i , ∀ i=1,...,N , where k i is the observed degree of node i in the real network.
2. Compute the expected degree for each node i according to the S D model as 3. Adjust the hidden degrees.Let ϵ max = max i {| k(κ i ) − k i |} be the maximal difference between the actual degrees and the expected degrees, and ϵ a tolerance parameter.
• If ϵ max > ϵ, the set of hidden degrees needs to be corrected.To do so, we set |κ i + [k i − k(κ i )]u| → κ i for every class of degree k i , where u ∼ U (0, 1).The random variable u prevents the process from getting trapped in a local minimum.Next, go to step 2 to compute the expected degrees corresponding to the new set of hidden degrees.
• Otherwise, if ϵ max ≤ ϵ, the hidden degrees have been correctly inferred for the current global parameters.
The tolerance parameter used in this work was set to ϵ = 0.01.

Inferring the inverse temperature β
Inferring β requires computing the expected local mean clustering c, given the current values of the global parameters as well as the hidden degrees κ(k) computed in Sec.IV B 1.
The method D-Mercator uses is based on the following.Suppose that we want to estimate the expected clustering c of some node of degree k.According to the definition of mean local clustering, this quantity is the probability for two randomly chosen neighbors of the node to be connected, which can be computed by following these two steps.First, we randomly choose two of its neighbors and draw their distances to the node from the distribution of distances between connected nodes in the model.Second, we compute the distance between the two neighbors and, with it, the probability for them to be connected.
Two important points require further clarification.
• The model is uncorrelated at the hidden level.Thus, we can draw two neighbors from the uncorrelated distribution P (k|k ′ ) = kP (k)/ ⟨k⟩.
• The distribution of angular distance ∆θ between two connected nodes with hidden degrees κ and κ ′ reads where p(a κκ ′ = 1|∆θ) is the probability that two nodes with hidden degrees κ and κ ′ separated by a distance ∆θ and in D-dimensional space are connected given by Eq. ( 1).This probability is The distribution of distances in the S D model is which becomes increasingly peaked around ∆θ = π/2 as D → ∞.
Finally, p(a κκ ′ = 1) is the connection probability between two nodes with hidden degrees κ and κ ′ and is given by With these tools in hand, the expected mean clustering is estimated as follows.(b) Draw the corresponding random variable ∆θ i from the distribution ρ(∆θ i |a κ(k)κ(ki) = 1), i = 1, 2 given in Eq. ( 14).(c) Generate two random vectors v i , i = 1, 2 with a given angular separation ∆θ i , i = 1, 2 and compute the angular distance: β is a probability for nodes 1 and 2 to be connected.

Compute the expected mean local clustering as
where cemp is the mean local clustering of the network to be embedded, we can accept the current values of β and proceed to the inference of angular coordinates.Otherwise, β needs to be corrected and the hidden degrees must be recomputed.Since the expected mean local clustering coefficient is a monotonic function of β, the process can be efficiently evaluated using the bisection method.More specifically, we start with a value of β chosen randomly between D and D + 1.Then, while the expected clustering is lower than the observed one, we multiply β by 1.5.We start the bisection method when we reach the value for which the observed clustering is surpassed.We found that for ϵ c = 0.01, m = 600 is enough.To gain higher precision, one must increase m to ensure that the required precision is satisfied.

S D model-corrected Laplacian Eigenmaps
The expected angular distance between nodes i and j in the S D model, conditioned to the fact that they are connected, can be computed as Since degree-one nodes do not add geometric information, we first obtain the positions for nodes with k > 1, and subsequently reincorporate the nodes with degree of one.For each node i with k i = 1 and its neighbour j, we draw an angular distance ∆θ ij from Eq. ( 14) given that the two connected nodes have hidden degrees κ i and κ j .
Then, the position of node i, v i , is generated with a given angular separation to the node j.

Likelihood maximization
Given initial positions for the nodes on the D-sphere, the steps to maximize the congruency between the observed network and the S D model are: 1. Define an ordering of nodes: The nodes are visited in the order defined by the networks' onion decomposition [78].In the sequence, the ordering of nodes belonging to the same layer in the decomposition is random.
2. Find new optimal coordinates: For every node i, we select the optimal coordinates among candidates' positions generated in the vicinity of the mean vector of its neighbors.This is achieved in the three steps: (a) Compute the mean coordinates of node i's neighbors.Let node i have k i neighbors, which are now label with index j = 1, . . ., k i .Since the nodes are situated on the D-sphere we have to compute their mean vector vi , which is given by where the hidden degrees in the above expression weight the contribution of every neighbor's positioning vector, as proposed in [79].
(b) Propose new positions around vi : We generate 100 max(ln N, 1) candidate vectors from the multivariate normal distribution with mean vi and standard deviation σ given by where ∆θ max is the angular distance between vector vi and the most distant neighbor of node i.
(c) Select the most likely candidate position: Compute the local log-likelihood of every candidate position as well as of node i's current position according to Locate node i at the position maximizing the local log-likelihood.

Final adjustment of hidden degrees
The process of adjusting hidden degrees, given the positions of the nodes in the similarity subspace, such that k(κ i ) = k i is similar to the initial inference of hidden degrees: 1. Compute the expected degrees: For every node i, set 2. Correct hidden degrees: Let ε max = max i {| k(κ i ) − k i } be the maximal deviation between degrees and expected degrees.If ε max > ε, the set of the hidden degrees needs to be corrected.Then set u| → κ i for every node i, where u ∼ U (0, 1).Again, the random variable u prevents the process from getting trapped in the local minimum.Next, go to step 1 and compute the expected degrees corresponding to the new set of hidden degrees.Otherwise, if ε max < ε, the hidden degrees have been inferred for the current global parameters and nodes' positions.
C. Generating synthetic networks with the S D model 1.The distribution of hidden degrees can be of any form.For the experiments in Fig. 2 and Fig. 3, we used a power-law hidden degree distribution of the form ρ(κ) = (γ − 1)κ γ−1 0 κ −γ , with κ > κ 0 = (γ − 2)/(γ − 1) ⟨k⟩ and different values of the characteristic exponent 2 < γ < 3. Hidden degrees are also cut-off from above by the natural cut-off [80].This choice avoids the extreme fluctuations of the maximum hidden degrees for γ < 3 that would result, for some network realizations, in expected degrees larger than N .For every node i in the simulated network, we generated the hidden degree κ i as a random value from this distribution.2. To assign nodes' positions on the similarity subspace, each node is assigned a vector v i ∈ R D+1 with D +1 independent and standard normally distributed entries.These entries are subsequently normalized to lie on the sphere with where R is given in Eq. ( 2).
3. The hidden degrees and the coordinates in the Ddimensional similarity subspace are used in Eq. ( 1) to calculate the probability of connection between any pair of nodes.For any given value of β, the value of µ is evaluated from Eq. (3) depending on the target average degree ⟨k⟩, β, and D.

V. DATA AVAILABILITY
The network datasets used in this study are available from the sources referenced in the manuscript and the supplementary materials.The coordinates and the parameters of the multidimensional hyperbolic embeddings of the real networks are available via the Zenodo platform at https://zenodo.org/doi/10.5281/zenodo.10027084.

VI. CODE AVAILABILITY I. DIFFERENCES BETWEEN MERCATOR AND D-MERCATOR
Mercator is a tool to embed networks in the hyperbolic plane according to the S 1 model.It also applies two methodologies, the model-adjusted machine learning LE technique and the maximum likelihood method.However, there are significant differences between Mercator and D-Mercator.
Nodes' variables.Each node in Mercator is endowed with two variables: a hidden degree κ and an angular position θ in a circle.In D-Mercator, nodes also have hidden degrees; however, they are positioned on the D-sphere and so Mercator stands as a particular case of D-Mercator with D = 1.Therefore, we assign a (D + 1)-dimensional vector to each one.One of the main consequences of this change, purely driven by dimensionality increase, affects the distribution of angular distances.In the S 1 model, this distribution is uniform between 0 and π, yet in higher dimensions, the distribution becomes increasingly peaked around ∆θ = π/2 (see Fig. S1).
No fast version in D-Mercator.Mercator introduced two embedding modes, fast and refined.In the former, the algorithm performs order-preserving adjustment after applying the S 1 -modified LE technique.This step allows maintaining larger gaps between communities while readjusting smaller gaps based on the S 1 model.The fast version of Mercator gives already meaningful embeddings.There is no fast mode in D-Mercator, since the readjustment step is not well defined in higher dimensions and it would be as computationally expensive as the ML step, or even more.

III. ALGORITHM'S TIME COMPLEXITY
We use the Trapezoid Rule to compute the integrals which cannot be solved analytically, e.g., Eq. 13, 17 from the main text.Time complexity is dependent on the resolution, i.e., number of steps, O(n).Overall, it does not change the final complexity of the D-Mercator which is O(N 2 ), where N is the network size.In comparison to the Mercator, the D-Mercator is slower but the time complexity remains the same.The detailed comparison is shown in Figure S2.

A. Community overlap
To measure the overlap of nodes in S 1 and S 2 models with community structure, we define a simple measure that computes the distance between a given node and all the centers.If a node is not closer to its center, we count it as an overlap.The detailed algorithm is shown as follows.
1.For each node i with label l.
2. Compute ∆θ il , which is a distance from node i to its center l.
3. Compute all angular distances to the rest of the centers.4. If a node i is closer to any other centers then l, mark this node as being the overlap.5. Repeat for every node and compute the average.We compare the agglomerative clustering algorithm, which uses the obtained embedding in the best dimension, with the topological-based methods for the community detection task.Table S2 shows the obtained values of modularity, whereas Table S3 shows the Normalized Mutual Information between the predicted communities and the metadata labels.Table S4 depicts the overlap between clusters obtained using the agglomerative clustering algorithm and the four topological-based methods.The number of clusters obtained by each method is reported in Table S5.TABLE S2: Comparison of the community detection performance in terms of modularity (Q) between the agglomerative clustering algorithm based on the embeddings in the best dimension and the topological based methods: GMM (greedy modularity maximization) [1], Louvain method [2], Infomap [3] and LPA (Label Propagation Algorithm) [4].For the agglomerative clustering algorithm we report two cases: (i) where the number of clusters is obtained from the metadata and (ii) when the number of clusters is determined by the maximum modularity.The highest value is shown in blue and the second highest in orange.
agglomerative   S3: Comparison of the community detection performance in terms of Normalized Mutual Information (NMI) between the predicted communities and the metadata labels for the agglomerative clustering algorithm based on the embeddings in the best dimension and the topological based methods: GMM (greedy modularity maximization) [1], Louvain method [2], Infomap [3] and LPA (Label Propagation Algorithm) [4].For the agglomerative clustering algorithm we report two cases: (i) where the number of clusters is obtained from the metadata and (ii) when the number of clusters is determined by the maximum modularity.The overlap between the communities obtained by applying the agglomerative clustering algorithm based on the embeddings in the best dimension and four topological based methods in terms of Normalized Mutual Information (NMI).We report two cases: (i) where the number of clusters is obtained from the metadata and (ii) when the number of clusters is determined by the maximum modularity.

FIG. 1 .
FIG.1.Geometric soft configuration model S D .In D = 2, the similarity subspace corresponds to the surface of a sphere embedded in three dimensions, such that it can be represented visually.Nodes are placed in the two-dimensional sphere representing the similarity subspace and the size of a node is proportional to its expected degree.The angular distances between pairs of nodes A1 − A2 and B1 − B2 are highlighted.Light gray lines on the two-sphere represent connections produced according to the model.

FIG. 2 .
FIG. 2. Validation of D-Mercator on synthetic networks.Relationship between coordinates of the synthetic S 2 networks (original) and its embeddings (inferred) with parameters: (a, c, e) β = 3, γ = 2.5, N = 2000, ⟨k⟩ = 8 and (b, d, f) β = 5, γ = 2.7, N = 2000, ⟨k⟩ = 9.In the top left corner of each figure, the value of the Pearson correlation coefficient between the inferred and original coordinates is reported.Since the coordinates from the embedding might be rotated, we transform them to minimize the average angular distance between the original and inferred coordinates (see Section III in SI).

FIG. 3 .
FIG. 3. Detecting the dimensionality of synthetic networks.(a)-(d) Greedy routing in multidimensional hyperbolic maps of synthetic networks.The probability of successful paths (ps) as a function of the embedding dimension for S D synthetic networks generated in dimensions (a) D = 1, (b) D = 2, (c) D = 3 and (d) D = 4.The black dashed lines show the maximum value of ps for each dimension computed from the generated synthetic networks using the real coordinates.The box ranges from the first quartile to the third quartile.A horizontal line goes through the box at the median.The whiskers go from each quartile to the minimum or maximum.Results obtained by averaging over 100 realizations with β = 2.5D, γ = 2.7 and N = 2000.(e)-(j) Community concentration in multidimensional hyperbolic maps of synthetic networks with modular structure.The community concentration cC for (g) the S 1 model with 4 communities and (j) the S 2 model with 6 communities embedded in different dimensions.Visualization of the embedding of S 1 model with 4 communities in (e) D = 1 and (f) D = 2. Visualization of the embedding of S 2 model with 6 communities in (h) D = 1 and (i) D = 2. Nodes are colored based on their communities, and their size in (f) and (i) is proportional to their expected degree.Results obtained by averaging over 50 realizations with β = 1.5D, γ = 2.7 and N = 2000.

FIG. 4 .
FIG. 4. Case study: the Add-health dataset.Top panels show the embeddings in (a) D = 1 and (b) two perspectives of the D = 2 similarity space of D-Mercator embeddings of the network.The size of a node is proportional to its expected degree, and its color indicates the community it belongs to.For the sake of clarity, only the connections with probability pij > 0.5 given by Eq. 1 are shown.Bottom panels show the performance of (c) community concentration (cC ), (d) community detection (NMI), (e) modularity (Q), and (f) the success rate of GR (ps).

1 . 2 .
Initialize mean local clustering: Let c(k) represent the expected mean local clustering of degree class k.Set c(k) = 0 for all k.Compute the expected mean local clustering spectrum: For every degree class k, repeat m times: (a) Draw two variables k i from P (k i |k), i = 1, 2.

50 FIG
FIG.S1: Distribution of angular distances between pair of nodes for different dimensions.The histograms represent the angular distances between nodes in the synthetic S D networks of size 100.Whereas the lines show the analytical solution (Eq.16 from the main text).

Generated in S 6 FIGd
FIG.S2: Comparison of computational complexity between D-Mercator and Mercator in terms of running time versus number of nodes in the network.For Mercator we generated S 1 synthetic network, whereas for D-Mercator S 2 synthetic networks.In both cases, the following parameters were used: β = 2.5D, γ = 2.5.The panel (a) shows the initialization part, where all necessary variable are created, in panel (b) the time to infer parameter β and set of κ-s is depicted.In panel (c) we see the required time to run Laplacian Eigenmaps, whereas in panel (d) the maximization likelihood step.The necessary time to run the last step, i.e., adjusting the κ-s is shown in panel (e).Finally, the combined time of the embeddings is presented in panel (f ).The results are averaged over 5 realizations.Simulations were conducted on Intel i7-7700K (8 cores, 4.5GHz) with 16GB RAM..

din S 4 FIG
FIG. S16: Average stretch as a function of the embedding dimension for S D synthetic networks generated in dimensions (a) D = 1, (b) D = 2, (c) D = 3 and (d) D = 4. See caption of Fig. S15 for more details.

S 2 FIG
FIG. S18: Overlap of communities as the function of ∆θ T in the synthetic networks.The results are averaged over 100 realizations.

FIG. S22 :
FIG. S22: Case study: the FAO-apples dataset.Top panels show the embeddings in (a) D = 1 and (b) two perspectives of the D = 2 similarity space of D-Mercator embeddings of the network.The size of a node is proportional to its expected degree, and its color indicates the community it belongs to.For the sake of clarity, only the connections with probability p ij > 0.5 given by Eq. 1 from main text are shown.Bottom panels show the performance of (c) community concentration (c C ), (d) community detection (NMI), (e) modularity (Q), and (f ) the success rate of GR (p s ).

FIG. S28 :
FIG. S28: Topological validation of the embeddings of the FAO-apples network.See caption in Fig. S27 for more details.
XIII.COMPARISON WITH TOPOLOGICAL-BASED COMMUNITY DETECTION METHODS

TABLE I .
Properties of the real networks used in this work.The β and µ values are presented for the best dimension of the embeddings.The NC indicates the number of ground-truth communities, while D is the inferred dimension as determined by the performance of greedy routing, community concentration, and community detection.

TABLE S1 :
Properties of selected real networks.The N C indicates the number of ground-truth communities.The β i and µ i values are presented for the i's dimension of the embeddings.The † indicates that β could not be inferred in that dimension.

TABLE
The highest value is shown in blue and the second highest in orange.