The hyperbolic geometry of financial networks

Based on data from the European banking stress tests of 2014, 2016 and the transparency exercise of 2018 we demonstrate for the first time that the latent geometry of financial networks can be well-represented by geometry of negative curvature, i.e., by hyperbolic geometry. This allows us to connect the network structure to the popularity-vs-similarity model of Papdopoulos et al., which is based on the Poincar\'e disc model of hyperbolic geometry. We show that the latent dimensions of `popularity' and `similarity' in this model are strongly associated to systemic importance and to geographic subdivisions of the banking system. In a longitudinal analysis over the time span from 2014 to 2018 we find that the systemic importance of individual banks has remained rather stable, while the peripheral community structure exhibits more (but still moderate) variability.

www.nature.com/scientificreports/ ('fire-sale contagion'); see also (French et al. 12 , p. 21ff). Here, we focus on the channel of fire-sale contagion, which has been singled out-both in simulation 21 and in empirical studies 20 -as a key mechanism of financial contagion. Moreover, the propensity of fire-sale contagion can be quantified from available balance sheet data, using liquidity-weighted portfolio overlap (LWPO) 22,23 as an indicator (see "Methods" for details). Our inference of financial networks follows a two-stage mechanism: First, we construct a weighted bipartite network in which banks B = (b 1 , . . . , b n ) are linked to a common pool of assets A = (a 1 , . . . , a m ) , which consist of sovereign bonds classified by issuing country and by different levels of maturity. In the second step we perform a one-mode projection of this network on the node set B, using the LWPO of two banks b i , b j ∈ B to determine the weight w ij of the link between the corresponding nodes. For any of the years y ∈ {2014, 2016, 2018} , the result is an undirected, weighted network N y of banks, in which two banks are connected if and only if they hold common assets. The link weight w ij , normalized to [0, 1], represents the susceptibility of two banks b i , b j to financial contagion, quantified by their LWPO.
Network features. The inferred networks are very dense (densities: ρ 2014 = 0.86 , ρ 2016 = 0.96 , ρ 2018 = 1.00 ), i.e., almost all pairs of banks hold some common assets. However, the distribution of weights is highly skewed (see Fig. 1A), with most of the connections exhibiting very small weights. In other words, the networks are dominated by a 'sparse backbone' of a few strong connections, which represent the dominant channels of potential contagion of financial distress. The same skew is present in the distribution of node strengths (see Fig. 1B) with a few strong nodes dominating over a majority of weaker nodes in all years.
To extract more salient connectivity information from these highly connected networks, we also consider the ' p%-Backbone' of each network, formed by the upper p%-quantile of highest-weighted edges (We have also considered disparity filtering 24 as an alternative method for backbone extraction; see 'Comparison of methods and robustness checks' below.). Figure 2 shows the degree distribution and the local clustering coefficient (in dependence on node degree) for the 10%-and the 25%-Backbone. While there is no evidence of a scale-free degree distribution, the clustering coefficient displays an interesting pattern: It is highest for medium-degree nodes and then decreases with increasing node degree. This indicates that high-degree nodes (i.e. highly connected banks) typically act as hubs between lower-degree nodes (i.e. 'normal' banks) without a direct link. Latent network geometry. Network representation methods. Our next objective was to uncover the latent geometric network structure and to evaluate the suitability of a hyperbolic network model. (See "Methods" for background on hyperbolic geometry.) To this end, we applied four different network representation methods (one method embedding into two-dim. Euclidean space E 2 , two methods embedding into two-dim. hyperbolic space H 2 , and one non-geometric method) to the financial networks N 2014 , N 2016 and N 2018 and their p%-backbones for p = 10, 25, 50 . The first two methods, multidimensional scaling 25 (MDS) and hydra+ 26,27 , calculate stress-minimizing embeddings of the weighted network distances into Euclidean and hyperbolic geometry, respectively. The third method, Mercator 28 , is a connectivity-based method (i.e. ignoring network weights) and uses a mix of machine learning and maximum likelihood estimation to infer latent coordinates in a popularity-vs-similarity-model of hyperbolic geometry; see García-Pérez et al. 28 for details. As non-geometric representation method, we used a degree-corrected stochastic block model (dSBM) 16 , as implemented in the Rpackage randnet 29 , which aims to represent network structure by inferring communities and their connection probabilities; see Karrer and Newman 16 for details. Since Mercator and dSBM are connectivity-based, they can only be meaningfully applied to the network backbones. MDS and hydra+, on the other hand, can also be applied to the full weighted networks and are directly comparable, since they minimize exactly the same objective function, but only differ in their target geometries. Figure 3 shows a comparison of the embedding quality of the different methods. For the full networks, we use stress (i.e. the root mean square error between network distances and embedded distances) as an evaluation metric, while for the backbones we use the AUPR (area under the Precision-Recall-curve) from a network reconstruction task (see "Methods" for details).     www.nature.com/scientificreports/ To check the robustness of our results with respect to the method of backbone extraction, we have repeated the same analysis with backbones determined by disparity filtering 24 . The results are reported in supplementary Figure S1. While reconstruction quality deteriorates for all methods on the disparity filtered backbones, the advantage of the hyperbolic methods over the non-geometric and Euclidean methods becomes even more pronounced. Overall, we conclude that the latent geometry of the observed financial networks is-at least in low dimension-much better represented by negatively curved (hyperbolic) rather than flat (Euclidean) geometry. Moreover, the hyperbolic representations are superior even to the (non-geometric) degree-corrected stochastic block model in terms of network reconstruction quality.
Latent hyperbolic coordinates. As a result of the hyperbolic embeddings we obtain for each bank node b i latent coordinates (r i , θ i ) in the Poincaré disc model of hyperbolic space (see "Methods"). This allows us to connect the network embedding to the popularity-vs-similarity model of Papadopoulos et al. 4 and the S 1 -model of Ángeles Serrano et al. 28,30 . The hyperbolic embedding of the full banking network of 2018 produced by hydra+ is shown in Fig. 4A. The embedding of the 10%-Backbone of the same network produced by Mercator is shown in Fig. 4D. Note that the hydra+-embedding attempts to give a faithful representation of all distances in the weighted network, whereas Mercator only encodes connectivity information and is harder to interpret visually. This phenomenon is exacerbated by the laws of hyperbolic geometry, in which seemingly small differences in the radial coordinate can represent large differences in hyperbolic distance. With reference to  28,30 used by Mercator offer a direct interpretation of the latent hyperbolic network coordinates in the Poincaré disc in terms of their popularity dimension (the radial coordinate r) and the similarity dimension (the angular coordinate θ ). In the context of financial networks, we hypothesized that the popularity dimension of a given bank aligns with its systemic importance, and that its similarity dimension is associated with sub-sectors of the banking system, e.g., along geographic and regional divisions. Also for the hydra+ embedding, a theoretical foundation for interpreting r as popularity dimension and θ as similarity dimension has been given 27 . However, due to the asymmetric distribution of banks within the Poincaré disc ( Fig. 4A) for the hydra+ embedding, we calculate its geodesic polar coordinates (r i , θ i ) with respect to the network centerof-weight, rather than the center of the Poincaré disc; see "Methods" for details (The fact that both approaches-Mercator and re-centered hydra+-lead to qualitatively very similar results can be seen as a validation of this methodology.). For the Mercator method we directly use the coordinates (r i , θ i ) from the embedding of the 25%-backbone and perform no additional centering.
To test the first hypothesis-the association between radial coordinate r and systemic importance-we labelled a bank as systemically important in a given year, whenever it was included in the contemporaneous list of global systemically important banks (G-SIBs) as published by the Financial Stability Board [33][34][35] ; see also Table 1. Using a Wilcoxon-Mann-Whitney test, we find a significant association between radial rank and systemic importance in all years and for both methods ( P 2014 < .0001 , P 2016 < .0001 for both methods, P 2018 = .0038 for hydra and P 2018 = 0.0001 for Mercator). In Table 2 we report the five top-ranked banks (most central in terms of r) for each year.
To test the second hypothesis-the association between similarity dimension θ and regional banking subsectors-we assigned banks to the following nine regional groups: These regions are reasonably balanced in terms of the number of banks included in the EBA panel. Using ANOVA for circular data (see "Methods") we find a highly significant association between the angular coordinate θ and the regional group in all three years considered ( P < .0001 in all years for both methods). This indicates that the peripheral community structure (away from the network core) of the EBA financial network is indeed strongly aligned with geographic and regional divisions in Europe. We have highlighted two different regional groups in Fig. 4B,C to illustrate the association between angular coordinate and regional structure.
Network structure over time. The longitudinal structure of the data set allows us to track changes in the network structure over the whole time span of observations from 2014 to 2018. Note, however, that the samples of banks included by the EBA vary substantially in size and-even when restricted to the smallest sampleare not completely overlapping; see Table 3. Nevertheless, the embedding quality of the hyperbolic methods (reported in Fig. 3) is surprisingly stable over all years. This suggests that the hyperbolic model does indeed capture intrinsic qualities of the network, rather than relying on transitory structural artefacts.
We proceed to analyze the temporal changes in the latent radial coordinate r and angular coordinate θ , corresponding to changes in systemic importance and community structure. Note that the small sample of banks included in the 2016 stress test restricts the number of banks that are included in this longitudinal analysis, cf. Table 3. The scatter plots in Interestingly, Nordea was one of just two banks (together with Royal Bank of Scotland) which were removed from the list of G-SIBs in the subsequent update in 2018 due to decreasing systemic importance 35 . In the Mercator embedding (panel IIa) Nordea bank does not appear as an outlier, which is likely due to the fact that some structural information is lost when the full network is reduced to its backbone.  Table 1 for full names). Panel (A) shows the full network embedding produced by the hydra+ method. Also shown is the top decile of strongest links, i.e., the connections with the largest liquidity-weighted portfolio overlap. Banks labelled as systemically important by the Financial Stability Board (G-SIBs) are indicated by asterisks. The black cross marks the capital-weighted hyperbolic center of the banking network. In panels (B) and (C) the Central/Eastern and the Nordic regional groups are highlighted to illustrate regional clustering. Panel (D) shows the hyperbolic embedding of the 10%-backbone of the same network, as produced by the Mercator method.

Discussion
Based on data from the EBA stress tests of 2014, 2016 and the transparency exercise of 2018, we have presented strong evidence that the latent geometry of financial networks can be well-represented by geometry of negative curvature, i.e., by hyperbolic geometry. Calculating embeddings into the Poicaré disc model of hyperbolic geometry has allowed us to visualize this geometric structure and to connect it to the popularity-vs-similarity model of Papdopoulos et al. 4 and the S 1 -model of Ángeles Serrano et al. 28,30 . We find that the radial coordinate ('popularity') is strongly associated with systemic importance (as assessed by the Financial Stability Board) and the angular www.nature.com/scientificreports/ coordinate ('similarity') with geographic and regional subdivisions. A longitudinal analysis shows that-in the observation period from 2014 to 2018-systemic importance of banks within the European banking network has stayed rather stable and has been predominated by only gradual changes. The peripheral community structure has been more variable, but has remained strongly determined by geographical divisions in all years considered. From a broader perspective, our results indicate that hyperbolic network representations could be important tools for regulators to monitor structural change in financial networks, as they are able to distinguish changes in the systemic importance (popularity) of financial institutions from 'peripheral changes' (similarity) which are less relevant from a regulator's perspective. Furthermore, our research provides an empirical basis for using hyperbolic geometry as a model space for the modelling of contagion processes and their optimal control in financial (or other) networks. Instead of modelling such processes by simulation on individual networks, a geometric model space provides the opportunity of analytic models that provide deeper insights beyond a specific case.

Methods
Data preparation and inference of financial networks. The financial networks were extracted from three different publicly available data sets stemming from the stress tests (in 2014 and 2016) and the EU-wide transparency exercise (in 2018) of the European Banking Authority (EBA) 14,15 . The data sets contain detailed balance sheet information from all European banks (EU incl. UK + Norway) included in the stress test/transparency exercise of the EBA in the respective year. From these data sets we extracted the portfolio values of all sovereign bonds held by the banks, split by issuing country (38 countries) and three levels of maturity (short: 0M-3M, medium: 3M-2Y, long: 2Y-10Y+), resulting in m = 38 × 3 = 114 different asset classes.
For each year, this data was stored as the weighted adjacency matrix P ('portfolio matrix') of a bipartite network. The n rows of P correspond to the banks in the sample, the m columns to the different asset classes, and the element P ik to the portfolio value (in EUR) of asset k in the balance sheet of bank i. To perform a one-mode projection of this bipartite network, we followed Cont and Wagalath 23,37 as well as Cont and Schaanning 22 : We computed the liquidity-weighted portfolio overlap (LWPO) of bank i and bank j as where d k is the market depth for asset k 22 . The LWPO measures the impact of a sudden liquidation of the portfolio of bank i on the portfolio value of bank j and vice versa. Hence, it quantifies the risk of fire-sale contagion between the banks in a financial stress scenario. The market depth of asset k was estimated from P as its total volume held by all banks in the sample, i.e., as d k = n i=1 P ik . Writing D for the diagonal matrix of market depths, (1) can be succinctly written as matrix product L = PD −1 P ⊤ . Finally, we set the link weight w ij between bank b i and b j in the one-mode projection N of the banking network equal to the normalized LWPO between banks b i and b j , i.e., w ij :  www.nature.com/scientificreports/ Background on hyperbolic geometry. The hyperboloid model. Hyperbolic geometry can be characterized as the geometry of a space of constant negative curvature, while the more familiar Euclidean geometry is the geometry of a flat space, i.e., a space of zero curvature. In the hyperboloid model of hyperbolic geometry 38,39 , d-dimensional hyperbolic space H d is defined as the hyperboloid In fact, H d endowed with the Riemannian metric tensor ds 2 = dx 2 0 − dx 2 1 − · · · − dx 2 d is a Riemannian manifold and d H (x, y) is the corresponding Riemannian distance 38,39 . The sectional curvature of this manifold is constant and equal to −1 . Thus, H d is indeed a model of geometry of constant negative curvature. The Poincaré disc model. While the hyperboloid model is convenient for computations, a more preferable (and popular) model for visualizations in dimension d = 2 is the Poincaré disc model 38 , which also forms the basis of the popularity-vs-similarity model of Papadopoulos et al. 4 . To obtain the Poincaré disc model, the hyperboloid H 2 is mapped to the open unit disc ('Poincaré disc') D = z ∈ R 2 : z 1 2 + z 2 2 < 1 , parameterized by hyperbolic polar coordinates as z 1 = tanh(r/2) cos θ , z 2 = tanh(r/2) sin θ , using the stereographic projection 38 where atan ′ is the quadrant-preserving arctangent (The quadrant-preserving arctangent atan ′ (x 2 , x 1 ) , welldefined unless x 1 = x 2 = 0 , returns the unique angle θ ∈ [0, 2π) which solves tan θ = x 2 /x 1 and points to the same quadrant as (x 1 , x 2 ) . It is commonly implemented in scientific computing environments (e.g. in MATLAB or R) as atan2.). In the Poincaré disc model, the hyperbolic distance becomes (2) Stress-minimizing embeddings and hyperbolic centering. Stress-minimizing embeddings. Stressminimizing embedding methods aim to find-for each network node b i -latent coordinates x i in a geometric model space G, such that the geodesic distance between x i and x j in G matches-as closely as possible-a given dissimilarity measure d ij (such as the weighted network distance) between nodes b i and b j . This is achieved by minimizing the stress functional which measures the root mean square error between given network distances and the corresponding distances in the model space. For Euclidean geometry, this method is well-known as multidimensional scaling 25,40 , or-using a weighted stress functional-as Sammon mapping 41 . For hyperbolic space, i.e., when d geom G = d H , several optimization methods for (3) have been proposed 26,27,42 . We use the hydra+ method implemented in the package hydra for the statistical computing environment R 43 .
Hyperbolic centering. For a point cloud x 1 , . . . , x n in H d and non-negative weights w 1 , . . . , w n summing to one, the hyperbolic mean 36 or hyperbolic center of weight 44 can be determined as follows: Calculate the weighted Euclidean mean x = w i x i , and its 'resultant length' In dimension d = 2 , the stereographic projection (2) can then be applied to convert the centered coordinates x i to centered polar coordinates (r i , θ i ) in the Poincaré disc.
Application to financial networks. For the hydra+ embedding, the described methods were applied to the financial networks inferred from the EBA data as follows: We converted the similarity weights w ij (normalized LWPO) to dissimilarities d ij = 1 − w ij . We embedded these dissimilarities by minimizing the stress functional (3), using the R-package hydra. For the resulting network embeddings, we calculated the capitalweighted network center c as the weighted hyperbolic mean with weights w i proportional to the total capital m k=1 P ik of bank i invested in all assets a 1 , . . . a m . After centering at the hyperbolic center c, we calculated the coordinates (r i , θ i ) by the stereographic projection (2).
For the Mercator embedding and the dSBM, we first extracted network backbones, both by simple thresholding and by disparity filtering 24 . The resulting backbones were used as input for the methods provided at https ://githu b.com/netwo rkgeo metry /merca tor and the implementation of dSBM in the R-package randnet. Mercator outputs latent coordinates (r i , θ i ) in the Poincaré disc, and the output of the dSBM method is a matrix of connection probabilities p ij for each node pair.
For multi-dimensional scaling (MDS) the same methodology as for hydra+ was used, except that Euclidean distance (instead of hyperbolic distance) was used as d geom G in the objective function (3).

Analysis of embedding results. AUPR and network reconstruction.
To evaluate the embedding results of the network backbones, we used the following network reconstruction task: To each pair of nodes is the geodesic distance in the geometric model space G, or-in case of the degree-corrected stochastic block model-the score p ij , where p ij is the estimated connection probability between nodes i and j. Based on these scores we predict whether an edge is present between nodes (b i , b j ) or not, and construct the Precision-Recall(PR)-curve 45 of this classifier. The area under the PR-curve (AUPR) measures the quality of this predictor, with an AUPR of 1.0 representing perfect prediction.
Wilcoxon-Mann-Whitney test. The Wilcoxon-Mann-Whitney 46 test is a non-parametric test to decide whether the distributions of two populations are identical without assuming them to follow the normal distribution. Let X be a sample of size m from the first population and Y be a sample of size n from the second population. Consider the combined sample of size m + n ordered from least to greatest and denote the ranks of Y i in this joint ordering by S i , i = 1, . . . , n . Then the test statistic W = n i=1 S i is the sum of the ranks assigned to the values of Y.
ANOVA for circular data. With the Analysis of Variance for circular data 36 , we test for the equality of p mean directions from independent circular (i.e. taking values on the unit circle) populations with von-Mises (M) distribution and the same (unknown) concentration parameter κ . We test the null hypothesis H 0 : µ 1 = . . . = µ p , where µ i are the mean directions for the p populations following a M(µ i , κ) distribution. For any circular observation θ , denote s = sin(θ) , c = cos(θ) and let s i ,c i be the averages within the i-th population. Let n i be the d B ((r 1 , θ 1 ), (r 2 , θ 2 )) = arcosh (cosh(r 1 ) cosh(r 2 ) − sinh(r 1 ) sinh(r 2 ) cos(θ 1 − θ 2 )) www.nature.com/scientificreports/ sample size, R i = s 2 i +c 2 i the mean resultant length of the i-th population, and let n = p i=1 n i be the size of the combined sample and R the overall mean resultant length based on all n observations. The identity has the approximate χ 2 decomposition χ 2 n−1 = χ 2 n−p + χ 2 p−1 36 and therefore, the test statistic F = can be derived. The null hypothesis is rejected for a given confidence level α , when F > F p−1,n−p;α , where F p−1,n−p;α is the α-quantile of the F-distribution with p − 1 and n − p degrees of freedom 36 .

Data availability
The data analysed during the current study are available from the website of the European Banking Authority at https ://www.eba.europ a.eu/risk-analy sis-and-data/eu-wide-stres s-testi ng and https ://eba.europ a.eu/risk-analy sis-and-data/eu-wide-trans paren cy-exerc ise/2018.