Dependency Structure and Scaling Properties of Financial Time Series Are Related

We report evidence of a deep interplay between cross-correlations hierarchical properties and multifractality of New York Stock Exchange daily stock returns. The degree of multifractality displayed by different stocks is found to be positively correlated to their depth in the hierarchy of cross-correlations. We propose a dynamical model that reproduces this observation along with an array of other empirical properties. The structure of this model is such that the hierarchical structure of heterogeneous risks plays a crucial role in the time evolution of the correlation matrix, providing an interpretation to the mechanism behind the interplay between cross-correlation and multifractality in financial markets, where the degree of multifractality of stocks is associated to their hierarchical positioning in the cross-correlation structure. Empirical observations reported in this paper present a new perspective towards the merging of univariate multi scaling and multivariate cross-correlation properties of financial time series.


Data and clusters
In this section we report additional results for the most relevant clusters and industrial sectors showing dependency between multifractality and hierarchical order. In Figure 1 we show the two hierarchical structures recovered by performing DBHT clustering (top plot) and SLCA clustering (bottom plot). For the DBHT clustering one has n ∈ [5,19] whereas for SLCA one finds n ∈ [2,115]. The very wide range found for SLCA seriously jeopardises the use of such clustering algorithm for the purpose of this study. Moreover, as clearly marked on the top plot in Figure 1, the DBHT identifies the clusters uniquely starting from a topological constraint, while the SLCA only provides the hierarchical organisation without clearly specifying clusters. We report in Table 1 the Bloomberg sectors and the number of stocks they contain. The DBHT method produces thirteen clusters quite heterogeneous   [5,19] for DBHT and in [2,115]  in sizes and compositions. In Table 2 we summarise the composition of the clusters in terms of stocks belonging to different sectors. We find three large clusters containing 54, 56 and 64 stocks respectively and all quite heterogeneous in composition. Clusters labelled by 2 and 3 show a relevant fraction of stocks belonging to Financial and Industrial sectors. This is confirmed by the enrichment analysis [1], which validates the identification of the clusters with the respective sectors at a confidence level of 1%. The other clusters are significantly smaller, but still capture quite well the sector membership: cluster 13 for example contains stocks from the Utility sector only, cluster 9 and 10 from Consumer Goods and Basic Materials sectors respectively. Cluster 7 has a very large component of stocks belonging to the Technology sector, while cluster 12 is mainly composed by stocks from the Services. For each stock we have computed the hierarchical order n, as defined in the main text. By construction the three largest clusters are the ones showing larger n on average, therefore including those stocks with more complex hierarchical structure.  Table 2: Percentage of stocks for each sector in each clusters. We report, for each of the 13 clusters identified through the DBHT over the whole time period 2-01-1997 to 31-12-2012, the percentages of the stocks from each of the 9 Bloomberg sectors.
The top plot in Figure 2 shows multifractality plotted versus hierarchical order for stocks belonging to the sector of Basic Materials. We can see a very fast growth of multifractality for hierarchical orders in the range [7, 12], followed by something like a flat trend for higher orders. This behaviour suggests that the dependence existing between the two variables may most probably be non-linear (see the figure captions for quantitative details on the trends). The same inflection suggesting non-linear behaviour is also observed on stocks belonging to the Utilities sector (see bottom plot in Figure  2). These observations seem to hint at a saturation of multifractality beyond a certain order, as if the effect of the many risks associated with the dendrogram nodes would decrease with the order of the node. We have also looked at the trends observed on sub clusters, i.e. groups of stocks below the cluster level in the hierarchy. Restricting to hierarchical levels below the clusters has two opposite consequences: on the one hand it allows to be more strict in factoring out a larger number of common hierarchical levels, hence making the differentiation between different stocks more relevant. On the contrary, it can reduce significantly the number of stocks and thus potentially increase the influence of noise on the trends. After a thorough analysis of all sub clusters of different order, we found that in all cases the trends recovered on sub clusters are, when significant, the same observed on the clusters. We use the notation S j i,k to label such sets, where j = 1, . . . , 13 denotes the cluster label, i denotes the order below the cluster and k = 1, . . . , 2 i denotes which of the 2 i sub clusters at each order we are considering. We plot in Figure 3 ∆H(1, 2) versus n for cluster no.6 in Table  1 in the main article and for two sub clusters S 3 1,1 and S 2 2,1 . The sub cluster S 3 1,1 is very similar to cluster no.3 shown in the main text: cluster no.3 in fact results from the merging of S 3 1,1 and S 3 1,2 , that contains only 5 stocks.

A different data set: London Stock exchange
The main empirical result reported in this paper has been verified also on a different data set. We have analysed the set of 185 most capitalised stocks traded at the London Stock Exchange (LSE) in the period 04/01/2000-21/08/2013. Data has been provided by Bloomberg. The DBHT returned a range of hierarchical order comparable to that obtained for the NYSE, with stocks exhibiting n ∈ [4,17]. We plot in Figure 4 the equivalent of Figure  2 in the main text. We observed the same increasing trend up to n=12,   followed by a drop and a plateau at higher order, where the observations are however fewer. The regime n ∈ [4,12] confirms the observation reported in the main text for NYSE: within a certain range of orders, multifractality and hierarchical order appear to be positively correlated.

Asian markets: one dominant market mode
We have also analysed the set of 386 most capitalised stocks listed in the Tokyo Stock Exchange (TSE) and the 136 most capitalised stocks in the Hong Kong Stock Exchange (HKSE) in the period 01/01/2003-26/08/2013. For both markets, the DBHT returns a taxonomy of clusters very different from that found in the NYSE and LSE data. For the Japanese market we found only six clusters, among which there is a gigantic cluster containing 300 stocks, while the remaining stocks are more or less uniformly distributed in the remaining clusters. A closer inspection of the correlation matrix of this system reveals that, on average, these stocks show correlation larger than those observed in the western markets, with an average correlation of 0.38 (compared to 0.18 found in NYSE and 0.20 for LSE). As a result of this very homogeneous taxonomy of clusters, stocks exhibit hierarchical order varying in the range [3,121]: as discussed in the main text, a dendrogram with a small number of clusters necessary entails the hierarchy to develop vertically. For the HKSE we found six clusters and n ∈ [4,51]. It is nearly impossible to distinguish the effect of such a wide range of orders on the multifractal properties of stocks, but nonetheless the interdependence between the two variables ∆H(1, 2) and n is still observed for small n (see Figure 5 (top) for TSE and Figure 5 (bottom) for HKSE). We mention that the number of observations for each n is very small to allow any robust statistical conclusion (the errors are consequently very large).
To remove the presence of these very large hubs of correlated stocks, we have performed a detrending of the time series in order to get rid of the dominant market mode and thus observe a more heterogeneous structure of clusters. Returns are considered to follow a one factor linear model of the following form [2] where I t = 1/N N i=1 r i,t is the composite index with homogeneous weights. After estimating the coefficients β i for each stock, the correlation matrix has been computed on the residuals c i,t , thus deprived of any common trend.    Relationship between ∆H(1, 2) and n for n ∈ [3,9], neglecting all higher hierarchical orders (n ∈ [10, 51]) for HKSE. The solid lines are the best fits on the averages. In both cases a positive trend is observed. Correlation coefficients and corresponding p-Values are 0.41 and 0.09 for TSE and 0.61 and 0.14 for HKSE.
As expected, after detrending the number of clusters increases to 21 and the range of n is found to be [5,21]. However, multifractality appears to be completely uncorrelated with n, as can be seen from the (almost) flat trend in Figure 6. This observation has in our opinion the following interpretation: in a very correlated data set, the market mode is definitely the most relevant factor affecting the dynamics of the stocks. As a consequence, removing the common trend all but removes the most relevant source of complexity of the time series, whose remaining heterogenous properties cannot explain the correlation between hierarchical order and multifractality. Note that this interpretation is not at odds with what postulated in our model, where one single market mode is equivalent to a vanishing hierarchy where the intermediate nodes do not introduce any diversification in the market as all stocks share the same set of nodes. The Japanese market thus appears to be much less diversified than the American and British ones and one factor seems to be enough to describe its dynamics. In general, it remains to be ascertain whether the DBHT is always the correct tool to measure the hierarchy in a very correlated data set, where very large hubs of stocks tend to appear, thwarting the possibility of tracking the effect of the hierarchy on the multiscaling properties of the time series. In markets diversified enough though, like the NYSE, the effect of many risk factors is still reasonably well detected by the DBHT.

Bootstrapping the DBHT
Although the DBHT is a deterministic algorithm, which means that repeating the clustering many times on the same dataset yields the same result, the correlation matrix is inherently noisy and small perturbations of its entries may have dramatic fallouts on the topology of the hierarchical structure. In particular, even slight modifications in the ranking of correlations are likely to undermine the organisation recovered with the original configuration. For this reason it makes sense to study the robustness of the DBHT method by validating the dendrogram obtained against some statistical significance level and, to this end, bootstrapping is probably the most suitable tool [3,4]. The estimates whose accuracy need to be validated are the hierarchical orders of the stocks. We proceed as follow: we construct a number of synthetic resamples of the original dataset, where each time the surrogate data are obtained by the usual procedure of sampling with replacement. Each time  the statistics of interest is computed and at the end the original observation is validated against the distribution of the resample estimates obtained from the surrogate series. Since we are interested in giving accurate estimates of the orders of all stocks, the DBHT is performed on each bootstrapped correlation matrix for a total of 1000 resamples. From each of these dendrograms constructed from surrogate correlation matrices we extract the vector of bootstrapped hierarchical orders n β = (n 1 , n 2 , . . . , n N ) β , where N = 342 is the number of stocks in the databset and β = 1, . . . , 1000. Then we compare the order vectorn observed on the empirical time series with the synthetic distribution of bootstrapped values. For each stock, we discard the empirical order as non-reliable if it falls beyond the two-sided 0.9 level of probability obtained from the cumulative distribution function of the bootstrapped series. As an example of the validation procedure, we report the case of the order of the stock Air Products and Chemicals INC in Figure 7: the ordern measured on the original DBHT is not validated by the bootstrap test. The number of stocks which fail the bootstrap test is remarkably high and, by coincidence, is exactly half of the total number of stocks. We show in Figure 8 the distribution of the measured order before the bootstrapping (blue bars) and the distribution of the order of the stocks validated (dashed red line). Although the number of stocks is halved, the shape of the distribution remains similar. We then denote the set of valid hierarchical orders as n B . After the bootstrap validation, the trends recovered in the plots of ∆H(1, 2) versus n reveal some further hidden structure. We show in Figure 9 the trends observed on clusters 2, 3 and sub cluster S 3,B 1,1 , where the stocks shown are only those that have been validated via bootstrapping. As also shown in Figure 8, the stocks excluded are more or less uniform in their order, which results in stocks with very large (above 15) and very small (below 7) order to almost disappear from the set. Although the number of stocks is reduced, the dependence of multifractality on the hierarchical order is seen even more neatly. Moreover the non-linear behaviour guessed from the plots presented in the previous section is now even more evident from the middle plot in Figure 9.     has been then performed on both tranches separately, after having computed the two weighted correlation matricesĈ w,T 1 andĈ w,T 2 [5]. We show the two hierarchical structures H T 1 and H T 2 obtained over the two different time periods in Figure 11. These two hierarchical structures have been then used to simulate the model producing the results reported in the main text. To each stock i = 1, . . . , 25 is associated a hierarchical tree Γ T k i , for k = 1, 2, including all the nodes above the stock. We plot in Figure 12 the simulated returns process and volatility process for the stock labelled 11 in Figure 11, as an example of the effect of the changing hierarchy. In the tranche T 1 this particular stock shares one common risk with stock labelled 14, then one common risk with the pair of stocks 6, 15 and so on. In the tranche T 2 it shares one common risk with stock 25, one common risk with the pair 3, 9 and so on. Overall the topology of the dendrogram changes quite dramati-     Standard errors are of order 10 −3 and not reported. All mean values in T 1 have been statistically checked to be significantly different from those in T 2 via a t-test, returning p-value always negligible at a threshold of 5% confidence level.
cally in the second tranche and this result in a rather different spreading of the risks. For the stock labelled 11 the number of risks increase from 5 to 8. One can clearly see the effect of the larger number of risks which cause fluctuations to become much larger. We performed an extensive analysis studying the measured multifractality and scaling exponents on many synthetic time series, considering separately those whose hierarchical order increases from those whose hierarchical order decreases. We have simulated 1000 realisations of a 25 stocks 2-regimes multivariate DHM of length 4000 with hierarchical structures given by H T 1 and H T 2 on the two tranches T 1 (t ∈ [1,2000]) and T 2 (t ∈ [2001, 4000]) respectively. Each time we computed the scaling exponents and multifractality indicator ∆H(1, 2), keeping track of whether, for each stock i, n T 1 > n T 2 or n T 1 < n T 2 . For the hierarchies in figures 11 we find 15 stocks whose order increases and 10 stocks whose order decreases. With this analysis one can test whether there is a significant correlation between the hierarchical order and the scaling properties of the synthetic time series. We performed this analysis for different values of the probabilities of the risks considering all p's to be uniformly distributed in  Tables 3, 4 and 5. From these tables two general facts stand out: • the observed multifractality is correlated with the order of the hierarchy. In all cases indeed we observed higher multifractality on the  Figure 12: We plot returns (top) and volatility (bottom) for the simulated process whose hierarchical structure is Γ 11 = Γ T 1 11 for t ∈ [1, 2013] and Γ 11 = Γ T 2 11 for t ∈ [2014, 4026]. The common volatility process x t has been simulated as discussed in the Methods section of the main text with parameters λ = 0.2 and T = 800.
Increasing order Decreasing order   tranches where the hierarchical order was larger than in those where it was smaller. As one can read off all tables, simulated time series whose order increase switching from the first to the second hierarchy show increasing multifractality, whereas series whose order decreases show decreasing multifractality.
• Multifractality is largest for parameters p's in [0.4, 0.6] and decreases significantly in the other two sets of simulations. The increase or decrease of multifractality and scaling exponents H(1) and H(2) have been statistically validated through t-test at two-sided 5% confidence level: the p-values are always extremely small, confirming a true differentiation between the scaling properties in T 1 and T 2 .
Among the 25 stocks analysed there are 4 stocks whose hierarchical order n T 1 = n T 2 . We show in Figure 13 the distribution of ∆H(1, 2) obtained on the two different tranches for these 4 stocks. Compared to the plots in Figure  7 in the main text, we can see how the difference between the values obtained in the two windows is much smaller for the four stocks with n T 1 = n T 2 . This tells us that keeping the hierarchy constant has not a sizeable impact on the multifractal properties of the stocks. We also report in Table 6 and 7 the numerical values of the quantiles shown in Figure 7 in the main text. Table 6 shows quantiles for the stocks having n T 1 < n T 2 while Table 7 shows quantiles for stocks having n T 1 > n T 2 . Finally, quantiles for stocks having n T 1 = n T 2 are reported in Table 8.

Proof of equation (4)
Let us consider two return processes r i,t and r j,t having arbitrary hierarchical trees: for q = {2.5%, 50%, 97.5%} for the 4 DHM simulated time series whose hierarchical order n T 1 = n T 2 . Red lines correspond to p-quantiles in T 1 while light blue lines to p-quantiles in T 2 .  Table 8: In this table we report the set of quantiles obtained from the distribution of the multifractality proxy ∆H(1, 2) on the two tranches T 1 and T 2 for 1000 simulated time series whose hierarchical order n T 1 = n T 2 . Probabilities p's of the model are initialised to take values in [0.4, 0.6]. One observes an almost rigid shift towards smaller values of the distribution of the observed ∆H(1, 2)s when the order decreases.

Increasing order
with corresponding risks Note that the Γ i and Γ j may or may not overlap. Denoting c the arbitrary node in the dendrogram, we consider also the sets We drop the subscript t for the sake of readability and consider the two random variables r i and r j at arbitrary time t. The correlation coefficient between two returns r i and r j is given by The covariance between r i and r j is Since K m = p m , we have e 2Km = p m (e 2 − 1) + 1 = ζ 2 (p m ) and e Km = p m (e − 1) + 1 = ζ 1 (p m ). One also has The correlation coefficient therefore is 27 The last expression can be further simplified by noting that and n j m=1 ζ 2 (p m ) = z:ax∈Γ i ∩Γ j The correlation coefficient ρ ij then reads which yields Defining F ij (p; Γ i , Γ j ) = l:a l ∈Γ i \Γ j we finally have ρ ij = Corr( i j )F ij (p; Γ i , Γ j ).
Some important remarks: • The total correlation is factorized into two separate contributions, the first one associated with the multivariate Gaussian vector t and the second one only dependent on the probabilities of the nodes, that is on the hierarchy.
• The hierarchy factor F ij (p; Γ i , Γ j ) is always smaller than one and thus acts as a perturbation of Corr( i j ), damping it according to the values assumed by the parameters p's.
• The two limit cases p m = 0, ∀m = 1, . . . , N − 1 and p m = 1, ∀m = 1, . . . , N − 1 both correspond to the hierarchy factor being 1. In both cases the effect of the hierarchy is null and one has ρ i,j = Corr( i , j ).
Overall one can conclude that including a hierarchical structure in the volatility modelling introduces a perturbation to the standard multivariate model correlation matrix, which is nonetheless recovered when the probabilities of the internal risks are all null or all unity. These two cases correspond indeed to have the hierarchical structure disappearing as there wouldn't be any difference between different trees. We show in Figure 14 the distribution of observed correlation coefficients on different realisations of DHM with probabilities taking values in different ranges. We can see how increasing the range of probability values shifts the bulk of the distribution to the right, which eventually matches the one of a log-normal multivariate model, where the hierarchical factors is turned off.