Introduction

Classical inverse statistical mechanics has been applied to determine properties of pairwise interaction potentials from ensemble properties in complex networks1. The maximum entropy model characterizes the correlation structure of a network activity without assumptions about its mechanistic origin and enables predictions of the collective effects2. These maximum entropy models are equivalent to Ising models in statistical physics and, as we show in the following, it allows us to explore critical behavior of the network.

Despite wide range application of maximum entropy model in the recent decade1,2,3,4,5, the main focus has been made on characterization of specific interaction between two elements or community structure of networks. However, global dynamical response of an evolving network based on its inherent symmetries has not been addressed yet. Here we apply percolation theory to measure the strength of interactions in a time-dependent evolving network of financial market and show that away from financial crisis the interaction network is self-similar and exhibits a geometrical criticality at a certain size-independent interaction threshold while during the crisis the network responses differently at different size-scales.

Percolation theory6 is the simplest fundamental model in statistical physics that displays phase transitions and explains the behavior of connected clusters in a random graph. The geometric critical behavior is dominated by the emergence of a giant connected component which controls the global response of the network. Resilience of networks under attack7,8, spreading phenomena and epidemics9,10,11,12,13,14 are examples of diverse problems that can be treated using percolation theory.

The concept of random graphs9,10 was put forward by Erdös-Rényi15 who introduced the simplest imaginable random graph ER n (p) including n vertices in which an edge is placed between any pair of distinct vertices with some fixed probability p. This random graph exhibits a continuous phase transition at a critical threshold p c which leads to a sharp global connectivity of the network components and serves as the mean-field model of percolation.

We find that despite seemingly strong correlations in the interaction network of financial market away from a global crisis, it can be modeled, to a good extent, by an ER n (p) random graph of n interacting stock markets when the control parameter p is taken to be the strength of pairwise interactions.

In this paper, we study the critical behavior of a financial network consist of about 400 indices (or vertices) in S&P 500 whose activities are available as time series. We first build up a time-dependent correlation matrix of the stocks from the time-series. Then, we extract the interaction matrix among system’s elements in which the non-direct correlations are eliminated. This interaction matrix is the adjacency matrix of elements’ interaction network. Given the interaction network of a system, we can reduce the full system to some disjoint components which have positive intra-interactions (compared with a given strength threshold, see below). The collective and large-scale information of the system is somehow encoded in the statistics of the largest component which can also control the dynamics of the system. This collective behavior can also result in large-scale deviations in system states. For example, in financial market all stock indices may fall down and influence global index of the market16.

Our analysis of the time series for indices in S&P 500 as the system elements and its mapping to the percolation problem on networks, unravels that for a network of stock markets, the dynamics can be well modeled by a critical random network theory of ER n (p) away from a global financial crisis, while around and at the crisis the network departures from criticality. This observation is in contrast with the ordinary critical phenomena in which the large-scale fluctuations play a crucial role in the behavior of systems in the vicinity of the critical state and the fluctuations are actually responsible for the criticality. Despite large fluctuations in stock’s prices over a crisis period (Fig. 1, light blue bars), the underlying network model shows a non-critical behavior and fluctuations drive the system out of criticality.

Figure 1
figure 1

Log-return of prices for 4 sample stocks are shown in different colors. The columnar shadow windows show the periods of major financial crises.

Data preparation

We analyse the available data for “adjusted closing prices” in S&P 500 index between 2000 and 2017. The time-series \(\overrightarrow{X}(t)=({X}_{1}(t),{X}_{2}(t),\,\mathrm{...,}\,{X}_{n}(t{))}^{T}\) of \(n=396\) stocks’ prices are extracted from finance.yahoo.com. We only consider the data for working days and use linear interpolation method to treat the sparsity of our data. For each stock i, as shown in Fig. 1, we work with the normalized log-return \({x}_{i}(t)\) of data17 defined as

$${x}_{i}(t)=({x^{\prime} }_{i}(t)-\mu (t))/\sigma (t)\,with\,{x^{\prime} }_{i}(t)=\,\mathrm{log}\,{X}_{i}(t)-\,\mathrm{log}\,{X}_{i}(t-\mathrm{1),}$$
(1)

where μ(t) and \(\sigma (t)\) denote the average and standard deviation of \(x^{\prime} \), respectively.

Interaction network

For a given time t and a time window τ (see Fig. 1), we construct a multivariate \(n\times \tau \) matrix \({D}_{\tau }(t)\) for n time series as

$${D}_{\tau }(t)=[\begin{array}{cccc}{x}_{1}(t-\tau +1) & {x}_{1}(t-\tau +2) & \cdots & {x}_{1}(t)\\ {x}_{2}(t-\tau +1) & {x}_{2}(t-\tau +2) & \cdots & {x}_{2}(t)\\ & \vdots & & \\ {x}_{n}(t-\tau +1) & {x}_{n}(t-\tau +2) & \cdots & {x}_{n}(t)\end{array}].$$
(2)

We then build up the correlation matrix \({C}_{\tau }(t)\) among the time-series for each time window as

$${C}_{\tau }(t)={D}_{\tau }(t)\cdot {D}_{\tau }{(t)}^{T}\mathrm{.}$$
(3)

In order to also monitor the time evolution of the correlation matrix we move the time t by steps of duration 30 days over the whole period between 2000 to 2017.

Based on the above correlation matrix Eq. (3), we extract the interaction matrix \({J}_{\tau }(t)\) 1,2,3,4,5

$${J}_{\tau }(t)=[\begin{array}{cccc}{j}_{11} & {j}_{12} & \cdots & {j}_{1n}\\ {j}_{21} & {j}_{22} & \cdots & {j}_{2n}\\ \vdots & \vdots & \ddots & \vdots \\ {j}_{n1} & {j}_{n2} & \cdots & {j}_{nn}\end{array}],$$
(4)

whose symmetric elements \({j}_{lk}={j}_{kl}\) represent the strength of interaction between stocks l and k in the time window [t, t + τ). Due to the finite sample size in our data sets, we used “Graphical LASSO” technique18,19 to evaluate interaction matrix \({J}_{\tau }(t)\) (we set regularization penalty of GLASSO to 0.1. We have also used the output Θ of graphical LASSO, as an estimator of inverse co-variance matrix. This matrix appears in multi-variate Gaussian distribution and determines the strength of the interactions among the different dimensions of PDF. In comparison with the Ising model, the coupling coefficients are minus sign of this matrix1,18,19.).

Interaction matrix has an important advantage over the correlation matrix in which all mediated correlations are eliminated in the interaction matrix. Therefore, if a high correlation between indices A and B would be due to their high correlation with an index C, this effect will be eliminated in the interaction matrix.

This concept is related to the partial correlation matrix in multivariate Gaussian noise18. It is also related to the maximum entropy network in the inverse statistical physics1. Based on this fact, elements of the interaction matrix are independent of each other and one can remove them during the attack process (percolation model).

Percolation model analysis

For each interaction matrix \({J}_{\tau }(t)\) at time t, we have an adjacent network \(G(t)=(n,E)\) with weighted links. In order to establish a connection with ordinary percolation problem on networks, we consider a threshold θ for the weights on the links to transform the network to an unweighted one \({G}_{\theta }(t)=(n,{E}_{\theta })\). This means that we remove the links whose weights are smaller than the considered threshold θ, and keep other links in the network by setting their weights equal to 120,21, i.e.,

$${\rm{f}}{\rm{o}}{\rm{r}}\,{\rm{e}}{\rm{a}}{\rm{c}}{\rm{h}}\,{e}_{ij}\in {E}_{\theta },\,{e}_{ij}=\{\begin{array}{cc}0 & {j}_{ij} < \theta \\ 1 & {j}_{ij}\ge \theta .\end{array}$$
(5)

Figure 2 illustrates this procedure on a sample network for four different threshold levels. It demonstrates how the giant component is born as the threshold level decreases. For a given graph G θ we define the percolation strength \({P}_{\infty }(\theta )\) as the probability that a randomly chosen node belongs to the largest connected component of the graph. Based on the percolation theory, the network may undergo a geometrical phase transition during which the size of the giant connected component jumps to a size of order \({\mathscr{O}}(n)\) 9,10. To be more comparable with the ERn random networks, we interchange our percolation parameter θ with the corresponding mean degree \(\bar{k}({G}_{\theta })\) of the graph. For the ER n graphs, the critical point is known to be \({\bar{k}}_{c}\mathrm{=1}\) 9,10.

Figure 2
figure 2

A sample interaction network with four different threshold levels on links’ weight. The birth of the giant connected component can be seen by decreasing the threshold level.

The first quantity of interest is \({P}_{\infty }(\bar{k})\) for a given interaction network \(G(t)\). In order to investigate the self-similarity of the network at different length scales, we also consider several sub-graphs \(g(t,s)\) of size \(s\le n\) which are randomly chosen from the original network \(G(t)\). We then measure \({P}_{\infty }(\bar{k})\) for different size scales which is averaged over an appropriate number (about 100) of independent realizations for each size s. In Fig. 3, we present the results of our measurements of \({P}_{\infty }(\bar{k})\) for two time periods far from (Fig. 3a) and close to (Fig. 3c) the financial crisis (In the Supplementary we present the same quantity measured for various time periods). As it is obvious from the Fig. 3a and c, \({P}_{\infty }(\bar{k})\) clearly behaves differently within these two time periods. Our further investigation shows that the difference between these two relies on the difference in the structure of the interaction network. To this aim, we shuffle the networks and repeat our analysis. Since a link’s weight comes from the correlation of agents actions on two stocks, shuffling links is equivalent to a situation where agents buy or sell randomly. We find that away from the crises, the shuffled network is very similar to the original network (Fig. 3b) as for the ER n random networks, while close to the crises, the original network substantially deviates from the shuffled one (Fig. 3d).

Figure 3
figure 3

The giant component probability \({P}_{\infty }(\bar{k})\), as a function of the mean degree \(\bar{k}\). (a) An interaction network far from crises and (b) its shuffled network (2000 Jan-May). (c) A network close to a crisis and (d) its shuffled network (2008 Jul-Nov).

The other quantity of interest is the mean cluster size (or susceptibility) \(\chi (\bar{k})\) defined as22

$$\chi (\bar{k})=\frac{{\langle {P}_{{\rm{\infty }}}^{i}{(\bar{k})}^{2}\rangle }_{i}}{{\langle {P}_{{\rm{\infty }}}^{i}(\bar{k})\rangle }_{i}},$$
(6)

where \({\langle \ldots \rangle }_{i}\) denotes averaging over the number (about 100 for each size s < n) of independent realizations and \({P}_{\infty }^{i}\) is the percolation strength computed for the i-th realization. Figure 4 shows the results of our computations of \(\chi (\bar{k})\) for two time periods, as in the Fig. 3 above, far from (Fig. 4a) and close to (Fig. 4c) the financial crisis (Supplementary presents the same quantity for various time periods). Figure 4a indicates that far from the crisis, all curves \(\chi (\bar{k})\) for different size scales maximize near a single size-independent critical point \({\bar{k}}_{c}\approx 1\), which is very similar to its shuffled variant shown in the Fig. 4b. Close to the crisis, in contrast, \(\chi (\bar{k})\) behaves differently at different size scales with an observable shift from the critical point (Fig. 4c). It also differs from its shuffled version as is evident in the (Fig. 4d).

Figure 4
figure 4

The mean cluster size \(\chi (\bar{k})\) as a function of the mean degree \(\bar{k}\). (a) An interaction network far from crises and (b) its shuffled network (2000 Jan-May). (c) A network close to a crisis and (d) its shuffled network (2008 Jul-Nov).

To quantify the amount of deviation from the critical behavior near the crisis, let us now measure the difference between the strength of the giant component and its prediction based on the random network theory9,10 which states that it is possible to compute the giant component probability \({P}_{\infty }^{theo}(\bar{k})\) by solving the following self-consistent equation (see the Supplementary for more description)

$${P}_{\infty }^{theo\mathrm{.}}(\bar{k}\mathrm{)=1}-{e}^{-\bar{k}{P}_{\infty }^{theo\mathrm{.}}(\bar{k})}$$
(7)

Next, we measure the mean absolute difference \(d(t,s)\) between the theory and our numerical computations as

$$d(t,s)={\langle |{\langle {P}_{{\rm{\infty }}}(\bar{k})\rangle }_{i}-{P}_{{\rm{\infty }}}^{theo.}(\bar{k})|\rangle }_{\bar{k}}.$$
(8)

We have plotted \(d(t,s)\) for different sub-graph sizes s in Fig. 5 for both original and shuffled networks. Close to the crisis periods (highlited in the Fig. 5 by light blue bars) \(d(t,s)\) increases. It shows that the critical behavior of the interacting financial network (which shown to behave like a random network) disappears close to a crisis.

Figure 5
figure 5

The mean absolute difference between theory and our computations for the giant component probabilities (top panel) for time period τ = 90 and different sub-graph size (different symbols of different colors). Larger sub-graph sizes exhibit larger fluctuations in time. The thin solid curves in different colors show the same quantity for the corresponding shuffled network of each size which are comparatively less fluctuating around a mean value. The bottom panel shows the data for S&P500 index (blue curve) and its increments (red curve). The vertical bars in light blue show four different major crisis periods: (i) Stock market downturn of 2002, (ii) Financial crisis of 2007–08, (iii) 2010 Flash Crash and August 2011 stock markets fall, and (iv) 2015–16 stock market sell off.

Conclusion

Efficiency of financial markets is a hot debate in financial economics. Some studies support the hypothesis of market efficiency–see for example23,24. According to this hypothesis, the stock prices fluctuate mostly like uncorrelated random variables. In other words, it states that it is impossible to extract information from the past history of the prices of a stock to predict its future and earn money.

Despite early observations which supported the hypothesis of efficiency, some later works challenged it and revealed deviations from it. Although it was hard to find correlations in time series of an individual stock, noticeable information could be captured if some other parameters such as cross correlations or earning price ratio were brought into account–see for example17,25,26 and references therein.

The studied methods have mostly focused on the individual indices or their cross correlations and the analysis based on an aggregated behavior is still lacking. In the present work, we studied the global behavior of indices as an example of an interaction network by using the concepts of percolation theory. We find that away from financial crises the interaction network behaves like a random network of Erdös and Rényi15 which similarly exhibits the properties of scale invariance and self-similarity at the critical point of a continuous phase transition. When the financial market approaches a crisis, our observation is that the interaction network model deviates from the critical random network and looses its scale invariance, i.e., the system behaves differently at different size scales. The deviations are summarized in Fig. 5 in which our data signals at major crashes of the markets namely, “Stock market downturn of 2002”, “Financial crisis of 2007–08”, “2010 Flash Crash”, “August 2011 stock markets fall”, and “2015–16 stock market sell off”, by a noticeable growth of difference with respect to the random networks.

During the financial crises, usually because of the spreading fear in the market, the stocks move together and correlation grows amongst them. This fact raises a natural question if our main result is just another derivation of the previous observations concerning the growth of correlation between the indices. In order to address this appropriately, let us compute the largest eigenvalue of the correlation matrix27,28,29. As shown in Fig. 6, in accordance with our previous observations, over the crises the largest eigenvalue grows significantly. But this time, when we look at the largest eigenvalues of the “shuffled” correlation matrix, they behave exactly as in the original (unshuffled) matrix, i.e., in both cases the largest eigenvalues significantly grow over the crises (see Fig. 6). This means that the largest eigenvalues of the correlation matrix do not necessarily carry information about the structure of the interaction network. This is while in our analysis, we observe two totally different behaviors between the interaction network and its shuffled one which provides a systematic way for a structural differentiation. Therefore we conclude that the off-critically over the crises is not a simple consequence of the growth in the correlations.

Figure 6
figure 6

The largest eigenvalues of the original correlation matrix (open circles) and its shuffled (open squares), for 90 working days window length. Irrespective of the obvious structural difference, they both follow each other and signal the crises similarly.

It should be notified that our observation close to the crises does not contradict the efficient market hypothesis, since it is not still clear if it can help one to extract money. Should we be able to extract money from such structure is left as an open question for the future works.

Extraction of real system’s interaction networks is a rapid growing field. Beside the statics features of these networks, they could be used to deduce some dynamic features of the system. In this work, we established a correspondence between the critical phenomena and the external macroscopic state of a system. This approach could however be generalized to other fields where maximum entropy network is used like gene regulatory networks, neural networks, protein interactions and etc.

Data Availability

The datasets analysed during the current study are available in the Yahoo Finance, http://finance.yahoo.com.