Exact results of the limited penetrable horizontal visibility graph associated to random time series and its application

Wang, Minggang; Vilela, André L. M.; Du, Ruijin; Zhao, Longfeng; Dong, Gaogao; Tian, Lixin; Stanley, H. Eugene

doi:10.1038/s41598-018-23388-1

Download PDF

Article
Open access
Published: 23 March 2018

Exact results of the limited penetrable horizontal visibility graph associated to random time series and its application

Minggang Wang^1,2,3,
André L. M. Vilela^3,4,
Ruijin Du^3,5,
Longfeng Zhao³,
Gaogao Dong^3,5,
Lixin Tian^1,5 &
…
H. Eugene Stanley³

Scientific Reports volume 8, Article number: 5130 (2018) Cite this article

3633 Accesses
31 Citations
1 Altmetric
Metrics details

Subjects

Abstract

The limited penetrable horizontal visibility algorithm is an analysis tool that maps time series into complex networks and is a further development of the horizontal visibility algorithm. This paper presents exact results on the topological properties of the limited penetrable horizontal visibility graph associated with independent and identically distributed (i:i:d:) random series. We show that the i.i.d: random series maps on a limited penetrable horizontal visibility graph with exponential degree distribution, independent of the probability distribution from which the series was generated. We deduce the exact expressions of mean degree and clustering coefficient, demonstrate the long distance visibility property of the graph and perform numerical simulations to test the accuracy of our theoretical results. We then use the algorithm in several deterministic chaotic series, such as the logistic map, H´enon map, Lorenz system, energy price chaotic system and the real crude oil price. Our results show that the limited penetrable horizontal visibility algorithm is efficient to discriminate chaos from uncorrelated randomness and is able to measure the global evolution characteristics of the real time series.

The impact of Russia–Ukraine war on crude oil prices: an EMC framework

Article Open access 02 January 2024

Principal component analysis

Article 22 December 2022

Strategy evolution on higher-order networks

Article 15 April 2024

Introduction

Several methodologies for understanding the complicated behavior of nonlinear time series have been recently developed, including chaos analysis^1,2, fractal analysis^3,4, and complexity measurement^5,6. With the development of complex network theories^7,8,9,10, a new multidisciplinary methodology for characterizing nonlinear time series using complex network science has emerged and rapidly expanded^{11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26}. The main tool of these methods is to use an algorithm to transform a nonlinear time series into a corresponding complex network and then use the topological structure of complex networks to analyze the properties of the nonlinear time series.

Currently there are several ways of converting univariate time series into complex networks. The first one is Pseudo-periodic time series transitions¹¹ that analyze pseudo-periodic time series. The second one is the visibility graph (VG) method, which was first proposed by Lacasa et al.^12,13 and simplified by Luque et al.^14,15. Bezsudnov et al.¹⁶ proposed a parameter visibility method and Gao et al.¹⁷ also proposed a limited penetrable visibility method (LPVG) and multiscale limited penetrable horizontal visibility graph (MLPHVG). The third one is the phase space reconstruction method^18,19. It begins with a phase space reconstruction of time series analysis, maps fixed-length time series segments into nodes of a network, and then uses the correlation coefficients (or distances) between these nodes to determine whether they are connected or not. The fourth one is recurrence networks method^20,21. This method uses the concept of recurrences in phase space and the recurrence matrix of a time series is interpreted as the adjacency matrix of an associated complex network. The links of this network connects different points in time if the considered states are closely neighboured in phase space. And the last one is the coarse graining method^22,23,24,25 by which fluctuations of time series are transformed into signal sequences. A fixed-length signal sequence is treated as a network node that connects nodes of time series in chronological order, and a weighted and directed network is then constructed. In recent years, researchers have also used complex network theories to study multivariate time series^26,27,28,29. These methods all effectively maintain most of the properties of different types of time series, and they have been successfully used in many different fields^{30,31,32,33,34,35,36,37}.

Although there have been abundant empirical results obtained using time series complex network algorithms^{11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29}, rigorous theoretical results are still lacking. Only a small amount of literature has presented exact results on the properties of the horizontal visibility graphs (HVG) associated with random series^12,13,14,15. Thus far no rigorous theory other than the above algorithms have been developed. Our goal here is to focus on one type of general horizontal visibility algorithm, the limited penetrable horizontal visibility graph (LPHVG). The parameter ρ of the limited penetrable horizontal visibility graph is called limited penetrable distance, when we set the limited penetrable distance ρ = 0, the limited penetrable horizontal visibility graph (LPHVG) degenerates into the horizontal visibility graph (HVG)¹⁴, thus LPHVG is the extended form of HVG. We derive exact results on the properties of the limited penetrable horizontal visibility graphs associated with independent and identically distributed (i.i.d.) random series. We prove that an i.i.d. random series can be mapped on a limited penetrable horizontal visibility graph with exponential degree distribution, which is an extension of the result presented by Luque et al.¹⁴. We deduce the exact mean degree and the clustering coefficient, and we prove that the limited penetrable horizontal visibility graph associated with any i.i.d. random series has small-world characteristics. To verify our theoretical solution, we acquire simulation results by using several deterministic chaotic series and a real crude oil price series that confirms the accuracy and usability of our results.

Results

Here we show several exact results of LPHVG associated with i.i.d. random time series and apply them to the deterministic chaotic series of a logistic map, a Hénon map, a Lorenz system and energy price chaotic system and a crude oil price series.

Degree distribution

Let X(t) be a real valued bi-infinite time series of i.i.d. random variables with a probability density f(x) in which x ∈ [a, b], and consider its associated LPHVG with the limited penetrable distance ρ = 1. Then

$$P(k)\sim exp[-(k-\mathrm{4)}ln\mathrm{(5/4)}]\quad {\rm{with}}\quad k=4,5,\ldots ,\forall f(x),$$

(1)

where P(k) is the degree distribution and k is the degree of a node. To prove this conclusion we first calculate the probability that an arbitrary datum with value x₀ has a limited penetrability at most a one-time visibility of k other data. We thus list all sets of possible configurations for data x₀ with k = 4, 5 and 6 (Figs S1, S2 and S3 in Supplementary Information), and we calculate the probability P(k = 4), P(k = 5) and P(k = 6) (Eqs (S4), (S9) and (S10), respectively). We then deduce the rules of when a given configuration contributes to P(k) (rules i–iv in Supplementary Information) and obtain a general expression for P(k) [see Eq. (S12)]. The detailed proof of this result is shown in Supplementary Information Theorem S1. This is an exact result for a limited penetrable horizontal visibility graph with the limited penetrable distance ρ = 1. We conclude that for every probability distribution f(x), the degree distribution P(k) of the associated LPHVG has the same exponential form. In addition, from this result we can obtain the more general result (Theorem S2 in Supplementary Information) in which X(t) is a real bi-infinite time series of i.i.d. random variables with a probability distribution f(x) in which x ∈ [a, b], and can examine its associated LPHVG with the limited penetrable distance ρ. Then

$$\begin{array}{c}P(k)\sim exp\{-(k-2\rho -\mathrm{2)}ln\mathrm{[(2}\rho +\mathrm{3)}/\mathrm{(2}\rho +\mathrm{2)]\}},\quad \rho =\mathrm{0,}\,\mathrm{1,}\,\mathrm{2,}\,\mathrm{...}\quad {\rm{with}}\quad \\ \quad \quad \quad k=2\rho +2,2\rho +\mathrm{3,}\ldots ,\forall f(x\mathrm{).}\end{array}$$

(2)

Note that when ρ = 0, then P(k) ~ exp[−(k − 2)ln(3/2)]¹⁴. For this case the LPHVG becomes the HVG (Methods Section). When ρ = 1, the result is Eq. (1). Therefore Eq. (2) is an extension of the previous result¹⁴ that indicates that the degree distribution P(k) of LPHVG associated with i.i.d. random time series has a unified exponential form.

To further check the accuracy of our analytical results, we perform several numerical simulations. We generate a random series of 3000 data points from uniform, Gaussian, and power-law distributions and their associated limited penetrable horizontal visibility graphs. Figure 1(a) and (b) show plots of the degree distributions of the resulting graphs with a penetrable distance ρ = 1 and ρ = 2. Here circles indicate a series extracted from a uniform distribution, and squares and diamonds indicate series extracted from Gaussian and power-law distributions, respectively. The solid lines indicate the theoretical results of Eq. (2). We find that the theoretical results agree with the numerics. Note that one prerequisite for our theoretical results is that the length of the time series must be infinitely long, i.e. the series size N → ∞, so we can assert that the tail degree distribution of LPHVG associated to i.i.d. random series deviated from the theoretical result is only due to the effect of the finite size. In order to check the effect of the finite size, we define the relative error E(k) and the mean relative error ME to measure accurate between the numerical result under the finite size and the theoretical result. We also use a cutoff value k₀ to denote the onset of finite size effects. Thus, we write

$$E(k)=\frac{|{P}_{num}(k)-{P}_{the}(k)|}{{P}_{the}(k)}\quad {\rm{and}}\quad ME=\sum _{k}E(k),$$

(3)

where, P_num(k) and P_the(k) denote the degree distribution of the numerical and theoretical results, respectively. We generate the random series from uniform distribution with different series sizes N and 10 realizations for each size. Figure 1(c) shows the test results of the resulting graphs with penetrable distance ρ = 1 and Fig. 1(d) with ρ = 2. The subplots in Fig. 1(c) and (d) show the relations between the mean relative error ME and the series size N, and the relations between the cutoff value k₀ and the series size N. We find that when the finite size N increases, the cutoff value k₀ also increases but the mean relative error ME decreases, which is in agreement with our previous assertion.

Mean degree

Using Eq. (2) we calculate the mean degree 〈k〉 of the LPHVG associated with an uncorrelated random series,

$$\begin{array}{l}\langle k\rangle =\sum _{k}kP(k)=\sum _{k=\mathrm{2(}\rho +\mathrm{1)}}^{\infty }\frac{k}{2\rho +3}{(\frac{2\rho +2}{2\rho +3})}^{k-\mathrm{2(}\rho +\mathrm{1)}}=4(\rho +1).\end{array}$$

(4)

Next, we deduce a more general expression of mean degree 〈k(T)〉, which depends on the period of a time series T. We consider an infinite periodic series of period T denoted by X_t = {..., x₀, x₁, x₂, ..., x_T, x₁, x₂, ...}, where x₀ = x_T and no repeated values in a period. Let ρ ≪ T for the subseries ${\tilde{X}}_{t}=\{{x}_{0},{x}_{1},{x}_{2},\,\mathrm{...,}\,{x}_{T}\}$. Without loss of generality, we assume x₀ = x_T corresponds to the largest value of the subseries, and x₁, ..., x_ρ, x_T−ρ, ...x_T−1 corresponds to the (2ρ + 1)nd largest value of the subseries. We then can construct the LPHVG associated with the subseries ${\tilde{X}}_{t}$. If the LPHVG has E links and x_i is smallest datum of ${\tilde{X}}_{t}$, the degree of x_i is 2(ρ + 1) during the construction of LPHVG. Figure S1 illustrates the result for ρ = 1. We delete node x_i and its 2(ρ + 1) links from the LPHVG and the resulting graph has E − 2(ρ + 1) links and T nodes. We iterate this process T − (2ρ + 1) times and in Fig. 2 we present a graphical illustration for ρ = 1 and T = 10. The total number of deleted links is now E_d = 2(ρ + 1)[T − (2ρ + 1)] and the resulting graph has 2(ρ + 1) nodes, i.e., x₀, x₁, ..., x_ρ, x_T−ρ, ...x_T−1, x_T, as shown in Fig. 2(h) for ρ = 1 and T = 10. Because these 2(ρ + 1) nodes are connected by ${E}_{r}=(\begin{array}{c}\mathrm{2(}\rho +\mathrm{1)}\\ 2\end{array})$ links, the mean degree of a limited penetrable horizontal visibility graph associated with X_t is

$$\begin{array}{rcl}\langle k(T)\rangle =2(\frac{{E}_{d}+{E}_{r}}{T}) & = & \frac{2[\mathrm{2(}\rho +\mathrm{1)(}T-\mathrm{(2}\rho +\mathrm{1))}+(\rho +\mathrm{1)(2}\rho +\mathrm{1)}]}{T}\\ & = & 4(\rho +1)(1-\frac{2\rho +1}{2T}),\,\rho \ll T\mathrm{.}\end{array}$$

(5)

Note that Eq. (5) holds for every periodic or aperiodic series in which T → ∞. This is independent of the deterministic process that generates the series, since the only constraint in its derivation is that no data repeatitions in a period is allowed. Note that one consequence of Eq. (5) is that every time series has an associated LPHVG with the maximum mean degree (achieved for aperiodic series) 〈k(∞)〉 = 4(ρ + 1), which agrees with Eq. (4).

To check the accuracy of our analytical result, we generate simple period-50, period-100, period-200, and period-250 time series with 1000 data points, as illustrated in Fig. 3(a). We construct the limited penetrable horizontal visibility graphs with the penetrable distance ρ = 0, 1, 2, ..., 10 associated with this periodic time series. Figure 3(b) is a plot of the mean degree of the resulting LPHVGs with different ρ values, where we found a good agreement with the numeric results for ρ ≪ T.

Local clustering coefficient

In LPHVG, nodes with the same degree usually have different clustering coefficients since the degree of the node contribute with different configurations and structures (see proof of Theorem S1). By calculating these coefficients for different configurations (Theorem S3), we find that they are irregular, but the minimum and the maximum clustering coefficient for nodes in LPHVG are regular. Therefore, based on results of degree distribution [Eq. (2)], we can deduce the minimum and maximum local clustering coefficient C_min(k) and C_max(k) of the LPHVG associated to i.i.d. random series. Thus, we write

$${C}_{{\rm{\min }}}(k)=\frac{2}{k}+\frac{2\rho (k-\mathrm{2)}}{k(k-\mathrm{1)}},\quad \rho =0,1,2\quad {\rm{and}}\quad k\ge \mathrm{2(}\rho +\mathrm{1).}$$

(6)

$${C}_{{\rm{\max }}}(k)=\frac{2}{k}+\frac{4\rho (k-\mathrm{3)}}{k(k-\mathrm{1)}},\quad \rho =0,1,2\quad {\rm{and}}\quad k\ge \mathrm{2(2}\rho +\mathrm{1).}$$

(7)

Using Eqs (2, 6) and (7) we also obtain the local clustering coefficient distribution P(C_min) and P(C_max), i.e.,

$$\begin{array}{rcl}P({C}_{{\rm{\min }}}) & = & \frac{1}{2\rho +3}exp\{[\frac{\phi +\sqrt{{\phi }^{2}-8{C}_{{\rm{\min }}}\mathrm{(2}\rho +\mathrm{1)}}}{2{C}_{{\rm{\min }}}}-\mathrm{2(}\rho +\mathrm{1)}]ln(\frac{2\rho +2}{2\rho +3})\},\\ \phi & = & {C}_{{\rm{\min }}}+2\rho +\mathrm{2,}\end{array}$$

(8)

$$\begin{array}{rcl}P({C}_{{\rm{\max }}}) & = & \frac{1}{2\rho +3}exp\{[\frac{\varphi +\sqrt{{\varphi }^{2}-8{C}_{{\rm{\max }}}\mathrm{(6}\rho +\mathrm{1)}}}{2{C}_{{\rm{\max }}}}-\mathrm{2(}\rho +\mathrm{1)}]ln(\frac{2\rho +2}{2\rho +3})\},\\ \varphi & = & {C}_{{\rm{\max }}}+4\rho +2.\end{array}$$

(9)

For a proof of this result see Theorem S3 in the Supplementary Information. Figure 4 shows the clustering coefficient C(k) and its distribution P(C) for limited penetrable horizontal visibility graphs associated with different random series of 3000 data points. The solid black and red lines in Fig. 4(a) and (b) are the theoretical prediction of C_min(k) and C_max(k), respectively [see Eqs (6) and (7)]. In Fig. 4(c) and (d) we show a similar analysis for P(C_min) and P(C_max) [see Eqs (8) and (9)].

Figure 4 shows that the theoretical predictions of C_min(k), C_max(k), P(C_min), and P(C_max) agree with the numerics.

Long distance visibility

Consider a limited penetrable horizontal visibility graph associated with a bi-finite sequence of i.i.d. random variables, which are extracted from a continuous probability density f(x). The probability P_ρ(n) that two data points are connected, when they are separated by n intermediate data points, is given by

$${P}_{\rho }(n)=\frac{2\rho (\rho +\mathrm{1)}+2}{n(n+\mathrm{1)}}\mathrm{.}$$

(10)

Note that P_ρ(n) is again independent of the probability distribution of the random variable x. For a detailed proof of this result see Theorem S4 in the Supplementary Information. Figure 5(a,b) and (c) show the adjacency matrix A of the limited penetrable horizontal visibility graph associated with a random series for different limited penetrable distances. When A(i, j) = 1, we plot ρ = 0 (circles), ρ = 1 (triangles), and ρ = 3 (diamonds) at each element of the matrix.

As we observed, Fig. 5(a,b) and (c) shows a typical homogeneous structure in which the adjacency matrix is filled around the main diagonal. In addition, the matrix indicates a superposed sparse structure caused by the limited penetrable visibility probability P_ρ(n), introducing shortcuts into the limited penetrable horizontal visibility graph. These shortcuts indicate that the limited penetrable horizontal visibility graph present small-world features. Figure 5(b) shows that the theoretical result in Eq. (10) agrees with the numerics and these results are exact, with regard to the topological properties of the LPHVG associated with i.i.d. random series.

Application to deterministic chaotic time series

These results can be used to discriminate between random and chaotic signals. Methods to analyze random processes and identify its behavior as chaotic or stochastic has received extensive study in recent decades^38,39,40,41. Most previous algorithms have been phenomenological and computationally complicated. Thus, new methods that can reliably distinguish stochastic from chaotic time series are needed. Recently Lacasa et al.^13,14 used the horizontal visibility algorithm to characterize and distinguish between chaotic and stochastic processes. Here we use our new theory to distinguish chaotic series from random series and compare with the horizontal visibility algorithm^13,14.

We address four deterministic time series. First, generated by the Logistic map⁴², we have

$${x}_{t+1}=\mu {x}_{t}\mathrm{(1}-{x}_{t})\quad {\rm{with}}\quad \mu =4.$$

Second, the Hénon map⁴³

$${x}_{t+1}=1+{y}_{t}-a{x}_{t}^{2},\quad {y}_{t+1}=b{x}_{t}\quad {\rm{with}}\quad a=1.4\quad {\rm{and}}\quad b=\mathrm{0.3.}$$

The Lorenz chaos system⁴⁴,

$$\dot{x}=a(y-x),\quad \dot{y}=cx-y-xz,\quad \dot{z}=xy-bz\quad {\rm{with}}\quad a=10\quad {\rm{and}}\quad b=8/3,\quad {\rm{and}}\quad c=28;$$

and last, the energy price-supply-economic growth system³⁴,

$$\dot{x}={a}_{1}x+{a}_{2}(C-y)+{a}_{3}(z-{K}_{1}),\quad \dot{y}=-{b}_{1}\,y+{b}_{2}x-{b}_{3}z\mathrm{(1}-z/{K}_{2}),\quad \dot{z}={c}_{1}z\mathrm{(1}-z/L)+{C}_{2}\,yz,$$

where a₁ = 0.3, C = 27, a₂ = 0.5563, a₃ = 0.15, b₁ = 0.4, b₂ = 0.6073, b₃ = 0.3, K₁ = 15, K₂ = 15, c₁ = 0.3, c₂ = 0.006 and L = 19.

Figure 6 shows the limited penetrable horizontal visibility graphs of 3000 data points extracted from two different chaotic maps and two different chaotic systems. Here, we show our results for the degree distribuction calculated numerically for (a) ρ = 0, (b) ρ = 1 and (c) ρ = 2. In the bottom panel, we plot the relation between clustering coefficient and degrees of the nodes. In every case P(k) deviates from Eq. (2) and C(k) deviates from Eqs (6) and (7). We also find that the degree distributions of the LPHVGs associated with these chaotic maps and systems can be approximated using the exponential function $P(k)\sim exp(-\hat{\lambda })k$, but $\hat{\lambda }\ne \lambda =ln[\mathrm{(2}\rho +\mathrm{3)}/\mathrm{(2}\rho +\mathrm{2)}]$ in each case. We conjecture that there is a functional relation between the random and chaos dimensions^13,14. Thus the parameter λ is the frontier between random series and chaotic series and can be used to distinguish randomness from chaos. The solid black and red lines in the bottom panel of Fig. 6 are the theoretical prediction of C_min(k) and C_max(k), respectively [see Eqs (6) and (7)]. In particular, both lines collapse in Fig. 6(d). This result shows that the limited penetrable horizontal visibility graphs for (e) ρ = 1 and (f) ρ = 2 are better discriminators from the perspective of C(k) than the horizontal visibility graph of ρ = 0 shown in Fig. 6(d).

Application to real crude oil future price series

As a further example, we use the data from the U.S. Energy Information Administration on the crude oil future contract 1 (Dollars per Barrel) from April 4, 1983 to March 28, 1985, and we found that they exhibit chaotic and long-range correlations^45,46. We select 500 sample data points and demonstrate that we can use our method to distinguish chaotic series from random series when the data sample is small (although for theoretical results we need infinite data). Figure 7 shows the results of LPHVG with ρ = 0, i.e. HVG [see Fig. 7(a)] and LPHVG with ρ = 2 [see Fig. 7(b)] of 500 data points extracted from Gaussian random series and crude oil futures. Here circles indicate a series extracted from a Gaussian distribution and diamonds indicate series extracted from crude oil futures. The solid green line is the theoretical value of P(k) given by Eq. (2). We can find that the degree distributions associated with Gaussian random series and crude oil price series both deviate from Eq. (2). From the above analysis, the deviations between the tails of the degree distributions associated with Gaussian random series are caused solely by finite size effects. However, how the degree distributions associated with crude oil price deviate from Gaussian random series is not clear. To quantify the deviations of these two cases, we compute the quantity χ²

$$\begin{array}{l}{\chi }^{2}=\sum _{k}\frac{{[{P}_{{\rm{num}}}(k)-{P}_{{\rm{theo}}}(k)]}^{2}}{{P}_{{\rm{theo}}}(k)},\end{array}$$

(11)

where P_num(k) is the degree distribution of the numerical result and P_theo(k) the theoretical result from Eq. (2). Note that χ² changes with the sample size N. Figure 7(c) shows a plot of the numerical values of χ² of LPHVGs with ρ = 0 and ρ = 2 associated with Gaussian random series and crude oil price series for different data sizes. From Fig. 7(c) we can clearly see that the degree distributions associated with crude oil price deviate from Gaussian random series, which means the crude oil future price sequence is not random but chaotic. Furthermore, we compute the absolute distance δ

$$\begin{array}{l}\delta =|{\chi }_{Gauss}^{2}-{\chi }_{oil}^{2}|,\end{array}$$

(12)

where ${\chi }_{Gauss}^{2}$ and ${\chi }_{oil}^{2}$ are χ² value for Gaussian random series and the value of crude oil price series, respectively. The larger δ means a better distinction between the series. Figure 7(d) shows a plot of the numerical values of δ of LPHVGs with ρ = 0 and ρ = 2 of different data size. Comparing the results of HVG (ρ = 0) and LPHVG (ρ = 2), we find that when N ≥ 500 LPHVG works better than HVG to distinguish the real time series from uncorrelated randomness. Therefore, in the next section, we select 500 sample data in each sliding window, i.e. L = 500, to construct LPHVG for measuring the global evolution characteristics of crude oil price series.

Next we use LPVHG with ρ = 2 to describe the global evolution of crude oil future prices (for the calculation process, see Methods). Our sample data is from the crude oil future contract 1 (in dollars per barrel) from April 4, 1983 to August 15, 2017 [Fig. 8(a)]. As we know, the fluctuations in crude oil future prices differ for several time periods^24,47. The whole crude oil price time series can be divided into five different fluctuation periods, based on complex network perspective: stable fluctuation period (April 4, 1983 to February 10, 2004), sharp rise period (February 11, 2004 to June 30, 2007), sharp decline period (July 1, 2007 to November 20, 2008), sharp rise period (November 20, 2008 to June 19, 2013) and sharp decline period (June 20, 2013 to August 15, 2017). Here, instead of discussing the fluctuation characteristics of each period, we want to show that the fluctuation behavior in different periods can be revealed by the topological structure of LPHVG. Thus, for simplicity, we combine the sharp rise and sharp decline periods into one sharp fluctuation period and we only consider two periods: a stable period from April 4, 1983 to 10 February 2004 and a period of sharp fluctuations from 11 February 2004 to 15 August 2017. Using our calculation method (see the Method section), we establish 82 windows where each window has 100 weeks in the time series (i.e., L = 500), the first of which is from 4 April 1983 to 28 March 1985. Because each window moves 20 weeks to generate the next window (i.e., l = 100), two adjacent windows have overlaps of 80 weeks. This enables information from one window to move to the next in succession. Each window contributes with 500 nodes to build the local limited penetrable visibility graph network. Figure 8(b) and (c) show the evolution of the adjacency matrix of LPHVGs associated with a random series extracted from a uniform distribution and from crude oil price, respectively. Note that adjacent matrices in the random time series and the crude oil price time series significantly differ, but their respective adjacent matrices in different time windows are similar. Figure 8(d–f) show the evolution of the mean degree, mean clustering coefficient and mean path length, respectively. We find that the mean degree, mean clustering coefficient and mean path length of the LPHVG associated with the random series agree with the theoretical values, but these three quantities of LPHVG associated with the crude oil price series do not. The levels of mean degree of the LPHVG associated with the crude oil price series are smaller than the theoretical values, but the mean clustering coefficient and mean path length are larger. They also show different trends in different fluctuation periods. All the quantities plotted present larger values in the sharp fluctuation period than in the stable fluctuation period. Figure 8(g) and (h) show the distance distribution for a random and for crude oil price series (see Eqs (14–15) in Methods). Comparing the results of Fig. 8(g) and (h), we find that the range of distance of crude oil price series is larger than random series, which means the correlation of crude oil price fluctuation presents high complexity. Based on the result of Fig. 8(g) and Eq. (16), we can determine the threshold value θ = 2001, then using Eq. (17), we can obtain the correlation index distribution of crude oil price series. Note that the i.i.d. random time series has neither short-range nor long-range correlations, but the crude oil price time series has both. Thus using LPHVG enables us to describe the global time series evolution.

Discussion

We have introduced a more generalized case of the horizontal visibility algorithm^13,14, named limited penetrable horizontal visibility algorithm, in which the limited penetrable distance is ρ. We obtain exact results on several properties of the limited penetrable horizontal visibility graph associated with a general uncorrelated random series, and its reliability has been confirmed by numerical simulations. In particular, the degree distribution of the graph has the exponential form and the expression for the mean degree holds for every periodic or aperiodic series, which does not depend on the deterministic process that generates them. We also found that the clustering coefficient C has a maximum and a minimum value, which depend on the degree k and distance ρ. We concluded that the probability ${P}_{\rho }(n)=\frac{2\rho (\rho +1)+2}{n(n+1)}$ introduces shortcuts to the limited penetrable horizontal visibility graph that exhibit a small-world phenomenon. These results are independent of the distribution from which the series was generated and we observe that all uncorrelated random series have the same limited penetrable horizontal visibility graph and, in particular, the same degree distribution, mean degree, clustering coefficient distribution, and small-world characteristics.

This algorithm can thus be used as a simple test for discriminating uncorrelated randomness from chaos. We show that the method can distinguish between random series that follow the theoretical predictions, and chaotic series that deviate from them. In addition, we employ the method to measure the global evolution characteristics of time series by using LPHVG with ρ = 2, and the results confirmed its validity. Note that we set L = 500, l = 100 as an example to construct the continuous LPHVGs in this paper, moreover, we can set different scales of sliding window L and step length l depending on the needs of the analysis. If the goal is to study the global evolution characteristics based on short periods, scales L and l can be set to smaller values. On the contrary, the scales L and l can be set to larger values. Increasing th scale L and l, the mean degree and mean path length in the corresponding complex networks increase, as shown in Fig. 9(a) and (c) and the mean clustering coefficient decreases [Fig. 9(b)]. The parameters L and l determine the number of sliding windows, and for crude oil price time series, when L = 500, l = 100 the number of sliding windows is 82. This number is only 28 if L = 500, l = 300. We observe that an increase in scale L and l will hide the characteristics of diversity of the real time series. On the other hand, for L = 50, l = 10 the number of sliding windows is more than 800, and they contain much noise. Thus, if the values of scale L and l are too large or too small, it is meaningless for studying the global evolution characteristics of the real time series. When we determine the appropriate parameters L and l, the corresponding threshold θ can be calculate by using Eqs (14–16) and we can observe the similarly distribution characteristics of correlation index of crude oil price time series, as illustrated in Fig. 9(d–i).

Our exact results presented here are extension of previous work¹⁴ and the method can serve as a preliminary test for locating deterministic fingerprints in time series. To better distinguish chaos from uncorrelated randomness, the limited penetrable parameter ρ can be adjusted. If we determine that P(k) has an exponential tail that deviates from the theoretical values or that C(k) also deviates from the theoretical results [Eqs (2, 6) and (7)], we apply embedding methods to analyze the series. There are also some open problems unsolved in this work, and the main question is how to deduce the theoretical results of LPHVG associated with more complicated time series, such as logistic maps, fractional Brownian motions and Lorenz equations. Another question is how to use this algorithm to measure chaos and reveal the global evolution characteristics for other time series. Therefore, topics for further research could include whether this algorithm is also able to quantify chaos, the relation between such standard chaos indicators as Lyapunov exponents and the correlation dimension. Also, how to tune the limited penetrable parameter ρ and how to use the limited penetrable horizontal visibility graph to handle two-dimensional manifolds. The topological properties of the visibility graphs (VG), of the limited penetrable visibility graphs (LPVG) and expanded applications of LPHVG might be considered in future works.

Methods

Limited Penetrable Horizontal Visibility Graph [LPHVG]

The limited penetrable visibility graph (LPVG)³⁷ and the multiscale limited penetrable horizontal visibility graph (MLPHVG)¹⁷ are a recent extension of the VG¹² and HVG^13,14 used to analyze nonlinear time series. The limited penetrable horizontal visibility graph (LPHVG) is geometrically more simple and an analytically solvable version of LPVG³⁷ and MLPHVG¹⁷. To define it we let X_t = {x_i}, i = 1, 2, ..., N be a time series of N real numbers. If we set the limited penetrable distance to ρ, LPHVG maps the time series into a graph with N nodes and an adjacency matrix A. Nodes i and j are connected through an undirected edge (A_ij = A_ji = 1) when x_i and x_j have limited penetrable horizontal visibility (see Fig. 10), i.e., if at most ρ, the intermediate data x_q is

$${x}_{q}\ge inf\{{x}_{i},{x}_{j}\},\forall q\in (i,j),$$

(13)

this mapping is a limited penetrable horizontal visibility graph (LPHVG). When we set the limited penetrable distance ρ = 0, LPHVG degenerates into HVG¹⁴. When ρ ≠ 0, there are more connections between any two nodes in LPHVG than in HVG. Figure 10(b) shows the new established connections (red lines) when we infer the LPHVG on the basis of HVG with a limited penetrable distance ρ = 1. Note that the limited penetrable horizontal visibility graph of a given time series has all the properties of its horizontal visibility graph, e.g., it is connected and invariant under all affine transformations of the series data^12,14.

Measurement of the Global Evolution Characteristics of Time Series using LPHVG

To characterize the evolution of the time series X_t = {x_i}, i = 1, 2, ..., N using LPHVG, we divide the entire scale of the time series into equal small-scale segments (or windows) and assume that the length of the sliding window is L. We define l as the step length between sliding time windows. To ensure that small-scale segments of the time series are continuous, we require that l < L. This allows us to obtain T = [(N − L)/l + 1] for small-scale time windows, where [...] is the rounding function. For every small-scale time window t, we transform time series into the a LPHVG of time t using the limited penetrable horizontal visibility algorithm. The topological structure of LPHVG changes with time t, therefore, we write LPHVG(t). In order to describe this process, we use the Euclidean distance to measure the relationship between LPHVGs. We define the Euclidean distance between LPHVG(t_m) and LPHVG(t_n) to be

$$d[LPHVG({t}_{m}),LPHVG({t}_{n})]=\sqrt{\sum _{i=1}^{L}\sum _{j=1}^{L}({a}_{ij}^{({t}_{m})}-{a}_{ij}^{({t}_{n})})},\quad {\rm{where}}\quad {a}_{ij}^{({t}_{m})}\in {{\rm{A}}}^{{t}_{m}}\quad {\rm{and}}\quad {a}_{ij}^{{t}_{n}}\in {{\rm{A}}}^{{t}_{n}}\mathrm{.}$$

(14)

We then determine the distance matrix

$${D}_{T\times T}={\{{d}_{{t}_{m},{t}_{n}}\}}_{{t}_{m}=1,\mathrm{2,...,}T,{t}_{n}\mathrm{=1,2,...,}T},$$

(15)

and assign a threshold value to θ

$$\theta =min{\{{d}_{tm,{t}_{n}}^{{\rm{rand}}}\}}_{{t}_{m}\ne {t}_{n}},\quad {d}_{{t}_{m},{t}_{n}}^{{\rm{rand}}}\in {{D}}_{T\times T}^{{\rm{rand}}}\mathrm{.}$$

(16)

Here ${{D}}_{T\times T}^{{\rm{rand}}}$ is the distance matrix associated with the independent and identically distributed random time series. From Eq. (16), we can see that θ can be obtained from the distance matrix of LPHVGs associated with the i.i.d. time series, which is the critical value to measure the correlation between the data in different time periods. Using the threshold θ, we define the correlation index γ,

$${\gamma }_{{t}_{m},{t}_{n}}=(\begin{array}{ll}\mathrm{0,} & {d}_{{t}_{m},{t}_{n}}\ge \theta ,\\ 1-{d}_{{t}_{m},{t}_{n}}/\theta , & {d}_{{t}_{m},{t}_{n}} < \theta \mathrm{.}\end{array}$$

(17)

Here ${\gamma }_{{t}_{m},{t}_{n}}$ is the correlation degree of LPHVG at time t_m and time t_n, and ${\gamma }_{{t}_{m},{t}_{n}}$ can be visualized using a recursive graph constructed using the formula

$$\Re ({t}_{m},{t}_{n})={\rm{\Theta }}(\theta -d[LPHVG({t}_{m}),LPHVG({t}_{n})]),\quad {\rm{\Theta }}(x)=(\begin{array}{ll}\mathrm{1,} & x > \mathrm{0,}\\ \mathrm{0,} & x\le 0.\end{array}$$

(18)

where Θ(x) is the Heaviside function. We use the formula to plot the dependence between LPHVG(t_m) and LPHVG(t_n) in two-dimensional coordinates, in which both the abscissa and the ordinate are the time t. In the recursive graph when the Euclidean distance between LPHVG(t_m) and LPHVG(t_n) is sufficiently close, i.e., when $\Re ({t}_{m},{t}_{n})=1$, we plot a dot at (t_m, t_n) and (t_n, t_m). Note that at (t_m, t_m) and (t_n, t_n), i.e., the dots remain in the main diagonal (Fig. 8), and we can use this to characterize the global dynamic changes in correlation.

Data Availability

All data generated or analysed during this study are included in this published article.

References

Daw, C. S. et al. Self-Organization and Chaos in a Fluidized Bed. Phys. Rev. Lett. 75, 2308–2311 (1995).
Article ADS CAS PubMed Google Scholar
Mutua, S., Gu, C. & Yang, H. Visibility graphlet approach to chaotic time series. Chaos: An Interdisciplinary. Journal of Nonlinear Science 26(5), 053107 (2016).
MathSciNet Google Scholar
Peng, C. K. et al. Mosaic organization of DNA nucleotides. Phys. Rev. E 49, 1685–1689 (1994).
Article ADS CAS Google Scholar
Manshour, P. Complex network approach to fractional time series. Chaos: An Interdisciplinary. Journal of Nonlinear Science 25(10), 103105 (2015).
MathSciNet MATH Google Scholar
Podobnik, B. & Stanley, H. E. Detrended cross-correlation analysis: A new method for analyzing two nonstationary time series. Phys.Rev. Lett. 100, 084102 (2008).
Article ADS PubMed Google Scholar
Marwan, N. et al. Recurrence plots for the analysis of complex systems. Physics reports 438(5–6), 237–329 (2007).
Article ADS MathSciNet Google Scholar
Barabási, A. L. & Albert, R. Emergence of scaling in random networks. science 286(5439), 509–512 (1999).
Article ADS MathSciNet PubMed MATH Google Scholar
Watts, D. J. & Strogatz, S. H. Collective dynamics of small-world networks. nature 393(6684), 440–442 (1998).
Article ADS CAS PubMed MATH Google Scholar
Newman, M. E. J. & Watts, D. J. Renormalization group analysis of the small-world network model. Physics Letters A 263(4), 341–346 (1999).
Article ADS MathSciNet CAS MATH Google Scholar
Erdös, P. & Rényi, A. On the existence of a factor of degree one of a connected random graph. Acta Mathematica Hungarica 17(3–4), 359–368 (1996).
MathSciNet MATH Google Scholar
Zhang, J. & Small, M. Complex network from pseudoperiodic time series: Topology versus dynamics. Physical review letters 96(23), 238701 (2006).
Article ADS CAS PubMed Google Scholar
Lacasa, L. et al. From time series to complex networks: The visibility graph. Proceedings of the National Academy of Sciences 105(13), 4972–4975 (2008).
Article ADS MathSciNet CAS MATH Google Scholar
Lacasa, L. On the degree distribution of horizontal visibility graphs associated with Markov processes and dynamical systems: diagrammatic and variational approaches. Nonlinearity 27(9), 2063 (2014).
Article ADS MathSciNet MATH Google Scholar
Luque, B. et al. Horizontal visibility graphs: Exact results for random time series. Physical Review E 80(4), 046103 (2009).
Article ADS CAS Google Scholar
Lacasa, L. & Toral, R. Description of stochastic and chaotic series using visibility graphs. Physical Review E 82(3), 036120 (2010).
Article ADS Google Scholar
Bezsudnov, I. V. & Snarskii, A. A. From the time series to the complex networks: The parametric natural visibility graph. Physica A: Statistical Mechanics and its Applications 414, 53–60 (2014).
Article ADS MathSciNet Google Scholar
Gao, Z. K. et al. Multiscale limited penetrable horizontal visibility graph for analyzing nonlinear time series. Scientific Reports 6, 35622 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Xu, X., Zhang, J. & Small, M. Superfamily phenomena and motifs of networks induced from time series. Proceedings of the National Academy of Sciences 105(50), 19601–19605 (2008).
Article ADS MathSciNet CAS MATH Google Scholar
Wang, M. & Tian, L. From time series to complex networks: The phase space coarse graining. Physica A: Statistical Mechanics and its Applications 461, 456–468 (2016).
Article ADS MathSciNet Google Scholar
Marwan, N. et al. Complex network approach for recurrence analysis of time series. Physics Letters A 373(46), 4246–4254 (2009).
Article ADS CAS MATH Google Scholar
Donner, R. V. et al. Recurrence networks-a novel paradigm for nonlinear time series analysis. New Journal of Physics 12(3), 033025 (2010).
Article ADS MATH Google Scholar
Gao, X. et al. Characteristics of the transmission of autoregressive sub-patterns in financial time series. Scientific reports, 4 (2014).
Gao, X. et al. Transmission of linear regression patterns between time series: From relationship in time series to complex networks. Physical Review E 90(1), 012818 (2014).
Article ADS Google Scholar
Wang, M. et al. Fluctuation behavior analysis of international crude oil and gasoline price based on complex network perspective. Applied Energy 175, 109–127 (2016).
Article Google Scholar
Chen, H. et al. Analysis of the Dynamic Evolutionary Behavior of American Heating Oil Spot and Futures Price Fluctuation Networks. Sustainability 9(4), 574 (2017).
Article Google Scholar
Gao, Z. K. et al. Multiscale complex network for analyzing experimental multivariate time series. EPL (Europhysics Letters) 109(3), 30005 (2015).
Article ADS Google Scholar
Gao, Z. K. et al. Multilayer Network from Multivariate Time Series for Characterizing Nonlinear Flow Behavior. International Journal of Bifurcation and Chaos 27(04), 1750059 (2017).
Article ADS Google Scholar
Lacasa, L., Nicosia, V., Latora, V. Network structure of multivariate time series. Scientific reports, 5 (2015).
Wang, M., Tian, L. & Du, R. Research on the interaction patterns among the global crude oil import dependency countries: A complex network approach. Applied Energy 180, 779–791 (2016).
Article Google Scholar
Marwan, N. How to avoid potential pitfalls in recurrence plot based data analysis. International Journal of Bifurcation and Chaos 21(04), 1003–1017 (2011).
Article ADS MathSciNet MATH Google Scholar
Marwan, N. & Meinke, A. Extended recurrence plot analysis and its application to ERP data. International Journal of Bifurcation and Chaos 14(02), 761–771 (2004).
Article ADS MATH Google Scholar
Du, R. et al. A complex network perspective on interrelations and evolution features of international oil trade, 2002–2013. Applied Energy 196, 142–151 (2017).
Article Google Scholar
Wang, M. et al. Systemic risk and spatiotemporal dynamics of the consumer market of China. Physica A: Statistical Mechanics and its Applications 473, 188–204 (2017).
Article ADS Google Scholar
Wang, M. & Tian, L. Regulating effect of the energy market: Theoretical and empirical analysis based on a novel energy prices–energy supply–economic growth dynamic system. Applied Energy 155, 526–546 (2015).
Article Google Scholar
An, H. et al. Research on patterns in the fluctuation of the co-movement between crude oil futures and spot prices: A complex network approach. Applied Energy 136, 1067–1075 (2014).
Article Google Scholar
Zhou, C. et al. Characterizing time series of near-miss accidents in metro construction via complex network theory. Safety science 98, 145–158 (2017).
Article Google Scholar
Gao, Z. K. et al. Limited penetrable visibility graph from two-phase flow for investigating flow pattern dynamics (2013).
Grassberger, P. & Procaccia, I. Characterization of strange attractors. Physical review letters 50(5), 346 (1983).
Article ADS MathSciNet MATH Google Scholar
Sugihara, G. et al. Distinguishing error from chaos in ecological time series. Philosophical Transactions of the Royal Society of London B: Biological Sciences 330(1257), 235–251 (1990).
Article ADS CAS PubMed Google Scholar
Kaplan, D. T. & Glass, L. Direct test for determinism in a time series. Physical review letters 68(4), 427 (1992).
Article ADS CAS PubMed Google Scholar
Rosso, O. A. et al. Distinguishing noise from chaos. Physical review letters 99(15), 154102 (2007).
Article ADS CAS PubMed Google Scholar
Kocarev, L. & Jakimoski, G. Logistic map as a block encryption algorithm. Physics Letters A 289(4), 199–206 (2001).
Article ADS MathSciNet CAS MATH Google Scholar
Gallas, J. A. C. Structure of the parameter space of the Henon map. Physical Review Letters 70(18), 2714 (1993).
Article ADS CAS PubMed Google Scholar
Grigorenko, I. & Grigorenko, E. Chaotic dynamics of the fractional Lorenz system. Physical review letters 91(3), 034101 (2003).
Article ADS PubMed Google Scholar
Panas, E. & Ninni, V. Are oil markets chaotic? A non-linear dynamic analysis. Energy economics 22(5), 549–568 (2000).
Article Google Scholar
Tabak, B. M. & Cajueiro, D. O. Are the crude oil markets becoming weakly efficient over time? A test for time-varying long-range dependence in prices and volatility. Energy Economics 29(1), 28–36 (2007).
Article Google Scholar
Gao, X. et al. Detecting method for crude oil price fluctuation mechanism under different periodic time series. Applied energy 192, 201–212 (2017).
Article Google Scholar

Download references

Acknowledgements

The Research was supported by the following foundations: The National Natural Science Foundation of China (71503132, 71690242, 91546118, 11731014, 71403105, 61403171), Qing Lan Project of Jiangsu Province (2017), University Natural Science Foundation of Jiangsu Province (14KJA110001), Jiangsu Center for Collaborative Innovation in Geographical Information Resource Development and Application, CNPq, CAPES, FACEPE and UPE.

Author information

Authors and Affiliations

School of Mathematical Science, Nanjing Normal University, Nanjing, 210042, Jiangsu, China
Minggang Wang & Lixin Tian
Department of Mathematics, Nanjing Normal University Taizhou College, Taizhou, 225300, Jiangsu, China
Minggang Wang
Center for Polymer Studies and Department of Physics, Boston University, Boston, MA, 02215, USA
Minggang Wang, André L. M. Vilela, Ruijin Du, Longfeng Zhao, Gaogao Dong & H. Eugene Stanley
Universidade de Pernambuco, 50720-001, Recife, PE, Brazil
André L. M. Vilela
Energy Development and Environmental Protection Strategy Research Center, Jiangsu University, Zhenjiang, 212013, Jiangsu, China
Ruijin Du, Gaogao Dong & Lixin Tian

Authors

Minggang Wang
View author publications
You can also search for this author in PubMed Google Scholar
André L. M. Vilela
View author publications
You can also search for this author in PubMed Google Scholar
Ruijin Du
View author publications
You can also search for this author in PubMed Google Scholar
Longfeng Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Gaogao Dong
View author publications
You can also search for this author in PubMed Google Scholar
Lixin Tian
View author publications
You can also search for this author in PubMed Google Scholar
H. Eugene Stanley
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Lixin Tian, Minggang Wang and H. Eugene Stanley designed the structure of the research; Minggang Wang performed the research and prove the Theorem; Minggang Wang, André L.M.Vilela and Ruijin Du designed the model; Minggang Wang and Longfeng Zhao analyzed the data; Minggang Wang, André L.M.Vilela and Gaogao Dong wrote the paper. All co-authors of this paper reviewed the manuscript.

Corresponding author

Correspondence to Lixin Tian.

Ethics declarations

Competing Interests

The authors declare no competing interests.

Additional information

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Wang, M., Vilela, A.L.M., Du, R. et al. Exact results of the limited penetrable horizontal visibility graph associated to random time series and its application. Sci Rep 8, 5130 (2018). https://doi.org/10.1038/s41598-018-23388-1

Download citation

Received: 02 November 2017
Accepted: 12 March 2018
Published: 23 March 2018
DOI: https://doi.org/10.1038/s41598-018-23388-1

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.