Detecting dynamic spatial correlation patterns with generalized wavelet coherence and non-stationary surrogate data

Time series measured from real-world systems are generally noisy, complex and display statistical properties that evolve continuously over time. Here, we present a method that combines wavelet analysis and non-stationary surrogates to detect short-lived spatial coherent patterns from multivariate time-series. In contrast with standard methods, the surrogate data proposed here are realisations of a non-stationary stochastic process, preserving both the amplitude and time-frequency distributions of original data. We evaluate this framework on synthetic and real-world time series, and we show that it can provide useful insights into the time-resolved structure of spatially extended systems.

Synchronization is a fundamental phenomenon described in many biological and physical contexts for which there are two or more interacting oscillatory systems [1].The interactions between coupled oscillators in real spatially extended systems continuously create and destroy synchronised states, which can be observed as noisy and transient coherent patterns.The statistical detection of spatial synchrony in networks of coupled dynamical systems is therefore of great interest in disciplines such as geophysics, physiology and ecology [2,3].
Statistical significance of transient coherent patterns cannot be assessed by classical spectral measures and tests, which require signals to be stationary [3,4].Synchrony estimators based on nonparametric methods have the advantage of not requiring any assumption on the time-scale structure of the observed signals.Among them, measures of synchrony or coherence based on wavelet transforms have been widely used to detect interactions between oscillatory components in different real systems, i. e. neural oscillations, business cycles, climate variations or epidemics dynamics [2,3].
In recent years, different significance tests for the wavelet cross-spectrum or wavelet coherence have been developed to detect oscillatory patterns with covarying dynamics [3][4][5].Unfortunately, the statistical assumptions of these tests are not always compatible with the structure of the data considered, and significance levels often depend on the structure of the wavelets applied [5].A rigorous theoretical framework cannot therefore be derived, and Monte Carlo simulations have to be performed to estimate the critical values for a chosen significance level [5].
Surrogate data techniques have been proposed as nonparametric resampling methods for testing general hypotheses on data without making assumptions on the underlying generating process [6].However, time series measurements from real systems generally display irregular fluctuations, long-term trends, or a time-varying spectra.Such properties are incompatible with the main assumptions of standard surrogate data based on Fourier transform [7,8].
Recently, parametric models have been also applied to test wider classes of null hypothesis, including nonstationary behaviour [8].Some limitations of these approaches include the relatively large basis dimension needed to obtain good global optimisation, and the monitoring needed to control the instabilities in the estimated model [9].Recent studies have proposed the use of discrete wavelet transforms for resampling time series such that the multiscale structure of original data is preserved [10,11].The main advantage of discrete wavelet decompositions is their ability to concentrate the signal's variance in a limited number of coefficients.Nevertheless, the number of data points heavily influences this decomposition (the number of scales); which may render the scale decompositions difficult to interpret [3].Although continuous wavelets often yield a redundant decomposition (the information extracted from a given scale band slightly overlaps that extracted from neighbouring scales), they are more robust to noise as compared with other decomposition schemes [3,4,12].
In this work, we use a continuous wavelet-based approach to detect spatial coherent patterns in nonstationary multivariate observations.We generalise the wavelet coherence to multivariate time series and we extend the classic phase-randomised surrogate data algorithm to the time-frequency domain for generating nonstationary surrogates.This procedure preserves both the original amplitude and time-frequency energy (spectrogram) distributions.Hence, the ensemble of these non-stationary surrogates are used to assess the significance of transient coherent patterns found in multivariate time series.We analyse the results obtained by the proposed method in different synthetic and real-world non-stationary data, and we show that this approach can substantially improve the detection of time-varying spatial coherence.We start by considering the time-frequency (TF) distributions obtained by correlating a time series x(t) with a scaled and translated version of a chosen mother wavelet w s,τ (t) =1 √ s w( t−τ s ).In the following, we always consider the mother complex Morlet wavelet defined as w(θ) = π −1/4 exp(−θ 2 /2) × exp(iω 0 θ), where the parameter ω 0 controls the time scale resolution of each wavelet To quantify the relationships between two nonstationary signals, x i (t) and x j (t), the wavelet crossspectrum is given by W i,j (t, f ) = W i (t, f )W * j (t, f ), where * denotes the complex conjugate operator and W k (t, f ) is the wavelet transform of signal x k (t).Let us now consider M zero-mean time series x 1 (t), . . ., x M (t), and define the complex coherence spec- for i, j = 1, . . ., M , where • denotes a smoothing operator both in time and frequency [13].
In bivariate data analysis, the wavelet coherence is defined as To extend this idea to the general case of M 2 signals, we can define a matrix Σ(t, f ) at every point in the time-frequency domain containing all the pairwise coherence spectra [14]: The time-varying spatial coherence (TVSC) can be defined by where λ Σ max denotes the largest eigenvalue of the spectral matrix Σ(t, f ) 1  The values of Ψ(t, f ) are bounded between 0 Ψ(t, f ) 1, reaching the maximum when all the M signals are locally -in the time-frequency plane-pairwise correlated (Σ(t, f ) becomes an all-ones matrix with λ Σ max = M ); and the minimum when all signals are completely uncorrelated (Σ(t, f ) = I and λ Σ max = 1).Interestingly, for the case M = 2, Σ(t, f ) is given by In the bivariate case, this therefore reduces the TVSC to the classic definition of the wavelet coherence Ψ In wavelet-based analysis, test statistics are strongly affected by data's structure, the mother wavelet's properties, and by the smoothing applied [5,13].In this work, the statistical properties of Ψ(t, f ) under the null hypothesis H 0 of M uncorrelated processes are determined by Monte Carlo simulation.To do this, we generate a number of surrogate data realisations x j (t), j = 1, . . ., K by repeating the randomisation procedure K times.The statistical significance of Ψ(t, f ) is assessed by quantifying its statistical deviation from values obtained in the ensemble of surrogate data.To correct for multiple testing, the false discovery rate (FDR) method was applied to each matrix of Ψ(t, f ) values [16].With this approach, the threshold of significance was set such that the expected fraction of false positives over the time-frequency plane is restricted to q 0.05.
A surrogate time series x(t) can be obtained by destroying the phase structure of the original signal x(t) in the time-frequency domain.As the Morlet wavelet is a complex function, the resulting wavelet decomposition W x (t, f ) has both real and imaginary parts.We can therefore write W x (t, f ) in terms of its phase The wavelet-based surrogate algorithm first generates a Gaussian white noise time series to match the original data length and it then derives the wavelet transform to extract the phase φ noise (t, f ).We use this randomised phase and the WT modulus of the original signal to obtain a surrogate time-frequency distribution A surrogate time series can be reconstructed by taking the real part of the inverse wavelet transform (in practice, by just summing the real part of W x (t, f ) over all scales/frequencies).Finally, the surrogate x(t) is rescaled to the distribution of the original data by sorting the data (after the wavelet filtering in the frequency band of interest) according to the ranking of the wavelet-based surrogate [6].As its Fourier-based counterpart, our scheme can be iteratively repeated to better adjust the time-frequency distribution of the surrogate.
To illustrate our surrogate data method, we consider an electroencephalographic (EEG) recording from a pediatric subject with intractable seizures 2 .Although our approach is applicable to any neuroimaging functional method (e.g.EEG, fMRI, and MEG) here we use the EEG as this modality of acquisition has the major feature that collective neural behaviors, i.e., synchronization of large and sparsely distributed cortical assemblies, are reflected as time-varying interactions between EEG signals.The file studied here contains 23 EEG signals sampled at 256 Hz according to the 10-20 bipolar montage [17].
The non-stationarity of epileptic EEG signals is clearly illustrated in Fig. 1-A.One can notice that the frequency content of epileptic oscillations may change rapidly across time over a range of frequencies.The time-frequency plot exhibits a short fast oscillatory behavior (15 − 20 Hz) around t = 3 sec followed by slow and large oscillations accompanying the epileptic seizure after t = 6 sec.As depicted in Fig. 1-B, classical stationary surrogate data (here we used the iAAFT algorithm [6]) is not able to 2 The EEG data was obtained from the open repository CHB-MIT Scalp EEG Database (https://www.physionet.org/pn6/chbmit) replicate the non-stationary oscillations embedded in the original signal.In contrast, the time-varying spectrum of original signal is clearly conserved with our algorithm, as illustrated in Fig. 1-C.Distributions in Fig. 1-D confirm that the three surrogate algorithms conserve the amplitude distributions.
Another paradigmatic example of non-stationary spatial synchrony is that observed in population dynamics.Here, we considered the weekly measles notifications in seven large English cities studied in Ref. [18].Measles epidemics generally exhibit a non-stationary dynamics with a regular and highly epidemics before nationwide vaccination schemes, and an irregular and spatially uncorrelated dynamics in the vaccine era.As illustrated in Fig. 2, the data display multiannual cycles that dramatically varies with time, specially after vaccination.This rich behavior can not be encompassed by classical stationary surrogate data (Fig. 2-B).Instead, the waveletbased method perfectly keeps the variations of epidemic periods observed in the original time series (Fig. 2-C).
We then test now the performance of our framework to detect non-stationary spatial coherent components on two synthetic datasets with time-varying coherence structure.In the first benchmark, the spatial system consists of 5 linear oscillators described by the following multivariate autoregressive (AR) model: where t is a discrete time index, i are independent white noise processes with zero means and unit variances, and k i are coupling strengths.Here, we set k 1 = 0 and k 2 = 0.15 for t < 1000; and k 1 = −0.5 and k 2 = 0.4 for t 1000.
Although the measure of time-varying spatial coherence is supposed to capture linear interactions, numerical evidence shows that Ψ(t, f ) still provides a qualitative description in case of nonlinear oscillators.Indeed, we consider a network of i = 1, . . ., 10 coupled non-identical chaotic Rössler oscillators.The equations of motion read where λ(t) is the time-varying coupling strength, ξij are the elements of the coupling matrix (a random graph with an average number of links per node k m = 4); ω i is the natural frequency of the i th oscillator (randomly assigned from a uniform distribution with values between 0.98 ω i 1.1); η i denotes a Gaussian delta correlated noise with η i (t) = 0 and η i (t)η i (t ) = 2Dδ(t−t ), D = 0.01.Coupling strength λ varies with time as follows: λ = 0.5 for 500 < t < 900 and λ = 0.001 elsewhere.Here, all time series are first centered and set to have zero mean and unit variance.Then, the TVSC values are computed.To estimate the distribution of Ψ(t, f ) under H 0 we generate 100 surrogates for each time series.Significant coherent patterns were detected as statistically different from the H 0 hypothesis by a z-test corrected by a FDR at q 0.05.
In Figs.3-(A-C) we report, respectively, the time series generated by the above models, the significant coherent components detected by Ψ(t, f ) in combination with classical surrogate data and with the non-stationary surrogate data.Results reveal that stationary randomizations of time series detect several large spurious syn-chrony patches on the time-frequency plane, e.g. the large patches before t = 1000 for the coupled AR model, or those out of the synchronous region for the coupled Rössler model (500 < t < 900).This is mainly due to the oscillations created over the whole segment by the stationary surrogate algorithm.Conversely, a detection based on our method considerably reduces the number of false coherent patches, while it clearly identifies the main regions with the highest spatial coherence.Remarkably, right plots of Figure 3 shows that the combination of Ψ(t, f ) with non-stationary surrogate data tests, constitutes a good criterion to assess spatial coherence in the case of nonlinear dynamical time series.
To illustrate our approach on real-world time series, we study the two spatial systems described above: the epileptic EEG recordings and the measles data.The situation with EEG data is illustrated in left panels of Figure 4.The first crucial observation is that, as expected in epilepsy dynamics, spatial coherent patterns are not time invariant, but instead they exhibit a rich time-frequency structure during seizure evolution.Results clearly show that classical surrogate data test may yield to the detection of large synchronous regions, specially at high frequency bands (f 20) Hz.In contrast, non-stationary surrogates improves the time-frequency localization of spatial correlation patterns.A first synchronous pattern seem to involve the low-amplitude fast oscillations often observed during the first seconds of epileptic seizures.Interestingly, the absence of significative values of Ψ(t, f ) between t = 4 − 8 s suggest a desynchronization of some cerebral structures during the build-up of epileptic seizures, just before a wide synchronous spreading to the ensemble of the brain at t = 8 s.This fully agrees with previous findings suggesting a neural desynchronization before the propagation of seizures which could facilitate the development of local pathological recruitments [19].
Right panels of Figure 4 show the results for the measles data.We observe from Ψ(t, f ) values that global interactions between major epidemics change relatively smoothly through time.Classic surrogate analysis can capture epidemic's dynamics at different scales, but does not allow a proper description when they change with time.Indeed, standard surrogate data test reveals no significant spatial correlation patterns.Conversely, our approach clearly detects the main changes in spatial correlation structure: a high spatial coherence between the major epidemic (mainly biennial) component of time series is clearly identified in the pre-vaccine era.The interactions between the smaller epidemics with longer periods observed after vaccination are not found to be statistically significant.This is a remarkable result as it supports previous findings that during the pre-vaccination era, measles dynamics is characterized by a high spatial correlation of biennal epidemic patterns, while the vaccination eliminates large epidemics yielding thus a significant spatial decorrelation [18].In conclusion, in this Letter we have addressed a fundamental problem in complex systems: detecting, from scalar observations, the time scales involved in spatial interactions of complex oscillators with time-varying spectral components.Classical surrogate data tests require time-series to be stationary.Nevertheless, data recorded from real-world systems are generally noisy and nonstationary.In order to study their interactions we propose a complementary approach based on wavelet analysis.Wavelet coherence is generalized as a method for detecting transient but significant coherence between multivariate nonlinear signals.The classic surrogate algorithm is also generalized to produce non-stationary surrogates.Several artificial and real non-stationary, linear and nonlinear time series are examined, in order to demonstrate the advantages of our approach.Surrogate data are produced with the classical and with the proposed method, and the results of statistical comparisons are compared.
Other wavelet-based methods have been used to analyse the relationships between multivariate signals [1,20].Nevertheless, standard surrogate data tests assume stationarity of observations, which strongly affects the significance of the detected coherent patterns.Our results provide evidence of the constructive role of nonstationary surrogate data to uncover changes of correlation patterns in multivariate time series.The detection of spatial correlations in non-stationary multivariate timeseries might provide, more in general, meaningful insights into the functional organisation of complex systems.Applied to other multivariate data (e.g., financial or climate time series), our framework could provide new insights into the structure of spatially extended systems.
BC is partially supported by a grant from Agence Na-

5 FIG. 1 .
FIG. 1. Exemple of the time-frequency (TF) structure for different surrogate algorithms applied to epileptic EEG data.A) The original time series after reconstruction by the wavelet filtering, B) surrogate data generated with the iterative Amplitude Adjusted Fourier Transform (iAAFT) algorithm, C) surrogate generated with our algorithm and D) distributions of amplitudes for the different algorithms (the original data and those generated by the iA-FFT are, by construction, identical).The color maps code for |(Wx(t, f ))| values.Wavelet analysis of EEG were done over the frequency range 0 − 128 Hz.For better visualisation spectra are displayed only for f < 30 Hz.

FIG. 2 .
FIG.2.Exemple of the TF structure for different surrogate algorithms applied to the squared root transformed measles data (Liverpool).Same stipulations as in the caption of Fig.1.Gray box in upper plot A indicates the vaccine era.Black maps of time-frequency plots indicate the cone of influence that delimits the regions not influenced by edge effects[15].

FIG. 3 .
FIG. 3. Ψ(t, f ) values estimated from synthetic time series and statistical differences with those values obtained from different surrogate data.A) The original time series (gray boxes delimitate the region where the system is synchronized), B) surrogate data test of the time-varying spatial coherence with the iAAFT algorithm and C) surrogate data test with our algorithm.The color maps code for Ψ(t, f ) values.Unmasked color regions in panels B-C indicate the q 0.05 significant levels.Black maps indicate the cones of influence that delimit the time-frequency regions not influenced by edge effects.

FIG. 4 .
FIG.4.Ψ(t, f ) values estimated from real spatial systems and statistical differences with those values obtained from different surrogate data tests.For measles data, missing values in each original time series were imputed using a local average, i.e. the mean of the two neighboring time points.Same stipulations as in the caption of Figure3