Spontaneous cortical activity transiently organises into frequency specific phase-coupling networks

Frequency-specific oscillations and phase-coupling of neuronal populations are essential mechanisms for the coordination of activity between brain areas during cognitive tasks. Therefore, the ongoing activity ascribed to the different functional brain networks should also be able to reorganise and coordinate via similar mechanisms. We develop a novel method for identifying large-scale phase-coupled network dynamics and show that resting networks in magnetoencephalography are well characterised by visits to short-lived transient brain states, with spatially distinct patterns of oscillatory power and coherence in specific frequency bands. Brain states are identified for sensory, motor networks and higher-order cognitive networks. The cognitive networks include a posterior alpha (8–12 Hz) and an anterior delta/theta range (1–7 Hz) network, both exhibiting high power and coherence in areas that correspond to posterior and anterior subdivisions of the default mode network. Our results show that large-scale cortical phase-coupling networks have characteristic signatures in very specific frequency bands, possibly reflecting functional specialisation at different intrinsic timescales.

A relevant related issue is the choice of metric used to measure phase-coupling. There are different alternatives, of which spectral coherence and phase-locking value (PLV) are two popular examples. Although PLV has been claimed to represent phase-coupling more faithfully than spectral coherence 3 , in this work we found PLV to be indeed less robust at the large scale than coherence, partly because the application of our data-driven spectral decomposition (as we performed for coherence) is not directly applicable, so we had to rely on an arbitrary specification of the frequency bands. Furthermore, PLV has its own limitations, such as its dependence on filtering and the subsequent use of the Hilbert transform for instantaneous phase calculation 4 ; see also 5 for a comprehensive description of this issue. It is also worth noting that one of the main limitations of spectral coherence is stationarity, which is mitigated here by the fact that the HMM breaks, to some extent, the non-stationarity of the signal into short visits to quasi-stationary states. Further, PLV is also not automatically immune to power bias, given that phase-locking is inferred more reliably when the power (and therefore the signal-to noise ratio) is high. The advantages and disadvantages of different approaches to compute phase-coupling are however beyond the scope of this paper, as a satisfactory solution would require to properly deal with the problems of nonlinearity and nonstationarity, at the heart of the limitations of both spectral coherence and phase-locking value.
A central parameter in our approach is the number of HMM states. Here, we have chosen it to be twelve. Importantly, we do not claim this number to be closer to any biological ground-truth than, for example, eight or sixteen. Although it is possible to guide the choice of the number of states using quantitative measures like the free energy 6 , or even using non-parametric approaches that automatically determine the number of states 7 , different numbers of states in practice just offer different levels of detail of brain dynamics. Indeed, examining different degrees of abstraction can reveal useful insights. For example, when we ran the proposed approach with six states (see Supplementary Figure 8) the posterior higher-order cognitive state was fused with the first two states depicted in Supplementary Figure 2. This is unsurprising considering their relatively similar spectral and spatial features. In this analysis, the left and right temporal states (see Supplementary  Figure 2), which are characterised by high asymmetry and are possibly related to language, were also merged into a single symmetric state containing the patterns of both. In summary, running the HMM with different numbers of states and combining the results in a principled way can provide a hierarchical view of the data that is hidden to other approaches. Thanks to the stochastic scheme of inference 8 , HMM runs are not computationally expensive to produce, facilitating these exploratory analyses. For a practical discussion on the choice of the other HMM parameters, refer to Methods. Another important aspect (and a sensible way of guiding the choice of the number of states) is state reliability, that is, how robust are the states across, for example, half-splits of the data? The question of reproducibility is discussed below.
The model specification of the HMM, through assigning state probabilities at each time point, implicitly assumes that only one state is active at each point in time. However, it is worth noting that it is still possible for network multiplexing to be realised at slower time scales through temporal correlation of the rate of occurrence of states. At the faster time-scale of HMM switching, it is important to note that any conclusion about brain network exclusivity must be made with caution and is by no means necessarily a physiologically meaningful feature of the brain. Addressing the information contained in the state time courses at multiple time scales is an important area for future investigations.
The TDE-HMM is a useful representation of the data, but is not the only possible one. For instance, a high order multivariate autoregressive model has the potential to explain very rich dynamics to similar extent, but in an alternative manner, to an HMM with a simpler observation model 9,10 . Armed with just resting data, it is not possible to disambiguate between these two different descriptions of the data. Which one is more appropriate rather depends on the question in hand. A potential reason to use the HMM over a single-state more complex model (such as a high order multivariate autoregressive model) is that it explicitly parameterises the time series through the state time courses, opening avenues to investigate, for example, the interactions between rest and task. Further, it is through the use of the HMM in this work that we have been able to successfully identify networks of spatially distinct patterns of oscillatory power and phase-coupling in specific frequency bands, in a manner that has not been achieved previously with other approaches, including the autoregressive model.

Slow-frequency spectral properties within fast state visits
We obtained state visits that were often well under 100ms ( Figure 5). How can this be compatible with the slow frequencies (e.g. delta/theta bands) that characterise the states? Here, we show that this is theoretically and practically possible through simulations. We have simulated data where segments of an 8Hz theta wave are interspersed with unstructured signal. We have performed three sets of simulations. In each of them, the duration of the wave segments (which are selected at random points of the theta period) are sampled from a Poisson distribution with mean 0.025s, 0.05s, and 0.1s, respectively. The separation between segments is sampled from a Poisson distribution with mean 1s in all cases, which makes the different wave occurrences to be completely phaseindependent. Small-variance Gaussian noise is added to the generated signals. We simulated 20min of data at 250Hz for each simulation, and assumed a state time course that is active only at the time of the wave segments occurrences. Hence, the duration of the wave segments corresponds to the duration of the state visits. We then used the state-wise multitaper used in Results and in 9 to assess the spectral content of the signal. Supplementary Figure 9 shows the spectral estimation on the top, and an example of a wave segments for each mean dwell time in the bottom. In the three cases, and despite the short state visits, the state-wise multitaper is able to find the correct frequency, even when the frequency resolution is degraded somewhat as we make the wave segments (state visits) shorter, leaking power toward faster frequencies.

Leakage reduction and phase-locking coherence estimation
In this work, we used the method proposed in 11 in order to reduce the effect of signal leakage (volume conduction). Without this step, the estimation of phase-locking gets dominated by a pattern of artefactual local connections that is common to all states. While the approach proposed in 25 has been shown to work well in the context of MEG amplitude correlations, its application for phase-locking networks is less well established. Recently, Pascual-Marqui and colleagues 12 have challenged the aforementioned approach, particularly within the context of estimating phaselocking measures, showing that under certain conditions, artefactual connections may arise. The authors also provide an alternative approach based on the multivariate autoregressive model that may overcome these issues.
To check the approach presented in 11 in the context of phase-locking, we also applied the Pascual-Marqui's approach on our real data 12 , and compared the resulting state-specific phase-locking with the estimations depicted in Figure 2. In Supplementary Figure 10, phase-locking connectivity is shown for the same four HMM states, after applying Pascual-Marqui's method. As observed, the differences between Supplementary Figure 10 and Figure 2 are limited, with the method proposed by Pascual-Marqui and colleagues being slightly more conservative. The main features of the HMM states are however preserved.
A related issue is whether leakage correction, which makes the signals orthogonal across the entire time series, precludes completely zero-lag (or small lag) relationships, which are central to the theory of communication through coherence 13 . Importantly, leakage correction operates at the level of the entire time series, and so only removes zero-lag correlations on average. This means that it is still possible to have transient periods of zero-or small-lag synchronisation. Focusing on the anterior and posterior higher-order cognitive networks, Supplementary Figure 11 illustrates this point by showing the phase at which different regions have a high coherence with the PCC (using a threshold of 0.05). Each dot thus represents a region with high coherence with PCC at the indicated frequency. Colours represent large-scale cortical areas. Importantly, due to the sign ambiguity issue, it is not possible to distinguish in-phase (0) from anti-phase coherence (π). If we assume that anti-phase actually represents in-phase relationships, this figure suggests that many of the (transient) phaselocking relationships are actually close to zero-lag.

Reproducibility
We assessed the reproducibility of the states by randomly splitting the data into two groups of subjects (half-splits) and running five times the HMM inference separately on each half. We also ran the HMM on the full data set five times. This is intended to evaluate the reproducibility of the results both across different HMM runs and across different subjects. We obtained 12 states from each run and matched the states across runs (between the two partitions and to the full cohort run) such that the similarity between state pairs is maximal. We used Riemannian distances to quantify the dissimilarities between states (see Methods). Supplementary Figure 12a shows the Riemannian distances between each pair of states (within and between runs). We then performed statistical testing on the consistency between runs across halves. Conceptually, we aimed to test if each pair of matched states (one per half) reliably represents the same process. If the distance between the states within the pair is significantly lower than between any two non-matched states, the state represented by this pair is robust across runs and subjects. This is shown in Supplementary Figure  12b (left), where several states appear to be significantly reliable. A notable exception is the anterior higher-order cognitive state, the reason being its high similarity with state 5 (see Supplementary  Figure 12a); that is, these two states are relatively similar and can potentially be mixed in certain runs. If, on the other hand, we test the distance of each pair of matched states against the average distance between any pair of states (that is, a less conservative test), the anterior higher-order cognitive state appears as highly reliable (Supplementary Figure 12b, right). In summary: -Overall, there is a strong similarity across runs, both within and between half-split partitions of the data (and to the full data set). -Within sessions, some states are relatively similar, suggesting some form of state hierarchy.

Supplementary Note 4 HMM states from surrogate data
In this paper, we have proposed a model that finds separable, spectrally-defined states from MEG data. Applied on the resting-state, we found a number of states with interpretable characteristics. Being aware that the brain complex dynamics can be equally well represented in multiple ways (see above), it is important to investigate how the HMM states might differ between those we find in surrogate data simulated from complex single-state dynamic models, and those that we find in the real data. To test this, we implemented a surrogate data generation procedure where we kept analogous 1/f dynamics and autocorrelations while breaking the state-specific dynamics by using autoregressive models. In particular, using the 42-channels data used in the rest of the paper before leakage correction, we estimated two models: (i) a multivariate autoregressive model of order 3 (MAR(3), with 42 2 x 3 parameters) which captured between channel autocorrelations, and (ii) a collection of univariate autoregressive models (one per channel) of order 21 (AR(21), with 21 x 42 parameters) which did not capture between-channels spectral characteristics, but which was able to estimate more detailed within-channel spectral features. We then sampled data from these two models, corrected for signal leakage, and applied the HMM.
Consistent with our expectations, and as discussed in previous work 9,10 , the autoregressive model is indeed able to represent complex dynamics. Depending on the complexity of the autoregressive model (in particular, in the case of the MAR(3) and AR(21) models), a single (low-rank) lagged crosscorrelation (as corresponds to the HMM states in our model) cannot represent the data well enough, and, therefore, several states are necessary. In other words, complex 1/f and cross-spectral dynamics can either be represented by an autoregressive model with a large number of parameters, or a set of less complex models as we use in this work. As a consequence of this, different HMM states emerge from these simulated data.
We compared the HMM decomposition obtained from the real data and the HMM decomposition obtained from these synthetic scenarios. Supplementary Figure 13 shows the results. The states obtained from real data hold significant differences with the states obtained from the MAR(3) and the AR(21) models. As a first approach, we compared the wideband spectral maps (see Methods) between the surrogated and the real states. For this, we paired up the synthetic states with the real states such that the correlation is maximal. In the MAR(3) case, the states have relatively low correlations between real and synthetic (top left); in the AR(21) case, which captures the withinchannel spectral information more faithfully, the correlation is much higher, with the interesting exception of the higher-cognitive states and the fifth state (top right). As demonstrated in the bottom-right panel, these high correlations are however trivially explained by the across-states average power profile being very similar between synthetic and real states; this was expected given the AR(21) model's high explanatory power in the spectral domain. Note that this grand-average correlation is missed in the MAR(3) model case, most likely due to leakage correction removing a large extent of the (stationary) information. Most importantly, the states obtained from the real data are much more diverse than those corresponding to either of the synthetic models, as shown in the bottom panels. This, together with the inability to identify the higher-cognitive states, suggests that the HMM states obtained here are not trivially obtained from just any data with the same spatiotemporal autocorrelations.

Simulating data from the HMM
Each state model of the TDE-HMM corresponds to a Gaussian process 14 . In order to generate T time points of data from a given state, one can construct a (no. of regions x T) by (no. of regions x T) covariance matrix by rearranging the elements of the (no. of regions x time lags) by (no. of regions x time lags) state covariance matrix. In order to generate data from the TDE-HMM, we used the two higher-order states and selected a subset of regions for computational simplicity (ACC, PCC, and left and right intraparietal sulci). We then sampled 30min of data, alternating between these two states, with state visits set to last from 0.2 to 2s. We then ran the HMM inference on the simulated data. The inference, as shown in Supplementary Figure 14, was able to accurately recover the simulated state time course with a correlation between simulated and estimated time courses of r=0.98. Furthermore, when simulating from one single state, the HMM inference was able to reduce the complexity of the model by eliminating all states but one.