## Introduction

When unoccupied by an external task, humans often engage in complex patterns of self-generated thinking. Often studied under the rubric of mind-wandering1,2 or daydreaming3, these patterns of thought often share a common feature of being unrelated to events in the here and now. However, they are also heterogeneous in content4 and outcome5, while their intrinsic nature constitutes a challenge to their measurement6. Although there is a growing understanding that self-generated states constitute an important feature of human cognition, we lack an agreed upon classification on their defining features7,8, as well as the tools with which to study them9. The current study examined whether it is possible to gain insight into the repertoire of self-generated states by applying advanced machine learning methods to recordings of neural data during periods of wakeful rest, and using these to predict measures of experience.

Our study builds on an emerging literature using advanced analyses techniques to understand the organisation of self-generated cognition. One common method is to reduce the dimensionality of self-reported data in order to identify latent variables that make up participants’ descriptions of their experience. For example, Ruby and colleagues10 used principal component analysis (PCA) and identified patterns of self-generated thought that were distinguished by their associations with either the past or the future. Using lag analysis, they demonstrated that these two states had unique correlates with subsequent mood – prospective self-generated thought increased positive mood, while past thoughts were associated with the reverse pattern. Other authors have used a similar approach to establish that prospective self-generated thoughts are often structured and realistic11 and that they may play a role in the consolidation of personal goals into concrete plans12. Another approach involves the application of clustering techniques to experience sampling data. For example, Andrews-Hanna and colleagues13 used hierarchical clustering to determine distinct patterns of negative and self-relevant thought and specific thinking, each with unique psychological correlates. Other studies have employed techniques that explored the temporal features of experience sampling data. For example, Zanesco14 employed a temporal clustering method to experience sampling data generated across several different experimental paradigms. They found several patterns of ongoing cognition that were similar across tasks, and identified regularities in the order with which the hidden cognitive states emerge. Our study builds on these findings by applying a temporal decomposition method to neural data recorded during a period of wakeful rest, a situation which has a high prevalence of self-generated mentation15 and in which prior studies have successfully linked neural activity to retrospective reports of experience16,17,18. Our aim was to determine whether the application of advanced machine learning methods on neural data revealed states that had reliable associations with an individual’s self-reported experience during that period.

In our study, therefore, we measured neural function during wakeful rest in a large cohort of healthy individuals using functional magnetic resonance imaging (fMRI). These participants completed in the scanner a set of questions related to their experiences during the scan, and a subset completed questionnaire measures of physical and mental health at a later session. We applied hidden Markov modelling (HMM) to the neural data to reveal distinct brain states corresponding to specific periods of temporally reoccurring patterns of neural mean activity and functional connectivity19. We then used the dwell-times of the states as predictors in a multivariate multiple regression with the reports given at the end of the scan as the outcome measures, aiming to explore their particular relationships in detail. Finally, we investigated how these states are distributed across a multi-dimensional space, formed by three well-described neural hierarchies based on the differentiation of whole-brain connectivity patterns20. A number of recent studies have highlighted a composite of large-scale hierarchies as a way to understand human cortical organisation in health and disease21,22,23. These gradients can present information about the non-linear functional interactions between different neural systems which, as recent studies have shown, can be particularly beneficial in the investigation of the multi-dimensional features of higher-order cognition24,25. To that end, our study aimed to examine the possibility that these hierarchies represent a functional landscape that describes how organised transient neurocognitive states emerge. The study workflow is presented in Fig. 1.

## Results

### Identifying reoccurring neural states at rest

We applied hidden Markov modelling to the resting state fMRI data (see Methods). This method uses Bayesian inference to identify reoccurring patterns of brain network activity; see19,26 for an empirical validation of this approach. Here, we generated HMM solutions capturing mean activity and functional connectivity patterns within the neural data recorded from our sample. We opted for a 7-state solution, however, our results highlighting associations with behaviour remained broadly similar for a decomposition of 9 states as well. We ran the algorithm 10 times in each case and found that the 7-state solution produced the same decomposition across all iterations. For this reason, we did not ask for a solution with a smaller number of states, so as not to unnecessarily reduce the algorithm’s ability to provide a detailed description of the dataset. Furthermore, for our data and model parameters, the 7-state solution was relatively more stable and had greater split-half reliability than the 9-state solution (see Fig. S2). Accordingly, we focus on the 7-state solution and report a parallel analysis of the 9-state solution in the Supplementary Materials (Fig. S6 and S8). Figure 2 shows the spatial description of neural mean activity in each state and the associated cognitive terms produced by a meta-analysis using Neurosynth27, displayed in the form of word clouds.

### The relationship between neural states and experience

Having identified a set of relatively stable recurring patterns within our data, we next explored how these naturally occurring neural states are related to the experiences reported by our participants. To answer this question, we performed a multi-variate analysis of co-variance, in which the mean dwell-time of each state was an explanatory variable and the self-reported data was the dependent variable (see Methods for details of the analyses; the specific questions can be found in Table S1). We included motion, age and gender as co-variates of no interest.

This analysis revealed that the mean dwell-time of states 3 and 7 had a significant multivariate association with patterns of reports made by our participants at the end of the scan (F(8, 238) = 2.22, $$p = 0.027$$, Wilks’ $$\Lambda$$ = 0.931, partial $$\eta ^2 = 0.069$$ for state 3 and F(8, 238) = 2.18, $$p = 0.03$$, Wilks’ $$\Lambda = 0.932$$, partial $$\eta ^2 = 0.068$$ for state 7). These results are presented in Fig. 3. It can be seen in the radar plot that state 3 is dominated by task-positive systems (frontal parietal and dorsal attention) as well as the medial temporal subsystem of the default mode network (DMN). Experientially, it was associated with a state of autobiographical planning, reflecting high scores on “Future” and “Problem solving” thoughts. In contrast, state 7 was associated with somatomotor, the core and lateral temporal subsystems of the DMN and the limbic system. This state was associated with reports of unintentional, intrusive thoughts about the past. Our analysis therefore identified two distinct neural states that were uniquely associated with states associated with experiential foci on different temporal epochs (the past and the future), a pattern that is routinely seen in experience sampling studies e.g.10,28,29.

Psychological studies have shown that patterns of ongoing thought often have associations with negative affect, and in particular those associated with the past10,29,31. Based on this evidence, our next analysis aimed at identifying whether the mean dwell-time of the states varied with trait measures of well-being recorded in a subset of these participants. This analysis found that two of the states had a multivariate association with the questionnaire data (state 4, F(3,155) = 2.71, $$p = 0.047$$, Wilks’ $$\Lambda = 0.95$$, partial $$\eta ^2 = 0.05$$ and state 7, F(3, 155) =  3.74, $$p = 0.013$$, Wilks’ $$\Lambda = 0.933$$, partial $$\eta ^2 = 0.067$$). These relationships are presented in Fig. 4 where it can be seen that state 4 is associated with higher levels of reports of depression and rumination. In contrast, state 7, also identified in our prior analyses, was most strongly linked to trait, state and social anxiety, rumination, depression, as well as greater symptoms linked to autism. Notably, therefore, our analysis identified that state 7, linked to unpleasant thoughts about the past, was also linked to states characterised by psychiatric symptomatology. Our observation of an association between retrospective experiential states and unpleasant affective states is consistent with prior experience sampling studies10,29,31, increasing our confidence in the association between the neural states and patterns of ongoing thought.

Finally, we examined the robustness of our results linking the neural states to the psychological measures using the mean dwell-times of the 9-state solution. In the 9-state solution there were states that were similar to those in the 7-state solution (see Fig. S3). A multivariate analysis showed that state 5, the homologue of state 3 in the 7-state solution, was associated with a similar multivariate pattern of experience, and state 6 (the homologue of state 7 in the 7-state solution) was related to the same pattern of trait measures (Fig. S6). Although no pattern of thoughts was significantly associated with state 6 in the 9-state solution, its experiential correlates were most similar across individuals to those seen in state 7 from the 7-state solution (r = 0.6, $$p <.001$$, see Fig. S8). These analyses suggest that associations of neural patterns with both experience and well-being are relatively well-preserved across small changes in the number of states generated through HMM.

### The relationship between naturally occurring states at rest and neural hierarchies

Having identified network states related to experience and trait measures of well-being, we next explored their association with three well-described neural hierarchies that highlight gradients in connectivity patterns over space. We chose the first three hierarchies from20 which represent the division of unimodal and transmodal systems (gradient 1), the distinction between vision and motor (gradient 2) and the patterns seen during the brain’s response to tasks relative to rest (gradient 3). These were generated using diffusion embedding, a non-linear dimensionality reduction technique, on the Human Connectome Project Data, and identified components describing the maximum variance in functional connectivity patterns (see Methods and initial paper for further details). The spatial distribution of each of these gradients is presented in Fig. 5a along with a meta-analysis of these maps using Neurosynth.

The aim of this analysis was to understand whether there is a relationship between the states inferred by the HMM (see Fig. 2) and the spatial maps which represent these pre-established hierarchies. To achieve this goal, we calculated the pair-wise similarity between each of the hierarchies and each of the states (see “Methods”). We compared the associations between the real states inferred in our analyses to a null distribution of synthetic states. The null distribution was generated by randomly permuting the weights describing the relative contribution of each brain region to the mean neural activity captured by each state (see “Methods”). The results of these analyses are presented on Fig. 5b. The contour plot and histogram on the left show where the real data fall on a subset of the dimensions, while the three-dimensional scatter plot show the same data in a combined manner. In these plots, the null distribution generated through permutation is represented as the shaded area in grey. We also generated a video that shows the data in a 3-d form (https://github.com/tkarapan/hmm-large-scale-hierarchies/blob/master/7S_Null.mp4).

The synthetic states clustered towards the middle of the three dimensions, indicating that states generated in a random manner tend to not be very similar in spatial terms to the three pre-established neurocognitive hierarchies20. In contrast, the states generated based on real data fall, on average, away from the centre and towards the outer edge of the distribution of synthetic states. Notably, the states with a significant association to experience, states 3 and 7, fall towards the mid-point of both gradient 1 and 2 and, in contrast, are maximally dissociated along the dimension describing how the brain responds to a task; state 3 is more similar to the task positive end of gradient 3, while state 7 shows the reverse. Finally, it can be seen that state 4, which alongside state 7 was associated with trait measures of well-being, falls towards the unimodal end of gradient 1.

To quantify whether the apparent difference between the real and permuted data was robust, we calculated the sum of the weighted distance from the origin for each of the sets of states generated through permutation of this space, and compared this to the same value from the real data. The distribution of these values is presented in the right-hand histogram in Fig. 5b, where it can be seen that the weighted distance sum of the real states from the centre of this space is higher than any of the states generated synthetically.

Together, these analyses suggest that states determined through the application of HMM to our real data have an association with the spatial maps describing neural hierarchies that is greater than chance. To identify the robustness of this effect, we performed a parallel analysis based on the 9-state solution to the current data, as well as the published states inferred from applying HMM with the same model parameters on the Human Connectome Project data26, finding comparable results (see Fig. S10). Together, these analyses show that the states that occur naturally at rest, as approximated through HMM, fall at the extremities of a co-ordinate space defined by well-established neural hierarchies. This result implies that macro-scale hierarchies can be thought of as constraining the state space from which the intrinsic network states emerge.

## Discussion

The current study set out to understand whether it is possible to shed light on the repertoire of self-generated states that an individual engages in, through the application of advanced machine learning methods to neural data recorded during periods of wakeful rest. Our results indicated a significant association between the amount of time an individual spends in two neural states and the experiential reports gained at the end of the scan. One state had features mimicking patterns of neural activity seen during complex tasks e.g.32,33 and was associated with patterns of thoughts focused on future problem solving. A second state highlighted the dominance of the default mode network (DMN)34 and was linked to reports of unintentional intrusive thought about the past. Importantly, this state was also linked to traits associated with negative affect (anxiety, depression and rumination). Comparing these results to those from35, where we ran a time-averaged functional connectivity analysis for a subset of this sample, we found that the time-varying analyses used in this study provided additional information which would be otherwise left unexposed. Together, our study adds to a growing consensus on the utility of methods that explore neural activity from a dynamical perspective, as a tool for understanding patterns of ongoing cognition and behaviour9,36.

Our analyses, therefore, establish that techniques that parse continuous neural data into time-varying states can be used as a tool to empirically constrain accounts of the qualitative patterns of ongoing thought. By relating the amount of time an individual spent in specific neural states to the pattern of thoughts reported at the end of the scan, we highlighted many of the themes of self-generated thought that are common within the literature. For example, patterns of future problem solving identified in our study are consistent with a prospective bias to ongoing thought that develops in young adulthood37,38 and is common in multiple cultures (Australia38; Belgium11; China39; Germany10; Japan40; U.K.15; USA41,42). This may reflect a mode of autobiographical planning that is hypothesised to be a potential beneficial outcome of self-generated thought11,41. The second state we identified emphasises unintentional intrusive thoughts from the distant past and was associated with high levels of trait affective disturbance. This pattern of unhappy rumination may reflect the link between negative affective thought and a focus on the past that is often observed in studies of self-generated thought10,29,31,43. Neurally, this state highlights an association between temporal properties of the default mode network and well-being, complementing previous findings linking DMN to both positive26,44 and negative45,46 behavioural traits. Examining both states, we also found a distinction related to the spontaneity of thought patterns associated with neural activity. Spending longer periods in the first state was linked to less spontaneous, more deliberate thoughts about the future, whereas more time spent in the second state was associated with less deliberate, intrusive thoughts about the past. These findings are in line with studies reporting positive, constructive, future thinking47,48 and past related, unintentional, intrusive thoughts29,42, highlighting the spontaneity of thought as an important dimension of ongoing experience49. Notably, our study estimated states simply based on neural information without the need to repeatedly sample participants’ experience. In this context, the synergy between the qualitative motifs associated with our neural states and those identified by prior experience sampling studies helps minimise concerns that experience sampling results are in general a consequence of the meta cognitive demands imposed by this technique2.

Finally, our study provides a novel view of the relationships between specific macro-scale hierarchies and patterns of spontaneously occurring neurocognitive states. The two states identified that had experiential correlates fell at opposite extremes of a neurocognitive hierarchy that reflects the neural response to tasks. These data suggest that similar neural hierarchies that support general aspects of how individuals respond to the demands of an external task32 may continue to do so in the absence of any overt external behaviour. Importantly, this neural hierarchy discriminated between patterns of autobiographical planning, which are thought to be advantageous because they help consolidate personal goals12, and those linked to patterns of rumination, which exacerbate unhappiness29. Our results thus suggest that the same neural hierarchy that determines the brain’s response to increasing task demands may also discriminate beneficial and detrimental types of self-generated experiences.

Although our study highlights the utility of advanced machine learning methods in the determination of naturally occurring self-generated states, it has some limitations and leaves several important questions open. For example, previous research has highlighted the potential confounding effects of cardiac and respiratory cycles on the BOLD fMRI signal50,51,52. Since these physiological measures have been shown to relate to behaviour53, acquiring them in future studies can provide a way to control their effect. Furthermore, several studies have shown that changes in arousal and vigilance can explain a meaningful fraction of variance in neural activity and connectivity54,55,56. Given that HMMs can identify distinct neural patterns, it is possible that fluctuations in arousal may be at least partially captured by one or multiple HMM states. Vigilance and arousal, however, can have unique temporal properties, and may relate to cognition in multiple ways57,58,59. Therefore, the magnitude of their effect to neural dynamics, how much can be captured by specific algorithms, different ways to account for it, and the extent that it can be distinguished from the effect of ongoing cognition are nontrivial and are left open for future research36,60. In addition, our application of HMMs allowed us to estimate the occurrence of transient states at rest; however, we only gained a single measure of an individual’s experience. This feature of our experimental design allows us to rule out certain meta-cognitive features of experience sampling as confounding our results (see above); however, it also makes unclear the extent to which the observed states are transient or result from more stable trait-like properties of the individual. In the future, studies could overcome this limitation by sampling an individual’s experience at rest on multiple occasions. This would allow for a more quantified assessment of whether these states occur in a trait-like manner, or whether they are subject to more transient influences. Moreover, although our study highlights that neural information can be used to provide a quantified assessment of the repertoire of states that individuals engage in, it leaves open the specific experiential features that these states may include. In our study, we sampled individuals’ thoughts using 25 questions that were refined through a sequence of empirical investigations, yet, it seems likely that these are not an exhaustive list of experiential states, because at least certain features of self-generated experience are probably not captured by the specific items we used. Accordingly, our study establishes that neural information can provide an important window into the types of thoughts individuals may engage in. Nevertheless, future work is needed to establish the most appropriate self-report items to fully appreciate the full range of the ontological features of self-generated experience. Finally, our study leaves open the specific duration that states can take. Our study used fMRI to estimate the neural states and we used a repetition time (TR) that was reasonably slow (3 s). This feature of our design acts as a hard limit to the duration of states that our analysis can determine since we would be unable to determine states that lasted substantially shorter durations that that of the TR. In the future, it will be possible to identify states with a shorter duration by performing a similar set of analyses, using a method of neuroimaging such as electro/magnetoencephalography that can acquire neural data in a more rapid manner.

## Methods and materials

### Participants

277 healthy participants were recruited from the University of York. Informed consent was obtained for all participants, the study was approved by the York Neuroimaging Centre Ethics Committee, and all research was performed in accordance with relevant guidelines and regulations. 21 participants were excluded from analyses, 1 due to technical issues during the neuroimaging data acquisition and 20 for excessive movement during the fMRI scan (mean framewise displacement61 $$> 0.3$$ mm and/or more than 15% of their data affected by motion), resulting in a final cohort of n = 256 (169 females, $$\mu _{age} = 20.7$$ years, $$\sigma _{age} = 2.4$$).

### Behavioural methods

We sampled participants’ experience during the resting state fMRI scan by asking them to retrospectively report their thoughts at the end of the scan. Experience was measured using a 4-scale Likert scale, with the question order randomised (all 25 questions are shown in Table S1). In a subset (n = 168) of the final cohort, we also assessed their physical and mental health by administering well-established questionnaire measures at a later separate session outside of the scanner. Details about each questionnaire are presented in SI.

### MRI data acquisition

MRI data were acquired on a GE 3 Tesla Signa Excite HDxMRI scanner, equipped with an eight-channel phased array head coil at York Neuroimaging Centre, University of York. For each participant, we acquired a sagittal isotropic 3D fast spoiled gradient-recalled echo T1-weighted structural scan (TR = 7.8 ms, TE = minimum full, flip angle = 20$$^{\circ }$$, matrix = 256 × 256, voxel size = 1.13 $$\times$$ 1.13 $$\times$$ 1 mm$$^3$$, FOV = 289 $$\times$$ 289 mm$$^2$$). Resting-state functional MRI data based on blood oxygen level-dependent (BOLD) contrast images with fat saturation were acquired using a gradient single-shot echo-planar imaging sequence with the following parameters; TE = minimum full ($$\approx$$19 ms), flip angle = 90$$^{\circ }$$, matrix = 64 $$\times$$ 64, FOV = 192 $$\times$$ 192 mm$$^2$$, voxel size = 3 $$\times$$ 3 $$\times$$ 3 mm$$^3$$, TR = 3000 ms, 60 axial slices with no gap and slice thickness of 3 mm. Scan duration was 9 minutes, which allowed us to collect 180 whole-brain volumes.

### fMRI data pre-processing

Functional MRI data pre-processing was performed using SPM12 (http://www.fil.ion.ucl.ac.uk/spm) and the CONN toolbox (v.18b) (https://www.nitrc.org/projects/conn)62 implemented in Matlab (R2018a) (https://uk.mathworks.com/products/matlab). Pre-processing steps followed CONN’s default pipeline and included motion estimation and correction by volume realignment using a six-parameter rigid body transformation, slice-time correction, and simultaneous grey matter (GM), white matter (WM) and cerebrospinal fluid (CSF) segmentation and normalisation to MNI152 stereotactic space (2 mm isotropic) of both functional and structural data. Following pre-processing, the following potential confounding effects were removed from the BOLD signal using linear regression: 6 motion parameters calculated at the previous step and their 1st and 2nd order derivatives, volumes with excessive movement (motion greater than 0.5 mm and global signal changes larger than z = 3), signal linear trend, and five principal components of the signal from WM and CSF (CompCor approach63). Finally, data were band-pass filtered between 0.01 and 0.1 Hz. Regarding global signal regression (GSR), studies have shown that it may remove both neuronal and non-neuronal information51,64. We followed multiple pre-processing steps (described above) in an effort to minimise confounding effects and, therefore, opted to not perform GSR in our main analyses, as we were ambivalent about whether the potential benefit from including GSR in our pipeline would justify the risk of removing true neuronal signal. However, in order to mitigate the lack of acquiring physiological measures (see Discussion), and due to the considerable ongoing controversy over the matter65, we also ran the same pipeline following GSR. The findings of this analysis, presented in the Supplementary Materials, were in relative agreement with our original findings.

### Dimensionality reduction

In order to reduce the dimensionality structure of the neuroimaging data, we used the 17 functional network parcellation from30, slightly eroded the parcels to avoid signal leakage from neighbouring parcels, masked them with subject specific grey matter masks, and calculated the average signal from the unsmoothed data within each parcel for each volume per participant. The acquired time series were then standardised and concatenated to form a “(256 participants $$\times$$ 180 volumes) $$\times$$ 17 parcels” matrix that was used as input to the HMM algorithm.

For the behavioural data, we applied a principal component analysis (PCA) with varimax rotation to the scores describing the participants’ experience at the end of the resting state scan. As in many of our previous studies (i.e.18,66,67,68), using this kind of decomposition and rotation improves the identification and interpretation of distinct patterns of covariance in our multi-dimensional datasets. The PCA revealed eight orthogonal components with eigenvalues greater than 1. We followed the same procedure for the trait measures of well-being that we acquired for a sub-sample of our cohort, which identified three principal components in their structure (see Fig. S1, and Tables S2 and S3 for the loadings of both decompositions, and the variance explained by each component). The component scores were used as dependent variables in two separate multivariate multiple regressions, run at a later stage in order to investigate the relationship between ongoing experience and well-being respectively, with the mean dwell-times of the dynamic neural states as described by our HMM decomposition. In order to help with the interpretation of the results, the beta weights from the regressions were mapped from the principal component space back to the multi-dimensional behavioural space and are shown as word clouds on the right-most column in Figs. 3 and 4.

### Hidden Markov model

To characterise the dynamics of neural activity, we applied hidden Markov modelling to the concatenated time series of the 17 parcels30. The inference of the model parameters was based on variational Bayes and the minimisation of free energy, as implemented in the HMM-MAR toolbox69. The HMM’s inference assigns state probabilities to each time point of the time series (i.e. reflecting how likely each time point is to be explained by each state) and estimates the parameters of the states, where each state has its own model of the observed data. Each state was represented as a multivariate Gaussian distribution19, described by its mean and covariance. Inference was run at the group level, such that the state spatial descriptions are defined across subjects, whereas the temporal activation of the states is defined at the subject level. This allowed us to discover dynamic temporal patterns of whole-brain activity and functional interactions along with their occurrence (state time series) and transition probabilities for the duration of the whole resting state fMRI scan. Detailed information about the HMM implementation and the variational Bayes inference can be found in19,26,69. As with other probabilistic unsupervised learning methods (e.g. independent component analysis), HMM is sensitive to initial values. In order to account for HMM run-to-run variability, we ran the algorithm 10 times and selected the iteration with the lowest free-energy at the end of the inference. As all 10 decompositions were the same for the 7-state solution, our results would have been robust to choosing any of the 10 iterations. This would not have been the case for the 9-state solution, as these decompositions were not as stable (see Fig. S2). The summary statistic describing the states’ dwell-times was computed after hard classifying the states as being active or not by using Viterbi decoding. The mean dwell-time of each state (measured in TRs, see Fig. S5) was then used as an explanatory variable in subsequent multivariate generalised linear models. Dwell-times were standardised and any values $$> 2.5\sigma$$ were substituted with the mean ($$\mu$$ = 0) to minimise the effect of potential outliers. We also calculated two additional summary metrics describing how often participants switch from one state to any other state (switching rate), and how long participants spend in each state on average (fractional occupancy), that can be seen in supplementary Figure S5. All analyses controlled for age, gender and motion during the resting state fMRI scan.

### Projection of state maps to a 3-dimensional space of neural hierarchies

Following the HMM inference, each state S can be characterised by parameters ($$\mu _s$$, $$\Sigma _s$$), where $$\mu _s$$ is a vector containing the activation/weights (with respect to the average) of each of the 17 parcels used in our analyses and $$\Sigma _s$$ is the covariance matrix describing their functional interactions. Using the weights of each parcel, we obtained each state’s “mean activity” spatial map (Fig. 2) and calculated its spatial similarity to the first three gradient maps from20. These maps were produced by a diffusion embedding algorithm, applied on data from the Human Connectome Project, they are orthogonal, and highlight topographies of information exchange based on the differentiation of connectivity patterns. Similarity was calculated as the pairwise correlation between each state spatial map and gradient and was used as the state’s co-ordinate for the corresponding dimension in gradient space. Aiming to construct a summary metric describing the topography of the states in this space, we Fisher-z transformed the correlation values (used as gradient space co-ordinates), calculated the distance of each state from the origin weighted by its maximum co-ordinate (as a way to differentiate between points lying on the surface of the same sphere), and added the weighted distances to produce a sum for the set of all states

$$\sum _{i=1}^{7}\left( \sqrt{\left( S_{iG1}\right) ^{2}+\left( S_{iG2}\right) ^{2}+\left( S_{iG3}\right) ^{2}}\cdot max\left( S_{iG1},S_{iG2},S_{iG3}\right) \right)$$

where $$S_{iG1}$$, $$S_{iG2}$$, $$S_{iG3}$$ are the Fisher’s z correlation values between the spatial maps of state i and gradient 1, 2, and 3 respectively.

### Null distribution

In order to compute a null distribution, we randomly permuted the parcel weights, constructed the synthetic states’ spatial maps and projected them into gradient space, by calculating their similarity to the gradients in the same way as before. We ran 300 permutations, producing 2100 synthetic states and calculated the weighted distance sum of each set in a similar manner as with the set of empirical states.