Brief segments of neurophysiological activity enable individual differentiation

Large, openly available datasets and current analytic tools promise the emergence of population neuroscience. The considerable diversity in personality traits and behaviour between individuals is reflected in the statistical variability of neural data collected in such repositories. Recent studies with functional magnetic resonance imaging (fMRI) have concluded that patterns of resting-state functional connectivity can both successfully distinguish individual participants within a cohort and predict some individual traits, yielding the notion of an individual’s neural fingerprint. Here, we aim to clarify the neurophysiological foundations of individual differentiation from features of the rich and complex dynamics of resting-state brain activity using magnetoencephalography (MEG) in 158 participants. We show that akin to fMRI approaches, neurophysiological functional connectomes enable the differentiation of individuals, with rates similar to those seen with fMRI. We also show that individual differentiation is equally successful from simpler measures of the spatial distribution of neurophysiological spectral signal power. Our data further indicate that differentiation can be achieved from brain recordings as short as 30 seconds, and that it is robust over time: the neural fingerprint is present in recordings performed weeks after their baseline reference data was collected. This work, thus, extends the notion of a neural or brain fingerprint to fast and large-scale resting-state electrophysiological dynamics.


Supplementary Notes
MEG fingerprinting is robust against sample demographics 5 The OMEGA data repository contains 158 participants, with a subset (N=47) scanned at 6 multiple occasions several days apart. OMEGA consists essentially of data from healthy controls 7 with a 18-73-year age span (SD=14.7 years; Supplementary Table 1). 8 One potential confound that could have inflated our ability to fingerprint individuals is the 9 heterogeneity introduced by both healthy and clinical populations in the OMEGA cohort. To 10 address this concern, we ran a secondary analysis where we performed the fingerprinting 11 procedures described in the manuscript with only healthy controls (N=130). The results, reported 12 in Supplementary table 2, demonstrated that fingerprinting performances were not biased by the 13 patients/controls heterogeneity of the OMEGA sample. We observed a decrease of less than 1% 14 in performance relative to fingerprinting from the entire cohort. Further, there was no clear 15 relationship between differentiability ** and demographics (Supplementary Figure 1)., using 16 connectome (age: r= 0.08, p = 0.2; gender: t= -0.27, p = 0.7; handedness: t= -0.51, p = 0.6; clinical 17 status: t= -0.87, p = 0.3; two-tailed) and spectral fingerprinting (age: r= 0.10, p = 0.1; gender: t= 0.62, 18 p = 0.5; handedness: t= 0.13, p = 0.8; clinical status: t= 0.84, p = 0.3; two-tailed). 19 20 21 Supplementary Figure 1: Differentiability is not associated with demographics 22 The plots depict demographic variables and corresponding differentiability scores across both (a) 23 connectome and (b) spectral broadband within-session fingerprinting (n = 158 Differentiation performances of connectome and spectral broadband within-session 57 fingerprinting obtained from for the entire repository (healthy controls and patients), and from 58 healthy participants only. Each column reports fingerprinting performances from dataset 1 to 59 dataset 2 and vice-versa (see Figure 1 and Methods for details). Overall, differentiation accuracy 60 decreased slightly by ~0.9% when comprising healthy participants only. Consistent with our Exemplar participant correlation matrix derived from between-session data used for 75 fingerprinting. The study-identity of participants was determined by the highest correlation 76 statistics taken across rows (e.g., to differentiate dataset-2 from dataset-1) or columns (to 77 differentiate dataset-1 from dataset-2).

79
Data reduction from principal component analysis does not improve MEG fingerprinting 80 substantially 81 Amico and Goñi (1) previously reported improvements to participant differentiation when 82 using data reduction techniques prior to fingerprinting, using e.g., principal component analysis 83 (PCA). We reproduced their approach, using PCA to reduce the dimensionality of the connectome 84 and spectral feature spaces prior to fingerprinting. Our results provided little support to PCA 85 reconstruction improving differentiation accuracy, as shown Supplementary Figure 5 and in 86 Supplementary table 3. PCA increased differentiability by less than 1.5%. Data reduction had 87 limited beneficial impact possibly because of high fingerprinting performances at baseline 88 (without data reduction). We also emphasize that we conducted MEG source time series 89 extraction via a PCA of all local time series within each parcel. It is therefore likely that this 90 dimension reduction procedure contributed to improve signal-to-noise ratio and limited the Performances in differentiation accuracy for connectome and spectral broadband within-session 97 fingerprinting, for both original and PCA-reconstructed data (1). PCA data reduction improved 98 connectome fingerprinting performances only slightly (about 2%). It had virtually no effect on 99 spectral fingerprinting performances. Fingerprinting with 30-second data segments 116 We challenged MEG fingerprinting using short 30-second data segments (i.e., shortened 117 within-session fingerprinting). We epoched participants' MEG recordings into three datasets of 30 118 second, where the first dataset was the first 30 seconds of the recording after having removed the 119 6 initial five seconds, the second dataset was the 30 seconds immediately following the first dataset, 120 and the last dataset was the last 30-second segment of the recording after having removed the 121 last ten seconds (see Figure 1). Cropping the initial and last few seconds from recordings excluded 122 edge, filtering, and other session artifacts. The lengths of the short datasets and epochs were 123 determined from the participant with the shortest available recording. This procedure yielded 124 three data segments for fingerprinting purposes via 6 possible dataset pairs (i.e., dataset 1 and 2; 125 dataset 2 and 3; and dataset 1 and 3 and vice-versa). Results for all possible combinations of 126 datasets are reported in Supplementary Figure 6. 127 Connectome fingerprinting successfully differentiated individuals across all possible 128 combinations of datasets (Supplementary Figure 6). Fingerprinting from recordings collected 129 closer in time (e.g., dataset-1 and dataset-2) outperformed differentiation from datasets collected 130 further apart in time (e.g., between dataset-1 and dataset-3). Overall, spectral fingerprinting 131 yielded lower differentiation accuracy than connectome fingerprinting, in particular from datasets 132 further apart in time.

133
In a similar fashion, we challenged MEG fingerprinting using short 30-second data 134 segments from different sessions (i.e., between-session fingerprinting). This yielded 6 epochs of 135 data for fingerprinting (i.e., three from both the first and second recording, see Figure 1a).

136
Fingerprinting results averaged across all possible data pairs are reported Figure 3c. Connectome 137 fingerprinting performances were greater than those from spectral fingerprinting. Differentiation 138 from slower frequency data components performed worse in comparison to higher bands -see 139 main article body for a discussion. differentiation from all possible combinations of datasets, (i.e., dataset 1 to predict dataset 2, 146 dataset 3 to predict dataset 2, etc.; see Methods for details). Differentiation accuracy increased as 147 datasets were proximal in time (i.e., fingerprinting accuracy for dataset 1 to dataset 2 was greater 148 than for dataset 1 to dataset 3). Source data are provided as a Source Data file.

150
Fingerprinting across recording sessions 151 We also report fingerprinting accuracy performances from all possible pairs of datasets for 152 the between-session fingerprinting challenge in Supplementary Figure 7. Overall, spectral 153 fingerprinting outperformed connectome fingerprinting, as discussed in the main text. Individuals cannot be differentiated from their respective imaging kernels 164 We verified that the within-session fingerprinting of individuals was not possible from empty-room 165 data (i.e., with no participant under the MEG sensor array) processed through their respective 166 imaging kernel of beamformer weights. Indeed, these latter are defined from individual anatomy 167 and head position under the MEG sensor array, which may have been sufficient information to 168 drive differentiation. We therefore ran the same fingerprinting pipeline on each session's empty-169 room data transformed through the corresponding individual's beamformer imaging kernel, which 170 was identical for each of the within-session data segments used. Note that for the between-171 session challenges, the imaging kernels were adjusted to the respective individual head positions 172 measured during each session. These analyses demonstrated that the imaging kernel information 173 did not contribute substantially to MEG fingerprinting (overall performance was below 20% on 174 average).

176
We also ran the MEG fingerprinting pipeline directly from the sensor data of the empty-room 177 recordings, without transformation through individual imaging kernels, to assess the floor level of 178 differentiation performances from non-brain data only. The data confirmed substantially lower Results for the empty-room sensor fingerprinting challenge. As expected, differentiation 186 accuracies of connectome and spectral broadband and narrowband fingerprinting were 187 substantially lower than from actual MEG data with individuals present. Source data are provided 188 as a Source Data file.

190 191
Fingerprinting from scalp data only 192 We also performed MEG fingerprinting from individual sensor data, with no MEG source 193 reconstruction to assess the added value of source modeling. We replicated the above MEG 194 fingerprinting pipelines from the within-, within-shortened, and between-session analyses.

195
Differentiation performances were less than with source modeling, especially from signal 196 components in higher frequency bands and for the shortened challenges (see Supplementary  197 Figure 9, 10, & 11). Yet for other signal components and longer durations, individuals remain 198 differentiable from sensor-level data collected between sessions (>60% accuracy from broadband 199 data), albeit with lower accuracy than when using MEG source transformations, which explicitly 200 account for different head positions between sessions.

201
Taken together with the empty-room fingerprinting tests above, these results provide evidence 202 that brain signals, not environmental conditions, were crucial for individual differentiation. Salient neurophysiological features for fingerprinting 232 We reported in the main manuscript intraclass correlations (ICC) to determine which 233 features contributed to individual differentiation the most. We also performed two additional 234 analyses, deriving group consistency and differential power. These two metrics were proposed by 235 Finn and colleagues (2) to identify the features which were the most consistent across their cohort, 236 and the features which were the most consistent within individuals but different across 237 participants, respectively (2). Differential power measures the empirical probability that a given 238 feature is more likely to have a higher edgewise product vector across individuals than within the 239 same individual. Taking the sum of the natural log of this probability across subjects yields 240 differential power (2). The higher the differential power, the better a feature discriminates 241 between individuals. Results for differential power are plotted in Figures  Differential Power (DP) analysis for broadband spectral fingerprinting of the within-session dataset 278 (see Figure 1). Mean DP plotted within frequency bands according to the Desikan-Killiany atlas (4). The higher the DP, the more a given frequency band and ROI distinguished between individuals. 280 The most characteristic regions and frequencies were medial structures for the beta band, and 281 temporal and central regions for gamma band signals. 282 283 284 Supplementary Figure 15: Group consistency spectral fingerprinting 285 Group Consistency (GC) analyses for broadband spectral fingerprinting of the within-session 286 dataset (see Figure 1). Mean GC plotted within frequency bands according to the Desikan-Killiany 287 atlas (4). The higher the GC, the more a given frequency band and ROI remained consistent within 288 individuals and across the cohort. The most stable frequencies were the lower bands (delta and 289 theta) and the most consistent regions across individuals were lateral frontal areas.

291
Partial Least Squares (PLS) analysis 292 We tested whether differences in resting-state neurophysiological signals related to 293 meaningful demographic features using an exploratory Partial Least Squares (PLS) analysis. PLS is 294 a multivariate statistical method that relates two data matrices based on latent variables (LV) that 295 explain the highest covariance between the two datasets. Here, our two datasets consist of a 296 demographic matrix (i.e., age, gender, handedness, and clinical status) and a neurophysiological 297 data matrix (i.e., spectral power or functional connectome). Latent variables (which explain the 298 most covariance between both matrices), and their corresponding variance explained are plotted 299 in Supplementary Figure 16. Significance of each latent variable was assessed via permutation 300 tests. Permuting the rows of the data allowed us to compute an associate p-value for each latent 301 variable (see Manuscript). We chose to explore the first significant latent variable which explained 302 the most variance for each neurophysiological signal feature (i.e., the first component for 303 connectomes and spectral data). The resulting weights associated to the latent neural and 304 demographic components are depicted Figure 5 along with their bootstrapped ratios. These 305 results corroborate how neurophysiological signals at rest, in addition to differentiating 306 individuals, carry meaningful information about participant demographics. 307 308