Resting state alpha oscillatory activity is a valid and reliable marker of schizotypy

Schizophrenia is among the most debilitating neuropsychiatric disorders. However, clear neurophysiological markers that would identify at-risk individuals represent still an unknown. The aim of this study was to investigate possible alterations in the resting alpha oscillatory activity in normal population high on schizotypy trait, a physiological condition known to be severely altered in patients with schizophrenia. Direct comparison of resting-state EEG oscillatory activity between Low and High Schizotypy Group (LSG and HSG) has revealed a clear right hemisphere alteration in alpha activity of the HSG. Specifically, HSG shows a significant slowing down of right hemisphere posterior alpha frequency and an altered distribution of its amplitude, with a tendency towards a reduction in the right hemisphere in comparison to LSG. Furthermore, altered and reduced connectivity in the right fronto-parietal network within the alpha range was found in the HSG. Crucially, a trained pattern classifier based on these indices of alpha activity was able to successfully differentiate HSG from LSG on tested participants further confirming the specific importance of right hemispheric alpha activity and intrahemispheric functional connectivity. By combining alpha activity and connectivity measures with a machine learning predictive model optimized in a nested stratified cross-validation loop, current research offers a promising clinical tool able to identify individuals at-risk of developing psychosis (i.e., high schizotypy individuals).

Schizophrenia is a highly debilitating and complex mental disorder characterized by impairments in integrating sensory and cognitive functions leading to incoherent perception. Schizophrenia often arises in late adolescence or early adulthood and is typically preceded by a high-risk (prodromal) phase, during which subtle neurocognitive impairments and sub-threshold psychotic symptoms usually emerge 1 . For this reason, increasing research efforts are focused upon identifying predictive neurobiological markers for early diagnosis in individuals at risk. Heightened risk for the development of a psychotic disorder is associated with schizotypal traits 2 . Substantial overlapping has been found between schizotypy and schizophrenia at genetic, biological and neurocognitive levels [3][4][5] , strongly supporting the claim of a continuous nature of schizotypy. Specifically, genome-wide association (GWA) research indicates that a vast number of independent polymorphisms confer risk [6][7][8] for psychosis proneness, whereas schizophrenia represents the extreme of these multiple quantitative dimensions. These genetic factors explain about 50% of the schizotypic variance 3 , whereas the remaining variance can be explained by biological [9][10][11][12] and psychosocial [13][14][15][16] factors. Taken together, these common genetic and environmental underpinnings have led to the assumption that schizotypy reflects the subclinical expression of the symptoms of schizophrenia in the general population [17][18][19] .
Several studies of patients at different stages of schizophrenia, including the prodromal phases preceding the onset of the disorder, have recently reported abnormal spontaneous alpha oscillations and altered resting-state functional connectivity of the alpha rhythm [20][21][22][23] . Crucially, alpha rhythm is generated by a complex interplay between thalamic and cortical pacemakers and propagates via short and long range cortico-cortical, www.nature.com/scientificreports/ cortico-thalamic, and thalamo-cortical connections 24,25 . It is well known that large bursts of alpha (7)(8)(9)(10)(11)(12)(13) band activity dominate the human electroencephalogram (EEG) during periods of rest 26 . However, whether abnormalities of resting-state alpha rhythm are already present in individuals with high schizotypal personality traits, and can be taken as early risk predictors for these individuals, is still unknown. There is plenty of evidence in support of this hypothesis. Murphy and Ongur 27 reported decreased peak alpha frequency in first episode psychosis patients. Specifically, they found alpha slowing in posterior regions, while peak alpha frequency did not decrease significantly in frontal and temporal regions. Further supporting the clinical relevance of abnormal peak alpha frequency to schizophrenia, there is evidence available for therapeutic effects of individualized alpha frequency transcranial magnetic stimulation on the negative symptoms of schizophrenia 28 .
Moreover, according to the idea that schizophrenia originates as a disconnection syndrome 29 , first episode psychosis patients show abnormal functional connectivity, as estimated using the phase lag index (PLI), especially in the alpha rhythm. Hence, also alpha PLI, in addition to peak alpha frequency, seems to be valuable for producing clinical significance already at the onset of schizophrenia 20 .
Interestingly, alpha waves propagate from anterior-to-posterior and from the cortex to the thalamus [30][31][32][33] , so that cortico-cortical and cortico-thalamo-cortical connections allow frontal regions to drive posterior alpha activity. All in all, this evidence led to the idea that alpha rhythm plays an important role during top-down processing in healthy conditions [34][35][36] . Rest EEG connectivity studies specifically testing the association between abnormal directionality of the anterior-to-posterior propagating alpha rhythm and schizophrenia risk are currently unavailable though. However, the altered alpha band activity recorded in ultra-high-risk individuals during an auditory oddball task has been proposed to indicate that a deficit in top-down control exists before the onset of schizophrenia 37 . In sum, while there is now enough evidence available indicating that abnormal rest EEG alpha rhythm characterizes already the onset of schizophrenia, its potential role as an early marker of a predisposition toward schizophrenia in non-clinical populations is still poorly investigated 20,38 .
To fill this gap in the literature, in the present study we first established the association between sub-clinical schizotypy and specific indices of rest EEG alpha oscillatory activity (i.e., individual alpha peak frequency, IAF) and connectivity using both non-directional (i.e., weighted phase lag index, wPLI) and directional (i.e., time lag index, TLI) indices. The choice of investigating resting-state EEG features, rather than task-based EEG signals, has been motivated by both theoretical and practical reasons. From a theoretical standpoint, schizotypy is defined as a stable personality trait, thus possible alterations of EEG activity should be present already during restingstate. On a practical note, due to its simplicity and versatility, resting-state EEG recording can be considered an efficient screening tool that enables task-independent standardized measures for large scale assessments. Moreover, diagnostic accuracy in early onset psychosis and schizophrenia might be improved using machine learning approaches 39 , as already demonstrated in studies using genetic and neuroimaging features 6,40 . Similarly, machine learning methods allow to classify EEG features and thus identify clinical conditions based on EEG patterns. Indeed, recent machine learning studies in schizophrenia patients identified altered amplitudes and time lag in frontal event-related potentials 41 , lower levels of frontal alpha amplitude during working memory tasks 42 , functional alterations of alpha power spectrum over occipital-parietal and frontal areas 43 and altered thalamo-cortical connectivity 44 . Therefore, we trained and tested a pattern classifier to create a predictive model able to assess the presence of high schizotypal traits based on the alpha resting state activity of an individual.

Results
Resting-state EEG activity was recorded in a sample of 48 participants. Participants were divided into two groups based on the presence of schizotypal traits, estimated via Schizotypal Personality Questionnaire (SPQ) 45 . Two groups of 24 participants were subsequently created, based on SPQ score: a Low Schizotypal Group (LSG) with scores below the 20th percentile (Mean score: M = 7.62, Standard Error of the mean: SE = 0.52), and a High Schizotypal Group (HSG) with scores above the 80th percentile (M = 43.29, SE = 1.29). EEG was recorded from 64 scalp electrodes at rest for two minutes, while participants kept their eyes closed. An Independent Component Analysis (ICA) was performed for each participant to identify topographies reflecting activity in frontal and parieto-occipital areas for both the left and right hemisphere, representing our regions of interest (ROIs) for functional connectivity and alpha activity analyses. In line with the notion of a postero-anterior gradient 46  Functional connectivity across groups (LSG and HSG) was estimated based on phase connectivity measures between ROIs in the alpha frequency peak (Fig. 2). In the following, t-test have been adjusted for multiple comparisons with corrected significance threshold p value of 0.017 for three comparisons (for details see methods). An inter-group difference in the wPLI was found over the fronto-parieto-occipital connectivity in the right hemisphere (t(46) = 3.26, p = 0.002, d = 0.94, 90% CI [0.43; 1.44]), with the HSG showing a lower value of wPLI compared to the LSG (M high = 0.08, SE high = 0.01; M low = 0.15, SE low = 0.02). No differences in connectivity were found between groups over the other considered ROIs (all ts < 0.90, all ps > 0.37, all ds < 0.26). Analysis of the time lag yielded similar results, further confirming the nature of the differential effect between HSG and LSG selectively observed between frontal and parieto-occipital ROIs in the right hemisphere (t(46) = 2.85, p = 0.006, Figure 1. Individual Alpha Frequency (IAF) and Alpha Amplitude. First row: power spectra in the alpha frequency range (7-13 Hz) for left and right frontal and parieto-occipital regions of interest (ROI) divided for high schizotypy group (HSG) in red and low schizotypy group (LSG) in blue. Thin lines indicate single subject spectrum and black lines reflect group means. IAF. Observed alpha frequency peak in Hertz (Hz) for the two groups and for ROIs in both hemispheres (Left vs. Right). Topographies show scalp distribution of alpha frequency peak for the two groups and for the difference between groups. Alpha Amplitude. Observed maximum alpha amplitude in power (10*log10(μv)2) for the two groups and for ROIs in both hemispheres. Topographies show scalp distribution of alpha amplitude for the two groups and for the difference between groups. Corrected significant differences are marked with black asterisks. Uncorrected differences are marked with light-grey asterisks. Error bars represents Standard error of the mean. μv (microvolt); Hz (Hertz). www.nature.com/scientificreports/  Taken together, EEG data show group differences in the right hemisphere in terms of alpha speed (slower alpha in the HSG) and in the hemispheric distribution of alpha amplitude (asymmetry shifted from frontal to parieto-occipital ROIs for HSG compared to LSG). Moreover, connectivity measures in the HSG support the disconnection syndrome hypothesis 29 by pointing to a reduced and even altered fronto-parieto-occipital connectivity in the right hemisphere.
Machine learning pattern classifiers. According to information gathered from the literature on EEG markers of schizophrenia, a pattern classifier has been trained and tested with the aim of creating a predictive model able to assess the presence of schizotypal traits based on the alpha resting state activity and connectivity (Fig. 3). In particular, we have looked at alpha peak frequency, shown to be generally reduced in schizophrenia 21,27 , alpha amplitude, also shown to be altered in schizophrenia 23,47 , and measures of fronto-parietal connectivity (including wPLI and TLI), as schizophrenia has been defined as a functional disconnection syndrome within the default mode network encompassing fronto-parietal networks 20,48,49 .
Moreover, in order to distinguish between HSG and LSG, these features were included as possible input features of the models either for both right and left hemisphere or separately for each hemisphere. At the same time, we looked separately at frontal and posterior features with particular focus on posterior alpha activity and intrahemispheric connectivity. This was determined by the fact that literature on schizophrenia has shown an important discrepancy between studies supporting a general alteration of these features 22,23,50 , other pointing to a more specific hemispheric alteration 20,38,51,52 and finally findings revealing altered alpha indices specifically at posterior sites and excluding frontal regions in patients with first episode psychosis 27 . Using a tenfold nested cross-validation (CV) procedure, repeated 1000 times, in the test sets of the outer CV, we observed a sensitivity of 78.8% (4.9%) [mean (standard deviation)], specificity of 69.7% (5.2%), balanced accuracy of 74.3% (3.8%) and area under the receiver operating characteristic -curve (AUC) of 0.83 (0.04). The plot of the average ROC curve across 10 000 outer loops of the nested CV (10 folds × 1000 repetitions) along with the standard deviation and the 99.9% confidence interval of the average is shown in Fig. 4. The ranking of the best feature combination has been reported in Fig. 5.
The first three combinations included only the right hemisphere activity and, overall, accounted for the 96.2% of the total occurrences. The remaining 3.8% was accounted for by the left + right hemisphere activity. Feature combinations including only left features were never selected.
In general, machine learning results suggest that differences between groups are maximal over the right hemisphere, or in other words, that right hemispheric features of alpha activity are the best predictors of schizotypal traits. www.nature.com/scientificreports/

Discussion
In the present study, we used a machine learning approach to identify biomarkers able to predict which individuals are at higher risks of developing schizophrenic symptoms. To the best of our knowledge, our study represents the first application of machine learning techniques to investigate resting-state EEG features in assessing the presence of high schizotypy traits. In particular, we investigated possible alterations of resting-state alpha oscillatory activity, known to be impaired in patients with schizophrenia, in the healthy population either high or low in schizotypy traits. To this aim, different parameters of resting-state alpha band activity have been included in the study: (1) alpha-amplitude (2)    www.nature.com/scientificreports/ frontal and parieto-occipital electrodes of both left and right hemisphere) measures of alpha phase connectivity. Crucially, the above-mentioned indices of alpha activity were used as input features of a state-of-the-art pattern classifier, with the aim of building a model able to predict the presence of schizotypy traits of an individual based on resting-state alpha activity, thus even establishing a distinct and influential role of alpha oscillatory activity as electrophysiological marker of schizotypy dimension. The pattern classifier used a nested stratified CV loop to perform, at the same time, in the inner CV loop, selection of the best feature combination in discriminating the presence of schizotypal traits, as well as the best classifier (between C-SVM with linear kernel and logistic regression with L 2 penalty) along with their hyper-parameter optimization.
Our results indicate that alterations of IAF are present in the high schizotypy group. In particular, high schizotypy seems to be accompanied by a decreased IAF in the right occipital component. The slowing of resting-state alpha activity has been reported in schizophrenia patients 21 , as well as in first episode psychosis 27 . Here, we found that this slowing is present also in the high schizotypy population, but restricted to the right parieto-occipital region.
In addition to these changes in IAF, high schizotypals also demonstrated an asymmetry of alpha-amplitude in parieto-occipital regions, with reduced alpha-amplitude in the right hemisphere. Diminished resting-state alphaamplitude has previously been found both in individuals with psychotic disorders and their healthy relatives 22,23 , although even opposite results have been reported 47 , as well as null differences between first episode psychosis patients and healthy controls 27 . These discrepancies could be due to between-study differences in methodology and EEG data processing, illness chronicity or diagnostic heterogeneity, some of which could be clarified by systematically investigating the identified biomarkers both in clinical and subclinical populations.
Finally, measures of long-range connectivity in the alpha range have shown a distinct alteration in HSG. Specifically, similar to IAF results, differences between groups emerge only in the right hemisphere, with a reduced connectivity between frontal and parieto-occipital areas in HSG as measured by wPLI that, furthermore seems to follow an opposite parieto-frontal direction as shown through TLI measures. Moreover, no differences in interhemispheric connectivity have been observed between HSG and LSG. Altogether connectivity analyses point to reduced resting-state functional communication between frontal and parietal areas within the alpha range in the HSG, in line with previous results describing a similar pattern in schizophrenia patients 20 but restricted to the right hemisphere in schizotypy.
In order to affirm that high schizotypals can be identified on a single subject level based on neural indices, thus by observing their resting-state alpha activity, a pattern classifier has been trained and tested which, if the hypothesis holds, should be able to successfully differentiate between HSG and LSG. Apart from being able to predict the participants group membership (low vs. high schizotypy) based on all the examined features of resting-state alpha activity (74.3% of balanced accuracy), two interesting dissociations have emerged. The first one concerns frontal and parieto-occipital features, with solely the latter being able to successfully differentiate between the two groups (frontal features do not seem to contribute to differentiating the two groups; see Fig. 5). This outcome is in line with previous findings which have not revealed altered alpha indices over frontal regions in patients with first episode psychosis 27 . Secondly, differences in alpha activity between the HSG and LSG seem to be present only in the right hemisphere (best feature rankings including only left features were never selected). Throughout the years, several theories have been proposed describing schizophrenia as an interhemispheric imbalance. Amongst the proposed genesis of this imbalance, both hypo-functioning and/or hyper-functioning of the one hemisphere have been hypothesized [53][54][55] , although often without firm empirical grounds to dissociate these alternatives. Our results support an altered pattern within the right hemisphere showing a slower and reduced alpha activity and an exclusively intrahemispheric altered communication between frontal and parietal regions. Following the dimensional approach, one could hypothesize either that the right hemisphere dysfunction could be more pronounced in schizophrenia, or alternatively, can instantiate the insurgence of first psychotic symptoms by spreading the right hemisphere dysfunction to the left hemisphere, thus resulting in a more generalized disconnection syndrome. Current research in schizophrenia does not point to the idea of a hindered right hemispheric activity but to a more spread dysconnectivity so future research should systematically point to identify whether the neurophysiological prodromal phase leading from a subclinical to a clinical psychotic condition may reside in an interhemispheric spreading.

Conclusions
Overall, our results clearly demonstrate that the altered patterns of resting-state alpha activity observed in schizophrenia patients can be tracked already before the onset of the psychosis. Specifically, we observe the presence of an altered pattern concerning the resting-state alpha oscillatory activity in the high, relative to the low schizotypy population. Thus, alpha activity seems to represent an important electrophysiological marker, which may likely pave a higher risk of developing schizophrenia spectrum psychopathology according to specific indices as pointed out by our study. Interestingly, these differences are most evident in the right posterior region and its' functional connections with the right anterior cortex. The right parieto-occipital deficit and fronto-parietal disconnection syndrome in the HSG may significantly alter both sensory processing per se, but also top-down influence on controlling sensory processing. Therefore, this research offers a firm ground to future investigations to identify patterns of neural and cognitive developments anticipating at high-risk individuals and in describing neurocognitive (dis)functioning across the schizophrenia spectrum.
Although representing a valid tool for detection and measurement of schizotypy 45 , SPQ still represents a selfreport questionnaire, thus facing various methodological issues 56,57 . Crucially, current research has employed computational methods in order to affirm the relevance of identified oscillatory features as important electrophysiological markers of schizotypy. Specifically, by building a pattern classifier, we were not only able to describe the existence of possible differences in alpha activity between HSG and LSG, but also to demonstrate their ability www.nature.com/scientificreports/ to successfully identify individuals high in schizotypy. Thus, this approach offers a novel accurate diagnostic tool able to detect biomarkers defining at-risk individuals of developing schizophrenia spectrum disorders, based on resting state alpha activity. In addition, the inclusion of other features (e.g., genetic and neuroimaging data) would likely enhance the over-all performance of the model, although not always feasible, due to time and resource constraints. Moreover, we note that due to the relatively small sample, it would be interesting to confirm the results obtained with our built pattern classifier by extending future machine learning applications to an independent and wider sample of participants. The availability of larger datasets will also pave the way to the adoption of deep learning approaches which may improve the overall performance. The main aim of the current study was to identify EEG configurational pattern of activity that could represent a fingerprint of schizophrenia proneness. As such, the focus of this study was not on comparing EEG activity across different mental disorders. Therefore, by using state-of-the-art machine learning analysis of the EEG patterns implemented here, future studies should empirically test whether the EEG activity pattern identified here is specific for schizophrenia risk or rather represents a transdiagnostic biomarker of risk for psychopathology.
Finally, the question remains how do altered patterns in alpha activity during rest translate into relevant cognitive processing? Both alpha amplitude and long-range fronto-parietal alpha synchronization have a welldetermined roles in visuo-attentional inhibition and selection 58 , along with occipital alpha peak frequency acting as temporal and spatial sampling mechanism [59][60][61][62][63][64][65][66] . Therefore, should we expect these altered patterns to persist even during visuo-attentional tasks, leading to reduced attentional efficiency and altered perception in schizotypy? Future research is expected to address these questions, establishing a tight link between schizotypy and schizophrenia, thus enabling an accurate and detailed description of early markers of psychosis.

Materials and methods
Participants. Participants  As a result, a sample of 48 participants (see Table 1 for detailed demographics) was recruited for electrophysiological data collection.
Each group counted one left-handed subject. All participants signed a written informed consent prior to take part in the study, which was conducted in accordance with the Declaration of Helsinki 67 , and approved by the Bioethics Committee of the University of Bologna. All participants had no neurocognitive or psychiatric disorders. EEG recordings. Participants were comfortably seated in a room with dimmed lights. EEG was recorded at rest for two minutes, while participants kept their eyes closed 68 . A set of 64 Ag/AgCl electrodes was mounted according to the international 10-20 system (Fast'n Easy-Electrode, Easycap, Herrsching, Germany). Additionally, four EoG channels were positioned: on the outer canthi of both eyes, as well as above and below the left eye. The right and left mastoids were used as the online and off-line reference, respectively. Ground was positioned on the right cheek of the subject. All impedances were kept below 10 kΩ. EEG signals were recorded with a pass band filter 0.5-50 Hz (as set in Brain Vision Recorder, Brain Products, Gilching, Germany) at a sampling rate of 1000 Hz. Off-line data were resampled at 500 Hz (function pop_resample on EEGLab) and re-referenced to the average of all electrodes.
All EEG analyses were implemented by custom-made routines developed in Matlab R2013a (The MathWorks, Inc., Natick, Massachusetts, United States) using EEGLab toolbox functions (v. 13.0.1) 69 .
EEG data processing. Resting state EEG data were band-pass filtered (using a Hamming windowed sync FIR filter implemented in the pop_eegfiltnew function on EEGLab) for alpha frequency 6 to 14 Hz, and epoched in 2000 ms temporal windows. An Independent Component Analysis (ICA) was performed for each participant to identify topographies reflecting activity in frontal and parieto-occipital areas for both the left and right hemisphere, representing our regions of interest (ROIs) for alpha analysis (see Fig. 6). ICA method separates EEG data on distinct information sources (i.e., independent components) and provides the weighted projection from each independent component to each scalp electrode [70][71][72] . In particular, individual alpha frequency peak (IAF) and alpha amplitude were extracted from individual power spectra separately over each ROI in subclusters of electrodes selected by visual inspection of the identified topographies: frontal ROIs (left electrode cluster: F1,  73 and time lag were used as indices of connectivity. In particular, inter-hemispheric connectivity was estimated between right and left parieto-occipital ROI and intra-hemispheric connectivity was estimated in both hemispheres between frontal and parieto-occipital ROIs. First, four templates, reflecting spatial topography of the ROIs, were identified via visual inspection within all ICs: two central frontal components and two parieto-occipital components, one for each hemisphere (left and right). Subsequently, for each subject, relevant ICs were identified via automatic multistep correlational template matching (CORRMAP, 0.80 correlation threshold) 74 . Topographies of ICs labeled as frontal and parieto-occipital components were visually inspected and back-projected to the data for frequency, amplitude and connectivity analyses 75 . For each participant, a minimum of one and maximum of three components were identified per template.
EEG features extraction. Individual alpha peak and alpha amplitude were extracted from power spectra of each participant using an automated peak-detection algorithm (function RestingIAF on EEGLab) 76 . This algorithm uses a Sovitzky-Golay filter (SGF, frequency resolution 0.24 Hz, polynomial order 5 of the SGF), which smooth power spectra and attenuate random noise. Alpha amplitude was defined as the maximum alpha power, expressed in normalized power (10*log 10 (μv) 2 ). To calculate wPLI, EEG resting state data were divided into 2500 ms non-overlapping windows 77 . Then the cross-spectrum of the time series signals was calculated and the wPLI estimates the magnitude of the imaginary part of the cross-spectrum. For each participant, wPLI was estimated as a function of individual alpha frequency peak and 14X14 connectivity matrices were generated over the selected electrode clusters (see above). Time lag estimates the mean difference in milliseconds of two time series spectra. EEG features. Individual independent components (ICs) were analyzed in order to extract electrophysiological features both for frontal and occipito-parietal regions of interest (ROIs) in the right and left hemisphere to be entered in the machine learning pattern classifier.
The following EEG features were extracted: Individual alpha frequency (IAF). For each participant, IAF was defined as the exact frequency in the alpha range (7-13 Hz) containing the maximum power. It was extracted from the individual power spectra in the alpha range and calculated using an automated peak-detection algorithm (function RestingIAF on EEGLab) 76 .
Alpha amplitude. For each participant, alpha amplitude was defined as the maximum power in the alpha range (7-13 Hz), expressed in normalized power (10*log 10 (μv) 2 ).
Weighted phase lag index (wPLI). This feature was extracted to calculate functional connectivity in the alpha range. This is a measure of phase-based connectivity calculated in a specific frequency, which accounts only for non-zero phase lag/lead relations between two time series signals 73 . wPLI is calculated between two neurophysiologic signals and can assume values between 0 and 1. Larger values of wPLI reflect a consistent phase relation between two signals. If the relation between two signals is random, the wPLI value is 0. Connectivity between frontal and parieto-occipital ROIs in the right hemisphere was estimated on the averaged wPLI values calculated over the following electrode clusters: right frontal ROI (F2,FC2,C2,FC4) and right parieto-

Time lag index (TLI).
This feature adds a further dimension to the wPLI as it provides information about the directionality of the communication between two synchronized signals 78 . It represents the means of the temporal phase lag in the cross-spectrum between time series signals of the selected clusters and, unlike the wPLI, it offers further insight regarding the temporal dimension of the synchronization. Specifically, TLI is used to determine the averaged phase differences in milliseconds of two considered signals 78 . Positive values of the TLI indicate a lag in the phase of the first considered signal with respect to the other, thus indicating the directionality of the communication between two synchronized signals.
EEG data analysis. 2 × 2 × 2 mixed-model ANOVAs were performed on IAF and alpha amplitude, with the between subject factor GROUP (HSG, LSG) and the within subject factors HEMISPHERE (left and right) and ROI (frontal and parieto-occipital ICs). Specific differences in the alpha activity were further tested both for alpha frequency (with paired and independent samples one-tailed t-tests as a directionality hypothesis was formulated 14 ) and alpha amplitude (with paired and independent samples two-tailed t-tests). Between groups planned comparison were performed on wPLI and TLI using independent samples two-tailed t-tests. p values < 0.05 were considered significant, along with Dunn-Sidak correction procedures for multiple comparison being applied where necessary 79 , with a corrected significance threshold p value of 0.013 for four comparisons (alpha activity analyses) and a corrected significance threshold p value of 0.017 for three comparisons (connectivity analyses).
Machine learning pattern classifiers. To train, validate and test the classifier, we employed a tenfold nested stratified CV loop (Fig. 7). In particular, empirical evidence suggested that 5-or 10-fold CV should be preferred to leave-one-out (LOO) CV as consistently reported by both current literature [80][81][82][83] and state-of-theart machine learning development tools documentation (see, e.g., https:// scikit-learn. org/ stable/ modul es/ cross_ valid ation. html). This strategy allowed us to perform, at the same time, in the inner CV loop, selection of the best feature combination in discriminating the presence of schizotypal traits, as well as the best classifier (between C-SVM with linear kernel and logistic regression with L2 penalty) along with their hyper-parameter optimization. Indeed, given that it is not possible to define a priori which is the best machine learning algorithm with respect to the data and the specific problem to address 84 , we used two well-established classifiers (C-SVM with linear kernel and logistic regression with L 2 regularization), which are generally appropriate choices for reducing overfitting in a small sample. In particular, for a binary classification task, a C-SVM constructs a hyperplane in a highdimensional space separating the training data into two classes. Since, in general, the larger the margin the lower the generalization error of the classifier, a good separation is achieved by the hyperplane that has the largest distance to the nearest training data points of any class 85 . On the other hand, logistic regression measures the relationship between the categorical dependent variable and one or more independent variables by estimating probabilities using a logistic sigmoid function 86 . The C hyperparameter of both the C-SVM and logistic regression classifiers takes a value that is proportional to the inverse of the regularization strength used during the training phase. For the C-SVM classifier, e.g., the choice of the C value is a trade-off between misclassification Figure 7. A scheme of tenfold nested CV is represented. The inner CV loop is used to perform feature selection, optimize hyper-parameters and select the best classifier, whereas the outer loop estimates the selected models' performance. www.nature.com/scientificreports/ of training examples and simplicity of the decision surface. A low C value makes the decision surface smooth, while a high C value aims at classifying all training examples correctly. In this study, we varied the C value of both the C-SVM and logistic regression in the set {0.1, 0.2, 0.3}. We refer to specialized reference textbooks for a deeper description of these state-of-the-art systems [85][86][87] . Once the best estimator (determined by the best classifier/hyper-parameter/feature combination) maximizing the balanced accuracy was found in the inner CV, it was re-trained on the outer training set and tested on the test set kept out from the outer CV to obtain an unbiased estimation of the model's prediction error. This procedure was repeated for each fold of the outer CV. Before each training (both in the inner and outer CV), each feature was standardized with reference to the training set only. Test set data were not used in any way during the learning process, thus preventing any form of peeking effect 88 .
Since the performance and the selected features may vary depending on how the data are split in each fold of the CV, we repeated the nested stratified CV procedure 1000 times recording the frequency that each feature combination was selected from each fold of the round of the outer CV. Average and standard deviation of the results from all repetitions in terms of sensitivity (the proportion of high schizotypes correctly identified as such), specificity (the proportion of low schizotypes correctly identified as such), balanced accuracy, and AUC were computed to get a final model assessment score in the test set of the outer-CV. The average ROC curve 77 across 10 000 outer loops of the nested CV (10 folds × 1000 repetitions) along with the standard deviation and the 99.9% confidence interval of the average was also computed.