Main

Major depressive disorder (MDD) is one of the most common and debilitating disorders worldwide1. The disorder’s high level of heterogeneity (in both symptoms and neurophysiology) complicates adequate treatment prescription, which may limit treatment response2,3,4. For instance, both antidepressant medication and cognitive-behavioral therapy led to insufficient symptom relief at the group level when treatment was assigned in an arbitrary fashion5, with response rates around 40–50% and remission rates around 30–40% (ref. 6).

While targeting the patient’s individual neurophysiology (for example, precision psychiatry) seems to be infeasible at present, an implementable alternative is treatment stratification (for a discussion, see ref. 6), which reduces heterogeneity within a disorder by identifying subgroups of patients that preferentially respond to a certain treatment, using so-called biomarkers7,8. A nonrandomized, open-label study, based on resting-state electroencephalography (EEG) biomarkers, prospectively stratified between three antidepressants in MDD that resulted in better clinical outcomes relative to treatment-as-usual7. Importantly, due to its relatively low cost and ease of usage, EEG-biomarker stratification is especially suited for widespread implementation in clinical practice.

Several EEG biomarkers for treatment outcome in MDD have been proposed8,9. However, few markers could successfully be replicated. In fact, a recent meta-analysis examining EEG markers of treatment response in MDD raised doubts about their clinical applicability due to publication bias and a lack of cross- and out-of-sample validations10.

One EEG pattern that has shown potential as stratification biomarker is the individual alpha peak frequency (iAF), which denotes the modal frequency of an individual’s alpha oscillations (7–13 Hz). The iAF has been shown to be associated with cognitive performance and to be aberrant in various mental disorders. For instance, faster iAF has been related to better cognitive performance11,12,13,14, while slower iAF has been associated with higher symptom severity15,16 and less favorable treatment outcome17,18,19 and has been observed across many disorders such as Alzheimer’s disease20, burnout syndrome21, mild cognitive impairment22, psychosis23, schizophrenia23,24 and attention-deficit hyperactivity disorder (ADHD)25, with this slowing potentially reflecting reduced thalamocortical information transfer.

In patients with ADHD, slower iAF has been related to worse treatment outcome to methylphenidate26 and better treatment outcome to multimodal neurofeedback27. Based on these findings, our group recently developed Brainmarker-I, which is based on the iAF measured during the resting-state EEG. We showed that this biomarker can successfully assign patients with ADHD to the individual best out of several treatment options, with findings confirmed in blinded-out-of-sample validations28. For antidepressant medication (amitriptyline and pirlindole), a slow iAF was shown to be predictive of nonresponse29. However, this finding does not generalize across antidepressants, as was shown by subsequent studies reporting an association between slow iAF and better response to the selective sertraline reuptake inhibitor sertraline30. For repetitive transcranial magnetic stimulation (rTMS), a different association has been observed. Specifically, an iAF closer to 10 Hz (that is, the rTMS stimulation frequency) was associated with better improvement to 10-Hz left-dorsolateral prefrontal cortex (L-DLPFC) rTMS31, which was independently replicated, while no association emerged between iAF and outcome of 1-Hz right-DLPFC (R-DLPFC) rTMS32. For electroconvulsive therapy (ECT), to our knowledge, iAF prediction of treatment outcome is unknown.

In this Article, following these promising findings, we aimed to extend Brainmarker-I, developed for ADHD treatment stratification28, to treatments for MDD. We decided a priori to conduct statistical analyses in line with Voetterl et al.28 and the hypotheses outlined below, focusing only on remission as our primary outcome, given its higher clinical relevance and to avoid multiple testing. We first conducted a blinded out-of-sample validation in the double-blind placebo-controlled Establishing Moderators and Biosignatures of Antidepressant Response for Clinical Care (EMBARC) dataset33,34, aiming to replicate the previously mentioned sertraline finding and to demonstrate specificity of iAF-based prediction for sertraline but not placebo. Next, biomarker directions were tested for brain stimulation treatments, focusing on potential treatment stratification of patients with a difficult-to-treat depression. An iAF close to 10 Hz was a priori considered an indication for 10 Hz L-DLPFC rTMS, based on the above-mentioned replicated research31,32. For both 1 Hz and ECT treatment, discovery analyses were conducted and all possible directions of effect were examined. A potential finding was subsequently validated through blinded biomarker-informed prediction of patients’ remission status in unseen datasets. Finally, exploratory analyses testing predictive value of iAF for psychotherapy, ketamine and bupropion treatment were conducted.

Resting-state eyes-closed EEG data were preprocessed for all datasets, in line with previous preprocessing35. The iAF was calculated in accordance with Voetterl et al.28, and each patient was assigned a decile score, with low scores reflecting a slow iAF. Additionally, a synchronization indicator, denoting an iAF between 9.6 Hz and 10.4 Hz at the F3 location was implemented (Fig. 1) to mark close proximity to 10 Hz, resulting in three distinct biomarker subgroups that were compared for the different treatments: synchronization, low deciles (decile score 1–5 without synchronization range) and high deciles (decile score 6–10 without synchronization range). Positive predictive values (PPVs) indicated the remission rate within each Brainmarker-I subgroup. A normalized PPV (nPPV) was calculated to be able to compare remission rates that differed between datasets. In short, the respective remission rate of each dataset was set to 100% and the increase or decrease after stratification in relation to these 100% was calculated.

Fig. 1: Visualization of the Brainmarker-I classification.
figure 1

A filled, pink dot on the left denotes either that the patient has low voltage alpha or that their iAF falls into the frontal synchronization range (9.6–10.4 Hz) (depicted above). The iAF is depicted in Brainmarker-I decile scores from 1 (relatively slow) to 10 (relatively fast). Low deciles (decile 1–5; blue) indicate stratification to ECT, Sync (orange) indicates 10 Hz rTMS treatment stratification, high deciles (decile 6–10; burgundy) indicate 1 Hz rTMS treatment. As visualized, the synchronization range overlaps with the decile scores, depending on the age of the individual (for example, higher deciles overlap more for older age). For subgroup assignment, the synchronization range is leading, that is, if an individual falls into that range, they are assigned to the synchronization group, otherwise the decile score indicates assignment to either low- or high-decile subgroup. A simulation for full group stratification was conducted where remission was calculated for all datasets combined but including only individuals in the respective stratified groups (for example, individuals with a high decile score in the 1 Hz rTMS samples). Sync, synchronization marker.

Finally, number-needed-to-treat (NNT) was calculated, which demonstrates how many patients need to be treated with the treatment recommended by the biomarker to get one more patient to remit compared with treating patients with the same active treatments but in a random fashion (not informed by the biomarker).

Results

Basic information about the different datasets is provided in Table 1 and Table 2. Remission rates of each dataset and treatment group are summarized in Supplementary Table 1.

Blinded sertraline replication

Results for the EMBARC dataset are visualized in Fig. 2.

Fig. 2: Independent validation of better remission rate to sertraline treatment in slow-iAF subgroup in a randomized, double-blind, placebo-controlled trial.
figure 2

Normalized remission rate in the low-decile subgroup for placebo and sertraline treatment arm after 8 or 16 weeks of treatment.

Since the aim was to replicate previous findings of low iAF and remission to sertraline, the directed hypothesis was that the remission rate would be higher in the low-decile subgroup.

At 8 weeks treatment, the low-decile subgroup showed a slightly higher remission to sertraline treatment compared with group remission (nPPV +9%, PPV 45%, NNT 28), which increased to +15% (PPV 83%, NNT 9) at 16 weeks. For placebo, no direction of effect was found (nPPV +3%) after 8 weeks, or after prolonged treatment at 16 weeks (nPPV −3%).

Brain Stimulation Treatments

Stratification results of the rTMS and ECT analyses are visualized in Fig. 3. Full results of analyses in all three biomarker subgroups can be found in Supplementary Table 2.

Fig. 3: Normalized remission rates within subgroups that would be assigned to respective treatment according to the biomarker.
figure 3

Orange color indicates synchronization subgroup, burgundy indicates high-decile subgroup and blue indicates low-decile subgroup.

In line with previous evidence31,32, the remission rate in the synchronization subgroup (iAF between 9.6 Hz and 10.4 Hz) in Dataset-2 for patients who had received 10 Hz rTMS was increased (nPPV +29%, PPV 77%, NNT 6) compared with the total group remission rate. Therefore, 10 Hz rTMS was regarded as first treatment choice for patients with a 10-Hz synchronous iAF.

Of the different subgroups tested in Dataset-2 in patients who had received 1 Hz rTMS treatment, the high-decile group showed the highest remission rate with an nPPV of +14% (PPV 60%, NNT 14). A blinded out-of-sample validation in the unseen rTMS Dataset-3 confirmed this direction of effect with an nPPV of +16% (PPV 50%, NNT 15).

For ECT, the low-decile subgroup in Dataset-4 presented with an increased remission rate of +38% (nPPV; PPV 36%, NNT 10) compared with the total group remission rate. A blinded out-of-sample validation in Dataset-5 corroborated the direction of effect with an nPPV of +18% (PPV 72%, NNT 9).

Brain stimulation treatment stratification

Based on prior findings, we conducted a simulation for stratification between brain stimulation interventions, calculating the weighted average of the PPVs that had previously been determined for each treatment.

The percentage of patients falling into the three different subgroups across all included rTMS and ECT datasets differed (Discussion). For low-decile, synchronization and high-decile subgroup, these were 47%, 30% and 23%, respectively.

Weighing each PPV in the biomarker-allocated subgroups by these percentages, and merging the different treatment samples into one dataset led to an increase in remission rate from 53% to 65% (NNT 9), an increase of normalized remission rate of +24% over the nonstratified remission rate.

Exploratory analyses

For psychotherapy Dataset-6, patients in the low-decile subgroup were more likely to remit, with an nPPV of +19% (PPV 35%, NNT 15). In the ketamine Dataset-7 and in patients of Dataset-1 who received buproprion for 8 weeks, neither low- nor high-decile scores were associated with remission (nPPV −2% and nPPV +1% for low deciles, respectively). Results are visualized in Supplementary Fig. 1.

Confounding factors analyses

To ascertain the presented findings were not related to differences in depression severity, we conducted one-way analyses of variance between the three biomarker subgroups (low decile without synchronization range, synchronization range, high decile without synchronization range) and baseline depression scores for all main datasets separately. There were no significant differences between groups in any of the datasets (P > .147).

Discussion

The present study successfully extends the previously introduced Brainmarker-I for ADHD to MDD treatment, thereby presenting a transdiagnostic and clinically actionable EEG biomarker. Following the previous finding of better treatment response to sertraline in patients with a low iAF30, we aimed to replicate this direction of effect in the randomized, placebo-controlled EMBARC dataset, expecting no effect for placebo. In addition to replicating the previously shown sertraline effect30 for remission after 8 weeks and 16 weeks of sertraline treatment, we demonstrated that this effect is specific to sertraline and does not hold for placebo at either of the two timepoints of outcome. The increase of remission rate to sertraline at week 8 was small (nPPV +9%), probably due to the high placebo remission rate of 29% that did not differ from the week-8 sertraline remission rate of 32%. It is known that placebo response can be substantial in antidepressant trials36,37 and a diminished response to the active antidepressant treatment has been reported in studies that include a placebo arm38. It is perceivable that the sertraline effect at week 8 was diminished by the possibility of receiving the inactive compound.

For 10 Hz rTMS treatment, the effect of better clinical response to 10 Hz rTMS in patients with an iAF closer to 10 Hz had already previously been demonstrated31 and replicated32. We quantified this finding by determining the Brainmarker-I synchronization subgroup in Dataset-2, which showed an increased normalized remission rate of +29% to 10 Hz rTMS compared with the group remission rate. This finding has been linked to the theory of 10 Hz stimulation entraining the endogenous oscillations to the stimulation frequency, with the Arnold tongue model predicting better entrainment the closer the stimulation frequency is to the endogenous frequency31,39.

For 1 Hz rTMS we explored linear effects in both directions, with either low or high decile scores corresponding to remission. Only high decile scores were associated with increased remission to 1 Hz rTMS (nPPV +14%). This association was successfully replicated in a blinded-out-of-sample validation in rTMS Dataset-3 with a 16% higher normalized remission rate in the biomarker-identified subgroup. The same discovery analyses were repeated for ECT treatment. In ECT Dataset-4, the low-decile subgroup presented with a higher normalized remission rate of +38% (PPV 36%) compared to the overall group remission rate. We subsequently replicated this direction of effect in a blinded, out-of-sample validation in ECT Dataset-5, with an increased remission rate of +18% (PPV 72%) in the low-decile subgroup.

Given that a slow iAF might be considered an abnormality in the EEG21,22,23,24,40,41, this finding is in line with previous results, showing that patients with EEG abnormalities not only responded better to bilateral than to unilateral ECT, they also responded better to bilateral ECT than the group without abnormalities (77% response versus 67%, respectively)42. Although the publication did not specifically mention slow iAF as one of the assessed abnormalities, our findings support the conclusion of better treatment response to bilateral ECT in patients with EEG abnormalities since most patients in Dataset-4 (74%) and all patients in Dataset-5 received bilateral ECT. Future research is needed to examine whether our finding only holds for bilateral ECT as suggested by the findings by Malaspina et al.42. Interestingly, in a secondary analysis (Supplementary Discussion 1) examining the association between side effects to ECT and iAF in replication Dataset-5, we found that those patients that Brainmarker-I classified as ECT remitters also experienced fewer side effects of any kind (mainly memory impairment) with an nPPV of +23% (PPV 44%). This is a particularly intriguing finding since ECT side effects are the main concern of patients.

Remission rate is generally lower in patients with a lower iAF, and this was also the case in our samples which led to lower PPVs (as reported in the results). In traditional biomarker research where one biomarker predicts treatment success or failure, one might consider these rather low PPVs insufficiently strong for use in clinical practice. However, when considering the idea behind stratification, we see how even small improvements can be clinically meaningful and valuable6. Instead of denying someone a treatment based on an unfavorable prediction, the stratification approach assigns individuals to one of several evidence-based and commonly prescribed treatments based on their worst or best chances to remit. This means that, compared with the alternative one-size-fits-all approach, no harm is done by using stratification (for a more in-depth explanation, see ref. 6).

In this manuscript, we present a stratification solution for difficult-to-treat depression, based partially on previous findings (for example, for 10 Hz) but enhanced by additional recommendations for the best treatment option (of several common interventions) for the low- and high-decile subgroups.

We, moreover, suggest that Brainmarker-I might have potential to inform matched stepped care by suggesting a better chance to remit to sertraline as a first-line treatment for patients in the low-decile subgroup, and to ECT for the same group after sertraline treatment has failed.

When combining all brain stimulation findings and following the tested and validated stratification scheme, the already high remission rate of 53% improved to 65%, an effective increase of 12%, with an NNT of 9, which means that nine patients need to be treated with the biomarker-recommended treatment to have one more patient remit compared with active treatment prescribed in an arbitrary way. This NNT is close to the effect of tricyclic antidepressant and SSRI monotherapy (minimum NNT 7)43 compared with placebo. This is rather impressive, considering that the simulated stratified remission rate was not compared with a non-active control treatment but rather to active treatment, meaning it reflects the added effect of biomarker-based stratification.

Since the focus of the present article is treatment stratification, associations between iAF and outcomes outside the context of stratification were not tested and the presented biomarker was not developed in the classical sense, validated on specificity and sensitivity. Instead the aim was to determine correlates that help decide between several evidence-based treatments, enriching treatment decision with a brain-based parameter to be considered in the context of other determining factors, such as treatment history or contraindications. We acknowledge that treatment prescription is often bound by health care policies. The biomarker presented here is therefore only meant as a tool for the treating physician that aids to inform treatment prescription with the final prescription lying with the physician in consultation with the patient.

The present manuscript is subject to some limitations. Remission was evaluated by different depression scales across different datasets. However, all remission cutoff criteria used, except for the 17-item Hamilton Rating Scale for Depression (HRSD-17), were in line with the criteria proposed by Riedel et al.44. Similarly, EEG parameters and amplifiers differed across collection locations, resulting in a total of six different EEG systems included. During preprocessing, all data were matched to our own datasets as closely as possible. For the purpose of detecting the alpha peak in frontal electrodes, all EEG data complied with our requirements. Moreover, consistent findings in spite of heterogeneity in acquisition systems highlight the robustness of the biomarker.

The original ECT dataset was small (N = 19) and had an unusually low remission rate (26%) compared with standard ECT remission due to a highly heterogeneous, comorbid patient profile. However, since we successfully replicated our ECT finding in a larger unseen dataset with a remission rate considered normal for ECT, we assume that the small sample size and low remission rate did not affect our finding.

One noticeable feature of Brainmarker-I is that iAFs are not evenly distributed across the three stratification subgroups (Supplementary Table 3). Approximately 40–50% of the patients fall into the asynchronous decile 1–5 subgroup while the 10-Hz synchronous and asynchronous higher-decile (6–10) subgroup make up the remaining 50–60%. One reason is that an iAF of 9.8 Hz is already considered to fall in the upper alpha range, that is, fast alpha45, making the synchronization marker (9.6–10.4 Hz) overlap more with the higher-decile subgroup.

One limitation linked to the use of an nPPV is that it depends on the prevalence of the biomarker in the total group, since a high remission rate in the biomarker subgroup will contribute more to the total remission rate, the more prevalent that biomarker is in the total group.

On the other hand, due to the rather prominent differences in remission rate between datasets, mentioning only the PPV in itself would also be biased, with a higher remission rate almost automatically resulting in a higher PPV.

Lastly, the high heterogeneity between datasets and their clinical nature complicated assessing other clinical or cognitive variables. Brainmarker-I per definition controls for age and sex, additional analyses showed no differences in baseline severity between subgroups in all datasets, and the results were validated in heterogeneous clinical, previously unseen datasets, thereby confirming the robustness of the biomarker across changing variables. Nonetheless, it cannot be ruled out that other factors could have influenced or mediated the presented findings. More systematic research is required in the future to examine the link between the introduced biomarker and other cognitive and clinical factors, and to examine whether adding such variables to the biomarker recommendation could potentially improve treatment stratification.

Conclusions

We hereby present a clinically actionable transdiagnostic treatment stratification EEG biomarker that can successfully assign patient subgroups to various ADHD and MDD treatments, and is ready to be implemented in clinical practice.

Methods

Data collection and preprocessing

EEGs for Dataset-2, Dataset-3, Dataset-4 and Dataset-6 were recorded in a standardized manner in accordance with Brain Resource Ltd.35. In short, brain activity was measured from 26 channels of the 10–20 electrode international system (Fp1, Fp2, F7, F3, Fz, F4, F8, FC3, FCz, FC4, T7, C3, Cz, C4, T8, CP3, CPz, CP4, P7, P3, Pz, P4, P8, O1, Oz, O2; Quikcap, NuAmps) with a ground at AFz. Measurements consisted of 4-min resting-state recordings (2 min eyes open, 2 min eyes closed). Sampling frequency (FS) was 500 Hz, and a low-pass filter with an attenuation of 40 dB per decade above 100 Hz was applied before digitization. Horizontal and vertical eye movements were recorded with electrooculography (EOG) electrodes (VEOG upper and lower, HEOG left and right) and skin resistance was kept <10 kΩ for all electrodes.

Artifact rejection was performed with a fully automated, custom Python package46,47,48,49.

In short, bipolar EOG was removed from the EEG signal using Gratton50. A band-pass filter between 0.5 Hz and 100 Hz was applied, and the notch frequency of 50 Hz was removed. The following artifacts were detected and removed: electromyography, sharp channel-jumps (up and down), kurtosis, extreme voltage swing, residual eyeblinks, electrode bridging and extreme correlations. If more than 66% of a channel’s signal was artifactual, it was repaired using a Euclidian distance weighted average of at least three neighboring channels. If neighboring channels were not available due to artifactual data, the channel was removed. Very artifactual data were excluded on the basis of visual inspection.

For full details on preprocessing, see van Dijk et al.51. The Python code used for processing the EEG and calculating the iAPF is freely available for download at https://brainclinics.com/resources/.

Data cleaning and artifact rejection for Datasets-1, -5 and -7 were performed in Brain Vision Analyzer version 2.2.0 (Brain Products GmbH) by semi-automatic removal of epochs with signal amplitudes >150 mV.

For Dataset-1 (EMBARC)33 different EEG acquisition systems were used across different sites, leading to different numbers of electrodes (60–128) and FS (250/256). EEGs from all EMBARC locations were downsampled to the lowest FS (250 Hz), and electrodes were adjusted to match the 26 locations listed above.

ECT Dataset-5 (ref. 52) was treated accordingly, resulting in an FS of 200 Hz and 19 channel locations (FC3, FCz, FC4, CP3, CPz, CP4 and Oz missing).

Similarly, the ketamine Dataset-7 (ref. 53) combined three different studies with different FS and channel locations. Matching them to our data resulted in an FS of 500 Hz in two of the studies and 250 Hz in one study (25 patients), and either 18 or 19 channel locations (FC3, FCz, FC4, CP3, CP4, CPz and either Cz or Oz or both missing).

In line with Voetterl et al.28, the primary outcome measure for all datasets was remission—defined as a score of ≤12 on the Beck Depression Inventory-II (BDI-II; for Dataset-2, Dataset-3 and Dataset-6), ≤7 on the HRSD-17 (for Dataset-1 and Dataset-4), ≤2 on the Clinical Global Impression ratings (Dataset-5), and ≤7 on the Montgomery–Asberg Depression Rating Scale (MADRS, Dataset-7). These were in line with remission as defined by Riedel et al.44, except for the HRSD-17 cutoff, which was based on the original sertraline study30, as the aim was to replicate this finding.

Biomarker development

Brainmarker-I for MDD is based on the same previously reported EEG-biomarker for ADHD28. The biomarker was developed in a large heterogeneous clinical dataset (TDBRAIN+; N = 4,249). A subset of the data, the open-access TDBRAIN dataset (N = 1,274; two decades brainclinics research archive for insights in neurophysiology), is freely available at http://www.brainclinics.com/resources ref. 53 after login, with all data recorded at Research Institute Brainclinics (Brainclinics Foundation, Nijmegen, the Netherlands). In addition, the data are available on the data repository Synapse at ref. 54.

EEG (pre-)processing, as well as conditions and montages employed, often differ considerably across studies which can hinder replication of findings and thereby implementation of biomarkers in clinical practice. In Voetterl et al.28, a standardized processing pipeline was developed by making use of a biological ground truth, the maturation (speeding-up) of the iAF during childhood and adolescence.

In short, EEGs without measurable alpha oscillations, so-called low-voltage alpha (LVA) EEGs, were identified and excluded from further processing since an alpha peak cannot be determined in these data. Subsequently, 108 processing parameter permutations, comparing reference montage, condition, segmentation and topographical location, were tested against iAF maturation in 1,671 children and adolescents aged <18 years. Curve fitting was performed for males and females separately to find the mathematical model that most closely represented the brain-maturation effect. The permutation resulting in the highest correlation between iAF and age was used for the subsequent analyses.

Divergence values were calculated for each individual by subtracting from the individual’s iAF the model-predicted iAF for the individual’s sex and age, with a negative divergence score reflecting an iAF that is slower than the mean at that age and sex. The divergence values of the full dataset of >4,000 individuals were sorted and divided into ten equal-sized bins that denote the deciles used for assignment to the different subgroups later. For a more detailed description of the LVA and biomarker discovery, see Supplementary Discussion 2 and 3.

The iAF for all treatment datasets was determined by calculating the Fast Fourier Transform of the preprocessed resting-state eyes-closed EEG data, segmented into 5 s and re-referenced to an average reference, based on previous literature28,31.

The highest peak within the frequency range of 7–13 Hz was identified at the 10–20 EEG system locations F3 and Fz, in line with previous predictions28,31,32. Participants with missing clinical data, insufficiently clean EEG data and EEGs with LVA were excluded. The resulting values were divided into decile scores, according to the cutoff values determined in the large TDBRAIN+ dataset. Treatment predictions were made on the basis of low (decile 1–5) or high deciles (decile 6–10) in the Fz electrode. For an example, see Fig. 1.

Additionally, to account for the association between an iAF close to 10 Hz and 10 Hz rTMS, a synchronization indicator was introduced, which denotes an iAF around the stimulation frequency of 10 Hz at the F3 location (Fig. 1). To determine the optimal range for this third biomarker subgroup, we tested different cutoff values that were equidistant from the 10 Hz frequency. Due to the frequency resolution of 0.2 Hz, a result of data segmentation, the possible options were restricted.

Ranges tested were 9.4–10.6 (49% of individuals), 9.8–10.2 (22% of individuals) and 9.6–10.4 (30% of individuals). The range of 9.6–10.4 Hz encompasses approximately a third of the individuals in the dataset and therefore resulted in the best ratio of patients falling into this range and prediction accuracy.

Since this range overlaps with the low- and high-decile subgroups, patients falling into the synchronization range were excluded from the low- and high-decile subgroups, to obtain three distinct subgroups.

The automated algorithm described in Voetterl et al.28 was used to calculate iAF and decile scores for individuals of all datasets (Fig. 1).

Statistics

PPVs indicate the remission rate within the subsample of patients that Brainmarker-I would have stratified to the respective treatment. An nPPV was calculated to be able to compare predicted remission rates of different datasets, using the formula \(\left(\tfrac{m}{w}-1\right)\times 100\) (m, PPV; w, observed sample remission rate). In short, the respective remission rate of each dataset was set to 100%, and the increase or decrease after stratification in relation to these 100% was calculated.

In addition, NNT was calculated, which determined how many patients need to be treated according to Brainmarker-I stratification to get one more remitter compared to treating patients with the same active treatments but in an arbitrary fashion.

To test whether potential findings could be explained by differences in depression severity, we conducted one-way analysis of variance analyses between the 3 biomarker subgroups (low decile without synchronization range, synchronization range, high decile without synchronization range) and baseline depression scores for all main datasets separately. Biomarker calculation was conducted in Python, using modules numpy55, pandas49 and scipy47. All other statistical analyses were performed in IBM SPSS Statistics for Macintosh, Version 27.0.

Datasets

Datasets used in this study are shortly described below. Full details of the samples can be found in their respective published primary papers. Basic information about the different datasets is summarized in Table 1 and Table 2. All studies were approved by their respective institutional review boards (with ethical approval numbers available in the primary publications of the studies).

Table 1 Basic demographic information EMBARC dataset
Table 2 Basic demographic information all datasets

Dataset-1: EMBARC Sertraline

The EMBARC data were precollected data that were specifically requested for secondary analyses (for information on ethical approval, CONSORT diagrams, study protocol and participant inclusion, we refer the reader to the relevant references)33,34. The study was approved by the institutional review boards of all study sites (University of Texas Southwestern Medical Center, Columbia University/Stony Brook, Massachusetts General Hospital, University of Michigan, University of Pittsburgh, and McLean Hospital). All participants provided written consent for the original study from which the data has been used and received financial compensation. Between 29 July 2011 and 15 December 2015, outpatients were recruited at four sites: Columbia University, New York; Massachusetts General Hospital, Boston; University of Michigan, Ann Arbor; and University of Texas Southwestern Medical Center, Dallas. A total of 296 participants were randomized to sertraline or placebo, administered for 8 weeks, and then assessed for treatment response (defined as ≥50% reduction in HRSD-17 scores). The study design stipulated responders to remain on the same drug regimen, and to switch nonresponders to a different medication (sertraline for placebo nonresponders and bupropion for sertraline nonresponders) for the next 8 weeks.

We used these data to conduct a blinded out-of-sample validation analysis, with the directed hypothesis that patients with a low decile score would be more likely to achieve remission to sertraline but not placebo. We first inspected the nPPV for sertraline and placebo at the primary endpoint (week 8), respectively. As a secondary outcome, we calculated nPPVs after prolonged sertraline or placebo administration (week 16).

Dataset-2 and Dataset-3: rTMS

Dataset-2 and Dataset-3 are open-label, clinical datasets composed of patient data collected at multiple outpatient mental health care clinics in the Netherlands (Brainclinics Treatment, neurocare clinic Nijmegen, neurocare clinic The Hague, and Psychologenpraktijk Timmers Oosterhout) between May 2007 and November 2016 (Dataset-2) and December 2016 and June 2022 (Dataset-3). These studies were not reviewed by an independent ethics committee. Each patient provided written informed consent for data use before collection of the EEG data. In rTMS Dataset-2, 196 patients with MDD received 10 Hz rTMS over the L-DLPFC or 1 Hz rTMS over the R-DLPFC (at 120% resting motor threshold, 1,500 or 1,200 pulses, respectively) concurrent with psychotherapy32. In rTMS Dataset-3, 39 patients received only 1 Hz R-DLPFC stimulation and psychotherapy. All other parameters were the same as in rTMS Dataset-2.

Dataset-4 and Dataset-5: ECT

ECT Dataset-4 comprises data from the Study on Neuroimaging predictors of Outcome in ECT Patients (SNOEP), which was approved by Rijnstate Hospital and the medisch-ethische toetsingscommissies (METC) Arnhem/Nijmegen. Patients who were referred for ECT treatment at Rijnstate hospital between August 2016 and June 2022 were included. All patients provided written consent for the original study from which the data have been used before study start. Since these data were collected as part of a clinical trajectory, participants did not receive financial compensation. Thirty-nine outpatients with MDD were treated with ECT, 19 of whom had complete EEG and outcome data. Fourteen received bifrontotemporal (BL) stimulation and 5 right unilateral (RUL; according to d’Elia56) stimulation with stimulus dose relative to seizure threshold (SDRST; that is, 6 times seizure threshold (ST) in RUL and 2.5 times ST in BL ECT) and using 0.5 ms pulse width. Resting-state EEG data and HRSD-17-score were collected before ECT and 2 weeks post-ECT course.

ECT Dataset-5 comprised data of 60 patients who underwent ECT treatment at University Hospital Zurich between 2006 and 2015. This study was not reviewed by an independent ethics committee. All participants provided written consent for the original study from which the data have been used. Since these data were collected as part of clinical treatment, participants received no financial compensation. As part of clinical treatment, patients were treated with 6–12 sessions of bifrontal ECT (pulse width 0.5 ms, SDRST 1.5 of ST)52, with outcome analyzed by Clinical Global Impression ratings.

Stratification between brain stimulation techniques

Discovery analyses were conducted for brain stimulation techniques except for the 10 Hz rTMS prediction since the direction of effect was informed on previous findings. Since this finding has already been independently replicated32, no blinded out-of-sample validation was conducted. Instead, in Dataset-2, remission was predicted in patients with an iAF in the synchronization range (iAF between 9.6 and 10.4) who had received 10 Hz rTMS.

For 1 Hz and ECT datasets, all possible directions of effect were tested, that is, low decile score (1–5) excluding synchronization range, synchronization range, and high decile score (6–10) excluding synchronization range. Potential findings were subsequently evaluated in blinded, out-of-sample validations in rTMS Dataset-3 and ECT Dataset-5.

Lastly, we conducted a simulation for stratification between brain stimulation interventions.

Since patients were not evenly distributed across the different subgroups (low decile, synchronization and high decile), we first determined the percentage of patients that can be expected to be stratified to each subgroup based on all our rTMS and ECT datasets. Subsequently, we used these percentages to calculate the weighted average of the PPVs that were previously determined for each treatment. The resulting PPV and nPPV were the expected remission rate and normalized remission rate following stratification to rTMS and ECT with Brainmarker-I.

Dataset-6 and Dataset-7: exploratory analyses—psychotherapy, ketamine and bupropion

Dataset-6 comprised patient data from three outpatient mental health care clinics (Synaeda Leeuwaarden Fonteinland, Synaeda Drachten, Synaeda Heerenveen), and was therefore not reviewed by an independent ethics committee. Each patient provided written informed consent for data use before EEG and treatment start. Since these data were collected as part of clinical treatment, participants received no financial compensation.

Approval for all three ketamine studies used in Dataset-7 was obtained from the Ethical committee of Prague Psychiatric Centre/National Institute of Mental Health, Czech Republic before patient enrollment. Outpatients were recruited for study participation at Prague Psychiatric Centre, Czech Republic between 2010 and 2022. All patients provided written informed consent. No financial compensation was offered (for more information on ethical approval, study protocol and participant inclusion, we refer the reader to the relevant trial registration and reference53).

Exploratory analyses were performed in Dataset-6 for psychotherapy57, in Dataset-7 for ketamine treatment and in Dataset-1 for the subgroup of sertraline nonresponders, switched to bupropion (N = 54) in accordance with the previous analyses, however, without a guided hypothesis. Datasets are described in more detail in Supplementary Discussion 4.

Clinical trials

Data from the following trials were used in this study: Establishing Moderators and Biosignatures of Antidepressant Response for Clinical Care for Depression (EMBARC58, identifier NCT01407094), QEEG Cordance and EEG Connectivity Changes after Administration of Subanesthetic Ketamine Doses in Patients with Depressive Disorder59, The Role of mTOR (Mammalian Target of Rapamycin) Signaling Pathway in the Antidepressive Effect of Ketamine in Patients with Depressive Disorder60 and Clinical and Neurobiological Predictors of Response to Ketamine: towards Personalized Treatment of Depression61.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.