Introduction

Cognitive impairments are core features of schizophrenia [1] that significantly contribute to functional outcomes [2, 3]. These cognitive impairments are targeted by only a limited number of interventions, and persist following medication stabilization [4, 5]. One promising intervention, auditory-based targeted cognitive training (TCT), aims to drive improvements in higher-order cognitive function (i.e., verbal learning) via enhancement of basic auditory information processing [6,7,8,9]. The “bottom up” approach of TCT is congruent with mounting evidence that abnormalities in early auditory information processing (EAIP) substantially contribute to impairments in higher-order cognitive and psychosocial functioning [10,11,12,13,14,15,16]. Efficacy trials in schizophrenia patients have shown that TCT improves higher-order cognitive functioning [6, 9, 15, 17,18,19,20], particularly in the domain of verbal learning. However, individual responses to TCT vary, with up to 45% of patients exhibiting minimal or no benefit even after extended therapeutic “doses” [21]. Although factors such as illness severity and age have been proposed as potential moderators of patient response to cognitive remediation, strong and reliable predictors of treatment success have not yet been identified [8, 15, 22, 23].

Event-related potential (ERP) components of EAIP, mismatch negativity (MMN) and P3a, have been proposed as candidate biomarkers for predicting response to auditory-based TCT [24,25,26]. MMN and P3a are reliable biomarkers of central auditory system plasticity and discriminability—key processes underlying TCT [27,28,29]. Schizophrenia patients often have large-effect size deficits in MMN and P3a [30, 31] that are associated with (and perhaps contribute to) higher-order cognitive, clinical, and psychosocial functioning [12, 14, 18, 32,33,34]. Importantly, MMN and P3a also show strong relationships with elements of TCT.

First-episode schizophrenia patients with larger MMN reduction have shown greater benefit from TCT [35], while other EEG correlates of EAIP and auditory cortical plasticity have also been linked to treatment gains [9, 36]. Our own previous study demonstrated that among schizophrenia outpatients, MMN amplitude is malleable following the first hour of exposure to TCT [37]. This initial malleability in the underlying MMN and P3a circuitry might thus reflect an intermediate therapeutic process [8, 9, 14, 37] and therefore predictive of gains with longer, therapeutic courses of treatment.

To the best of our knowledge, no previous studies have examined the extent to which change in MMN and P3a measures from baseline after an initial 1-h dose of TCT predict patient outcomes. This study therefore aimed to replicate and extend our prior findings of ERP malleability assessed immediately before and after the initial 1 h of TCT (see ref. [37]), and determine whether this malleability predicts TCT outcomes after a 30 h course of treatment. We have previously reported a favorable TCT response in this cohort [15], with significant improvements in MCCB verbal learning t-scores (d = 0.65–082), and reductions in auditory hallucinations (d = −0.58), consistent with existing literature [6]. Based on our previous report [37], we hypothesized that there would be significant changes in MMN and P3a in recordings completed immediately before and after the 1st hour of TCT, and that the degree of this initial change would predict post-treatment improvements in verbal learning and positive symptom severity.

Materials and methods

Participants and design

Participants included 45 treatment-refractory patients with schizophrenia or schizoaffective disorder based on a clinical interview using a modified version of the Structured Clinical Interview for DSM-IV-TR [38]. Patients were recruited from a community-based inpatient treatment program and enrolled in the study after they were determined to be clinically stable by their treatment team. Exclusion criteria included an inability to understand the consent processes and/or provide consent or assent, not being a fluent English speaker, previous significant head injury with loss of consciousness >30 min, neurological illness, severe systemic illness, or current mania. After potential subjects indicated a willingness to participate, written informed consent was obtained from the patients with subsequent written approvals from their public guardians/conservators. Participants completed baseline assessments and were then randomized to either treatment-as-usual (TAU; n = 22) or TAU augmented with TCT (n = 23) using stratified random assignment by discrete levels of age, ethnicity, and gender. Participants then underwent either 1 h of TCT or 1 h of computer games (TAU). Since the efficacy of TCT has already been well-established, the TAU group was only seen again for post-study assessments after the completion of the initial hour of computer games. There were no significant group differences in demographics, clinical symptom ratings, or baseline cognitive function (see Table 1; [15]). Facility clinical treatment team members were blinded to TAU vs. TCT group assignment. The nature of the treatment and resource limitations (i.e., staff and physical space) did not allow for blinding of subjects and study staff. Attrition in the current sample is fully described in Supplementary Fig. 1 and by Thomas et al. [15]. One additional subject’s EEG data was lost due to file corruption. Notably, no subjects withdrew from the study due to the tolerability of the EEG assessments. The Institutional Review Board of University of California, San Diego approved all experimental procedures (IRB#130874).

Table 1 Comparison of clinical and demographic variables across patient groups

Targeted cognitive training

Details of the cognitive training and outcomes of this same cohort are reported by Thomas et al. [15]. Briefly, participants randomized to TCT completed 3–5 one-hour sessions per week, for a total of ~30 h. Six exercises supplied by BrainHQ by Posit Science Corporation were administered: Sound Sweeps (auditory processing speed), Fine Tuning (auditory perception and processing speed), Syllable Stacks (auditory memory), Memory Grid (auditory memory), To-Do List Training (auditory memory), and Rhythm Recall (auditory memory). Exercises applied an n-up/m-down algorithm to participant responses to estimate threshold. This design ensured that virtually any participant (regardless of initial impairment) could successfully initiate exercises and were continuously challenged at an appropriate level (~80% criterion accuracy) as their abilities improved.

Cognitive assessment

The MATRICS Consensus Cognitive Battery (MCCB) [39, 40] was administered to patients at baseline (TBaseline) prior to group assignment and upon completion of the study (TPost). The MCCB is a reliable and comprehensive neurocognitive battery designed for treatment studies involving schizophrenia populations. Since previous studies [6,7,8,9], including Thomas et al. [15], have demonstrated that TCT produces significant improvements in verbal learning, the MCCB verbal learning t-scores (corrected for age and sex) were the primary outcome measures of interest.

Clinical symptom assessment

The Scale for the Assessment of Positive Symptoms (SAPS) [41] and Negative Symptoms (SANS) [42] were administered to patients at baseline (TBaseline) and upon completion of the study (TPost). As TCT provides auditory-based cognitive training exercises, and because our previous reports showed significant reductions in auditory hallucinations in this cohort [15], the SAPS global hallucination severity rating raw score was the primary measure of interest.

MMN and P3a

All patients underwent EEG recording at baseline (TBaseline), after their initial completion of either 1 h of TCT (TCT) or 1 h of computer games (TAU) (TInitial), and upon completion of the study (TPost). Data were collected from 64 channels using a BioSemi ActiveTwo System with a sampling rate of 1 kHz. Eye blinks and horizontal eye movements were recorded using four additional electrodes placed above and below the left eye and at the outer canthi of the both eyes.

A passive auditory oddball paradigm consisting of a pseudorandom sequence of tones presented binaurally at 80 dB using foam insert earphones was used to evoke MMN and P3a responses. The standard stimulus (p= 0.90) had a 50 ms duration at 1000 Hz. All deviant stimuli (p= 0.10) were 125 ms in duration and consisted of a duration deviant (1000 Hz), as well as four unique “sweep deviants” presented in the form of a “pitch sweep” akin to the TCT module of the same name in order to increase ecological validity. These “sweep deviants” varied in terms of starting at the standard tone (1000 Hz) or a deviant tone (500 or 1500 Hz), and the direction of the change in pitch (up vs. down) across the sweep. A minimum of 400 total deviant trials were collected for each participant. All tones had a 5 ms rise/fall and a 500 ms stimulus onset asynchrony. Participants were instructed to ignore the stimuli while viewing a silent movie.

Eye movement artifacts were removed from continuous files via independent component analysis. Following data segmentation, removal of residual artifacts exceeding ±50 μV was performed. Standard and deviant average waveforms were created, as were deviant minus standard difference waves for measuring MMN and P3a. All waveforms were calculated from electrode Fz across a 500 ms epoch with a 100 ms baseline. Filtering consisted of 0.1 Hz low-cutoff and 20 Hz high-cutoff (12 dB/oct). A 100–200 ms time-window was utilized for MMN peak identification, while 200–300 ms was utilized for P3a peak identification. Mean amplitude was calculated by averaging the activity in a 50 ms window centered on the identified peaks.

Statistical analyses

Addressing our first goal of evaluating MMN and P3a malleability secondary to TCT, changes in activity over time were examined using mixed-effects models [43] for each dependent variable (MMN and P3a amplitudes and latencies). Each model was coded using the ‘lme4’ package in R [44] and included time (TBaseline, TInitial, and TPost) and stimulus type (duration deviant, sweep deviant) as fixed factors, as well as a random effect of time and stimulus type nested within subjects. Follow-up analyses consisted of a direct comparison of ERP activity at each time-point using paired-samples t-tests with a Bonferroni correction to control for multiple-comparisons [45].

In order to address our second goal regarding the predictive utility of MMN and P3a, difference scores in cognition and clinical symptoms were computed by subtracting baseline (TBaseline) from post-TCT (TPost) treatment values. Following our established methods (see ref. [37]), early ERP malleability was calculated by subtracting MMN and P3a in the session recorded immediately before (TBaseline) from the recording that immediately followed the initial 1-h exposure to TCT (TInitial). Multiple regression was used to predict change in MCCB verbal learning t-scores, and change in SAPS global hallucination severity scores, from the early malleability in MMN and P3a mean amplitude and peak latency across both classes of deviant stimulus type. Outliers were defined as scores that were outside 1.5 times the interquartile range (IQR) [45, 46] identifying three outliers in the TAU group. Results were comparable regardless of outlier inclusion, as such, a conservative approach was used whereby outliers were removed from the final analyses. Given the non-equivalence of treatment groups across time points (due to the removal of the TAU group’s active control for the duration of TInitial to TPost), separate regression models were used for TCT and TAU groups. Effect sizes for regression analyses were quantified using standardized regression coefficients (β: small = 0.02, medium = 0.25, large = 0.40).

Results

ERP malleability secondary to TCT

A significant main effect of time was observed on MMN amplitude (F [2,27.77] = 4.00, p = 0.030) whereby amplitude became smaller (i.e., more positive) over time, consistent with our previous findings [37]. Specifically, baseline (TBaseline) amplitude was significantly larger than one-hour post TCT (TInitial) (t[20] = −2.49, p = 0.022, Diff = −0.27, 95% CIDiff [−0.50, −0.04]) and at completion of the study (TPost) (t[14] = −3.31, p= 0.005, Diff = −0.42, 95% CIDiff [−0.69, −0.15]). Interestingly, there were no significant differences in mean amplitude between TInitial and TPost (t[14] = −0.086, p= 0.93, Diff = −0.016, 95% CIDiff [−0.42,0.39]). No significant time by stimulus type interaction was present (F [2,56.29] = 0.23, p = 0.80). Significant reductions in MMN latency over time were detected (F [2,23.49] = 4.02, p = 0.032). Latency at baseline (TBaseline) was significantly larger than one-hour post TCT (TInitial) (t[20] = 2.47, p = 0.023, Diff = 8.67, 95% CIDiff [1.33,16.01]), and marginally larger than at the completion of the study (TPost) (t[14] = 2.01, p = 0.06, Diff = 8.07, 95% CIDiff [−0.55,16.69]). There were no significant latency differences between TInitial and TPost (t[14] = −0.55, p = 0.59, Diff = −2.77, 95% CIDiff [−13.61,8.07]). No stimulus type (duration vs. sweep) by time interaction was present (F [2,56.27] = 1.53, p = 0.23).

A significant main effect of time was detected for P3a amplitude (F [2,18.77] = 4.92, p = 0.019) whereby amplitude increased (i.e., more positive) over time. Although, amplitude at baseline (TBaseline) was not significantly smaller than one-hour post TCT (TInitial) (t[20] = −1.33, p = 0.20, Diff = −0.28, 95% CIDiff [−0.72,0.16]), it was significantly smaller than at the completion of the study (TPost) (t[14] = −3.76, p = 0.002, Diff = −0.77, 95% CIDiff [−1.21,−0.33]). As with MMN amplitude, no significant differences in mean amplitude were seen between TInitial and TPost (t[14] = −0.58, p = 0.57, x̄Diff = −0.17, 95% CIDiff [−0.79,0.46]), no significant time by stimulus interaction was present (F [2,53.10] = 0.49, p = 0.62), and amplitude for both deviant stimulus types increased over time. The impact of time on P3a peak latency was non-significant (F [2,16.45] = 0.12, p = 0.89), as was the interaction between stimulus type and time (F [2,40.41] = 1.84, p = 0.17). See Fig. 1 for grand average ERP waveforms (Fig. 1).

Fig. 1
figure 1

Grand average of ERP data across patient groups (TCT, TAU) and stimulus type (duration deviant, sweep deviant). Amplitude is measured in μV, time in ms

Predictors of cognitive and clinical changes

As shown in Fig. 2, malleability of duration MMN peak latency significantly predicted change in MCCB verbal learning t-scores, such that initial reductions in latency predicted improvements in verbal learning in the TCT (r2 = 0.35, β = −0.59, F[1,14] = 6.89, p = 0.021, 95% CIB [−0.16, −0.02]) but not TAU group (r2 = 0.012, β = 0.11, F[1,15] = 0.17, p = 0.69, 95% CIB[−0.11,0.17]). In addition, as shown in Fig. 3, initial malleability of duration deviant P3a mean amplitude was significantly associated with post-treatment improvements in MCCB verbal learning, such that increases in amplitude predicted improvements in verbal learning t-scores in the TCT (r2 = 0.34, β = 0.58, F [1,14] = 6.56, p = 0.024, 95% CIB [0.37, 4.32]) but not TAU group (r2 = 0.006, β = −0.079, F [1,14] = 0.081, p = 0.78, 95% CIB [−3.12, 2.39])). All other models predicting change in cognition from MMN or P3a amplitude or peak latency were non-significant (p > 0.05).

Fig. 2
figure 2

Change (TPostTBaseline) in MCCB verbal learning t-score predicted by the malleability (TInitialTBaseline) in duration deviant MMN peak latency (ms) for TCT and TAU groups

Fig. 3
figure 3

A change (TPostTBaseline) in MCCB verbal learning t-score predicted by the malleability (TInitialTBaseline) in duration deviant P3a mean amplitude (µV) for TCT and TAU groups. Change (TPostTBaseline) in SAPS Global Hallucination Severity raw score predicted by the malleability (TInitialTBaseline) in sweep deviant P3a mean amplitude (µV) for TCT and TAU groups

A significant relationship of malleability of sweep deviant P3a amplitude and SAPS global hallucination severity score was detected, such that increases in P3a amplitude predicted decreases in positive symptom severity in the TCT (r2 = 0.30, β = −0.54, F [1,14] = 5.44, p = 0.04, 95% CIB [−2.86, −0.11])) but not TAU group (r2 = 0.075, β = −0.27, F [1,13] = 0.98, p = 0.34, 95% CIB [−0.75, 0.28])). Although non-significant, a trend of increases in duration deviant P3a mean amplitude predicting improvements in SAPS global hallucination severity score was observed (r2 = 0.22, β = −0.47, F [1,14] = 3.69, p = 0.08, 95% CIB [−1.54, 0.091])). All other models predicting change in symptoms from MMN or P3a amplitude or peak latency failed to reach statistical-significance (p > 0.05).

Discussion

While TCT has demonstrated efficacy for improving verbal learning and clinical symptoms in previous studies, some patients fail to show meaningful benefits to this intensive intervention and there are no established biomarkers that can be used to predict patient outcome. Previous findings in this cohort demonstrated that even severely impaired patients with acute symptoms exhibit significant improvements in verbal learning and auditory hallucinations secondary to TCT [15]. The current data establish that adding high-density EEG is feasible and well-tolerated in the context of a community-based inpatient clinical intervention study. Further, changes in these functional EEG biomarkers of auditory system target engagement predicted improvements in the key clinical outcomes from TCT. Specifically, the degree of malleability in MMN latency and P3a amplitude accounted for more than 34% of the variance in verbal learning improvement and up to 30% of the variance in positive symptom severity reduction in patients who underwent TCT.

Malleability of MMN and P3a secondary to TCT

MMN and P3a are robust and reliable biomarkers that are sensitive to interventions in people with schizophrenia [25, 37, 47]. These measures are also established as indices of neuroplasticity via N-methyl-d-aspartate receptor (NMDAR) functioning [21, 27, 47,48,49,50,51,52]; TCT-driven cognitive gains have been shown to be mediated by auditory system neuroplasticity in patients with schizophrenia [9, 37]. Consistent with these models and our prior findings [37], MMN latency and amplitude, as well as P3a amplitude, evidenced malleability secondary to TCT, with the greatest changes occurring after the initial 1-h dose. This suggests that ERP sensitivity is evident after the 1st hour of exposure to TCT exercises, substantially preceding gains in the desired targeted outcomes after 30 h of treatment. It is possible that this malleability of MMN and P3a reflects an enhancement of the neural circuits underlying EAIP via the neuroplastic mechanisms of TCT [12, 29, 37, 53]. MMN and P3a thus represent sensitive indicators of both auditory-system target engagement [8, 21, 51, 54], as well as neuroplastic changes induced by TCT [37].

MMN and P3a as biomarkers of TCT outcome

In addition to extending our previous finding of single-dose malleability of MMN and P3a following acute exposure to TCT in a distinct cohort of schizophrenia outpatients, the current longitudinal study demonstrates that the magnitude of these initial changes predicted improvements in key patient outcomes following 30 h of TCT in treatment refractory inpatients. The malleability of MMN latency and P3a amplitude predicted improvements in verbal learning, while change in P3a amplitude predicted decreases in positive symptom severity. Moreover, these changes in MMN/P3a at 1 h preceded changes in cognitive function and symptom severity measured after 30 h, 3 months later. Collectively, these findings exhibited a pattern whereby malleability of ERP amplitude and latency in the direction of higher functionality predict greater TCT-related treatment gains. Each patient’s initial ERP malleability after 1 h of TCT could thus be used as a biomarker to predict meaningful patient outcomes in key, targeted outcome measures following 30 h of TCT [8, 37, 47].

To our knowledge, the current data represent the first to establish the utility of using indicators of neural plasticity after an initial dose of TCT to predict full-course TCT-driven improvements in cognition and clinical symptoms in schizophrenia patients. This finding is consistent with a body of research supporting a critical connection between MMN and P3a, TCT, and EAIP [6, 14], as well as research demonstrating the importance of neuroplasticity in TCT-driven gains in cognition [8, 20, 47]. We thus propose a model where patient sensitivity to TCT indexed by MMN and P3a malleability [21, 51, 53] is acted on via TCT-driven improvements in the neural substrates of EAIP [8, 20, 47]. Capacity for TCT-related gain (i.e. improvements in cognition and symptoms) is thus moderated by this initial MMN and P3a malleability—with greater malleability resulting in greater efficacy of EAIP changes secondary to the initial dose of TCT.

Limitations

The current findings should be interpreted in light of some limitations. Anticipated limitations in the sample size did not allow for stratification across education level, baseline cognitive function, or symptom severity. However, post hoc analyses did not indicate significant differences across treatment groups on any of these variables (see Table 1; [15]). The participants in the current study were treatment refractory patients recruited from a community-based inpatient transitional care facility. Although the current findings could have been attenuated given the higher symptom acuity and severity [55], our recent report highlights the effectiveness of TCT in this cohort, and demonstrates that even treatment refractory patients can benefit from cognitive remediation (e.g., ref. [15]). Moreover, inclusion of this population increases the generalizability of these findings to populations that are traditionally underrepresented in research due to limitations imposed by their illness severity. In addition, participant attrition was higher in patients randomized to TCT compared to TAU, although this difference was not statistically significant; no significant clinical or demographic predictors of attrition could be identified [15]. It is likely that attrition was attributable to the higher demands required for participation in the daily TCT cognitive training exercises rather than adverse outcomes from the intervention itself, as discussed in Thomas et al. [15].

Future directions

The current biomarkers were examined cross-sectionally (i.e., pre-treatment and post-treatment) rather than continuously across each training session. Although patient response to TCT in the current sample was not associated with dose effects (i.e. total number of hours completed) [15], future research monitoring ongoing patient response in order to determine the optimal titration of TCT would be beneficial (see Fig. 4). Examination of possible changes in the underlying neural substrates resulting in cortical source redistribution are of particular interest given counterintuitive decreases in MMN response [37]. Likewise, other ERP [56] or oscillatory measures [57,58,59,60] may also hold promise for predicting/monitoring response to TCT and other procognitive interventions. Although the durability of cognitive improvement several months following completion of the full course of TCT is well characterized [19], more detailed longitudinal studies examining the stability of neurophysiologic changes would be beneficial, especially in this “treatment refractory” population. Finally, preliminary analyses attempting to derive clinically relevant change thresholds suggest that any malleability reflecting improvement in these biomarkers could be used to identify patients who will benefit from TCT; however, future research is needed in order to determine cutoff scores for prediction of a clinically meaningful change in both the proposed ERP biomarkers, as well as outcome measures.

Fig. 4
figure 4

Decision tree for evaluating patients based on their unique ERP biomarker presentation.

Conclusion

These findings represent a pivotal juncture for the development of procognitive therapeutics by establishing neurophysiologic changes following initial exposure to treatment robustly predicted therapeutic outcomes in two critical clinical domains after 30 h of treatment administered over ~3 months. Not only are MMN and P3a well validated and reliable markers of auditory system function, but the current findings lend additional support for future examinations of EEG malleability in order to gauge patient response to TCT in routine clinical practice—even for treatment-refractory patients [15, 30]. Moreover, with their significant predictive utility, the use of MMN and P3a malleability as biomarkers of TCT outcome advances the precision medicine approach towards finding the “right treatment, for the right person, and the right time” by improving patient assignment and thereby outcome.

Funding and disclosures

Research reported in this publication was supported by the Department of Veterans Affairs Office of Academic Affiliations Advanced Fellowship Program in Mental Illness Research and Treatment, the Medical Research Service of the Veterans Affairs San Diego Health Care System, the Department of Veterans Affairs Desert-Pacific VISN-22 Mental Illness Research, Education, and Clinical Center (MIRECC), the Sidney R. Baer, Jr. Foundation, the Brain and Behavior Research Foundation, and National Institute of Mental Health of the National Institutes of Health (K23 H102420). Dr. Light reports having been a consultant to Astellas, Boehringer-Ingelheim, Dart Neuroscience, Heptares, Lundbeck, Merck, NeuroSig, Neuroverse, and Takeda. The funding organizations had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; and preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication. This study was supported by Grant Numbers MH59803 and MH94320. There are competing financial interests in relation to the work described, and the authors have no conflicts of interest to declare.