Introduction

Bipolar disorder is defined primarily by alterations in behavior and affect, but multiple cognitive deficits have been documented in patients with bipolar disorder as well, including impairments of attention and working memory [1, 2]. Since the onset of bipolar disorder most commonly occurs during adolescence, studying the neurobiology of adolescents with bipolar disorder who have limited treatment exposure may provide important insights into illness mechanisms and the impact of treatment on reducing brain alterations associated with the disorder [3, 4].

Disruptions in cognition and affect modulation are two cardinal features of bipolar disorder [1, 2, 5,6,7]. Affective responses are mediated primarily by the limbic system, with those responses moderated by emotion regulation circuitry that includes a loop from ventral frontal cortex to striatum, thalamus and then back to frontal cortex. Disturbances in limbic and emotion regulation circuitry are central components of the neurobiology of bipolar disorder [8, 9]. Disturbances of cognition are also well established in adult and pediatric bipolar disorder, and are believed to represent both intrinsic aspects of the illness as well as secondary consequences of alterations in affective responsivity and its modulation [2, 10, 11].

Studies using functional imaging are a promising approach for studying the disturbances in regional brain function that contribute to affective and cognitive problems in bipolar disorder, and in how different effective drug therapies reduce those abnormalities. Studies in this area, especially testing for drug treatment effects, are limited by small sample sizes, ongoing drug treatments at the time of randomization for trials, few direct comparisons of drug treatments with different mechanisms of action, testing for drug effects on brain function at the end of trials rather than examining early drug effects as well, and few studies testing for the degree to which baseline and early drug effects on brain function predict clinical outcomes at the end of trials [12,13,14]. Studies of youth are important not only for identifying potentially unique features of illness in younger individuals, but also because confounding effects of long-term treatment and illness are limited.

Lithium has been a mainstay of bipolar disorder pharmacotherapy for acute mania and is approved by the FDA for the treatment of youth with bipolar disorder [15, 16]. Quetiapine, a mixed dopamine serotonin receptor antagonist is also efficacious for manic episodes in youth with bipolar disorder [17]. Prior structural imaging studies have shown that lithium therapy was associated with increased volumes in areas important for mood regulation, while antipsychotic agents generally do not [18]. However, there are few neuroimaging studies directly comparing effects of lithium and second-generation antipsychotics in the treatment of acute mania to understand their systems level impact on the brain connectome [19].

The present study tested for the effects on brain function of acute treatment with lithium or quetiapine for manic or mixed episodes in youth with bipolar disorder. Considering established problems of attention during episodes of illness and connections between attentional and emotional brain regions within the limbic system and emotion regulation circuitry, well characterized attentional tasks are useful to probe for abnormal activation of reciprocal emotional reactivity and top-down compensatory factors involving attentional control. Therefore, we used CPT-IP, a task of sustained attention [20], to examine changes in brain activity during functional magnetic resonance imaging (fMRI) studies. In this randomized, double-blind study, scans were performed pretreatment, and 1 and 6 weeks after treatment initiation in patients, and at parallel intervals in healthy individuals. Since the terminal half-life of quetiapine is much shorter than lithium [21, 22], we hypothesized that (1) quetiapine would lead to a faster normalization of regional brain activation toward that of healthy individuals in the limbic system and emotion regulation circuitry, and (2) altered regional activation within the limbic system and emotion regulation circuitry at the baseline and a shift toward normalization of those changes at 1-week would predict clinical treatment outcome at week six.

Methods

Participants

The University of Cincinnati and the Cincinnati Children’s Hospital Medical Center Institutional Review Boards approved this study. All study participants and their legal guardians provided written informed assent and consent, respectively. Youth (ages 10–17 years) with a DSM-IV diagnosis of bipolar I disorder, during their first hospitalization for a manic or mixed episode, or who had been recently diagnosed in an outpatient setting (n = 107) were recruited from Cincinnati Children’s Hospital Medical Center and the University of Cincinnati Medical Center. Typically developing healthy youth (n = 60) were recruited from nearby communities where participants with bipolar disorder reside as a comparison group. Healthy participants had no history of mood or psychotic disorders personally, and no known history of these conditions in their first- or second-degree relatives. Baseline evaluations included the Crovitz Handedness Questionnaire [23], the Duke Tanner Stage Self-assessment [24], the Hollingshead Four-Factor Index of Social Status (SES-Child) [25], the Children’s Depression Rating Scale-Revised (CDRS-R) [26], the Young Mania Rating Score (YMRS), a review of prior psychiatric and medical treatments, physical exam, vital signs, and laboratory tests including a urine β-HCG and toxicology screen to assess for pregnancy and drug use, respectively. Diagnoses of bipolar I disorder and the absence of Axis I disorders in controls were confirmed by trained raters with established diagnostic reliability (κ > 0.9) via administration of the Washington University in St. Louis Kiddie Schedule of Affective Disorders and Schizophrenia (WASH-U-KSADS) [27]. Participants with bipolar disorder met the following criteria: (1) baseline YMRS scores >20; (2) less than 2 years from onset of bipolar disorder, defined by age at onset of first DSM-IV-TR affective episode (mania, hypomania, depression or mixed); (3) no more than 3 months of lifetime psychotropic medication exposure (with the exception of psychostimulants) and required no active psychotropic medication for at least one week before baseline MR scans (72 h for psychostimulants and benzodiazepines). No subjects were taken off medications for the purpose of study participation (patients all unmedicated to begin with). The following inclusion and exclusion criteria were applied to both groups. Inclusion criteria: (1) age 10–18 years and Tanner stage II-V [24]; (2) no lifetime DSM-IV-TR substance use disorder; (3) fluent in English. Exclusion criteria included: (1) contraindication to MRI scanning (e.g., braces, claustrophobia, etc.); (2) IQ < 70, as determined by the Wechsler Abbreviated Scale of Intelligence [28]; (3) a positive pregnancy test; and (4) a history of major systemic or neurological illness, or an episode of loss of consciousness >10 min.

Treatment procedures

Following baseline clinical evaluation and scanning, bipolar adolescents were randomized, to double-dummy, double-blind treatment with quetiapine or lithium and evaluated weekly for six weeks [29]. The randomization schedule was stratified by presence vs. absence of ADHD, presence vs. absence of psychosis, and mood state (i.e., a mixed vs. manic episode). Acute treatment outcome was assessed using scores from the YMRS [29].

Data acquisition

All subjects were scanned at the local Center for Imaging Research (CIR) using a 4.0 Tesla (4T) Varian Unity INOVA MRI scanner. All participants (patients and controls) were scanned without sedation at baseline and again after 1 and 6 weeks. Padding was used to minimize head movement. Following a scout image, the shim procedure FASTMAP (Fast Automatic Shimming Technique by Mapping Along Projections) [30] was performed to generate a homogeneous magnetic field [30]. An anatomical T1-weighted 3-D brain scan was obtained using a modified driven equilibrium Fourier transform (MDEFT) sequence (TMD = 1.1 s, TR = 13 ms, TE = 6 ms, FOV = 256 × 256 × 192 mm, matrix 256 × 256 × 192 pixels, flip angle = 20 degrees) [31]. A midsagittal localizer scan was used to place 50 contiguous 4 mm coronal slices that covered the entire brain [31] for an fMRI session during which participants performed a CPT-IP task using a T2*-weighted gradient-echo EPI pulse sequence (TR/TE = 3000/30 ms, FOV = 20.8 × 20.8 cm, matrix 64 × 64 pixels, slice-thickness = 4 mm, flip angle = 75 degrees). Data from the first two volume acquisitions of each run were discarded during post-processing to avoid non-equilibrium intensity modulation effects.

Continuous performance task-identical pairs version (CPT-IP)

A CPT-IP task was designed using E-Prime software (Psychology Software Tools, Pittsburgh, PA, USA) to explore the integrity of brain substrates supporting sustained attentional [20]. Supplementary Figure S1 depicts the continuous performance task schematic for the CPT-IP. During active blocks of the task, subjects were presented with a series of single-digit numbers and asked to respond by dominant thumb button press when the same number occurred twice consecutively. Control task blocks consisted of the number ‘1’ presented at the same rate as the active task, during which subjects were asked to press the response button for the first five stimuli and then watch the remainder of the control task presentation without responding. Active and control tasks were presented in alternating blocks of 30 s each, with numbers being presented for 700 ms at 750 ms intervals (i.e., a 50 ms gap between stimulus presentations, for a total of 40 numbers/block). Each epoch/block was preceded by a 2-s warning, notifying the participant of the upcoming task. There were five stimuli requiring positive responses per active epoch. Eleven blocks (six control and five active) were obtained for analysis during each scan session. Responses to targets were recorded to calculate response parameters, including sensitivity, response bias, and reaction time.

Data preprocessing

Preprocessing of fMRI data was performed using Statistical Parametric Mapping (SPM12, The Wellcome Center for Human Neuroimaging, London, UK) and SPM-based Conn Toolbox 2018b (McGovern Institute for Brain Research, MIT, Cambridge, MA) [32], running in MATLAB (The MathWorks Inc.; MA, USA). Processing steps included slice-timing correction, realignment and unwrap (subject motion estimation and correction), coregistration to individual T1-weighted images, spatial normalization to Montreal Neurological Institute coordinates (MNI) space, spatial smoothing (8-mm Gaussian kernel), and denoising which included component-based noise correction with aCompCor (white matter and cerebrospinal fluid region of interests [ROIs], five components each) [33], Artifact Detection Tools (ART)-based scrubbing [34], motion regression (12 regressors: six motion parameters plus six first-order temporal derivatives) and band-pass filtering (0.009 < f < 0.08 Hz). Participants with severe head motion were excluded from further analysis based on the following criteria: mean frame-wise displacement (FD) > 0.2 mm or any FDs > 5 mm [35]. Two patients in each treatment group and one healthy individual were excluded.

As in our prior studies [8, 9, 36], we tested for activation effects (increase in activation in task relative to control condition) in 20 ROIs within the limbic system and emotion regulation circuitry including six subcortical regions (amygdala, thalamus, putamen, globus pallidus, caudate and nucleus accumbens) and four cortical regions (pregenual anterior cingulate cortex [ACC], subgenual ACC, ventrolateral prefrontal cortex and ventromedial PFC for each hemisphere. ROI parcellations were determined using a cross-validated functional atlas (Brainnectom Atlas) [37] with ROI masks illustrated in Supplementary Fig. S2. Individual ROI masks were created using WFU_PickAtlas MATLAB toolbox (https://www.nitrc.org/projects/wfu_pickatlas). We used the MarsBaR ROI analysis toolbox (http://marsbar.sourceforge.net/) to extract voxelwise t-values for active-control task differences which were averaged for each ROI.

Task performance results

Task performance for all participants were reported in a previous report [19]. For all three points in time, the patients showed worse CPT-IP task performance (i.e., lower percent correct and discriminability) compared with healthy individuals (p < 0.05). Longitudinal analyses showed no significantly different change over time on any task performance measure among the healthy, lithium and quetiapine groups (p > 0.05).

Statistical analysis

Statistical analyses were performed using R software v.4.0.0 (https://www.r-project.org/). We analyzed ROI activation data using mixed-effects analysis of covariance (ANCOVA) models with group (lithium, quetiapine and healthy control) and time (baseline, Week one and Week six) as factors for each ROI separately. Age, sex and mean FD were treated as covariates. The false discovery rate method was used to correct for multiple comparisons.

Our primary hypothesis was that there would be significant group-by-time interactions that followed a prespecified pattern. In order for interaction effects to be considered meaningful, in addition to their significance we required that three further conditions be met. First, the differences between healthy control and patients at baseline had to be significant to demonstrate illness-related regional pathology as a target for treatment effects. Second, we required that the within group longitudinal change for the healthy control group had to be nonsignificant. Third, we required that at least one of the treatment groups (lithium or quetiapine) showed a significant change from baseline in the ROI. As an exploratory analysis, we examined correlations between the activation changes in ROIs and mania symptom changes for each of the two drug treatments using partial correlation analysis separately, correcting for age, sex and mean FD. A treatment-by-time interaction analysis was conducted to test for changes in cognitive task performance over time.

Machine learning analysis

Machine learning analysis was performed with linear support vector machine (SVM) to distinguish treatment responders and non-responders at week six. We separately used baseline activation and changes in activation after one week of treatment (i.e., week one – baseline) of 20 ROIs as features to train our models. Participants with bipolar disorder who demonstrated a minimum 50% reduction of YMRS scores at endpoint were regarded as responders, those with less improvement at week six were treated as non-responders. We evaluated treatment outcome prediction in quetiapine and lithium treatment groups separately given that these two drugs may have different neuropsychopharmacology. Ten-fold stratified cross-validation was performed to split training and testing sets. The sole hyperparameter C of linear SVM was determined via grid search on a set of values (i.e., [10−3, 10−2, 10−1, 1, 101, 102, 103]), and the grid search was performed using another nested 10-fold stratified cross-validation within the training set. Model performance was examined by average balanced accuracy, sensitivity, specificity and area under receiver operating characteristic curve (AUC) based on testing sets across 10 folds. Statistical significance was estimated using permutation testing (1000 permutations). The whole training process was performed 1000 times with the label of subjects (responders vs. non-responders) permuted. The p values were then obtained by dividing the number of times that the permuted version was better than the original performance by the number of permutations. We also examined the model transferability between two medication groups which described in detail in our previous study [38].

To maintain consistency with preceding analyses, model transferability test was also implemented in the identical two-stage prediction pipeline.

Results

Participant characteristics

There were no significant differences among the lithium, quetiapine and healthy controls in age or sex (Table 1). There were significant differences among the healthy control, lithium and quetiapine treatment groups in terms of IQ (F = 8.27, p < 0.001) and socioeconomic status (SES) (F = 16.20, p < 0.001). Compared with healthy controls, both the lithium and quetiapine treatment groups had lower SES and lower IQ (all p < 0.01) but the two treatment groups did not differ (all p > 0.05). The quetiapine treatment group did not differ from the lithium treatment group in baseline YMRS and CGI scores (all p > 0.05). There was a more rapid reduction in YMRS scores in the quetiapine than lithium group (significant treatment-by-time interaction effect in YMRS scores) (F = 3.54, p = 0.030). Also, at both Week one and Week six, the quetiapine group showed greater reductions in mania symptoms from baseline than the lithium group (Fig. 1, p < 0.05).

Table 1 Demographic and clinical characteristics of participants.
Fig. 1: Longitudinal changes of clinical symptoms in patients of bipolar disorder by treatment group.
figure 1

Along with the treatment procedure, the clinical symptoms of patients with bipolar disorder were alleviated. AE YMRS total, CDRS total, CGI severity mania, CGI severity depression and CGI severity overall). Cross-sectional comparison shows the group-wise differences between lithium and quetiapine treatment group in mania symptoms (YMRS and CGI-mania, A, C) in week 1 and 6 as illustrated by asterisks. The longitudinal treatment-by-time interaction effect was significant in YMRS (F = 3.54, p = 0.03). *p < 0.05; **p < 0.01. YMRS Young mania rating score, CDRS children’s depression rating scale, CGI clinical global impression.

Test-retest reliability of task activation

Test-retest reliability of task activation in the healthy controls were analyzed using the mean beta coefficients of task activation across all ROIs at 3 different points of time. The mean ICC all ROIs was 0.5 which indicates a moderately high and significant test-retest reliability of neural responses to task demands in healthy controls [39].

Influence of treatment methods on brain activation

Three regions demonstrated a differential change over time between groups (significant group-by-time interaction) and also met all other decision rules for identifying differential change over time across participant groups noted above: left amygdala, right putamen and right globus pallidus (Fig. 2, Table 2).

Fig. 2: Longitudinal changes of task-related activation and its correlation with mania symptom changes.
figure 2

A The treatment-by-time interaction effects were significant in the left amygdala (F = 3.56, p = 0.0074); B right putamen (F = 4.76, p < 0.001,); C right globus pallidus (F = 4.13, p = 0.0029,); D Correlation relationship between changes of mania symptoms and changes of regional activation after drug treatment. The correlation analysis showed the 0-W6 changes of activation in right putamen were positively correlated with the 0-W6 changes of YMRS (r = 0.29, p = 0.040) in the quetiapine treatment group but not the lithium treatment group (r = −0.018, p = 0.92). 0-W6 from baseline to Week six, L left, R right.

Table 2 Influence of treatment methods on brain activation.

For the left amygdala (Fig. 2A), the overall group-by-time interaction was significant (F = 3.56, p = 0.007). The patient group showed reduced task-related activation (F = 6.94, p = 0.009) compared with the control group at baseline. Within group analysis showed that only in the quetiapine treatment group, but not in the healthy or lithium groups, there was a significant increase in activation over time (t = 2.50, p = 0.014).

For the right putamen (Fig. 2B), the overall group-by-time interaction was significant (F = 4.76, p < 0.001). The patient group showed decreased activation (F = 6.48, p = 0.012) compared with the controls at baseline. The within group analysis showed increasing activation over the course of the study in both the lithium (t = 2.52, p = 0.015) and quetiapine treatment group (t = 4.73, p < 0.001), but not in the control group.

For the right globus pallidus (Fig. 2C), the overall group-by-time interaction was significant (F = 4.13, p = 0.003). The patient group showed decreased activation (F = 6.27, p = 0.013) compared with the healthy controls at baseline. The within group analysis showed that in the quetiapine treatment group (t = 3.91, p = 0.002), but not in the lithium or healthy group, there was a significant increase in activation over the six-week trial.

The treatment-by-time interaction analyses results was insignificant (p > 0.05). which may indicate that the drug treatments did not lead to differential change in task performance. We tested for baseline differences among the lithium, quetiapine and healthy control groups using F tests. We found there were significant baseline differences among the 3 groups in L amygdala (F = 6.19, p = 0.002), bilateral putamen (Left: F = 4.13, p = 0.018; Right: F = 3.87, p = 0.023) and bilateral globus pallidus (Left: F = 3.63, p = 0.029; Right: F = 4.42, p = 0.014). The post hoc analyses showed that the lithium treatment group had higher left amygdala activation (F = 4.87, p = 0.030) during the task compared with the quetiapine treatment group. We did not observe significant differences between lithium and quetiapine treatment group in bilateral putamen and globus pallidus. To decrease the impacts of baseline differences on the longitudinal models, we also reran the primary analyses with baseline activation as an additional covariate. Our results showed that including baseline activation in these ROIs did not significantly change our primary results (all p > 0.05), which indicates that the longitudinal change trajectories remained different among the 3 groups after considering the baseline differences.

Associations among symptom ratings and brain activation

The regional activation changes in right putamen (r = 0.29, p = 0.040) were positively correlated with the YMRS changes in the quetiapine treatment group (Fig. 2D) but not in the lithium treatment group (r = −0.018, p = 0.92). The positive result needs to be considered with caution since it was not corrected for multiple comparisons. Activation changes were not significantly related to clinical changes in the other two ROI.

Medication response prediction

In the quetiapine group, baseline regional activation of 20 ROIs within the limbic system and emotion regulation system distinguished responders from non-responders with a balanced accuracy of 79% (AUC = 0.865, p < 0.001), while change in activation after one-week of treatment from baseline predicted endpoint YMRS scores with a balanced accuracy of 87.5% (AUC = 0.958, p < 0.001). For the lithium treatment, baseline activation features predicted treatment outcome with balanced accuracy of 60.83% (AUC = 0.715, p = 0.009), while one-week change from baseline in brain activation predicted treatment outcome with a balanced accuracy of 69.2% (AUC = 0.864, p = 0.004). The receiver operating characteristic curves for these four prediction models are presented in Fig. 3.

Fig. 3: Receiver operating characteristic curves for different prediction models based on regional activation within the limbic and emotion regulation system.
figure 3

Left panel Performance of models based on baseline regional activation. Right panel Performance of models based on one-week change of regional activation.

Model transferability analyses between drug treatments

All models failed to reach comparable performance in the transferability test when prediction models developed for one drug were applied to the other drug treated group (all p > 0.05). Baseline activation features showed chance level performance (i.e., 50%) using the quetiapine model to predict lithium treatment outcome (balanced accuracy, 50.5%) and vice versa (balanced accuracy, 51.4%). Using week 1 change data, both models still showed poor transferability (balanced accuracy of 57.1% for quetiapine model predicting lithium outcome and balanced accuracy of 43.5% for lithium predicting quetiapine outcome).

Discussion

This study examined how treatment with a quetiapine or lithium impacts limbic and emotion regulation brain circuitry in youth with bipolar disorder. While both treatments had a normalizing effect on brain function, treatment with quetiapine, relative to lithium, lead to a more rapid symptom reduction and normalization of task-related brain activation in the amygdala, putamen, and globus pallidus. Notably, two of these three regions are part of the basal ganglia, and the high density of dopamine receptors in these regions may account for more widespread and rapid changes in these regions following treatment with antipsychotic medications as they are dopamine receptor antagonists. Baseline alterations of brain function in these regions predicted treatment outcome. The striatum in particular plays a critical role in functions that integrate cognitive and affective processes, and in cortico-striatal-thalamocortical loops, including the circuitry that supports emotion regulation, and thus to be important in the neurobiology of bipolar disorder [40]. The centrality of the amygdala in emotion processing is well established, and abnormalities in this region have been widely reported in previous studies of bipolar disorder [4143,42,]. These findings parallel those from a previous study which observed normalized cortical activation during an affective faces task in patients with pediatric bipolar disorder after receiving treatment with second-generation antipsychotics [42].

Quetiapine induced a more rapid and widespread normalization of brain function in these limbic and striatal structures, suggesting greater effectiveness in normalizing pretreatment functional brain alterations in both limbic and emotion regulation circuitry. This finding with fMRI data parallels clinical change effects in this study, and is consistent with previous studies showing that quetiapine is effective and fast acting for the treatment of mania [44]. The association between putamen regional activation changes and symptomatic improvements in manic adolescents following 6 weeks of treatment supports the clinical relevance of identified treatment-related changes in putamen functional brain activity.

Using machine learning, we found that the limbic and emotion regulation system predicted medication response at the individual level, and the prediction model performed better in the quetiapine than lithium group (18.2% higher at the baseline activation and 18.3% higher at changes in activation after one week of treatment). This suggests that the prediction of outcome based on task-based regional brain network activation differs across the different drugs, with a greater predictive utility for quetiapine treatment. Moreover, change at week one is better than baseline metrics for both treatments. This observation is relatively novel in showing that early changes in brain function only 1-week after initiation of drug treatment can predict clinical outcome, offering promise for early detection of likely treatment non-response. This observation is consistent with other studies indicating that fMRI features may provide predictors of treatment response approaching the level needed for clinical trials and eventually perhaps for clinical application [45, 46]. In addition to these general findings, our transferability test results showed that all models failed to reach comparable performance in the transferability test, indicating that the contributing features of the predictive models differed between treatment groups, both for baseline data and for treatment-related change at week 1.

Several limitations should be considered when interpreting these results. First, the specific molecular mechanisms by which lithium and quetiapine treatment were impacting regional brain function could not be determined in the present study; consequently, it is not possible to determine how the drugs differentially impacted brain function [47, 48]. Nonetheless, our findings provide useful information of the differential regional effects of the two medications on brain function and their relation to clinical outcome. Second, the aim of this study was to identify abnormal activation of limbic and emotional regulation brain circuitry. Thus, ROI approaches were applied to directly test specific functional neuroanatomic hypotheses [8]. However, these approaches, by definition, limit discovery of novel findings in other brain regions. Third, as most first-episode manic individuals will remit at some point within one year, a longer follow-up period might provide useful information about longer term clinical outcomes and their relations to brain function [49, 50]. Forth, we found that there are surprising pretreatment differences in some brain regions activation (like left amygdala) despite randomization. Although we did additional analysis to show that including baseline activation in these ROIs did not significantly change our primary result, it may still be a limitation in analysis of differential change in the two treatment groups. Finally, the machining learning models described in this study will require replication with a larger sample size before any application in clinical decision making.

Together, this study suggests that fMRI provides a means to identify functional brain alterations in the untreated state that are relevant for predicting and understanding the course of acute treatment. Further, our findings document a more rapid and widespread pattern of functional brain changes associated with quetiapine than lithium treatment in youth with bipolar disorder, and those changes were evident in both limbic circuitry and in the ventral prefrontal-basal ganglia-thalamocortical circuitry that supports emotion regulation. Moreover, pretreatment functional brain alterations are promising potential biomarkers for predicting end-of-trial clinical outcome. Understanding the regional neurophysiologic effects of antimanic treatment can provide mechanistic understanding at the systems level of how treatments are clinically effective, and may provide strategies for testing novel compounds and potentially in the longer term for making clinical decisions about starting and changing therapeutic intervention.