Mood stabilizers and/or antipsychotics for bipolar disorder in the maintenance phase: a systematic review and network meta-analysis of randomized controlled trials

We searched Embase, PubMed, and CENTRAL from inception until 22 May 2020 to investigate which antipsychotics and/or mood stabilizers are better for patients with bipolar disorder in the maintenance phase. We performed two categorical network meta-analyses. The first included monotherapy studies and studies in which the two drugs used were specified (i.e., aripiprazole, aripiprazole once monthly, aripiprazole+lamotrigine, aripiprazole+valproate, asenapine, carbamazepine, lamotrigine, lamotrigine+valproate, lithium, lithium+oxcarbazepine, lithium+valproate, olanzapine, paliperidone, quetiapine, risperidone long-acting injection, valproate, and placebo). The second included studies on second-generation antipsychotic combination therapies (SGAs) (i.e., aripiprazole, lurasidone, olanzapine, quetiapine, and ziprasidone) with lithium or valproate (LIT/VAL) compared with placebo with LIT/VAL. Outcomes were recurrence/relapse rate of any mood episode (RR-any, primary), depressive episode (RR-dep) and manic/hypomanic/mixed episode (RR-mania), discontinuation, mortality, and individual adverse events. Risk ratios and 95% credible interval were calculated. Forty-one randomized controlled trials were identified (n = 9821; mean study duration, 70.5 ± 36.6 weeks; percent female, 54.1%; mean age, 40.7 years). All active treatments other than carbamazepine, lamotrigine+valproate (no data) and paliperidone outperformed the placebo for RR-any. Aripiprazole+valproate, lamotrigine, lamotrigine+valproate, lithium, olanzapine, and quetiapine outperformed placebo for RR-dep. All active treatments, other than aripiprazole+valproate, carbamazepine, lamotrigine, and lamotrigine+valproate, outperformed placebo for RR-mania. Asenapine, lithium, olanzapine, quetiapine, and valproate outperformed placebo for all-cause discontinuation. All SGAs+LIT/VALs other than olanzapine+LIT/VAL outperformed placebo+LIT/VAL for RR-any. Lurasidone+LIT/VAL and quetiapine+LIT/VAL outperformed placebo+LIT/VAL for RR-dep. Aripiprazole+LIT/VAL and quetiapine+LIT/VAL outperformed placebo+LIT/VAL for RR-mania. Lurasidone+LIT/VAL and quetiapine+LIT/VAL outperformed placebo+LIT/VAL for all-cause discontinuation. Treatment efficacy, tolerability, and safety profiles differed among treatments.


Introduction
Bipolar disorder (BD) is a common chronic mental disorder and a major contributor to the global burden of disease, with a worldwide prevalence of~1% [1][2][3]. Patients with BD repeatedly and irregularly present mania/hypomania or depression during their lifetimes, which can result in social and occupational disability [4].
Pharmacological treatments are among the primary treatments for BD [4,5]. The most recent guidelines state that clinicians and patients should take the maintenance phase into account when selecting acute phase treatments [6]. A previous network meta-analysis (NMA) reported that, compared with placebo, lithium and quetiapine reduced the recurrence or relapse rate of any mood, depressive, or manic, hypomanic/ mixed episodes [7]. Recently, aripiprazole once monthly (AOM) and asenapine were approved for the treatment of BD [8]. We performed a systematic review and NMA of the efficacy, tolerability, and safety of antipsychotics and/or mood stabilizers, and we conducted a risk-benefit analysis of each medication for patients with BD in the maintenance phase.

Methods
This study was performed according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines (PRISMA Checklist) [9] and was registered on Open Science Framework (https://osf.io/h4nw p). The literature search, data extraction, and data input into spreadsheets for analysis were performed simultaneously and independently by at least two authors (TK, TI, YM, KS, and MO). The authors double-checked the accuracy of data transfer and calculations in the study.

Search strategy and inclusion criteria
The information about the literature search is shown in Supplementary Fig. 1. Inclusion criteria were (1) randomized controlled trials (RCTs) of antipsychotics and/or mood stabilizers lasting at least 12 weeks; (2) studies including adult patients with any BD subtype in the maintenance phase; (3) studies including patients with any mood symptoms at recruitment; (4) open studies and those with any level of blinding; and (5) studies with/without an enrichment designs. Exclusion criteria were (1) studies with child/ adolescent patients with BD; (2) continuation studies which randomly assigned patients with acute symptoms to treatment groups; (3) monotherapy and/or combination therapy studies of antidepressants with mood stabilizers or antipsychotics.

Data synthesis and outcome measures
The primary outcome was recurrence/relapse rate of any mood episode. Secondary outcomes were recurrence/relapse rate of depressive episodes, recurrence/relapse rate of manic/hypomanic/mixed episodes, all-cause discontinuation, and discontinuation rate due to adverse events. Other outcomes were mortality rate and incidence of individual adverse events. Divalproex was classified as part of the valproate group. Definitions of recurrence/relapses are shown in Supplementary Table 1.

Data extraction
We analyzed the extracted data based on intention-to-treat or modified intention-to-treat principles. When data required for meta-analysis were missing in the articles, we searched for these data in published systematic review articles. Although we attempted to contact the original study investigators to obtain unpublished data, we did not succeed in obtaining these data from all of them.

Meta-analysis methods
Based on the results of our literature search (Supplementary Fig. 1 and Supplementary Table 1), we planned to perform two categorical NMAs. The first included (1) placebocontrolled and head-to-head trials of monotherapy of antipsychotics and/or mood stabilizers, and (2) combination or augmentation studies in which the two drugs used were specified. The second NMA included studies in which second-generation antipsychotics (SGAs) combined with lithium or valproate (LIT/VAL) were compared with placebo-LIT/VAL. A Bayesian NMA based on random-effects models [10] was conducted using the netmeta package [11]. We fitted random-effects frequentist NMAs, in which we assumed a common random-effects standard deviation for all comparisons in the network. The risk ratio (RR) and 95% credible interval (95% CI) were calculated. The heterogeneity standard deviation was also calculated for all outcomes. The odds ratios and their 95% CIs were calculated for mortality rate and completed suicide rate because incidences of these outcomes were very rare (Supplementary Appendix 1.6-1.7). We assessed network heterogeneity using τ 2 with the netmeta package. We conducted a statistical evaluation of consistency using the design-bytreatment test (globally) and the node-splitting approach or Separate Direct from Indirect Evidence test (locally). The Bayesian analyses also estimated rank probabilities (i.e., probability of each treatment obtaining each possible rank as shown by their relative effects). The surface under the cumulative ranking area was calculated to rank the interventions. We also performed a meta-regression analysis in the first NMA to examine whether some potentially confounding factors (e.g., publication year, duration of study, number of total patients, percent female, and mean age) were associated with the extent of effect on primary and secondary outcomes. In addition to the analyses conducted previously [7], we also performed sensitivity analyses for primary and secondary outcomes in the first NMA, in which we gave only half the weight to (1) studies that included both patients with bipolar disorder I (BDI) and with other BD (when focusing on studies including only patients with BDI); (2) studies that included rapid-cycling patients with BD (when focusing on studies including only non-rapidcycling patients with BD because rapid-cycling BD is considered to be more difficult to stabilize than non-rapidcycling BD); (3) non-double-blind studies (when focusing on double-blind studies); (4) study arms that were "enriched" (when focusing on nonenriched studies); and (5) study arms supported by industry sponsors (when focusing on non-industry sponsorship studies) [12]. We did not perform meta-regression and sensitivity analyses in the second NMA because only six studies were included. In addition, the methodological quality of the included articles was assessed according to the Cochrane Risk of Bias criteria [13]. Funnel plots were used to explore potential publication bias. Lastly, we incorporated results into the Confidence in Network Meta-Analysis (CINeMA) application to assess the credibility of findings from each NMA [14]. CINeMA grades the confidence in results of each treatment comparison as high, moderate, low, or very low.  . Three additional studies [53][54][55] were identified following a manual search through the reference lists of the previous review article [7]. No further studies were found in the clinical trial registers. Although two studies included antidepressant treatment arms [15,28], these studies were included in the NMA because they had both lithium arm and placebo arm. Hence, 41 studies, including a total of 9821 patients, with mean study duration of 70.5 ± 36.6 weeks, were identified and included in this study. Characteristics of these studies are shown in Supplementary  Table 1. The percent female was 54.1%, and the mean age was 40.7 years. Twenty-three studies included only patients with BDI. Just four studies included only patients who had depressive episodes at recruitment. Sixteen studies included patients with rapid-cycling BD and 25 studies used enrichment designs. One perphenazine study [51] and two risperidone long-acting injection (RISLAI) studies [43,44] were not included in the NMA because no arms of the study connected to the treatment arms of other studies [51]. Detailed methodological quality analyses of the studies based on the Cochrane Risk of Bias criteria are presented in Supplementary Fig. 2. Three studies were open-label studies [27,30,43]. Twenty-nine studies were industry-sponsored studies. Supplementary Tables 2.1 and 2.2 show the results of primary outcome in the individual study included in our systematic review.

Results of the first network meta-analysis
Results of the first NMA are shown in Supplementary Appendix 1.1-1.17.
Aripiprazole+valproate ranked first for reduction of the recurrence/relapse rate of any mood episode and depressive episodes. Asenapine was selected the best drug for reducing manic/hypomanic/mixed episodes and discontinuation due to adverse events. Lithium+valproate had the least incidence of all-cause discontinuation. Supplementary Appendix 2.1-2.3 shows two-dimensional graphs of the primary and secondary outcomes.

Meta-regression analysis of primary and secondary efficacy outcomes
A significant association between the extent of effect on the recurrence/relapse rate of manic/hypomanic/mixed episodes and the duration of study was detected (beta = -0.497; 95% CI = -0.985, -0.004; p < 0.001). The heterogeneity variance of the meta-regression analysis was reduced by 21% compared with the unadjusted analysis. Although the unadjusted analysis demonstrated that aripiprazole, aripi-prazole+lamotrigine, and paliperidone outperformed placebo in the recurrence/relapse rate of manic/hypomanic/ mixed episodes, these differences were not statistically significant in the meta-regression analysis. We did not find any associations between the extent of effect in primary and other secondary outcomes and potentially confounding factors (Supplementary Appendix 1.1-1.5).

Sensitivity analyses for primary and secondary outcomes
Relative reduction in heterogeneity variance for recurrence/ relapse of any mood episodes for sensitivity analyses focusing on studies including only non-rapid-cycling patients with BD, nonenriched studies, and those not sponsored by industry were 29%, 21%, and 29%, respectively ( Supplementary Appendix 1.1). Although outcomes with aripiprazole and aripiprazole+valproate were superior to placebo in the unadjusted analysis, the results did not reach statistical significance in the sensitivity analyses. The results of other comparisons for this outcome in the unadjusted and sensitivity analyses were similar. We did not detect relative reductions in heterogeneity variance for other outcomes in any of the sensitivity analyses (Supplementary Appendix 1.2-1.5).

Mortality rate and incidence of individual adverse events
Mortality and completed suicide rates were low and similar for all treatments. Aripiprazole was associated with a higher incidence of extrapyramidal symptoms/use of anticholinergic agents compared with carbamazepine. Lithium was associated with a higher incidence of extrapyramidal symptoms/use of anticholinergic agents compared with placebo, carbamazepine, lamotrigine, olanzapine, and quetiapine. Valproate was associated with a higher incidence of extrapyramidal symptoms/use of anticholinergic agents compared with placebo, carbamazepine, lamotrigine, and quetiapine. Olanzapine was associated with a higher incidence of somnolence compared with placebo, lamotrigine, and lithium. Olanzapine and quetiapine were associated with a lower incidence of insomnia compared with placebo, lamotrigine, and lithium. RISLAI was associated with a higher incidence of prolactin-related adverse events compared with placebo. Lithium was associated with a higher incidence of dry mouth compared with valproate, and quetiapine was associated with a higher incidence of dry mouth compared with placebo and valproate. Lamotrigine, lithium, olanzapine, quetiapine, valproate, and placebo were associated with a higher incidence of headache compared with RISLAI. Valproate was associated with a higher incidence of headache compared with AOM. Lamotrigine was associated with a higher incidence of nausea compared with quetiapine. Lithium was associated with a higher incidence of nausea compared with placebo, olanzapine, and quetiapine. Valproate was associated with a higher incidence of nausea compared with placebo and quetiapine. Lithium was associated with a higher incidence of diarrhea compared with placebo and lamotrigine.
Heterogeneity, inconsistency, and results of the first network meta-analysis graded using the CINeMA system Global heterogeneity was low to moderate for most outcomes other than insomnia, dry mouth, and increased weight ( Supplementary Appendix 1.1-1.17). We also did not detect considerable heterogeneities for most of the outcomes in certain comparisons ( Supplementary  Appendix 1.1-1.17). We did not find significant global inconsistencies in the primary and secondary outcomes. Percent inconsistency loops in the recurrence/relapse of any mood episode, depressive episodes, manic/hypomanic/mixed episodes, all-cause discontinuation, and discontinuation due to adverse events were: 0%, 13.6%, 9.1%, 0%, and 0%, respectively. However, we detected global inconsistency in insomnia and increased weight. We did not analyze global inconsistencies in prolactinrelated adverse events and dry mouth due to insufficient data. Funnel plots with fewer than ten studies might not be meaningful. The confidence in evidence was often low or very low.

Results of the second network meta-analysis
Results of the second NMA are shown in Supplementary Appendix 3.1-3.11. Aripiprazole+LIT/VAL, lurasidone +LIT/VAL, quetiapine+LIT/VAL, and ziprasidone+LIT/ VAL were superior to placebo+LIT/VAL in the recurrence/ relapse rate of any mood episode. Moreover, lurasidone+LIT/ VAL and quetiapine+LIT/VAL were superior to olanzapine +LIT/VAL. Lurasidone+LIT/VAL and quetiapine+LIT/ VAL were superior to placebo+LIT/VAL in the recurrence/ relapse rate of depressive episodes, and lurasidone+LIT/VAL and quetiapine+LIT/VAL were superior to aripiprazole+LIT/ VAL and ziprasidone+LIT/VAL. Aripiprazole+LIT/VAL and quetiapine+LIT/VAL were superior to placebo+LIT/ VAL in the recurrence/relapse rate of manic/hypomanic/ mixed episodes, and lurasidone+LIT/VAL and quetiapine +LIT/VAL were associated with lower all-cause discontinuation compared with placebo+LIT/VAL. Quetiapine +LIT/VAL was associated with a higher incidence of somnolence compared with placebo+LIT/VAL. Olanzapine +LIT/VAL and quetiapine+LIT/VAL were associated with a lower incidence of insomnia compared with placebo+LIT/ VAL. Olanzapine+LIT/VAL and quetiapine+LIT/VAL were associated with a higher incidence of increased weight compared with placebo+LIT/VAL and aripiprazole+LIT/VAL. We did not examine local heterogeneity, and global and local inconsistency for any outcomes in the second NMA due to insufficient data. The confidence in evidence of the second NMA was very low.

Discussion
We performed a systematic review and NMAs of efficacy, acceptability, tolerability, and safety for mono-or combination therapies using mood stabilizers and/or antipsychotics in the treatment of adult patients with BD in the maintenance phase. We extended a previous NMA by two SGAs (i.e., asenapine and AOM), by investigating many more adverse effects and by examining efficacy and safety of various combination therapies using SGA and LIT/VAL [7]. Overall, most of the mood stabilizers and/or antipsychotics reduced the recurrence/relapse rates of any mood episode. However, when examining individual mood symptoms, both drug types appeared to be more effective for treating mania than depression.
Aripiprazole+valproate was the best treatment for reducing the recurrence/relapse rates of any mood episode and depressive episodes. However, these significances disappeared during sensitivity analyses adjusting for enrichment design and sponsorship. Lithium+oxcarbazepine ranked high with respect to reducing the recurrence/relapse rates of any mood episode (2nd), depressive episodes (2nd), and manic/hypomanic/mixed episodes (3rd). Lamotrigine +valproate ranked third for reducing the recurrence/relapse rate of depressive episodes. However, these results were based on only one small study (<50 patients in each treatment arm). Lithium+valproate ranked first for all-cause discontinuation, based on the results of a single open-label study. We deemed the result inconclusive, given the CINeMA rating showed low and very low confidence levels for these treatments.
Asenapine ranked high with respect to reducing the recurrence/relapse rates of any mood episode (3rd), manic/ hypomanic/mixed episodes (1st), all-cause discontinuation (3rd), and discontinuation due to adverse events (1st), which might represent novel insights into the pharmacological treatment of patients with BD in the maintenance phase. Although it did not prevent recurrence/relapse of depressive episodes, asenapine ranked fifth for outcome. It should be noted that this ranking was made from only one 26-week, double-blind, randomized, placebo-controlled trial of asenapine. Furthermore, asenapine carries the risk of oral hypoesthesia [34], and this distinctive side effect makes it difficult to blind [13]; the asenapine study might therefore be subject to performance and detection biases.
Olanzapine and quetiapine outperformed placebo in all efficacy outcomes and all-cause discontinuation. Quetiapine results should be interpreted with caution because all the quetiapine studies included in our meta-analysis used enrichment designs and were industry sponsored. However, sensitivity analyses adjusting for these factors demonstrated that quetiapine outperformed placebo in all efficacy outcomes. Thus, olanzapine and quetiapine showed good efficacy and acceptability in adult patients with BD in the maintenance phase. However, olanzapine and quetiapine carry a risk of somnolence and dry mouth, respectively. The second NMA demonstrated that combination therapies of these SGAs with LIT/VAL also carried the risk of increased weight.
Recent treatment guidelines recommend lithium as a first-line drug for the treatment of adult patients with BD in the maintenance phase [6,56,57]. The numbers of studies and patients treated with lithium were the largest among the active drugs included in our study (19 studies and 1335 patients). A recent meta-review including RCTs and non-RCTs reported that lithium had anti-suicidal effects for patients with psychiatric disorders including BD [58], although our meta-analysis did not show this effect. Our meta-analysis demonstrated that lithium outperformed placebo in all efficacy outcomes; however, it did not rank highly for the outcomes. Although lithium outperformed placebo regarding all-cause discontinuation, lithium increased discontinuation due to adverse events, and carried risks of extrapyramidal symptoms/use of anticholinergic agents, nausea, and diarrhea. However, given only 17 of 19 lithium studies included in our meta-analysis did not use enrichment designs, most patients assigned lithium included in our meta-analysis were not evaluated for efficacy, acceptability, tolerability, and safety of lithium prior to the assignment. However, sensitivity analysis of enrichment designs using the design-adjusted model demonstrated similar results to the unadjusted analysis. Accordingly, we concluded that lithium still had benefits for patients with BD in the maintenance phase, providing that due care is taken of its side effects.
A Finnish nationwide cohort of 18,018 patients with BD (mean follow-up time = 7.2 years) demonstrated that lithium and long-acting injectable (LAI) antipsychotics were effective in preventing hospitalization due to mental or physical illness compared with no drug use [59]. Unlike the results of our meta-analysis, the study indicated that lithium was superior to other mood stabilizers and that LAI antipsychotics are markedly better than identical oral formulations of antipsychotics. Quetiapine (most widely used in the study population) showed only an 8% risk reduction. Thus, there appear to be inconsistencies between the results of our meta-analysis, which included RCTs (providing the most robust evidence), and those of the cohort study (reflecting "real-world" routine clinical practice). We could not simply compare results between the studies for the following reasons [59,60]. First, the study durations of RCTs are generally shorter than those of non-RCT studies. Second, the symptoms of trial populations are evaluated in more detail than those of patient populations in clinical practice. Hence, symptoms might be detected earlier, and earlier intervention given to trial populations than to patients in clinical practice. Third, because RCTs often have stringent inclusion and exclusion criteria (e.g., excluding patients with the most comorbidities and the highest severity of illness, such as suicidal ideation and suicidal attempt), trial populations are often not representative of those in clinical practice.
Our study has several limitations. First, the confidence in evidence of the first NMA was often low or very low. In the primary outcome, confidence levels were deemed to be low or very low in 90.8% of comparisons with placebo. Second, we did not perform the inconsistency test for dry mouth and prolactin-related adverse events for the first NMA and all outcomes for second NMA. Third, the range of study durations included in our meta-analysis was 17.3-171.4 weeks. Thus, the long-term efficacy and safety of drugs still need to be verified. Fourth, we did not cover important clinical issues that might inform treatment decision-making in routine clinical practice (e.g., combination with nonpharmacological treatments). Fifth, a costeffectiveness analysis should be performed and included in the decision-making process.
In conclusion, our study represents the most comprehensive evidence currently available to guide the initial choice of pharmacological treatment for adult patients with BD in the maintenance phase. Clinicians and patients should consider the maintenance phase when selecting the treatment for the acute phase of BD.
Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons. org/licenses/by/4.0/.