INTRODUCTION

For decades, lithium and chlorpromazine were the only medicines with regulatory approval for acute mania, although many others were used empirically (Baldessarini and Tarazi, 2005). These included many neuroleptics (first generation antipsychotics (FGAs)) and potent benzodiazepines, used empirically with very limited support from randomized controlled trials (RCTs). Anticonvulsants, including carbamazepine and valproate, have been widely used since the 1980s. Growing numbers of novel antimanic agents have emerged in recent years, including second generation antipsychotics (SGAs), several anticonvulsants, and the antiestrogenic, central protein kinase-C (PKC) inhibitor tamoxifen. Currently, all SGAs (except clozapine), as well as lithium and the anticonvulsants valproate and carbamazepine are FDA approved for treatment of acute bipolar mania.

Placebo-controlled RCTs for newer antimanic drugs have increased greatly in the past decade, but relatively little information is available about how these compounds or pharmacologically similar groups of them compare in efficacy, or about types of patients most likely to benefit from particular treatments. Recently, the efficacy of some psychotropic agents for treating some mental disorders has been questioned. For example, in their evaluation of antidepressants for major depressive disorder (MDD), Moncrieff and Kirsch (2005) found only a two point difference between drug and placebo on the Hamilton rating scale for depression. Leucht et al (2009) in a meta-analytic review of 38 studies reported a difference of only nine points on the brief psychiatric rating scale between SGAs and placebo in patients diagnosed with schizophrenia. The few previous meta-analytic assessments of relative efficacy of treatments for mania usually involved selective consideration of agents of particular interest, and none has considered outcomes of all available RCTs, as identified through clinical trial registries (Emilien et al, 1996; Perlis et al, 2006b; Scherk et al, 2007; Smith et al, 2007; Tohen et al, 2001). Indeed, bias toward reporting favorable trials, as well as alleged delays or failure to report details of trials not showing separation of a novel treatment from placebo, probably have limited or biased information available (Vieta and Cruz, 2008). As pharmaceutical companies are now expected to post the design of trials before they are conducted, and to publish results promptly, it has become more feasible to attempt comprehensive meta-analyses.

To evaluate efficacy of available drugs compared with placebo for treating acute mania we now present results of primary meta-analyses of 38 monotherapy studies that included 56 drug–placebo comparisons involving 10 800 manic patient subjects. We tested for possible effects of study site counts, sample sizes, and initial illness severity on trial outcomes. We also secondarily considered available direct (head-to-head) comparisons of similar groups of agents.

METHODS

Data Sources

We performed comprehensive searches of the literature, using PubMed/Medline; ClinicalTrials.gov; Cochrane Central Register of Controlled Trials; Controlled-trials.com; and EMBASE/Excerpta Medica databases for RCTs for acute mania in bipolar disorder (last search: 12 January 2010). Search terms were: ‘bipolar, mania, trial’, and names of individual antipsychotic, anticonvulsant, or other drugs. We also reviewed reports in proceedings of meetings of the American and European Colleges of Neuropsychopharmacology, American Psychiatric Association, American College of Psychiatry, and International Conference on Bipolar Disorder since 1990, as well as references from all sources. We also consulted investigators identified as having studied antimania agents, as well as representatives of pharmaceutical companies that market such drugs to identify reports of additional trials, or for information missing from identified reports.

Study Selection

Among initially identified potential studies, we selected monotherapy trials with random assignment to treatment arms that prospectively compared a test agent with placebo or a standard comparator. Therapeutic targets were manic symptoms of acute mania or mixed manic depressive states of adult bipolar I disorder as defined by DSM-IIIR, -IV, or -IV-TR (American Psychiatric Association (APA), (2000)), or research diagnostic criteria (RDC; Spitzer et al, 1978). Manic symptoms were rated at baseline and during treatment with a standard rating scale (usually the Young mania rating scale (11-item YMRS, scoring range: 0–60; Young et al, 1978) or mania rating scale (11-item MRS, range: 0–52; Endicott and Spitzer, 1978), which are similar in scoring and apparent responsiveness to treatment effects (Perlis et al, 2006b). Trials using other symptom rating methods, or participants diagnosed with bipolar II, unspecified (NOS), or schizoaffective disorders were excluded, as were studies without a placebo or standard comparator, or permitting psychotropic agents other than moderate doses of benzodiazepines or chloral hydrate. Double-blind design was required for placebo-controlled studies. However, as omparison trials are uncommon, we included several randomized but unblinded head-to-head trials in secondary analyses. We included data from all registered and completed placebo-controlled trials of acute mania, except two recent studies involving calcium channel blocker MEM-1003 and clozapine, considered by the investigators as not yet ready for public disclosure.

Data Extraction and Outcome Measures

Information collected included study site counts; proportions of women and men randomized as well as with at least one assessment during treatment (intent-to-treat (ITT) samples); type of presentation (mania or mixed manic depressive states, with or without psychotic features); baseline mania severity (total score and percentage of maximum possible scale score) and mean doses (mg/day) of experimental agents; measures of initial and final group mean mania ratings; nominal study duration; and completion rates for each trial arm. Data were extracted by two reviewers (AY and SÖ) to meet consensus.

The primary outcome of interest was the Hedges adjusted g, based on the standardized mean difference between changes in mania ratings with test drug vs placebo or a standard comparator (Borenstein et al, 2009). A secondary outcome measure was the rate of attaining response (defined in nearly all studies as 50% reduction of initial mania scores from baseline to end point).

Meta-Analytic Calculations

We combined outcome data across trials with standard meta-analytic methods. For continuous data (changes in mean mania scores from individual trials), we employed the standardized mean difference: Hedges’ adjusted g (a slightly modified version of Cohen’ D, also generally used by the Cochrane collaboration), as it transforms all effect sizes from individual studies to a common metric and enables inclusion of different outcome measures in the same synthesis (Borenstein et al, 2009). When standard deviations (SD) for change in mean mania scores were not reported, we estimated pooled SD by using standard statistical procedures (Whitley and Ball, 2002). For categorical responder rates, we used risk ratio (RR: response with drug treatment vs placebo or a standard comparator) and absolute difference in responder rates (rate difference (RD)), with the associated number-needed-to-treat (NNT (1/RD): the estimated number of patients to be treated with a drug versus a placebo or standard comparator for one additional patient to benefit (NNTbenefit) or be harmed (NNTharm); Altman, 1988; Borenstein et al, 2009). Numerical results are presented with their 95% confidence intervals (CIs). As the true effects in the trials analyzed were assumed to have been sampled from a distribution of true effects we used random effects meta-analyses, with or without evidence of heterogeneity based on Q and I2 statistics (Borenstein et al, 2009; Der Simonian and Laird, 1986).

We also used meta-regression (with unrestricted maximum likelihood, mixed effects modeling) to evaluate impact on Hedges’ g for drug–placebo comparisons (outcome) of pre-selected factors: study site count, sample size, and baseline mania severity. This variant of multiple regression modeling weights for subject number/arm and variance measures to compute regression equations. The slope (β-coefficient: direct (+) or inverse (−)) of the regression line indicates the strength of a relationship between moderator and outcome. To limit risk of false-positive (type I) errors arising from multiple comparisons we adjusted p<0.05 by dividing α with the number of meta-regressions.

Studies with relatively large drug–placebo differences may be more likely to be reported, resulting in publication-bias that typically overestimates effect size (Borenstein et al, 2009). We examined potential publication bias with funnel plot of pooled effect size vs its standard error (Sterne and Egger, 2001). We also estimated fail safe N values (number of additional hypothetical studies with zero effect that would make summary effect in meta-analysis trivial, defined as Hedges’ g<0.10; Orwin, 1983). Finally, we used Duval and Tweedie's (2000) trim and fill approach to provide best estimate of unbiased effect size by removing the smallest studies sequentially until funnel plots became symmetrical about adjusted effect size. For data analyses, we used Comprehensive Meta-Analysis, version 2.2 (BioStat, Englewood, NJ). Statistical significance required two-tailed p<0.05.

RESULTS

Characteristics of Trials and Subjects

Primary meta-analyses included 56 randomized, double-blind comparisons (13 with negative results) of 17 drugs versus placebo from 38 studies involving a total of 13 093 randomized and 12 920 ITT manic patient subjects (Table 1). Corrected for duplicate counting of placebo arm patients who appear more than once in multiarm trials, 6988 manic patients were randomized to active agents and 3812 to placebo, with at least one follow-up assessment (total n=10 800 ITT patients). Mania symptom ratings used YMRS in 45/56 trials (80.4%), and MRS in 11/56 (19.6%). Most studies (34/38: 89.5%) involved multiple collaborating sites (mean: 29.7±18.9 sites/study; range: 1–70). Manufacturers of tested agents sponsored 89.5% of studies. Placebo-associated improvement in mean mania ratings relative to baseline varied greatly, from −19% (Zarate et al, 2007) or +0.63% (Pope et al, 1991) to +38% (McIntyre et al, 2009a). Likewise, study drop-out rates ranged from 13–15% (Kushner et al, 2006; Smulevich et al, 2005, respectively) to 82% (Hirschfeld et al, 2010) with placebo, and from 11–14% (Bowden et al, 2005; Khanna et al, 2005; Smulevich et al, 2005) to 83% (Hirschfeld et al, 2010) with drug. The impact of these sources of variance lie beyond this study and are reported separately (Yildiz et al, 2010). Of the 11 072 randomized subjects (corrected for duplicate counting in placebo arms), 5603 (50.6%) were men, and age averaged 39.1±11.7 years. Diagnostic criteria followed DSM-IV or -IV-TR in 92.1% of 38 studies, and less often, DSM-IIIR (5.3%) or RDC criteria (2.6%). Most subjects (73.1%) were diagnosed with mania, whereas 26.5% randomized to drugs and 27.1% given placebo were considered to be in a mixed manic depressive state. However, responses of men vs women, specific age groups, those diagnosed with mania vs mixed states, or outcomes at specific sites were rarely reported separately, precluding direct comparisons. Psychotic features at intake were noted in 29.3% of subjects (28.0% given drugs and 31.8% given placebo). Nominal trial duration was 3 weeks in 97.4% of studies (considered sufficient for regulatory approval; Table 1). However, rates of protocol completion averaged 65.8% with active agents (34.2% dropout) and 57.4% with placebo (42.6% dropout), in 36/38 studies providing such data, indicating that actual treatment exposure was close to 2 weeks.

Table 1 Characteristics of Included Randomized, Placebo-controlled Monotherapy Trials in Mania (N=37 studies with 54 comparisons)

Secondary meta-analyses involved comparison of a test agent with an established comparison-control drug (with or without a placebo arm), assigned randomly in 31 studies with 33 comparisons (31 (93.9%) double-blind) involving 13 drugs and a total of 6710 manic patients as the ITT sample corrected for duplicate counting of placebo arms (Table 2). These trials rated mania with the YMRS in 77.4%, and MRS in 22.6% of the 31 studies. Multiple sites were involved in 80.6% of these 31 trials (averaging 30.4±23.8 (1–76) sites/study), and drug manufacturers sponsored 77.4% of them. Nominal trial duration was 3 weeks in 21 studies (67.7%) and protocol completion averaged 73.4% (26.6% drop out; Table 2).

Table 2 Characteristics of Included Randomized, Monotherapy Trials Comparing Two Active Drugs for Treatment of Acute Mania (N=27)

Comparisons of Individual Drugs vs Placebo

Meta-analysis indicated statistical superiority over placebo for 13/17 agents tested: aripiprazole (n=1662 subjects), asenapine (n=569), carbamazepine (n=427), cariprazine (n=235), haloperidol (n=1051), lithium (n=1199), olanzapine (n=1335), paliperdone (n=1001), quetiapine (n=1007), risperidone (n=823), tamoxifen (n=74), valproate (n=1046), and ziprasidone (n=663); and lack of efficacy in four others: lamotrigine (n=179), licarbazepine (n=313), topiramate (n=1074), and verapamil (n=20; Table 1; Figure 1). For the 13 effective drugs, the pooled effect size was moderate (in 48 trials involving 11 092 patients, Hedges’ g=0.42, 95% CI: 0.36–0.48; p<0.0001). On contrast, four agents with non-significant summary effects yielded a pooled effect size of <0.10 in seven trials with 1586 subjects (Hedges’ g= −0.03, CI: −0.13 to +0.08; p=0.62). For categorical responder rates, pooled RR for the 13 effective drugs was 1.52 (CI: 1.42–1.62) in 46 trials with 10 669 subjects (p<0.0001), and only 0.98 (CI: 0.82–1.19) in 7 trials of the 4 apparently ineffective agents with 1586 subjects (p=0.87; Table 3).

Figure 1
figure 1

Forest plot of Hedges’ g with its 95% upper and lower limits (confidence interval (CI)), based on mania score changes in 55 drug/placebo comparisons, based on random effects meta-analysis. Filled squares indicate pooled results of individual drugs (and their CI). Drugs are listed according to the magnitude of the pooled effect sizes (Hedges’ g).

PowerPoint slide

Table 3 Results of Random Effects Meta-analyses for the Outcomes of Response as Risk Ratio, Absolute Difference in Responder Rates, and NNT with Drug vs Placebo Comparisons

Comparisons of Drug Classes vs Placebo

On the basis of primary outcome measure Hedges’ g, as a measure of improvement of mania ratings between drugs and placebo, SGAs as a group yielded an overall effect size of 0.40 (CI: 0.32–0.47 in 29 trials involving 7295 patients; p<0.0001). For mood stabilizers (MSs, including carbamazepine, lithium, and valproate), pooled effect size was 0.38 (CI: 0.26–0.50 in 13 trials involving 2672 patients; p<0.0001). The unique central PKC-inhibiting drug tamoxifen yielded an unusually large Hedges’ g of 2.32 (CI: 1.66–2.99; p<0.0001) in two small trials involving a total of 74 patients. Studies involving haloperidol as a standard active comparator (FGA), in its direct comparisons with placebo, yielded a pooled Hedges’ g of 0.54 (CI: 0.34–0.74; p<0.0001) in four trials with 1051 subjects.

With respect to categorical responder rates (Table 3), SGAs vs placebo yielded a pooled RR of 1.47 (CI: 1.36–1.59; 28 trials, 7094 patients, p<0.0001); MSs, as a group yielded pooled RR of 1.59 (CI: 1.39–1.82; 12 trials, 2450 patients, p<0.0001), again indicating similar summary effects and CIs. Tamoxifen yielded an unusually high RR of 7.46 (CI: 1.88–29.7; 2 trials, 74 patients, p=0.004). For haloperidol, RR was 1.58 (CI: 1.29–1.94; 4 trials, 1051 patients, p<0.0001). Estimates of NNTbenefit values (smaller NNT with greater efficacy) ranked: tamoxifen <haloperidol <MSs <SGAs (Table 3).

Direct Comparisons

On the basis of the improvement in mania ratings (Hedges’ g; Table 4), SGAs as a group yielded greater effect size than MSs (in eight trials with 1464 patients, Hedges’ g=0.17, CI: 0.07–0.28, p=0.001). Similarly, comparison of MSs vs all antipsychotics tested (SGAs or haloperidol) also favored the antipsychotics (Hedges’ g=0.18, CI: 0.08–0.28 in 10 trials with 1530 subjects, p<0.0001), and SGAs did not differ from haloperidol (Hedges’ g= −0.001, CI: −0.24 to +0.24 in six trials with 1536 subjects, p=0.99). Similarly, valproate and lithium did not differ significantly (Hedges’ g=0.11, CI: −0.04 to +0.26 in four trials with 679 subjects, p=0.16).

Table 4 Results of Random Effects Meta-analyses for the outcomes of Hedges’ g, Risk Ratio, and Rate Difference (absolute difference in responder rates) with Head-to-head Drug Comparisons

On the basis of categorical responder rates in direct comparisons (Table 4), SGAs again appeared to be somewhat more effective than MSs (RR=0.88, CI: 0.80–0.96, in six trials with 1443 subjects, p=0.006). Antipsychotics (SGAs or haloperidol) were similarly superior to, or faster acting than, MSs (RR=0.88, CI: 0.80–0.97, in seven trials with 1479, p=0.01). Direct comparisons of haloperidol (the only FGA tested) with SGAs indicated little or no difference (RR=0.93, CI: 0.79–1.10, in seven trials with 2166 patients, p=0.40), as did lithium vs valproate (RR=1.00, CI: 0.81–1.24, in four trials with 679 patients, p=1.00).

Factors Associated with Drug–Placebo Contrasts

Overall inter-study variance in effect sizes of drug–placebo contrasts was substantial (Q=47.6, df=12, p<0.0001; I2=70.4), encouraging consideration of possible explanatory factors. In regression models involving drug arms, we considered only the 13 agents found more effective than placebo, so as to avoid potential confounding by drug inefficacy, which itself would influence treatment effects (drug–placebo contrasts). We tested pre-selected covariates (study site counts, sample size, and initial manic symptom severity) for possible association with observed effect size (Hedges’ g) as a measure of treatment effect (difference in improvements in mania ratings between drug versus placebo), and mean difference (change in mania scores between baseline and final rating) to indicate drug or placebo effects. With these three covariates, statistical significance set at two-tailed α=0.016 (0.05/3).

We found significant associations between higher number of collaborating study sites and smaller treatment effects (drug versus placebo: 48 trials; slope (β)=–0.007, CI: −0.01 to −0.003, z= −3.79, p=0.00015), as well as larger placebo effects (38 trials; β=+0.11, CI: 0.06–0.15, z=4.67, p<0.0001), but not drug effects (48 trials; β= −0.02, CI: −0.06 to +0.03, z= −0.80, p=0.43). As more study sites corresponds with larger patient samples, we found similar associations between larger sample sizes and smaller treatment effects (48 trials; slope (β)= −0.001, CI: −0.003 to −0.0004, z= −2.63, p=0.008), and larger placebo effects (38 trials; β=+0.06, CI: 0.04–0.08, z=6.47, p<0.0001), but not drug effects (48 trials; β= −0.003, CI: −0.02 to +0.01, z= −0.30, p=0.77).

Treatment effects were unrelated to baseline symptom ratings (as the percentage-of-maximum attainable mania scores: 100%=60 for YMRS; 100%=52 for MRS, to avoid confounding by scaling differences) across 47 trials (β=0.43, CI: −0.57 to +0.65, z=0.14, p=0.89). However, higher baseline mania ratings predicted greater improvement with drug (46 trials; β=+0.26, CI: 0.13–0.40, z=3.80, p=0.0002), but not with placebo (36 trials; β=0.02, CI: −0.18 to +0.22, z=0.18, p=0.86).

Publication Bias

As studies with larger than average effects are more likely to be published, it is possible that the studies in a meta-analysis may overestimate the true effect size because they are based on a biased sample of target population of studies. As a first step in exploring any evidence of such bias in the present meta-analysis, the funnel plot of the effect size (Hedges’ g) vs its standard error was plotted, which numerically (not visually) indicated some sort of asymmetry in distribution of the studies (Kendall's tau (τ)=0.19, z=2.02, p=0.04). As a next step for assessment of publication bias we evaluated the possibility that the entire effect is an artifact of bias by calculating Orwin's Fail-safe N value, which was 140, suggesting that a large number of trials with zero effect would need to be added to the analysis to make cumulative effect trivial (defined in this study as Hedges’ g<0.10). We made a concerted effort to include all available completed trials in mania, regardless of publication status; and could only include 38 studies with 56 comparisons (13 being trials with negative findings). Thus, it is very unlikely that we failed to identify such a large of number of studies, and the entire effect is an artifact of bias. For the primary meta-analyses including 56 placebo-controlled comparisons, trim and fill analysis identified and trimmed only one aberrant small study (of tamoxifen with 16 subjects; Zarate et al, 2007), before the funnel plot became symmetric about the adjusted effect size (Hedges’ g) of 0.37 (CI: 0.29–0.45), indicating only a trivial change on the observed overall effect-size (Hedges’ g=0.37, CI: 0.31–0.42). When we considered only the trials for effective agents however, trim and fill analysis did not identify any aberrant studies; and the summary effect remained unchanged at the Hedges’ g of 0.42 (CI: 0.36–0.48). Overall, these considerations indicate that the effect of publication bias in this meta-analysis was negligible.

DISCUSSION

Efficacy of Agents and Groups of Agents

The primary meta-analysis based on 10 800 ITT patients from 38 studies with 56 randomized, double-blind, placebo-controlled comparisons of 17 investigated drugs found that 13 agents (76.5%) were more effective than placebo for acute symptoms of mania. These included all eight SGAs tested, as well as haloperidol as the only FGA tested (widely used but never licensed for mania), tamoxifen (a central PKC inhibitor), and two mood-stabilizing anticonvulsants (carbamazepine, valproate), and lithium. Agents that appeared to be most effective compared to placebo (based on effect size as Hedges’ g>0.50) were: tamoxifen (2.32, in two small, single-site trials), risperidone (0.66), carbamazepine (0.61, two trials), haloperidol (0.54), cariprazine (0.51, one trial), whereas eight other agents had smaller effect sizes: olanzapine (Hedges’ g=0.46), ziprasidone (0.42), asenapine (0.40, two trials), quetiapine (0.40), lithium (0.39), paliperidone (0.30), valproate (0.28), and aripiprazole (0.26; Figure 1). To avoid bias we included all available data in all analyses. Pooled effect size estimates for aripiprazole and paliperidone involved trials with various doses of test drugs, only some of which were effective. When only highest doses were considered, the effect size for aripiprazole (at 30 mg/day) increased only slightly, from Hedges’ g of 0.26 to 0.31 (CI: 0.16–0.46 in five trials with 1405 subjects, p<0.0001) and its dose effects were very limited. Pooled effect size for paliperidone for the highest dose (12 mg/day) vs all doses increased substantially, from Hedges’ g of 0.30 to 0.51 (CI: 0.27–0.76 in two trials with 529 subjects, p<0.0001), and its dose effects were correspondingly robust. Four agents: lamotrigine, S-licarbazepine (principal active metabolite of oxcarbazepine, in one comparison with placebo), topiramate, and verapamil were apparently ineffective in mania: (all Hedges’s g=−0.06 to +0.09; Figure 1). Of note proposed mechanism of action of effective and ineffective agents did not appear to account for efficacy. For example, some effective and ineffective anticonvulsant–antimanics shared ability to block sodium channels or to potentiate the inhibitory amino acid neurotransmitter GABA.

Agents found to be more effective than placebo demonstrated moderate absolute differences in responder rates (RDs=0.17), medium overall effect size (Hedges’ g=0.42), and NNT (6; Table 3). Exclusion of two small studies of tamoxifen with large drug–placebo differences did not change these results (RD=0.17, Hedges’ g=0.41, NNT=6). The close similarity of these computed measures of drug-over-placebo efficacy to the meta-analytically pooled efficacy of SGAs in schizophrenia is noteworthy (Leucht et al, 2009).

We also identified 32 direct, head-to-head drug comparisons, but they were limited in the range of drugs studied, and not all were double-blind or placebo controlled (Table 2). Various types of antipsychotic drugs appeared to be somewhat more effective than MSs; and SGAs did not differ appreciably from haloperidol (the only FGA tested; Table 4). Despite compelling evidence of antimanic efficacy for haloperidol (Hedges’ g=0.54), FGAs are no longer commonly used to treat acute mania, owing mainly to their unfavorable risks of short- and long-term adverse effects that need to be balanced against considerable long-term adverse metabolic effects of some SGAs (Baldessarini and Tarazi, 2005). Although relatively few direct comparisons seemed to favor SGAs over MSs for acute mania (Table 4), these groups of drugs showed similar pooled effect sizes when compared with placebo (Table 3). Moreover, all of the trials considered were very short (approximately only 2 weeks, when drop-out rates are considered), raising the possibility that speed-of-clinical action may favor the antipsychotics, especially through their almost immediate nonspecific or sedating actions (Baldessarini and Tarazi, 2005). As full clinical recovery from acute mania typically requires many weeks, the effects of SGAs versus MSs should be followed for longer times (Bowden et al, 2008; Tohen et al, 2008; Vieta et al, 2010b). In the absence of such long term, direct comparisons, one can consider the similar effect sizes of MSs and SGAs, the established neuroprotective and neurotrophic effects of MSs (Chang et al, 2009; Manji et al, 2000), and the long-term adverse metabolic effects of some SGAs (Baldessarini and Tarazi, 2005) in attempting to compare these classes of effective antimanic agents for clinical selection in the treatment of acute mania. Whereas the findings of the trials reviewed above strongly indicate that many candidate antimanic agents are significantly more effective than placebo, their similar effect sizes and overlapping CIs make it hard to conclude that one type is superior to another. Moreover, current clinical practice, driven largely by pressures of time and cost, often use more than one treatment to bring mania under control as quickly as possible—often combinations of MSs (carbamazepine, lithium, valproate), antipsychotics, and potent sedatives, at least temporarily (Baldessarini and Tarazi, 2005; Centorrino et al, 2010). A further question that clinicians may take into account when prescribing antimanic drugs, which goes beyond the scope of this meta-analysis, is their capacity to protect against switch into depression. Thus, the possibility that the most effective antimanic agents might not necessarily be the best to prevent depression may count against their use in clinical practice and might also explain why some combinations are more widely used than others (Vieta et al, 2009).

Factors Associated with Treatment Effects

This large database yielded evidence for smaller drug–placebo contrasts, and greater placebo-associated benefits in trials of acute mania involving larger number of collaborating sites, as well as patient samples. Two small, single-site studies of tamoxifen yielded remarkably large apparent therapeutic effects with particularly small placebo effects (Table 3). Post-hoc meta-regression after exclusion of these two tamoxifen trials confirmed the observed associations between higher number of collaborating study sites and smaller drug–placebo contrasts (46 trials; β= −0.05, CI: −0.009 to −0.002, z= −2.86, p=0.004), as well as greater placebo-induced improvement in mania ratings (36 trials; β=+0.06, CI: 0.03–0.10, z=3.66, p=0.00025). Regarding sample sizes, although the association between larger sample sizes and smaller treatment effects was no longer observed, greater placebo-associated benefit in larger trials of acute mania (36 trials; β=+0.04, CI: 0.02–0.05, z=4.17, p=0.00003) was supported after exclusion of two tamoxifen trials, indicating that small studies are likely to encounter lesser placebo effects. Sterne et al (2001) stated that the effect size may be larger in small studies because of retrieval of a biased sample of the smaller studies, but it is also possible the effect size really is larger in smaller studies for entirely unrelated reasons; such that the small studies may have been performed using patients who were quite ill (therefore more likely to benefit from drugs as indicated in this report; β=+0.26, p=0.0002), or the small studies may have been performed with better (or worse) quality control than the large ones.

Meta-regression modeling found that drug-associated benefit increased with initial symptom severity (based on mania ratings at intake). In a meta-analysis based on individual responses, Fournier et al (2010) reported that drug–placebo differences, and clinical change in symptoms of MDD during treatment with placebo or antidepressants, all tended to increase as initial severity of depression increased. However, in acute bipolar mania initial manic symptom severity did not appear to enhance observed drug–placebo contrasts, but amplified benefit from the drugs selectively. This may relate with the view that more severely ill patients better represent a target phonotype, or that initially high scores have more room for improvement. These observations suggest that the law of initial values (more deviant initial assessments tend to yield greater change with interventions) may well apply to experimental therapeutics, perhaps with different patterns for different disorders (Benjamin, 1963).

Study Limitations

Despite vigorous efforts to gain access to data from all available relevant trials, it is possible that some, especially negative, findings were not accessed. For some treatments, available numbers of trials and subjects were small, and most trials did not provide sufficient data to evaluate effects of treatment exposure time, or of other demographic or clinical factors that might suggest subgroups of particular interest. Also, some subgroup analyses involved particularly a few trials or subjects, or involved substantial inter-study variance (eg, effects of verapamil, or of lithium versus anticonvulsants), and their results should be interpreted with caution.

Conclusions

The present comprehensive meta-analysis of randomized, controlled trials of treatments for acute bipolar mania indicates at least moderate effect sizes, with statistical superiority over placebo found with 13/17 drugs, most of which are in common clinical use. In trials of individual drugs vs placebo, efficacy measures (differences in improvement of mania ratings or rates of response (achieving 50% improvement) in 2–3 weeks) and their 95% CIs were similar among most of the effective agents identified, and so do not indicate clear superiority of one agent or drug class over others. Nevertheless, a limited number of direct comparisons indicated that antipsychotic agents (SGAs or haloperidol) may have had somewhat superior apparent efficacy or more rapid action than the group of mood stabilizers tested (carbamazepine, lithium, valproate). Further development of improved antimanic drugs calls for agents with even better efficacy through clinical remission with better short- and long-term tolerability, as well as further testing of relative efficacy of existing compounds in more head-to-head, randomized comparisons.