We conducted a systematic review and meta-analysis of randomized controlled trials that compared second-generation antipsychotic (SGA) drugs with placebo in schizophrenic patients and which considered 13 different outcome measures. Thirty-eight randomized controlled trials with 7323 participants were included. All SGA drugs were more effective than placebo, but the pooled effect size (ES) for overall symptoms (primary outcome) was moderate (−0.51). The absolute difference (RD) in responder rates was at 18% (41% responded to drug compared with 24% to placebo, number needed to treat=6). Similar ESs were found for the other efficacy parameters: negative symptoms (ES=−0.39), positive symptoms (ES=−0.48), depression (ES=−0.26), relapse (RD 20%) and discontinuation due to inefficacy (RD 17%). Curiously, the efficacy of haloperidol for negative and depressive symptoms was similar to that of the SGA drugs. In contrast to haloperidol, there was no difference in terms of EPS between any SGA drugs and placebo, and there was also no difference in terms of dropouts due to adverse events. Meta-regression showed a decline in treatment response over time, and a funnel plot suggested the possibility of publication bias. We conclude that the drug versus placebo difference of SGA drugs and haloperidol in recent trials was moderate, and that there is much room for more efficacious compounds. Whether methodological issues account in part for the relatively low efficacy ESs and the scarcity of adverse event differences compared with placebo needs to be established.
Recent critics of psychotropic agents have claimed that these drugs are not efficacious. For example, the efficacy of anticholinesterase inhibitors for Alzheimer's dementia has been questioned,1 as has the efficacy of modern antidepressants, where Moncrieff and Kirsch2 found only a two-point difference between drug and placebo on the Hamilton rating scale for depression is found. In this context, we present a meta-analysis of 38 randomly controlled trials with 7323 participants comparing second-generation (atypical) antipsychotics with placebo. The aim is to assess the efficacy and safety of SGA drugs based on 13 outcomes. This large database allows for some judgments on the efficacy of antipsychotic drugs in general, and the degree of efficacy has implications for the interpretation of comparisons between second-generation antipsychotic (SGA) drugs and conventional antipsychotics. The review also assesses how well it can be documented that the newer drugs cause certain adverse effects. New versus old drug comparisons may establish that the new drug has a lower incidence of adverse effects, but they do not establish whether the newer drug can cause that adverse effect. Finally, the database allows for the discussion of a number of design issues in the context of placebo-controlled research in schizophrenia.
Materials and methods
We searched the register of the Cochrane Schizophrenia Group (CSG) for randomized controlled trials that compared oral routes of administration of SGAs (search terms: amisulpride, aripiprazole, clozapine, olanzapine, quetiapine, risperidone, sertindole, ziprasidone and zotepine) with placebo and/or conventional antipsychotics in the treatment of schizophrenia or related disorders (schizoaffective, schizophreniform or delusional disorder, any diagnostic criteria). There were no language restrictions. The last search was made in August 2005; since then, studies from monthly MEDLINE searches until September 2006 were added. The CSG register is compiled by regular methodical searches in numerous electronic databases (BIOSIS, CINAHL, Dissertation abstracts, EMBASE, LILACS, MEDLINE, PSYNDEX, PsycINFO, RUSSMED and Sociofile), supplemented by the hand searching of relevant journals and numerous conference proceedings (for details see the description of the Cochrane Schizophrenia Group3). We also searched the FDA web site and previous reviews4, 5 including those of the Cochrane Collaboration. Only studies meeting the quality criteria A (adequate randomization) and B (usually studies stated to be randomized without further details) according to the Cochrane handbook were included.6 We used only optimum doses of SGA drugs in fixed-dose studies as determined in controlled dose-finding studies as follows: amisulpride 50–300 mg day−1 for predominantly negative symptoms and 400–800 mg day−1 for positive symptoms, aripiprazole 10–30 mg day, olanzapine 10–20 mg day−1, quetiapine >250 mg day−1, risperidone 4–6 mg day−1, sertindole 16–24 mg day−1 and ziprasidone 120–160 mg day−1. It should be noted that there is a debate about the optimum quetiapine doses, but there is no evidence from dose-finding studies that shows higher doses are more efficacious. Indeed, in the studies included here the 750 mg quetiapine per day group was the least effective one.7 Eleven studies had an additional haloperidol arm. The results of the studies' haloperidol groups as compared with placebo were also pooled as a benchmark.
Data extraction and outcome parameters
All data were extracted independently by two reviewers. The first authors (when addresses were available) and all SGA drugs manufacturers were contacted for missing data. The primary outcome of interest was the mean overall change of symptoms according to the following hierarchy: the change of the Positive and Negative Syndrome Scale (PANSS8) total score from baseline, if not available the change of the Brief Psychiatric Rating Scale (BPRS9), then values at study end point of these scales, all based on intent-to-treat data set whenever available. We also analyzed negative symptoms, positive symptoms, depressive symptoms and overall quality of life in a similar fashion. For dichotomous efficacy measures, we analyzed responder rates, relapse rates and dropout due to inefficacy. The hierarchy for responder rates was 50% or more reduction from baseline on the PANSS/BPRS or better; or a Clinical Global Impression10 of much improved in so far as available; followed by the authors' definitions, which were usually at least 20 or 30% PANSS/BPRS reduction. Adverse effect outcomes were based on use of antiparkinson medication, mean EPS score (Simpson Angus Scale (SAS11), Extrapyramidal Symptoms Rating Scale (ESRS12)), dropouts due to adverse events and sedation. Dropouts for any reason were analyzed as a measure of acceptability of treatment. In a ‘once randomized–analyzed’ approach, we assumed in the case of dichotomous data that participants who dropped out prior to completion had no change in their condition.
Standardized mean differences (SMDs) based on Hedges's adjusted g and its 95% confidence intervals (CIs) were calculated for continuous data. When s.d. were not reported, we either derived them from other measures of variability or P-values, or we used the average s.d. of the other studies. For dichotomous data, relative risks (RRs) and risk differences (RDs) along with their 95% CIs were calculated. We believe that both measures are important. The mathematical properties of RR are somewhat better than those of RD, because they make an adjustment for baseline risks.13 But RRs are often misinterpreted by clinicians.14, 15 The number of patients needed to treat (NNT) or the number of participants needed to harm were calculated as the inverse of the RD. We also showed the percentages in each group, because we feel that this is crucial for the reader to be able to appreciate the results. For example, a RR reduction of 50% is not meaningful if the reader does not know whether these underlying percentages are 60 versus 30% or 4 versus 2%.
We explored study heterogeneity by using the I2 statistics, a measure estimating how much of the variance is explained by study heterogeneity.16 Since in some of the analyses there was considerable heterogeneity, we applied the random effects model by Der-Simonian and Laird17 throughout for the pooling of the studies. Random-effect models are in general more conservative than fixed-effect models, because they take heterogeneity among studies into account. When studies had several arms (for example, risperidone, quetiapine and placebo), we used the mean of the single arms to avoid counting the same participants twice.
Unrestricted maximum likelihood random effects meta-regression was used to find whether there was a change of the primary efficacy outcome (mean change of overall symptoms) over time using publication year as a moderator.
We made a sensitivity analysis excluding studies that consisted of patients with predominantly negative symptoms,18, 19, 20, 21, 22, 23 long-term studies on initially stable patients24, 25, 26, 27, 28 and one very short study of only 2 weeks duration.29 Owing to space limitations, we do not show the results here, but any result that deviated to an important extent from the primary analysis will be mentioned.
Studies with negative results are less likely to be published than studies with significant results. The possibility of such publication bias was examined applying the ‘funnel plot’ method to the primary outcome (mean change of overall symptoms) described by Egger et al.30 All calculations were done with Comprehensive Meta-Analysis Version 2.31 The exact formulae were reported there. Two-sided P-values <0.05 were considered statistically significant.
The searches in the register of the Cochrane Schizophrenia Group yielded 4166 citations. Of those publications that we ordered for inspection, 107 studies were excluded for the following reasons: no or inadequate randomization (N=50), no appropriate intervention or control group (N=29), inappropriate participants (N=2), no usable data (N=24), presentation of only a subgroup (N=1) and very short duration (5 days, N=1). The results of 202 studies comparing SGAs with first-generation antipsychotics will be reported elsewhere. Thirty-eight studies with 7323 participants were included (only the principle publication of each study is referenced): amisulpride (N=5), aripiprazole (N=7), clozapine (N=1), olanzapine (N=6), quetiapine (N=5), risperidone (N=7), sertindole (N=3), ziprasidone (N=4), zotepine (N=3; three studies provided results on two SGA drugs). Most of the studies were short-term and examined patients with positive symptoms, while only six studies examined patients with predominantly negative symptoms (four amisulpride studies, one olanzapine and amisulpride study and one zotepine study). Almost all studies were conducted by pharmaceutical companies and usually for registrational purposes. The minimum duration of washout was usually not more than a few days. The median of mean age was 38 years (see Table 1).
The results in terms of SMDs or RRs for the single drugs are shown in Figures 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 and 11. Table 2 presents pooled results of the single drugs based on RDs and NNTs. Table 3 presents all results for dropout rates.
All antipsychotics were significantly more efficacious than placebo in the treatment of overall symptoms (primary outcome). Nevertheless, with the exception of clozapine (SMD=–1.65, based on only one study), the effect sizes (ESs) were moderate (pooled ES of all SGA drugs: N=35, n=5568, SMD=−0.51, CI: −0.58 to −0.43, P<0.0001). This point is underscored by the difference in responder rates. Sertindole was not significantly more efficacious than placebo, and quetiapine and zotepine were only significant in the sensitivity analysis (see Figure 2). The pooled RR across SGA drugs was 0.78 (CI: 0.73–0.83, N=28, n=4498, P<0.0001), and the associated RD was –0.18 (CI: −0.22 to −0.14, n=4498, P<0.0001), thus reflecting an 18% difference in responder rates (overall 41 versus 24% responded under SGA drugs and placebo, respectively) or an NNT of 6 (CI: 5–7). The sensitivity analysis found an almost identical RR (0.79) and RD (−0.17). The funnel plot was asymmetrical, raising the possibility that studies with negative results have not been published (Egger's regression intercept, d.f.=33, P<0.001, see Figure 3). The meta-regression with publication year as a moderator suggested the drug–placebo difference may have become smaller over time (see Figure 4). This effect was no longer statistically significant in the sensitivity analysis excluding patients with predominantly negative symptoms and long-term or very short-term studies. It should be noted that the subset in the sensitivity analysis was more homogeneous, but statistical power was also reduced (see Figures 1, 2 and 4; Table 2).
Seven studies on relapse of 6–12 months duration showed that aripiprazole, olanzapine, ziprasidone and zotepine reduced the relapse risk significantly more than placebo. The RR for relapse suggested a more pronounced superiority of the SGA drugs than the RR for responder rates, but the RD was similar (all SGA drugs combined: N=7, n=1371, RR=0.41, CI: 0.28 to 0.59, RD=−0.20, CI: −11 to −30, NNT=5, CI: 3–9, P<0.0001). Amisulpride was not superior to placebo. Data on clozapine, risperidone and sertindole were not available (see Figure 5).
Amisulpride and zotepine showed no difference in positive symptoms compared with placebo, but for both drugs only studies on patients with predominantly negative symptoms were available. While there were no data on clozapine, the other antipsychotics were significantly more effective than placebo in the treatment of positive symptoms, with ESs ranging between −0.36 (aripiprazole) and −0.82 (risperidone). The pooled ES across SGA drugs was: N=30, n=4941, SMD=−0.48, CI: −0.57 to −0.38, P<0.0001 (see Figure 6).
In contrast to clozapine (only one study) and quetiapine (P=0.07, the effect was significant in the sensitivity analysis excluding the only 2-week study), the other antipsychotics improved negative symptoms more than placebo. The ESs for negative symptoms were usually lower than those for positive symptoms (ES across SGA drugs: N=36, n=5403, SMD=−0.39, CI: −0.45 to −0.33, P<0.0001). It should be noted that most of the studies investigated patients with predominantly positive symptoms. Amisulpride is the only compound for which several studies on patients suffering predominantly from negative symptoms are available. Such populations are more appropriate for examining effects on negative symptoms. One such study showed no superiority of zotepine.23 In one such olanzapine study, the 5 mg day−1 group was effective, but the 20 mg day−1 group was not.20 Haloperidol also reduced overall negative symptoms significantly more than placebo (see Figure 7).
On the basis of more limited data (14 studies), the SGA drugs also reduced depressive symptoms more than placebo (N=14, n=1910, SMD=−0.26, CI: −0.38 to −0.15, P<0.0001). Amisulpride, haloperidol, olanzapine, ziprasidone and zotepine were found statistically significantly superior to placebo. Haloperidol also significantly reduced depression scores (see Figure 8).
Quality of life
Two olanzapine studies20, 41 found olanzapine significantly superior to placebo on overall quality of life (N=2, n=406, SMD=−0.38, CI: −0.59 to −0.17, P=0.0003). Möller et al.23 found no significant superiority of zotepine (combined effect on the physical and the psychic components of the SF-36 scale: n=72, SMD=−0.24, CI: −0.70 to 0.22, P=0.309). No data on the other SGA drugs and haloperidol compared with placebo are available.
Extrapyramidal adverse effects
Although all antipsychotics were numerically more sedating than placebo (see Figure 10), statistical significance was reached only for haloperidol, quetiapine and zotepine using the RR and for aripiprazole, haloperidol and quetiapine using the RD. The pooled effect across SGA drugs was: N=21, n=3367, RR=1.91, CI: 1.44–2.52; RD=0.08, CI: 0.04–0.11, NNT=13, CI: 9–25. It is not possible to disentangle the effects of concomitant benzodiazepines from those of the antipsychotics using meta-analysis (see Figure 11 and Table 2).
Amisulpride, olanzapine, sertindole and zotepine were not associated with significantly lower rates of all-cause dropouts than placebo, whereas aripiprazole, clozapine, quetiapine, risperidone and ziprasidone were. This composite measure of efficacy, tolerability and other factors has been used as a proxy measure for acceptability of treatment.3 The overall dropout rate when all studies were combined was as high as 47% (pooled ES SGA drugs versus PBO: N=37, n=6001, RR=0.75, CI: 0.69–0.82; RD=−0.14, CI: −0.10 to −0.18, NNT=7, CI: 6–10, P<0.0001) (see Table 3).
Dropouts due to insufficient efficacy confirmed that all antipsychotics were superior to placebo (pooled results across SGA drugs: N=36, n=5809, RR=0.52, CI: 0.45–0.59, RD=−0.17, CI: −0.20 to −0.13, NNT=6, CI: 5–8, P<0.0001).
No antipsychotic, not even haloperidol, was associated with significantly increased RR in terms of dropouts due to adverse events, but there was a significantly increased RD for haloperidol and sertindole. The pooled ES across SGA drugs was also not significant: N=31, n=5320, RR=1.1, CI: 0.72–1.51, P=0.81, RD=0.01, CI: −0.01 to 0.03, P=0.46. In the sensitivity analysis, aripiprazole was even associated with fewer dropouts due to adverse events than placebo, while an increased risk was found for ziprasidone, haloperidol and sertindole.
This review, based on 38 randomized controlled trials with 7323 participants, demonstrates the efficacy of SGA drugs over placebo on various measures of response, relapse and discontinuation due to poor efficacy. Nevertheless, the relatively small absolute difference in responder rates of 18%, translating into an NNT of six, and the medium ES for the primary outcome (change of overall symptoms) of −0.51 are striking. Furthermore, we found that the drug–placebo difference diminished over time. This effect had already been reported in an analysis of psychiatric trials by Trikalinos et al.57
Cohen58 described an ES of −0.50 as large enough to be visible to the naked eye, for example, the difference between 14-year-old and 18-year-old girls (about 1 inch) or the difference in IQ between clerical and semiskilled workers. We pooled the (usually earlier) studies using the BPRS and found an absolute difference of nine BPRS points between SGA drugs and placebo, which we translate into a difference of one point on the Clinical Global Impression Scale.59 We pooled the more recent studies using the PANSS and found a difference of 10 points. According to Leucht et al.,59 a PANSS total score difference of 15 points reflects minimal improvement according to the CGI.
The meta-analysis confirms that the SGA drugs are no ‘wonder drugs’ in terms of efficacy, and that there is much room for better medication, confirming recent naturalistic studies.60, 61 But the fact that the ESs of the haloperidol arm studies also revealed a moderate effect raises the question whether antipsychotic drugs have previously been overestimated. An early NIMH study has often been quoted as a proof for a strong effect of antipsychotic drugs.62 In this study (n=344), the response rate to drug was 61% compared with 22% in the placebo group, resulting in a response rate difference of 41% or an NNT of 2. In contrast to the more chronic participants in our studies, half of the participants had a first episode of schizophrenia and received antipsychotic drugs for the first time. The Cochrane Review comparing the standard drug chlorpromazine to placebo found an NNT of 4 in the short-term (n=590, 11 studies, response rate drug 65.9%, placebo 41.5%, weighted RD 25%) and 6 in the medium term (n=1121, 13 randomized controlled trials, response rate drug 28.1%, response rate placebo 13.1%, weighted RD 18%).63 The Cochrane Review on haloperidol showed a pronounced superiority over placebo, with an NNT of 3 in both short- and medium-term studies (medium-term results: n=308, eight studies, response rate drug 43.8%, placebo 14.4%, weighted RD 32%64). An older review on maintenance treatment found substantially lower relapse rates of 16% in the antipsychotic group compared with 53% in the placebo group.65 The relapse results of our review also suggested a more pronounced long-term superiority of SGA drugs, at least in terms of RRs. In summary, these reviews highlight that there is a substantial placebo response, which in our sample was 24% based on a response definition of at least 20–30% total score reduction in two-thirds of the studies. Consistent with our meta-regression analyzing publication year as a moderator, the degree of improvement seems to decrease over time.
An obvious question is whether design issues can, at least partly, account for these findings. A mean age of 37.5 years suggests that the participants were relatively chronic. Less chronic patients respond better to antipsychotic drugs. For example, the mean age in the early NIMH study mentioned above was 28.2 years.62 Remission rates of more than 80% have been achieved in 1-year studies of first-episode patients.66, 67 The generalizability of recent studies is called into question by the fact that only 10–15% of the eligible schizophrenic patients are entered into clinical trials.68, 69 ‘Failed studies’ in which neither haloperidol nor the SGA drugs were better than placebo cannot explain the relatively small difference, because the pooled ES of studies with a significantly effective haloperidol arm was similar (Hedges's g=−0.54, RD=−0.16). The high dropout rates in the studies (overall 47%) may decrease the drug–placebo difference, because the antipsychotic drugs do not have time to develop their full effects, and the full deterioration under placebo is also decreased if participants are prematurely taken out of the trial. The Cochrane Review on haloperidol excluded studies with dropout rates higher than 50% and found an NNT of 3.64 We did not apply such an approach, because it is not clear what degree of attrition will clearly bias the results and in which direction. On the other hand, the studies in the Cochrane Review were all published before 1993 and were arguably less ‘sophisticated,’ for example, because some did not use standardized diagnostic criteria and rating scales, had small sample sizes or were carried out in single centers. It should also be noted that haloperidol is a high-potency antipsychotic drug. CATIE and CUtLASS suggested that these results are not necessarily representative for low-potency or intermediate-potency antipsychotics.60, 61 Preliminary work by Davis et al.4 had revealed a similar ES for the difference between haloperidol and placebo equaling 0.60.
The funnel plots raised the possibility of publication bias. This method must be interpreted with caution, because there can be other reasons for the plot asymmetry, especially true heterogeneity, because studies with different SGA drugs and possibly different efficacy were pooled.30 Another issue is that almost all included studies were organized by pharmaceutical companies, who may have not published studies with small drug–placebo difference, raising the possibility of an ‘industry bias.’70 Nevertheless, even if methodological issues accounted, in part, for the small differences, it is difficult to interpret the effectiveness of SGA drugs in clinical practice. On the one hand, the data clearly show that the SGA drugs are no wonder drugs in terms of efficacy, producing a moderate ES (0.49) in comparison with placebo. But what then do the ESs of those SGA drugs that were more efficacious than first-generation antipsychotics in the analysis by Davis et al.4 mean? They ranged between 0.21 (olanzapine) and 0.49 (clozapine).
Other problems are evident when negative symptoms and depression are considered. With the exception of amisulpride and olanzapine (5 but not 20 mg day−1 20), there is still no proof that SGA drugs are effective for predominantly negative symptoms, because populations with predominantly positive symptoms are simply not appropriate to examine this issue due to secondary effects, and statistical methods such as path analysis can only, in part, account for these effects.71 Even more surprising was that haloperidol decreases not only negative symptoms, but also depressive symptoms significantly more than placebo. It has been said that conventional antipsychotics induce depression rather than alleviate it.72 It may be that the depression-inducing effect is a long-term one, while haloperidol improves depression in the short run. But it is also possible that all these symptoms are truly related and the expression of the same underlying pathology.
While there was substantial evidence that haloperidol produces EPS, none of the SGA drugs induced significantly more EPS than placebo. This finding demonstrates that the EPS risks of all SGA drugs are low, but it does not prove that they are all free from these adverse effects. A meta-analysis comparing SGA drugs with placebo in bipolar mania suggested that some SGA drugs do induce EPS.73 There is some evidence that bipolar patients are more sensitive to EPS than people with schizophrenia.74 But many participants in schizophrenia trials were previously treated with antipsychotics for long periods with washout periods often lasting only a few days. In contrast, many mania patients had much less exposure to antipsychotics. Consequently, carryover effects of prior treatment may have reduced drug–placebo differences. Indeed, if overall rates of use of antiparkinson medication under the different SGA drugs are considered rather than ESs compared with placebo, these rates were considerable for some SGA drugs (amisulpride 2%—please note that these were low doses up to only 300 mg day−1, aripiprazole 13.3%, clozapine—no data available, olanzapine 15.6%, quetiapine 9.5%, risperidone 32.3%, sertindole 12.7%, ziprasidone 21.3%, zotepine 9.5%), although all were clearly lower than for haloperidol (47.6%). The placebo rates varied from 2.5% (amisulpride studies) to 25.9% (risperidone studies). Longer washout periods would improve the sensitivity for detecting EPS differences but are problematic from an ethical point of view.
There were few differences between antipsychotics and placebo in terms of dropouts due to adverse events. Unfortunately, some efficacy-related adverse events such as agitation due to insufficient efficacy are counted as adverse events and falsely inflate the placebo ‘adverse event’ tabulation. For example, we found fewer dropouts due to adverse events with aripiprazole than with placebo. We suggest that only adverse events reflecting side effects should be presented as dropouts due to adverse events to make this outcome a useful global measure of tolerability.
For perspective, even internal medicine drugs rarely cure. It is impossible to make direct comparisons of different treatments of different diseases, but (to give an example of a contemporary drug) the NNT to avoid vascular events or death by statins is higher than 100.75 The implication of our findings for antipsychotic drug development is that there is a great deal of room for improvement, let alone cure.
We are much indebted to the Cochrane Schizophrenia Group. Without access to its register of randomized controlled trials, this review would not have been possible. We also thank AstraZeneca, BristolMyersSquibb, EliLilly, Lundbeck and Sanofi-Aventis for providing unpublished data.
About this article
Disclosure/Conflict of interest
This meta-analysis received no funding. Stefan Leucht has received speaker or consultancy honoraria from Sanofi-Aventis, BMS, Lilly, Janssen, Lundbeck and Pfizer. Lilly and Sanofi-Aventis sponsored research projects by Dr Leucht. Werner Kissling has received speaker or consultancy honoraria from Sanofi-Aventis, BMS, Lilly, Janssen, Lundbeck, Bayer and Pfizer. Dieter Arbter, Rolf R Engel and John M Davis have no conflict of interest to declare.
Antipsychotic drugs for patients with schizophrenia and predominant or prominent negative symptoms: a systematic review and meta-analysis
European Archives of Psychiatry and Clinical Neuroscience (2018)