INTRODUCTION

To understand what given scores of the Brief Psychiatric Rating Scale (BPRS; Overall and Gorham, 1962) and the Positive and Negative Syndrome Scale (PANSS; Kay et al, 1987) mean from a clinical point of view is important for the translation of research results into clinical practice. In previous publications, we compared total scores and percentage reduction scores of both scales with simultaneous Clinical Global Impressions Ratings (CGI; Guy, 1976) using so-called linking analyses (Leucht et al, 2005a, 2005b). These analyses provided some anchor points as to which BPRS/PANSS score corresponds to a CGI severity rating (CGI-S) of eg ‘moderately ill’ or ‘severely ill’, and which percentage BPRS/PANSS reduction from baseline corresponds to a CGI-improvement (CGI-C) rating of for example ‘minimally better’ or ‘much better’. Whereas in the previous analyses we looked only at the association between the percentage BPRS/PANSS change from baseline and the CGI-improvement, we now carry out an analysis of the absolute BPRS/PANSS change. The former analysis was important for the understanding of what percentage cutoffs, which are frequently used to define ‘response’ in a trial, mean (eg an at least 20 or 50% of the baseline score). The present analysis of the absolute change is important because the primary outcome in antipsychotic drug trials is nowadays usually given as the mean absolute BPRS/PANSS reduction rather than as percentage reduction. Secondly, we analyzed whether the severity of illness at baseline had an impact on the associations between the BPRS/PANSS and the CGI. Finally, the essence of science is the replication of a finding in a different study. Therefore, we replicated the previous analysis linking the BPRS with the CGI using PABPRS scores from our PANSS database.

PATIENTS AND METHODS

The Databases

We used the same databases that were available for our original investigations of the BPRS and the PANSS (Leucht et al, 2005a, 2005b). The BPRS database (see Table 1) was composed of original patient data from seven randomized controlled studies that compared amisulpride (Möller et al, 1997; Peuskens et al, 1999; Wetzel et al, 1998; Puech et al, 1998; Colonna et al, 2000; Carrière et al, 2000) or olanzapine (Beasley et al, 1996b), with other antipsychotics and/or placebo. All of the latter studies used the original BPRS (1–7 scale); the BPRS was not derived from the PANSS.

Table 1 Patient Characteristics and Mean Baseline and End-Point Values of the Rating Scales (means±SD)

The PANSS and PABPRS database (see Table 1) was composed of seven randomized, double-blind trials that compared olanzapine (Beasley et al, 1996a, 1997; Tollefson et al, 1997; Tran et al, 1997; Lieberman et al, 2003; Breier et al, 2005) or amisulpride (Sechter et al, 2002) with other antipsychotics or placebo.

The trials included patients with schizophrenia, schizoaffective disorder, or schizophreniform disorder according to DSM-III-R or DSM-IV (American Psychiatric Association, 1987, 1994). With one exception (Carrière et al, 2000), all studies used various minimum scores as eligibility criteria to assure that the patients had florid symptoms, and most of them also required a minimum of positive symptoms by definition. The patients in the study without scale derived minimum scores (Carrière et al, 2000) were all in-patients and had a mean BPRS of 65 at baseline, so that highly symptomatic patients were also involved in this study. All numbers in Table 1 refer to those patients who had a PANSS and a CGI rating at baseline so that they could be included in at least one linking function. A summary of further characteristics of the studies can be found elsewhere (Leucht et al, 2005a, 2005b).

In the BPRS database, all studies provided ratings on both the CGI-severity score and the CGI-improvement score. In the PANSS database, all studies provided CGI-severity ratings, but (Beasley et al, 1997; Breier et al, 2005; Sechter et al, 2002) provided only CGI-improvement ratings (n at baseline=1318). A description of the two CGI subscores is provided in Table 2.

Table 2 Description of the Two Clinical Global Impressions Subscores

Statistical Analysis

We again used the SAS program EQUIPERCENTILE (Price et al, 2001), a realization of the algorithms for equipercentile linking described by Kolen and Brennan (1995) in order to compare the BPRS/PANSS with the CGI. For further details of the methodology used, please consult Leucht et al (2005a, 2005b).

  1. 1

    The previous analyses of the percentage BPRS/PANSS change were important to gain a better understanding of the meaning of cutoffs, frequently used to define response (Leucht et al, 2005a, 2005b). The primary efficacy outcome in antipsychotic drug trials, however, is usually given as the absolute change of the BPRS/PANSS from baseline. Here, therefore, we linked the absolute change rather than the percentage change of the BPRS/PABPRS/PANSS to the CGI-improvement score.

  2. 2

    Both the absolute and the percentage changes of the BPRS/PABPRS/PANSS were also linked to the absolute change from baseline of the CGI-severity score.

  3. 3

    We analyzed whether the associations between the absolute and percentage BPRS/PANSS change and the CGI depended on the patients’ severity of illness at baseline. In other words, does the same absolute/percentage change of the BPRS/PANSS mean the same clinical improvement according to the CGI, if the patients were initially very severely ill or less severely ill? For this purpose, we divided the patients in two severity groups based on the median of the BPRS/PANSS total score at baseline.

  4. 4

    Finally, we tried to replicate the previously found association between the BPRS and the CGI (Leucht et al, 2005a) with the PANSS-derived BPRS (PABPRS) using the PANSS database. It should be noted that there was no overlap in terms of studies in the BPRS and the PANSS database, so this was a completely independent replication.

In the BPRS database only evaluations at baseline, weeks 1, 2, and 4 were analyzed, and in the PANSS database only evaluations at baseline, weeks 1, 2, 4, and 6 were analyzed. Although the duration of the studies ranged from 4 to 51 weeks, not all studies provided data for other time points, so the trial effects could have biased the results. For each linking task, we utilized all patients with valid values on both measures, as analysing only study completers would have implied a selection. In sensitivity analyses, we included only those patients who were still in the studies at weeks 4 and 6, respectively, so that a rating was available at each time point. With the exception of considerable variation at the extremes (especially CGI-C score reflecting worsening), the results were similar, so only the results of the primary analysis are shown. The results of these sensitivity analyses can be looked up at the Journal's website (Supplementary Information). These figures are numbered in the same way as in the publication, but in roman numerals rather than arabic numerals).

RESULTS

Correlation between the CGI and the BPRS/PANSS/PABPRS

Table 3 summarize the Spearman correlation coefficients obtained for the association between various CGI and BPRS/PANSS/PABPRS scores for the whole sample at baseline and at weeks 1–6. They ranged between 0.52 and 0.74 reflecting moderate to strong associations between scores.

Table 3 Correlations between BPRS/PANSS/PABPRS and CGI

Replication of the Linking Functions ‘CGI-Severity Score vs BPRS Total Score’ and ‘CGI-Improvement Score and Percentage BPRS Reduction’ Using the PABPRS

Figures 1 and 2 show the replication of the linking functions CGI-severity score vs BPRS total score and CGI-improvement score and percentage BPRS reduction using the PABPRS. Although based on a completely different set of studies, the results were similar to those previously reported for the original BPRS (Leucht et al, 2005a). Thus, being considered ‘mildly ill’ on the CGI (CGI-S=3) corresponded to BPRS total scores between 32 and 36 over the weeks (original BPRS in Leucht et al, 2005a: 30-32), ‘moderately ill’ (CGI-S=4) to BPRS total scores of 40–45 (original BPRS: 40–44), ‘markedly ill’ (CGI-S=5) to a BPRS of 52–55 (original BPRS: 52–55), ‘severely ill’ (CGI-S=6) to a BPRS of 64–67 (original BPRS: 65–70), and ‘among the most extremely ill’ (CGI-S=7) corresponded to a BPRS of 83–85 (original BPRS: 84–88).

Figure 1
figure 1

Linking of PABPARS total score with CGI-severity score.

Figure 2
figure 2

Linking of PABPRS percentage reduction with CGI-improvement score.

On linking the CGI-C with the percentage PABPRS reduction, the results again proved similar to those in our study on the original BPRS (Leucht et al, 2005a). Thus, ratings of ‘minimally improved’ (CGI-C=3) corresponded to percentage BPRS reductions of 22–34% (original BPRS: 23–30%), ratings of ‘much improved’ (CGI-C=2) corresponded to percentage BPRS reductions of 46–59% (original BPRS: 44–58%), and ‘very much improved’ (CGI-C=1) to percentage BPRS reductions of 77–86% (original BPRS: 71–85%). As in the previous publication, there was a time effect, with more percentage PABPRS reductions needed at later weeks to link with the same CGI-C score. Expectation effects are a likely reason for these time effects (see Leucht et al, 2005a, 2005b).

Linking of the CGI-Improvement Score to the Absolute Change of the BPRS/PANSS/PABPRS from Baseline

Figure 3a–c shows the linking functions between the CGI-C and the absolute change of the BPRS, the PANSS, and the PABPRS. Please note that owing to space limitations Figure 3c and all subsequent figures concerning the PABRS marked with a ‘c’ are presented on the journal's website (Supplementary Information). They suggest that a reduction of the BPRS/PABPRS by approximately 10 points (range over the weeks BPRS 8–11, PABPRS 8–12) and of the PANSS by approximately 15 points (range over the weeks 12–18) corresponds to a CGI-C of ‘minimally improved’. A reduction of the BPRS/PABPRS by approximately 20 points (range BPRS 19–22, PABPRS 19–23) and of the PANSS by 33 points (range 30–36) corresponded to ‘much improved’ according to the CGI-C.

Figure 3
figure 3

(a) Linking of BPRS absolute change with CGI-improvement score. (b) Linking of PANSS absolute change with CGI-improvement score.

Linking of the Absolute Change of the CGI-Severity Score to the Absolute and the Percentage Change of the BPRS/PABPRS/PANSS from Baseline

Figure 4a–c shows the linking functions between the absolute change of the CGI-severity from baseline and the absolute change of the BPRS/PABPRS/PANSS. They suggest that a reduction of the BPRS/PABPRS by approximately 10 points reflects a reduction of the CGI-severity score by one severity step. The same number for the PANSS was 15 points. (Please note that the linking lines do not pass through the origin of the plot in Figure 4a–c, thus the correct numbers are 10/15 points rather than 15/20!)

Figure 4
figure 4

(a) Linking of BPRS absolute change with CGI-severity score absolute change. (b) Linking of PANSS absolute change with CGI-severity score absolute change.

Figure 5a–c shows the linking functions between the absolute change of the CGI-severity from baseline and the percentage change of the BPRS/PABPRS/PANSS. Please consider again that the linking functions do not pass through the origin of the plot when interpreting these figures. Thus, they suggest that a percentage reduction of the BPRS/PABPRS by approximately 28 percentage points (range BPRS 26–30, PABPRS 27–30) reflects a reduction of the CGI-severity score by one severity step. The same number for the PANSS was 25 percentage points (range 24–28).

Figure 5
figure 5

(a) Linking of BPRS percentage change with CGI-severity score absolute change. (b) Linking of PANSS percentage change with CGI-severity score absolute change.

Linking Analyses Depending on the Initial Severity of Illness

In Figures 6a–c, 7a–c, 8a–c and 9a–c we analyzed whether the results of the functions linking improvement according to the BPRS/PANSS with the CGI-C and CGI-S change depend on the patients' severity of illness at baseline. For this purpose, the patients were divided into two groups according to their median BPRS/PANSS score at baseline, and the results of both severity groups are shown together in the figures. The figures show a clear impact of baseline severity on the results. In all figures, in less severely ill patients (median of the BPRS/PANSS at baseline), a smaller change of the absolute BPRS/PANSS was associated with a certain degree of CGI-improvement than in the more severely ill patients (>median of the BPRS/PANSS at baseline). This effect was attenuated and inconsistent when the percentage change of the BPRS/PANSS was considered instead of the absolute change.

Figure 6
figure 6

(a) Linking of BPRS percentage change with CGI-improvement score after splitting the patients in two groups according to the median of the BPRS at baseline (median=58). (b) Linking of PANSS percentage change with CGI-improvement score after splitting the patients in two groups according to the median of the PANSS at baseline (median=97).

Figure 7
figure 7

(a) Linking of BPRS absolute change with CGI-improvement score after splitting the patients in two groups according to the median of the BPRS at baseline (median=58). (b) Linking of PANSS absolute change with CGI- improvement score after splitting the patients in two groups according to the median of the PANSS at baseline (median=97).

Figure 8
figure 8

(a) Linking of BPRS absolute change with CGI-severity change after splitting the patients in two groups according to the median of the BPRS at baseline (median=58). (b) Linking of PANSS absolute change with CGI-severity change after splitting the patients in two groups according to the median of the PANSS at baseline (median=92).

Figure 9
figure 9

(a) Linking of BPRS percentage change with CGI-severity change after splitting the patients in two groups according to the median of the BPRS at baseline (median=58). (b) Linking of PANSS percentage change with CGI-severity change after splitting the patients in two groups according to the median of the PANSS at baseline (median=92).

DISCUSSION

In contrast to two previous reports, (Leucht et al, 2005a, 2005b), we here linked the absolute change rather than the percentage change of the BPRS/PANSS to the CGI-improvement score, and to the absolute change of the CGI-severity score. As a by-product we also repeated our previous BPRS analysis with a completely different data set based on the PABPRS, because replication is the essence of science. The latter association can now be considered to be a robust finding, at least for similar antipsychotic drug trials in schizophrenic patients with acute exacerbations.

The new analysis can on the one hand help us understand what a given absolute change of the BPRS/PANSS in a clinical trial means. For example, a 10/15 points reduction of the BPRS/PANSS corresponds roughly to minimal improvement according to clinical judgment. Or, roughly the same reduction of the BPRS/PANSS is necessary for a patient's severity of illness to decrease by one severity step according to the CGI-S.

Going one step further, these numbers may also help to understand what the efficacy differences between antipsychotic drugs mean. For example, in their meta-analysis Davis et al (2003) found an average superiority of amisulpride, olanzapine, and risperidone compared to conventional antipsychotics of approximately 3–4 BPRS and 4–6 PANSS points. As our analysis showed that approximately a 10/15 points difference of the BPRS/PANSS means minimal improvement according to the CGI, the differences found in Davis's meta-analysis fall below this threshold. It is therefore not surprising that many clinicians report that in clinical routine they do not ‘see’ an efficacy superiority of these new generation antipsychotics. The effect size for the difference between clozapine and conventional antipsychotics was, however, twice as large as that found for the former new generation antipsychotics, and hence clinicians may recognize its higher efficacy compared to conventional antipsychotics (Davis et al, 2003). We wish to stress, however, that given the severity of the disorder and its consequences for afflicted individuals, even small differences in efficacy can be important from a public health point of view. Another example is the recent rejection of a widely held belief that there is a delay of the onset of antipsychotic drug action. In a meta-analysis by Agid et al (2003), the BPRS/PANSS was reduced by 13.5% points in the first week, and in the study by Leucht et al (2005c) (who could not subtract a placebo effect), by 19%. Both values fall below the threshold of minimal improvement according to the CGI (25% BPRS/PANSS reduction). This may in part explain why on average clinicians do not realize an antipsychotic effect during the first week of treatment. Finally, the results may also be useful for sample size estimates in the planning of antipsychotic drug trials. Such sample size estimates are often based on what a clinically meaningful difference between two interventions would be. It could for example be argued that a 10/15 point difference in BPRS/PANSS scores means minimal clinical improvement, so that a sample size estimate can be based on these numbers.

The generalizability of the latter findings is, however, limited by effects determined by the severity of the patients' illness at baseline. Splitting the patients in two groups by the BPRS/PANSS at baseline showed that for the less severely ill patients less ‘objective’ reduction of symptoms as measured by the BPRS/PANSS was linked to the same CGI-improvement rating than for the more severely ill patients. For example, a 5-point reduction of the BPRS in a patient who was initially only moderately ill seems to be subjectively perceived as a greater improvement according to the CGI than the same reduction for an initially severely ill patient. One explanation of this finding could be that mildly ill patients do not have many symptoms per se that can improve, so that a small ‘objective’ (BPRS/PANSS) change is ‘subjectively’ (CGI) considered to be an important change. This phenomenon was much less pronounced and inconsistent in the analysis of the percentage change of these scales than in the analysis of the absolute BPRS/PANSS change. It seems, therefore, that clinicians rating the CGI think in ‘relative’ (percentage) terms rather than in ‘absolute’ terms. As the use of the percentage change of the BPRS/PANSS appears to partially level out severity-at-baseline effects, it could be considered to analyze these scales on a percentage basis rather than on an absolute basis. It must be emphasized that our original finding that a 50% reduction of the BPRS/PANSS reflects ‘much improvement’ according to the CGI-C better than a 20% cutoff which has recently been frequently used (eg Marder and Meibach, 1994; Peuskens, 1995) and is therefore a clinically more meaningful response cutoff in acutely ill, non-refractory patients, is not challenged by this severity-at-baseline effect (Leucht et al, 2005a, 2005b).

Finally, although it is reassuring that it was possible to replicate our previous findings on the BPRS using a different database, all the limitations that have been extensively discussed in our previous reports also apply to the current analyses (Leucht et al, 2005a, 2005b). The most important ones are that the linking functions are no more than a kind anchoring of the BPRS/PANSS, only limited information on the psychometric properties of the CGI is available, ceiling effects are possible, the ratings have not been made independently, time effects probably reflecting clinicians' expectations occur, information on inter-rater reliability is not available, and that the results may not be generalizable to different patient populations (for example, patients with predominantly negative symptoms). Nevertheless, we hope that our results will contribute to a better understanding of clinical trial results.