INTRODUCTION

Despite several decades of research on the mechanisms of action of antidepressants (ADs), the specific therapeutic actions that bring about recovery in depressed patients remain unresolved. In addition, the time of onset of drug-induced clinical actions, the types of behavioral change that precede recovery in treatment-responsive patients, and whether different types of ADs produce similar patterns of early behavioral changes in treatment-responsive patients are not well characterized. We previously reported that the onset of improvement in several symptom domains in hospitalized patients with major depressive disorder who were treated with tricyclic antidepressants (TCAs) occurred within the first 10 days of treatment (Katz et al, 1987). This finding was at variance with the then current view that despite the almost immediate impact of ADs on the functioning of central monoamine transmitter systems, the clinical actions of the drugs ‘lag’, that is, were not thought to occur until after 2–3 weeks of treatment (see Gelenberg and Chesen, 2000). These early clinical effects of TCAs, found in patients who would eventually respond to the drugs, were also contrary to the results of much of the research conducted during the 1980s (eg Quitkin et al, 1984). In our earlier study (Katz et al, 1987), the clinical effects detected within the first 2 weeks were large and were predictive of eventual positive clinical outcome. Findings from a number of other studies (Small et al, 1981; Coryell et al, 1982; Khan et al, 1989; Nagayama et al, 1991; Stassen et al, 1993; Boyer and Feighner, 1994) have also confirmed the positive role of early clinical actions in predicting treatment outcome. The capacity to predict outcome early in treatment would not only shorten the length of a treatment trial but should also decrease morbidity and the risk of suicide from prolonged depressive symptoms (Kiloh et al, 1988).

Our earlier study did not include a placebo control group. Consequently, questions remained about whether these early clinical effects in treatment responders were due to drug or to a placebo response. The lack of a placebo control group in several of the other prediction studies, for example, Nagayama et al (1991), also brought into question the etiology of these reported early clinical actions. Recent reviews of relevant research on the issue of the onset of action of AD drugs (Stassen et al, 1993; Katz et al, 1996/1997; Gelenberg and Chesen, 2000) concluded that none of these studies, including ours, had fully met the requirements of a definitive study on response onset. Thus, the question as to whether ADs do or do not induce specific predictive behavioral actions within the first 2 weeks of drug treatment is still unresolved. Here, we report a randomized, parallel group, placebo-controlled study designed to address this question. The two primary aims of this study were first to determine, through intensive behavioral examination during the first 2 weeks, the time course and onset of action of the selective serotonin reuptake inhibitor (SSRI), paroxetine, and the selective norepinephrine reuptake inhibitor (NRI), desipramine (DMI). The second aim was to compare early behavioral changes elicited by the two types of ADs to those associated with treatment with placebo.

In contrast to our earlier study that used TCAs, imipramine, and amitriptyline, which have similar pharmacologic effects, the current study evaluated paroxetine and DMI, two ADs with distinct initial pharmacologic profiles (Frazer, 1997). The study focuses on the behavioral impact of the different drug types during the first 2 weeks of treatment and applies advanced statistical methods to determine rate of improvement, timing, and course of early clinical actions. This study is not a clinical trial aimed at evaluating the efficacy of the two active drugs. Rather, the drugs were selected as they have been shown in numerous studies to be efficacious and they are representatives of the SSRI and selective NRI classes of ADs (see Frazer, 1997). The design allowed us to test hypotheses based on earlier findings (Katz et al, 1994) that early behavioral change to the SSRI, paroxetine, would be on anxiety, whereas those to the selective NRI, DMI, would be on motor activity and that such early improvement would be predictive of a positive outcome at the end of treatment.

METHODS AND PROCEDURES

Research Design

The design entailed randomized, parallel group, double-blind assignment of patients to DMI, paroxetine, or placebo for a period of 6 weeks.

Patient Sample

Patients with a diagnosis of primary major depression, unipolar type, single, or recurrent episode were identified from newly admitted inpatients. All subjects provided written informed consent and the study was conducted as approved by the University of Texas Health Science Center at San Antonio's Institutional Review Board (IRB) and the Dallas VA Medical Center's IRB. Diagnostic interviews were conducted using the Structured Clinical Interview for DSM-IIIR (SCID) (Spitzer and Williams, 1983) and medical chart review. Diagnoses were confirmed at weekly research conferences. The exclusion criteria included a history of neurologic disease, recent head trauma, current psychotic symptoms, active substance dependence, unstable medical problems, and suicidal thoughts with intent or previous electroconvulsive therapy. Patients were required to score 18 on the Hamilton Depression Rating Scale (Ham-D, 21-item version). (Hamilton, 1960).

A total of 82 patients were initially enrolled and randomly assigned to treatment. Of these, 12 patients dropped out prior to receiving the minimum 3 weeks of treatment to qualify for the evaluable sample. The reasons for their dropping out are indicated below. A total of 70 patients completed the requirements of the protocol. The patients, 58 male and 12 female, averaged 46 years, with an age range between 20 and 69 years. All were diagnosed with major unipolar depressive disorder and most patients were moderately (56%) to severely (38%) ill (Clinical Global Improvement (CGI)) (Guy, 1976). In all, 78% had had at least one prior episode, with 18 months as the average length of the current episode. The average patient score at baseline on the Ham-D for the sample was 23.5±4.8 (X+SD) and these scores were essentially equivalent for the patients at the two sites, being 23.8±6.4 in San Antonio and 22.2±3.7 in Dallas.

Baseline Period

Study subjects were treated on an in-patient research unit for 6 weeks at two Veterans Administration Hospitals. All patients were maintained on placebo during an initial 1-week drug ‘washout’ period prior to beginning a 6-week treatment period. Use of a placebo run-in period in this study was primarily carried out to eliminate pre-existing drug that could confound identification of early behavioral changes related to the study medications. Clinical response at the end of the pretreatment period was measured using the Ham-D and CGI (Guy, 1976) scales. Regardless of the initial level of severity, a patient was not included in the treatment study if by day 7 the patient no longer scored 18 on the Ham-D. During this baseline period, 20 patients were excluded from the randomization stage primarily because they changed their mind about participating in the study or did not meet the inclusion criteria.

Treatment Period

After the 7-day pretreatment period, patients were randomly assigned to either DMI, paroxetine, or placebo. The dosage of DMI was started at 50 mg and was raised as necessary to a maximum of 350 mg/day to reach a blood level of 125 ng/ml by 7 days. Steady-state concentrations of DMI were reached by day 13 for 80% of the patients. Of the initial patient sample entered into the study, 29 were assigned to treatment with DMI. Of these, three dropped out within 2 weeks due to side effects. In all, 28 patients were assigned to paroxetine. Of these, four did not complete at least 3 weeks of the protocol, two of which were due to personal choice and two because of protocol violation. The dosage for paroxetine ranged from 20 to 60 mg daily. The dosage was adjusted to achieve a minimum steady-state serum concentration of drug of at least 10 ng/ml. Although there is no current evidence of a relationship between blood level and clinical response to paroxetine, measurement of levels assured adequate absorption and metabolism of paroxetine. Steady-state concentrations of paroxetine were reached by day 6 for 94% of these patients. A total of 25 patients were assigned to placebo, and five did not complete 3 weeks of treatment; for four, this was due to personal choice (changed mind because duration of hospitalization was too long) and one had a protocol violation. Thus, of the 102 patients enrolled in the study originally, 32 did not complete the study, 20 of whom dropped out during the initial 1-week placebo period. Drug plasma levels were monitored daily during the first week of the study, twice weekly for the next 2 weeks, then once a week until the end of the study. Thus, we were able to assure equivalent treatment in the active medication groups.

As the focus of the study was on early behavioral change, it was decided a priori that patients who did not complete at least 3 weeks of treatment would not generate useful data for the analyses. As just indicated, 12 patients were, therefore, not included in the analyses. However, of the 70 patients included in the data analyses, five completed 5 weeks of treatment, one completed 4 weeks, and three completed 3 weeks. For such patients, their last behavioral scores were considered to be their final score and they were assigned a behavioral outcome based on these scores.

Measurement of Behavioral Components and Severity of the Depressed State

To address the questions and hypotheses raised, measures of specific behavioral components as well as the severity of the disorder are required. In this multivantaged approach (Katz et al, 1984), doctor ratings included the Ham-D, CGI, Schedule for Affective Disorders, and Schizophrenia-Change Version (SADS-C, Endicott and Spitzer, 1978), Global Assessment Scale (GAS, Endicott et al, 1976), Video Interview Behavioral Evaluation Scales (VIBES) (Katz and Itil, 1974); nurse ratings: the Affective Disorder Rating Scale (ADRS) (Murphy et al, 1980), NIMH Mood Scale (Raskin et al, 1969), Global Ward Behavior Scale (Raskin et al, 1969); patient self-reports included the Symptoms Checklist (SCL-90; Derogatis et al, 1974); NIMH Mood Scale (Raskin et al, 1969); and the Buss–Durkee Hostility Inventory (Buss and Durkee, 1957). Other measures of the components of depression, for example, psychomotor performance, derived from our earlier work, are described below. These have been described previously (Katz et al, 1982, 1984, 1989).

Behavioral components (state constructs)

Following administration of the rating scales at baseline to a broad sample of patients with affective disorders and normal subjects, (Katz et al, 1984), psychometric analyses of scores obtained from the various instruments led to a set of 11 state constructs that measure the major components of depression. These studies included tests of discriminant and construct validity (Katz et al, 1984, 1987). The 11 include disturbed emotions and abnormalities of motor movement, cognition, social behavior, and somatic functioning, specifically, (1) depressed mood, (2) anxiety, (3) retardation of movement and speech, (4) agitation, (5) hostility, (6) somatization, (7) distressed physical, primarily facial, expression, (8) interpersonal sensitivity, (9) positive adaptation that describes assertive, open social behavior in the interview, (10) cognitive impairment, which includes both thinking and concentration problems, and (11) sleep disorder. It is these constructs that are used for statistical analyses and not scores directly obtained on the various behavioral instruments. Based on the hypotheses being tested, construct numbers 1–4 and number 7 were used in the primary statistical analyses.

Severity dimensions

In a principal components analysis of the intercorrelations of the 11 state constructs, three orthogonal dimensions were identified, accounting for approximately 75% of the variance (Katz et al, 1982, 1984). The dimensions, which consist of combined constructs, represent three independent components of severity of the disorder and help to explain the major areas of functioning measured: (I) Depressed mood–retardation of movement and speech (DM-MR dimension). (II) Anxiety–agitation–somatization–sleep disorder, apparently reflecting the many aspects of anxiety, psychological and physical. (III) Hostility–interpersonal sensitivity. Based on the hypotheses being tested, the severity dimension, DM-MR, was used in the primary statistical analyses.

Categorical outcome (multicomponent)

The procedure applied in this study for identifying ‘therapeutic actions’ required classifying patients as responders (essentially recovered) or nonresponders (essentially unresponsive to treatment) after 6 weeks of treatment. It is based on the measurement of two major dimensions of overall outcome, general psychopathology, and severity of the depressed state. Patients in the recovered group have to achieve an illness level of no greater than ‘mild’ severity, demonstrate markedly reduced severity scores, and be reported as having a level of functioning on the GAS that would indicate no further evidence of psychiatric disorder (GAS>60). Nonrecovered patients show minimal to no decrease on the severity of illness scale, that is, are still judged to be moderately ill or worse, show a zero to minimal increase on the global improvement scale, continue to have a Ham-D score above 16, and show a level of functioning on the GAS that reflects the continued presence of severe symptomatology or impairment. The specific operational criteria can be found in Appendix 1 of Katz et al, (1984). Even after 6 weeks of treatment, it is not possible to classify some patients as responders or nonresponders. Previously, we classified such partially responsive patients as ‘indeterminate’. In this study, which focuses on the identification of clear responders, we have treated the nonresponders and indeterminates as a single group for analysis, subsequently to be called ‘nonresponders’.

A second categorical outcome based on the Hamilton Scale, where ‘marked response’ is defined as a 50% decrease in the total (21 item) score, has also been defined.

Frequency of Measurement

The behavioral measurements were administered twice weekly during the first 3 weeks of treatment and weekly from weeks 4 to 6. The patients were interviewed in each assessment period utilizing the semistructured question format (20–25 min) that accompanies the SADS-C method (Endicott and Spitzer, 1978). On the basis of that interview, the clinician completes the SADS-C ratings, the HAM-D (it is incorporated into the SADS-C interview), and the brief VIBES (25 mood and behavior items). Since the SADS-C and HAM-D are rated during the course of the interview, the entire interview-rating procedure takes about 30 min. The nurse-rating procedures on the ward, that is, the ADRS, Mood Scale, and GWBS, require 30–45 min. The patient completes the full NIMH Mood Scale and SCL-90 and the psychomotor performance tests at baseline and at 42 days (outcome). These procedures were shortened at the interim assessment points, days 7, 10, 13, 17, 21, 28, and 35, through the use of brief forms of the SCL (60 items) and the Mood Scale (33 items). The assessment procedure at baseline took about 60 min. Owing to the relative brevity of the interim assessments (30–35 min), the patients did not find the tests to be laborious or fatiguing.

Biochemical Assays

Plasma concentrations of drugs were measured by high-performance liquid chromatography exactly as described in Javors et al (2000).

Statistical Methods

Descriptive statistics, such as means and standard deviations, medians, or frequencies were produced for all variables. Unless stated otherwise, the following analyses were run separately for each of the state constructs, severity dimensions, and the Hamilton total score.

Analysis of treatment efficacy

χ2 analysis and Fisher's exact test were used to compare the treatment groups on two binary outcome measures, the categorical outcome and the Hamilton Scale outcome of at least a 50% reduction in the total score. Analysis of covariance (ANCOVA) on the final value of each measure (day 42), with treatment as the independent group and covarying for the baseline value, was used to assess the absolute difference between drug groups after 6 weeks of treatment. Post-test contrasts were used for pairwise comparisons. Survival analysis was used to determine if the overall course of change over the 6-week period was different between different drug groups. For these models, the outcome was a 20% or better improvement in the measure that was sustained for the remainder of the study (Stassen et al, 1993).

Analysis of early behavioral actions and onset comparing treatment results in all subjects

  1. 1

    To determine if the two drugs and placebo differed in terms of specific behavioral changes at days 7, 10, and 13 of treatment, ANCOVA using the baseline value as a covariate as described above was run for each of these days.

  2. 2

    To determine if one drug acts faster than the other and/or placebo in reducing the overall severity or specific symptomatic behaviors during this 2-week treatment period, the linear mixed model (Laird and Ware, 1982) was used to model the longitudinal change in each construct. The mixed model is similar to ordinary least-squares regression, but has the advantage of using all the data. Treatment was included in the model and the interaction between treatment and time effect (slopes test) was tested for significance. This model measures the differences in the rate of change or reduction of a given variable among treatments during the first 2 weeks.

  3. 3

    The ‘median time of onset’ was defined as the earliest time point at which 50% of patients changed a minimum of 20% on a given behavioral construct, a change that was then sustained throughout the course of treatment. The treatment groups were compared on the construct measures of behavior and the dimensional measures of severity of the disorder.

Analysis of onset of ‘therapeutic’ action within each treatment group

Using the categorical outcome described above, clear responders were compared with nonresponders. These comparisons were conducted within each of the treatment groups separately and were designed to identify the initial behavioral actions that precede positive therapeutic action for each drug. To identify the times of onset of improvement in those behavioral indices that distinguish treatment responders, ANCOVA, covarying for the baseline value with response as the group effect was used. Effect size statistics (Cohen, 1969) were calculated to ascertain the degree to which the improvements that resulted were clinically recognizable.

Association of early change with outcome at 6 weeks

To determine whether early drug-induced changes in specific behaviors are correlated with the amount of improvement on a primary outcome measure, namely the Ham-D Total Score, percent improvement in a specific behavior at weeks 1 and/or 2 was correlated with % improvement at the end of 6 weeks on the Ham-D.

Prediction of outcome

Logistic regression (Hosmer and Lemeshow, 1989) was used to develop an algorithm for estimating the probability that a patient would recover by 6 weeks of treatment based on values on the behavioral constructs after 1 or 2 weeks of treatment. Different models of individual prediction were tested for each drug independently. We used the results of the earlier analyses to select variables for inclusion in the models, which could include just a single construct or a linear combination of several constructs as independent variables. We did not test models including variables that did not discriminate between recovered and nonrecovered subjects at any of the early time points. For each model, a classification table was developed that assumed different values of the probability of recovery as a threshold to predict that a subject would recover. The predicted outcomes were then crosstabulated with the actual outcomes to determine the ‘sensitivity’ (percentage of true positives) and ‘specificity’ (percent of true negatives), at each proposed threshold. The model and threshold that provided the best combination of sensitivity and specificity was then selected as the prediction model for recovery.

RESULTS

Response to Treatment with DMI, Paroxetine, and Placebo at 6 Weeks

DMI was significantly superior to placebo on the 50% improvement Ham-D rating scale criterion, with 62% of patients treated with DMI meeting this criterion compared with 30% of placebo-treated patients achieving it (p<0. 05). The percentage of patients treated with paroxetine who had this level of improvement (46%) was not significantly different from the proportions for either DMI or placebo. Essentially identical proportions of responders were found using this criterion when applying the HAM-17 scoring rather than the HAM-21, with the same statistical significance (data not shown). Similar trends were seen with the multicomponent categorical index of response with 62, 46, and 45% of patients treated, respectively, with DMI, paroxetine, or placebo categorized as responders. However, the difference in response between DMI and placebo on this index was not significant because of the higher percentage of patients responding to placebo.

Early Behavioral Actions of DMI, Paroxetine, and Placebo

At 1 week of treatment, DMI significantly improved depressed mood and motor retardation when compared with placebo and when these factors were combined to form the DM-MR-dimensional measure (Table 1, Figure 1a and b). The size of the change in motor retardation was greater than that in depressed mood and the change was more durable (Table 1, Figure 1a) These changes in motor retardation and in the severity dimension were sustained at 13 days (Table 1, Figure 1a). These effects of DMI were not only significantly greater than those seen in patients treated with placebo but also of those treated with paroxetine as well. In addition, by 13 days, DMI significantly reduced hostility (Figure 2) and distressed expression compared with placebo-treated patients (Table 1). Such effects were also detectable using the more global severity scale, namely the HAM-21, in that DMI caused a significantly greater reduction in the score at 1 week than did either paroxetine or placebo. At 2 weeks, improvement due to DMI was superior to that caused by paroxetine, but not to that due to placebo.

Table 1 DMI vs Paroxetine vs Placebo: Comparison of Behavioral Changes at 1 and 2 Weeks of Treatment
Figure 1
figure 1

Early drug actions: Comparing DMI, paroxetine, and placebo across time periods. (a) Motor retardation: At days 7 and 13 of treatment, DMI improved motor retardation significantly more than that caused by either paroxetine or placebo (**<0.01, ANCOVA) and at day 10, the same effect was observed (*p<0.05, ANCOVA). (b) Severity dimension: DM-MR: At days 7 and 13 of treatment, DMI improved the severity dimension significantly more than that due to either paroxetine or placebo (**<0.01, ANCOVA) and at day 10, the improvement caused by DMI was significantly greater than that caused by paroxetine (*p<0.05, ANCOVA). Although behavioral assessments were carried out at days 21, 28, and 35 of treatment, these time points were not essential for the hypothesis being tested. Consequently, values obtained at these time points are omitted from Figures 1a, b and 2 for the clarity of data presentation.

Figure 2
figure 2

Early drug actions: Comparing treatments across time periods on hostility. DMI and paroxetine reduce hostility at a significantly faster rate during the first 2 weeks than placebo (slopes test, p<0.05). At day 13 of treatment, DMI improved hostility significantly more than placebo (*p<0.05, ANCOVA).

The slopes test indicated a significantly more rapid paroxetine-induced reduction in hostility during the first 2 weeks of treatment than that due to placebo (see Table 1a and Figure 2). Treatment with paroxetine, however, shows no other significant differences from placebo during this period. DMI had greater effects on distressed expression than either paroxetine or placebo had at 2 weeks, and greater effects on cognition than paroxetine had at 2 weeks. No other differences in drug effects on behavioral constructs were found (Table 1).

Therapeutic Actions: Relationship of Early Behavioral Changes to Recovery Status at 6 Weeks

These analyses compared the early behavioral changes in patients who responded with a clear positive outcome to treatment at 6 weeks, and in patients treated with the same drug who were nonresponders at 6 weeks. For DMI-treated patients at 1 week, responders showed significantly greater reductions in motor retardation (p<0.001) and in the dimension, DM-MR (p<0.001), than the nonresponder group (see Table 2a) These differences, particularly in motor retardation and in the severity dimension, DM-MR, constitute ‘large’-effect sizes (Cohen, 1969) (Table 2a). At days 10 and 13, the significant improvements in motor retardation and DM-MR for responders were sustained. The HAM-21 also showed eventual responders to have a drop in overall severity after 1 week that was significantly greater than that in nonresponders. This difference, though, was not sustained after 2 weeks of treatment. Thus, the early DMI-induced behavioral improvement in motor retardation and depressed mood was large and was associated with its therapeutic action.

Table 2 Recovered vs Nonrecovered Patient Groups: Changes at 1 and 2 Weeks of Treatment

For paroxetine, no specific behavioral changes after week 1 were related to response. Anxiety was the earliest behavior influenced early in treatment in paroxetine responders, in that the anxiety construct improved to a significantly greater extent than in nonresponders at 10 days and this effect appeared to be sustained (Table 2b and Figure 3). Consistent with this effect, the dimension of anxiety/agitation/somatization improved to a significantly greater extent at both 10 and 13 days of treatment in responders than in nonresponders (data not shown). At 2 weeks, reductions were observed in depressed mood, motor retardation, and distressed expression, as well as in the severity dimension, DM-MR, in responders to paroxetine with large effect sizes on depressed mood and DM-MR (Table 2b). This specific pattern of behavioral improvement produced by paroxetine in eventual responders differed from that caused by DMI, in that it appeared later and not only involved anxiety but also depressed mood, motor retardation, and distressed expression. Interestingly, the global severity measure (HAM-21) detected differences between paroxetine responders and nonresponders as early as 1 week, and this difference was sustained at 2 weeks (Table 2b).

Figure 3
figure 3

Paroxetine responders vs nonresponders: Course of change on anxiety during first 2 weeks of treatment. Paroxetine improved anxiety in eventual responders by day 10 of treatment significantly more than in nonresponders (*p<0.05, ANCOVA).

For placebo, except for a reversal on motor retardation, that is, the responders were worse than nonresponders at week 1, no statistically significant differences in behavioral change that could distinguish responders from nonresponders were found during the first 2 weeks of treatment (data not shown). Global severity, that is, the HAM-21, also did not change significantly more early in treatment in placebo responders than in nonresponders. Therefore, early improvement among placebo-treated patients was unrelated to eventual responder status.

These results were obtained using the categorical index described earlier that takes into account both improvement and final status on the HAM-D, to divide patients into responders and nonresponders. If the more traditional 50% improvement index in HAM-21 scores is used to create these categories, the population of patients in the two groups is slightly different. It is relevant to ask, then, if the results just presented would be similar if response categorization was based solely on the HAM-21, 50% index. They are; for example, DMI responders still showed significant improvement in motor retardation at 1 and 2 weeks of treatment (p<0.005 and <0.05, respectively) as compared with that measured in nonresponders. This was true for the DM/MR dimension as well (p<0.005 and <0.025, respectively, at these times). The HAM-21 total score also dropped significantly more at 1 week (p<0.05), but not at 2 weeks, of treatment in DMI responders than in nonresponders. Patterns of response to paroxetine in responders and nonresponders (categorized according to the HAM-21 criterion) were somewhat different from those shown in Table 2b, in that the latter did not detect the early change in anxiety, but showed depressed mood to improve significantly more in responders at 1 week (p<0.025), as well as at 2 weeks. Similarly, motor retardation and the severity dimension, DM/MR, improved significantly more in responders at 2 weeks than in nonresponders. Categorization of placebo responders using the 50% decrease in HAM-21 criterion also did not result in any early improvement in specific behaviors or global severity as compared with that measured in eventual nonresponders. Such data reinforce those presented in Table 2a and b and indicate that early improvement can be used to predict eventual response, irrespective of the criterion used to categorize responders.

Estimating Onset and Time Course for Treatment Responders Through Survival Analyses

Table 3 summarizes the survival analysis results with treatment responders. Responders to DMI showed consistently more rapid improvement on both overall severity (as represented by the dimension of DM-MR (see Figure 4) and several behavioral indices. The time of the onset estimates for paroxetine responders was more delayed, being 13 days for these indices. The time for the first sustained response was even later for the placebo responders (Table 3). Patients who responded to either DMI or paroxetine had earlier improvement on some dimensions than responders to placebo. The initial changes elicited by DMI are different from those elicited in patients treated with the SSRI, the former showing more rapid and greater changes in the motor activity and depressed mood aspects of the disorder.

Table 3 Time of Onseta of Improvement in Treatment Responders (Survival Analysis)
Figure 4
figure 4

Median time to onset and time course for treatment responders: DM-MR (survival analysis). There is a significant difference among these curves (p<0.05, Wilcoxon).

Predicting Outcome from Early Behavioral Changes

As shown in Table 4 for DMI-treated patients, there are significant relationships between the amount of improvement on several behaviors at the end of week 1 with percent improvement on the Ham-D total score at the end of 6 weeks of treatment. These relationships are also found with the severity indices DM-MR and Hostility–Interpersonal Sensitivity. In general, similar relationships were found for behavioral changes at the end of week 2.

Table 4 Product Moment Correlations of % Change on Behaviors at 1 and 2 Weeks With % Improvement on Ham-D Total Score at 6 Weeks

These relationships were stronger for DMI than for paroxetine. At 1 week, only a significant improvement in social behavior (positive adaptation) was associated with response at outcome. At 2 weeks, however, improvement in depressed mood, and the severity indices, anxiety–agitation–somatization and DM-MR, were significantly associated with outcome among paroxetine-treated patients. For placebo, only one significant association was found at week 2 between a behavioral characteristic, interpersonal sensitivity, and outcome, which was reduced more in placebo nonresponders than in responders.

Applying Logistic Regression to Develop Algorithms for Predicting Outcome from Early Behavioral Changes

The best predictive model for the patients treated with DMI was the simple model that had only the dimensional construct DM-MR after 1 week as the independent variable. This model achieved a combination of sensitivity and specificity of 0.90 and 0.88, respectively. The same model at 2 weeks showed less balance, but could achieve 0.70 and 0.81 for a balance in favor of specificity or 0.90 and 0.69 for a balance in favor of sensitivity. There were no variables with any predictive value at 1 week for paroxetine. The best model was based on a linear combination of the distressed expression, cognitive impairment, and depressed mood constructs from the 2-week data that achieved a sensitivity and specificity of 0.85 and 0.91. The model was slightly improved with the addition of the dimensional construct DM-MR, to give 0.92 sensitivity and 0.91 specificity.

DISCUSSION

The most important finding in this study is that it is possible to detect, within the first week or two of treatment, clinically significant improvement in those depressed patients who will eventually respond to pharmacotherapy after a 6-week trial. Furthermore, ADs that target either noradrenergic or serotonergic neurons induce initial improvement on different facets of depressive symptomatology. The selective NRI, DMI, initiated improvement through effects on depressed mood and motor retardation. The SSRI, paroxetine, on the other hand, produced initial improvement somewhat more slowly than DMI, initiating recovery by improving anxiety and somewhat later, depressed mood, distressed expression, and in cognitive functioning. In contrast to these consistent early behavioral effects in patients who eventually responded to these drugs, depressed patients who responded to 6 weeks of treatment with placebo showed no consistent early pattern of behavioral improvement. This finding could mean that the onset of placebo response is not initiated by the same neuronal systems affected by most AD drugs.

When considering these results, it is important to consider the patient population studied. As the study was carried out in two Veterans Administration hospitals, the majority (83%) of patients were male and had illnesses of sufficient severity to warrant hospitalization. Over three-quarters had at least one prior episode of depression and the average duration of the current episode was quite long, about 1.5 years. As in-patients, they were carefully monitored, both with respect to frequent behavioral assessment as well as to the development of side effects. This latter factor probably contributes to relatively few (15%) dropping out after randomization, although about 20% dropped out during the initial 1-week placebo run-in period, primarily because they changed their mind about participating due to the length of hospitalization required.

Another factor to consider is that we are measuring a number of behavioral variables in our patients, which raises the statistical issue of multiple comparisons. Although the behavioral methodology used has 11 state constructs and three severity dimensions, the results of our earlier study (Katz et al, 1987) resulted in specific hypotheses (as mentioned in the Introduction) that focused on five constructs—depressed mood, anxiety, distressed expression (the physical expression of depressed mood and anxiety), motor retardation, and hostility—as well as the severity dimension of depressed mood/motor retardation. Thus, there were a limited number of variables involved in the primary analyses. Further, many of the primary findings have p-values below the 0.05 level and they are consistent across several methods, for example, ANCOVA, linear mixed model, and survival analyses. We think such considerations strongly reinforce the validity of the major findings.

Efficacy, Onset, and Timing of Clinical Actions

On a traditional efficacy measure such as the 50% drop in the HAM-D score, treatment with DMI caused significantly greater improvement than placebo treatment did, but paroxetine did not. The categorical index resulted in a greater percentage of responders to placebo (45%) than that found using the HAM-D criterion. The likely reason for this difference is that the HAM-D criterion is an absolute one based on a single measure, such that patients who may be just one percentage point apart (49 vs 50% decrease) will be given different outcomes. By contrast, the categorical index criterion uses the HAM-D, the CGI, and GAS so as not only to measure improvement and final score but also general psychopathology and social functioning. Thus, the categorical index criterion is less black and white and perhaps more akin to what a clinician might use when evaluating a patient. The difference in the placebo response rates is attributed to three patients whose HAM-21 scores dropped by 44, 46, and 48%, but were classified as responders using the categorical index.

It is likely that the lower response rate found with paroxetine is due to the fact that most of the depressed patients in this study were male. Both Korstein et al (2000) and Joyce et al (2003) reported that men responded less well to SSRIs such as sertraline and fluoxetine than to TCAs. In these studies, the response rate of male patients to these drugs was about 40–45%, quite similar to the response rate of 46% in this study

In spite of the fact that the different criteria for assigning outcome generated slightly different populations of patients in the two groups, the results obtained using either method of classification were similar. For example, in patients who responded to DMI after 6 weeks of treatment (using either the HAM-D or categorical index criteria), improvement in motor retardation and depressed mood was detected early in treatment, with such behavioral improvement not observed at these times in nonresponders. Similarly, the ability of paroxetine to initiate improvement in certain behaviors early in treatment in eventual responders was independent of the criterion used to categorize response, although there were some differences in the specific behaviors affected. Importantly, no specific pattern of behavioral improvement was detected early in treatment in placebo responders, irrespective of the criteria used to assign patients to this category.

The results on the onset and timing of response agree generally with those found by Stassen et al (1993), (1997), in that significant improvement occurs as early as 13 days for most ADs. However, our results do not agree with Stassen et al (1993), (1997) on other issues. The results with DMI, for example, indicate that the drug acted more rapidly than 13 days (DMI was not included in the Stassen et al, 1997 analysis). The time pattern with placebo was also different from that found by these authors, in that the placebo responders in this study did not respond similarly to the AD responders. Stassen et al (1993), (1997) conducted their analyses using the total Hamilton Scale score, and that confined their generalizations to global severity only and did not extend to specific behavioral components of the disorders.

Central to the study and to our analysis, however, is the controversy over whether ADs take several weeks to initiate clinical action (eg Quitkin et al, 1984) or whether the behavioral effects begin to act sooner, within the first week, as originally observed (Kuhn, 1958; Kielholz and Poldinger, 1968; Angst, 1970), and as found in a number of subsequent clinical studies (Katz et al, 1987; Khan et al, 1989; Stassen et al, 1993). Large meta-analytic studies analyzing data post hoc from pharmaceutical trials additionally found evidence of early behavioral changes in the first 2 weeks of treatment (Dunbar and Fuell, 1992; Tollefson and Holman, 1994). While such large sample sizes may allow minimal clinical differences to reach statistical significance, these studies also suggest the onset of AD action early in treatment.

The disagreement about the length of time between the effect of the ADs on the functioning of neurotransmitter systems and clinical response has been in great part a function of the ambiguity of the concept of ‘onset of clinical response’. The fact that marked or full response, conventionally defined as equal to or greater than a 50% reduction on the Ham-D total score, can require several weeks or even months to occur is not disputed. If recovery is used as the definition of onset of clinical response, there is no question that a significant lag time exists between it and initial neurochemical effects of ADs. However, if the onset of clinical response is defined as ‘improvement’ (eg, a 20% sustained reduction of the Ham-D total score; Stassen et al, 1993) rather than recovery, the average time for the onset of clinical response to ADs is 13 days, significantly less than the 20 days required for ‘full response’ (the 50% criterion). Although the figures from similar studies vary somewhat, the results of the Stassen et al (1997) analysis are based on sufficient data to provide reliable estimates of these different time points.

The current study assured equivalent treatment in the medication arms by monitoring drug in plasma and examined improvement in a more detailed manner, that is, replacing the Ham-D total score with multiple measures of the major behavioral components of the disorder. This behavioral methodology results in the detection of specific behavioral change even earlier than 13 days. For DMI, motor activity and depressed mood began to change in 3–7 days; for paroxetine, the reduction in anxiety began by 10 days. Thus, by further refining the measurement of clinical response, it was possible to detect effects on behavior even earlier than previously reported. Further, these behaviors represent critical facets of the depressive disorder and were shown to be correlated with eventual treatment response. They, therefore, can provide the links to later full response and can be tracked alongside further changes in the functioning of the neurotransmitter systems. We conclude that the lag in time between neurochemical and the initial behavioral actions of effective ADs is a matter of days, not many weeks.

Drug-Induced Effects on Monoamines and Behavioral Changes

It is interesting to consider the early behavioral effects produced by DMI and paroxetine from the perspective of their actions on noradrenergic or serotonergic neurons. DMI appeared to produce an initial stimulatory motor effect that was closely, but secondarily, associated with elevation of mood, and this was followed by improvement in anxiety. Although its effect in reducing motor retardation has been noted frequently since the early descriptions by Kielholz and Poldinger (1968) and Carlsson et al (1969), there has been little hard evidence until now that this is a specific clinical action of DMI, that is, an effect greater than that which occurs with other ADs and/or placebo, and furthermore, which is associated with the recovery process. Tonic noradrenergic activity is closely related to the overall state of behavioral arousal, which may be conceptualized as the degree to which an organism is ‘engaged’ with or responsive to its external environment (Aston-Jones et al, 1991; Jacobs et al, 1991). Brain noradrenergic activity is closely associated with behavioral and physiological indices of arousal, such as locomotion, eye movements, EEG activity, heart rate, and overt behavioral activity (Rasmussen et al, 1986; Berridge and Foote, 1991; Page et al, 1993; Rosario and Abercrombie, 1999). Given the close relationship that has been demonstrated between tonic noradrenergic activity and alertness, behavioral activation and arousal, the tonic elevation of extracellular NE levels produced by selective NRIs such as DMI likely contribute to an alleviation of the ‘inhibitory’ symptoms of depression such as psychomotor retardation, fatigue/languor, and depressed mood. Our data show that DMI exerts prominent and clinically significant initial effects on such components of depression, consistent with enhancement of noradrenergic transmission.

It is more difficult to obtain consensus about what general behavioral roles to ascribe to serotonin. Some consider serotonin to function mainly in a tonic steady-state mode of activity also related primarily to behavioral arousal (Rueter and Jacobs, 1996), although this relationship can be modified (Chaouloff et al, 1999). Others, though, consider that even if there is some constant tonic level of serotonergic activity, there can still be local modulation of serotonin release that may be both brain region and stimulus specific (Kirby et al, 1995, 1997). From a behavioral perspective, the activation of serotonergic neurons has been hypothesized to produce behavioral inhibition (Soubrié, 1986; Spoont, 1992; Handley, 1995), that is, serotonin may promote response suppression, inhibition, passivity, and waiting. More specifically at the behavioral level, it has been associated with the regulation of impulsivity (Soubrié, 1986), aggression (Linnoila et al, 1983), and anxiety (Briley and Chopin, 1991). Clinical research reaffirms the association of the functioning of the serotonin system and impulsive, aggressive behavior (Coccaro et al, 1989), and also indicates a specific and strong association with anxiety (Dunbar and Fuell, 1992; Montgomery, 1992; Gorman, 2002). In view of this, treatment with SSRIs might be expected to improve, at least initially, components of depression reflecting arousal, such as anxiety, agitation, and hostility. Consistent with this view, hostility was the earliest behavioral component in all patients that was shown to be reduced. And the earliest behavior that improved in responders to paroxetine compared to paroxetine nonresponders was anxiety. Improvement in depressed mood and cognitive impairment in responders occurred somewhat later. This later behavioral effect of paroxetine may not be due to serotonin exclusively, but to sequential or direct interactions between serotonin and noradrenergic or even dopaminergic systems (Willner, 1997; Bonhomme and Esposito, 1998; Bourin et al, 2001).

Our data show that paroxetine and DMI, prototypes for SSRIs and selective NRIs, initiate behavioral improvement somewhat differently. This leads to the possibility that ADs with different mechanisms of action may initiate their own unique pattern of response. Such data are consistent with those generated in the innovative work of Delgado and Moreno (2000), who demonstrated that abruptly reducing the availability of 5-HT through rapid depletion of dietary tryptophan interrupted the therapeutic effect in patients responsive to an SSRI. When the same procedure was applied to patients responsive to a selective NRI, it had no effect. By contrast, catecholamine depletion transiently reversed responses produced by selective NRIs without affecting the response to SSRIs. This work reinforces the idea that the neurobiological mechanisms underlying responsiveness to different classes of ADs involve initial actions on different neurotransmitter systems. Nevertheless, it is evident that different types of ADs can achieve the same therapeutic efficacy.

Drug Actions on the Dimensional Structure of the Depressive Disorder

To determine how changes in discrete behaviors brought about by drug-induced effects on neuroamine systems resolve the complex, clinical state of depression, it is useful to view the impact of the drugs on the three major severity dimensions of the disorder, that is, DM-MR, anxiety–agitation–somatization (‘arousal’), and hostility–interpersonal sensitivity (Katz et al, 1984; Katz and Maas, 1994). The sequence of behavioral actions would indicate that the selective NRI acted first on the DM-retardation dimension, followed by impact on the anxiety arousal dimension. The secondary effects in the sequence, reduction of anxiety, and hostility might then be associated with the somewhat later effect that DMI has on the serotonin system (Frazer, 2000). The therapeutic pattern for paroxetine as an AD, ie the initial effect on anxiety, has been found in many other studies, so there seems little question that reducing anxiety is a major focus of paroxetine's clinical activity. The pattern is further supported by paroxetine's use as an ‘anxiolytic’, now FDA approved for the treatment of a range of anxiety disorders (Dunbar and Fuell, 1992; Lydiard and Bobes, 2000; Pollack et al, 2001). The sequence of behavioral changes for paroxetine appears to reflect an initial impact on the serotonergic system, that is, the ‘calming’ of the negative arousal and hostility dimensions, followed by elevation of mood and improvement in other aspects of the disorder. The results, therefore, support a clinical analysis of AD actions on depression, which views the SSRI and selective NRI drugs as impacting the three major dimensions of the depressed state in contrasting sequences. The NRI first improves motor retardation-depressed mood and, secondly, arousal components of the illness. By contrast, the SSRI reduces arousal first and then depressed mood.

Placebo Response

In this study, responders to placebo treatment responded significantly more slowly than responders to drugs. The placebo responders also showed no differences in early behavioral changes from those who did not respond to placebo. Thus, placebo responders were ‘late’ in response and showed no pattern of early improvement that was distinctive, as was found with responders to ADs. This result is in contrast to those of Stassen et al (1997) who found the time course of response to placebo to be virtually identical to the time course for responders to a range of active ADs, that is, initial improvement was evident by day 13 for all treatments. The differences in results from the two studies may be due to differences in the clinical populations sampled, although both were primarily in-patient studies. Our data are also in contrast to those of Quitkin et al (1984), (1987), who described a pattern of early response to placebo that was not sustained. The limitations of these studies have been discussed in detail previously (Katz et al, 1996/1997). It is clear that the basis for placebo response is still not well understood. This issue and our results will be explored in greater detail in a subsequent paper.

Prediction of Drug Response

The finding that the onset of therapeutic changes for DMI and paroxetine occurs within the first 2 weeks of treatment suggests that it may be possible in the future to use these early changes to predict outcome at 6 weeks of treatment. As a first step in testing the clinical applicability of these results, we determined whether the amount of improvement produced by each of the drugs on the various behavioral and severity indices at the end of weeks 1 and 2 was correlated with the amount of improvement on a primary measure of outcome, the Hamilton depression total score, at 6 weeks. If so, formulas could be generated utilizing the findings at weeks 1 and 2, to identify the critical behavioral factors to be used in predicting outcome. Decisions might then be made at 2 weeks of treatment to determine whether a patient should be maintained on the initial drug or shifted to a new treatment. The findings in this study indicate that by the end of the first week of treatment, changes in several behavioral variables produced by DMI are associated with clinical response at 6 weeks. They include the severity dimension, DM-MR, and the behavioral facets, depressed mood, anxiety, and somatization (Table 4). These behavioral associations are sustained at 2 weeks. A reduction in severity dimensions is also associated with outcome for paroxetine, but these associations do not appear until the end of 2 weeks of treatment.

These promising relationships were reinforced by the sensitivity and specificity analyses. It appears that status on the DM-MR severity dimension can be used to predict outcome for DMI at 1 week resulting in 0.90 sensitivity and 0.88 specificity. A combination of behavioral facets can be used for paroxetine, as well as the DM-MR dimension, to also achieve high sensitivity and specificity at 2 weeks. A study would be required to test whether these high estimates are equally valid when applied in the outpatient situation, but it is clear that high-level prediction is quite possible using a limited set of behavioral and severity indices. The generalizability of our findings may be limited by the fact that most of our patients were men of lower socioeconomic status with recurrent depression. This further substantiates the need for replication of this work in an outpatient population.

There has been great difficulty in prior studies trying to identify predictors of AD response based on pretreatment patient characteristics (Joyce and Paykel, 1989). There have been, however, a number of studies over the years (Small et al, 1981; Coryell et al, 1982; Katz et al, 1987; Khan et al, 1989; Nagayama et al, 1991; Boyer and Feighner, 1994) that have demonstrated a relationship between early response to ADs, mainly tricyclics, and treatment outcome. It is clear from the current study, however, that drug-specific types of behavioral responses in the first 1 or 2 weeks of treatment with DMI or paroxetine are highly predictive of later outcome. Consistent with these results, Szegedi et al (2003) found that improvement in the first 2 weeks in depressed patients treated with either mirtazapine or paroxetine was highly predictive of a positive response after 6 weeks of treatment. Lack of early improvement was also highly predictive of lack of improvement after 6 weeks. Thus, an increasing body of data is emerging that refutes the concept that ADs initiate behavioral improvement quite slowly.

Earlier studies may have insufficiently assessed the nature and sequence of the specific behavioral changes that might accompany early drug-induced changes in the functioning of the monoamine systems. The difficulty in describing this neurochemical-behavioral process in earlier studies was influenced by the use of ‘full clinical response’ as the criterion of drug action, rather than treating recovery as a process that begins with changes in parts of the disorder and evolves, sometimes rapidly, into resolution of the full disorder. Our results imply that despite the current emphasis in AD development on selective 5-HT receptor subtypes in order to increase the rate of clinical action (Artigas et al, 1996), more rapid improvement is seen with drugs that target the noradrenergic system than serotonergic neurons. It is important to recognize this in future drug development.