INTRODUCTION

Clinicians treating patients with psychiatric disorders (eg, major depression, substance abuse), face considerable heterogeneity among patients in treatment response. In addition, the course of these illnesses is often characterized by a waxing and waning of symptoms over time. Consequently the best clinical care may require changes in treatment type and dose over time. Treatment is also driven by additional factors that vary over time such as side-effect severity, the presence or emergence of co-occurring disorders, treatment adherence, drug–drug interactions, and so on. In general sequential decisions need to be made about when to change treatment intensity or type (eg, switch or augment treatments) and about which treatment should be used next.

Below characteristics of chronic psychiatric disorders that require sequential decision making are described in more detail. To illustrate the issues we consider the clinical treatment of patients with major depression. However the points raised are applicable to most other chronic psychiatric disorders (eg, substance abuse, and schizophrenia). Adaptive treatment strategies, which operationalize sequential clinical decision making, are introduced. Then we describe specific methodological issues and provide brief descriptions of experimental designs and data analytic tools (based on current technology from the fields of engineering, computer science, and statistics) that address these issues.

CHARACTERISTICS OF CHRONIC DISORDERS THAT REQUIRE ADAPTIVE, SEQUENTIAL DECISION MAKING

Chronic psychiatric disorders require sequential decision making because (1) response heterogeneity (no single treatment is universally effective), (2) there remains a high risk of relapse or reoccurrence of symptoms—both during and following treatment, and (3) more intensive or longer-term treatments may increase the possibility of intolerable side effects and increase patient burden.

Heterogeneity of treatment response refers to treatment responses that vary across patients (between patient heterogeneity in response may lead to sequential decision making in the treatment of acute disorders as well) and/or over time within a patient. For example, 65–70% of patients with major depressive disorder do not achieve remission with a single acute phase of treatment (Trivedi et al, 2006). Response heterogeneity among patients inspires clinicians, when presented with a new patient, to try a sequence of treatments to find an effective one—a natural consequence of the availability of multiple treatments.

Response heterogeneity leads to another common clinical problem: a meaningful reduction in symptoms without symptom elimination may occur. Should a switch to a different treatment be made, with the risk of losing the benefit of the initial treatment, should a second treatment be added with the risk of increasing side-effect burden, or should time be allowed to pass in the hopes of a gradual improvement (American Psychiatric Association, 2000)?

The waxing and waning course of chronic disorders often requires ongoing treatment modifications to minimize symptom exacerbation and prevent or delay relapse. For example, in a prospective study of 318 subjects by Solomon et al (2000) with major depressive disorder, 25 relapsed within 1 year despite ongoing treatment. The ongoing risk of relapse suggests consideration of further treatment options such as a continuing care or simply monitoring.

Furthermore, longer-term and/or higher-intensity treatments require ongoing treatment modifications to reduce patient burden (eg, side effects, life style changes, frequent clinic visits, and so on). Weight gain, a common side effect of medication, may prompt clinicians to recommend an exercise program or begin adjunctive therapy (Hirschfeld, 2003). Adherence, a substantial problem in managing chronic psychiatric disorders (Nemeroff, 2003) is often addressed by conducting behavioral therapies or educational programs.

To summarize, the treatment of chronic psychiatric disorders requires decisions regarding sequencing of treatment and timing of treatment changes. An important consideration is how best to use outcomes observed during treatment (eg, response, burden, and adherence) and pretreatment characteristics (eg, genetics and family history) to inform these decisions.

OPERATIONALIZING ADAPTIVE, SEQUENTIAL DECISION MAKING: ADAPTIVE TREATMENT STRATEGIES

At the core of the management of chronic psychiatric disorders is the idea that important clinical outcomes are systematically tracked and that, at specified times (‘critical decision points’) (Crismon et al, 1999; Adli et al, 2006), clinical decisions are required to optimally control the disorder, maximize functional status, and minimize patient burden and complications. Adaptive treatment strategies (adaptive treatment strategies are also called stepped care strategies (Sobell and Sobell, 2000), treatment algorithms (Rush, 2001), and expert systems (Prochaska et al, 2001)) (Lavori and Dawson, 1998, 2003; Lavori et al, 2000; Murphy et al, 2001; Murphy, 2003; Murphy and McKay, 2004; Collins et al, 2004) provide a framework for operationalizing these key clinical decisions. By operationalizing these decisions, they can be studied and improved upon, with the aim of reducing inappropriate variance in treatment delivery while retaining appropriate flexibility to tailor these decisions to individual patients (Adli et al, 2006; Rush et al, 1998, 1999a, 1999b; Rush, 2005). This section defines adaptive treatment strategies.

In melding the ‘art of medicine’ to the science of adaptive treatment strategies, it is useful to define the terms. The terms tailoring variables, decision options, and decision rules reflect clinical thinking (Collins et al, 2004). Tailoring variables (tailoring variables are also called prescriptive indices (Hollon and Beck, 2004)) are variables that are useful for pinpointing when to alter treatment and for identifying which treatment is best for whom. The best treatment for individuals differs according to different values of these variables. Potential tailoring variables may include variables ascertained before starting a treatment; for example, in the case of alcohol dependent, depressed patients the disorder the patient finds most burdensome may be a useful tailoring variable. Other potential tailoring variables may include outcomes obtained during treatment such as the speed of benefit (eg, systematic measurement of symptom severity or biological tests) or adherence and side effects.

Decision options are the range of options available at the point of adaptation. For example, watchful waiting might be one of several initial decision options. Treatment augmentation or switching could include different medications, psychotherapies, or adjunctive components aimed at improving adherence, or reducing side effects. Decision options also include the range of possible modalities of delivering treatment (eg, inpatient, day patient, and outpatient). Finally, decision rules provide specific guidance for decision making given the tailoring variables. High quality decision rules are operationalized. For example, a decision rule concerning whether one should augment or switch treatment might be

‘If the depression has improved but not yet remitted based on a symptom measure and side effects are tolerable, then augment current treatment with either of two medications A or B; if the depression has not improved or side effects are intolerable switch to either of medications C or D; if the depression has remitted continue on current treatment.’

Decision rules need not be strict or categorical, as they can specify a range of treatment options with each option carrying different risks and benefits.

In summary, adaptive treatment strategies are a series of decision rules that repeatedly tailor treatment decisions to the patient.

USING DATA TO DEVELOP ADAPTIVE TREATMENT STRATEGIES

Having highlighted the variety of clinical decision-making issues to be considered in developing adaptive treatment strategies, we now discuss two methodological elements related to the use of data to inform the development of these strategies. First we present promising experimental designs that inform the construction of decision rules. Second analytic methods for constructing data-driven decision rules are presented. We briefly discuss only the main points, but provide references for interested readers to pursue each issue in greater depth (see Supplementary Information).

Experimental Designs

Why are new experimental designs needed? Consider the gold standard, randomized controlled trials (RCTs). In traditional RCTs, a few treatment conditions are compared with the goal of determining which treatment results in better outcomes. The comparison treatment usually corresponds to a ‘control condition’ (eg, placebo control or ‘treatment as usual’). Thus, traditional RCTs address important but very circumscribed questions such as, will treatment condition A result in better outcomes on average than treatment condition B? Results often provide the first evidence of safety, tolerability, and efficacy of a treatment.

Traditional RCTs are not well suited, however for determining when or how a specific treatment is best used (eg, when should treatment A be used in the course of trying one and then other treatments to define the best overall treatment strategy or sequence for individual patients). These RCTs also do not answer essential tactical questions (eg, ‘When should a treatment with insufficient response be changed?’ or ‘After response to a specific treatment, is the intensity or type of maintenance treatment important for successful long-term management?’). Answers to these questions are essential to achieving the optimal long-term outcomes and for defining an evidence base for system and policy research (Rush and Kupfer, 1995; Rush and Prien, 1995; Rush et al, 1998). The experimental designs discussed below have been developed specifically to address these types of questions (the term ‘design’ refers only to the experimental design—not the design of a treatment or of an adaptive treatment strategy).

The experimental designs (Lavori and Dawson, 2003; Dawson and Lavori, 2004; Murphy, 2005) are variations on a sequential, multiple assignment, randomized trial (SMART). A SMART design uses multiple randomizations to assist in the construction of a powerful adaptive treatment strategy. A randomization occurs at each critical decision point. SMART designs are not confirmatory experiments and may not involve a control condition (Dawson and Lavori, 2004). Several studies have implemented variations on the SMART design (Stone et al, 1995; Tummarello et al, 1997; Stroup et al, 2003; Fava et al, 2003; Rush et al, 2004; Rabinowitz et al, 2005).

Consider the following hypothetical SMART trial for patients with major depression (this example combines elements of the Fava et al, 2003 and Stroup et al, 2003 trials). In this 16-week trial design subjects are randomized to different initial treatments (SSRI A vs SSRI B), then nonremitting subjects are re-randomized to different second step treatments. If at the end of the initial 8 weeks the subject's depression level meets the criterion for remission then he/she continues on current treatment and is provided a continuing care program. If the subject's depression level does not meet the criterion for remission then he/she is randomized to either a switch to a third SSRI (D) or to an augmentation with an anti-anxiety medication (C) for the remainder of the 20-week period.

Figure 1 highlights the four resulting conditions: (1) ‘medication A for 8 weeks, if disorder does not remit then switch medication to D’, (2) ‘medication A for 8 weeks, if disorder does not remit then augment with C’, (3) ‘medication B for 8 weeks, if disorder does not remit then switch medication to D’, and (4) ‘medication B for 8 weeks, if disorder does not remit then augment with C’. In all four conditions, subjects move to continuing care if their depression remits. Each of the four groups is assigned a particular adaptive treatment strategy. When viewed this way, it is evident that randomization to the four conditions may be conducted before trial initiation (Murphy et al, 2006).

Figure 1
figure 1

A hypothetical SMART trial for depression. R denotes randomization.

Even though the four simple adaptive treatment strategies only use remission as a tailoring variable, in the analysis of the trial data one can assess the usefulness of potential tailoring variables including adherence and side-effect severity for deciding which treatment is the best second step treatment for nonremitters (Murphy, 2005; Murphy et al, 2006). This design also permits a comparison of the two initial treatments in the setting in which an early lack of response is followed by switching or augmenting the treatment. The latter comparison, using the study end point, provides a more clinically relevant comparison than is typical in traditional RCTs.

The SMART design is used to proactively construct and optimize an adaptive treatment strategy. Results of several SMART designs may be needed to fully optimize an adaptive treatment strategy. The optimized strategy should then be tested against an appropriate alternative in a confirmatory RCT. See Murphy et al (2006) for simple data analysis methods and more discussion of SMART designs in the addiction field.

Needed research and collaboration

More careful implementations of SMART experimental designs are needed to provide critical clinically relevant information, to further demonstrate the utility of these designs and to illuminate any unexpected issues.

Constructing Decision Rules from Data

Why are new ways of analyzing data needed? It turns out that in many cases the construction of optimized treatment decision rules requires a more holistic approach than expected. For example, it is tempting to ascertain the best treatment at any given time ignoring future treatments. This approach, however, can lead to erroneous conclusions about the best sequence of treatments. The effect of the sequence cannot be accurately estimated by evaluating only a single treatment episode. For example, an initial course of cognitive therapy for depression may be much more effective in the long term when followed by less frequent therapy sessions (continuation treatment) than when followed by waiting without therapy (Jarrett et al, 1998).

Consider two treatments (A and B) that differ in terms of immediate response, favoring A. But when B is followed by B augmented with C, the longer-term response during the entire time period may exceed the effect of A followed by A augmented with C (Figure 2). This occurs for two reasons. First, if a patient responds to B then the patient is more likely to either remain in or progress to remission as opposed to a patient who responds to A. Second, treatment B followed by B+C is synergistic, that is among those who do not respond to B, treatment B+C produces higher remission rate as compared to the effect of treatment A+C among those who did not respond initially to A.

Figure 2
figure 2

A comparison of two strategies. The strategy beginning with medication A has an overall remission rate at 4 months of 58% (16+42%). The strategy beginning with medication B has an overall remission rate at 4 months of 65% (25+40%). Medication A is best if considered as a stand-alone treatment, but medication B is best initially when considered as part of a sequence of treatments.

Similarly a treatment may be very useful in the long term, but may entail greater cost or inconvenience in the short term. For instance, cognitive therapy may be useful in reducing relapses, once the treatment is stopped (Fava et al, 1998, 2001; Hollon et al, 2005) yet it is more time consuming and expensive in the short term. So, too, vagus nerve stimulation (Rush et al, 2005a, 2005b; Sackeim et al, 2001; George et al, 2005) may have only modest or minimal short-term effects, yet in the longer-term, efficacy may increase. The fact that one should incorporate the effects of future treatment decisions when evaluating present treatment is well known to scientists who work on improving sequential decision making (Parmigiani, 2002; see comments on myopic decisions in Sutton and Barto (1998)). There are methods for constructing decision rules that incorporate the effects of future decisions when evaluating present treatment decisions (Thall et al, 2000; Pineau et al, 2003; Parmigiani, 2002; Qin and Badgwell, 2003; Braun et al, 2001; Sutton and Barto, 1998; Murphy et al, 2001; Murphy, 2003; Robins, 2004). These methods also permit the evaluation of the tailoring variables.

One intuitive computer science technique for constructing decision rules is called ‘Q-learning’ (Sutton and Barto, 1998; Blatt et al, 2004). Q-learning explicitly incorporates the effects of future decisions; it is a generalization of the familiar regression model. To illustrate a simple version of Q-Learning suppose the goal is to minimize the average level of depression over a 4-month period, and suppose that data from the SMART design in Figure 1 is available. Note there are only two key decisions in this rather simple trial, the initial treatment decision and then the second treatment decision (for those not responding satisfactorily to the initial treatment). Suppose further that remission and side-effect level are to be used as tailoring variables in decision making.

In Q-learning with SMART data the construction of the decision rules works backwards from the last decision to the first decision. As there are two treatment decisions there are two regressions. Consider the last (here second) treatment decision. This regression uses data from subjects whose depression did not remit by 8 weeks. A simple model uses a summary of depression during weeks 9 through 12 as the independent variable (Y2) and the regression model

The subscript 8 indicates that the side-effect level, S8, is a summary of side effects up to the end of the eighth week. In general the regression might include further potential tailoring variables such as number of past depression episodes, adherence level during initial 8 weeks and the initial treatment to which the subject was assigned. The treatment T2 is coded as 1 if the switch is assigned and is coded as 0 if augmentation is assigned. In this simple case, the decision rule recommends a switch in treatment for a patient with nonremitting depression if β0+β1S8+(β2+β3S8) is smaller than β0+β1S8 and recommends an augmentation otherwise (ie, recommend a switch if β2+β3S8<0). If one expects that the higher the side effects S8 are, the better it is to switch treatment, then β3 will be negative.

Now consider the initial decision. In this regression we use data from all subjects regardless of whether their depression remitted. It is insufficient to use a summary of depression during the first 8 weeks (Y1) or the indicator of remission as the independent variable because both of these represent only short-term benefits instead of both short- and long-term benefits of the initial treatment. Instead a term is added to Y1; this term represents longer-term benefits of the initial decision. Denote this additional term by V; model (1) provides V for the nonremitting subjects. In this case, V is the smaller of β0+β1S8+(β2+β3S8) and β0+β1S8. The former term is smaller if a switch in treatment was found to be best. V represents the effect of the initial decision on both the depression summary during weeks 9–16 and on the best second treatment decision (if subject's depression did not remit by week 8). If a subject's depression remitted by week 8 then V is simply the predicted Y2 from a regression of Y2 on S8 for the remitting subjects. The regression for the initial treatment decision uses Y1+V as the independent variable and the regression model: α0+α1T1 where treatment T1 is coded as 1 if the medication A is assigned and is coded as 0 otherwise. In this simple case, the decision rule recommends medication A if α1 is positive and recommends medication B otherwise. In general one would include dependent variables such as number of past depression episodes and other subject characteristics.

Needed research and collaboration

Although there are several methods for using data to construct adaptive treatment strategies, these methods have not been evaluated in realistic settings. Collaborations are needed to provide practical evaluations of existing methods, like Q-Learning, for developing adaptive treatment strategies. For example, the illustration provided above is somewhat simplistic. In practice, there are often more than two decisions, each may involve a choice between more than two options, the regression models might not be linear, and there may be a variety of outcomes.

DISCUSSION

Effective management of chronic psychiatric disorders presents many challenges. Response heterogeneity is common, treatments may become burdensome, adherence is problematic, and many patients may relapse. Additionally these disorders often occur in conjunction with other health and social problems. These disorder characteristics and the treatment/social settings in which they occur motivate the development of adaptive treatment strategies. Addressing tactical questions concerning the length of time to wait for treatment response and the choice of subsequent treatment are crucial in this endeavor. We have discussed a variety of promising methodologies in developing adaptive treatment strategies. These methodologies, however, while clearly useful in other scientific domains, are relatively untested in the fields of psychiatric disorders. Thus, the primary challenge is to form collaborative teams to evaluate these and other methodologies to construct evidence-based adaptive treatment strategies. These interdisciplinary, collaborative, teams can spur new ways to conceptualize sequential decision making and enable the use of new methodologies in improving clinical care for individuals with chronic disorders. The scientific opportunities, potential for improved patient care, and the intellectual challenges entailed in such work provide strong incentives for such efforts.