Introduction

Depressive disorders are a major burden to societies worldwide, being the single largest contributors to years lived with disability1. This is largely due to the often chronic relapsing nature of depression, which underlies the functional and social impairments it brings2. Hence, successful treatment of a particular depressive episode, e.g. by achieving response to an antidepressant medication (ADM), is critical, but only the first step.

Thus, the prevention of relapses is the next step as over half of the patients with one depressive episode will experience a second one and the risk of relapse only increases further thereafter3. Preventing relapses is of paramount importance for the longer-term course of the illness, and a number of strategies exist. One important strategy is continuation and maintenance treatment with ADM, which reduces the risk of relapse4,5,6,7. However, there is still a risk of breakthrough depression, i.e. the development of further depressive episodes while taking ADMs8. Furthermore, many patients want to discontinue their ADM due to side-effects such as weight gain and sexual dysfunction9 or adhere only partially10. At the same time, one in three patients relapses within 6 months after discontinuation4.

Thus, not all patients benefit equally from continuation treatment and there appears to be variation in individual trajectories after the initial response to ADMs11,12,13,14. Markers that identify these trajectories and separate those patients who can safely discontinue their ADMs from those with a higher risk of relapse after discontinuation clearly have the potential of improving this situation.

Indeed, current guidelines take some of this variation into account, and recommend continuation treatment for 4–9 months after the first depressive episode and 2 years or more after recurrent episodes15,16. More recently, guidelines also refer to residual symptoms and physical and psychological comorbidities17. These recommendations are based on evidence derived from the natural course of depression and overall relapse risk18,19,20. However, the importance of these markers has been disputed21: two meta-analyses came to diametrically opposed conclusions about the relevance of the number of prior episodes5,22, and five separate meta-analyses have failed to find an effect of length of ADM treatment on relapse risk after discontinuation4,5,6,22,23.

Several other predictors of relapse after discontinuation exist in the literature21. These include ethnicity24, neurovegetative symptoms25, melancholic subtype25, anxiety26, somatic pain27 and response pattern to drug28,29. Only the last predictor, having a placebo drug response, i.e. fast but unstable response, compared to a true drug response, i.e. slower but sustained response30, has been replicated28,29. Unfortunately, assessing this measure will be difficult in clinical practice.

Two further points complicate this picture. The first point relates to a methodological problem. Studies have usually mainly focused on asking whether a particular variable differs between groups of patients who do and do not go on to relapse. Unfortunately, while such differences might reach statistical significance, they might still fail to perform well as predictors31. Regression analyses have been used in some studies, but using a simple regression bears the risk of overfitting32. To make inferences about a new patient in a practice, the predictive power in cases outside of the sample used to fit the regression model needs to be determined, e.g., using cross-validation. To our knowledge, this has not been reported in the literature so far. Second, most studies have been performed in the setting of double-blind RCTs. While this is the ideal approach to examine whether an active compound has a causal role in reducing relapse, it may underestimate relapse rates after discontinuation because medication discontinuation might have psychological effects in addition to direct pharmacological effects.

Here, we report findings from the AIDA study—a two-centre, longitudinal, naturalistic observational study of antidepressant discontinuation. Our first aim was to investigate the extent to which variables which are easily assessable in a naturalistic setting can predict individual relapse risk and possibly guide the decision to discontinue or not. We paid specific attention to previously reported clinical predictors and examined their performance in a naturalistic setting. A secondary goal of this study was to understand the effects of discontinuation itself and how these relate to relapse. Accordingly, we investigated if any of the state-dependent variables changed with discontinuation and if that change differed between relapsers and non-relapsers.

Methods and material

Participants

We recruited patients who decided to discontinue their medication independently from study participation after they were diagnosed with Major Depressive Disorder (MDD) and had (a) experienced one severe33 or multiple depressive episodes, (b) initiated antidepressant treatment during the last depressive episode and (c) now achieved stable remission, i.e. a score of less than 7 on the Hamilton Depression Rating Scale 1734 for 30 days. To identify disease and medication effects, we also recruited healthy controls (HC) matched for age, sex and education. See Section S1.1 for detailed inclusion and exclusion criteria. All participants gave informed written consent and received monetary compensation for the time of participation. Ethical approval for the study was obtained from the cantonal ethics commission Zurich (BASEC: PB_2016-0.01032; KEK-ZH: 2014-0355) and the ethics commission at the Campus Charité-Mitte (EA 1/142/14), and procedures were in accordance with the Declaration of Helsinki.

Study design

The study design is depicted in Fig. 1. Trained staff interviewed remitted patients on ADM to assess in- and exclusion criteria during a baseline assessment (BA). The BA consisted of the assessment of current symptoms and present and past diagnoses, as well as a short neuropsychological testing and a questionnaire batch assessing stable traits. Patients meeting inclusion criteria were randomised to one of two study arms. Of note, the first 10 participants at each site were all assigned to arm 1D2 (1/2 represents the number of the main assessment, “D” represents discontinuation). Participants in arm 1D2 underwent the first main assessment (MA1) including a questionnaire assessing state variables, then gradually discontinued their medication over up to 18 weeks and then underwent a second main assessment (MA2). Participants in arm 12D underwent both main assessments before discontinuation. During discontinuation all patients were contacted every 2 weeks for a telephone assessment. This two-arm design allowed us to identify discontinuation effects while controlling for time, learning and repetition effects. After discontinuation, all patients entered a follow-up period of 6 months. During that period, they were contacted for telephone assessments at weeks 1, 2, 4, 6, 8, 12, 16 and 21 to assess relapse status. If telephone assessment indicated a possible relapse, patients were invited to an on-site structured clinical interview (SCID-I35) to assess criteria for relapse, i.e. fulfilling the diagnosis of a depressive episode according to the Diagnostic and Statistical Manual of Mental Disorders (4th ed., text rev.; DSM-IV-TR3). If these criteria were fulfilled, they underwent a final assessment (FA). If no relapse occurred, the FA took place in week 26. HC underwent MA1 only. In addition to the measures reported here, participants also underwent functional magnetic resonance imaging, a range of behavioural task, electroencephalography and blood sampling during the main assessments. See supplementary Section S1.2 for detailed procedures of the assessment sessions and S1.3 for observer-rated and self-report measures. Participant recruitment took place between July 2015 and January 2018.

Figure 1
figure 1

Study design: we recruited remitted, medicated patients on antidepressant medication (ADM) and matched healthy controls (HC). They were assessed and compared at main assessment 1 (MA1) to identify traits characterising the remitted, medicated state. Next, patients were randomised to either discontinue their medication before MA2 (bottom arm, “discontinuation group” or enter a waiting period while continuing their ADM matched to the length of discontinuation time (top arm, “waiting group”). Differences in changes between MA1 and MA2 in the two separate groups were investigated to gain an understanding of the effects underlying discontinuation. Patients in the waiting group discontinued their ADM after MA2. After discontinuation, all patients entered the follow-up (FU) period of 6 months, whereas some patients had a relapse during this period and some patients finished this period without relapse. Differences in characteristics at MA1 of patients who relapsed and patients who did not relapse during FU provide information on which variables relates to relapse risk and can be used to identify predictors of relapse after ADM discontinuation.

Measures

We included 18 measures spanning four categories: demographics, current symptoms, clinical history and treatment. Measures were chosen based on two criteria: (1) they have previously been related to relapse after antidepressant discontinuation21, and (2) they can easily be assessed during a routine clinical visit, do not require extensive training or equipment and have a plausible relation to relapse risk. Individual measures in each category are listed in Table 1 and described in supplementary Section S1.3. Ten of these variables were previously investigated in randomised controlled trials (RCTs). All listed measures can be assessed before discontinuation and will be included in the prediction analysis. We additionally compared discontinuation time between relapsers and non-relapsers, but did not include it in the prediction model. All measures from the category current symptoms were re-assessed at MA2.

Table 1 Participant characteristics and complete-case analyses.

Data analysis

Analyses were performed using Matlab version 9.1.0.441655 (R2016b) according to our a priori analysis plan available at https://gitlab.ethz.ch/tnu/analysis-plans/aidaz_analysis_plan_clinical_prediction.

Association analyses

Candidate predictor variables were first identified by assessing group differences between patients and HC and between relapsers and non-relapsers. Two-sample two-tailed independent t-tests were used for continuous and chi-squared tests for categorical variables (including psychotherapy). We report results using no multiple comparison correction, i.e. considering tests to be significant at p < 0.05 and indicate if they survive correction using false discovery rate (FDR). The former allows for better interpretation of non-significant findings. The latter helps to control for the number of tests we applied since we are investigating a range of variables increasing the risk of false positives. In contrast to Bonferroni correction, FDR-based corrections do not make the assumption that tests are independent.

Complete-case analyses can yield biased results. We therefore examined whether patients who dropped out differed from patients who finished the study. For this, we repeated the above analyses procedure comparing patients who finished the study and patients who dropped out after MA1. We next performed Cox proportional hazards regression models, relating predictor variables to time to relapse or dropout. For these analyses, all variables were mean-centered and normalised. We first performed this for each measure individually and then included all measures in the same Cox regression, to compare predictors. Since our goal is to predict relapse after antidepressant discontinuation, we performed the latter analysis first for the time after discontinuation, but repeated the analysis by extending the observation period to include the time of discontinuation. To test the assumption of proportional hazards, we conducted the Schoenfield individual and global test for each individual predictor variable using the scaled Schoenfield residuals and visually inspected the plots of the residuals to exclude an association of the residuals with time.

Prediction analyses

To examine whether clinical variables have predictive value, we first fitted a full logistic general linear model (GLM) including all relapsers and non-relapsers to determine which variables made a significant contribution to the prediction, the total variance that can be explained by the combined predictors, the area under the curve, the best threshold as well as the sensitivity and specificity at this threshold.

However, as there are 18 predictors for 84 data points any results for the current sample may generalise poorly due to overfitting. To address the high number of predictors compared to the small sample size, we used an elastic net with both an L1 and L2 regularisation36 as implemented by the lassoglm function in Matlab. We applied tenfold cross validation with stratification to optimize strength of the L1-regularisation parameter (\(\lambda\)). This was repeated for a range of \(\alpha\) values and the optimum was chosen.

Next we repeated this entire procedure within a nested cross-validation procedure to examine generalisation to data not seen by the algorithm. The outer loop consisted of a leave-one-out cross-validation (LOOCV). One subject was first set aside, then the full GLM or the regularised GLM, respectively, was fitted to all other subjects. Then, the group membership of the left-out subject was predicted using parameter estimates (regression weights) obtained from the other subjects. The classification threshold was set to 0.5. These predictions were used to compute the balanced accuracy and the probability that these predictions would not be better than chance was determined with a binomial test. To determine receiver operating curves for left out subjects, we categorised these subjects as relapsers or non-relapsers for varying thresholds and computed how many subjects were categorised correctly for each threshold.

Discontinuation analyses

To investigate the discontinuation effect and the interaction between discontinuation and relapse, we applied mixed analyses of variance (ANOVAs) with group (1D2 vs. 12D) and (relapse vs. no relapse in the discontinuation group, i.e. patients who discontinued before MA2, only) as between-subjects factor and time (MA1 vs. MA2) as within-subject factor.

Exploratory analyses

Due to new findings that quality of life relates to relapse37 and to ensure that our results are not limited to our set of pre-selected variables described in our analysis plan, we run exploratory analyses comparing quality of life, income, psychiatric family history, alcohol consumption and smoking between patients who would go on to relapse and those who remained well.

Results

Participants

Nineteen (15%) of 123 included patients dropped out of the study prior to the first main assessment and were not further analysed. Of the 104 who completed the first main assessment (28 recruited in Berlin and 76 recruited in Zuerich), 91 (88%) completed both main assessments (44 off medication after discontinuation in arm 1D2 and 47 on medication prior to discontinuation in arm 12D). Mean (standard deviation) number of days between first and second main assessment was 72 (40) for group 1D2 and 55 (36) in group 12D (t(85) = 2.02, p = 0.046). Of the 91 patients who completed both main assessments, 89 (86%) achieved antidepressant discontinuation and 83 (67%) reached a study endpoint by either remaining in remission for 6 months, or only restarting antidepressants after reaching criteria for relapse. One additional patient was categorised as relapser after meeting criteria for relapse for 10 days (shorter than the length criterion of 14 days) and quick improvement after treatment re-initiation. Of these 84 patients, 30 (36%) had a relapse during the follow-up period. Detailed reasons for dropouts are depicted in Fig. S1.

Association analyses

Complete-case analysis

Patients and healthy controls (n = 57) were matched for demographic variables but patients had elevated residual depression (t(159) = 5.68, p < 0.001, CI = 1.93–3.99), anxiety (t(159) = 3.56, p < 0.001, CI = 0.56–1.96) and somatic pain symptoms (t(159) = 4.47, p < 0.001, CI = 0.098–0.254) and scored higher on general impairment (t(159) = 5.02, p < 0.001, CI = 0.11–0.26; Table 1). These results survived correction for multiple comparison.

Table 2 Intention-to-treat analyses.
Figure 2
figure 2

(A) Survival curves for time until relapse during follow-up period for patients who were only treated by a general practitioner (GP) or additionally by a psychiatrist or psychologist. (B) Prediction: Receiver operating curves for a standard general linear model (blue) and a regularised general linear model using least absolute shrinkage and selection operator and elastic net (red) using the full sample (solid lines) and for subjects left out of the fit using leave-one-out (LOO) cross-validation (dashed lines).

We first performed a complete-case analysis on the 84 patients who either reached the follow-up period without relapse or relapsed during that period to maximise the chances of identifying potentially predictive variables. Patients who went on to relapse after ADM discontinuation had increased somatic pain (t(82) = 2.07, p = 0.042, CI = 0.004–0.21) and were more often treated by a general practitioner only rather than a psychiatrist (\(\chi ^2\) = 3.93, p = 0.048), though these differences did not survive correction for multiple comparisons (Table 1). To assess the unique contributions of the predictor variables, all measures were combined in a single multiple regression model. This revealed treatment by GP only as the sole significant variable associated with relapse (b = − 0.94, p = 0.005; Table 1). Of the 30 relapsers, 10 were treated by a GP only, while only 8 of 54 non-relapsers were treated by a GP only.

Complete-case analyses may yield biased results. Patients who dropped out had more residual symptoms (t(102) = − 2.01, CI: − 3.73 to − 0.025, p = 0.047) and more symptoms during the last episode (t(102) = − 2.09, CI: − 1.24 to − 0.033, p = 0.039) (Table S1). These differences did not survive correction for multiple comparisons.

Intention-to-treat analyses

As there were differences between patients who completed the study and those who dropped out, we performed intention-to-treat analyses using Cox proportional hazards including all the 89 patients who completed discontinuation. The Schoenfield tests for each predictor variable were all non-significant, indicating that the assumptions for the cox proportional hazard models were met. These results were further confirmed by visual inspection indicating no association of time with the scaled Schoenfield residuals. The Cox proportional hazard models revealed that general impairment (b = 0.32, p = 0.044, CI = 0.008–0.632) and treatment by GP only (b = 0.36, p = 0.025, CI = 0.045–0.666) were significantly associated with shorter time to relapse, though neither survived correction for multiple comparisons. Of note, no effect was found for the current symptoms and symptoms during the last episode which distinguished patients who dropped out (Table 2). To assess the unique contributions of the predictor variables, all measures were combined in a Cox multiple regression model. This again revealed treatment by GP only as the uniquely significant predictor (b = 0.662, p = 0.005; Table 2, Fig. 2A). GP only treatment was also the only variable associated with shorter time to relapse in an extended intention-to-treat analysis including an additional 6 patients who initiated but did not complete antidepressant discontinuation (Table 2).

Prediction of relapse

To ascertain whether these findings could inform clinical practice, we next assessed how well clinical variables were able to predict relapses. Individual predictions could only meaningfully be assessed on the complete-case data. The multiple linear regression with all variables included achieved an area under the curve (AUC) of 0.76 with a sensitivity of 0.87 and specificity of 0.53 at the best cut-off (Fig. 2B). The model explained 21% of the variance. Such a performance is suggestive of clinical utility. However, with 18 predictor variables for 84 outcomes, this model may have overfitted the data and therefore may not generalize to new data.

We first examined overfitting through regularization via an elastic net, which pushes regression weights towards zero except for those predictor variables with most predictive power36. A standard approach with elastic nets, namely setting \(\lambda\) to be one standard error larger than the value minimizing deviance, resulted in all regression weights being set to zero. A less stringent regularization using the value of \(\lambda\) that minimized deviance resulted in a model with non-zero weights for five variables only (intelligence, somatic pain, general impairment, severity factor and treatment by GP only; Table 1) with an AUC of 0.74, a specificity of 0.66 and a sensitivity of 0.76 at the best cut-off value (Fig. 2B). Thus, five variables may suffice to predict relapse. However, since this is a within-sample analysis, it is still not clear whether and how well this result would generalise.

To determine how the models’ performances might generalise to new incoming patients, we approximated out-of-sample predictive accuracy using leave-one-out cross-validation (LOOCV). Doing this without regularisation yielded a balanced accuracy of 0.47. With regularisation, the balanced accuracy was 0.49. Neither prediction exceeded chance.

Figure 3
figure 3

(AD) Discontinuation effects: changes in symptoms from main assessment one (MA1) to main assessment two (MA2) for depression (A), anxiety (B), somatic pain (C) and general impairment (D) in patients who discontinued between the two assessments and patients who did not discontinue. (EH) Discontinuation relapse interaction effects: Changes in symptoms from MA1 to MA2 for depression (E), anxiety (F), somatic pain (G) and general impairment (H) in patients who discontinued and either relapsed or remained well during the follow-up period. (IL) Test–retest reliability for symptom measures: Changes in symptoms from MA1 to MA2 for depression (I), anxiety (J), somatic pain (K) and general impairment (L) in patients who did not discontinue and either relapsed or remained well during the follow-up period. Asterisks indicate a significant difference at p < 0.05 for FDR-corrected p values. Asterisks on top of a line relate to a within-subjects difference between MA1 and MA2 for the group indicated by the line. Asterisks between two lines relate to a between-subjects difference at the indicated time point.

Discontinuation effect

The impact of antidepressant discontinuation on symptoms was examined by comparing changes in symptoms between the two main assessments in individuals randomized to groups 1D2 and 12D (Fig. 1). Discontinuation resulted in changes in residual symptoms in all four domains, including anxiety (F(1,89) = 6.55, p = 0.012), depression (F(1,89) = 1.46, p = 0.001) and general impairment (F(1,89) = 9.99, p = 0.002; Fig. 3A–D and Table S2). Post-hoc tests corrected for multiple comparisons with FDR indicated that no difference between groups exist at MA1, but did at MA2 and that the changes were due to an increase in symptoms in the group that discontinued their ADMs. For somatic pain, the interaction effect only showed a trend towards a significant difference (F(1,89) = 3.31, p = 0.072), and post-hoc tests of change only survived FDR correction in the discontinuation group. Excluding four patients from the analyses who received fluoxetine known to have long half-lives of 120 hours did not change the pattern or results.

Association between discontinuation effect and relapse

We next asked whether the early effect of antidepressant discontinuation is associated with the ultimate risk of relapse. There was no interaction between the change in clinical measures before and after discontinuation (i.e. between the two main assessments in patients who discontinued before MA2, group 1D2; c.f. Fig. 1) and relapse (all p > 0.05). Instead, the analysis revealed main effects of relapse across all domains [anxiety (F(1,40) = 8.751, p = 0.005), general impairment (F(1,40) = 11.001, p = 0.003), depression (F(1,40) = 5.615, p = 0.023) and somatic pain (F(1,40) = 4.709, p = 0.036)]. Relapsers in the group 1D2 had more symptoms before starting discontinuation and symptoms in both relapsers and non-relapsers increased after discontinuation to a similar extent (Fig. 3E–H), while there were no changes in the group that did not discontinue before MA2 (i.e. group 12D; Fig. 3I–L, Table S2).

Exploratory results

Quality of life did not differ between subsequent relapsers and non-relapsers (t(82) = − 1.35, p = 0.18) nor was it associated with time to relapse in a cox regression (b = − 0.28, p = 0.12). Similar results were obtained for income (t(82) = 0.57, p = 0.57; b = 0.9, p = 0.59), education (t(77) = − 0.38, p = 0.7; b = − 0.4, p = 0.83), family history (t(82) = − 0.27, p = 0.79; b = − 0.09, p = 0.65), smoking (t(82) = − 1.32 , p = 0.19; b = − 0.22, p = 0.25) and alcohol (t(82) = − 1.18, p = 0.24; b = − 0.21, p = 0.28). The inclusion of the latter variables into our out-of-sample prediction analyses also did not improve the predictive accuracy of the model (balanced accuracy: 0.5).

Discussion

Antidepressant medications are efficient in the prevention of relapses and relapse rates after discontinuation are high4. Relapse rates in our study were also high, with one in three patients suffering a relapse within 6 months of discontinuation. This high relapse rate was observed even though the median duration of treatment was around 2 years, and hence at least as long as the duration of treatment recommended for recurrent illness15,17, and despite including only fully remitted patients with HAMD\(^{\mathrm{17}}\) scores below 7. Relapses are not only important because they represent a period of renewed illness, but because any one episode has a 5-10% risk of becoming chronic42 and because early on in the disease additional episodes may mark the transition between those with a benign outcome and few lifetime episodes, and those with a malignant outcome and high risk of relapses43,44,45,46. This situation makes it evident that there is a clinical need to establish predictors of relapses specifically after antidepressant discontinuation, as such predictors could guide the discontinuation decision and in that way help reduce relapses and possibly even modify the long-term course of the illness.

A first pertinent step is the examination of the predictive power of clinical variables that are easily assessed in clinical practice. Our results suggest that such standard clinical variables carry at best weak predictive power. This conclusion relies on an examination of the likely generalisability of the associations. The approach is motivated by machine-learning approaches32. Rather than asking how well a set of variables can predict a particular outcome within a given dataset, the prediction is assessed on out-of-sample data not used in ascertaining the prediction parameters. Such approaches are standard in the field of machine-learning, and are becoming more prominent in neuroscience and psychiatry (e.g.47,48,49). We note that our cross-validation approach is not perfect as establishing a valid clinical predictor would ideally involve a fully independent dataset, but in our case this analysis indicates that the standard regression results do not carry predictive power.

Several aspects of the results from the standard approach are nevertheless noteworthy. First, in the full regression model and the intention-to-treat analyses including all predictors, only GP only treatment emerged as significantly associated with relapse. This suggests that better treatment outcomes may be achieved when patients remain in specialist care. This finding has important implications for the clinical setting and might provide insights on how to reduce relapse rates after antidepressant discontinuation overall in this patient population. One possible mechanism underlying this effect could be the impact of psychotherapy that is often delivered by specialists. Although in the current study psychotherapy did not appear to have an effect on relapse rates, our assessment of psychotherapeutic intervention strength was crude, and does leave room for the possibility that relapse risk could be mitigated by means of specific psychotherapeutic input. Indeed, psychotherapeutic techniques explicitly aimed at relapse have been developed50,51. Second, we did not replicate the effects of anxiety on relapse risk26, but the complete-case analyses replicated somatic pain as a risk factor27. Third, the null findings do replicate null findings from RCTs21 in a naturalistic setting. Importantly, the two indicators which clinical guidelines emphasize, namely the number of prior episodes and the length of ADM treatment15,17, both failed to show an association with relapse risk in our naturalistic setting. This mirrors previous findings in RCTs21 and the consistent lack of coherent effects of these measures on relapse risk after ADM discontinuation suggests a revisiting of these recommendations. In a similar vein, we found no effect of residual symptoms, a decision criterion added in the newest version of the guidelines17, on subsequent relapse risk. This is the case despite an influence of residual symptoms on overall relapse risk20 and symptom severity being the best predictor of disease course in studies using similar analyses approaches for patients in a depressive episode47,49. Finally, the lack of effects of any other clinical variable is still surprising given the relation to overall relapse risk of several of them as reviewed previously52,53.

Next, discontinuation was associated with a robust increase in current symptoms across domains. Surprisingly, this increase in symptoms did not appear to be related to prospective relapses. The dissociation we observed raises the possibility that the mechanisms driving symptom increase after discontinuation differ from those driving subsequent relapse even though relapse trajectories on and off medication are similar54. Clinically, the fact that transient symptomatic worsening does not relate to relapse may help clinicians and patients alike to hold their nerve in the face of early worsening of symptoms.

The study has strengths and limitations. Most prominently, the naturalistic setting of the study limits our ability to draw causal inferences: the pharmacological discontinuation effect is confounded with the potential psychological effect of knowing that the medication has been discontinued, and these cannot be disentangled. Additionally, predictors might differ between patients with a true-drug response and patients with placebo response, but our study design does not allow us to disentangle these two groups. As we excluded patients with other psychotropic medications, our results might not generalize to patients who are treated with other medication in parallel. However, the naturalistic design increases the relevance for real-life outpatient care where these effects co-occur. Furthermore, by including patients with comorbid anxiety disorders, we target the most prevalent population treated with antidepressant in primary care19. A further strength is the application of cross-validation to examine generalisability, but the small sample size is an important limitation. The small sample size also limits the identifiability of mechanistically heterogeneous subgroups and complicates the interpretation of null results. However, preparing an a priori analysis plan and carefully executing it adds additional credibility to our results. Based on the power of our study, the null results exclude large effects. In addition, using data from the same patient sample, we could show that decision time during a physical effort task predicted relapse better than chance in a validation dataset55 and that changes in resting state-functional connectivity between the right dorsolateral prefrontal cortex and the parietal cortex due to discontinuation predicted subsequent relapse using LOOCV56 . These results indicate that predictors of relapse can be identified and validated in our sample and that behavioural and biomarkers seem to carry more predictive weight for relapse after antidepressant discontinuation. We will continue to examine more potential predictors using behavioural, fMRI, EEG and physiological data from this sample.

Clinical implications

The results of the present study need to be replicated. Nevertheless, they are of potential clinical relevance and suggest several changes to the management of remitted depressive disorders. First, there may be a role for continued specialist care, in particular during and after the discontinuation phase. Second, prominent decision criteria currently used in clinical practice such as length of treatment, number of prior episodes and residual symptoms are poorly predictive of relapse, suggesting that guidelines for antidepressant discontinuation might have to be revisited. Third, both treatment providers and patients need to be informed that discontinuation may be accompanied by a transient re-emergence of depressive symptoms that do not necessarily indicate an imminent relapse.

Conclusion

Easily assessable demographic and clinical variables appear to be of limited use to guide antidepressant discontinuation decisions. Given the importance of the problem, more complex and costly measures should be evaluated.