Testing a machine-learning algorithm to predict the persistence and severity of major depressive disorder from baseline self-reports


Heterogeneity of major depressive disorder (MDD) illness course complicates clinical decision-making. Although efforts to use symptom profiles or biomarkers to develop clinically useful prognostic subtypes have had limited success, a recent report showed that machine-learning (ML) models developed from self-reports about incident episode characteristics and comorbidities among respondents with lifetime MDD in the World Health Organization World Mental Health (WMH) Surveys predicted MDD persistence, chronicity and severity with good accuracy. We report results of model validation in an independent prospective national household sample of 1056 respondents with lifetime MDD at baseline. The WMH ML models were applied to these baseline data to generate predicted outcome scores that were compared with observed scores assessed 10–12 years after baseline. ML model prediction accuracy was also compared with that of conventional logistic regression models. Area under the receiver operating characteristic curve based on ML (0.63 for high chronicity and 0.71–0.76 for the other prospective outcomes) was consistently higher than for the logistic models (0.62–0.70) despite the latter models including more predictors. A total of 34.6–38.1% of respondents with subsequent high persistence chronicity and 40.8–55.8% with the severity indicators were in the top 20% of the baseline ML-predicted risk distribution, while only 0.9% of respondents with subsequent hospitalizations and 1.5% with suicide attempts were in the lowest 20% of the ML-predicted risk distribution. These results confirm that clinically useful MDD risk-stratification models can be generated from baseline patient self-reports and that ML methods improve on conventional methods in developing such models.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.


  1. 1

    Altshuler LL, Cohen LS, Moline ML, Kahn DA, Carpenter D, Docherty JP et al. Treatment of depression in women: a summary of the expert consensus guidelines. J Psychiatr Pract 2001; 7: 185–208.

    CAS  Article  Google Scholar 

  2. 2

    Hetrick SE, Simmons M, Thompson A, Parker AG . What are specialist mental health clinician attitudes to guideline recommendations for the treatment of depression in young people? Aust N Z J Psychiatry 2011; 45: 993–1001.

    Article  Google Scholar 

  3. 3

    Kuiper S, McLean L, Fritz K, Lampe L, Malhi GS . Getting depression clinical practice guidelines right: time for change? Acta Psychiatr Scand Suppl 2013; 444: 24–30.

    Article  Google Scholar 

  4. 4

    Perlis RH . Use of treatment guidelines in clinical decision making in bipolar disorder: a pilot survey of clinicians. Curr Med Res Opin 2007; 23: 467–475.

    Article  Google Scholar 

  5. 5

    van Loo HM, de Jonge P, Romeijn JW, Kessler RC, Schoevers RA . Data-driven subtypes of major depressive disorder: a systematic review. BMC Med 2012; 10: 156.

    Article  Google Scholar 

  6. 6

    Vrieze E, Demyttenaere K, Bruffaerts R, Hermans D, Pizzagalli DA, Sienaert P et al. Dimensions in major depressive disorder and their relevance for treatment outcome. J Affect Disord 2014; 155: 35–41.

    Article  Google Scholar 

  7. 7

    Hasler G, Northoff G . Discovering imaging endophenotypes for major depression. Mol Psychiatry 2011; 16: 604–619.

    CAS  Article  Google Scholar 

  8. 8

    Kennedy SH, Downar J, Evans KR, Feilotter H, Lam RW, MacQueen GM et al. The Canadian Biomarker Integration Network in Depression (CAN-BIND): advances in response prediction. Curr Pharm Des 2012; 18: 5976–5989.

    CAS  Article  Google Scholar 

  9. 9

    Uher R, Perroud N, Ng MY, Hauser J, Henigsberg N, Maier W et al. Genome-wide pharmacogenetics of antidepressant response in the GENDEP project. Am J Psychiatry 2010; 167: 555–564.

    Article  Google Scholar 

  10. 10

    James G, Witten D, Hastie T, Tibshirani R . An Introduction to Statistical Learning: With Applications in R. Springer: New York, 2013.

    Google Scholar 

  11. 11

    van der Laan MJ, Rose S . Targeted Learning: Causal Inference for Observational and Experimental Data. Springer: New York, 2011.

    Google Scholar 

  12. 12

    Chang YJ, Chen LJ, Chung KP, Lai MS . Risk groups defined by Recursive Partitioning Analysis of patients with colorectal adenocarcinoma treated with colorectal resection. BMC Med Res Methodol 2012; 12: 2.

    Article  Google Scholar 

  13. 13

    Chao ST, Koyfman SA, Woody N, Angelov L, Soeder SL, Reddy CA et al. Recursive partitioning analysis index is predictive for overall survival in patients undergoing spine stereotactic body radiation therapy for spinal metastases. Int J Radiat Oncol Biol Phys 2012; 82: 1738–1743.

    Article  Google Scholar 

  14. 14

    Nelson JC, Zhang Q, Deberdt W, Marangell LB, Karamustafalioglu O, Lipkovich IA . Predictors of remission with placebo using an integrated study database from patients with major depressive disorder. Curr Med Res Opin 2012; 28: 325–334.

    CAS  Article  Google Scholar 

  15. 15

    Riedel M, Moller HJ, Obermeier M, Adli M, Bauer M, Kronmuller K et al. Clinical predictors of response and remission in inpatients with depressive syndromes. J Affect Disord 2011; 133: 137–149.

    Article  Google Scholar 

  16. 16

    van Loo HM, Cai T, Gruber MJ, Li J, de Jonge P, Petukhova M et al. Major depressive disorder subtypes to predict long-term course. Depress Anxiety 2014; 31: 765–777.

    Article  Google Scholar 

  17. 17

    Wardenaar KJ, van Loo HM, Cai T, Fava M, Gruber MJ, Li J et al. The effects of co-morbidity in defining major depression subtypes associated with long-term course and severity. Psychol Med 2014; 44: 3289–3302.

    CAS  Article  Google Scholar 

  18. 18

    Kessler RC, McGonagle KA, Zhao S, Nelson CB, Hughes M, Eshleman S et al. Lifetime and 12-month prevalence of DSM-III-R psychiatric disorders in the United States. Results from the National Comorbidity Survey. Arch Gen Psychiatry 1994; 51: 8–19.

    CAS  PubMed  PubMed Central  Google Scholar 

  19. 19

    Kessler RC, Merikangas KR, Berglund P, Eaton WW, Koretz DS, Walters EE . Mild disorders should not be eliminated from the DSM-V. Arch Gen Psychiatry 2003; 60: 1117–1122.

    Article  Google Scholar 

  20. 20

    Kessler RC, Wittchen HU, Abelson JM, McGonagle KA, Schwarz N, Kendler KS et al. Methodological studies of the Composite International Diagnostic Interview (CIDI) in the US National Comorbidity Survey. Int J Methods Psychiatr Res 1998; 7: 33–55.

    Article  Google Scholar 

  21. 21

    Spitzer RL, Williams JB, Gibbon M, First MB . The Structured Clinical Interview for DSM-III-R (SCID). I: history, rationale, and description. Arch Gen Psychiatry 1992; 49: 624–629.

    CAS  Article  Google Scholar 

  22. 22

    Endicott J, Andreasen N, Spitzer RL . Family History Research Diagnostic Criteria (FHRDC). Biometrics Research, New York State Psychiatric Institute: New York, 1978.

    Google Scholar 

  23. 23

    Therneau T, Atkinson B . An Introduction to Recursive Partitioning Using the RPART Routines. Mayo Foundation: Rochester, MN, 2015.

    Google Scholar 

  24. 24

    Friedman J, Hastie T, Tibshirani R . Regularization Paths for Generalized Linear Models via Coordinate Descent. J Stat Softw 2010; 33: 1–22.

    Article  Google Scholar 

  25. 25

    SAS Institute Inc. SAS/STAT software. 9.2 for Unix edn. SAS Institute Inc.: Cary, NC, 2009.

  26. 26

    Research Triangle Institute SUDAAN: Professional Software for Survey Data Analysis, 9th edn Research Triangle Institute: Research Triangle Park: NC, 2004.

  27. 27

    Marsland S . Machine Learning: An Algorithmic Perspective 2nd (edn). Taylor & Francis: Boca Raton, FL, 2015.

    Google Scholar 

  28. 28

    van der Laan MJ, Polley EC, Hubbard AE . Super learner. Stat Appl Genet Mol Biol 2007; 6: Article 25.

    Article  Google Scholar 

  29. 29

    Klein DN, Shankman SA, Rose S . Dysthymic disorder and double depression: prediction of 10-year course trajectories and outcomes. J Psychiatr Res 2008; 42: 408–415.

    Article  Google Scholar 

  30. 30

    Moos RH, Cronkite RC . Symptom-based predictors of a 10-year chronic course of treated depression. J Nerv Ment Dis 1999; 187: 360–368.

    CAS  Article  Google Scholar 

  31. 31

    Angst J, Gamma A, Rossler W, Ajdacic V, Klein DN . Childhood adversity and chronicity of mood disorders. Eur Arch Psychiatry Clin Neurosci 2011; 261: 21–27.

    Article  Google Scholar 

  32. 32

    Bradvik L, Mattisson C, Bogren M, Nettelbladt P . Long-term suicide risk of depression in the Lundby cohort 1947–1997—severity and gender. Acta Psychiatr Scand 2008; 117: 185–191.

    CAS  Article  Google Scholar 

  33. 33

    Rice ME, Harris GT . Comparing effect sizes in follow-up studies: ROC Area, Cohen's d, and r. Law Hum Behav 2005; 29: 615–620.

    Article  Google Scholar 

  34. 34

    Singh JP, Desmarais SL, Van Dorn RA . Measurement of predictive validity in violence risk assessment studies: a second-order systematic review. Behav Sci Law 2013; 31: 55–73.

    Article  Google Scholar 

  35. 35

    Sjostedt G, Grann M . Risk assessment: what is being predicted by actuarial prediction instruments? Int J Forensic Ment Health 2002; 1: 179–183.

    Article  Google Scholar 

  36. 36

    Echouffo-Tcheugui JB, Kengne AP . Comparative performance of diabetes-specific and general population-based cardiovascular risk assessment models in people with diabetes mellitus. Diabetes Metab 2013; 39: 389–396.

    Article  Google Scholar 

  37. 37

    Siontis GC, Tzoulaki I, Siontis KC, Ioannidis JP . Comparisons of established risk prediction models for cardiovascular disease: systematic review. BMJ 2012; 344: e3318.

    Article  Google Scholar 

  38. 38

    Tzoulaki I, Liberopoulos G, Ioannidis JP . Assessment of claims of improved prediction beyond the Framingham risk score. JAMA 2009; 302: 2345–2352.

    CAS  Article  Google Scholar 

  39. 39

    Anothaisintawee T, Teerawattananon Y, Wiratkapun C, Kasamesup V, Thakkinstian A . Risk prediction models of breast cancer: a systematic review of model performances. Breast Cancer Res Treat 2012; 133: 1–10.

    Article  Google Scholar 

  40. 40

    Haas LR, Takahashi PY, Shah ND, Stroebel RJ, Bernard ME, Finnie DM et al. Risk-stratification methods for identifying patients for care coordination. Am J Manag Care 2013; 19: 725–732.

    PubMed  Google Scholar 

  41. 41

    Morris JN, Howard EP, Steel K, Schreiber R, Fries BE, Lipsitz LA et al. Predicting risk of hospital and emergency department use for home care elderly persons through a secondary analysis of cross-national data. BMC Health Serv Res 2014; 14: 519.

    Article  Google Scholar 

  42. 42

    Williams LM, Rush AJ, Koslow SH, Wisniewski SR, Cooper NJ, Nemeroff CB et al. International Study to Predict Optimized Treatment for Depression (iSPOT-D), a randomized clinical trial: rationale and protocol. Trials 2011; 12: 4.

    Article  Google Scholar 

  43. 43

    Burke JF, Hayward RA, Nelson JP, Kent DM . Using internally developed risk models to assess heterogeneity in treatment effects in clinical trials. Circ Cardiovasc Qual Outcomes 2014; 7: 163–169.

    Article  Google Scholar 

  44. 44

    Willke RJ, Zheng Z, Subedi P, Althin R, Mullins CD . From concepts, theory, and evidence of heterogeneity of treatment effects to methodological approaches: a primer. BMC Med Res Methodol 2012; 12: 185.

    Article  Google Scholar 

  45. 45

    Li C, Lu Y . Evaluating the improvement in diagnostic utility from adding new predictors. Biom J 2010; 52: 417–435.

    Article  Google Scholar 

  46. 46

    Neugebauer R, Schmittdiel JA, van der Laan MJ . Targeted learning in real-world comparative effectiveness research with time-varying interventions. Stat Med 2014; 33: 2480–2520.

    Article  Google Scholar 

  47. 47

    Anglemyer A, Horvath HT, Bero L Healthcare outcomes assessed with observational study designs compared with those assessed in randomized trials. Cochrane Database Syst Rev 2014; (4): MR000034.

  48. 48

    Jain FA, Hunter AM, Brooks JO 3rd, Leuchter AF . Predictive socioeconomic and clinical profiles of antidepressant response and remission. Depress Anxiety 2013; 30: 624–630.

    CAS  Article  Google Scholar 

  49. 49

    Perlis RH . A clinical risk stratification tool for predicting treatment resistance in major depressive disorder. Biol Psychiatry 2013; 74: 7–14.

    Article  Google Scholar 

  50. 50

    Cuijpers P, Reynolds CF 3rd, Donker T, Li J, Andersson G, Beekman A . Personalized treatment of adult depression: medication, psychotherapy, or both? A systematic review. Depress Anxiety 2012; 29: 855–864.

    Article  Google Scholar 

  51. 51

    Simon GE, Perlis RH . Personalized medicine for depression: can we match patients with treatments? Am J Psychiatry 2010; 167: 1445–1455.

    Article  Google Scholar 

Download references


PdJ was supported by a VICI grant (no: 91812607) from the Netherlands Organization for Scientific Research (NWO-ZonMW). The NCS data collection was supported by the National Institute of Mental Health (NIMH; R01MH46376). The NCS-2 data collection was supported by the National Institute on Drug Abuse (NIDA; R01DA012058). Data analysis for this paper was additionally supported by NIMH grants R01MH070884 and U01MH060220, with supplemental support from the Substance Abuse and Mental Health Services Administration (SAMHSA), the Robert Wood Johnson Foundation (RWJF; Grant 044780) and the John W. Alden Trust. The NCS-2 is carried out in conjunction with the World Health Organization World Mental Health (WMH) Survey Initiative. We thank the staff of the WMH Data Collection and Data Analysis Coordination Centres for assistance with instrumentation, fieldwork and consultation on data analysis. These activities were supported by the NIMH (R01MH070884), the John D and Catherine T MacArthur Foundation, the Pfizer Foundation, the US Public Health Service (R13MH066849, R01MH069864 and R01DA016558), the Fogarty International Center (FIRCA R03TW006481), the Pan American Health Organization, Eli Lilly and Company, Ortho-McNeil Pharmaceutical, GlaxoSmithKline and Bristol-Myers Squibb.

Author information



Corresponding author

Correspondence to R C Kessler.

Ethics declarations

Competing interests

RCK has been a consultant for Hoffman La Roche, Johnson & Johnson Wellness and Prevention and Sonofi-Aventis Groupe; has served on an advisory board for Lake Nona Institute; and owns stock in DataStat. AAN has been a consultant for Abbott Laboratories, American Psychiatric Association, Appliance Computing (Mindsite), Basliea, Brain Cells, Brandeis University, Bristol-Myers Squibb, Clintara, Corcept, Dey Pharmaceuticals, Dainippon Sumitomo (now Sunovion), Eli Lilly and Company, EpiQ, L.P./Mylan, Forest, Genaissance, Genentech, GlaxoSmithKline, Hoffman La Roche, Infomedic, Lundbeck, Janssen Pharmaceutica, Jazz Pharmaceuticals, Medavante, Merck, Methylation Sciences, Naurex, Novartis, PamLabs, Pfizer, PGx Health, Ridge Diagnostics Shire, Schering-Plough, Somerset, Sunovion, Takeda Pharmaceuticals, Targacept and Teva; consulted through the MGH Clinical Trials Network and Institute (CTNI) for Astra Zeneca, Brain Cells, Dianippon Sumitomo/Sepracor, Johnson and Johnson, Labopharm, Merck, Methylation Science, Novartis, PGx Health, Shire, Schering-Plough, Targacept and Takeda/Lundbeck Pharmaceuticals; had grant/research support from the American Foundation for Suicide Prevention, AHRQ, Brain and Behavior Research Foundation, Bristol-Myers Squibb, Cederroth, Cephalon, Cyberonics, Elan, Eli Lilly, Forest, GlaxoSmithKline, Janssen Pharmaceutica, Lichtwer Pharma, Marriott Foundation, Mylan, NIMH, PamLabs, PCORI, Pfizer Pharmaceuticals, Shire, Stanley Foundation, Takeda and Wyeth-Ayerst; received honoraria from Belvoir Publishing, University of Texas Southwestern Dallas, Brandeis University, Bristol-Myers Squibb, Hillside Hospital, American Drug Utilization Review, American Society for Clinical Psychopharmacology, Baystate Medical Center, Columbia University, CRICO, Dartmouth Medical School, Health New England, Harold Grinspoon Charitable Foundation, IMEDEX, International Society for Bipolar Disorder, Israel Society for Biological Psychiatry, Johns Hopkins University, MJ Consulting, New York State, Medscape, MBL Publishing, MGH Psychiatry Academy, National Association of Continuing Education, Physicians Postgraduate Press, SUNY Buffalo, University of Wisconsin, University of Pisa, University of Michigan, University of Miami, University of Wisconsin at Madison, APSARD, ISBD, SciMed, Slack Publishing and Wolters Klower Publishing; owns stock in Appliance Computing (MindSite), Brain Cells, Medavante; and owns the following copyrights: Clinical Positive Affect Scale and the MGH Structured Clinical Interview for the Montgomery Asberg Depression Scale exclusively licensed to the MGH Clinical Trials Network and Institute (CTNI). MAW is an employee of Janssen Pharmaceuticals. HMvL, KJW, RMB, LAB, TC, DDE, IH, JL, PdJ, MVP, AJR, NAS, RAS and AMZ declare no conflict of interest.

Additional information

A complete list of NCS and NCS-2 publications can be found at http://www.hcp.med.harvard.edu/ncs.


The views, opinions and/or findings contained in this article are those of the authors and should not be construed as an official Department of Veterans Affairs position, policy or decision unless so designated by other documentation, or the views of any of the sponsoring organizations, agencies or the US Government.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Kessler, R., van Loo, H., Wardenaar, K. et al. Testing a machine-learning algorithm to predict the persistence and severity of major depressive disorder from baseline self-reports. Mol Psychiatry 21, 1366–1371 (2016). https://doi.org/10.1038/mp.2015.198

Download citation

Further reading