Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Protocol
  • Published:

Toward a unified framework for interpreting machine-learning models in neuroimaging

Abstract

Machine learning is a powerful tool for creating computational models relating brain function to behavior, and its use is becoming widespread in neuroscience. However, these models are complex and often hard to interpret, making it difficult to evaluate their neuroscientific validity and contribution to understanding the brain. For neuroimaging-based machine-learning models to be interpretable, they should (i) be comprehensible to humans, (ii) provide useful information about what mental or behavioral constructs are represented in particular brain pathways or regions, and (iii) demonstrate that they are based on relevant neurobiological signal, not artifacts or confounds. In this protocol, we introduce a unified framework that consists of model-, feature- and biology-level assessments to provide complementary results that support the understanding of how and why a model works. Although the framework can be applied to different types of models and data, this protocol provides practical tools and examples of selected analysis methods for a functional MRI dataset and multivariate pattern-based predictive models. A user of the protocol should be familiar with basic programming in MATLAB or Python. This protocol will help build more interpretable neuroimaging-based machine-learning models, contributing to the cumulative understanding of brain mechanisms and brain health. Although the analyses provided here constitute a limited set of tests and take a few hours to days to complete, depending on the size of data and available computational resources, we envision the process of annotating and interpreting models as an open-ended process, involving collaborative efforts across multiple studies and laboratories.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Model complexity in neuroimaging and the model interpretation framework.
Fig. 2: A proposed workflow for the procedure.
Fig. 3: Predictive performance of the SVM model (Steps 2 and 3) and the results of feature-level assessment of the linear models (Step 7, options A and B).
Fig. 4: A schematic of the ‘virtual lesion’ analysis (Step 7, option C).
Fig. 5: Layer-wise relevance propagation results (Step 7, option D).
Fig. 6: Generalizability tests (Steps 8–10).
Fig. 7: Examples of biology-level assessment (Step 11) and the representational analysis (Steps 12–15).

Similar content being viewed by others

Data availability

Sample data used in this protocol are publicly available at https://github.com/cocoanlab/interpret_ml_neuroimaging.

Code availability

Codes used in this protocol are publicly available at https://github.com/cocoanlab/interpret_ml_neuroimaging.

References

  1. Scheinost, D. et al. Ten simple rules for predictive modeling of individual differences in neuroimaging. Neuroimage 193, 35–45 (2019).

    Article  PubMed  Google Scholar 

  2. Woo, C.-W., Chang, L. J., Lindquist, M. A. & Wager, T. D. Building better biomarkers: brain models in translational neuroimaging. Nat. Neurosci. 20, 365–377 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Haxby, J. V. Multivariate pattern analysis of fMRI: the early beginnings. Neuroimage 62, 852–855 (2012).

    Article  PubMed  Google Scholar 

  4. Haynes, J. D. A primer on pattern-based approaches to fMRI: principles, pitfalls, and perspectives. Neuron 87, 257–270 (2015).

    Article  CAS  PubMed  Google Scholar 

  5. Norman, K. A., Polyn, S. M., Detre, G. J. & Haxby, J. V. Beyond mind-reading: multi-voxel pattern analysis of fMRI data. Trends Cogn. Sci. 10, 424–430 (2006).

    Article  PubMed  Google Scholar 

  6. Horikawa, T., Tamaki, M., Miyawaki, Y. & Kamitani, Y. Neural decoding of visual imagery during sleep. Science 340, 639–642 (2013).

    Article  CAS  PubMed  Google Scholar 

  7. Kragel, P. A., Knodt, A. R., Hariri, A. R. & LaBar, K. S. Decoding spontaneous emotional states in the human brain. PLoS Biol. 14, e2000106, https://doi.org/10.1371/journal.pbio.2000106 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Mitchell, T. M. et al. Predicting human brain activity associated with the meanings of nouns. Science 320, 1191–1195 (2008).

    Article  CAS  PubMed  Google Scholar 

  9. Brodersen, K. H. et al. Decoding the perception of pain from fMRI using multivariate pattern analysis. Neuroimage 63, 1162–1170 (2012).

    Article  PubMed  Google Scholar 

  10. Schulz, E., Zherdin, A., Tiemann, L., Plant, C. & Ploner, M. Decoding an individual’s sensitivity to pain from the multivariate analysis of EEG data. Cereb. Cortex 22, 1118–1123 (2012).

    Article  PubMed  Google Scholar 

  11. Haxby, J. V. et al. Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science 293, 2425–2430 (2001).

    Article  CAS  PubMed  Google Scholar 

  12. Kriegeskorte, N., Goebel, R. & Bandettini, P. Information-based functional brain mapping. Proc. Natl Acad. Sci. USA 103, 3863–3868 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Wager, T. D. et al. An fMRI-based neurologic signature of physical pain. N. Engl. J. Med. 368, 1388–1397 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Rosenberg, M. D. et al. A neuromarker of sustained attention from whole-brain functional connectivity. Nat. Neurosci. 19, 165–171 (2016).

    Article  CAS  PubMed  Google Scholar 

  15. Mano, H. et al. Classification and characterisation of brain network changes in chronic back pain: a multicenter study. Wellcome Open Res. 3, 19 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  16. Shen, X. et al. Using connectome-based predictive modeling to predict individual behavior from brain connectivity. Nat. Protoc. 12, 506–518 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Peelen, M. V., Wiggett, A. J. & Downing, P. E. Patterns of fMRI activity dissociate overlapping functional brain areas that respond to biological motion. Neuron 49, 815–822 (2006).

    Article  CAS  PubMed  Google Scholar 

  18. Woo, C.-W. et al. Quantifying cerebral contributions to pain beyond nociception. Nat. Commun. 8, 14211 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Krishnan, A. et al. Somatic and vicarious pain are represented by dissociable multivariate brain patterns. Elife 5, e15166, https://doi.org/10.7554/eLife.15166 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  20. Castelvecchi, D. Can we open the black box of AI? Nat. N. 538, 20 (2016).

    Article  CAS  Google Scholar 

  21. Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1, 206–215 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  22. Eloyan, A. et al. Automated diagnoses of attention deficit hyperactive disorder using magnetic resonance imaging. Front. Syst. Neurosci. 6, 61 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  23. Vellido, A., Martín-Guerrero, J. D. & Lisboa, P. J. Making machine learning models interpretable. In Proceedings of European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning 163–172 (ESANN, 2012).

  24. Lipton, Z. C. The mythos of model interpretability. Preprint at https://arxiv.org/abs/1606.03490 (2016).

  25. Cabitza, F., Rasoini, R. & Gensini, G. F. Unintended consequences of machine learning in medicine. JAMA 318, 517–518 (2017).

    Article  PubMed  Google Scholar 

  26. Doshi-Velez, F. & Kim, B. Towards a rigorous science of interpretable machine learning. Preprint at https://arxiv.org/abs/1702.08608 (2017).

  27. Paulus, M. P. Pragmatism instead of mechanism: a call for impactful biological psychiatry. JAMA Psychiatry 72, 631–632 (2015).

    Article  PubMed  Google Scholar 

  28. Pine, D. S. & Leibenluft, E. Biomarkers with a mechanistic focus. JAMA Psychiatry 72, 633–634 (2015).

    Article  PubMed  Google Scholar 

  29. Bzdok, D. & Ioannidis, J. P. A. Exploration, inference, and prediction in neuroscience and biomedicine. Trends Neurosci. 42, 251–262 (2019).

    Article  CAS  PubMed  Google Scholar 

  30. Bennett, D., Silverstein, S. M. & Niv, Y. The two cultures of computational psychiatry. JAMA Psychiatry 76, 563–564 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  31. Breakspear, M. Dynamic models of large-scale brain activity. Nat. Neurosci. 20, 340–352 (2017).

    Article  CAS  PubMed  Google Scholar 

  32. Ritter, P., Schirner, M., McIntosh, A. R. & Jirsa, V. K. The virtual brain integrates computational modeling and multimodal neuroimaging. Brain Connect. 3, 121–145 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  33. Deco, G., Jirsa, V. K., Robinson, P. A., Breakspear, M. & Friston, K. The dynamic brain: from spiking neurons to neural masses and cortical fields. PLoS Comput. Biol. 4, e1000092 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  34. O'Reilly, R. C. Biologically based computational models of high-level cognition. Science 314, 91–94 (2006).

    Article  CAS  PubMed  Google Scholar 

  35. Frank, M. J., Seeberger, L. C. & O'Reilly, R. C. By carrot or by stick: cognitive reinforcement learning in parkinsonism. Science 306, 1940–1943 (2004).

    Article  CAS  PubMed  Google Scholar 

  36. Cole, J. H. et al. Predicting brain age with deep learning from raw imaging data results in a reliable and heritable biomarker. NeuroImage 163, 115–124 (2017).

    Article  PubMed  Google Scholar 

  37. Horikawa, T. & Kamitani, Y. Generic decoding of seen and imagined objects using hierarchical visual features. Nat. Commun. 8, 15037, https://doi.org/10.1038/ncomms15037 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Kragel, P. A., Reddan, M. C., LaBar, K. S. & Wager, T. D. Emotion schemas are embedded in the human visual system. Sci. Adv. 5, eaaw4358 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  39. Hassabis, D., Kumaran, D., Summerfield, C. & Botvinick, M. Neuroscience-inspired artificial intelligence. Neuron 95, 245–258 (2017).

    Article  CAS  PubMed  Google Scholar 

  40. Banino, A. et al. Vector-based navigation using grid-like representations in artificial agents. Nature 557, 429–433 (2018).

    Article  CAS  PubMed  Google Scholar 

  41. Box, G. E. P. Science and statistics. J. Am. Stat. Assoc. 71, 791–799 (1976).

    Article  Google Scholar 

  42. Tibshirani, R. Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 58, 267–288 (1996).

    Google Scholar 

  43. Zou, H. & Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B Stat. Methodol. 67, 301–320 (2005).

    Article  Google Scholar 

  44. Grosenick, L., Klingenberg, B., Katovich, K., Knutson, B. & Taylor, J. E. Interpretable whole-brain prediction analysis with GraphNet. Neuroimage 72, 304–321 (2013).

    Article  PubMed  Google Scholar 

  45. Bzdok, D., Eickenberg, M., Varoquaux, G. & Thirion, B. Hierarchical region-network sparsity for high-dimensional inference in brain imaging. In International Conference on Information Processing in Medical Imaging. (eds. Niethammer, M., Styner, M., Aylward, S., Zhu, H., Oguz, I. et al.) 323–335 (Springer, 2017).

  46. Yamashita, O., Sato, M., Yoshioka, T., Tong, F. & Kamitani, Y. Sparse estimation automatically selects voxels relevant for the decoding of fMRI activity patterns. Neuroimage 42, 1414–1429 (2008).

    Article  PubMed  Google Scholar 

  47. Chang, L. J., Gianaros, P. J., Manuck, S. B., Krishnan, A. & Wager, T. D. A sensitive and specific neural signature for picture-induced negative affect. PLoS Biol. 13, e1002180 https://doi.org/10.1371/journal.pbio.1002180 (2015).

  48. Kragel, P. A., Koban, L., Barrett, L. F. & Wager, T. D. Representation, pattern information, and brain signatures: from neurons to neuroimaging. Neuron 99, 257–273 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Rahwan, I. et al. Machine behaviour. Nature 568, 477–486 (2019).

    Article  CAS  PubMed  Google Scholar 

  50. Caliskan, A., Bryson, J. J. & Narayanan, A. Semantics derived automatically from language corpora contain human-like biases. Science 356, 183–186 (2017).

    Article  CAS  PubMed  Google Scholar 

  51. Rabinowitz, N. C. et al. Machine theory of mind. Preprint at https://arxiv.org/abs/1802.07740 (2018).

  52. Silver, D. et al. Mastering the game of Go without human knowledge. Nature 550, 354–359 (2017).

    Article  CAS  PubMed  Google Scholar 

  53. Haxby, J. V., Connolly, A. C. & Guntupalli, J. S. Decoding neural representational spaces using multivariate pattern analysis. Ann. Rev. Neurosci. 37, 435–456 (2014).

    Article  CAS  PubMed  Google Scholar 

  54. Kriegeskorte, N. & Kievit, R. A. Representational geometry: integrating cognition, computation, and the brain. Trends Cogn. Sci. 17, 401–412 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  55. Yamins, D. L. & DiCarlo, J. J. Using goal-driven deep learning models to understand sensory cortex. Nat. Neurosci. 19, 356–365 (2016).

    Article  CAS  PubMed  Google Scholar 

  56. Khaligh-Razavi, S. M. & Kriegeskorte, N. Deep supervised, but not unsupervised, models may explain IT cortical representation. PLoS Comput. Biol. 10, e1003915 https://doi.org/10.1371/journal.pcbi.1003915 (2014).

  57. Raj, D., Anderson, A. W. & Gore, J. C. Respiratory effects in human functional magnetic resonance imaging due to bulk susceptibility changes. Phys. Med. Biol. 46, 3331 (2001).

    Article  CAS  PubMed  Google Scholar 

  58. Caballero-Gaudes, C. & Reynolds, R. C. Methods for cleaning the BOLD fMRI signal. NeuroImage 154, 128–149 (2017).

    Article  PubMed  Google Scholar 

  59. Power, J. D., Schlaggar, B. L. & Petersen, S. E. Recent progress and outstanding issues in motion correction in resting state fMRI. NeuroImage 105, 536–551 (2015).

    Article  PubMed  Google Scholar 

  60. Ciric, R. et al. Mitigating head motion artifact in functional connectivity MRI. Nat. Protoc. 13, 2801–2826 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Labus, J. S. et al. Multivariate morphological brain signatures predict patients with chronic abdominal pain from healthy control subjects. Pain 156, 1545–1554 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  62. Efron, B. Bootstrap methods: another look at the jackknife. Ann. Stat. 7, 1–26 (1979).

    Article  Google Scholar 

  63. Craddock, R. C., Holtzheimer, P. E., Hu, X. P. P. & Mayberg, H. S. Disease state prediction from resting state functional connectivity. Magn. Reson. Med. 62, 1619–1628 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  64. Bach, S. et al. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS One 10, e0130140 https://doi.org/10.1371/journal.pone.0130140 (2015).

  65. Lundberg, S. M. et al. Explainable machine-learning predictions for the prevention of hypoxaemia during surgery. Nat. Biomed. Eng. 2, 749–760 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  66. Hanson, S. J., Matsuka, T. & Haxby, J. V. Combinatorial codes in ventral temporal lobe for object recognition: Haxby (2001) revisited: is there a "face" area? NeuroImage 23, 156–166 (2004).

    Article  PubMed  Google Scholar 

  67. Ribeiro, M. T., Singh, S. & Guestrin, C. “Why should I trust you?”: explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 1135–1144 (ACM, 2016).

  68. Kermany, D. S. et al. Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell 172, 1122–1131 (2018).

    Article  CAS  PubMed  Google Scholar 

  69. Gotsopoulos, A. et al. Reproducibility of importance extraction methods in neural network based fMRI classification. NeuroImage 181, 44–54 (2018).

    Article  PubMed  Google Scholar 

  70. Simonyan, K., Vedaldi, A. & Zisserman, A. Deep inside convolutional networks: visualising image classification models and saliency maps. Preprint at https://arxiv.org/abs/1312.6034 (2013).

  71. Mordvintsev, A., Olah, C. & Tyka, M. Inceptionism: Going Deeper into Neural Networks. https://ai.googleblog.com/2015/06/inceptionism-going-deeper-into-neural.html (2015).

  72. Lee, M. et al. Activation of corticostriatal circuitry relieves chronic neuropathic pain. J. Neurosci. 35, 5247–5259 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Ren, W. et al. The indirect pathway of the nucleus accumbens shell amplifies neuropathic pain. Nat. Neurosci. 19, 220–222 (2016).

    Article  CAS  PubMed  Google Scholar 

  74. Carrasquillo, Y. & Gereau, R. W. IV Hemispheric lateralization of a molecular signal for pain modulation in the amygdala. Mol. Pain. 4, 24 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Kim, H. F. & Hikosaka, O. Distinct basal ganglia circuits controlling behaviors guided by flexible and stable values. Neuron 79, 1001–1010 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Baliki, M. N. et al. Parceling human accumbens into putative core and shell dissociates encoding of values for reward and pain. J. Neurosci. 33, 16383–16393 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Pauli, W. M., O’Reilly, R. C., Yarkoni, T. & Wager, T. D. Regional specialization within the human striatum for diverse psychological functions. Proc. Natl Acad. Sci. USA 113, 1907–1912 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. Simons, L. E. et al. The human amygdala and pain: evidence from neuroimaging. Hum. Brain Mapp. 35, 527–538 (2014).

    Article  PubMed  Google Scholar 

  79. Ashar, Y. K., Andrews-Hanna, J. R., Dimidjian, S. & Wager, T. D. Empathic care and distress: predictive brain markers and dissociable brain systems. Neuron 94, 1263–1273.e4 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Yarkoni, T., Poldrack, R. A., Nichols, T. E., Van Essen, D. C. & Wager, T. D. Large-scale automated synthesis of human functional neuroimaging data. Nat. Methods 8, 665–670 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Gorgolewski, K., Esteban, O., Schaefer, G., Wandell, B. & Poldrack, R. OpenNeuro—a free online platform for sharing and analysis of neuroimaging data. 1677 (Organization for Human Brain Mapping, Vancouver, Canada, 2017).

  82. Gorgolewski, K. J. et al. NeuroVault.org: a web-based repository for collecting and sharing unthresholded statistical maps of the human brain. Front. Neuroinform. 9, 8 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  83. Wager, T. D. et al. A Bayesian model of category-specific emotional brain responses. PLoS Comput. Biol. 11, e1004066, https://doi.org/10.1371/journal.pcbi.1004066 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  84. Kragel, P. A. et al. Generalizable representations of pain, cognitive control, and negative emotion in medial frontal cortex. Nat. Neurosci. 21, 283–289 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  85. Eisenbarth, H., Chang, L. J. & Wager, T. D. Multivariate brain prediction of heart rate and skin conductance responses to social threat. J. Neurosci. 36, 11987–11998 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Zaki, J., Wager, T. D., Singer, T., Keysers, C. & Gazzola, V. The anatomy of suffering: understanding the relationship between nociceptive and empathic pain. Trends Cogn. Sci. 20, 249–259 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  87. Yeo, B. T. T. et al. The organization of the human cerebral cortex estimated by intrinsic functional connectivity. J. Neurophysiol. 106, 1125–1165 (2011).

    Article  PubMed  Google Scholar 

  88. Hultman, R. et al. Brain-wide electrical spatiotemporal dynamics encode depression vulnerability. Cell 173, 166–180.e14 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  89. Grosenick, L. et al. Functional and optogenetic approaches to discovering stable subtype-specific circuit mechanisms in depression. Biol. Psychiatry: Cogn. Neurosci. Neuroimaging 4, 554–566 (2019).

    Google Scholar 

  90. Drysdale, A. T. et al. Resting-state connectivity biomarkers define neurophysiological subtypes of depression. Nat. Med. 23, 28–38 (2017). Erratum in: Nat. Med. 23, 264 (2017).

    Article  CAS  PubMed  Google Scholar 

  91. Vemuri, P. et al. Antemortem MRI based STructural Abnormality iNDex (STAND)-scores correlate with postmortem Braak neurofibrillary tangle stage. NeuroImage 42, 559–567 (2008).

    Article  PubMed  Google Scholar 

  92. Apkarian, A. V. A brain signature for acute pain. Trends Cogn. Sci. 17, 309–310 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  93. Woo, C.-W. et al. Separate neural representations for physical pain and social rejection. Nat. Commun. 5, 5380, https://doi.org/10.1038/ncomms6380 (2014).

    Article  PubMed  Google Scholar 

  94. Rasmussen, P. M., Hansen, L. K., Madsen, K. H., Churchill, N. W. & Strother, S. C. Model sparsity and brain pattern interpretation of classification models in neuroimaging. Pattern Recognit. 45, 2085–2100 (2012).

    Article  Google Scholar 

  95. Baldassarre, L., Pontil, M. & Mourao-Miranda, J. Sparsity is better with stability: combining accuracy and stability for model selection in brain decoding. Front. Neurosci. 11, 62, https://doi.org/10.3389/fnins.2017.00062 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  96. de Pierrefeu, A. et al. Structured sparse principal components analysis with the TV-elastic net penalty. IEEE Trans. Med. Imaging 37, 396–407 (2018).

    Article  PubMed  Google Scholar 

  97. Zou, H., Hastie, T. & Tibshirani, R. Sparse principal component analysis. Hui. J. Comput. Graph. Stat. 15, 265–286 (2006).

    Article  Google Scholar 

  98. Leonardi, N. et al. Principal components of functional connectivity: a new approach to study dynamic brain connectivity during rest. NeuroImage 83, 937–950 (2013).

    Article  PubMed  Google Scholar 

  99. Calhoun, V. D., Maciejewski, P. K., Pearlson, G. D. & Kiehl, K. A. Temporal lobe and "default" hemodynamic brain modes discriminate between schizophrenia and bipolar disorder. Hum. Brain Mapp. 29, 1265–1275 (2008).

    Article  PubMed  Google Scholar 

  100. Baker, B. T. et al. Decentralized temporal independent component analysis: leveraging fMRI data in collaborative settings. NeuroImage 186, 557–569 (2019).

    Article  PubMed  Google Scholar 

  101. Varoquaux, G. et al. Assessing and tuning brain decoders: cross-validation, caveats, and guidelines. NeuroImage 145, 166–179 (2017).

    Article  PubMed  Google Scholar 

  102. Alber, M. et al. iNNvestigate neural networks. J. Mach. Learn. Res. 20, 1–8 (2019).

    Google Scholar 

  103. Lindquist, M. A. et al. Group-regularized individual prediction: theory and application to pain. NeuroImage 145, 274–287 (2017).

    Article  PubMed  Google Scholar 

  104. Riley, R. D. et al. Minimum sample size for developing a multivariable prediction model: PART II—binary and time-to-event outcomes. Stat. Med. 38, 1276–1296 (2019).

    Article  PubMed  Google Scholar 

  105. Riley, R. D. et al. Minimum sample size for developing a multivariable prediction model: Part I—continuous outcomes. Stat. Med. 38, 1262–1275 (2019).

    Article  PubMed  Google Scholar 

  106. Woo, C. W., Roy, M., Buhle, J. T. & Wager, T. D. Distinct brain systems mediate the effects of nociceptive input and self-regulation on pain. PLoS Biol. 13, e1002036, https://doi.org/10.1371/journal.pbio.1002036 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  107. Esteban, O. et al. MRIQC: advancing the automatic prediction of image quality in MRI from unseen sites. PLoS One 12, e0184661 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  108. Chollet, F. Keras. Deep learning for humans. Github repository. https://github.com/keras-team/keras (2015).

  109. Van Essen, D. C. et al. The WU-Minn Human Connectome Project: an overview. NeuroImage 80, 62–79 (2013).

    Article  PubMed  Google Scholar 

  110. Casey, B. J. et al. The Adolescent Brain Cognitive Development (ABCD) study: imaging acquisition across 21 sites. Dev. Cogn. Neurosci. 32, 43–54 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  111. Sudlow, C. et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med 12, e1001779 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  112. Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Preprint at https://arxiv.org/abs/1412.6980 (2014).

  113. Vul, E., Harris, C., Winkielman, P. & Pashler, H. Puzzlingly high correlations in fMRI studies of emotion, personality, and social cognition. Perspect. Psychol. Sci. 4, 274–290 (2009).

    Article  PubMed  Google Scholar 

  114. Woo, C.-W. & Wager, T. D. What reliability can and cannot tell us about pain report and pain neuroimaging. Pain 157, 511–513 (2016).

    Article  PubMed  Google Scholar 

  115. De Martino, F. et al. Combining multivariate voxel selection and support vector machines for mapping and classification of fMRI spatial patterns. NeuroImage 43, 44–58 (2008).

    Article  PubMed  Google Scholar 

  116. Buckner, R. L., Krienen, F. M., Castellanos, A., Diaz, J. C. & Yeo, B. T. The organization of the human cerebellum estimated by intrinsic functional connectivity. J. Neurophysiol. 106, 2322–2345 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  117. Choi, E. Y., Yeo, B. T. & Buckner, R. L. The organization of the human striatum estimated by intrinsic functional connectivity. J. Neurophysiol. 108, 2242–2263 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  118. Yahata, N. et al. A small number of abnormal brain connections predicts adult autism spectrum disorder. Nat. Commun. 7, 11254 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  119. Poldrack, R. A. & Gorgolewski, K. J. Making big data open: data sharing in neuroimaging. Nat. Neurosci. 17, 1510–1517 (2014).

    Article  CAS  PubMed  Google Scholar 

  120. Karpathy, A., Johnson, J. & Fei-Fei, L. Visualizing and understanding recurrent networks. Preprint at https://arxiv.org/abs/1506.02078 (2015).

  121. Papernot, N. & McDaniel, P. Deep k-nearest neighbors: towards confident, interpretable and robust deep learning. Preprint at https://arxiv.org/abs/1803.04765 (2018).

  122. Wisniewski, D., Reverberi, C., Tusche, A. & Haynes, J. D. The neural representation of voluntary task-set selection in dynamic environments. Cereb. Cortex 25, 4715–4726 (2015).

    Article  PubMed  Google Scholar 

  123. Ye, J. P. et al. Sparse learning and stability selection for predicting MCI to AD conversion using baseline ADNI data. BMC Neurol. 12, 46, https://doi.org/10.1186/1471-2377-12-46 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  124. Erlikhman, G. & Caplovitz, G. P. Decoding information about dynamically occluded objects in visual cortex. NeuroImage 146, 778–788 (2017).

    Article  PubMed  Google Scholar 

  125. Rondina, J. M., Shawe-Taylor, J. & Mourão-Miranda, J. Stability-based multivariate mapping using ScoRS. In PRNI ’13: Proceedings of the 2013 International Workshop on Pattern Recognition in Neuroimaging 198–202 (IEEE Computer Society, 2013).

  126. Strother, S. C. et al. Activation pattern reproducibility: measuring the effects of group size and data analysis models. Hum. Brain Mapp. 5, 312–316 (1997).

    Article  CAS  PubMed  Google Scholar 

  127. Habes, I. et al. Pattern classification of valence in depression. Neuroimage Clin. 2, 675–683 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  128. Zhang, F. Q., Wang, J. P., Kim, J., Parrish, T. & Wong, P. C. M. Decoding multiple sound categories in the human temporal cortex using high resolution fMRI. PLoS One 10, e0117303, https://doi.org/10.1371/journal.pone.0117303 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  129. Zien, A., Krämer, N., Sonnenburg, S. & Rätsch, G. The feature importance ranking measure. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases 694–709 (Springer, 2009).

  130. Vidovic, M. M.-C., Görnitz, N., Müller, K.-R. & Kloft, M. Feature importance measure for non-linear learning algorithms. Preprint at https://arxiv.org/abs/1611.07567 (2016).

  131. Lei, J., G’Sell, M., Rinaldo, A., Tibshirani, R. J. & Wasserman, L. Distribution-free predictive inference for regression. J. Am. Stat. Assoc. 113, 1094–1111 (2017).

    Article  CAS  Google Scholar 

  132. Shrikumar, A., Greenside, P. & Kundaje, A. Learning important features through propagating activation differences. Preprint at https://arxiv.org/abs/1704.02685 (2017).

  133. Lundberg, S. & Lee, S.-I. A unified approach to interpreting model predictions. Preprint at https://arxiv.org/abs/1705.07874 (2017).

  134. Vetere, G. et al. Chemogenetic interrogation of a brain-wide fear memory network in mice. Neuron 94, 363–374.e364 (2017).

    Article  CAS  PubMed  Google Scholar 

  135. Polyn, S. M., Natu, V. S., Cohen, J. D. & Norman, K. A. Category-specific cortical activity precedes retrieval during memory search. Science 310, 1963–1966 (2005).

    Article  CAS  PubMed  Google Scholar 

  136. Erhan, D., Bengio, Y., Courville, A. & Vincent, P. Visualizing Higher-Layer Features of a Deep Network http://www.iro.umontreal.ca/~lisa/publications2/index.php/publications/show/247 (2009).

Download references

Acknowledgements

We would like to thank CANlab members who have contributed to the CANlab tool development, including Yoni Ashar, Luke Chang, Stephan Geuter, Phil Kragel, Bogdan Petre and Dan Weflen (who made >10 GitHub commits) among others. This work was supported by IBS-R015-D1 (Institute for Basic Science, Korea), 2019R1C1C1004512 (National Research Foundation of Korea) and 18-BR-03, 2019-0-01367-BabyMind (Ministry of Science and ICT, Korea) (to C.-W.W.); AI Graduate School Support Program [2019-0-00421] and ITRC Support Program [2019-2018-0-01798] of MSIT/IITP of the Korean government (to J.H., S.C. and T.M.); and NIH R01DA035484 and R01MH076136 (to T.D.W.). The authors have no conflicts of interest to declare.

Author information

Authors and Affiliations

Authors

Contributions

L.K., T.D.W and C.-W.W. conceptualized and developed the protocol and implemented its part for linear models. J.H., S.C., S.L., T.M. and C.-W.W. implemented the part for nonlinear models. T.D.W., C.-W.W. and L.K. contributed to the development of CanlabCore tools. All authors reviewed and revised the manuscript.

Corresponding authors

Correspondence to Tor D. Wager or Choong-Wan Woo.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Protocols thanks Monica Rosenberg and the other anonymous reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Related links

Key reference(s) using this protocol

Wager, T. D. et al. N. Engl. J. Med. 368, 1388–1397 (2013): https://doi.org/10.1056/NEJMoa1204471

Woo, C.-W. et al. Nat. Commun. 5, 5380 (2014): https://doi.org/10.1038/ncomms6380

Woo, C.-W. et al. Nat. Commun. 8, 14211 (2017): https://doi.org/10.1038/ncomms14211

Key data used in this protocol

Woo, C.-W. et al. Nat. Commun. 5, 5380 (2014): https://doi.org/10.1038/ncomms6380

Supplementary information

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kohoutová, L., Heo, J., Cha, S. et al. Toward a unified framework for interpreting machine-learning models in neuroimaging. Nat Protoc 15, 1399–1435 (2020). https://doi.org/10.1038/s41596-019-0289-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41596-019-0289-5

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research