Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Quantum reinforcement learning during human decision-making

Abstract

Classical reinforcement learning (CRL) has been widely applied in neuroscience and psychology; however, quantum reinforcement learning (QRL), which shows superior performance in computer simulations, has never been empirically tested on human decision-making. Moreover, all current successful quantum models for human cognition lack connections to neuroscience. Here we studied whether QRL can properly explain value-based decision-making. We compared 2 QRL and 12 CRL models by using behavioural and functional magnetic resonance imaging data from healthy and cigarette-smoking subjects performing the Iowa Gambling Task. In all groups, the QRL models performed well when compared with the best CRL models and further revealed the representation of quantum-like internal-state-related variables in the medial frontal gyrus in both healthy subjects and smokers, suggesting that value-based decision-making can be illustrated by QRL at both the behavioural and neural levels.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: Task diagram and task performance.
Fig. 2: Diagrams of model architecture.
Fig. 3: The AICc and BIC of each model, computed separately for each group.
Fig. 4: The inferred model probability of each model, computed separately for each group.
Fig. 5: The simulation results of each model, computed separately for each group.
Fig. 6: Generalized quantum distance (computed by the QSPP model)-related activity in the control group.
Fig. 7: fMRI results of the uncertainty × penalty/reward interaction.

Data availability

All data are available from the corresponding author on reasonable request.

Code availability

All code used to generate the results central to the main claims in this study is available from the corresponding author on reasonable request.

References

  1. 1.

    Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction, Vol. 1 (MIT Press, 1998).

  2. 2.

    Niv, Y. Reinforcement learning in the brain. J. Math. Psychol. 53, 139–154 (2009).

    Article  Google Scholar 

  3. 3.

    Biamonte, J. et al. Quantum machine learning. Nature 549, 195–202 (2017).

    CAS  PubMed  Article  Google Scholar 

  4. 4.

    Dong, D., Chen, C., Li, H. & Tarn, T.-J. Quantum reinforcement learning. IEEE Trans. Syst. Man Cybern. Pt B 38, 1207–1220 (2008).

    Article  Google Scholar 

  5. 5.

    Dong, D., Chen, C., Chu, J. & Tarn, T.-J. Robust quantum-inspired reinforcement learning for robot navigation. IEEE/ASME Trans. Mechatron. 17, 86–97 (2012).

    Article  Google Scholar 

  6. 6.

    Fakhari, P., Rajagopal, K., Balakrishnan, S. N. & Busemeyer, J. R. Quantum inspired reinforcement learning in changing environment. New Math. Nat. Comput. 9, 273–294 (2013).

    Article  Google Scholar 

  7. 7.

    Wittek, P. Quantum Machine Learning: What Quantum Computing Means to Data Mining (Academic Press, 2014).

  8. 8.

    Dunjko, V., Taylor, J. M. & Briegel, H. J. Quantum-enhanced machine learning. Phys. Rev. Lett. 117, 130501 (2016).

    PubMed  Article  CAS  Google Scholar 

  9. 9.

    Manousakis, E. Quantum formalism to describe binocular rivalry. Biosystems 98, 57–66 (2009).

    PubMed  Article  Google Scholar 

  10. 10.

    Busemeyer, J. R. & Bruza, P. D. Quantum Models of Cognition and Decision (Cambridge Univ. Press, 2012).

  11. 11.

    Busemeyer, J. R., Wang, Z. & Shiffrin, R. M. Bayesian model comparison favors quantum over standard decision theory account of dynamic inconsistency. Decision 2, 1–12 (2015).

    Article  Google Scholar 

  12. 12.

    Kvam, P. D., Pleskac, T. J., Yu, S. & Busemeyer, J. R. Interference effects of choice on confidence: quantum characteristics of evidence accumulation. Proc. Natl Acad. Sci. USA 112, 10645–10650 (2015).

    CAS  PubMed  Article  Google Scholar 

  13. 13.

    Ashtiani, M. & Azgomi, M. A. A survey of quantum-like approaches to decision making and cognition. Math. Soc. Sci. 75, 49–80 (2015).

    Article  Google Scholar 

  14. 14.

    Yukalov, V. I. & Sornette, D. Quantum probability and quantum decision-making. Phil. Trans. R. Soc. A 374, 20150100 (2016).

    PubMed  Article  CAS  Google Scholar 

  15. 15.

    de Barros, J. A. & Oas, G. in The Palgrave Handbook of Quantum Models in Social Science (eds Haven, E. & Khrennikov, A.) 195–228 (Springer, 2017).

  16. 16.

    Takahashi, T. Can quantum approaches benefit biology of decision making? Prog. Biophys. Mol. Biol. 130, 99–102 (2017).

    PubMed  Article  Google Scholar 

  17. 17.

    Gold, J. I. & Shadlen, M. N. The neural basis of decision making. Annu. Rev. Neurosci. 30, 535–574 (2007).

    CAS  PubMed  Article  Google Scholar 

  18. 18.

    Sanfey, A. G., Loewenstein, G., McClure, S. M. & Cohen, J. D. Neuroeconomics: cross-currents in research on decision-making. Trends Cogn. Sci. 10, 108–116 (2006).

    PubMed  Article  Google Scholar 

  19. 19.

    Glimcher, P. W. Indeterminacy in brain and behavior. Annu. Rev. Psychol. 56, 25–56 (2005).

    PubMed  Article  Google Scholar 

  20. 20.

    Glimcher, P. W. & Fehr, E. Neuroeconomics: Decision Making and the Brain (Academic Press, 2013).

  21. 21.

    Lee, D., Seo, H. & Jung, M. W. Neural basis of reinforcement learning and decision making. Annu. Rev. Neurosci. 35, 287–308 (2012).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  22. 22.

    Daw, N. D. & Tobler, P. N. in Neuroeconomics 2nd edn (eds Glimcher, P. W. & Fehr, E.) 283–298 (Academic Press, 2014).

  23. 23.

    Kornmeier, J., Friedel, E., Wittmann, M. & Atmanspacher, H. EEG correlates of cognitive time scales in the Necker-Zeno model for bistable perception. Conscious. Cogn. 53, 136–150 (2017).

    CAS  PubMed  Article  Google Scholar 

  24. 24.

    Bechara, A., Damasio, A. R., Damasio, H. & Anderson, S. W. Insensitivity to future consequences following damage to human prefrontal cortex. Cognition 50, 7–15 (1994).

    CAS  PubMed  Article  Google Scholar 

  25. 25.

    Ahn, W. Y., Dai, J., Vassileva, J., Busemeyer, J. R. & Stout, J. C. in Progress in Brain Research Vol. 224 (eds Ekhtiari, H. & Paulus, M.) 53–65 (Elsevier, 2016).

  26. 26.

    Buelow, M. T. & Suhr, J. A. Risky decision making in smoking and nonsmoking college students: examination of Iowa Gambling Task performance by deck type selections. Appl. Neuropsychol. Child 3, 38–44 (2014).

    PubMed  Article  Google Scholar 

  27. 27.

    Wei, Z. et al. Chronic nicotine exposure impairs uncertainty modulation on reinforcement learning in anterior cingulate cortex and serotonin system. NeuroImage 169, 323–333 (2018).

    CAS  PubMed  Article  Google Scholar 

  28. 28.

    Steingroever, H. et al. Data from 617 healthy participants performing the Iowa gambling task: a “many labs” collaboration. J. Open Psychol. Data 3, 340–353 (2015).

    Article  Google Scholar 

  29. 29.

    Rangel, A., Camerer, C. & Montague, P. R. A framework for studying the neurobiology of value-based decision making. Nat. Rev. Neurosci. 9, 545–556 (2008).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  30. 30.

    Ahn, W.-Y., Busemeyer, J. R., Wagenmakers, E.-J. & Stout, J. C. Comparison of decision learning models using the generalization criterion method. Cogn. Sci. 32, 1376–1402 (2008).

    PubMed  Article  Google Scholar 

  31. 31.

    Worthy, D. A., Pang, B. & Byrne, K. A. Decomposing the roles of perseveration and expected value representation in models of the Iowa gambling task. Front. Psychol. 4, 640 (2013).

    PubMed  PubMed Central  Article  Google Scholar 

  32. 32.

    Ahn, W. Y. et al. Decision-making in stimulant and opiate addicts in protracted abstinence: evidence from computational modeling with pure users. Front. Psychol. 5, 849 (2014).

    PubMed  PubMed Central  Article  Google Scholar 

  33. 33.

    Worthy, D. A. & Maddox, W. T. Age-based differences in strategy use in choice tasks. Front. Neurosci. 5, 145 (2012).

    PubMed  PubMed Central  Article  Google Scholar 

  34. 34.

    Ahn, W.-Y., Krawitz, A., Kim, W., Busemeyer, J. R. & Brown, J. W. A model-based fMRI analysis with hierarchical Bayesian parameter estimation. Decision 1, 8–23 (2013).

    Article  Google Scholar 

  35. 35.

    Byrne, K. A., Norris, D. D. & Worthy, D. A. Dopamine, depressive symptoms, and decision-making: the relationship between spontaneous eye blink rate and depressive symptoms predicts Iowa Gambling Task performance. Cogn. Affect. Behav. Neurosci. 16, 23–36 (2016).

    PubMed  PubMed Central  Article  Google Scholar 

  36. 36.

    Cavanaugh, J. E. Unifying the derivations for the Akaike and corrected Akaike information criteria. Stat. Probab. Lett. 33, 201–208 (1997).

    Article  Google Scholar 

  37. 37.

    Schwarz, G. Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978).

    Article  Google Scholar 

  38. 38.

    Stephan, K. E., Penny, W. D., Daunizeau, J., Moran, R. J. & Friston, K. J. Bayesian model selection for group studies. NeuroImage 46, 1004–1017 (2009).

    PubMed  PubMed Central  Article  Google Scholar 

  39. 39.

    Dajka, J., Łuczka, J. & Hänggi, P. Distance between quantum states in the presence of initial qubit-environment correlations: a comparative study. Phys. Rev. A 84, 032120 (2011).

    Article  CAS  Google Scholar 

  40. 40.

    O’Doherty, J. P., Hampton, A. & Kim, H. Model-based fMRI and its application to reward learning and decision making. Ann. N. Y. Acad. Sci. 1104, 35–53 (2007).

    PubMed  Article  Google Scholar 

  41. 41.

    Ma, W. J. & Jazayeri, M. Neural coding of uncertainty and probability. Annu. Rev. Neurosci. 37, 205–220 (2014).

    CAS  PubMed  Article  Google Scholar 

  42. 42.

    Bach, D. R., Hulme, O., Penny, W. D. & Dolan, R. J. The known unknowns: neural representation of second-order uncertainty, and ambiguity. J. Neurosci. 31, 4811–4820 (2011).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  43. 43.

    Payzan-LeNestour, E., Dunne, S., Bossaerts, P. & O’Doherty, J. P. The neural representation of unexpected uncertainty during value-based decision making. Neuron 79, 191–201 (2013).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  44. 44.

    Behrens, T. E. J., Woolrich, M. W., Walton, M. E. & Rushworth, M. F. S. Learning the value of information in an uncertain world. Nat. Neurosci. 10, 1214–1221 (2007).

    CAS  PubMed  Article  Google Scholar 

  45. 45.

    Yu, A. J. & Dayan, P. Uncertainty, neuromodulation, and attention. Neuron 46, 681–692 (2005).

    CAS  PubMed  Article  Google Scholar 

  46. 46.

    Singh, V. A potential role of reward and punishment in the facilitation of the emotion-cognition dichotomy in the Iowa Gambling Task. Front. Psychol. 4, 944 (2013).

    PubMed  PubMed Central  Google Scholar 

  47. 47.

    Yechiam, E. & Ert, E. Evaluating the reliance on past choices in adaptive learning models. J. Math. Psychol. 51, 75–84 (2007).

    Article  Google Scholar 

  48. 48.

    Chuang, I. L., Gershenfeld, N. & Kubinec, M. Experimental implementation of fast quantum searching. Phys. Rev. Lett. 80, 3408 (1998).

    CAS  Article  Google Scholar 

  49. 49.

    Dunjko, V., Taylor, J. M. & Briegel, H. J. Advances in quantum reinforcement learning. In Proc. 2017 IEEE International Conference on Systems, Man, and Cybernetics 282–287 (IEEE, 2017).

  50. 50.

    Nielsen, M. A. & Chuang, I. L. Quantum Computation and Quantum Information (Cambridge Univ. Press, 2010).

  51. 51.

    Yearsley, J. M. Advanced tools and concepts for quantum cognition: a tutorial. J. Math. Psychol. 78, 24–39 (2017).

    Article  Google Scholar 

  52. 52.

    Crawford, D., Levit, A., Ghadermarzy, N., Oberoi, J. S. & Ronagh, P. Reinforcement learning using quantum Boltzmann machines. Quantum Info. Comput. 18, 51–74 (2018).

    Google Scholar 

  53. 53.

    Krain, A. L., Wilson, A. M., Arbuckle, R., Castellanos, F. X. & Milham, M. P. Distinct neural mechanisms of risk and ambiguity: a meta-analysis of decision-making. NeuroImage 32, 477–484 (2006).

    PubMed  Article  Google Scholar 

  54. 54.

    Hsu, M., Bhatt, M., Adolphs, R., Tranel, D. & Camerer, C. F. Neural systems responding to degrees of uncertainty in human decision-making. Science 310, 1680–1683 (2005).

    CAS  PubMed  Article  Google Scholar 

  55. 55.

    Litt, A., Plassmann, H., Shiv, B. & Rangel, A. Dissociating valuation and saliency signals during decision-making. Cereb. Cortex 21, 95–102 (2010).

    PubMed  Article  Google Scholar 

  56. 56.

    Wang, Y. et al. Neural substrates of updating the prediction through prediction error during decision making. NeuroImage 157, 1–12 (2017).

    PubMed  Article  Google Scholar 

  57. 57.

    Vickery, T. J. & Jiang, Y. V. Inferior parietal lobule supports decision making under uncertainty in humans. Cereb. Cortex 19, 916–925 (2008).

    PubMed  Article  Google Scholar 

  58. 58.

    Xue, G., Lu, Z., Levin, I. P. & Bechara, A. The impact of prior risk experiences on subsequent risky decision-making: the role of the insula. NeuroImage 50, 709–716 (2010).

    PubMed  PubMed Central  Article  Google Scholar 

  59. 59.

    Haggard, P. Human volition: towards a neuroscience of will. Nat. Rev. Neurosci. 9, 934–946 (2008).

    CAS  PubMed  Article  Google Scholar 

  60. 60.

    Nachev, P., Kennard, C. & Husain, M. Functional role of the supplementary and pre-supplementary motor areas. Nat. Rev. Neurosci. 9, 856–869 (2008).

    CAS  PubMed  Article  Google Scholar 

  61. 61.

    Tanji, J. & Kurata, K. Contrasting neuronal activity in supplementary and precentral motor cortex of monkeys. I. Responses to instructions determining motor responses to forthcoming signals of different modalities. J. Neurophysiol. 53, 129–141 (1985).

    CAS  PubMed  Article  Google Scholar 

  62. 62.

    Okano, K. & Tanji, J. Neuronal activities in the primate motor fields of the agranular frontal cortex preceding visually triggered and self-paced movement. Exp. Brain Res. 66, 155–166 (1987).

    CAS  PubMed  Article  Google Scholar 

  63. 63.

    Rushworth, M. F. S. & Behrens, T. E. J. Choice, uncertainty and value in prefrontal and cingulate cortex. Nat. Neurosci. 11, 389–397 (2008).

    CAS  PubMed  Article  Google Scholar 

  64. 64.

    Sul, J. H., Kim, H., Huh, N., Lee, D. & Jung, M. W. Distinct roles of rodent orbitofrontal and medial prefrontal cortex in decision making. Neuron 66, 449–460 (2010).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  65. 65.

    Kepecs, A., Uchida, N., Zariwala, H. A. & Mainen, Z. F. Neural correlates, computation and behavioural impact of decision confidence. Nature 455, 227–231 (2008).

    CAS  PubMed  Article  Google Scholar 

  66. 66.

    O’Neill, M. & Schultz, W. Coding of reward risk by orbitofrontal neurons is mostly distinct from coding of reward value. Neuron 68, 789–800 (2010).

    PubMed  Article  CAS  Google Scholar 

  67. 67.

    Studer, B., Cen, D. & Walsh, V. The angular gyrus and visuospatial attention in decision-making under risk. NeuroImage 103, 75–80 (2014).

    PubMed  Article  Google Scholar 

  68. 68.

    Tversky, A. & Kahneman, D. Advances in prospect theory: cumulative representation of uncertainty. J. Risk Uncertain. 5, 297–323 (1992).

    Article  Google Scholar 

  69. 69.

    De Barros, J. A. & Suppes, P. Quantum mechanics, interference, and the brain. J. Math. Psychol. 53, 306–313 (2009).

    Article  Google Scholar 

  70. 70.

    Lambert, N. et al. Quantum biology. Nat. Phys. 9, 10–18 (2013).

    CAS  Article  Google Scholar 

  71. 71.

    Busemeyer, J. R., Pothos, E. M., Franco, R. & Trueblood, J. S. A quantum theoretical explanation for probability judgment errors. Psychol. Rev. 118, 193–218 (2011).

    PubMed  Article  Google Scholar 

  72. 72.

    beim Graben, P. & Atmanspacher, H. Complementarity in classical dynamical systems. Found. Phys. 36, 291–306 (2006).

    Article  Google Scholar 

  73. 73.

    beim Graben, P., Filk, T. & Atmanspacher, H. Epistemic entanglement due to non-generating partitions of classical dynamical systems. Int. J. Theor. Phys. 52, 723–734 (2013).

    Article  Google Scholar 

  74. 74.

    Ivakhnenko, O. V., Shevchenko, S. N. & Nori, F. Simulating quantum dynamical phenomena using classical oscillators: Landau-Zener-Stückelberg-Majorana interferometry, latching modulation, and motional averaging. Sci. Rep. 8, 12218 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  75. 75.

    Bliokh, K. Y., Bekshaev, A. Y., Kofman, A. G. & Nori, F. Photon trajectories, anomalous velocities and weak measurements: a classical interpretation. New J. Phys. 15, 073022 (2013).

    Article  Google Scholar 

  76. 76.

    Carleo, G. & Troyer, M. Solving the quantum many-body problem with artificial neural networks. Science 355, 602–606 (2017).

    CAS  PubMed  Article  Google Scholar 

  77. 77.

    Busemeyer, J. R., Fakhari, P. & Kvam, P. Neural implementation of operations used in quantum cognition. Prog. Biophys. Mol. Biol. 130, 53–60 (2017).

    PubMed  Article  Google Scholar 

  78. 78.

    Phelps, E. A., Lempert, K. M. & Sokol-Hessner, P. Emotion and decision making: multiple modulatory neural circuits. Annu. Rev. Neurosci. 37, 263–287 (2014).

    CAS  PubMed  Article  Google Scholar 

  79. 79.

    Hu, H. Reward and aversion. Annu. Rev. Neurosci. 39, 297–324 (2016).

    CAS  PubMed  Article  Google Scholar 

  80. 80.

    Chen, C., Takahashi, T., Nakagawa, S., Inoue, T. & Kusumi, I. Reinforcement learning in depression: a review of computational research. Neurosci. Biobehav. Rev. 55, 247–267 (2015).

    PubMed  Article  Google Scholar 

  81. 81.

    Sanfey, A. G. Social decision-making: insights from game theory and neuroscience. Science 318, 598–602 (2007).

    CAS  PubMed  Article  Google Scholar 

  82. 82.

    Roskies, A. L. How does neuroscience affect our conception of volition? Annu. Rev. Neurosci. 33, 109–130 (2010).

    CAS  PubMed  Article  Google Scholar 

  83. 83.

    Schack, R., Brun, T. A. & Caves, C. M. Quantum Bayes rule. Phys. Rev. A 64, 014305 (2001).

    Article  CAS  Google Scholar 

  84. 84.

    Kouda, N., Matsui, N., Nishimura, H. & Peper, F. Qubit neural network and its learning efficiency. Neural Comput. Appl. 14, 114–121 (2005).

    Article  Google Scholar 

  85. 85.

    Piotrowski, E. W. & Sladkowski, J. The next stage: quantum game theory. in Mathematical Physics Research at the Cutting Edge (ed. Benton, C. V.) 247–268 (Nova Science Publishers, 2004).

  86. 86.

    Ahn, W.-Y., Krawitz, A., Kim, W., Busemeyer, J. R. & Brown, J. W. A model-based fMRI analysis with hierarchical Bayesian parameter estimation. J. Neurosci. Psychol. Econ. 4, 95–110 (2011).

    PubMed  PubMed Central  Article  Google Scholar 

  87. 87.

    He, Q. et al. Altered dynamics between neural systems sub-serving decisions for unhealthy food. Front. Neurosci. 8, 350 (2014).

    PubMed  PubMed Central  Article  Google Scholar 

  88. 88.

    Brevers, D., Noël, X., He, Q., Melrose, J. A. & Bechara, A. Increased ventral-striatal activity during monetary decision making is a marker of problem poker gambling severity. Addict. Biol. 21, 688–699 (2016).

    PubMed  Article  Google Scholar 

  89. 89.

    Yechiam, E. & Busemeyer, J. R. Comparison of basic assumptions embedded in learning models for experience-based decision making. Psychon. Bull. Rev. 12, 387–402 (2005).

    PubMed  Article  Google Scholar 

  90. 90.

    Busemeyer, J. R. & Stout, J. C. A contribution of cognitive decision models to clinical assessment: decomposing performance on the Bechara gambling task. Psychol. Assess. 14, 253–262 (2002).

    PubMed  Article  Google Scholar 

  91. 91.

    Erev, I. & Barron, G. On adaptation, maximization, and reinforcement learning among cognitive strategies. Psychol. Rev. 112, 912–931 (2005).

    PubMed  Article  Google Scholar 

  92. 92.

    Ahn, W.-Y., Haines, N. & Zhang, L. Revealing neurocomputational mechanisms of reinforcement learning and decision-making with the hBayesDM package. Comput. Psychiatr. 1, 24–57 (2017).

    PubMed  PubMed Central  Article  Google Scholar 

  93. 93.

    Wagner, A. R. & Rescorla, R. A. in Inhibition and Learning (eds Boakes, R. A. & Halliday, M. S.) 301–336 (1972).

  94. 94.

    Erev, I. & Roth, A. E. Predicting how people play games: reinforcement learning in experimental games with unique, mixed strategy equilibria. Am. Econ. Rev. 88, 848–881 (1998).

    Google Scholar 

  95. 95.

    Grover, L. K. A fast quantum mechanical algorithm for database search. In Proc. 28th Annual ACM Symposium on Theory of Computing 212–219 (ACM, 1996).

  96. 96.

    Acerbi, L. & Ji, W. Practical Bayesian optimization for model fitting with Bayesian adaptive direct search. Adv. Neural Inf. Proc. Syst. 30, 1836–1846 (2017).

    Google Scholar 

  97. 97.

    Akaike, H. A new look at the statistical model identification. IEEE Trans. Automat. Contr. 19, 716–723 (1974).

    Article  Google Scholar 

  98. 98.

    Cox, R. W. AFNI: software for analysis and visualization of functional magnetic resonance neuroimages. Comput. Biomed. Res. 29, 162–173 (1996).

    CAS  PubMed  Article  Google Scholar 

  99. 99.

    Li, N. et al. Resting-state functional connectivity predicts impulsivity in economic decision-making. J. Neurosci. 33, 4886–4895 (2013).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

Download references

Acknowledgements

We thank Y. Yang, R. Zha, J. Besumeyer and N. Ma for their inspirational comments. We thank L. Acerbi, G. R. Yang, C. Gneiting, A. Miranowicz, X. Li, Z. Jin and X. Li for their helpful suggestions. This work was supported by grants from the National Key Basic Research Programme (grant nos. 2016YFA0400900 and 2018YFC0831101), the National Natural Science Foundation of China (grant nos. 31471071, 31771221, 61773360, 71671115, 71874170 and 71942003), the Fundamental Research Funds for the Central Universities of China, the MURI Center for Dynamic Magneto-Optics via the Air Force Office of Scientific Research (AFOSR; grant no. FA9550-14-1-0040), the Army Research Office (ARO; grant no. W911NF-18-1-0358), the Asian Office of Aerospace Research and Development (AOARD; grant no. FA2386-18-1-4045), the Japan Science and Technology Agency (JST; via the Q-LEAP programme and CREST grant no. JPMJCR1676), the Japan Society for the Promotion of Science (JSPS; JSPS–RFBR grant no. 17-52-50023 and JSPS–FWO grant no. VS.059.18N), the RIKEN–AIST Challenge Research Fund, the Templeton Foundation, the Foundational Questions Institute (FQXi) and the NTT PHI Laboratory, the Australian Research Council’s Discovery Projects funding scheme under Project DP190101566, the Alexander von Humboldt Foundation and the US Office of Naval Research. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript. We thank the Bioinformatics Centre of the University of Science and Technology of China, School of Life Science for providing supercomputing resources for this project.

Author information

Affiliations

Authors

Contributions

L.J.-A., Y.P. and X.Z. conceived the study. Y.L. and Z.W. provided the devices and collected the data. L.J.-A. built the models. L.J.-A. and Z.W. analysed the data. All authors participated in discussions. L.J.-A., D.D., Y.P., F.N. and X.Z. wrote the paper. X.Z. supervised the project and acquired funding.

Corresponding author

Correspondence to Xiaochu Zhang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Primary Handling Editor: Stavroula Kousta.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Methods, Supplementary Results, Supplementary Discussion, Supplementary Figs. 1–13, Supplementary Tables 1–6 and Supplementary References.

Reporting Summary

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Li, JA., Dong, D., Wei, Z. et al. Quantum reinforcement learning during human decision-making. Nat Hum Behav 4, 294–307 (2020). https://doi.org/10.1038/s41562-019-0804-2

Download citation

Further reading

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing