Quantum reinforcement learning during human decision-making

Li, Ji-An; Dong, Daoyi; Wei, Zhengde; Liu, Ying; Pan, Yu; Nori, Franco; Zhang, Xiaochu

doi:10.1038/s41562-019-0804-2

Article
Published: 20 January 2020

Quantum reinforcement learning during human decision-making

Nature Human Behaviour volume 4, pages 294–307 (2020)Cite this article

7918 Accesses
66 Citations
78 Altmetric
Metrics details

Subjects

Abstract

Classical reinforcement learning (CRL) has been widely applied in neuroscience and psychology; however, quantum reinforcement learning (QRL), which shows superior performance in computer simulations, has never been empirically tested on human decision-making. Moreover, all current successful quantum models for human cognition lack connections to neuroscience. Here we studied whether QRL can properly explain value-based decision-making. We compared 2 QRL and 12 CRL models by using behavioural and functional magnetic resonance imaging data from healthy and cigarette-smoking subjects performing the Iowa Gambling Task. In all groups, the QRL models performed well when compared with the best CRL models and further revealed the representation of quantum-like internal-state-related variables in the medial frontal gyrus in both healthy subjects and smokers, suggesting that value-based decision-making can be illustrated by QRL at both the behavioural and neural levels.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Task diagram and task performance.**

**Fig. 2: Diagrams of model architecture.**

**Fig. 3: The AICc and BIC of each model, computed separately for each group.**

**Fig. 4: The inferred model probability of each model, computed separately for each group.**

**Fig. 5: The simulation results of each model, computed separately for each group.**

**Fig. 6: Generalized quantum distance (computed by the QSPP model)-related activity in the control group.**

**Fig. 7: fMRI results of the uncertainty × penalty/reward interaction.**

Beyond dichotomies in reinforcement learning

Article 01 September 2020

Humans primarily use model-based inference in the two-stage task

Article 06 July 2020

Positive reward prediction errors during decision-making strengthen memory encoding

Article 06 May 2019

Data availability

All data are available from the corresponding author on reasonable request.

Code availability

All code used to generate the results central to the main claims in this study is available from the corresponding author on reasonable request.

References

Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction, Vol. 1 (MIT Press, 1998).
Niv, Y. Reinforcement learning in the brain. J. Math. Psychol. 53, 139–154 (2009).
Article Google Scholar
Biamonte, J. et al. Quantum machine learning. Nature 549, 195–202 (2017).
Article CAS PubMed Google Scholar
Dong, D., Chen, C., Li, H. & Tarn, T.-J. Quantum reinforcement learning. IEEE Trans. Syst. Man Cybern. Pt B 38, 1207–1220 (2008).
Article Google Scholar
Dong, D., Chen, C., Chu, J. & Tarn, T.-J. Robust quantum-inspired reinforcement learning for robot navigation. IEEE/ASME Trans. Mechatron. 17, 86–97 (2012).
Article Google Scholar
Fakhari, P., Rajagopal, K., Balakrishnan, S. N. & Busemeyer, J. R. Quantum inspired reinforcement learning in changing environment. New Math. Nat. Comput. 9, 273–294 (2013).
Article Google Scholar
Wittek, P. Quantum Machine Learning: What Quantum Computing Means to Data Mining (Academic Press, 2014).
Dunjko, V., Taylor, J. M. & Briegel, H. J. Quantum-enhanced machine learning. Phys. Rev. Lett. 117, 130501 (2016).
Article PubMed CAS Google Scholar
Manousakis, E. Quantum formalism to describe binocular rivalry. Biosystems 98, 57–66 (2009).
Article PubMed Google Scholar
Busemeyer, J. R. & Bruza, P. D. Quantum Models of Cognition and Decision (Cambridge Univ. Press, 2012).
Busemeyer, J. R., Wang, Z. & Shiffrin, R. M. Bayesian model comparison favors quantum over standard decision theory account of dynamic inconsistency. Decision 2, 1–12 (2015).
Article Google Scholar
Kvam, P. D., Pleskac, T. J., Yu, S. & Busemeyer, J. R. Interference effects of choice on confidence: quantum characteristics of evidence accumulation. Proc. Natl Acad. Sci. USA 112, 10645–10650 (2015).
Article CAS PubMed PubMed Central Google Scholar
Ashtiani, M. & Azgomi, M. A. A survey of quantum-like approaches to decision making and cognition. Math. Soc. Sci. 75, 49–80 (2015).
Article Google Scholar
Yukalov, V. I. & Sornette, D. Quantum probability and quantum decision-making. Phil. Trans. R. Soc. A 374, 20150100 (2016).
Article PubMed CAS Google Scholar
de Barros, J. A. & Oas, G. in The Palgrave Handbook of Quantum Models in Social Science (eds Haven, E. & Khrennikov, A.) 195–228 (Springer, 2017).
Takahashi, T. Can quantum approaches benefit biology of decision making? Prog. Biophys. Mol. Biol. 130, 99–102 (2017).
Article PubMed Google Scholar
Gold, J. I. & Shadlen, M. N. The neural basis of decision making. Annu. Rev. Neurosci. 30, 535–574 (2007).
Article CAS PubMed Google Scholar
Sanfey, A. G., Loewenstein, G., McClure, S. M. & Cohen, J. D. Neuroeconomics: cross-currents in research on decision-making. Trends Cogn. Sci. 10, 108–116 (2006).
Article PubMed Google Scholar
Glimcher, P. W. Indeterminacy in brain and behavior. Annu. Rev. Psychol. 56, 25–56 (2005).
Article PubMed Google Scholar
Glimcher, P. W. & Fehr, E. Neuroeconomics: Decision Making and the Brain (Academic Press, 2013).
Lee, D., Seo, H. & Jung, M. W. Neural basis of reinforcement learning and decision making. Annu. Rev. Neurosci. 35, 287–308 (2012).
Article CAS PubMed PubMed Central Google Scholar
Daw, N. D. & Tobler, P. N. in Neuroeconomics 2nd edn (eds Glimcher, P. W. & Fehr, E.) 283–298 (Academic Press, 2014).
Kornmeier, J., Friedel, E., Wittmann, M. & Atmanspacher, H. EEG correlates of cognitive time scales in the Necker-Zeno model for bistable perception. Conscious. Cogn. 53, 136–150 (2017).
Article CAS PubMed Google Scholar
Bechara, A., Damasio, A. R., Damasio, H. & Anderson, S. W. Insensitivity to future consequences following damage to human prefrontal cortex. Cognition 50, 7–15 (1994).
Article CAS PubMed Google Scholar
Ahn, W. Y., Dai, J., Vassileva, J., Busemeyer, J. R. & Stout, J. C. in Progress in Brain Research Vol. 224 (eds Ekhtiari, H. & Paulus, M.) 53–65 (Elsevier, 2016).
Buelow, M. T. & Suhr, J. A. Risky decision making in smoking and nonsmoking college students: examination of Iowa Gambling Task performance by deck type selections. Appl. Neuropsychol. Child 3, 38–44 (2014).
Article PubMed Google Scholar
Wei, Z. et al. Chronic nicotine exposure impairs uncertainty modulation on reinforcement learning in anterior cingulate cortex and serotonin system. NeuroImage 169, 323–333 (2018).
Article CAS PubMed Google Scholar
Steingroever, H. et al. Data from 617 healthy participants performing the Iowa gambling task: a “many labs” collaboration. J. Open Psychol. Data 3, 340–353 (2015).
Article Google Scholar
Rangel, A., Camerer, C. & Montague, P. R. A framework for studying the neurobiology of value-based decision making. Nat. Rev. Neurosci. 9, 545–556 (2008).
Article CAS PubMed PubMed Central Google Scholar
Ahn, W.-Y., Busemeyer, J. R., Wagenmakers, E.-J. & Stout, J. C. Comparison of decision learning models using the generalization criterion method. Cogn. Sci. 32, 1376–1402 (2008).
Article PubMed Google Scholar
Worthy, D. A., Pang, B. & Byrne, K. A. Decomposing the roles of perseveration and expected value representation in models of the Iowa gambling task. Front. Psychol. 4, 640 (2013).
Article PubMed PubMed Central Google Scholar
Ahn, W. Y. et al. Decision-making in stimulant and opiate addicts in protracted abstinence: evidence from computational modeling with pure users. Front. Psychol. 5, 849 (2014).
Article PubMed PubMed Central Google Scholar
Worthy, D. A. & Maddox, W. T. Age-based differences in strategy use in choice tasks. Front. Neurosci. 5, 145 (2012).
Article PubMed PubMed Central Google Scholar
Ahn, W.-Y., Krawitz, A., Kim, W., Busemeyer, J. R. & Brown, J. W. A model-based fMRI analysis with hierarchical Bayesian parameter estimation. Decision 1, 8–23 (2013).
Article Google Scholar
Byrne, K. A., Norris, D. D. & Worthy, D. A. Dopamine, depressive symptoms, and decision-making: the relationship between spontaneous eye blink rate and depressive symptoms predicts Iowa Gambling Task performance. Cogn. Affect. Behav. Neurosci. 16, 23–36 (2016).
Article PubMed PubMed Central Google Scholar
Cavanaugh, J. E. Unifying the derivations for the Akaike and corrected Akaike information criteria. Stat. Probab. Lett. 33, 201–208 (1997).
Article Google Scholar
Schwarz, G. Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978).
Article Google Scholar
Stephan, K. E., Penny, W. D., Daunizeau, J., Moran, R. J. & Friston, K. J. Bayesian model selection for group studies. NeuroImage 46, 1004–1017 (2009).
Article PubMed Google Scholar
Dajka, J., Łuczka, J. & Hänggi, P. Distance between quantum states in the presence of initial qubit-environment correlations: a comparative study. Phys. Rev. A 84, 032120 (2011).
Article CAS Google Scholar
O’Doherty, J. P., Hampton, A. & Kim, H. Model-based fMRI and its application to reward learning and decision making. Ann. N. Y. Acad. Sci. 1104, 35–53 (2007).
Article PubMed Google Scholar
Ma, W. J. & Jazayeri, M. Neural coding of uncertainty and probability. Annu. Rev. Neurosci. 37, 205–220 (2014).
Article CAS PubMed Google Scholar
Bach, D. R., Hulme, O., Penny, W. D. & Dolan, R. J. The known unknowns: neural representation of second-order uncertainty, and ambiguity. J. Neurosci. 31, 4811–4820 (2011).
Article CAS PubMed PubMed Central Google Scholar
Payzan-LeNestour, E., Dunne, S., Bossaerts, P. & O’Doherty, J. P. The neural representation of unexpected uncertainty during value-based decision making. Neuron 79, 191–201 (2013).
Article CAS PubMed PubMed Central Google Scholar
Behrens, T. E. J., Woolrich, M. W., Walton, M. E. & Rushworth, M. F. S. Learning the value of information in an uncertain world. Nat. Neurosci. 10, 1214–1221 (2007).
Article CAS PubMed Google Scholar
Yu, A. J. & Dayan, P. Uncertainty, neuromodulation, and attention. Neuron 46, 681–692 (2005).
Article CAS PubMed Google Scholar
Singh, V. A potential role of reward and punishment in the facilitation of the emotion-cognition dichotomy in the Iowa Gambling Task. Front. Psychol. 4, 944 (2013).
PubMed PubMed Central Google Scholar
Yechiam, E. & Ert, E. Evaluating the reliance on past choices in adaptive learning models. J. Math. Psychol. 51, 75–84 (2007).
Article Google Scholar
Chuang, I. L., Gershenfeld, N. & Kubinec, M. Experimental implementation of fast quantum searching. Phys. Rev. Lett. 80, 3408 (1998).
Article CAS Google Scholar
Dunjko, V., Taylor, J. M. & Briegel, H. J. Advances in quantum reinforcement learning. In Proc. 2017 IEEE International Conference on Systems, Man, and Cybernetics 282–287 (IEEE, 2017).
Nielsen, M. A. & Chuang, I. L. Quantum Computation and Quantum Information (Cambridge Univ. Press, 2010).
Yearsley, J. M. Advanced tools and concepts for quantum cognition: a tutorial. J. Math. Psychol. 78, 24–39 (2017).
Article Google Scholar
Crawford, D., Levit, A., Ghadermarzy, N., Oberoi, J. S. & Ronagh, P. Reinforcement learning using quantum Boltzmann machines. Quantum Info. Comput. 18, 51–74 (2018).
Google Scholar
Krain, A. L., Wilson, A. M., Arbuckle, R., Castellanos, F. X. & Milham, M. P. Distinct neural mechanisms of risk and ambiguity: a meta-analysis of decision-making. NeuroImage 32, 477–484 (2006).
Article PubMed Google Scholar
Hsu, M., Bhatt, M., Adolphs, R., Tranel, D. & Camerer, C. F. Neural systems responding to degrees of uncertainty in human decision-making. Science 310, 1680–1683 (2005).
Article CAS PubMed Google Scholar
Litt, A., Plassmann, H., Shiv, B. & Rangel, A. Dissociating valuation and saliency signals during decision-making. Cereb. Cortex 21, 95–102 (2010).
Article PubMed Google Scholar
Wang, Y. et al. Neural substrates of updating the prediction through prediction error during decision making. NeuroImage 157, 1–12 (2017).
Article PubMed Google Scholar
Vickery, T. J. & Jiang, Y. V. Inferior parietal lobule supports decision making under uncertainty in humans. Cereb. Cortex 19, 916–925 (2008).
Article PubMed Google Scholar
Xue, G., Lu, Z., Levin, I. P. & Bechara, A. The impact of prior risk experiences on subsequent risky decision-making: the role of the insula. NeuroImage 50, 709–716 (2010).
Article PubMed Google Scholar
Haggard, P. Human volition: towards a neuroscience of will. Nat. Rev. Neurosci. 9, 934–946 (2008).
Article CAS PubMed Google Scholar
Nachev, P., Kennard, C. & Husain, M. Functional role of the supplementary and pre-supplementary motor areas. Nat. Rev. Neurosci. 9, 856–869 (2008).
Article CAS PubMed Google Scholar
Tanji, J. & Kurata, K. Contrasting neuronal activity in supplementary and precentral motor cortex of monkeys. I. Responses to instructions determining motor responses to forthcoming signals of different modalities. J. Neurophysiol. 53, 129–141 (1985).
Article CAS PubMed Google Scholar
Okano, K. & Tanji, J. Neuronal activities in the primate motor fields of the agranular frontal cortex preceding visually triggered and self-paced movement. Exp. Brain Res. 66, 155–166 (1987).
Article CAS PubMed Google Scholar
Rushworth, M. F. S. & Behrens, T. E. J. Choice, uncertainty and value in prefrontal and cingulate cortex. Nat. Neurosci. 11, 389–397 (2008).
Article CAS PubMed Google Scholar
Sul, J. H., Kim, H., Huh, N., Lee, D. & Jung, M. W. Distinct roles of rodent orbitofrontal and medial prefrontal cortex in decision making. Neuron 66, 449–460 (2010).
Article CAS PubMed PubMed Central Google Scholar
Kepecs, A., Uchida, N., Zariwala, H. A. & Mainen, Z. F. Neural correlates, computation and behavioural impact of decision confidence. Nature 455, 227–231 (2008).
Article CAS PubMed Google Scholar
O’Neill, M. & Schultz, W. Coding of reward risk by orbitofrontal neurons is mostly distinct from coding of reward value. Neuron 68, 789–800 (2010).
Article PubMed CAS Google Scholar
Studer, B., Cen, D. & Walsh, V. The angular gyrus and visuospatial attention in decision-making under risk. NeuroImage 103, 75–80 (2014).
Article PubMed Google Scholar
Tversky, A. & Kahneman, D. Advances in prospect theory: cumulative representation of uncertainty. J. Risk Uncertain. 5, 297–323 (1992).
Article Google Scholar
De Barros, J. A. & Suppes, P. Quantum mechanics, interference, and the brain. J. Math. Psychol. 53, 306–313 (2009).
Article Google Scholar
Lambert, N. et al. Quantum biology. Nat. Phys. 9, 10–18 (2013).
Article CAS Google Scholar
Busemeyer, J. R., Pothos, E. M., Franco, R. & Trueblood, J. S. A quantum theoretical explanation for probability judgment errors. Psychol. Rev. 118, 193–218 (2011).
Article PubMed Google Scholar
beim Graben, P. & Atmanspacher, H. Complementarity in classical dynamical systems. Found. Phys. 36, 291–306 (2006).
Article Google Scholar
beim Graben, P., Filk, T. & Atmanspacher, H. Epistemic entanglement due to non-generating partitions of classical dynamical systems. Int. J. Theor. Phys. 52, 723–734 (2013).
Article Google Scholar
Ivakhnenko, O. V., Shevchenko, S. N. & Nori, F. Simulating quantum dynamical phenomena using classical oscillators: Landau-Zener-Stückelberg-Majorana interferometry, latching modulation, and motional averaging. Sci. Rep. 8, 12218 (2018).
Article CAS PubMed PubMed Central Google Scholar
Bliokh, K. Y., Bekshaev, A. Y., Kofman, A. G. & Nori, F. Photon trajectories, anomalous velocities and weak measurements: a classical interpretation. New J. Phys. 15, 073022 (2013).
Article Google Scholar
Carleo, G. & Troyer, M. Solving the quantum many-body problem with artificial neural networks. Science 355, 602–606 (2017).
Article CAS PubMed Google Scholar
Busemeyer, J. R., Fakhari, P. & Kvam, P. Neural implementation of operations used in quantum cognition. Prog. Biophys. Mol. Biol. 130, 53–60 (2017).
Article PubMed Google Scholar
Phelps, E. A., Lempert, K. M. & Sokol-Hessner, P. Emotion and decision making: multiple modulatory neural circuits. Annu. Rev. Neurosci. 37, 263–287 (2014).
Article CAS PubMed Google Scholar
Hu, H. Reward and aversion. Annu. Rev. Neurosci. 39, 297–324 (2016).
Article CAS PubMed Google Scholar
Chen, C., Takahashi, T., Nakagawa, S., Inoue, T. & Kusumi, I. Reinforcement learning in depression: a review of computational research. Neurosci. Biobehav. Rev. 55, 247–267 (2015).
Article PubMed Google Scholar
Sanfey, A. G. Social decision-making: insights from game theory and neuroscience. Science 318, 598–602 (2007).
Article CAS PubMed Google Scholar
Roskies, A. L. How does neuroscience affect our conception of volition? Annu. Rev. Neurosci. 33, 109–130 (2010).
Article CAS PubMed Google Scholar
Schack, R., Brun, T. A. & Caves, C. M. Quantum Bayes rule. Phys. Rev. A 64, 014305 (2001).
Article CAS Google Scholar
Kouda, N., Matsui, N., Nishimura, H. & Peper, F. Qubit neural network and its learning efficiency. Neural Comput. Appl. 14, 114–121 (2005).
Article Google Scholar
Piotrowski, E. W. & Sladkowski, J. The next stage: quantum game theory. in Mathematical Physics Research at the Cutting Edge (ed. Benton, C. V.) 247–268 (Nova Science Publishers, 2004).
Ahn, W.-Y., Krawitz, A., Kim, W., Busemeyer, J. R. & Brown, J. W. A model-based fMRI analysis with hierarchical Bayesian parameter estimation. J. Neurosci. Psychol. Econ. 4, 95–110 (2011).
Article PubMed PubMed Central Google Scholar
He, Q. et al. Altered dynamics between neural systems sub-serving decisions for unhealthy food. Front. Neurosci. 8, 350 (2014).
Article PubMed PubMed Central Google Scholar
Brevers, D., Noël, X., He, Q., Melrose, J. A. & Bechara, A. Increased ventral-striatal activity during monetary decision making is a marker of problem poker gambling severity. Addict. Biol. 21, 688–699 (2016).
Article PubMed Google Scholar
Yechiam, E. & Busemeyer, J. R. Comparison of basic assumptions embedded in learning models for experience-based decision making. Psychon. Bull. Rev. 12, 387–402 (2005).
Article PubMed Google Scholar
Busemeyer, J. R. & Stout, J. C. A contribution of cognitive decision models to clinical assessment: decomposing performance on the Bechara gambling task. Psychol. Assess. 14, 253–262 (2002).
Article PubMed Google Scholar
Erev, I. & Barron, G. On adaptation, maximization, and reinforcement learning among cognitive strategies. Psychol. Rev. 112, 912–931 (2005).
Article PubMed Google Scholar
Ahn, W.-Y., Haines, N. & Zhang, L. Revealing neurocomputational mechanisms of reinforcement learning and decision-making with the hBayesDM package. Comput. Psychiatr. 1, 24–57 (2017).
Article PubMed PubMed Central Google Scholar
Wagner, A. R. & Rescorla, R. A. in Inhibition and Learning (eds Boakes, R. A. & Halliday, M. S.) 301–336 (1972).
Erev, I. & Roth, A. E. Predicting how people play games: reinforcement learning in experimental games with unique, mixed strategy equilibria. Am. Econ. Rev. 88, 848–881 (1998).
Google Scholar
Grover, L. K. A fast quantum mechanical algorithm for database search. In Proc. 28th Annual ACM Symposium on Theory of Computing 212–219 (ACM, 1996).
Acerbi, L. & Ji, W. Practical Bayesian optimization for model fitting with Bayesian adaptive direct search. Adv. Neural Inf. Proc. Syst. 30, 1836–1846 (2017).
Google Scholar
Akaike, H. A new look at the statistical model identification. IEEE Trans. Automat. Contr. 19, 716–723 (1974).
Article Google Scholar
Cox, R. W. AFNI: software for analysis and visualization of functional magnetic resonance neuroimages. Comput. Biomed. Res. 29, 162–173 (1996).
Article CAS PubMed Google Scholar
Li, N. et al. Resting-state functional connectivity predicts impulsivity in economic decision-making. J. Neurosci. 33, 4886–4895 (2013).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank Y. Yang, R. Zha, J. Besumeyer and N. Ma for their inspirational comments. We thank L. Acerbi, G. R. Yang, C. Gneiting, A. Miranowicz, X. Li, Z. Jin and X. Li for their helpful suggestions. This work was supported by grants from the National Key Basic Research Programme (grant nos. 2016YFA0400900 and 2018YFC0831101), the National Natural Science Foundation of China (grant nos. 31471071, 31771221, 61773360, 71671115, 71874170 and 71942003), the Fundamental Research Funds for the Central Universities of China, the MURI Center for Dynamic Magneto-Optics via the Air Force Office of Scientific Research (AFOSR; grant no. FA9550-14-1-0040), the Army Research Office (ARO; grant no. W911NF-18-1-0358), the Asian Office of Aerospace Research and Development (AOARD; grant no. FA2386-18-1-4045), the Japan Science and Technology Agency (JST; via the Q-LEAP programme and CREST grant no. JPMJCR1676), the Japan Society for the Promotion of Science (JSPS; JSPS–RFBR grant no. 17-52-50023 and JSPS–FWO grant no. VS.059.18N), the RIKEN–AIST Challenge Research Fund, the Templeton Foundation, the Foundational Questions Institute (FQXi) and the NTT PHI Laboratory, the Australian Research Council’s Discovery Projects funding scheme under Project DP190101566, the Alexander von Humboldt Foundation and the US Office of Naval Research. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript. We thank the Bioinformatics Centre of the University of Science and Technology of China, School of Life Science for providing supercomputing resources for this project.

Author information

Authors and Affiliations

Eye Center, Dept. of Ophthalmology, the First Affiliated Hospital of USTC, Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
Ji-An Li, Zhengde Wei & Xiaochu Zhang
Department of Statistics and Finance, School of Management, University of Science and Technology of China, Hefei, China
Ji-An Li
School of Engineering and Information Technology, University of New South Wales, Canberra, Australian Capital Territory, Australia
Daoyi Dong
Shanghai Key Laboratory of Psychotic Disorders, Shanghai Mental Health Centre, Shanghai Jiao Tong University School of Medicine, Shanghai, China
Zhengde Wei
The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
Ying Liu
Key Laboratory of Applied Brain and Cognitive Sciences, School of Business and Management, Shanghai International Studies University, Shanghai, China
Yu Pan
Theoretical Quantum Physics Laboratory, RIKEN Cluster for Pioneering Research, Wakoshi, Japan
Franco Nori
Department of Physics, The University of Michigan, Ann Arbor, MI, USA
Franco Nori
Hefei Medical Research Centre on Alcohol Addiction, Anhui Mental Health Centre, Hefei, China
Xiaochu Zhang
Academy of Psychology and Behaviour, Tianjin Normal University, Tianjin, China
Xiaochu Zhang
Centres for Biomedical Engineering, University of Science and Technology of China, Hefei, China
Xiaochu Zhang

Authors

Ji-An Li
View author publications
You can also search for this author in PubMed Google Scholar
Daoyi Dong
View author publications
You can also search for this author in PubMed Google Scholar
Zhengde Wei
View author publications
You can also search for this author in PubMed Google Scholar
Ying Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yu Pan
View author publications
You can also search for this author in PubMed Google Scholar
Franco Nori
View author publications
You can also search for this author in PubMed Google Scholar
Xiaochu Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

L.J.-A., Y.P. and X.Z. conceived the study. Y.L. and Z.W. provided the devices and collected the data. L.J.-A. built the models. L.J.-A. and Z.W. analysed the data. All authors participated in discussions. L.J.-A., D.D., Y.P., F.N. and X.Z. wrote the paper. X.Z. supervised the project and acquired funding.

Corresponding author

Correspondence to Xiaochu Zhang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Primary Handling Editor: Stavroula Kousta.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Methods, Supplementary Results, Supplementary Discussion, Supplementary Figs. 1–13, Supplementary Tables 1–6 and Supplementary References.

Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, JA., Dong, D., Wei, Z. et al. Quantum reinforcement learning during human decision-making. Nat Hum Behav 4, 294–307 (2020). https://doi.org/10.1038/s41562-019-0804-2

Download citation

Received: 19 December 2018
Accepted: 02 December 2019
Published: 20 January 2020
Issue Date: March 2020
DOI: https://doi.org/10.1038/s41562-019-0804-2

This article is cited by

A hybrid classical-quantum approach to speed-up Q-learning
- A. Sannia
- A. Giordano
- F. Plastina
Scientific Reports (2023)
On-chip phonon-magnon reservoir for neuromorphic computing
- Dmytro D. Yaremkevich
- Alexey V. Scherbakov
- Manfred Bayer
Nature Communications (2023)
Continual portfolio selection in dynamic environments via incremental reinforcement learning
- Shu Liu
- Bo Wang
- Zhi Wang
International Journal of Machine Learning and Cybernetics (2023)
Quantum affective processes for multidimensional decision-making
- Johnny K. W. Ho
- Johan F. Hoorn
Scientific Reports (2022)
Quantum deep reinforcement learning for clinical decision support in oncology: application to adaptive radiotherapy
- Dipesh Niraula
- Jamalina Jamaluddin
- Issam El Naqa
Scientific Reports (2021)