Abstract
A current proposal for a computational notion of self is a representation of one’s body in a specific time and place, which includes the recognition of that representation as the agent. This turns self-representation into a process of self-orientation, a challenging computational problem for any human-like agent. Here, to examine this process, we created several ‘self-finding’ tasks based on simple video games, in which players (N = 124) had to identify themselves out of a set of candidates in order to play effectively. Quantitative and qualitative testing showed that human players are nearly optimal at self-orienting. In contrast, well-known deep reinforcement learning algorithms, which excel at learning much more complex video games, are far from optimal. We suggest that self-orienting allows humans to flexibly navigate new settings.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout








Data availability
The data that support the findings of this study are available in the Open Science Framework at https://osf.io/bwzth/.
Code availability
All code for data analysis and reproducing the plots is available at https://github.com/Ethical-Intelligence-Lab/probabilisticSelf.
References
James, W., Burkhardt, F., Bowers, F. & Skrupskelis, I. K. The Principles of Psychology Vol. 1 (Macmillan London, 1890).
Belk, R. W. Extended self in a digital world. J. Consum. Res. 40, 477–500 (2013).
Buckner, R. L. & Carroll, D. C. Self-projection and the brain. Trends Cogn. Sci. 11, 49–57 (2007).
Dennett, D. C. in Self and Consciousness 111–123 (Psychology Press, 2014).
Sui, J. & Humphreys, G. W. The integrative self: how self-reference integrates perception and memory. Trends Cogn. Sci. 19, 719–728 (2015).
Blanke, O. & Metzinger, T. Full-body illusions and minimal phenomenal selfhood. Trends Cogn. Sci. 13, 7–13 (2009).
Bem, D. J. Self-perception: an alternative interpretation of cognitive dissonance phenomena. Psychol. Rev. 74, 183 (1967).
McConnell, A. R. The multiple self-aspects framework: self-concept representation and its implications. Personal. Soc. Psychol. Rev. 15, 3–27 (2011).
Sanchez-Vives, M. V. & Slater, M. From presence to consciousness through virtual reality. Nat. Rev. Neurosci. 6, 332–339 (2005).
Strawson, G. The sense of the self. Lond. Rev. Books 18, 126–152 (1996).
Dennett, D. C. in Science Fiction and Philosophy: From Time Travel to Superintelligence (ed. Schneider, S.) 55–68 (John Wiley & Sons, 2016).
Nozick, R. Philosophical Explanations (Harvard Univ. Press, 1981).
Perry, J. Can the self divide? J. Philos. 69, 463–488 (1972).
Moulin-Frier, C. et al. DAC-h3: a proactive robot cognitive architecture to acquire and express knowledge about the world and the self. IEEE Trans. Cogn. Dev. Syst. 10, 1005–1022 (2017).
Johnson, M. & Demiris, Y. Perceptual perspective taking and action recognition. Int. J. Adv. Rob. Syst. 2, 32 (2005).
Paul, L., Ullman, T. E., De Freitas, J. & Tenenbaum, J. Reverse-engineering the self. Preprint at https://doi.org/10.31234/osf.io/vzwrn (2023).
Andrychowicz, M. et al. Hindsight experience replay. Adv. Neural Inform. Process. Syst. 30, 5048–5058 (2017).
Hausknecht, M. & Stone, P. in 2015 AAAI Fall Symposium Series 29–37 (AAAI, 2015).
Schaul, T., Quan, J., Antonoglou, I. & Silver, D. Prioritized experience replay. Preprint at https://doi.org/10.48550/arXiv.1511.05952 (2015).
Van Hasselt, H., Guez, A. & Silver, D. in Proc. AAAI Conference on Artificial Intelligence 2094–2100 (AAAI, 2016).
Wang, Z. et al. in International Conference on Machine Learning 1995–2003 (PMLR, 2016).
Mnih, V. et al. Playing Atari with deep reinforcement learning. Preprint at https://doi.org/10.48550/arXiv.1312.5602 (2013).
Kaiser, L. et al. Model-based reinforcement learning for Atari. Preprint at https://doi.org/10.48550/arXiv.1903.00374 (2019).
Dubey, R., Agrawal, P., Pathak, D., Griffiths, T. L. & Efros, A. A. Investigating human priors for playing video games. Preprint at https://doi.org/10.48550/arXiv.1802.10217 (2018).
Tsividis, P. A. et al. Human-level reinforcement learning through theory-based modeling, exploration, and planning. Preprint at https://doi.org/10.48550/arXiv.2107.12544 (2021).
Tsividis, P. A., Pouncy, T., Xu, J. L., Tenenbaum, J. B. & Gershman, S. J. in 2017 AAAI Spring Symposium Series 643–646 (AAAI, 2017).
Uhde, C., Berberich, N., Ramirez-Amaro, K. & Cheng, G. in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 8081–8086 (IEEE, 2020).
Lanillos, P. & Cheng, G. Robot self/other distinction: active inference meets neural networks learning in a mirror. Preprint at https://doi.org/10.48550/arXiv.2004.05473 (2020).
Demiris, Y. & Meltzoff, A. The robot in the crib: a developmental analysis of imitation skills in infants and robots. Infant Child Dev. Int. J. Res. Pract. 17, 43–53 (2008).
Piaget, J. The construction of reality in the child. J. Consult. Psychol. 19, 77 (1955).
Thrun, S. in Robotics and Cognitive Approaches to Spatial Mapping 13–41 (Springer, 2008).
Silver, D., Singh, S., Precup, D. & Sutton, R. S. Reward is enough. Artif. Intell. 299, 103535 (2021).
Botvinick, M. et al. Building machines that learn and think for themselves. Behav. Brain Sci. 40, E255 (2017).
Botvinick, M. et al. Building machines that learn and think for themselves: Commentary on Lake et al., Behavioral and Brain Sciences, 2017. Preprint at https://doi.org/10.48550/arXiv.1711.08378 (2017).
Vul, E., Goodman, N., Griffiths, T. L. & Tenenbaum, J. B. One and done? Optimal decisions from very few samples. Cogn. Sci. 38, 599–637 (2014).
Reed, S. et al. A generalist agent. Trans. Mach. Learn. Res. https://openreview.net/forum?id=1ikK0kHjvj (2022).
Schrittwieser, J. et al. Mastering Atari, Go, chess and shogi by planning with a learned model. Nature 588, 604–609 (2020).
Pan, X. et al. How you act tells a lot: privacy-leakage attack on deep reinforcement learning. Preprint at https://doi.org/10.48550/arXiv.1904.11082 (2019).
Brockman, G. et al. OpenAI Gym. Preprint at https://doi.org/10.48550/arXiv.1606.01540 (2016).
Rouder, J. N., Speckman, P. L., Sun, D., Morey, R. D. & Iverson, G. Bayesian t tests for accepting and rejecting the null hypothesis. Psychon. Bull. Rev. 16, 225–237 (2009).
Hill, A., Raffin, A., Ernestus, M., Gleave, A. & Kanervisto, A. stable-baselines. GitHub https://github.com/Stable-Baselines-Team/stable-baselines (2018).
Dhariwal, P. et al. Openai baselines. GitHub https://github.com/openai/baselines (2017).
Weitkamp, L. option-critic-pytorch. GitHub https://github.com/lweitkamp/option-critic-pytorch (2019).
Acknowledgements
For running the artificial models, we used the Harvard Business School compute cluster. This research was funded by Harvard Business School. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
Author information
Authors and Affiliations
Contributions
J.D.F. initiated the research, J.D.F., A.K.U. and Z.O.-U. put together the data and conducted the analyses, and J.D.F., A.K.U., Z.O.-U., L.A.P., J.T. and T.D.U. wrote the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Human Behaviour thanks Nathan Faivre, Tony Prescott and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Figs. 1–20 and Tables 1–20.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
De Freitas, J., Uğuralp, A.K., Oğuz-Uğuralp, Z. et al. Self-orienting in human and machine learning. Nat Hum Behav (2023). https://doi.org/10.1038/s41562-023-01696-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41562-023-01696-5