Self-orienting in human and machine learning

De Freitas, Julian; Uğuralp, Ahmet Kaan; Oğuz-Uğuralp, Zeliha; Paul, L. A.; Tenenbaum, Joshua; Ullman, Tomer D.

doi:10.1038/s41562-023-01696-5

Article
Published: 31 August 2023

Self-orienting in human and machine learning

Nature Human Behaviour volume 7, pages 2126–2139 (2023)Cite this article

2530 Accesses
27 Altmetric
Metrics details

Subjects

Abstract

A current proposal for a computational notion of self is a representation of one’s body in a specific time and place, which includes the recognition of that representation as the agent. This turns self-representation into a process of self-orientation, a challenging computational problem for any human-like agent. Here, to examine this process, we created several ‘self-finding’ tasks based on simple video games, in which players (N = 124) had to identify themselves out of a set of candidates in order to play effectively. Quantitative and qualitative testing showed that human players are nearly optimal at self-orienting. In contrast, well-known deep reinforcement learning algorithms, which excel at learning much more complex video games, are far from optimal. We suggest that self-orienting allows humans to flexibly navigate new settings.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 2: Results of study 1 (Logic Game).**

**Fig. 4: Results of study 2 (Contingency Game).**

**Fig. 5: Results of study 3 (Switching Mappings Game).**

**Fig. 6: The Switching Embodiments Game.**

**Fig. 7: Results for study 4 (Switching Embodiments Game).**

**Fig. 8: Results for the mean number of steps during the last 50 levels and (where relevant) all post-perturbation levels.**

Machine learning reveals the control mechanics of an insect wing hinge

Article 17 April 2024

Memorability shapes perceived time (and vice versa)

Article 22 April 2024

Perceptography unveils the causal contribution of inferior temporal cortex to visual perception

Article Open access 18 April 2024

Data availability

The data that support the findings of this study are available in the Open Science Framework at https://osf.io/bwzth/.

Code availability

All code for data analysis and reproducing the plots is available at https://github.com/Ethical-Intelligence-Lab/probabilisticSelf.

References

James, W., Burkhardt, F., Bowers, F. & Skrupskelis, I. K. The Principles of Psychology Vol. 1 (Macmillan London, 1890).
Belk, R. W. Extended self in a digital world. J. Consum. Res. 40, 477–500 (2013).
Article Google Scholar
Buckner, R. L. & Carroll, D. C. Self-projection and the brain. Trends Cogn. Sci. 11, 49–57 (2007).
Article PubMed Google Scholar
Dennett, D. C. in Self and Consciousness 111–123 (Psychology Press, 2014).
Sui, J. & Humphreys, G. W. The integrative self: how self-reference integrates perception and memory. Trends Cogn. Sci. 19, 719–728 (2015).
Article PubMed Google Scholar
Blanke, O. & Metzinger, T. Full-body illusions and minimal phenomenal selfhood. Trends Cogn. Sci. 13, 7–13 (2009).
Article PubMed Google Scholar
Bem, D. J. Self-perception: an alternative interpretation of cognitive dissonance phenomena. Psychol. Rev. 74, 183 (1967).
Article CAS PubMed Google Scholar
McConnell, A. R. The multiple self-aspects framework: self-concept representation and its implications. Personal. Soc. Psychol. Rev. 15, 3–27 (2011).
Article Google Scholar
Sanchez-Vives, M. V. & Slater, M. From presence to consciousness through virtual reality. Nat. Rev. Neurosci. 6, 332–339 (2005).
Article CAS PubMed Google Scholar
Strawson, G. The sense of the self. Lond. Rev. Books 18, 126–152 (1996).
Google Scholar
Dennett, D. C. in Science Fiction and Philosophy: From Time Travel to Superintelligence (ed. Schneider, S.) 55–68 (John Wiley & Sons, 2016).
Nozick, R. Philosophical Explanations (Harvard Univ. Press, 1981).
Perry, J. Can the self divide? J. Philos. 69, 463–488 (1972).
Article Google Scholar
Moulin-Frier, C. et al. DAC-h3: a proactive robot cognitive architecture to acquire and express knowledge about the world and the self. IEEE Trans. Cogn. Dev. Syst. 10, 1005–1022 (2017).
Article Google Scholar
Johnson, M. & Demiris, Y. Perceptual perspective taking and action recognition. Int. J. Adv. Rob. Syst. 2, 32 (2005).
Article Google Scholar
Paul, L., Ullman, T. E., De Freitas, J. & Tenenbaum, J. Reverse-engineering the self. Preprint at https://doi.org/10.31234/osf.io/vzwrn (2023).
Andrychowicz, M. et al. Hindsight experience replay. Adv. Neural Inform. Process. Syst. 30, 5048–5058 (2017).
Hausknecht, M. & Stone, P. in 2015 AAAI Fall Symposium Series 29–37 (AAAI, 2015).
Schaul, T., Quan, J., Antonoglou, I. & Silver, D. Prioritized experience replay. Preprint at https://doi.org/10.48550/arXiv.1511.05952 (2015).
Van Hasselt, H., Guez, A. & Silver, D. in Proc. AAAI Conference on Artificial Intelligence 2094–2100 (AAAI, 2016).
Wang, Z. et al. in International Conference on Machine Learning 1995–2003 (PMLR, 2016).
Mnih, V. et al. Playing Atari with deep reinforcement learning. Preprint at https://doi.org/10.48550/arXiv.1312.5602 (2013).
Kaiser, L. et al. Model-based reinforcement learning for Atari. Preprint at https://doi.org/10.48550/arXiv.1903.00374 (2019).
Dubey, R., Agrawal, P., Pathak, D., Griffiths, T. L. & Efros, A. A. Investigating human priors for playing video games. Preprint at https://doi.org/10.48550/arXiv.1802.10217 (2018).
Tsividis, P. A. et al. Human-level reinforcement learning through theory-based modeling, exploration, and planning. Preprint at https://doi.org/10.48550/arXiv.2107.12544 (2021).
Tsividis, P. A., Pouncy, T., Xu, J. L., Tenenbaum, J. B. & Gershman, S. J. in 2017 AAAI Spring Symposium Series 643–646 (AAAI, 2017).
Uhde, C., Berberich, N., Ramirez-Amaro, K. & Cheng, G. in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 8081–8086 (IEEE, 2020).
Lanillos, P. & Cheng, G. Robot self/other distinction: active inference meets neural networks learning in a mirror. Preprint at https://doi.org/10.48550/arXiv.2004.05473 (2020).
Demiris, Y. & Meltzoff, A. The robot in the crib: a developmental analysis of imitation skills in infants and robots. Infant Child Dev. Int. J. Res. Pract. 17, 43–53 (2008).
Article Google Scholar
Piaget, J. The construction of reality in the child. J. Consult. Psychol. 19, 77 (1955).
Article Google Scholar
Thrun, S. in Robotics and Cognitive Approaches to Spatial Mapping 13–41 (Springer, 2008).
Silver, D., Singh, S., Precup, D. & Sutton, R. S. Reward is enough. Artif. Intell. 299, 103535 (2021).
Article Google Scholar
Botvinick, M. et al. Building machines that learn and think for themselves. Behav. Brain Sci. 40, E255 (2017).
Botvinick, M. et al. Building machines that learn and think for themselves: Commentary on Lake et al., Behavioral and Brain Sciences, 2017. Preprint at https://doi.org/10.48550/arXiv.1711.08378 (2017).
Vul, E., Goodman, N., Griffiths, T. L. & Tenenbaum, J. B. One and done? Optimal decisions from very few samples. Cogn. Sci. 38, 599–637 (2014).
Article PubMed Google Scholar
Reed, S. et al. A generalist agent. Trans. Mach. Learn. Res. https://openreview.net/forum?id=1ikK0kHjvj (2022).
Schrittwieser, J. et al. Mastering Atari, Go, chess and shogi by planning with a learned model. Nature 588, 604–609 (2020).
Article CAS PubMed Google Scholar
Pan, X. et al. How you act tells a lot: privacy-leakage attack on deep reinforcement learning. Preprint at https://doi.org/10.48550/arXiv.1904.11082 (2019).
Brockman, G. et al. OpenAI Gym. Preprint at https://doi.org/10.48550/arXiv.1606.01540 (2016).
Rouder, J. N., Speckman, P. L., Sun, D., Morey, R. D. & Iverson, G. Bayesian t tests for accepting and rejecting the null hypothesis. Psychon. Bull. Rev. 16, 225–237 (2009).
Article PubMed Google Scholar
Hill, A., Raffin, A., Ernestus, M., Gleave, A. & Kanervisto, A. stable-baselines. GitHub https://github.com/Stable-Baselines-Team/stable-baselines (2018).
Dhariwal, P. et al. Openai baselines. GitHub https://github.com/openai/baselines (2017).
Weitkamp, L. option-critic-pytorch. GitHub https://github.com/lweitkamp/option-critic-pytorch (2019).

Download references

Acknowledgements

For running the artificial models, we used the Harvard Business School compute cluster. This research was funded by Harvard Business School. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.

Author information

Authors and Affiliations

Marketing Unit, Harvard Business School, Boston, MA, USA
Julian De Freitas
Department of Computer Engineering, Bilkent University, Ankara, Turkey
Ahmet Kaan Uğuralp
Department of Psychology, Bilkent University, Ankara, Turkey
Zeliha Oğuz-Uğuralp
Department of Philosophy, Yale University, New Haven, CT, USA
L. A. Paul
Brain & Cognitive Sciences Department, MIT, Boston, MA, USA
Joshua Tenenbaum
Psychology Department, Harvard University, Boston, MA, USA
Tomer D. Ullman

Authors

Julian De Freitas
View author publications
You can also search for this author in PubMed Google Scholar
Ahmet Kaan Uğuralp
View author publications
You can also search for this author in PubMed Google Scholar
Zeliha Oğuz-Uğuralp
View author publications
You can also search for this author in PubMed Google Scholar
L. A. Paul
View author publications
You can also search for this author in PubMed Google Scholar
Joshua Tenenbaum
View author publications
You can also search for this author in PubMed Google Scholar
Tomer D. Ullman
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.D.F. initiated the research, J.D.F., A.K.U. and Z.O.-U. put together the data and conducted the analyses, and J.D.F., A.K.U., Z.O.-U., L.A.P., J.T. and T.D.U. wrote the manuscript.

Corresponding author

Correspondence to Julian De Freitas.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Human Behaviour thanks Nathan Faivre, Tony Prescott and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figs. 1–20 and Tables 1–20.

Reporting Summary

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

De Freitas, J., Uğuralp, A.K., Oğuz-Uğuralp, Z. et al. Self-orienting in human and machine learning. Nat Hum Behav 7, 2126–2139 (2023). https://doi.org/10.1038/s41562-023-01696-5

Download citation

Received: 05 September 2022
Accepted: 07 August 2023
Published: 31 August 2023
Issue Date: December 2023
DOI: https://doi.org/10.1038/s41562-023-01696-5