Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

A social path to human-like artificial intelligence


Traditionally, cognitive and computer scientists have viewed intelligence solipsistically, as a property of unitary agents devoid of social context. Given the success of contemporary learning algorithms, we argue that the bottleneck in artificial intelligence (AI) advancement is shifting from data assimilation to novel data generation. We bring together evidence showing that natural intelligence emerges at multiple scales in networks of interacting agents via collective living, social relationships and major evolutionary transitions, which contribute to novel data generation through mechanisms such as population pressures, arms races, Machiavellian selection, social learning and cumulative culture. Many breakthroughs in AI exploit some of these processes, from multi-agent structures enabling algorithms to master complex games such as Capture-The-Flag and StarCraft II, to strategic communication in the game Diplomacy and the shaping of AI data streams by other AIs. Moving beyond a solipsistic view of agency to integrate these mechanisms could provide a path to human-like compounding innovation through ongoing novel data generation.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type



Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Learning quality depends on the richness and size of the dataset.
Fig. 2: Social interactions drive compounding innovation.


  1. Krizhevsky, A., Sutskever, I. & Hinton, G. E. Imagenet classification with deep convolutional neural networks. Adv. NeurIPS 25, 1097–1105 (2012).

  2. Deng, J. et al. Imagenet: a large-scale hierarchical image database. IEEE Conf. Comput. Vis. Pattern Recog. 248–255 (2009).

  3. Kaplan, J. et al. Scaling laws for neural language models. Preprint at (2020).

  4. Bommasani, R. et al. On the opportunities and risks of foundation models. Preprint at (2021).

  5. Hoffmann, J. et al. Training compute-optimal large language models. Preprint at (2022).

  6. Fei-Fei, L. & Krishna, R. Searching for computer vision north stars. Daedalus 151, 85–99 (2022).

    Article  Google Scholar 

  7. Alayrac, J.-B. et al. Flamingo: a visual language model for few-shot learning. Adv. NeurIPS 35, 23716–23736 (2022).

    Google Scholar 

  8. Young, T. Experiments and calculations relative to physical optics (The 1803 Bakerian lecture). Phil. Trans. R. Soc. 94, 1–16 (1804).

  9. Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction (MIT Press, 2018).

  10. Schaul, T., Borsa, D., Modayil, J. & Pascanu, R. Ray interference: a source of plateaus in deep reinforcement learning. Preprint at (2019).

  11. Ortega, P. A. et al. Shaking the foundations: delusions in sequence models for interaction and control. Preprint at (2021).

  12. Huang, J. et al. Large language models can self-improve. Preprint at (2022).

  13. Shumailov, I. et al. The curse of recursion: training on generated data makes models forget. Preprint at (2023).

  14. Wang, R., Lehman, J., Clune, J. & Stanley, K. O. Paired open-ended trailblazer (POET): endlessly generating increasingly complex and diverse learning environments and their solutions. Preprint at (2019).

  15. Portelas, R., Colas, C., Weng, L., Hofmann, K. & Oudeyer, P.-Y. Automatic curriculum learning for deep RL: a short survey. Proc. 29th International Joint Conference on Artificial Intelligence Survey Track (2020).

  16. Linke, C., Ady, N. M., White, M., Degris, T. & White, A. Adapting behavior via intrinsic reward: a survey and empirical study. J Artif. Intell. Res. 69, 1287–1332 (2020).

    Article  MathSciNet  Google Scholar 

  17. Oudeyer, P.-Y. & Kaplan, F. What is intrinsic motivation? A typology of computational approaches. Front. Neurorobot. 1, 6 (2007).

    Article  Google Scholar 

  18. Pathak, D., Agrawal, P., Efros, A. A. & Darrell, T. Curiosity-driven exploration by self-supervised prediction. Proc. 34th International Conference on Machine Learning 70, 2778–2787 (PMLR, 2017).

  19. Colas, C., Karch, T., Sigaud, O. & Oudeyer, P.-Y. Autotelic agents with intrinsically motivated goal-conditioned reinforcement learning: A short survey. J. Artif. Intell. Res. 74, 1159–1199 (2022).

    Article  MathSciNet  MATH  Google Scholar 

  20. Ladosz, P., Weng, L., Kim, M. & Oh, H. Exploration in deep reinforcement learning: a survey. Inf. Fusion 85, 1–22 (2022).

  21. Jiang, M., Rocktäschel, T. & Grefenstette, E. General intelligence requires rethinking exploration. R. Soc. Open Sci. 10, 230539 (2023).

    Article  Google Scholar 

  22. Kearns, M. & Singh, S. Near-optimal reinforcement learning in polynomial time. Mach. Learn. 49, 209–232 (2002).

    Article  MATH  Google Scholar 

  23. Osband, I., Van Roy, B., Russo, D. J. & Wen, Z. Deep exploration via randomized value functions. J. Mach. Learn. Res. 20, 1–62 (2019).

    MathSciNet  MATH  Google Scholar 

  24. Leibo, J. Z., Hughes, E., Lanctot, M. & Graepel, T. Autocurricula and the emergence of innovation from social interaction: a manifesto for multi-agent intelligence research. Preprint at (2019).

  25. Sukhbaatar, S. et al. Intrinsic motivation and automatic curricula via asymmetric self-play. 6th International Conference on Learning Representations 6 (2018).

  26. Leibo, J. Z. et al. Malthusian reinforcement learning. Proc. 18th International Conference on Autonomous Agents and MultiAgent Systems 1099–1107 (2019).

  27. Baker, B. et al. Emergent tool use from multi-agent autocurricula. 8th International Conference on Learning Representations 8 (2020).

  28. Balduzzi, D. et al. Open-ended learning in symmetric zero-sum games. Proc. 36th International Conference on Machine Learning 97, 434–443 (PMLR, 2019).

  29. Plappert, M. et al. Asymmetric self-play for automatic goal discovery in robotic manipulation. Preprint at (2021).

  30. Goodfellow, I. et al. Generative adversarial nets. Adv. NeurIPS 27, 2672–2680 (2014).

  31. Herrmann, E., Call, J., Hernández-Lloreda, M. V., Hare, B. & Tomasello, M. Humans have evolved specialized skills of social cognition: the cultural intelligence hypothesis. Science 317, 1360–1366 (2007).

    Article  Google Scholar 

  32. Boyd, R., Richerson, P. J. & Henrich, J. The cultural niche: why social learning is essential for human adaptation. Proc. Natl Acad. Sci. USA 108, 10918–10925 (2011).

    Article  Google Scholar 

  33. Whiten, A. Cultural evolution in animals. Annu. Rev. Ecol. Evol. Syst. 50, 27–48 (2019).

    Article  Google Scholar 

  34. Dunbar, R. I. M. The social brain hypothesis. Evol. Anthropol. 6, 178–190 (1998).

    Article  Google Scholar 

  35. Byrne, R. W. Machiavellian intelligence retrospective. J. Comp. Psychol. 132, 432 (2018).

    Article  Google Scholar 

  36. Szathmáry, E. & Maynard Smith, J. The major evolutionary transitions. Nature 374, 227–232 (1995).

    Article  Google Scholar 

  37. Jablonka, E. & Lamb, M. J. Evolution in Four Dimensions: Genetic, Epigenetic, Behavioral, and Symbolic Variation in the History of Life (MIT Press, 2014).

  38. Heyes, C. Cognitive Gadgets: The Cultural Evolution of Thinking (Harvard Univ. Press, 2018).

  39. Ng, W.-L. & Bassler, B. L. Bacterial quorum-sensing network architectures. Ann. Rev. Genet. 43, 197 (2009).

    Article  Google Scholar 

  40. Verheggen, F. J., Haubruge, E. & Mescher, M. C. Alarm pheromones—chemical signaling in response to danger. Vit. Horm. 83, 215–239 (2010).

    Article  Google Scholar 

  41. Nagy, M. et al. Synergistic benefits of group search in rats. Curr. Biol. 30, 4733–4738 (2020).

    Article  Google Scholar 

  42. Schluter, D. The Ecology of Adaptive Radiation (Oxford Univ. Press, 2000).

  43. Bansal, T., Pachocki, J., Sidor, S., Sutskever, I. & Mordatch, I. Emergent complexity via multi-agent competition. 6th International Conference on Learning Representations 6 (2018).

  44. Reynolds, C. W. Flocks, herds and schools: a distributed behavioral model. Computer Graphics 21, 25–34 (1987).

  45. Lerer, A. & Peysakhovich, A. Maintaining cooperation in complex social dilemmas using deep reinforcement learning. Preprint at (2017).

  46. Leibo, J. Z., Zambaldi, V., Lanctot, M., Marecki, J. & Graepel, T. Multi-agent reinforcement learning in sequential social dilemmas. Proc. 16th International Conference on Autonomous Agents and MultiAgent Systems 464–473 (2017).

  47. McKee, K. R., Leibo, J. Z., Beattie, C. & Everett, R. Quantifying the effects of environment and population diversity in multi-agent reinforcement learning. Auton. Agents Multi-Agent Syst. 36, 21 (2022).

  48. Strouse, D., McKee, K., Botvinick, M., Hughes, E. & Everett, R. Collaborating with humans without human data. Adv. NeurIPS 34, 14502–14515 (2021).

    Google Scholar 

  49. Lazaridou, A., Peysakhovich, A. & Baroni, M. Multi-agent cooperation and the emergence of (natural) language. 5th International Conference on Learning Representations 5 (2017).

  50. Czarnecki, W. M. et al. Real world games look like spinning tops. Adv. NeurIPS 33, 17443–17454 (2020).

    Google Scholar 

  51. McGill, B. J. & Brown, J. S. Evolutionary game theory and adaptive dynamics of continuous traits. Annu. Rev. Ecol. Evol. Syst. 38, 403–435 (2007).

    Article  Google Scholar 

  52. Sareni, B. & Krahenbuhl, L. Fitness sharing and niching methods revisited. IEEE Trans. Evol. Comp. 2, 97–106 (1998).

    Article  Google Scholar 

  53. Lehman, J. et al. The surprising creativity of digital evolution: a collection of anecdotes from the evolutionary computation and artificial life research communities. Artif. Life 26, 274–306 (2020).

    Article  Google Scholar 

  54. Van Valen, L. A new evolutionary law. Evol. Theory 1, 1–30 (1973).

    Google Scholar 

  55. Dawkins, R. & Krebs, J. R. Arms races between and within species. Proc. R. Soc. B 205, 489–511 (1979).

    Google Scholar 

  56. Sims, K. Evolving 3D morphology and behavior by competition. Artif. Life 1, 353–372 (1994).

    Article  Google Scholar 

  57. Nolfi, S. & Floreano, D. Coevolving predator and prey robots: do ‘arms races’ arise in artificial evolution? Artif. Life 4, 311–335 (1998).

    Article  Google Scholar 

  58. Silver, D. et al. Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489 (2016).

    Article  Google Scholar 

  59. Stooke, A. et al. Open-ended learning leads to generally capable agents. Preprint at (2021).

  60. Johanson, M. B., Hughes, E., Timbers, F. & Leibo, J. Z. Emergent bartering behaviour in multi-agent reinforcement learning. Preprint at (2022).

  61. Clune, J. AI-GAs: AI-generating algorithms, an alternate paradigm for producing general artificial intelligence. Preprint at (2019).

  62. Nisioti, E. & Moulin-Frier, C. Grounding artificial intelligence in the origins of human behavior. Preprint at (2020).

  63. Aubret, A., Matignon, L. & Hassas, S. A survey on intrinsic motivation in reinforcement learning. Preprint at (2019).

  64. Tesauro, G. TD-Gammon, a self-teaching backgammon program, achieves master-level play. Neural Computation 6, 267–285 (1994).

  65. Jaderberg, M. et al. Human-level performance in 3D multiplayer games with population-based reinforcement learning. Science 364, 859–865 (2019).

    Article  MathSciNet  Google Scholar 

  66. Bakhtin, A. et al. Human-level play in the game of Diplomacy by combining language models with strategic reasoning. Science 378, 1067–1074 (2022).

    Article  MathSciNet  Google Scholar 

  67. Byrne, R. & Whiten, A. Machiavellian Intelligence (Oxford Univ. Press, 1994).

  68. Lanctot, M. et al. A unified game-theoretic approach to multiagent reinforcement learning. Adv. NeurIPS 30, 4190–4203 (2017).

  69. Vinyals, O. et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature 575, 350–354 (2019).

    Article  Google Scholar 

  70. Rendell, L. et al. Why copy others? Insights from the social learning strategies tournament. Science 328, 208–213 (2010).

    Article  MathSciNet  MATH  Google Scholar 

  71. Fang, C., Lee, J. & Schilling, M. A. Balancing exploration and exploitation through structural design: the isolation of subgroups and organizational learning. Org. Sci. 21, 625–642 (2010).

    Article  Google Scholar 

  72. Lazer, D. & Friedman, A. The network structure of exploration and exploitation. Admin. Sci. Quart. 52, 667–694 (2007).

    Article  Google Scholar 

  73. Mason, W. A., Jones, A. & Goldstone, R. L. Propagation of innovations in networked groups. J. Exp. Psychol. Gen. 137, 422 (2008).

    Article  Google Scholar 

  74. Vlasceanu, M., Morais, M. J. & Coman, A. Network structure impacts the synchronization of collective beliefs. J. Cogn. Cult. 21, 431–448 (2021).

    Article  Google Scholar 

  75. Coman, A., Momennejad, I., Drach, R. D. & Geana, A. Mnemonic convergence in social networks: the emergent properties of cognition at a collective level. Proc. Natl Acad. Sci. USA 113, 8171–8176 (2016).

    Article  Google Scholar 

  76. Centola, D. The network science of collective intelligence. Trends Cogn. Sci. 26, 923–941 (2022).

  77. Bernstein, E., Shore, J. & Lazer, D. How intermittent breaks in interaction improve collective intelligence. Proc. Natl Acad. Sci. USA 115, 8734–8739 (2018).

    Article  Google Scholar 

  78. McKee, K. R. et al. Scaffolding cooperation in human groups with deep reinforcement learning. Nat. Hum. Behav. 7, 1787–1796 (2023).

  79. Osa, T. et al. An algorithmic perspective on imitation learning. Found. Trends Robot. 7, 1–179 (2018).

    Article  Google Scholar 

  80. Torabi, F., Warnell, G. & Stone, P. Behavioral cloning from observation. Proc. 27th International Joint Conference on Artificial Intelligence 4950–4957 (2018).

  81. Ho, J. & Ermon, S. Generative adversarial imitation learning. Adv. NeurIPS 29, (2016).

  82. Liu, S. et al. From motor control to team play in simulated humanoid football. Preprint at (2021).

  83. Borsa, D. et al. Observational learning by reinforcement learning. Proc. 18th International Conference on Autonomous Agents and MultiAgent Systems 1117–1124 (2019).

  84. Ndousse, K. K., Eck, D., Levine, S. & Jaques, N. Emergent social learning via multi-agent reinforcement learning. Proc. 38th International Conference on Machine Learning 139, 7991–8004 (PMLR, 2021).

  85. Nisioti, E., Mahaut, M., Oudeyer, P.-Y., Momennejad, I. & Moulin-Frier, C. Social network structure shapes innovation: experience-sharing in RL with SAPIENS. Preprint at (2022).

  86. Jablonka, E. & Lamb, M. J. The evolution of information in the major transitions. J. Theor. Biol. 239, 236–246 (2006).

    Article  MathSciNet  Google Scholar 

  87. Henrich, J. The Secret of Our Success: How Culture is Driving Human Evolution, Domesticating Our Species, and Making Us Smarter (Princeton Univ. Press, 2016).

  88. Bowling, S., Lawlor, K. & Rodríguez, T. A. Cell competition: the winners and losers of fitness selection. Development 146, dev167486 (2019).

    Article  Google Scholar 

  89. Raff, M. C. Social controls on cell survival and cell death. Nature 356, 397–400 (1992).

    Article  Google Scholar 

  90. Ferrante, E., Turgut, A. E., Duéñez-Guzmán, E., Dorigo, M. & Wenseleers, T. Evolution of self-organized task specialization in robot swarms. PLoS Comp. Biol. 11, e1004273 (2015).

    Article  Google Scholar 

  91. Peysakhovich, A. & Lerer, A. Prosocial learning agents solve generalized stag hunts better than selfish ones. Proc. 17th International Conference on Autonomous Agents and MultiAgent Systems 2043–2044 (2018).

  92. Brambilla, M., Ferrante, E., Birattari, M. & Dorigo, M. Swarm robotics: a review from the swarm engineering perspective. Swarm Intell. 7, 1–41 (2013).

    Article  Google Scholar 

  93. Oroojlooy, A. & Hajinezhad, D. A review of cooperative multi-agent deep reinforcement learning. Appl. Intell. 53, 13677–13722 (2023).

  94. Schranz, M., Umlauft, M., Sende, M. & Elmenreich, W. Swarm robotic behaviors and current applications. Front. Robot. AI 7, 36 (2020).

    Article  Google Scholar 

  95. Leibo, J. Z. et al. Scalable evaluation of multi-agent reinforcement learning with Melting Pot. Proc. 38th International Conference on Machine Learning 139, 6187–6199 (PMLR, 2021).

  96. Sunehag, P., Vezhnevets, A. S., Duéñez-Guzmán, E., Mordach, I. & Leibo, J. Z. Diversity through exclusion (DTE): niche identification for reinforcement learning through value-decomposition. Proc. 2023 International Conference on Autonomous Agents and Multiagent Systems 2827–2829 (2023).

  97. Wang, J. X. et al. Evolving intrinsic motivations for altruistic behavior. Proc. 18th International Conference on Autonomous Agents and MultiAgent Systems 683–692 (2019).

  98. Gemp, I. et al. D3C: reducing the price of anarchy in multi-agent learning. Proc. 21st International Conference on Autonomous Agents and Multiagent Systems 498–506 (2022).

  99. Zheng, S., Trott, A., Srinivasa, S., Parkes, D. C. & Socher, R. The AI economist: taxation policy design via two-level deep multiagent reinforcement learning. Sci. Adv. 8, eabk2607 (2022).

    Article  Google Scholar 

  100. Koster, R. et al. Human-centered mechanism design with democratic AI. Nat. Hum. Behav. 6, 1398–1407 (2022).

    Article  Google Scholar 

  101. Dean, L. G., Kendal, R. L., Schapiro, S. J., Thierry, B. & Laland, K. N. Identification of the social and cognitive processes underlying human cumulative culture. Science 335, 1114–1118 (2012).

    Article  Google Scholar 

  102. Muthukrishna, M. & Henrich, J. Innovation in the collective brain. Phil. Trans. R. Soc. B 371, 20150192 (2016).

    Article  Google Scholar 

  103. Dunbar, R. I. & Shultz, S. Why are there so many explanations for primate brain evolution? Phil. Trans. R. Soc. B 372, 20160244 (2017).

    Article  Google Scholar 

  104. Kirby, S., Tamariz, M., Cornish, H. & Smith, K. Compression and communication in the cultural evolution of linguistic structure. Cognition 141, 87–102 (2015).

    Article  Google Scholar 

  105. Ostrom, E. Understanding Institutional Diversity (Princeton Univ. Press, 2005).

  106. Havrylov, S. & Titov, I. Emergence of language with multi-agent games: Learning to communicate with sequences of symbols. Adv. NeurIPS 30, (2017).

  107. Mordatch, I. & Abbeel, P. Emergence of grounded compositional language in multi-agent populations. Proc. AAAI Conf. Artif. Intell. 32, (2018).

  108. Brown, T. et al. Language models are few-shot learners. Adv. NeurIPS 33, 1877–1901 (2020).

    Google Scholar 

  109. Chowdhery, A. et al. PaLM: scaling language modeling with pathways. Preprint at (2022).

  110. Chan, S. C. et al. Data distributional properties drive emergent few-shot learning in transformers. Adv. NeurIPS 35, 18878–18891 (2022).

    Google Scholar 

  111. Wei, J. et al. Chain of thought prompting elicits reasoning in large language models. Adv. NeurIPS 35, 24824–24837 (2022).

    Google Scholar 

  112. Bisk, Y. et al. Experience grounds language. In Proc. 2020 Conference on Empirical Methods in Natural Language Processing 8718–8735 (2020).

  113. Ullman, T. Large language models fail on trivial alterations to theory-of-mind tasks. Preprint at (2023).

  114. Liu, R. et al. Mind’s eye: Grounded language model reasoning through simulation. 11th International Conference on Learning Representations 11 (2023).

  115. Glaese, A. et al. Improving alignment of dialogue agents via targeted human judgements. Preprint at (2022).

  116. Colas, C., Karch, T., Moulin-Frier, C. & Oudeyer, P.-Y. Language and culture internalization for human-like autotelic AI. Nat. Mach. Intell. 4, 1068–1076 (2022).

    Article  Google Scholar 

  117. Villalobos, P. et al. Will we run out of data? An analysis of the limits of scaling datasets in machine learning. Preprint at (2022).

  118. Gazda, S. K. Driver-barrier feeding behavior in bottlenose dolphins (Tursiops truncatus): new insights from a longitudinal study. Mar. Mammal Sci. 32, 1152–1160 (2016).

    Article  Google Scholar 

  119. Bales, K. L. et al. What is a pair bond? Horm. Behav. 136, 105062 (2021).

    Article  Google Scholar 

  120. Lukas, D. & Clutton-Brock, T. Social complexity and kinship in animal societies. Ecol. Lett. 21, 1129–1134 (2018).

    Article  Google Scholar 

  121. Feldman, R. The adaptive human parental brain: implications for children’s social development. Trends Neurosci. 38, 387–399 (2015).

    Article  Google Scholar 

  122. Tarr, B., Launay, J., Cohen, E. & Dunbar, R. Synchrony and exertion during dance independently raise pain threshold and encourage social bonding. Biol. Lett. 11, 20150767 (2015).

    Article  Google Scholar 

  123. Lieberwirth, C. & Wang, Z. Social bonding: regulation by neuropeptides. Front. Neurosci. 8, 171 (2014).

    Article  Google Scholar 

  124. Ågren, J. A., Davies, N. G. & Foster, K. R. Enforcement is central to the evolution of cooperation. Nat. Ecol. Evol. 3, 1018–1029 (2019).

    Article  Google Scholar 

  125. Wilkins, A. S., Wrangham, R. W. & Fitch, W. T. The ‘domestication syndrome’ in mammals: a unified explanation based on neural crest cell behavior and genetics. Genetics 197, 795–808 (2014).

    Article  Google Scholar 

Download references


We thank A. Anand, D. Parkes, T. Schaul and K. Tuyls for helpful comments on early versions of this manuscript.

Author information

Authors and Affiliations



All authors contributed ideas and wrote the paper.

Corresponding author

Correspondence to Edgar A. Duéñez-Guzmán.

Ethics declarations

Competing interests

The authors declare no competing interests

Peer review

Peer review information

Nature Machine Intelligence thanks Fernando Santos and Cédric Colas for their contribution to the peer review of this work. Primary Handling Editor: Trenton Jerde, in collaboration with the Nature Machine Intelligence team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Duéñez-Guzmán, E.A., Sadedin, S., Wang, J.X. et al. A social path to human-like artificial intelligence. Nat Mach Intell 5, 1181–1188 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing