Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Expertise increases planning depth in human gameplay

Abstract

A hallmark of human intelligence is the ability to plan multiple steps into the future1,2. Despite decades of research3,4,5, it is still debated whether skilled decision-makers plan more steps ahead than novices6,7,8. Traditionally, the study of expertise in planning has used board games such as chess, but the complexity of these games poses a barrier to quantitative estimates of planning depth. Conversely, common planning tasks in cognitive science often have a lower complexity9,10 and impose a ceiling for the depth to which any player can plan. Here we investigate expertise in a complex board game that offers ample opportunity for skilled players to plan deeply. We use model fitting methods to show that human behaviour can be captured using a computational cognitive model based on heuristic search. To validate this model, we predict human choices, response times and eye movements. We also perform a Turing test and a reconstruction experiment. Using the model, we find robust evidence for increased planning depth with expertise in both laboratory and large-scale mobile data. Experts memorize and reconstruct board features more accurately. Using complex tasks combined with precise behavioural modelling might expand our understanding of human planning and help to bridge the gap with progress in artificial intelligence.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Task and computational model.
Fig. 2: The model accounts for multivariate data and generalizes to unseen data.
Fig. 3: The effects of expertise and time pressure on planning.
Fig. 4: The effects of expertise on planning in mobile data.

Similar content being viewed by others

Data availability

Data supporting the findings of this study are publicly available at the Open Science Framework (https://osf.io/n2xjm/).

Code availability

Code used in this study is publicly available at the Open Science Framework (https://osf.io/n2xjm/).

References

  1. Miller, K. J. & Venditto, S. J. C. Multi-step planning in the brain. Curr. Opin. Behav. Sci. 38, 29–39 (2021).

    Article  Google Scholar 

  2. Mattar, M. G. & Lengyel, M. Planning in the brain. Neuron 110, 914–934 (2022).

    Article  CAS  PubMed  Google Scholar 

  3. de Groot, A. D. Het Denken van den Sckaken (Noord-Holland. Uitgev. Maatschappij, 1946).

  4. Charness, N. in Toward a General Theory of Expertise: Prospects and Limits (eds Anders, E. K. & Smith, J.) 39–63 (Cambridge University Press, 1991).

  5. Holding, D. H. Theories of chess skill. Psychol. Res. 54, 10–16 (1992).

    Article  Google Scholar 

  6. Gobet, F. A pattern-recognition theory of search in expert problem solving. Think. Reasoning 3, 291–313 (1997).

    Article  Google Scholar 

  7. Campitelli, G. & Gobet, F. Adaptive expert decision making: Skilled chess players search more and deeper. J. Int. Comput. Games Assoc. 27, 209–216 (2004).

  8. Linhares, A., Freitas, A. E. T., Mendes, A. & Silva, J. S. Entanglement of perception and reasoning in the combinatorial game of chess: differential errors of strategic reconstruction. Cogn. Syst. Res. 13, 72–86 (2012).

    Article  Google Scholar 

  9. Daw, N. D., Gershman, S. J., Seymour, B., Dayan, P. & Dolan, R. J. Model-based influences on humans’ choices and striatal prediction errors. Neuron 69, 1204–1215 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Huys, Q. J. et al. Bonsai trees in your head: how the Pavlovian system sculpts goal-directed choices by pruning decision trees. PLoS Comput. Biol. 8, e1002410 (2012).

    Article  CAS  PubMed  PubMed Central  MathSciNet  Google Scholar 

  11. Chase, W. G. & Simon, H. A. Perception in chess. Cogn. Psychol. 4, 55–81 (1973).

    Article  Google Scholar 

  12. Van Harreveld, F., Wagenmakers, E.-J. & Van Der Maas, H. L. The effects of time pressure on chess skill: an investigation into fast and slow processes underlying expert performance. Psychol. Res. 71, 591–597 (2007).

    Article  PubMed  Google Scholar 

  13. Sheridan, H. & Reingold, E. M. Chess players’ eye movements reveal rapid recognition of complex visual patterns: evidence from a chess-related visual search task. J. Vis. 17, 4 (2017).

    Article  PubMed  Google Scholar 

  14. Gobet, F. & Simon, H. A. Expert chess memory: revisiting the chunking hypothesis. Memory 6, 225–255 (1998).

    Article  CAS  PubMed  Google Scholar 

  15. Bilalić, M., Langner, R., Erb, M. & Grodd, W. Mechanisms and neural basis of object and pattern recognition: a study with chess experts. J. Exp. Psychol. Gen. 139, 728–742 (2010).

    Article  PubMed  Google Scholar 

  16. Saariluoma, P. Visuospatial and articulatory interference in chess players’ information intake. Appl. Cogn. Psychol. 6, 77–89 (1992).

    Article  Google Scholar 

  17. Holding, D. H. The Psychology of Chess Skill (Lawrence Erlbaum, 1985).

  18. Holding, D. H. Evaluation factors in human tree search. Am. J. Psychol. 102, 103–108 (1989).

    Article  Google Scholar 

  19. Gobet, F. & Jansen, P. Towards a chess program based on a model of human memory. Adv. Comput. Chess 7, 35–60 (1994).

    Google Scholar 

  20. Holding, D. H. Counting backward during chess move choice. Bull. Psychon. Soc. 27, 421–424 (1989).

    Article  Google Scholar 

  21. Charness, N. in Complex Information Processing 203–228 (Psychology Press, 2013).

  22. Huys, Q. J. et al. Interplay of approximate planning strategies. Proc. Natl Acad. Sci. USA 112, 3098–3103 (2015).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  23. Snider, J., Lee, D., Poizner, H. & Gepshtein, S. Prospective optimization with limited resources. PLoS Comput. Biol. 11, e1004501 (2015).

    Article  PubMed  PubMed Central  ADS  Google Scholar 

  24. Kolling, N., Scholl, J., Chekroud, A., Trier, H. A. & Rushworth, M. F. Prospection, perseverance, and insight in sequential behavior. Neuron 99, 1069–1082 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Pfeiffer, B. E. & Foster, D. J. Hippocampal place-cell sequences depict future paths to remembered goals. Nature 497, 74–79 (2013).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  26. Redish, A. D. Vicarious trial and error. Nat. Rev. Neurosci. 17, 147–159 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Pezzulo, G., Donnarumma, F., Maisto, D. & Stoianov, I. Planning at decision time and in the background during spatial navigation. Curr. Opin. Behav. Sci. 29, 69–76 (2019).

    Article  Google Scholar 

  28. Miller, K. J., Botvinick, M. M. & Brody, C. D. Dorsal hippocampus contributes to model-based planning. Nat. Neurosci. 20, 1269 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Groman, S. M., Rich, K. M., Smith, N. J., Lee, D. & Taylor, J. R. Chronic exposure to methamphetamine disrupts reinforcement-based decision making in rats. Neuropsychopharmacology 43, 770–780 (2018).

    Article  CAS  PubMed  Google Scholar 

  30. Akam, T. et al. The anterior cingulate cortex predicts future states to mediate model-based action selection. Neuron 109, 149–163 (2020).

  31. Beck, J. Combinatorial Games: Tic-Tac-Toe Theory Vol. 114 (Cambridge Univ. Press, 2008).

  32. van Opheusden, B. & Ma, W. J. Tasks for aligning human and machine planning. Curr. Opin. Behav. Sci. 29, 127–133 (2019).

    Article  Google Scholar 

  33. Pearl, J. Heuristics: Intelligent Search Strategies for Computer Problem Solving (Addison-Wesley Longman Publishing Co., Inc., 1984).

  34. Bonet, B. & Geffner, H. Planning as heuristic search. Artif. Int. 129, 5–33 (2001).

  35. Dechter, R. & Pearl, J. Generalized best-first search strategies and the optimality of A*. J. ACM 32, 505–536 (1985).

    Article  MATH  MathSciNet  Google Scholar 

  36. Callaway, F. et al. Rational use of cognitive resources in human planning. Nat. Hum. Behav. 6, 1112–1125 (2022).

    Article  PubMed  Google Scholar 

  37. Treisman, A. M. & Gelade, G. A feature-integration theory of attention. Cogn. Psychol. 12, 97–136 (1980).

    Article  CAS  PubMed  Google Scholar 

  38. van Opheusden, B., Acerbi, L. & Ma, W. J. Unbiased and efficient log-likelihood estimation with inverse binomial sampling. PLOS Comput. Biol. 16, e1008483 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  39. Acerbi, L. & Ma, W. J. Practical Bayesian optimization for model fitting with Bayesian adaptive direct search. Proceedings of the 31st International Conference on Neural Information Processing Systems 1834–1844 (2017).

  40. Turing, A. Computing machinery and intelligence. Mind 59, 433–460 (1950).

    Article  MathSciNet  Google Scholar 

  41. Elo, A. E. The Rating of Chessplayers, Past and Present (Arco Pub., 1978).

  42. Chabris, C. F. & Hearst, E. S. Visualization, pattern recognition, and forward search: Effects of playing speed and sight of the position on grandmaster chess errors. Cogn. Sci. 27, 637–648 (2003).

    Article  Google Scholar 

  43. Calderwood, R., Klein, G. A. & Crandall, B. W. Time pressure, skill, and move quality in chess. Am. J. Psychol. 101, 481–493 (1988).

    Article  Google Scholar 

  44. Krusche, M. J., Schulz, E., Guez, A. & Speekenbrink, M. Adaptive planning in human search. Preprint at BioRxiv https://doi.org/10.1101/268938 (2018).

  45. Huang, J., Velarde, I., Ma, W. J. & Baldassano, C. Schema-based predictive eye movements support sequential memory encoding. eLife 12, e82599 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  46. Dubey, R., Agrawal, P., Pathak, D., Griffiths, T. L. & Efros, A. A. Investigating human priors for playing video games. In Proc. Intennational Conference of Machine Learning (ICML) (2018).

  47. Charness, N., Tuffiash, M., Krampe, R., Reingold, E. & Vasyukova, E. The role of deliberate practice in chess expertise. Appl. Cogn. Psychol. 19, 151–165 (2005).

    Article  Google Scholar 

  48. Brown, N. & Sandholm, T. Superhuman AI for multiplayer poker. Science 365, 885–890 (2019).

    Article  CAS  PubMed  MATH  ADS  MathSciNet  Google Scholar 

  49. Meta Fundamental AI Research Diplomacy Team (FAIR) et al.Human-level play in the game of diplomacy by combining language models with strategic reasoning. Science 378, 1067–1074 (2022).

    Article  ADS  MathSciNet  Google Scholar 

  50. Silver, D. et al. A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science 362, 1140–1144 (2018).

    Article  CAS  PubMed  MATH  ADS  MathSciNet  Google Scholar 

  51. Hamrick, J. B. et al. Combining q-learning and search with amortized value estimates. In Proc. International Conference on Learning Representations (ICLR) (2020).

  52. Ma, I., Phaneuf, C., van Opheusden, B., Ma, W. J. & Hartley, C. The component processes of complex planning follow distinct developmental trajectories. Preprint at PsyArXiv https://doi.org/10.31234/osf.io/d62rw (2022).

  53. Padoa-Schioppa, C. & Assad, J. A. Neurons in the orbitofrontal cortex encode economic value. Nature 441, 223–226 (2006).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  54. Cornelissen, F. W., Peters, E. M. & Palmer, J. The eyelink toolbox: eye tracking with MATLAB and the psychophysics toolbox. Behav. Res. Methods Instr. Comput. 34, 613–617 (2002).

    Article  Google Scholar 

  55. Zermelo, E. Die berechnung der turnier-ergebnisse als ein maximumproblem der wahrscheinlichkeitsrechnung. Math. Z. 29, 436–460 (1929).

    Article  MATH  MathSciNet  Google Scholar 

  56. Hunter, D. R. MM algorithms for generalized Bradley-Terry models. Ann. Stat. 32, 384–406 (2004).

    Article  MATH  MathSciNet  Google Scholar 

  57. Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction Vol. 1 (MIT Press, 1998).

  58. Sutton, R. S., McAllester, D. A., Singh, S. P. & Mansour, Y. in Advances in Neural Information Processing Systems 1057–1063 (2000).

  59. Dawson, R. Unbiased Tests, Unbiased Estimators, and Randomized Similar Regions. PhD thesis, Harvard Univ. (1953).

  60. de Groot, M. H. Unbiased sequential estimation for binomial populations. Ann. Math. Stat. 30, 80–101 (1959).

    Article  MathSciNet  Google Scholar 

  61. Huyer, W. & Neumaier, A. Global optimization by multilevel coordinate search. J. Glob. Optim. 14, 331–355 (1999).

    Article  MATH  MathSciNet  Google Scholar 

Download references

Acknowledgements

We thank Z. Shu for piloting an early version of the experiment; F. Khalidi for assistance with data collection; and A. Mihali, A. Yoo, M. Honig, L. Acerbi, W. Adler, F. Callaway, T. Griffiths and M. Mattar, and the other current members and alumni of the Ma laboratory for discussions. This work was supported by grant number IIS-1344256 to W.J.M. and by Graduate Research Fellowship number DGE1839302 to I.K. from the National Science Foundation.

Author information

Authors and Affiliations

Authors

Contributions

All of the authors contributed to conceptualization of the research. B.v.O., G.G., I.K. and Y.L. collected data. B.v.O., I.K., G.G., Y.L. and Z.B. developed software, methodology and performed analysis. B.v.O., I.K. and W.J.M. wrote the paper. W.J.M. supervised the project and acquired funding.

Corresponding author

Correspondence to Bas van Opheusden.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature thanks Quentin Huys and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Model comparison.

We validate our main model specification by comparing to alternatives in three categories: lesions generated by removing model components (red), extensions generated by adding new model components (blue) and modifications generated by replacing a model component with a similar implementation (green). A. Cross-validated log-likelihood per move, across all participants in the laboratory experiments. Error bars indicate mean and s.e.m. of the difference in log-likelihood with the main model. B–F. Same as A., for participants in the human-vs-human, generalization, eye tracking, learning and time pressure experiments.

Extended Data Fig. 2 Parameter validation.

Because model fitting is too computationally expense for parameter recovery, we assess the reliability of the parameter estimates using less computationally expensive methods. A. Pearson correlation across participants between model parameters estimated in two independent fits. Error bars indicate the confidence interval. B. Same as A., for different sessions in the learning experiment. Error bars indicate s.e.m. across participants C-D. Same as A-B., for the derived metrics. E. 2-sample Kolmogorov-Smirnov test statistic between the distribution of \({\hat{\theta }}_{j}^{{\rm{lesion}}\,i}\) and \({\hat{\theta }}_{j}^{{\rm{full}}}\) for each pair of parameters. In all panels, we indicate tests that are significant after correcting for multiple comparisons using false discovery rate by *: α = 0.05, **: α = 0.01, ***: α = 0.001. For significant tests, we additionally report uncorrected two-sided p-values. F. Trade-offs between model parameters using a Pearson correlation between \({\hat{\theta }}_{i}^{{\rm{full}}}\) and \({\hat{\theta }}_{j}^{{\rm{full}}}-{\hat{\theta }}_{j}^{{\rm{lesion}}\,i}\) for each pair of model parameters. G-H. Same as E-F., for the derived metrics.

Extended Data Fig. 3 Summary statistics.

Comparing our main model directly to human choices is challenging because the data is high-dimensional and discrete. Instead, we compute summary statistics as a function of number of pieces on the board, to probe for systematic patterns in the time course of people’s games, such as a tendency to start playing near the centre of the board and gradually expand outwards. We compare moves made in human-vs-human games (green solid lines), the behavioural model with inferred parameters on the same positions (blue solid lines) or random moves (black dashed lines). For all summary statistics, people deviate considerably from random, and the main model closely matches the human data. All panels depict cross-validated predictions.

Extended Data Fig. 4 Individual differences across summary statistics.

Each panel shows a scatterplot for the same set of summary statistics as in Extended Data Fig. 3, where each point represents a participant in the human-vs-human experiment, the horizontal coordinate the statistic computed on that participant’s moves, and the vertical coordinate the statistic computed on moves made by the model, with parameters inferred for that participant on out-of-sample choices. The Pearson correlation coefficient and two-sided p-value are reported within each panel. The model accurately predicts individual differences between participants.

Extended Data Fig. 5 Example board positions illustrating model components.

To investigate which patterns in the data are explained by tree search and feature dropping, we compare the distribution of choices predicted by the main model against lesion models. A. Example positions from human-vs-human games in which the model with (right column) and without tree search (left column) make highly different predictions (red shade), as quantified by Jensen-Shannon divergence. In each position, we also show the models’ preferred move (with an x) and the move made by the human participant (open circle). These predictions are averaged across simulations with 200 different parameter vectors from fits to human data, to capture positions with robust differences between planning and no planning. Upon inspection, we recognize these positions as ones where the player to move has multiple reasonable options, but to evaluate their quality one has to calculate many moves ahead. For example, in the second position, the move preferred by the No tree model is losing and the one by the main model is drawn, but this relies on a specific 10-move forced sequence that can only be found through explicit search. B. Same as A., but lesioning the feature drop metric, and using the ratio of the predicted probability of the human move as metric for selecting positions. The feature drop mechanism is primarily necessary to account for people’s tendency to overlook possibilities to immediately make four-in-a-row, or block immediate four-in-a-row threats by the opponent.

Extended Data Fig. 6 Turing test.

In the Turing test, we showed participants video segments of sequences of moves, on average 9.38 moves long. A. Classification accuracy in the Turing test as a function of video length. Error bars indicate s.e.m. Participants are at chance level for classification of one-move videos (of which there were 8), and their accuracy only substantially exceeds 50% for sequences longer than 10 moves. A mixed effects linear regression with accuracy as dependent variable and observer-specific random intercepts estimates the increase in accuracy per observed move as only 0.33 ± 0.10%. B. Histogram of the percentage of observers classifying a given video as human-vs-human or computer-vs-computer, for either human games (pink), or computer-generated games (grey). While human games are on average more likely to be classified as human and computer games as computers, there are no videos for which all 30 observers agree, and there is a considerable fraction of videos (63 out of 180) for which a majority of observers respond incorrectly.

Extended Data Fig. 7 Eye tracking.

A. Coefficients in a linear regression predicting participants’ attentional distribution from the distribution of squares that the model includes in its principal variation at each depth. The regression coefficients are significantly greater than zero (one-sample T-test across participants) for depth up to 7, and highest for depth closer to 1. Error bars indicate s.e.m. across participants. B. Example positions from the eye tracking data in which the No feature drop model assigns low probability to the participant’s move. The right column shows the eye movements while the participant contemplates their move. In most positions, the participant spends no time whatsoever looking at the square preferred by the model, suggesting they indeed dropped the relevant four-in-a-row feature.

Extended Data Fig. 8 Playing strength correlations and response times.

A. Planning depth vs Elo rating of all participants in the learning (green) and time pressure experiments (purple). Playing strength correlates with planning depth (ρ = 0.62, p < 0.001). B. Same as A., for feature drop rate (ρ = −0.73, p < 0.001). C. Same as A., for heuristic quality, which does correlate with playing strength (ρ = 0.11, p = 0.088). C. Response times for participants in each session of the learning experiment. Error bars indicate s.e.m. across participants. Participants play slightly faster in later sessions. Therefore, our finding of increased planning in later sessions is not confounded by an increase in thinking time. Instead, people plan more while using less time. D. Same as C., for the time pressure experiment. The time limit manipulation is effective at increasing participants’ response times, even though they use only a fraction of the available time on average.

Extended Data Fig. 9 Memory and reconstruction experiment.

A. Error rates in the memory and reconstruction experiment. Although experts are slightly worse than novices in the extra piece error rate (β = 0.0071 ± 0.0031, p = 0.049), experts substantially outperform novices in the missed piece (β = 0.037 ± 0.006, p < 0.001) and the wrong colour rate (β = 0.019 ± 0.003, p < 0.001). B. Scatterplot of total reconstruction time for experts and novices. Each point represents a board position in the memory in reconstruction experiment, the x-coordinate the average time that experts take to finish their reconstruction, and the y-coordinate the same but for novices. Positions from games are coloured pink, randomly scrambled positions in grey. Experts take more time to reconstruct pieces (β = 2.73 ± 0.57, p < 0.001), meaning that the error rate result could reflect a speed-accuracy trade-off as opposed to an overall improvement. However, experts reconstruct game-relevant features such as 3-in-a-row more accurately in the same amount of time. C. Example position of the memory and reconstruction experiment. The original board contains a 3-in-a-row feature on the bottom row (yellow shading). In the reconstructions, each circle indicates the distribution of pieces placed by different observers, with the angles of the grey, black and white wedges indicating the probability for that square to be empty, contain a black or contain a white piece, respectively. Novices correctly reconstruct the 3-in-a-row feature 42.1% of the time, but experts 84.2%. Together, these results suggest that players represent boards in memory in terms of game-relevant features.

Extended Data Table 1 Robustness analysis

Supplementary information

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

van Opheusden, B., Kuperwajs, I., Galbiati, G. et al. Expertise increases planning depth in human gameplay. Nature 618, 1000–1005 (2023). https://doi.org/10.1038/s41586-023-06124-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41586-023-06124-2

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing