Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

A bias–variance trade-off governs individual differences in on-line learning in an unpredictable environment

Abstract

Decisions often benefit from learned expectations about the sequential structure of the evidence. Here we show that individual differences in this learning process can reflect different implicit assumptions about sequence complexity, leading to performance trade-offs. For a task requiring decisions about dynamic evidence streams, human subjects with more flexible, history-dependent choices (low bias) had greater trial-to-trial choice variability (high variance). In contrast, subjects with more history-independent choices (high bias) were more predictable (low variance). We accounted for these behaviours using models in which assumed complexity was encoded by the size of the hypothesis space over the latent rate of change of the source of evidence. The most parsimonious model used an efficient sampling algorithm in which the range of sampled hypotheses represented an information bottleneck that gave rise to a bias–variance trade-off. This trade-off, which is well known in machine learning, may thus also have broad applicability to human decision-making.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Fig. 1: Task.
Fig. 2: Individual differences in hazard-rate learning.
Fig. 3: Relationship between choice variability and adaptivity across subjects.
Fig. 4: Learning models.
Fig. 5: Relationships between adaptivity and choice variability from simulations using the ideal-observer, sampling and delta-rule models.
Fig. 6: Fits of the ideal-observer, sampling and delta-rule models to choice data.
Fig. 7: Effects of prior precision (φ H ) from the 20-sample model on simulated choice patterns.
Fig. 8: A model-independent measure of inference complexity.

References

  1. Gold, J. I. & Shadlen, M. N. The neural basis of decision making. Annu. Rev. Neurosci. 30, 535–574 (2007).

    CAS  Article  PubMed  Google Scholar 

  2. Smith, P. L. & Ratcliff, R. Psychology and neurobiology of simple decisions. Trends Neurosci. 27, 161–168 (2004).

    CAS  Article  PubMed  Google Scholar 

  3. Wald, A. Sequential Analysis (Wiley: New York, 1947).

  4. Barnard, G. A. Sequential tests in industrial statistics. J. Roy. Stat. Soc. Suppl. 8, 1–26 (1946).

    Article  Google Scholar 

  5. Brody, C. D. & Hanks, T. D. Neural underpinnings of the evidence accumulator. Curr. Opin. Neurobiol. 37, 149–157 (2016).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  6. Kelly, S. P. & O’Connell, R. G. The neural processes underlying perceptual decision making in humans: recent progress and future directions. J. Physiol. Paris 109, 27–37 (2015).

    Article  PubMed  Google Scholar 

  7. Bogacz, R., Brown, E., Moehlis, J., Holmes, P. & Cohen, J. D. The physics of optimal decision making: a formal analysis of models of performance in two-alternative forced choice tasks. Psychol. Rev. 113, 700–765 (2006).

    Article  PubMed  Google Scholar 

  8. Wilson, R. C., Nassar, M. R. & Gold, J. I. Bayesian online learning of the hazard rate in change-point problems. Neural Comput. 22, 2452–2476 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  9. Wilson, R. C., Nassar, M. R. & Gold, J. I. A mixture of delta-rules approximation to bayesian inference in change-point problems. PLoS. Comput. Biol. 9, (2013).

  10. Adams, R. P. & MacKay, D. J. C. Bayesian Online Changepoint Detection (University of Cambridge, Cambridge, 2007).

  11. Fearnhead, P. & Liu, Z. On-line inference for multiple changepoint problems. J. R. Stat. Soc. Ser. B 69, 589–605 (2007).

    Article  Google Scholar 

  12. Veliz-Cuba, A., Kilpatrick, Z. P. & Josic, K. Stochastic models of evidence accumulation in changing environments. SIAM Rev. 58, 264–289 (2016).

    Article  Google Scholar 

  13. Glaze, C. M., Kable, G. W. & Gold, J. I. Normative evidence accumulation in unpredictable environments.eLife 4, (2015).

  14. Ossmy, O. et al. The timescale of perceptual evidence integration can be adapted to the environment. Curr. Biol. 23, 981–986 (2013).

    CAS  Article  PubMed  Google Scholar 

  15. Behrens, T. E., Woolrich, M. W., Walton, M. E. & Rushworth, M. F. Learning the value of information in an uncertain world. Nat. Neurosci. 10, 1214–1221 (2007).

    CAS  Article  PubMed  Google Scholar 

  16. Krugel, L. K., Biele, G., Mohr, P. N., Li, S. C. & Heekeren, H. R. Genetic variation in dopaminergic neuromodulation influences the ability to rapidly and flexibly adapt decisions. Proc. Natl. Acad. Sci. USA 106, 17951–17956 (2009).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  17. Nassar, M. R., Wilson, R. C., Heasly, B. & Gold, J. I. An approximately Bayesian delta-rule model explains the dynamics of belief updating in a changing environment. J. Neurosci. 30, 12366–12378 (2010).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  18. Bishop, C. M. Pattern Recognition and Machine Learning (Springer, New York, NY, 2006).

  19. Rao, R. P. Bayesian computation in recurrent neural circuits. Neural Comput. 16, 1–38 (2004).

    Article  PubMed  Google Scholar 

  20. Friston, K. The free-energy principle: a unified brain theory? Nat. Rev. Neurosci. 11, 127–138 (2010).

    CAS  Article  PubMed  Google Scholar 

  21. Shi, L. & Griffiths, T. L. Neural implementation of hierarchical Bayesian inference by importance sampling. In Advances in Neural Information Processing Systems 22 (eds Bengio, Y., Schuurmans, D., Lafferty, J. D., Williams, C. K. I. & Culotta, A.) 1669–1677 (NIPS, 2009).

  22. Lochmann, T. & Deneve, S. Neural processing as causal inference. Curr. Opin. Neurobiol. 21, 774–781 (2011).

    CAS  Article  PubMed  Google Scholar 

  23. Legenstein, R. & Maass, W. Ensembles of spiking neurons with noise support optimal probabilistic inference in a dynamically changing environment. PLoS. Comput. Biol. 10, e1003859 (2014).

  24. Acuña, D. E. & Schrater, P. Structure learning in human sequential decision-making. PLoS. Comput. Biol. 6, (2010).

  25. Hastie, T. et al. The Elements of Statistical Learning (Springer, New York, NY, 2009).

  26. Geman, S., Bienenstock, E. & Doursat, R. Neural networks and the bias/variance dilemma. Neural Comput. 4, 1–58 (1992).

    Article  Google Scholar 

  27. Friedman, J. H. On bias, variance, 0/1—loss, and the curse-of-dimensionality. Data Min. Knowl. Discov. 1, 55–77 (1997).

    Article  Google Scholar 

  28. Austerweil, J. L., Gershman, S. J., Tenenbaum, J. B. & Griffiths, T. L. in Oxford Handbook of Computational and Mathematical Psychology (eds Busemeyer, J. R., Wang, Z., Townsend, J. T. & Eidels, A.) 187–208 (Oxford Univ. Press, New York, NY, 2015).

  29. Gigerenzer, G. & Gaissmaier, W. Heuristic decision making. Annu. Rev. Psychol. 62, 451–482 (2011).

    Article  PubMed  Google Scholar 

  30. Behrens, T. E., Woolrich, M. W., Walton, M. E. & Rushworth, M. F. Learning the value of information in an uncertain world. Nat. Neurosci. 10, 1214–1221 (2007).

    CAS  Article  PubMed  Google Scholar 

  31. Yu, A. J. & Cohen, J. D. Sequential effects: superstition or rational behavior? Adv. Neural Inf. Process. Syst. 21, 1873–1880 (2008).

    PubMed  PubMed Central  Google Scholar 

  32. Meyniel, F., Schlunegger, D. & Dehaene, S. The sense of confidence during probabilistic learning: a normative account. PLoS. Comput. Biol. 11, (2015).

  33. Meyniel, F., Maheu, M. & Dehaene, S. Human inferences about sequences: a minimal transition probability model. PLoS. Comput. Biol. 12, (2016).

  34. Mathys, C., Daunizeau, J., Friston, K. J. & Stephan, K. E. A Bayesian foundation for individual learning under uncertainty. Front. Hum. Neurosci. 5, 39 (2011).

  35. Creutzig, F., Globerson, A. & Tishby, N. Past-future information bottleneck in dynamical systems. Phys. Rev. E 79, 041925 (2009).

  36. Palmer, S. E., Marre, O., Berry, M. J. & Bialek, W. Predictive information in a sensory population. Proc. Natl. Acad. Sci. USA 112, 6908–6913 (2015).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  37. Tishby, N., Pereira, F. C. & Bialek, W. The information bottleneck method. Preprint at https://arxiv.org/abs/physics/0004057 (2000).

  38. Brown, S. D. & Steyvers, M. Detecting and predicting changes. Cogn. Psychol. 58, 49–67 (2009).

    Article  PubMed  Google Scholar 

  39. Boerlin, M., Machens, C. K. & Denève, S. Predictive coding of dynamical variables in balanced spiking networks. PLoS. Comput. Biol. 9, (2013).

  40. Gonzalez Castro, L. N., Hadjiosif, A. M., Hemphill, M. A. & Smith, M. A. Environmental consistency determines the rate of motor adaptation. Curr. Biol. 24, 1050–1061 (2014).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  41. Sato, Y. & Kording, K. P. How much to trust the senses: likelihood learning. J. Vis. 14, 13 (2014).

  42. Radillo, A. E., Veliz-Cuba, A., Josic, K. & Kilpatrick, Z. P. Evidence accumulation and change rate Inference in dynamic environments. Neural Comput. 29, 1561–1610 (2017).

    Article  PubMed  Google Scholar 

  43. Deneve, S. Bayesian spiking neurons II: learning. Neural Comput. 20, 118–145 (2008).

    Article  PubMed  Google Scholar 

  44. Deneve, S. Making decisions with unknown sensory reliability. Front. Neurosci. 6, 75 (2012).

  45. Kemp, C., Perfors, A. & Tenenbaum, J. B. Learning overhypotheses with hierarchical Bayesian models. Dev. Sci. 10, 307–321 (2007).

    Article  PubMed  Google Scholar 

  46. Lee, T. S. & Mumford, D. Hierarchical Bayesian inference in the visual cortex. J. Opt. Soc. Am. A 20, 1434–1448 (2003).

    Article  Google Scholar 

  47. Botvinick, M. M., Niv, Y. & Barto, A. C. Hierarchically organized behavior and its neural foundations: a reinforcement learning perspective. Cognition 113, 262–280 (2008).

  48. Diuk, C., Tsai, K., Wallis, J., Botvinick, M. & Niv, Y. Hierarchical learning induces two simultaneous, but separable, prediction errors in human basal ganglia. J. Neurosci. 33, 5797–5805 (2013).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  49. Ribas-Fernandes, J. J. et al. A neural signature of hierarchical reinforcement learning. Neuron 71, 370–379 (2011).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  50. Badre, D., Doll, B. B., Long, N. M. & Frank, M. J. Rostrolateral prefrontal cortex and individual differences in uncertainty-driven exploration. Neuron 73, 595–607 (2012).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  51. Frank, M. J. & Badre, D. Mechanisms of hierarchical reinforcement learning in corticostriatal circuits 1: computational analysis. Cereb. Cortex 22, 509–526 (2012).

    Article  PubMed  Google Scholar 

  52. Mathys, C. D. et al. Uncertainty in perception and the hierarchical Gaussian filter. Front. Hum. Neurosci. 8, 825 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  53. Daw, N. & Courville, A. The pigeon as particle filter. Adv. Neural Inf. Process. Syst. 20, 369–376 (2008).

    Google Scholar 

  54. Buesing, L., Bill, J., Nessler, B. & Maass, W. Neural dynamics as sampling: a model for stochastic computation in recurrent networks of spiking neurons. PLoS. Comput. Biol. 7, (2011).

  55. Huang, Y. & Rao, R. P. Neurons as Monte Carlo samplers: Bayesian inference and learning in spiking networks. In Advances in Neural Information Processing Systems 27 (eds Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N. D. & Weinberger, K. Q.) 1943–1951 (NIPS, 2014).

  56. Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction (MIT Press: Cambridge, MA, 1998).

  57. Wu, H. G., Miyamoto, Y. R., Gonzalez Castro, L. N., Ölveczky, B. P. & Smith, M. A. Temporal structure of motor variability is dynamically regulated and predicts motor learning ability. Nat. Neurosci. 17, 312–321 (2014).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  58. Tumer, E. C. & Brainard, M. S. Performance variability enables adaptive plasticity of ‘crystallized’ adult birdsong. Nature 450, 1240–1244 (2007).

    CAS  Article  PubMed  Google Scholar 

  59. Kaelbling, L. P., Littman, M. L. & Moore, A. W. Reinforcement learning: a survey. J. Art. Intel. Res. 4, 237–285 (1996).

    Google Scholar 

  60. Vapnik, V. Statistical Learning Theory (Wiley: New York, 1998).

  61. Chervonenkis, A. I. A. & Vapnik, V. N. Theory of uniform convergence of frequencies of events to their probabilities and problems of search for an optimal solution from empirical data. Autom. Remote. Control. 32, 207–217 (1971).

    Google Scholar 

  62. Friston, K., Mattout, J., Trujillo-Barreto, N., Ashburner, J. & Penny, W. Variational free energy and the Laplace approximation. Neuroimage 34, 220–234 (2007).

    Article  PubMed  Google Scholar 

  63. Ming, L. & Vitányi, P. An Introduction to Kolmogorov Complexity and its Applications (Springer, Heidelberg, 1997).

  64. Rissanen, J. in Complexity, Entropy and the Physics of Information (ed. Zurek, W. H.) 117–126 (Addison-Wesley Publishing, Redwood City, CA, 1990).

  65. Bialek, W., Nemenman, I. & Tishby, N. Predictability, complexity, and learning. Neural Comput. 13, 2409–2463 (2001).

    CAS  Article  PubMed  Google Scholar 

  66. Bialek, W., Nemenman, I. & Tishby, N. Complexity through nonextensivity. Phys. A 302, 89–99 (2001).

    Article  Google Scholar 

  67. Balasubramanian, V. Statistical inference, Occam’s razor, and statistical mechanics on the space of probability distributions. Neural Comput. 9, 349–368 (1997).

    Article  Google Scholar 

  68. Balasubramanian, V. A geometric formulation of Occam’s razor for inference of parametric distributions. Preprint at https://arxiv.org/abs/adap-org/9601001 (1996).

  69. Drugowitsch, J., Moreno-Bote, R., Churchland, A. K., Shadlen, M. N. & Pouget, A. The cost of accumulating evidence in perceptual decision making. J. Neurosci. 32, 3612–3628 (2012).

  70. Davidson, M. & McCarthy, D. The Matching Law: A Research Review. (Erlbaum: Hillsdale, 1988.

    Google Scholar 

  71. Luce, R. D. Response Times: Their Role in Inferring Elementary Mental Organization 8 (Oxford University Press: New York, NY, 1986).

  72. Laming, D. R. J. Information Theory of Choice Reaction Time (Wiley: New York, NY,1968).

  73. Cho, R. Y. et al. Mechanisms underlying dependencies of performance on stimulus history in a two-alternative forced-choice task. Cogn. Affect. Behav. Neurosci. 2, 283–299 (2002).

    Article  PubMed  Google Scholar 

  74. Jones, M., Curran, T., Mozer, M. C. & Wilder, M. H. Sequential effects in response time reveal learning mechanisms and event representations. Psychol. Rev. 120, 628–666 (2013).

    Article  PubMed  Google Scholar 

  75. Zhang, S., Huang, H. C. & Yu, A. J. Sequential effects: A Bayesian analysis of prior bias on reaction time and behavioral choice. In Proc. Annual Meeting Cognitive Science Society 36, 1844–1849 (Cognitive Science Society, 2014).

  76. Goldfarb, S., Wong-Lin, K. F., Schwemmer, M., Leonard, N. E. & Holmes, P. Can post-error dynamics explain sequential reaction time patterns? Front. Psychol. https://doi.org/10.3389/fpsyg.2012.00213 (2012).

  77. McGuire, J. T., Nassar, M. R., Gold, J. I. & Kable, J. W. Functionally dissociable influences on learning rate in a dynamic environment. Neuron 84, 870–881 (2014).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  78. Charles, A. & Dennis, J. E. Analysis of generalized pattern searches. SIAM J. Optim. 13, 889–903 (2003).

    Google Scholar 

Download references

Acknowledgements

We thank G. Kroch and T. Kim for help with data collection and K. Krishnamurthy for comments. Funded by NSF-NCS 1533623. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript

Author information

Authors and Affiliations

Authors

Contributions

C.M.G., J.W.K. and J.I.G. designed the experiment; C.M.G. collected and analysed the data and implemented the models; A.L.S.F. implemented the complexity analysis; all five authors interpreted the results and drafted and/or revised the manuscript.

Corresponding author

Correspondence to Joshua I. Gold.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Glaze, C.M., Filipowicz, A.L.S., Kable, J.W. et al. A bias–variance trade-off governs individual differences in on-line learning in an unpredictable environment. Nat Hum Behav 2, 213–224 (2018). https://doi.org/10.1038/s41562-018-0297-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41562-018-0297-4

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing