A bias–variance trade-off governs individual differences in on-line learning in an unpredictable environment

Glaze, Christopher M.; Filipowicz, Alexandre L. S.; Kable, Joseph W.; Balasubramanian, Vijay; Gold, Joshua I.

doi:10.1038/s41562-018-0297-4

Article
Published: 05 February 2018

A bias–variance trade-off governs individual differences in on-line learning in an unpredictable environment

Christopher M. Glaze^1,2,
Alexandre L. S. Filipowicz^1,2,
Joseph W. Kable²,
Vijay Balasubramanian³ &
…
Joshua I. Gold ORCID: orcid.org/0000-0002-6018-0483¹

Nature Human Behaviour volume 2, pages 213–224 (2018)Cite this article

2655 Accesses
36 Citations
52 Altmetric
Metrics details

Subjects

Abstract

Decisions often benefit from learned expectations about the sequential structure of the evidence. Here we show that individual differences in this learning process can reflect different implicit assumptions about sequence complexity, leading to performance trade-offs. For a task requiring decisions about dynamic evidence streams, human subjects with more flexible, history-dependent choices (low bias) had greater trial-to-trial choice variability (high variance). In contrast, subjects with more history-independent choices (high bias) were more predictable (low variance). We accounted for these behaviours using models in which assumed complexity was encoded by the size of the hypothesis space over the latent rate of change of the source of evidence. The most parsimonious model used an efficient sampling algorithm in which the range of sampled hypotheses represented an information bottleneck that gave rise to a bias–variance trade-off. This trade-off, which is well known in machine learning, may thus also have broad applicability to human decision-making.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 2: Individual differences in hazard-rate learning.**

**Fig. 3: Relationship between choice variability and adaptivity across subjects.**

**Fig. 5: Relationships between adaptivity and choice variability from simulations using the ideal-observer, sampling and delta-rule models.**

**Fig. 6: Fits of the ideal-observer, sampling and delta-rule models to choice data.**

**Fig. 7: Effects of prior precision (φ_H) from the 20-sample model on simulated choice patterns.**

**Fig. 8: A model-independent measure of inference complexity.**

Multiple timescales of learning indicated by changes in evidence-accumulation processes during perceptual decision-making

Article Open access 08 June 2023

Aaron Cochrane, Chris R. Sims, … Daphne Bavelier

Active inference and the two-step task

Article Open access 21 October 2022

Sam Gijsen, Miro Grundei & Felix Blankenburg

Rational arbitration between statistics and rules in human sequence processing

Article 02 May 2022

Maxime Maheu, Florent Meyniel & Stanislas Dehaene

References

Gold, J. I. & Shadlen, M. N. The neural basis of decision making. Annu. Rev. Neurosci. 30, 535–574 (2007).
Article CAS PubMed Google Scholar
Smith, P. L. & Ratcliff, R. Psychology and neurobiology of simple decisions. Trends Neurosci. 27, 161–168 (2004).
Article CAS PubMed Google Scholar
Wald, A. Sequential Analysis (Wiley: New York, 1947).
Barnard, G. A. Sequential tests in industrial statistics. J. Roy. Stat. Soc. Suppl. 8, 1–26 (1946).
Article Google Scholar
Brody, C. D. & Hanks, T. D. Neural underpinnings of the evidence accumulator. Curr. Opin. Neurobiol. 37, 149–157 (2016).
Article CAS PubMed PubMed Central Google Scholar
Kelly, S. P. & O’Connell, R. G. The neural processes underlying perceptual decision making in humans: recent progress and future directions. J. Physiol. Paris 109, 27–37 (2015).
Article PubMed Google Scholar
Bogacz, R., Brown, E., Moehlis, J., Holmes, P. & Cohen, J. D. The physics of optimal decision making: a formal analysis of models of performance in two-alternative forced choice tasks. Psychol. Rev. 113, 700–765 (2006).
Article PubMed Google Scholar
Wilson, R. C., Nassar, M. R. & Gold, J. I. Bayesian online learning of the hazard rate in change-point problems. Neural Comput. 22, 2452–2476 (2010).
Article PubMed PubMed Central Google Scholar
Wilson, R. C., Nassar, M. R. & Gold, J. I. A mixture of delta-rules approximation to bayesian inference in change-point problems. PLoS. Comput. Biol. 9, (2013).
Adams, R. P. & MacKay, D. J. C. Bayesian Online Changepoint Detection (University of Cambridge, Cambridge, 2007).
Fearnhead, P. & Liu, Z. On-line inference for multiple changepoint problems. J. R. Stat. Soc. Ser. B 69, 589–605 (2007).
Article Google Scholar
Veliz-Cuba, A., Kilpatrick, Z. P. & Josic, K. Stochastic models of evidence accumulation in changing environments. SIAM Rev. 58, 264–289 (2016).
Article Google Scholar
Glaze, C. M., Kable, G. W. & Gold, J. I. Normative evidence accumulation in unpredictable environments.eLife 4, (2015).
Ossmy, O. et al. The timescale of perceptual evidence integration can be adapted to the environment. Curr. Biol. 23, 981–986 (2013).
Article CAS PubMed Google Scholar
Behrens, T. E., Woolrich, M. W., Walton, M. E. & Rushworth, M. F. Learning the value of information in an uncertain world. Nat. Neurosci. 10, 1214–1221 (2007).
Article CAS PubMed Google Scholar
Krugel, L. K., Biele, G., Mohr, P. N., Li, S. C. & Heekeren, H. R. Genetic variation in dopaminergic neuromodulation influences the ability to rapidly and flexibly adapt decisions. Proc. Natl. Acad. Sci. USA 106, 17951–17956 (2009).
Article CAS PubMed PubMed Central Google Scholar
Nassar, M. R., Wilson, R. C., Heasly, B. & Gold, J. I. An approximately Bayesian delta-rule model explains the dynamics of belief updating in a changing environment. J. Neurosci. 30, 12366–12378 (2010).
Article CAS PubMed PubMed Central Google Scholar
Bishop, C. M. Pattern Recognition and Machine Learning (Springer, New York, NY, 2006).
Rao, R. P. Bayesian computation in recurrent neural circuits. Neural Comput. 16, 1–38 (2004).
Article PubMed Google Scholar
Friston, K. The free-energy principle: a unified brain theory? Nat. Rev. Neurosci. 11, 127–138 (2010).
Article CAS PubMed Google Scholar
Shi, L. & Griffiths, T. L. Neural implementation of hierarchical Bayesian inference by importance sampling. In Advances in Neural Information Processing Systems 22 (eds Bengio, Y., Schuurmans, D., Lafferty, J. D., Williams, C. K. I. & Culotta, A.) 1669–1677 (NIPS, 2009).
Lochmann, T. & Deneve, S. Neural processing as causal inference. Curr. Opin. Neurobiol. 21, 774–781 (2011).
Article CAS PubMed Google Scholar
Legenstein, R. & Maass, W. Ensembles of spiking neurons with noise support optimal probabilistic inference in a dynamically changing environment. PLoS. Comput. Biol. 10, e1003859 (2014).
Acuña, D. E. & Schrater, P. Structure learning in human sequential decision-making. PLoS. Comput. Biol. 6, (2010).
Hastie, T. et al. The Elements of Statistical Learning (Springer, New York, NY, 2009).
Geman, S., Bienenstock, E. & Doursat, R. Neural networks and the bias/variance dilemma. Neural Comput. 4, 1–58 (1992).
Article Google Scholar
Friedman, J. H. On bias, variance, 0/1—loss, and the curse-of-dimensionality. Data Min. Knowl. Discov. 1, 55–77 (1997).
Article Google Scholar
Austerweil, J. L., Gershman, S. J., Tenenbaum, J. B. & Griffiths, T. L. in Oxford Handbook of Computational and Mathematical Psychology (eds Busemeyer, J. R., Wang, Z., Townsend, J. T. & Eidels, A.) 187–208 (Oxford Univ. Press, New York, NY, 2015).
Gigerenzer, G. & Gaissmaier, W. Heuristic decision making. Annu. Rev. Psychol. 62, 451–482 (2011).
Article PubMed Google Scholar
Behrens, T. E., Woolrich, M. W., Walton, M. E. & Rushworth, M. F. Learning the value of information in an uncertain world. Nat. Neurosci. 10, 1214–1221 (2007).
Article CAS PubMed Google Scholar
Yu, A. J. & Cohen, J. D. Sequential effects: superstition or rational behavior? Adv. Neural Inf. Process. Syst. 21, 1873–1880 (2008).
PubMed PubMed Central Google Scholar
Meyniel, F., Schlunegger, D. & Dehaene, S. The sense of confidence during probabilistic learning: a normative account. PLoS. Comput. Biol. 11, (2015).
Meyniel, F., Maheu, M. & Dehaene, S. Human inferences about sequences: a minimal transition probability model. PLoS. Comput. Biol. 12, (2016).
Mathys, C., Daunizeau, J., Friston, K. J. & Stephan, K. E. A Bayesian foundation for individual learning under uncertainty. Front. Hum. Neurosci. 5, 39 (2011).
Creutzig, F., Globerson, A. & Tishby, N. Past-future information bottleneck in dynamical systems. Phys. Rev. E 79, 041925 (2009).
Palmer, S. E., Marre, O., Berry, M. J. & Bialek, W. Predictive information in a sensory population. Proc. Natl. Acad. Sci. USA 112, 6908–6913 (2015).
Article CAS PubMed PubMed Central Google Scholar
Tishby, N., Pereira, F. C. & Bialek, W. The information bottleneck method. Preprint at https://arxiv.org/abs/physics/0004057 (2000).
Brown, S. D. & Steyvers, M. Detecting and predicting changes. Cogn. Psychol. 58, 49–67 (2009).
Article PubMed Google Scholar
Boerlin, M., Machens, C. K. & Denève, S. Predictive coding of dynamical variables in balanced spiking networks. PLoS. Comput. Biol. 9, (2013).
Gonzalez Castro, L. N., Hadjiosif, A. M., Hemphill, M. A. & Smith, M. A. Environmental consistency determines the rate of motor adaptation. Curr. Biol. 24, 1050–1061 (2014).
Article CAS PubMed PubMed Central Google Scholar
Sato, Y. & Kording, K. P. How much to trust the senses: likelihood learning. J. Vis. 14, 13 (2014).
Radillo, A. E., Veliz-Cuba, A., Josic, K. & Kilpatrick, Z. P. Evidence accumulation and change rate Inference in dynamic environments. Neural Comput. 29, 1561–1610 (2017).
Article PubMed Google Scholar
Deneve, S. Bayesian spiking neurons II: learning. Neural Comput. 20, 118–145 (2008).
Article PubMed Google Scholar
Deneve, S. Making decisions with unknown sensory reliability. Front. Neurosci. 6, 75 (2012).
Kemp, C., Perfors, A. & Tenenbaum, J. B. Learning overhypotheses with hierarchical Bayesian models. Dev. Sci. 10, 307–321 (2007).
Article PubMed Google Scholar
Lee, T. S. & Mumford, D. Hierarchical Bayesian inference in the visual cortex. J. Opt. Soc. Am. A 20, 1434–1448 (2003).
Article Google Scholar
Botvinick, M. M., Niv, Y. & Barto, A. C. Hierarchically organized behavior and its neural foundations: a reinforcement learning perspective. Cognition 113, 262–280 (2008).
Diuk, C., Tsai, K., Wallis, J., Botvinick, M. & Niv, Y. Hierarchical learning induces two simultaneous, but separable, prediction errors in human basal ganglia. J. Neurosci. 33, 5797–5805 (2013).
Article CAS PubMed PubMed Central Google Scholar
Ribas-Fernandes, J. J. et al. A neural signature of hierarchical reinforcement learning. Neuron 71, 370–379 (2011).
Article CAS PubMed PubMed Central Google Scholar
Badre, D., Doll, B. B., Long, N. M. & Frank, M. J. Rostrolateral prefrontal cortex and individual differences in uncertainty-driven exploration. Neuron 73, 595–607 (2012).
Article CAS PubMed PubMed Central Google Scholar
Frank, M. J. & Badre, D. Mechanisms of hierarchical reinforcement learning in corticostriatal circuits 1: computational analysis. Cereb. Cortex 22, 509–526 (2012).
Article PubMed Google Scholar
Mathys, C. D. et al. Uncertainty in perception and the hierarchical Gaussian filter. Front. Hum. Neurosci. 8, 825 (2014).
Article PubMed PubMed Central Google Scholar
Daw, N. & Courville, A. The pigeon as particle filter. Adv. Neural Inf. Process. Syst. 20, 369–376 (2008).
Google Scholar
Buesing, L., Bill, J., Nessler, B. & Maass, W. Neural dynamics as sampling: a model for stochastic computation in recurrent networks of spiking neurons. PLoS. Comput. Biol. 7, (2011).
Huang, Y. & Rao, R. P. Neurons as Monte Carlo samplers: Bayesian inference and learning in spiking networks. In Advances in Neural Information Processing Systems 27 (eds Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N. D. & Weinberger, K. Q.) 1943–1951 (NIPS, 2014).
Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction (MIT Press: Cambridge, MA, 1998).
Wu, H. G., Miyamoto, Y. R., Gonzalez Castro, L. N., Ölveczky, B. P. & Smith, M. A. Temporal structure of motor variability is dynamically regulated and predicts motor learning ability. Nat. Neurosci. 17, 312–321 (2014).
Article CAS PubMed PubMed Central Google Scholar
Tumer, E. C. & Brainard, M. S. Performance variability enables adaptive plasticity of ‘crystallized’ adult birdsong. Nature 450, 1240–1244 (2007).
Article CAS PubMed Google Scholar
Kaelbling, L. P., Littman, M. L. & Moore, A. W. Reinforcement learning: a survey. J. Art. Intel. Res. 4, 237–285 (1996).
Google Scholar
Vapnik, V. Statistical Learning Theory (Wiley: New York, 1998).
Chervonenkis, A. I. A. & Vapnik, V. N. Theory of uniform convergence of frequencies of events to their probabilities and problems of search for an optimal solution from empirical data. Autom. Remote. Control. 32, 207–217 (1971).
Google Scholar
Friston, K., Mattout, J., Trujillo-Barreto, N., Ashburner, J. & Penny, W. Variational free energy and the Laplace approximation. Neuroimage 34, 220–234 (2007).
Article PubMed Google Scholar
Ming, L. & Vitányi, P. An Introduction to Kolmogorov Complexity and its Applications (Springer, Heidelberg, 1997).
Rissanen, J. in Complexity, Entropy and the Physics of Information (ed. Zurek, W. H.) 117–126 (Addison-Wesley Publishing, Redwood City, CA, 1990).
Bialek, W., Nemenman, I. & Tishby, N. Predictability, complexity, and learning. Neural Comput. 13, 2409–2463 (2001).
Article CAS PubMed Google Scholar
Bialek, W., Nemenman, I. & Tishby, N. Complexity through nonextensivity. Phys. A 302, 89–99 (2001).
Article Google Scholar
Balasubramanian, V. Statistical inference, Occam’s razor, and statistical mechanics on the space of probability distributions. Neural Comput. 9, 349–368 (1997).
Article Google Scholar
Balasubramanian, V. A geometric formulation of Occam’s razor for inference of parametric distributions. Preprint at https://arxiv.org/abs/adap-org/9601001 (1996).
Drugowitsch, J., Moreno-Bote, R., Churchland, A. K., Shadlen, M. N. & Pouget, A. The cost of accumulating evidence in perceptual decision making. J. Neurosci. 32, 3612–3628 (2012).
Davidson, M. & McCarthy, D. The Matching Law: A Research Review. (Erlbaum: Hillsdale, 1988.
Google Scholar
Luce, R. D. Response Times: Their Role in Inferring Elementary Mental Organization 8 (Oxford University Press: New York, NY, 1986).
Laming, D. R. J. Information Theory of Choice Reaction Time (Wiley: New York, NY,1968).
Cho, R. Y. et al. Mechanisms underlying dependencies of performance on stimulus history in a two-alternative forced-choice task. Cogn. Affect. Behav. Neurosci. 2, 283–299 (2002).
Article PubMed Google Scholar
Jones, M., Curran, T., Mozer, M. C. & Wilder, M. H. Sequential effects in response time reveal learning mechanisms and event representations. Psychol. Rev. 120, 628–666 (2013).
Article PubMed Google Scholar
Zhang, S., Huang, H. C. & Yu, A. J. Sequential effects: A Bayesian analysis of prior bias on reaction time and behavioral choice. In Proc. Annual Meeting Cognitive Science Society 36, 1844–1849 (Cognitive Science Society, 2014).
Goldfarb, S., Wong-Lin, K. F., Schwemmer, M., Leonard, N. E. & Holmes, P. Can post-error dynamics explain sequential reaction time patterns? Front. Psychol. https://doi.org/10.3389/fpsyg.2012.00213 (2012).
McGuire, J. T., Nassar, M. R., Gold, J. I. & Kable, J. W. Functionally dissociable influences on learning rate in a dynamic environment. Neuron 84, 870–881 (2014).
Article CAS PubMed PubMed Central Google Scholar
Charles, A. & Dennis, J. E. Analysis of generalized pattern searches. SIAM J. Optim. 13, 889–903 (2003).
Google Scholar

Download references

Acknowledgements

We thank G. Kroch and T. Kim for help with data collection and K. Krishnamurthy for comments. Funded by NSF-NCS 1533623. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript

Author information

Authors and Affiliations

Department of Neuroscience, University of Pennsylvania, Philadelphia, PA, USA
Christopher M. Glaze, Alexandre L. S. Filipowicz & Joshua I. Gold
Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA
Christopher M. Glaze, Alexandre L. S. Filipowicz & Joseph W. Kable
Department of Physics, University of Pennsylvania, Philadelphia, PA, USA
Vijay Balasubramanian

Authors

Christopher M. Glaze
View author publications
You can also search for this author in PubMed Google Scholar
Alexandre L. S. Filipowicz
View author publications
You can also search for this author in PubMed Google Scholar
Joseph W. Kable
View author publications
You can also search for this author in PubMed Google Scholar
Vijay Balasubramanian
View author publications
You can also search for this author in PubMed Google Scholar
Joshua I. Gold
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

C.M.G., J.W.K. and J.I.G. designed the experiment; C.M.G. collected and analysed the data and implemented the models; A.L.S.F. implemented the complexity analysis; all five authors interpreted the results and drafted and/or revised the manuscript.

Corresponding author

Correspondence to Joshua I. Gold.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figures 1–6.

Life Sciences Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Cite this article

Glaze, C.M., Filipowicz, A.L.S., Kable, J.W. et al. A bias–variance trade-off governs individual differences in on-line learning in an unpredictable environment. Nat Hum Behav 2, 213–224 (2018). https://doi.org/10.1038/s41562-018-0297-4

Download citation

Received: 09 April 2017
Accepted: 04 January 2018
Published: 05 February 2018
Issue Date: March 2018
DOI: https://doi.org/10.1038/s41562-018-0297-4

This article is cited by

Persistent activity in human parietal cortex mediates perceptual choice repetition bias
- Anne E. Urai
- Tobias H. Donner
Nature Communications (2022)
Human inference reflects a normative balance of complexity and accuracy
- Gaia Tavoni
- Takahiro Doi
- Joshua I. Gold
Nature Human Behaviour (2022)
Individual beliefs about temporal continuity explain variation of perceptual biases
- Stefan Glasauer
- Zhuanghua Shi
Scientific Reports (2022)
Collaborative Thompson Sampling
- Zhenyu Zhu
- Liusheng Huang
- Hongli Xu
Mobile Networks and Applications (2020)
Controllability governs the balance between Pavlovian and instrumental action selection
- Hayley M. Dorfman
- Samuel J. Gershman
Nature Communications (2019)