A neural basis of probabilistic computation in visual cortex


Bayesian models of behavior suggest that organisms represent uncertainty associated with sensory variables. However, the neural code of uncertainty remains elusive. A central hypothesis is that uncertainty is encoded in the population activity of cortical neurons in the form of likelihood functions. We tested this hypothesis by simultaneously recording population activity from primate visual cortex during a visual categorization task in which trial-to-trial uncertainty about stimulus orientation was relevant for the decision. We decoded the likelihood function from the trial-to-trial population activity and found that it predicted decisions better than a point estimate of orientation. This remained true when we conditioned on the true orientation, suggesting that internal fluctuations in neural activity drive behaviorally meaningful variations in the likelihood function. Our results establish the role of population-encoded likelihood functions in mediating behavior and provide a neural underpinning for Bayesian models of perception.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Alternative models of uncertainty information encoding.
Fig. 2: Behavioral task.
Fig. 3: Encoding and decoding of the stimulus orientation.
Fig. 4: Likelihood functions decoded by the trained full-likelihood decoders.
Fig. 5: Model performance.
Fig. 6: Attribution analysis for means and standard deviations of the likelihood functions.

Data availability

All figures except for Fig. 1 and Extended Data Fig. 4 were generated from raw data or processed data. The data generated and/or analyzed during this study are available from the corresponding author upon reasonable request. No publicly available data were used in this study.

Code availability

Codes used for modeling and training the DNNs, as well as for figure generation, can be viewed and downloaded from https://github.com/eywalker/v1_likelihood. All other codes used for analysis, including data selection and decision model fitting, can be found at https://github.com/eywalker/v1_project. Finally, codes used for electrophysiology data processing can be found in the Tolias lab GitHub organization website (https://github.com/atlab).


  1. 1.

    Laplace, P.-S. Theorie Analytique des Probabilites (Ve Courcier, Paris, 1812).

    Google Scholar 

  2. 2.

    von Helmholtz, H. Versuch einer erweiterten Anwendung des Fechnerschen Gesetzes im farbensystem. Z. Psychol. Physiol. Sinnesorg 2, 1–30 (1891).

    Google Scholar 

  3. 3.

    Knill, D. C. & Richards, W. (eds) Perception as Bayesian Inference (Cambridge University Press, 1996).

  4. 4.

    Kersten, D., Mamassian, P. & Yuille, A. Object perception as Bayesian inference. Annu. Rev. Psychol. 55, 271–304 (2004).

    PubMed  Google Scholar 

  5. 5.

    Knill, D. C. & Pouget, A. The Bayesian brain: the role of uncertainty in neural coding and computation. Trends Neurosci. 27, 712–719 (2004).

    CAS  PubMed  Google Scholar 

  6. 6.

    Ma, W. J. & Jazayeri, M. Neural coding of uncertainty and probability. Annu. Rev. Neurosci. 37, 205–220 (2014).

    CAS  PubMed  Google Scholar 

  7. 7.

    Alais, D. & Burr, D. The ventriloquist effect results from near-optimal bimodal integration. Curr. Biol. 14, 257–262 (2004).

    CAS  PubMed  Google Scholar 

  8. 8.

    Ernst, M. O. & Banks, M. S. Humans integrate visual and haptic information in a statistically optimal fashion. Nature 415, 429–433 (2002).

    CAS  PubMed  Google Scholar 

  9. 9.

    Ma, W. J., Beck, J. M., Latham, P. E. & Pouget, A. Bayesian inference with probabilistic population codes. Nat. Neurosci. 9, 1432–1438 (2006).

    CAS  PubMed  Google Scholar 

  10. 10.

    Beck, J. M. et al. Probabilistic population codes for bayesian decision making. Neuron 60, 1142–1152 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. 11.

    Pouget, A., Dayan, P. & Zemel, R. Information processing with population codes. Nat. Rev. Neurosci. 1, 125–132 (2000).

    CAS  PubMed  Google Scholar 

  12. 12.

    Pouget, A., Dayan, P. & Zemel, R. S. Inference and computation with population codes. Annu. Rev. Neurosci. 26, 381–410 (2003).

    CAS  PubMed  Google Scholar 

  13. 13.

    Ma, W. J., Beck, J. M. & Pouget, A. Spiking networks for Bayesian inference and choice. Curr. Opin. Neurobiol. 18, 217–222 (2008).

    CAS  PubMed  Google Scholar 

  14. 14.

    Graf, A. B. A., Kohn, A., Jazayeri, M. & Movshon, J. A. Decoding the activity of neuronal populations in macaque primary visual cortex. Nat. Neurosci. 14, 239–245 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  15. 15.

    Qamar, A. T. et al. Trial-to-trial, uncertainty-based adjustment of decision boundaries in visual categorization. Proc. Natl. Acad. Sci. USA 110, 20332–20337 (2013).

    CAS  PubMed  Google Scholar 

  16. 16.

    LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).

    CAS  Google Scholar 

  17. 17.

    Goodfellow, A., Bengio, I. & Courville, Y. Deep Learning (MIT Press, 2016).

  18. 18.

    Seung, H. S. & Sompolinsky, H. Simple models for reading neuronal population codes. Proc. Natl. Acad. Sci. USA 90, 10749–10753 (1993).

    CAS  PubMed  Google Scholar 

  19. 19.

    Sanger, T. D. Probability density estimation for the interpretation of neural population codes. J. Neurophysiol. 76, 2790–2793 (1996).

    CAS  PubMed  Google Scholar 

  20. 20.

    Zemel, R. S., Dayan, P. & Pouget, A. Probabalistic interpretation of population codes. Neural Comput. 10, 403–430 (1998).

    CAS  PubMed  Google Scholar 

  21. 21.

    Jazayeri, M. & Movshon, J. A. Optimal representation of sensory information by neural populations. Nat. Neurosci. 9, 690–696 (2006).

    CAS  PubMed  Google Scholar 

  22. 22.

    Fetsch, C. R., Pouget, A., Deangelis, G. C. & Angelaki, D. E. Neural correlates of reliability-based cue weighting during multisensory integration. Nat. Neurosci. 15, 146–154 (2012).

    CAS  Google Scholar 

  23. 23.

    Averbeck, B. B. & Lee, D. Effects of noise correlations on information encoding and decoding. J. Neurophysiol. 95, 3633–3644 (2006).

    Google Scholar 

  24. 24.

    Ecker, A. S. et al. Decorrelated neuronal firing in coritcal micorcircuits. Science 327, 584–587 (2010).

    CAS  PubMed  Google Scholar 

  25. 25.

    Ecker, A. S., Berens, P., Tolias, A. S. & Bethge, M. The effect of noise correlations in populations of diversely tuned neurons. J. Neurosci. 31, 14272–14283 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. 26.

    Ecker, A. S. et al. State dependence of noise correlations in macaque primary visual cortex. Neuron 82, 235–248 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. 27.

    van Bergen, R. S. & Jehee, J. F. M. Modeling correlated noise is necessary to decode uncertainty. Neuroimage 180, 78–87 (2018).

    CAS  PubMed  Google Scholar 

  28. 28.

    Denfield, G. H., Ecker, A. S., Shinn, T. J., Bethge, M. & Tolias, A. S. Attentional fluctuations induce shared variability in macaque primary visual cortex. Nat. Commun. 9, 2654 (2018).

    PubMed  PubMed Central  Google Scholar 

  29. 29.

    Ma, W. J. Signal detection theory, uncertainty, and poisson-like population codes. Vis. Res. 50, 2308–2319 (2010).

    PubMed  Google Scholar 

  30. 30.

    Van Bergen, R. S., Ma, W. J., Pratte, M. S. & Jehee, J. F. M. Sensory uncertainty decoded from visual cortex predicts behavior. Nat. Neurosci. 18, 1728–1730 (2015).

    PubMed  PubMed Central  Google Scholar 

  31. 31.

    Tolhurst, D. J., Movshon, J. A. & Dean, A. F. The statistical reliability of signals in single neurons in cat and monkey visual cortex. Vis. Res. 23, 775–785 (1983).

    CAS  PubMed  Google Scholar 

  32. 32.

    Shadlen, M. N. & Newsome, W. T. The variable discharge of cortical neurons: implications for connectivity, computation, and information coding. J. Neurosci. 18, 3870–3896 (1998).

    CAS  PubMed  PubMed Central  Google Scholar 

  33. 33.

    Ancona, M., Ceolini, E., Öztireli, C. & Gross, M. A unified view of gradient-based attribution methods for deep neural networks. In NIPS 2017 Workshop onInterpreting, Explaining and Visualizing Deep Learning http://www.interpretable-ml.org/nips2017workshop/papers/02.pdf (2017).

  34. 34.

    Simonyan, K., Vedaldi, A. & Zisserman, A. Deep inside convolutional networks: visualising image classification models and saliency maps. Preprint at arXiv https://arxiv.org/abs/1312.6034 (2013).

  35. 35.

    Shrikumar, A., Greenside, P. & Kundaje, A. Learning important features through propagating activation differences. in Proceedings of the 34th International Conference on Machine Learning, Proceedings of Machine Learning Research Vol. 70 (eds Precup, D. & Teh, Y. W.) 3145–3153 (2017).

  36. 36.

    Campbell, F. W. & Kulikowski, J. J. The visual evoked potential as a function of contrast of a grating pattern. J. Physiol. 222, 345–356 (1972).

    CAS  PubMed  PubMed Central  Google Scholar 

  37. 37.

    Britten, K. H., Newsome, W. T., Shadlen, M. N., Celebrini, S. & Movshon, J. A. A relationship between behavioral choice and the visual responses of neurons in macaque mt. Vis. Neurosci. 13, 87–100 (1996).

    CAS  PubMed  Google Scholar 

  38. 38.

    Angelaki, D. E., Humphreys, G. & DeAngelis, G. C. Multisensory integration. J. Theor. Humanit. 19, 452–458 (2009).

    CAS  Google Scholar 

  39. 39.

    Ma, W. J., Navalpakkam, V., Beck, J. M., van den Berg, R. & Pouget, A. Behavior and neural basis of near-optimal visual search. Nat. Neurosci. 14, 783–790 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  40. 40.

    Beck, J. M., Latham, P. E. & Pouget, A. Marginalization in neural circuits with divisive normalization. J. Neurosci. 31, 15310–15319 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  41. 41.

    Ma, W. J. & Rahmati, M. Towards a neural implementation of causal inference in cue combination. Multisens. Res. 26, 159–176 (2013).

    PubMed  Google Scholar 

  42. 42.

    Orhan, A. E. & Ma, W. J. Efficient probabilistic inference in generic neural networks trained with non-probabilistic feedback. Nat. Commun. 8, 138 (2017).

    PubMed  PubMed Central  Google Scholar 

  43. 43.

    Cumming, B. G. & Nienborg, H. Feedforward and feedback sources of choice probability in neural population responses. Curr. Opin. Neurobiol. 37, 126–132 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  44. 44.

    Bondy, A. G., Haefner, R. M. & Cumming, B. G. Feedback determines the structure of correlated variability in primary visual cortex. Nat. Neurosci. 21, 598–606 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  45. 45.

    Geisler, W. S. Contributions of ideal observer theory to vision research. Vis. Res. 51, 771–781 (2011).

    PubMed  Google Scholar 

  46. 46.

    Körding, K. Decision theory: what ‘should’ the nervous system do? Science 318, 606–610 (2017).

    Google Scholar 

  47. 47.

    Maloney, L. T. & Mamassian, P. Bayesian decision theory as a model of human visual perception: testing Bayesian transfer. Vis. Neurosci. 26, 147–155 (2009).

    PubMed  Google Scholar 

  48. 48.

    Ma, W. J. Organizing probabilistic models of perception. Trends Cogn. Sci. 16, 511–518 (2012).

    PubMed  Google Scholar 

  49. 49.

    Brainard, D. H. The psychophysics toolbox. Spat. Vis. 10, 433–436 (1997).

    CAS  PubMed  Google Scholar 

  50. 50.

    Tolias, A. S. et al. Recording chronically from the same neurons in awake, behaving primates. J. Neurophysiol. 98, 3780–3790 (2007).

    PubMed  Google Scholar 

  51. 51.

    Subramaniyan, M., Ecker, A. S., Berens, P. & Tolias, A. S. Macaque monkeys perceive the flash lag illusion. PLoS ONE 8, e58788 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  52. 52.

    Quiroga, R. Q., Nadasdy, Z. & Ben-Shaul, Y. Unsupervised spike detection and sorting with wavelets and superparamagnetic clustering. Neural Comput. 16, 1661–1687 (2004).

    PubMed  Google Scholar 

  53. 53.

    Kohn, A. & Movshon, J. A. Adaptation changes the direction tuning of macaque MT neurons. Nat. Neurosci. 7, 764–772 (2004).

    CAS  PubMed  Google Scholar 

  54. 54.

    Richard, M. D. & Lippmann, R. P. Neural network classifiers estimate bayesian a posteriori probabilities. Neural Comput. 3, 461–483 (1991).

    PubMed  Google Scholar 

  55. 55.

    Kline, D. M. & Berardi, V. L. Revisiting squared-error and cross-entropy functions for training neural network classifiers. Neural Comput. Appl. 14, 310–318 (2005).

    Google Scholar 

  56. 56.

    Kullback, S. & Leibler, R. A. On information and sufficiency. Ann. Math. Stat. 22, 79–86 (1951).

    Google Scholar 

  57. 57.

    MacKay, D. J. C. Information Theory, Inference, and Learning Algorithms Vol. 22 (Cambridge University Press, 2003).

  58. 58.

    Srivastava, N., Hinton, G., Krizhevsky, A. & Salakhutdinov, R. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014).

    Google Scholar 

  59. 59.

    Prechelt, L. in Neural Networks: Tricks of the Trade (eds Grégoire, M., Orr, G. B. & Müller, K.-R.) 53–68 (Springer-Verlag, 1998).

  60. 60.

    Jaderberg, M., Simonyan, K., Zisserman, A. & Kavukcuoglu, K. Spatial transformer networks. Adv. Neural Inf. Process. Syst. 28, 2017–2025 (2015).

    Google Scholar 

  61. 61.

    Rasmussen, C. E. & Williams, C. K. I. Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) (The MIT Press, 2005).

  62. 62.

    Shrikumar, A., Greenside, P., Shcherbina, A. & Kundaje, A. Not just a black box: learning important features through propagating activation differences. Preprint at arXiv https://arxiv.org/abs/1605.01713 (2016).

  63. 63.

    Mamassian, P. & Landy, M. S. Observer biases in the 3D interpretation of line drawings. Vis. Res. 38, 2817–2832 (1998).

    CAS  PubMed  Google Scholar 

  64. 64.

    Acerbi, L., Vijayakumar, S. & Wolpert, D. M. On the origins of suboptimality in human probabilistic inference. PLoS Comput. Biol. 10, e1003661 (2014).

    PubMed  PubMed Central  Google Scholar 

  65. 65.

    Acerbi, L. & Ma, W. J. Practical Bayesian optimization for model fitting with Bayesian adaptive direct search. Adv. Neural Inf. Process. Syst. 30, 1836–1846 (2017).

    Google Scholar 

Download references


The research was supported by a National Science Foundation Grant (no. IIS-1132009 to W.J.M. and A.S.T.), a DP1 EY023176 Pioneer Grant (to A.S.T.) and grants from the US Department of Health & Human Services, National Institutes of Health, National Eye Institute (nos. F30 EY025510 to E.Y.W., R01 EY026927 to A.S.T. and W.J.M., and T32 EY00252037 and T32 EY07001 to A.S.T.) and National Institute of Mental Health (nos. F30 F30MH088228 to R.J.C.). We thank F. Sinz for helpful discussion and suggestions on the DNN fitting to likelihood functions. We also thank T. Shinn for assistance in the behavioral training of the monkeys and experimental data collection.

Author information




All authors designed the experiments and developed the theoretical framework. R.J.C. programmed the experiment. R.J.C. trained the first monkey, and R.J.C. and E.Y.W. recorded data from this monkey. E.Y.W. trained and recorded from the second monkey. E.Y.W. performed all data analyses. E.Y.W. wrote the manuscript, with contributions from all authors. W.J.M. and A.S.T. supervised all stages of the project.

Corresponding authors

Correspondence to Edgar Y. Walker or Wei Ji Ma or Andreas S. Tolias.

Ethics declarations

Competing interests

E.Y.W. and A.S.T. hold equity ownership in Vathes LLC, which provides development and consulting for the open source software (DataJoint) used to develop and operate a data analysis pipeline for this publication.

Additional information

Peer review information Nature Neuroscience thanks Jan Drugowitsch and Robbe Goris for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Number of trials per contrast-session.

Each point corresponds to a single contrast-session, depicting the number of trials performed at the particular contrast.

Extended Data Fig. 2 Example decoded likelihood functions.

Example decoded likelihood functions under Full-Likelihood, Poisson-like and Independent-Poisson based decoders are shown for randomly selected trials from three distinct contrast-sessions from Monkey T.

Extended Data Fig. 3 Performance of the likelihood functions decoded by DNN-based decoders.

a, b, Results on independent Poisson population responses. a, KL divergence between the ground truth likelihood function and likelihood function decoded with: a trained DNN \(D_{{\mathrm{DNN}}}\) vs. independent Poisson distribution assumption \(D_{{\mathrm{Poiss}}}\). Each point is a single trial in the test set. The distributions of \(D_{{\mathrm{DNN}}}\) and \(D_{{\mathrm{Poiss}}}\) are shown at the top and right margins, respectively. The distribution of pair-wise difference between \(D_{{\mathrm{DNN}}}\) and \(D_{{\mathrm{Poiss}}}\)is shown on the diagonal. b, Example likelihood functions. The ground truth (solid blue), independent-Poisson based (dotted orange), and DNN-based (dashed green) likelihood functions are shown for selected trials from the test set. Four random samples (columns) were drawn from the top, middle and bottom 1/3 of trials sorted by the \(D_{{\mathrm{DNN}}}\) (rows). c, d, Same as in a, b but for simulated population responses with correlated Gaussian distribution where the variance is scaled by the mean.

Extended Data Fig. 4 Alternative relationships between the likelihood function and the decision.

Possible relationships between variables in the model are indicated by black arrows. We consider two scenarios: a, c the likelihood function \({\cal{L}}\) mediates the decision \(\hat C\), b, d the likelihood function does not mediate the decision. The gray arrow represents the trial-by-trial fluctuations in the subject’s decisions \(\hat C\) as predicted by the variable. a, b, When not conditioning on the stimulus \(s\), the stimulus can drive correlation among all variables, making it difficult to distinguish the two scenarios. c, d, When conditioning on the stimulus (red push pins), we expect correlation between \(\hat C\) and \({\cal{L}}\) only when \({\cal{L}}\) mediates the decision, allowing us to distinguish the two scenarios. The variable r represents the recorded cortical population and rall represents responses of all recorded and unrecorded neurons.

Extended Data Fig. 5 Fixed-Uncertainty decoder.

a, A schematic of a DNN for the Fixed-Uncertainty decoder mapping r to the decoded likelihood function \(\mathbf{L}\). For each contrast-session, the Fixed-Uncertainty decoder learns a single fixed-shape likelihood function \({\mathbf{L}}_0\) and a network that shifts \({\mathbf{L}}_0\) based on the population response. Therefore, all resulting likelihood functions share the same shape (uncertainty) but differ in the center location from trial to trail. b, Example decoded likelihood functions from randomly selected trials from a single contrast-session for both the Fixed-Uncertainty decoder and the Full-Likelihood decoder.

Extended Data Fig. 6 Fitted Bayesian decision maker parameters.

Each point corresponds to a single contrast-session, depicting the average fitted parameter value across 10 cross-validation training sets plotted against the contrast of the contrast-session. The solid line and error bars/shaded area depicts the mean and the standard error of the mean of the parameter value for binned contrast values, respectively.

Extended Data Fig. 7 Model performance on decision predictions.

a, b, Model performance measured in proportions of trials correctly predicted by the model as a function of contrast for four decision models based on different likelihood decoders (n=110,695 and n=192,630 total trials across all contrasts for Monkey L and T, respectively). On each trial, the class decision that would maximize the posterior \(P( {\hat C{\mathrm{|}}{\mathbf{r}}} )\) was chosen to yield a concrete classification prediction. c, d, Same as in a, b but with performance measured as the trial-averaged log likelihood of the model. For a, b and c, d, black dashed lines indicate the performance at chance (50 % and \(\ln \left( {0.5} \right)\), respectively). e, f, The average trial-by-trial performance of the Full-Likelihood, Poisson-like and Independent Poisson Models are shown relative to the Fixed-Uncertainty Model across contrasts, measured as the average trial difference in the log likelihood (n=110,695 and n=192,630 total trials for Monkey L and T, respectively). Results are shown for the cross-validated datasets. All data points are the means and error bar/shaded area indicates the standard error of the mean.

Extended Data Fig. 8 Model performance based on population responses to different stimulus windows.

a, c, Average trial-by-trial performance of the Full-Likelihood Model relative to the Fixed-Uncertainty Model across contrasts, measured as the average trial difference in the log likelihood. The models were trained and evaluated on the population response to (a) the first half (0—250 ms, ‘fh’) (n=110,816 and n=192,962 total trials for Monkey L and T) or (c) the second half (250—500 ms, ‘sh’) (n=110,887 and n=192,980 total trials for Monkey L and T) of the stimulus presentation. The results for the original (unshuffled) and the shuffled data are shown in solid and dashed lines, respectively. The squares and triangles mark Monkey L and T, respectively. b, d, Relative model performance summarized across all contrasts based on models trained as described in (a, c). Performance on the original and the shuffled data is shown individually for both monkeys. The trial log likelihood difference between the two models was statistically significant for both stimulus windows, and on both the original and the shuffled data for both monkeys (two tailed paired t-tests; Monkey L: \(t_{{\mathrm{fh}},{\mathrm{original}}}\left( {110815} \right) = 31.29\), \(t_{{\mathrm{sh}},{\mathrm{original}}}\left( {110886} \right) = 25.86\), \({\mathrm{t}}_{{\mathrm{sh}},{\mathrm{shuffled}}}\left( {110886} \right) = - 6.98\); Monkey T: \({\mathrm{t}}_{{\mathrm{fh}},{\mathrm{original}}}\left( {192961} \right) = 18.48\), \({\mathrm{t}}_{{\mathrm{fh}},{\mathrm{shuffled}}}\left( {192961} \right) = - 19.31\), \({\mathrm{t}}_{{\mathrm{sh}},{\mathrm{original}}}\left( {192979} \right) = 19.01\), \({\mathrm{t}}_{{\mathrm{sh}},{\mathrm{shuffled}}}\left( {192979} \right) = - 20.17\); all with \(p < 10^{ - 9}\)), 0—250 ms for Monkey L (\(t_{{\mathrm{fh}},{\mathrm{shuffled}}}\left( {110815} \right) = 1.89\) with \(P = 0.17\)). The difference between the Full-Likelihood Model on the original and the shuffled data was significant for both monkeys for both stimulus windows (two tailed paired t-tests; Monkey L: \({\mathrm{t}}_{{\mathrm{fh}}}\left( {110815} \right) = 32.73\), \({\mathrm{t}}_{{\mathrm{sh}}}\left( {110886} \right) = 37.10\); Monkey T: \({\mathrm{t}}_{{\mathrm{fh}}}\left( {192961} \right) = 40.69\), \({\mathrm{t}}_{{\mathrm{sh}}}\left( {192979} \right) = 42.78\); all with \(P < 10^{ - 9}\)). All p values are Bonferroni corrected for the three comparisons. All data points are means, and error bar/shaded area indicate standard error of the means.

Extended Data Fig. 9 Expected model performance on simulated data and observed effect of shuffling.

a, b, Using the trained Full-Likelihood Model as the ground truth to simulate the behavior, the expected performances of the model on the simulated data was assessed. a, Average trial-by-trial performance of the Full-Likelihood Model relative to the Fixed-Uncertainty Model across contrasts on the simulated data, measured as the trial-averaged difference in the log likelihood. The results for the unshuffled and the shuffled simulated data are shown in solid and dashed lines, respectively. The squares and triangles mark Monkey L and T, respectively. b, Relative model performance summarized across all contrasts. Results are shown for each monkey and for unshuffled and shuffled simulated data. For a and b, all data points are the means and error bar/shaded area indicates the standard deviation across the 5 simulation repetitions. For b, data points for individual simulation repetitions are depicted by gray icons next to the error bars. c, The dependence of the width of the likelihood function \(\sigma _L\) on the stimulus orientation is depicted for an example contrast-session (Monkey T, 8 % contrast, n=1,126 trials) on the original and the shuffled data. The shuffling procedure preserves the relationship between the average likelihood width and the stimulus orientation as desired. All data points are means, and error bar indicates standard deviation across trials falling in the specific bin.

Extended Data Fig. 10 Contributions of multi-units to the total attribution.

a, For each contrast-session, the multi-units were ordered from the largest to the smallest attribution to the likelihood mean \(A_\mu\), and the cumulative attribution over the total of 96 multi-units were plotted (thin gray lines, n=545 total contrast-sessions from Monkey L and T). The average cumulative attribution over all contrast-sessions are depicted by the thick black lines. The results are shown for each attribution method separately. b, Same as in a, but for attribution to the likelihood standard deviation \(A_\sigma\).

Supplementary information

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Walker, E.Y., Cotton, R.J., Ma, W.J. et al. A neural basis of probabilistic computation in visual cortex. Nat Neurosci 23, 122–129 (2020). https://doi.org/10.1038/s41593-019-0554-5

Download citation

Further reading


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing