Abstract
The human capacity to recognize complex visual patterns emerges in a sequence of brain areas known as the ventral stream, beginning with primary visual cortex (V1). We developed a population model for mid-ventral processing, in which nonlinear combinations of V1 responses are averaged in receptive fields that grow with eccentricity. To test the model, we generated novel forms of visual metamers, stimuli that differ physically but look the same. We developed a behavioral protocol that uses metameric stimuli to estimate the receptive field sizes in which the model features are represented. Because receptive field sizes change along the ventral stream, our behavioral results can identify the visual area corresponding to the representation. Measurements in human observers implicate visual area V2, providing a new functional account of neurons in this area. The model also explains deficits of peripheral vision known as crowding, and provides a quantitative framework for assessing the capabilities and limitations of everyday vision.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Ungerleider, L.G. & Haxby, J.V. 'What' and 'where' in the human brain. Curr. Opin. Neurobiol. 4, 157–165 (1994).
Hubel, D.H. Exploration of the primary visual cortex, 1955–78. Nature 299, 515–524 (1982).
Carandini, M. et al. Do we know what the early visual system does? J. Neurosci. 25, 10577–10597 (2005).
Tanaka, K. Inferotemporal cortex and object vision. Annu. Rev. Neurosci. 19, 109–139 (1996).
Granlund, G. In search of a general picture processing operator. Comput. Graph. Image Process. 8, 155–173 (1978).
Fukushima, K. Neocognitron: a self organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36, 193–202 (1980).
LeCun, Y. et al. Handwritten digit recognition with a back-propagation network. Adv. Neural Inf. Process. Syst. 2, 396–404 (1989).
Riesenhuber, M. & Poggio, T. Hierarchical models of object recognition in cortex. Nat. Neurosci. 2, 1019–1025 (1999).
Serre, T., Wolf, L., Bileschi, S., Riesenhuber, M. & Poggio, T. Robust object recognition with cortex-like mechanisms. IEEE Trans. Pattern Anal. Mach. Intell. 29, 411–426 (2007).
Rolls, E. The neurophysiology and computational mechanisms of object representation. in Object Categorization: Computer and Human Vision Perspectives (eds. Dickinson, S.J., Leonardis, A., Schiele, B. & Tarr, M.J.) 257–287 (Cambridge University Press, 2009).
Gattass, R., Gross, C.G. & Sandell, J.H. Visual topography of V2 in the macaque. J. Comp. Neurol. 201, 519–539 (1981).
Gattass, R., Sousa, A.P. & Gross, C.G. Visuotopic organization and extent of V3 and V4 of the macaque. J. Neurosci. 8, 1831–1845 (1988).
Dumoulin, S.O. & Wandell, B.A. Population receptive field estimates in human visual cortex. Neuroimage 39, 647–660 (2008).
Wandell, B. Foundations of Vision (Sinauer Associates, 1995).
Julesz, B. Visual pattern discrimination. IEEE Trans. Inf. Theory 8, 84–92 (1962).
Koenderink, J. & Doom, A.J.V. Local image operators and iconic structure. Algebr. Frames Percept. Action Cycle 1315, 66–93 (1997).
Portilla, J. & Simoncelli, E.P. A parametric texture model based on joint statistics of complex wavelet coefficients. Int. J. Comput. Vis. 40, 49–70 (2000).
Pelli, D.G. & Tillman, K.A. The uncrowded window of object recognition. Nat. Neurosci. 11, 1129–1135 (2008).
Levi, D.M. Crowding–an essential bottleneck for object recognition: a mini-review. Vision Res. 48, 635–654 (2008).
Lettvin, J.Y. On seeing sidelong. The Sciences 16, 10–20 (1976).
Parkes, L., Lund, J., Angelucci, A., Solomon, J.A. & Morgan, M. Compulsory averaging of crowded orientation signals in human vision. Nat. Neurosci. 4, 739–744 (2001).
Pelli, D.G., Palomares, M. & Majaj, N.J. Crowding is unlike ordinary masking: distinguishing feature integration from detection. J. Vis. 4, 1136–1169 (2004).
Greenwood, J.A., Bex, P.J. & Dakin, S.C. Positional averaging explains crowding with letter-like stimuli. Proc. Natl. Acad. Sci. USA 106, 13130–13135 (2009).
Balas, B., Nakano, L. & Rosenholtz, R. A summary-statistic representation in peripheral vision explains visual crowding. J. Vis. 9, 13 (2009).
Adelson, E.H. & Bergen, J.R. Spatiotemporal energy models for the perception of motion. J. Opt. Soc. Am. A 2, 284–299 (1985).
Graham, N. Visual Pattern Analyzers (Oxford University Press, 1989).
Balas, B. Attentive texture similarity as a categorization task: comparing texture synthesis models. Pattern Recognit. 41, 972–982 (2008).
Hegdé, J. & Essen, D.C.V. Selectivity for complex shapes in primate visual area V2. J. Neurosci. 20, RC61 (2000).
Ito, M. & Komatsu, H. Representation of angles embedded within contour stimuli in area V2 of macaque monkeys. J. Neurosci. 24, 3313–3324 (2004).
Anzai, A., Peng, X. & Essen, D.C.V. Neurons in monkey visual area V2 encode combinations of orientations. Nat. Neurosci. 10, 1313–1321 (2007).
Schmid, A.M., Purpura, K.P., Ohiorhenuan, I.E., Mechler, F. & Victor, J.D. Subpopulations of neurons in visual area v2 perform differentiation and integration operations in space and time. Front. Syst. Neurosci. 3, 15 (2009).
Willmore, B.D.B., Prenger, R.J. & Gallant, J.L. Neural representation of natural images in visual area V2. J. Neurosci. 30, 2102–2114 (2010).
Kovesi, P. Phase congruency: a low-level image invariant. Psychol. Res. 64, 136–148 (2000).
Simoncelli, E.P. & Heeger, D.J. A model of neuronal responses in visual area MT. Vision Res. 38, 743–761 (1998).
David, S.V., Hayden, B.Y. & Gallant, J.L. Spectral receptive field properties explain shape selectivity in area V4. J. Neurophysiol. 96, 3492–3505 (2006).
Chen, X., Han, F., Poo, M.-M. & Dan, Y. Excitatory and suppressive receptive field subunits in awake monkey primary visual cortex (V1). Proc. Natl. Acad. Sci. USA 104, 19120–19125 (2007).
Macmillan, N.A., Kaplan, H.L. & Creelman, C.D. The psychophysics of categorical perception. Psychol. Rev. 84, 452–471 (1977).
Shushruth, S., Ichida, J.M., Levitt, J.B. & Angelucci, A. Comparison of spatial summation properties of neurons in macaque V1 and V2. J. Neurophysiol. 102, 2069–2083 (2009).
Bouma, H. Interaction effects in parafoveal letter recognition. Nature 226, 177–178 (1970).
Pelli, D.G. et al. Crowding and eccentricity determine reading rate. J. Vis. 7, 20 (2007).
Geiger, G., Lettvin, J.Y. & Zegarra-Moran, O. Task-determined strategies of visual process. Brain Res. Cogn. Brain Res. 1, 39–52 (1992).
Martelli, M., Filippo, G.D., Spinelli, D. & Zoccolotti, P. Crowding, reading, and developmental dyslexia. J. Vis. 9, 1–14 (2009).
Townsend, J.T., Taylor, S.G. & Brown, D.R. Lateral masking for letters with unlimited viewing time. Atten. Percept. Psychophys. 10, 375–378 (1971).
Scolari, M., Kohnen, A., Barton, B. & Awh, E. Spatial attention, preview, and popout: which factors influence critical spacing in crowded displays? J. Vis. 7, 7 (2007).
Yeshurun, Y. & Rashal, E. Precueing attention to the target location diminishes crowding and reduces the critical distance. J. Vis. 10, 16 (2010).
Chung, S.T.L. Learning to identify crowded letters: does it improve reading speed? Vision Res. 47, 3150–3159 (2007).
Rust, N.C. & Movshon, J.A. In praise of artifice. Nat. Neurosci. 8, 1647–1650 (2005).
Zoccolan, D., Kouh, M., Poggio, T. & DiCarlo, J.J. Trade-off between object selectivity and tolerance in monkey inferotemporal cortex. J. Neurosci. 27, 12292–12307 (2007).
Schall, J.D., Perry, V.H. & Leventhal, A.G. Retinal ganglion cell dendritic fields in old-world monkeys are oriented radially. Brain Res. 368, 18–23 (1986).
Rodionova, E.I., Revishchin, A.V. & Pigarev, I.N. Distant cortical locations of the upper and lower quadrants of the visual field represented by neurons with elongated and radially oriented receptive fields. Exp. Brain Res. 158, 373–377 (2004).
Acknowledgements
We would like to thank R. Rosenholtz for early inspiration and discussions regarding the relationship between texture and crowding, N. Rust for discussions about the nature of information represented in the ventral stream, C. Anderson for discussions about the scaling of receptive fields with eccentricity, M. Landy, A. Girshick and R. Goris for advice on experimental design, C. Ekanadham and U. Rajashaker for advice on the model and analysis, and D. Ganguli, D. Heeger, J. McDermott, E. Merriam and C. Ziemba for comments on the initial manuscript. This work was supported by a National Science Foundation Graduate Student Fellowship to J.F. and a Howard Hughes Medical Institute Investigatorship to E.P.S.
Author information
Authors and Affiliations
Contributions
J.F. and E.P.S. conceived the project and designed the experiments. J.F. implemented the model, performed the experiments and analyzed the data. J.F. and E.P.S. wrote the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Text and Figures
Supplementary Analysis and Supplementary Figures 1 and 2 (PDF 1527 kb)
Rights and permissions
About this article
Cite this article
Freeman, J., Simoncelli, E. Metamers of the ventral stream. Nat Neurosci 14, 1195–1201 (2011). https://doi.org/10.1038/nn.2889
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nn.2889
This article is cited by
-
Using global feedback to induce learning of gist of abnormality in mammograms
Cognitive Research: Principles and Implications (2023)
-
Feedforward attentional selection in sensory cortex
Nature Communications (2023)
-
Analysis of convolutional neural networks reveals the computational properties essential for subcortical processing of facial expression
Scientific Reports (2023)
-
Model metamers reveal divergent invariances between biological and artificial neural networks
Nature Neuroscience (2023)
-
Variation in spatial dependencies across the cortical mantle discriminates the functional behaviour of primary and association cortex
Nature Communications (2023)