Abstract
The human capacity to recognize complex visual patterns emerges in a sequence of brain areas known as the ventral stream, beginning with primary visual cortex (V1). We developed a population model for mid-ventral processing, in which nonlinear combinations of V1 responses are averaged in receptive fields that grow with eccentricity. To test the model, we generated novel forms of visual metamers, stimuli that differ physically but look the same. We developed a behavioral protocol that uses metameric stimuli to estimate the receptive field sizes in which the model features are represented. Because receptive field sizes change along the ventral stream, our behavioral results can identify the visual area corresponding to the representation. Measurements in human observers implicate visual area V2, providing a new functional account of neurons in this area. The model also explains deficits of peripheral vision known as crowding, and provides a quantitative framework for assessing the capabilities and limitations of everyday vision.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
Crowding results from optimal integration of visual targets with contextual information
Nature Communications Open Access 30 September 2022
-
Mixture-modeling approach reveals global and local processes in visual crowding
Scientific Reports Open Access 25 April 2022
-
Processing of visual statistics of naturalistic videos in macaque visual areas V1 and V4
Brain Structure and Function Open Access 14 March 2022
Access options
Subscribe to Journal
Get full journal access for 1 year
£59.00
only £4.92 per issue
Tax calculation will be finalised during checkout.
Buy article
Get time limited or full article access on ReadCube.
$32.00
All prices are NET prices.







References
Ungerleider, L.G. & Haxby, J.V. 'What' and 'where' in the human brain. Curr. Opin. Neurobiol. 4, 157–165 (1994).
Hubel, D.H. Exploration of the primary visual cortex, 1955–78. Nature 299, 515–524 (1982).
Carandini, M. et al. Do we know what the early visual system does? J. Neurosci. 25, 10577–10597 (2005).
Tanaka, K. Inferotemporal cortex and object vision. Annu. Rev. Neurosci. 19, 109–139 (1996).
Granlund, G. In search of a general picture processing operator. Comput. Graph. Image Process. 8, 155–173 (1978).
Fukushima, K. Neocognitron: a self organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36, 193–202 (1980).
LeCun, Y. et al. Handwritten digit recognition with a back-propagation network. Adv. Neural Inf. Process. Syst. 2, 396–404 (1989).
Riesenhuber, M. & Poggio, T. Hierarchical models of object recognition in cortex. Nat. Neurosci. 2, 1019–1025 (1999).
Serre, T., Wolf, L., Bileschi, S., Riesenhuber, M. & Poggio, T. Robust object recognition with cortex-like mechanisms. IEEE Trans. Pattern Anal. Mach. Intell. 29, 411–426 (2007).
Rolls, E. The neurophysiology and computational mechanisms of object representation. in Object Categorization: Computer and Human Vision Perspectives (eds. Dickinson, S.J., Leonardis, A., Schiele, B. & Tarr, M.J.) 257–287 (Cambridge University Press, 2009).
Gattass, R., Gross, C.G. & Sandell, J.H. Visual topography of V2 in the macaque. J. Comp. Neurol. 201, 519–539 (1981).
Gattass, R., Sousa, A.P. & Gross, C.G. Visuotopic organization and extent of V3 and V4 of the macaque. J. Neurosci. 8, 1831–1845 (1988).
Dumoulin, S.O. & Wandell, B.A. Population receptive field estimates in human visual cortex. Neuroimage 39, 647–660 (2008).
Wandell, B. Foundations of Vision (Sinauer Associates, 1995).
Julesz, B. Visual pattern discrimination. IEEE Trans. Inf. Theory 8, 84–92 (1962).
Koenderink, J. & Doom, A.J.V. Local image operators and iconic structure. Algebr. Frames Percept. Action Cycle 1315, 66–93 (1997).
Portilla, J. & Simoncelli, E.P. A parametric texture model based on joint statistics of complex wavelet coefficients. Int. J. Comput. Vis. 40, 49–70 (2000).
Pelli, D.G. & Tillman, K.A. The uncrowded window of object recognition. Nat. Neurosci. 11, 1129–1135 (2008).
Levi, D.M. Crowding–an essential bottleneck for object recognition: a mini-review. Vision Res. 48, 635–654 (2008).
Lettvin, J.Y. On seeing sidelong. The Sciences 16, 10–20 (1976).
Parkes, L., Lund, J., Angelucci, A., Solomon, J.A. & Morgan, M. Compulsory averaging of crowded orientation signals in human vision. Nat. Neurosci. 4, 739–744 (2001).
Pelli, D.G., Palomares, M. & Majaj, N.J. Crowding is unlike ordinary masking: distinguishing feature integration from detection. J. Vis. 4, 1136–1169 (2004).
Greenwood, J.A., Bex, P.J. & Dakin, S.C. Positional averaging explains crowding with letter-like stimuli. Proc. Natl. Acad. Sci. USA 106, 13130–13135 (2009).
Balas, B., Nakano, L. & Rosenholtz, R. A summary-statistic representation in peripheral vision explains visual crowding. J. Vis. 9, 13 (2009).
Adelson, E.H. & Bergen, J.R. Spatiotemporal energy models for the perception of motion. J. Opt. Soc. Am. A 2, 284–299 (1985).
Graham, N. Visual Pattern Analyzers (Oxford University Press, 1989).
Balas, B. Attentive texture similarity as a categorization task: comparing texture synthesis models. Pattern Recognit. 41, 972–982 (2008).
Hegdé, J. & Essen, D.C.V. Selectivity for complex shapes in primate visual area V2. J. Neurosci. 20, RC61 (2000).
Ito, M. & Komatsu, H. Representation of angles embedded within contour stimuli in area V2 of macaque monkeys. J. Neurosci. 24, 3313–3324 (2004).
Anzai, A., Peng, X. & Essen, D.C.V. Neurons in monkey visual area V2 encode combinations of orientations. Nat. Neurosci. 10, 1313–1321 (2007).
Schmid, A.M., Purpura, K.P., Ohiorhenuan, I.E., Mechler, F. & Victor, J.D. Subpopulations of neurons in visual area v2 perform differentiation and integration operations in space and time. Front. Syst. Neurosci. 3, 15 (2009).
Willmore, B.D.B., Prenger, R.J. & Gallant, J.L. Neural representation of natural images in visual area V2. J. Neurosci. 30, 2102–2114 (2010).
Kovesi, P. Phase congruency: a low-level image invariant. Psychol. Res. 64, 136–148 (2000).
Simoncelli, E.P. & Heeger, D.J. A model of neuronal responses in visual area MT. Vision Res. 38, 743–761 (1998).
David, S.V., Hayden, B.Y. & Gallant, J.L. Spectral receptive field properties explain shape selectivity in area V4. J. Neurophysiol. 96, 3492–3505 (2006).
Chen, X., Han, F., Poo, M.-M. & Dan, Y. Excitatory and suppressive receptive field subunits in awake monkey primary visual cortex (V1). Proc. Natl. Acad. Sci. USA 104, 19120–19125 (2007).
Macmillan, N.A., Kaplan, H.L. & Creelman, C.D. The psychophysics of categorical perception. Psychol. Rev. 84, 452–471 (1977).
Shushruth, S., Ichida, J.M., Levitt, J.B. & Angelucci, A. Comparison of spatial summation properties of neurons in macaque V1 and V2. J. Neurophysiol. 102, 2069–2083 (2009).
Bouma, H. Interaction effects in parafoveal letter recognition. Nature 226, 177–178 (1970).
Pelli, D.G. et al. Crowding and eccentricity determine reading rate. J. Vis. 7, 20 (2007).
Geiger, G., Lettvin, J.Y. & Zegarra-Moran, O. Task-determined strategies of visual process. Brain Res. Cogn. Brain Res. 1, 39–52 (1992).
Martelli, M., Filippo, G.D., Spinelli, D. & Zoccolotti, P. Crowding, reading, and developmental dyslexia. J. Vis. 9, 1–14 (2009).
Townsend, J.T., Taylor, S.G. & Brown, D.R. Lateral masking for letters with unlimited viewing time. Atten. Percept. Psychophys. 10, 375–378 (1971).
Scolari, M., Kohnen, A., Barton, B. & Awh, E. Spatial attention, preview, and popout: which factors influence critical spacing in crowded displays? J. Vis. 7, 7 (2007).
Yeshurun, Y. & Rashal, E. Precueing attention to the target location diminishes crowding and reduces the critical distance. J. Vis. 10, 16 (2010).
Chung, S.T.L. Learning to identify crowded letters: does it improve reading speed? Vision Res. 47, 3150–3159 (2007).
Rust, N.C. & Movshon, J.A. In praise of artifice. Nat. Neurosci. 8, 1647–1650 (2005).
Zoccolan, D., Kouh, M., Poggio, T. & DiCarlo, J.J. Trade-off between object selectivity and tolerance in monkey inferotemporal cortex. J. Neurosci. 27, 12292–12307 (2007).
Schall, J.D., Perry, V.H. & Leventhal, A.G. Retinal ganglion cell dendritic fields in old-world monkeys are oriented radially. Brain Res. 368, 18–23 (1986).
Rodionova, E.I., Revishchin, A.V. & Pigarev, I.N. Distant cortical locations of the upper and lower quadrants of the visual field represented by neurons with elongated and radially oriented receptive fields. Exp. Brain Res. 158, 373–377 (2004).
Acknowledgements
We would like to thank R. Rosenholtz for early inspiration and discussions regarding the relationship between texture and crowding, N. Rust for discussions about the nature of information represented in the ventral stream, C. Anderson for discussions about the scaling of receptive fields with eccentricity, M. Landy, A. Girshick and R. Goris for advice on experimental design, C. Ekanadham and U. Rajashaker for advice on the model and analysis, and D. Ganguli, D. Heeger, J. McDermott, E. Merriam and C. Ziemba for comments on the initial manuscript. This work was supported by a National Science Foundation Graduate Student Fellowship to J.F. and a Howard Hughes Medical Institute Investigatorship to E.P.S.
Author information
Authors and Affiliations
Contributions
J.F. and E.P.S. conceived the project and designed the experiments. J.F. implemented the model, performed the experiments and analyzed the data. J.F. and E.P.S. wrote the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Text and Figures
Supplementary Analysis and Supplementary Figures 1 and 2 (PDF 1527 kb)
Rights and permissions
About this article
Cite this article
Freeman, J., Simoncelli, E. Metamers of the ventral stream. Nat Neurosci 14, 1195–1201 (2011). https://doi.org/10.1038/nn.2889
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nn.2889
This article is cited by
-
Crowding results from optimal integration of visual targets with contextual information
Nature Communications (2022)
-
Mixture-modeling approach reveals global and local processes in visual crowding
Scientific Reports (2022)
-
Incorporating the properties of peripheral vision into theories of visual search
Nature Reviews Psychology (2022)
-
Processing of visual statistics of naturalistic videos in macaque visual areas V1 and V4
Brain Structure and Function (2022)
-
Dennettian Panpsychism: Multiple Drafts, All of Them Conscious
Acta Analytica (2022)