Two relatively independent areas of visual cognition research examine important aspects of visual object understanding: Object Recognition and Perceptual Categorization. These areas have focused on different aspects of the same problems, with surprisingly little overlap. Nevertheless, they have ultimately arrived at complementary conclusions regarding the computational bases of visual object understanding.
Traditionally, computational models in Object Recognition provide a detailed description of the format of object representations, whereas Perceptual Categorization models emphasize how representations are used to make decisions. Both image-based theories and exemplar-based theories have articulated how the same representations can be used to recognize, identify and categorize objects.
Although intuition suggests that object recognition is effortless regardless of changes in viewpoint, and that knowledge about object categories is abstract, there is much evidence to the contrary. Just as recognizing an object is influenced by particular stored views, categorizing an object is influenced by particular stored exemplars. Image-based and exemplar-based models are supported by behavioural, neurophysiological and functional imaging results. There is also some renewed support for abstraction, and new hybrid models attempt to integrate structural descriptions with image-based representations and to integrate abstract category representations with exemplar-based representations.
Objects can be categorized at several levels of abstraction (for example, animal, mammal, cat, Abyssinian, Max). Some argue that basic-level categorization is the fundamental goal of vision, with identification relying on features other than object shape, whereas early tests of image-based theories emphasized discrimination at the subordinate level. Recently, image-based theorists have argued that categorization at all levels can be accomplished using image-based representations. Early work in Perceptual Categorization suggested that identification and categorization used distinct representations and processes, but recent evidence indicates that a common representational substrate can be used adaptively according to task demands.
Researchers in Object Recognition have traditionally discussed modularity of content: are there specific modules devoted to particular kinds of objects? The Perceptual Categorization literature focused on debates regarding the modularity of memory systems: are there specific modules devoted to particular tasks, irrespective of object category? In both fields, claims of modularity have been disputed, relying primarily on demonstrations that non-modular models can account for dissociations.
Traditionally, visual perception was thought to create the representational input to a conceptual system that identified or categorized objects in a linear fashion. Recently, more 'interactive' solutions have been proposed. The evidence indicates that there is an interaction between perception and conceptual knowledge, and that category learning can influence perceptual representations.
A new dynamic approach emphasizes the role of learning in most questions of interest in visual object understanding. Novices can demonstrate visual object understanding in qualitatively different ways than experts: for instance, people might initially categorize using rules but with experience start to retrieve exemplars from memory. Experience with certain categories leads to specialization in the visual system: for example, experts can process non-face objects such as cars, dogs, birds and novel objects in a manner similar to faces, using the same brain areas and with neural responses with the same latency.
Despite their historic differences, current theories of Object Recognition and Perceptual Categorization have begun to consider complementary problems and have converged on similar solutions. Ultimately, a complete understanding of visual object understanding will demand an integration of the best theoretical constructs from Object Recognition and Perceptual Categorization.
Visual object understanding includes processes at the nexus of visual perception and visual cognition. A traditional approach separates questions that are more associated with perception — how are objects represented by high-level vision — from questions that are more associated with cognition — how are objects identified, categorized and remembered. However, to understand the bridge between perception and cognition, it is fruitful to abandon any sharp distinction between perceptual and cognitive aspects of visual object understanding. We provide a selective review of research from both the Object Recognition and Perceptual Categorization literatures, highlighting relevant behavioural, neuropsychological, neurophysiological and theoretical research into the representations and processes that underlie visual object understanding in humans and primates.
This is a preview of subscription content
Subscribe to Journal
Get full journal access for 1 year
only $4.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
Jennings, C. & Aamodt, S. Computational approaches to brain function. Nature Neurosci. 3, Suppl. 1160 (2000).
Schyns, P. G. Diagnostic recognition: task constraints, object information, and their interactions. Cognition 67, 147–179 (1998).
Schyns, P. G., Goldstone, R. L. & Thibaut, J. -P. The development of features in object concepts. Behav. Brain Sci. 21, 1–54 (1998). This paper brings together several lines of evidence in favour of a theory that category learning involves the flexible creation of new perceptual features.
Barsalou, L. -W. Perceptual symbol systems. Behav. Brain Sci. 22, 577–660 (1999). Argues against amodal theories of conceptual knowledge, but instead proposes a theory that abstract conceptual knowledge is grounded in perceptual experiences.
Goldstone, R. L. & Barsalou, L. W. Reuniting perception and conception. Cognition 65, 231–262 (1998).
Marr, D. Vision: A Computational Investigation into the Human Representation and Processing of Visual Information (Freeman, San Francisco, 1982).
Rosch, E. Cognitive representations of semantic categories. J. Exp. Psychol. Gen. 104, 192–233 (1975).
Fodor, J. A. Modularity of Mind (MIT Press, Cambridge, Massachusetts, 1983).
Goldstone, R. Influences of categorization on perceptual discrimination. J. Exp. Psychol. Gen. 123, 178–200 (1994).
Hintzman, D. L. Human learning and memory: connections and dissociations. Ann. Rev. Psychol. 41, 109–139 (1990).
Ashby, F. G. Multidimensional Models of Perception and Cognition (Lawrence Erlbaum, Hillsdale, New Jersey, 1992). A classic tutorial volume that brings together theories from mathematical psychology on categorization and identification, theories from psychometrics on similarity, choice and preference, and theories from visual perception. The unifying theoretical theme is that all assume probabilistic multidimensional representations of perceptual and cognitive information.
Biederman, I. Recognition-by-components: a theory of human image understanding. Psychol. Rev. 94, 115–147 (1987).
Hummel, J. E. & Biederman, I. Dynamic binding in a neural network for shape recognition. Psychol. Rev. 99, 480–517 (1992).
Biederman, I. & Gerhardstein, P. C. Recognizing depth-rotated objects: evidence and conditions for three-dimensional viewpoint invariance. J. Exp. Psychol. Hum. Percept. Perform. 19, 1162–1182 (1993). Defines the conditions under which a specific structural description theory (RBC) predicts depth invariance for object recognition.
Edelman, S. Computational theories of object recognition. Trends Cogn. Sci. 1, 296–304 (1997).
Biederman, I., Subramaniam, S., Bar, M., Kalocsai, P. & Fiser, J. Subordinate-level object classification reexamined. Psychol. Res. 62, 131–153 (1999).
Hummel, J. E. & Stankiewicz, B. J. Two roles for attention in shape perception: a structural desription model of visual scrutiny. Vis. Cogn. 5, 49–79 (1998). Presents a structural description model of object recognition that integrates both categorical and metric information to account for recognition at both the basic and subordinate levels.
Ullman, S. Aligning pictorial descriptions: an approach to object recognition. Cognition 32, 193–254 (1989).
Mel, B. SEEMORE: combining color, shape, and texture histogramming in a neurally inspired approach to visual object recognition. Neural Comput. 9, 777–804 (1997).
Pinker, S. Visual cognition: an introduction. Cognition 18, 1–63 (1984).
Poggio, T., Bülthoff, H. & Lee, S. W. in BMCV 2000 (Springer, Seoul, 2000).
Bülthoff, H. H. & Edelman, S. Psychophysical support for a two-dimensional view interpolation theory of object recognition. Proc. Natl Acad. Sci. USA 89, 60–64 (1992).
Bülthoff, H. H., Edelman, S. Y. & Tarr, M. J. How are three-dimensional objects represented in the brain? Cereb. Cortex 5, 247–260 (1995).
Logothetis, N. K. & Pauls, J. Psychophysical and physiological evidence for viewer-centered object representations in the primate. Cereb. Cortex 5, 270–288 (1995).
Ullman, S. & Basri, R. Recognition by linear combinations of models. IEEE PAMI 13, 992–1006 (1991).
Tarr, M. J. Rotating objects to recognize them: a case study of the role of viewpoint dependency in the recognition of three-dimensional objects. Psychon. Bull. Rev. 2, 55–82 (1995). Proposes and provides empirical support for a multiple-views-plus-rotation model in which view-specific representations are matched to percepts through a normalization procedure akin to mental rotation.
Edelman, S. Representation and Recognition in Vision (MIT Press, Cambridge, Massachusetts, 1999).
Poggio, T. & Edelman, S. A network that learns to recognize three-dimensional objects. Nature 343, 263–266 (1990).
Poggio, T. & Girosi, F. Regularization algorithms for learning that are equivalent to multilayer networks. Science 247, 978–982 (1990).
Riesenhuber, M. & Poggio, T. Hierarchical models of object recognition in cortex. Nature Neurosci. 2, 1019–1025 (1999). An image-based model from Object Recognition that combines and extends preceding models: early layers of the hierarchical model create representations that are size- and translation-invariant. These representations are matched to view-tuned image-based units, which in turn activate category and identity nodes.
Fukushima, K. Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36, 193–202 (1980).
Schneider, R. & Riesenhuber, M. A Detailed Look at Scale and Translation Invariance in a Hierarchical Neural Model of Visual Object Recognition. CBCL Paper 218/AI Memo 2002-011 (Massachusetts Institute of Technology, Cambridge, Massachusetts, 2002).
Marsolek, C. Dissociable neural subsystems underlie abstract and specific object recognition. Psychol. Sci. 107, 111–118 (1999). Provides empirical support for a dual systems theory of object recognition that includes an abstract subsystem for categorization in the left hemisphere and an exemplar-specific subsystem for individuation in the right hemisphere.
Foster, H. G. & Gilson, S. J. Recognizing novel three-dimensional objects by summing signals from parts and views. Proc. R. Soc. Lond. B 269, 1939–1947 (2002).
Nosofsky, R. M. Attention, similarity, and the identification-categorization relationship. J. Exp. Psychol. Gen. 115, 39–61 (1986). Presents a unified exemplar-based model of categorization and identification, the generalized context model, by integrating the context model of categorization with the similarity-choice model of identification. Because of selective attention to dimensions, similarities can vary systematically between categorization and identification.
Op de Beeck, H., Wagemans, J. & Vogels, R. The effect of category learning on the representation of shape: dimensions can be biased, but not differentiated. J. Exp. Psychol. Gen. 132, 491–511 (2003). Provides new results that might temper theories of novel feature creation during category learning. Extensive category learning can bias the processing of already separable shape dimensions but cannot differentiate already integral shape dimensions.
Op de Beeck, H., Wagemans, J. & Vogels, R. Inferotemporal neurons represent low-dimensional configurations of parameterized shapes. Nature Neurosci. 4, 1244–1252 (2001). Responses of neurons in IT cortex are characterized as low-dimensional representations of complex shape; interestingly, these neurons often show more within-category discrimination than between-category discrimination, indicating a distributed representation that emphasizes exemplar-specific information rather than highlighting category-specific information.
Posner, M. I. & Keele, S. W. On the genesis of abstract ideas. J. Exp. Psychol. 77, 353–363 (1968).
Smith, J. D. & Minda, J. P. Distinguishing prototype-based and exemplar-based processes in dot-pattern category learning. J. Exp. Psychol. Learn. Mem. Cogn. 28, 800–811 (2002).
Knowlton, B. & Squire, L. R. The learning of categories: parallel brain systems for item memory and category knowledge. Science 262, 1747–1749 (1993).
Ashby, F. G. A stochastic version of general recognition theory. J. Math. Psychol. 44, 310–329 (2000).
Ashby, F. G., Alfonso-Reese, L. A., Turken, A. U. & Waldron, E. M. A formal neuropsychological theory of multiple systems in category learning. Psychol. Rev. 105, 442–481 (1998). Presents a neuropsychological theory of category learning that assumes competition between separate implicit procedural learning (mediated by the caudate nucleus and other structures) and verbal rule (mediated by the anterior cingulate, prefrontal cortex and other structures) systems.
Nosofsky, R. M. Choice, similarity, and the context theory of classification. J. Exp. Psychol. Learn. Mem. Cogn. 10, 104–114 (1984).
Hintzman, D. L. 'Schema abstraction' in a mutiple-trace memory model. Psych. Rev. 93, 411–428 (1986).
Nosofsky, R. M. Exemplar-based accounts of relations between classification, recognition, and typicality. J. Exp. Psychol. Learn. Mem. Cogn. 14, 700–708 (1988).
Kruschke, J. K. ALCOVE: an exemplar-based connectionist model of category learning. Psychol. Rev. 99, 22–44 (1992). A connectionist, exemplar-based model of category learning that combines the generalized context model of categorization with an error-driven learning rule to allow learning of selective attention to dimensions and of associations between exemplars and categories.
Lamberts, K. Information-accumulation theory of speeded categorization. Psychol. Rev. 107, 227–260 (2000). Presents a process model of categorization that accounts for both response probabilities and response times. The model assumes that perceptual features are sampled probabilistically over time according to their salience until sufficient information has been acquired to generate a response. Under time pressure, responses are driven by salient features. Under deliberate decisions, responses are driven by diagnostic features.
Nosofsky, R. M. in Rational Models of Cognition (eds Oaksford, M. & Chater, N.) (Oxford Univ. Press, Oxford, 1998).
Nosofsky, R. M. & Palmeri, T. J. An exemplar-based random walk model of speeded classification. Psychol. Rev. 104, 266–300 (1997).
Riesenhuber, M. & Poggio, T. Models of object recognition. Nature Neurosci. 3, Suppl., 1199–1204 (2000).
Tarr, M. J. & Vuong, Q. C. in Steven's Handbook of Experimental Psychology 3rd edn Vol. 1 (eds Pashler, H. & Yantis, S.) 287–314 (John Wiley & Sons, New York, 2002).
Logan, G. D. Toward an instance theory of automatization. Psychol. Rev. 95, 492–527 (1988).
Tjan, B. S. & Legge, G. E. The viewpoint complexity of an object recognition task. Vision Res. 38, 2335–2350 (1998).
Rosseel, Y. Mixture models of categorization. J. Math. Psychol. 46, 178–210 (2002).
Barsalou, L. W. in Building Object Categories (ed. Rakison, D.) (Erlbaum, Mahwah, New Jersey, in the press).
Tarr, M. J. & Bülthoff, H. H. Image-based object recognition in man, monkey and machine. Cognition 67, 1–20 (1998).
Booth, M. C. A. & Rolls, E. T. View-invariant representations of familiar objects by neurons in the inferior temporal visual cortex. Cereb. Cortex 8, 510–523 (1998).
Palmer, S., Rosch, E. & Chase, P. in Attention and Performance IX (eds Long, J. & Baddeley, A.) 135–151 (Lawrence Erlbaum, Hillsdale, New Jersey, 1981).
Tarr, M. J. & Bülthoff, H. H. Is human object recognition better described by geon-structural-descriptions or by multiple-views? J. Exp. Psychol. Hum. Percept. Perform. 21, 1494–1505 (1995).
Wallis, G. & Bülthoff, H. Learning to recognize objects. Trends Cogn. Sci. 3, 22–31 (1999).
Tarr, M. J. & Pinker, S. Mental rotation and orientation-dependence in shape recognition. Cogn. Psychol. 21, 233–282 (1989).
Tarr, M. J., Williams, P., Hayward, W. G. & Gauthier, I. Three-dimensional object recognition is viewpoint-dependent. Nature Neurosci. 1, 275–277 (1998). Tests a fundamental prediction from a specific structural description model (RBC), according to which simple volumes called geons are the building blocks of a system that leads to depth-invariant recognition. Several experiments using various tasks and stimulus conditions find that geon recognition is viewpoint-dependent.
Tarr, M. J. in Perception of Faces, Objects, and Scenes: Analytic and Holistic Processes (eds Peterson, M. A. & Rhodes, G.) 177–211 (Oxford Univ. Press, Oxford, 2003).
Nosofsky, R. & Palmeri, T. J. A rule-plus-exception model for classifying objects in continuous-dimension spaces. Psychon. Bull. Rev. 5, 345–369 (1998).
Homa, D. Abstraction of ill-defined form. J. Exp. Psychol. Hum. Learn. Mem. 4, 407–416 (1978).
Reed, S. K. Pattern recognition and categorization. Cogn. Psychol. 3, 382–407 (1972).
Medin, D. L. & Schaffer, M. M. Context theory of classification learning. Psychol. Rev. 85, 207–238 (1978).
Ashby, F. G. & Waldron, E. M. On the nature of implicit categorization. Psychon. Bull. Rev. 6, 363–378 (1999).
Perrett, D. I., Oram, M. W. & Ashbridge, E. Evidence accumulation in cell populations responsive to faces: an account of generalisation of recognition without mental transformations. Cognition 67, 111–145 (1998).
Tanaka, K. Inferotemporal cortex and object vision. Annu. Rev. Neurosci. 19, 109–139 (1996).
Tovee, M. J., Rolls, E. T. & Azzopardi, P. Translation invariance in the responses to faces of single neurons in the temporal visual cortical areas of the alert macaque. J. Neurophysiol. 72, 1049–1060 (1994).
Op de Beeck, H. & Vogels, R. Spatial sensitivity of macaque inferior temporal neurons. Comp. Neurol. 426, 505–518 (2000).
DiCarlo, J. J. & Maunsell, J. -H. R. Anterior inferotemporal neurons of monkeys engaged in object recognition can be highly sensitive to object retinal position. J. Neurophysiol. 89, 3264–3278 (2003).
Logothetis, N. K. & Sheinberg, D. L. Visual object recognition. Annu. Rev. Neurosci. 19, 577–621 (1996).
Sigala, N., Gabbiani, F. & Logothetis, N. K. Visual categorization and object representation in monkeys and humans. J. Cogn. Neurosci. 14, 187–198 (2002). Monkeys and humans learned novel object categories with feedback. Comparative fits of computational models to observed response probabilities indicated that neither monkeys nor humans learned categories by abstracting a prototype, but instead based categorization decisions either on similarity to exemplars or with respect to a linear decision boundary.
Sigala, N. & Logothetis, N. K. Visual categorization shapes feature selectivity in the primate temporal cortex. Nature 415, 318–320 (2002). Monkeys learned novel categories of objects with four varying feature dimensions, only two of which were diagnostic for categorization. Single-unit recordings of cells in IT cortex revealed neural responses sensitive to dimensional diagnosticity, consistent with the construct of dimensional selective attention found in models such as the generalized context model.
Vogels, R., Biederman, I., Bar, M. & Lorincz, A. Inferior temporal neurons show greater sensitivity to nonaccidental metric shape differences. J. Cogn. Neurosci. 15, 444–453 (2001).
Freedman, D. J., Riesenhuber, M., Poggio, T. & Miller, E. K. A comparison of primate prefrontal and temporal cortices during categorization. J. Neurosci. 23, 5235–5246 (2003). Monkeys learned to categorize 'dogs' and 'cats' that parametrically varied in shape. Neurons in IT cortex seem more involved in encoding object shape whereas neurons in PFC seem more involved in encoding object category, object memory and meaning.
Nosofsky, R. M., Gluck, M. A., Palmeri, T. J., McKinley, S. C. & Glauthier, P. Comparing models of rule-based classification learning: a replication and extension of Shepard, Hovland, and Jenkins (1961). Mem. Cogn. 22, 352–369 (1994).
Gauthier, I. & Palmeri, T. J. Visual neurons: categorization-based selectivity. Curr. Biol. 12, R282–284 (2002).
Kourtzi, Z., Erb, M., Grodd, W. & Bülthoff, H. H. Representation of the perceived 3D object shape in the human lateral occipital complex. Cereb. Cortex 13, 911–920 (2003).
James, T. W., Humphrey, G. K., Gati, J. S., Menon, R. S. & Goodale, M. A. Differential effects of viewpoint on object-driven activation in dorsal and ventral streams. Neuron 35, 793–801 (2002).
Gauthier, I. et al. BOLD activity during mental rotation and viewpoint-dependent object recognition. Neuron 34, 161–171 (2002).
Rosch, E., Mervis, C. B., Gray, W. D., Johnson, D. M. & Boyes-Braem, P. Basic objects in natural categories. Cogn. Psychol. 8, 382–439 (1976).
Jolicoeur, P., Gluck, M. & Kosslyn, S. M. Pictures and names: making the connection. Cogn. Psychol. 16, 243–275 (1984).
McClelland, J. L. & Rumelhart, D. E. Distributed memory and the representation of general and specific information. J. Exp. Psychol. Gen. 114, 159–188 (1985).
Stankiewicz, B. J. Empirical evidence for independent dimensions in the visual representation of three-dimensional shape. J. Exp. Psychol. Hum. Percept. Perform. 28, 913–932 (2002).
Tarr, M. J. & Bülthoff, H. H. Is human object recognition better described by geon structural descriptions or by multiple views? Comment on Biederman and Gerhardstein (1993). J. Exp. Psychol. Hum. Percept. Perform. 21, 1494–1505 (1995).
Hayward, W. G. & Williams, P. Viewpoint dependence and object discriminability. Psychol. Sci. 11, 7–12 (2000). Refutes a common prediction that object recognition should be viewpoint-invariant when an object set is highly discriminable. The results support the notion that differences in object geometry are more important than the difficulty of the categorization in terms of accounting for viewpoint effects.
Laeng, B., Zarrinpar, A. & Kosslyn, S. M. Do separate processes identify objects as exemplars versus members of basic-level categories? Evidence from hemispheric specialization. Brain Cogn. 53, 15–27 (2003).
Vuilleumier, P., Henson, R. N., Driver, J. & Dolan, R. J. Multiple levels of visual object constancy revealed by event-related fMRI of repetition priming. Nature Neurosci. 5, 491–499 (2002).
Tanaka, J., Luu, P., Weisbrod, M. & Kiefer, M. Tracking the time course of object categorization using event-related potentials. Neuroreport 10, 829–835 (1999).
Gauthier, I., Anderson, A. W., Tarr, M. J., Skudlarski, P. & Gore, J. C. Levels of categorization in visual recognition studied using functional magnetic resonance imaging. Curr. Biol. 7, 645–651 (1997).
Shepard, R. N., Hovland, C. I. & Jenkins, H. M. Learning and memorization of classifications. Psychol. Monogr. 75 (1961).
Johnson, K. E. & Mervis, C. B. Effects of varying levels of expertise on the basic level of categorization. J. Exp. Psychol. Gen. 126, 248–277 (1997).
Tanaka, J. W. & Taylor, M. Object categories and expertise: is the basic level in the eye of the beholder? Cogn. Psychol. 23, 457–482 (1991).
Tanaka, J. W. The entry point of face recognition: evidence for face expertise. J. Exp. Psychol. Gen. 130, 534–543 (2001). Supports the notion that people are experts with faces by showing that faces are more frequently identified, are equally quick to identity at the subordinate level (an individual) as the basic level (human), and that face recognition is not impaired by brief presentations in the way the recognition of other objects is.
Wong, A. C. -N. & Gauthier, I. The basic level as the entry point of expert letter recognition. Annu. Meet. Cogn. Neurosci. Soc. Abstr. C295 (2003)
Liu, J. & Kanwisher, N. Stages of processing in face perception: an MEG study. Nature Neurosci. 5, 910–916 (2002).
Bentin, S., Deouell, L. Y. & Soroker, N. Selective visual streaming in face recognition: evidence from developmental prosopagnosia. Neuroreport 10, 823–827 (1999).
Farah, M. J., Wilson, K. D., Drain, M. & Tanaka, J. W. What is 'special' about face perception? Psychol. Rev. 105, 482–498 (1998). An empirical and review paper on behavioural evidence supporting the hypothesis that faces are recognized in a more holistic fashion than other objects.
Palmeri, T. J. Exemplar similarity and the development of automaticity. J. Exp. Psychol. Learn. Mem. Cogn. 23, 324–354 (1997).
Wenger, M. J. & Ingvalson, E. M. A decisional component of holistic encoding. J. Exp. Psychol. Learn. Mem. Cogn. 28, 872–892 (2002). Describes the construct of holistic representation in terms of informational independence, informational separability and decisional separability adopted from general recognition theory; only informational independence and informational separability can be related to traditional notions of holistic representations. But only evidence for decisional separability was observed.
Schall, J. D. On building a bridge between brain and behavior. Annu. Rev. Psychol. (in the press).
Schmolesky, M. T. et al. Signal timing across the macaque visual system. J. Neurophysiol. 79, 3272–3278 (1998).
Tjan, B. Adaptive object representation with hierarchically-distributed memory sites. Adv. Neural Inf. Process. Syst. (in the press).
Ullman, S., Vidal-Naquet, M. & Sali, E. Visual features of intermediate complexity and their use in classification. Nature Neurosci. 5, 682–687 (2002). Provides computational evidence that image-based 'features' of intermediate complexity can be sufficient to categorize objects at the basic level.
Humphreys, G. W., Lamote, C. & Lloyd-Jones, T. -J. An interactive activation approach to object processing: effects of structural similarity, name frequency, and task in normality and pathology. Memory 3, 535–586 (1995).
Waldron, E. M. & Ashby, F. G. The effects of concurrent task interference on category learning: evidence for multiple category learning systems. Psychon. Bull. Rev. 8, 168–176 (2001).
Moscovitch, M., Winocur, G. & Behrmann, M. What is special about face recognition? Nineteen experiments on a person with visual object agnosia and dyslexia but normal face recognition. J. Cogn. Neurosci. 9, 555–604 (1997).
Kanwisher, N., Downing, P., Epstein, R. & Kourtzi, Z. in The Handbook on Functional Neuroimaging (eds Cabeza, R. & Kingstone, A.) 109–152 (MIT Press, Cambridge, Massachusetts, 2001).
Patalano, A. L., Smith, E. E., Jonides, J. & Koeppe, R. A. PET evidence for multiple strategies of categorization. Cogn. Affect. Behav. Neurosci. 1, 360–370 (2001).
Shallice, T. From Neuropsychology to Mental Structure (Cambride Univ. Press, Cambridge, 1988).
Plaut, D. C. Double dissociation without modularity: evidence from connectionist neuropsychology. J. Clin. Exp. Neuropsychol. 17, 291–321 (1995).
Nosofsky, R. & Zaki, S. Dissociations between categorization and recognition in amnesic and normal individuals: an exemplar-based interpretation. Psychol. Sci. 9, 247–255 (1998).
Kinder, A. & Shanks, D. R. Amnesia and the declarative/nondeclarative distinction: a recurrent network model of classification, recognition, and repetition priming. J. Cogn. Neurosci. 13, 648–669 (2001).
Haxby, J. V. et al. Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science 293, 2425–2430 (2001).
Gauthier, I. What constrains the organization of the ventral temporal cortex? Trends Cogn. Sci. 4, 1–2 (2000).
Poldrack, R. -A. et al. Interactive memory systems in the human brain. Nature 29, 546–550 (2001). Provides evidence that both the medial temporal lobes (emphasized early in learning) and the basal ganglia (emphasized later in learning) are involved in category learning.
Palmeri, T. J. & Flanery, M. A. in The Psychology of Learning and Motivation vol. 41 (ed. Ross, B. H.) (Academic, San Diego, California, 2002). Critically evaluates recent evidence for single memory system versus multiple memory system models of categorization and old–new recognition memory.
Malach, R. et al. Object-related activity revealed by functional magnetic resonance imaging in human occipital cortex. Proc. Natl Acad. Sci. USA 92, 8135–8139 (1995).
Kanwisher, N., McDermott, J. & Chun, M. M. The fusiform face area: a module in human extrastriate cortex specialized for face perception. J. Neurosci. 17, 4302–4311 (1997).
Epstein, R. & Kanwisher, N. A cortical representation of the local visual environment. Nature 392, 598–601 (1998).
Tsao, D. Y., Freiwald, W. A., Knutsen, T. A., Mandeville, J. B. & Tootell, R. B. Faces and objects in macaque cerebral cortex. Nature Neurosci. 6, 989–995 (2003).
Squire, L. R. & Zola, S. M. Episodic memory, semantic memory, and amnesia. Hippocampus 8, 205–211 (1998).
Roediger III, H. L., Buckner, R. L. & McDermot, K. B. in Memory: Systems, Process, or Function? (eds Foster, J. K. & Jelicic, M.) 31–65 (Oxford Univ. Press, Oxford, 1999).
Squire, L. R. & Knowlton, B. Learning about categories in the absence of memory. Proc. Natl Acad. Sci. USA 92, 12470–12474 (1995).
Knowlton, B. J., Mangels, J. A. & Squire, L. R. A neostriatal habit learning system in humans. Science 273, 1399–1402 (1996).
Farah, M. J., Levinson, K. L. & Klein, K. L. Face perception and within-category discrimination in prosopagnosia. Neuropsychologia 33, 661–674 (1995).
Rolls, E. Memory systems in the brain. Annu. Rev. Psychol. 51, 599–630 (2000).
Palmeri, T. J. & Flanery, M. A. Learning about categories in the absence of training: profound amnesia and the relationship between perceptual categorization and recognition memory. Psychol. Sci. 10, 526–530 (1999).
Zaki, S. R., Nosofsky, R. M., Jessup, N. M. & Ungerzagt, F. W. Categorization and recognition performance of a memory-impaired group: evidence for single-system models. J. Int. Neuropsychol. Soc. 9, 394–406 (2003).
Zaki, S. R. & Nosofsky, R. M. A single-system interpretation of dissociations between recognition and categorization in a task involving object-like stimuli. Cogn. Affect. Behav. Neurosci. 1, 344–359 (2001).
Johansen, M. K. & Palmeri, T. J. Are there representational shifts during category learning? Cogn. Psychol. 45, 482–553 (2002).
Nosofsky, R. M. & Johansen, M. K. Exemplar-based accounts of 'multiple-system' phenomena in perceptual categorization. Psychon. Bull. Rev. 7, 375–402 (2000).
Smith, J. D. & Minda, J. P. Thirty categorization results in search of a model. J. Exp. Psychol. Learn. Mem. Cogn. 26, 3–27 (2000).
Dixon, M. J., Desmarais, G., Gojmerac, C., Schweizer, T. A. & Bub, D. N. The role of premorbid expertise on object identification in a patient with category-specific visual agnosia. Cogn. Neuropsychol. 19, 401–419 (2002). An elegant case study of a patient with category-specific agnosia. This paper indicates that a better knowledge of attributes for a category of expertise (in this case brass instruments) can facilitate learning of arbitrary associations between concepts and novel shapes.
Gauthier, I., James, T. W., Curby, K. M. & Tarr, M. J. The influence of conceptual knowledge on visual discrimination. Cogn. Neuropsychol. 20, 507–523 (2002).
James, T. W. & Gauthier, I. Brain areas engaged by involuntary access to novel conceptual information during visual judgments. Vision Res. (in the press).
Murtha, S., Chertkow, H., Beauregard, M. & Evans, A. The neural substrate of picture naming. J. Cogn. Neurosci. 11, 399–423 (1999).
Damasio, A. -R. Descartes' Error: Emotion, Reason, and the Human Brain (G. P. Putnam's Sons, New York, 1994).
Barsalou, L. W., Solomon, K. O. & Wu, L. L. in Cultural, Typological, and Psychological Perspectives in Cognitive Linguistics: The Proceedings of the 4th Conference of the International Cognitive Linguistics Association Vol. 3. (eds Hiraga, M. K., Sinha, C. & Wilcox, S.) (John Benjamins, Amsterdam, 1999).
James, T. W. & Gauthier, I. Auditory and action semantic feature types activate sensory-specific perceptual brain regions. Curr. Biol. (in the press).
Diamond, R. & Carey, S. Why faces are and are not special: an effect of expertise. J. Exp. Psychol. Gen. 115, 107–117 (1986).
Allen, S. W. & Brooks, L. R. Specializing the operation of an explicit rule. J. Exp. Psychol. Gen. 120, 3–19 (1991).
Palmeri, T. J. & Nosofsky, R. M. Recognition memory for exceptions to the category rule. J. Exp. Psychol. Learn. Mem. Cogn. 21, 548–568 (1995).
Nosofsky, R. M., Palmeri, T. J. & McKinley, S. C. Rule-plus-exception model of classification learning. Psychol. Rev. 101, 53–79 (1994).
Gauthier, I., Skudlarski, P., Gore, J. C. & Anderson, A. W. Expertise for cars and birds recruits brain areas involved in face recognition. Nature Neurosci. 3, 191–197 (2000).
Gauthier, I. & Tarr, M. J. Unraveling mechanisms for expert object recognition: bridging brain activity and behavior. J. Exp. Psychol. Hum. Percept. Perform. 28, 431–446 (2002).
Gauthier, I., Curran, T., Curby, K. M. & Collins, D. Perceptual interference supports a non-modular account of face processing. Nature Neurosci. 6, 428–432 (2003). Provides evidence that expertise with cars can interfere with face processing, specifically at the level of holistic processing taking place as early as 170 ms after the stimulus.
Young, A. W., Hellawell, D. & Hay, D. Configural information in face perception. Perception 10, 747–759 (1987).
Wenger, M. J. & Townsend, J. T. in Computational, Geometric, and Process Perspectives on Facial Cognition: Contexts and Challenges (eds Wenger, M. J. & Townsend, J. T.) (Erlbaum, Hillsdale, New Jersey, 2001).
Maurer, D., LeGrand, R. & Mondloch, C. J. The many faces of configural processing. Trends Cogn. Sci. 6, 255–260 (2002).
Goldstone, R. L. Unitization during category learning. J. Exp. Psychol. Hum. Percept. Perform. 26, 86–112 (2000). Systematically explores the question of whether new perceptual units can be created during category learning.
Dailey, M. N. & Cottrell, G. W. Organization of face and object recognition in modular neural network models. Neural Netw. 12, 1053–1073 (1999).
Riesenhuber, M. & Poggio, T. in Biologically Motivated Computer Vision (eds Lee, S.-W., Bulthoff, H. H. & Poggio, T.) 1–9 (Springer, Berlin Heidelberg, 2000).
Erikson, M. A. & Kruschke, J. K. Rules and exemplars in category learning. J. Exp. Psychol. Gen. 127, 107–140 (1998).
Smith, E. -E., Patalano, A. L. & Jonides, J. Alternative strategies of categorization. Cognition 65, 167–196 (1998).
Erickson, M. A. & Kruschke, J. K. Rule-based extrapolation in perceptual categorization. Psychon. Bull. Rev. 9, 160–168 (2002). An example of the recent theoretical and empirical research from Perceptual Categorization that proposes both exemplar-based and rule-based components to category learning.
Sloman, S. A. The empirical case for two systems of reasoning. Psychol. Bull. 119, 3–22 (1996).
Nosofsky, R. & Kruschke, J. K. Single-system models and interference in category learning: commentary on Waldron and Ashby (2001). Psychon. Bull. Rev. 9, 175–180 (2001).
Ashby, F. G., Waldron, E. M., Lee, W. W. & Berkman, A. Suboptimality in human categorization and identification. J. Exp. Psychol. Gen. 130, 77–96 (2001).
Kanwisher, N. & Spiridon M. How distributed is visual category information in human occipito-temporal cortex? An fMRI study. Neuron 35, 1157–1165 (2002).
Downing, P. E., Jiang, Y., Shuman, M. & Kanwisher, N. A cortical area selective for visual processing of the human body. Science 293, 2470–2473 (2001).
Farah, M. J., Rabinowitz, C., Quinn, G. E. & Liu, G. T. Early commitment of neural substrates for face recognition. Cogn. Neuropsychol. 17, 117–124 (2000).
Morton, J. & Johnson, M. H. CONSPEC and CONLERN: a two-process theory of infant face recognition. Psychol. Rev. 98, 164–181 (1991).
Turati, C., Simion, F., Milani, I. & Umilta, C. Newborns' preference for faces: what is crucial? Dev. Psychol. 38, 875–882 (2002). Elegant set of studies with newborns which suggests that the bias to prefer an upright to an inverted face-like configuration can be explained by a preference for any pattern with more elements in its upper part.
Hasson, U., Levy, I., Behrmann, M., Hendler, T. & Malach, R. Eccentricity bias as an organizing principle for human high-order object areas. Neuron 34, 479–490 (2002). Provides support from fMRI for a centre–periphery model of occipito-temporal cortex, according to which objects requiring foveation (for example, faces) are associated with a centre-biased representation and objects requiring the integration of large-scale features (for example, buildings) are represented in part of the cortex that is periphery-biased.
Lerner, Y., Hendler, T., Ben-Bashat, D., Harel, M. & Malach, R. A hierarchical axis of object processing stages in the human visual cortex. Cereb. Cortex 11, 287–297 (2001).
Malach, R., Levy, I. & Hasson, U. The topography of high-order human object areas. Trends Cogn. Sci. 6, 176–184 (2002).
Gauthier, I., Tarr, M. J., Anderson, A. W., Skudlarski, P. & Gore, J. C. Activation of the middle fusiform 'face area' increases with expertise in recognizing novel objects. Nature Neurosci. 2, 568–573 (1999).
Rossion, B., Gauthier, I., Goffaux, V., Tarr, M. J. & Crommelinck, M. Expertise training with novel objects leads to left lateralized face-like electrophysiological responses. Psychol. Sci. 13, 250–257 (2002).
Tanaka, J. W. & Curran, T. A neural basis for expert object recognition. Psychol. Sci. 12, 43–47 (2001). The first ERP study to provide evidence for expertise effects (with dog and bird images) on the N170 face-selective potential.
Rousselet, G. A., Mace, M. J. & Fabre-Thorpe, M. Is it an animal? Is it a human face? Fast processing in upright and inverted natural scenes. J. Vis. 3, 440–455 (2003). Measures processing speed and the inversion effect for faces and non-face objects in the context of natural scenes. The evidence argues against the involvement of a special face module or a mental rotation mechanism, and supports the use of features of intermediate complexity.
Supported by grants from the NSF, NIMH, NEI and James S. McDonnell Foundation. The authors wish to thank M. J. Tarr and the members of the Perceptual Expertise Network (funded by the James S. McDonnell Foundation) for helpful discussions. We also thank M. Graf for detailed comments on an earlier version of this paper.
The authors declare no competing financial interests.
A decision about an object's unique identity. Identification requires subjects to discriminate between similar objects and involves generalization across some shape changes as well as physical translation, rotation and so on.
A decision about an object's kind. Categorization requires generalization across members of a class of objects with different shapes.
A decision about whether an object has been seen before. We can recognize an object seen just moments before — as in many experiments from Object Recognition — or we can recognize an object seen on an earlier occasion — as in many experiments from Perceptual Categorization and the memory literature. Recognition involves generalization across size, location, viewpoint and illumination.
(Geometric ions). Simple viewpoint-independent volumetric primitives that are the building blocks of object representation for recognition-by-components theory.
- STRUCTURAL DESCRIPTION
A qualitative representation of an object in terms of its three-dimensional primitives (for example, 'geons') and their relative positions. Many structural descriptions are devoid of metric information regarding quantitative aspects of the primitives (specific shapes and sizes) and their positions (specific spatial locations).
A representation of an object that preserves much of the richness of the perceived two-dimensional image. It is viewpoint-specific, or represented in an egocentric frame of reference, and might contain information about illumination, colour and material (but is often proposed to be largely scale- and translation-invariant).
- VIEWPOINT-INDEPENDENT (OR DEPENDENT) PERFORMANCE
Behavioural performance that is invariant of viewing position and independent of experience with particular views is said to be viewpoint-independent. By contrast, viewpoint-dependent performance depends systematically on experience with specific views of an object.
Novel objects that, like faces, all share a common spatial configuration. Their features can be varied systematically to test aspects of object recognition and feature perception.
- BASIC LEVEL
The level at which object descriptions (both functional and perceptual attributes) maximize a combination of informativeness and distinctiveness. Typically, the basic level is the entry level of recognition. Exceptions include atypical category members (such as penguin, palm tree).
- ENTRY LEVEL
The first level of abstraction at which a perceived object triggers its representation in memory. Empirically it is the fastest level at which observers can verify that an object can be given a particular label at some level of the hierarchy (for example, canary, bird or animal).
- CASCADE MODELS
Cascade models are those in which the later stages of information processing can begin before the completion of earlier stages, unlike discrete models in which computations at any given stage are completed before the subsequent step is engaged.
A thesis concerning the structure of the mind that is based on special-purpose computational mechanisms termed 'modules'. Fodor8 proposed that modules are innate, that they perform their operations on a specific input or domain (for example, faces or speech) and that their operations are informationally encapsulated (not accessible to any other module).
Originally defined as the inability to gain a sense of familiarity from known faces, prosopagnosia also now includes a deficit in the perception of faces. It typically occurs in the context of visual agnosia — a visual deficit in object recognition — and only a few cases have been suggested to present with a face-specific deficit.
About this article
Cite this article
Palmeri, T., Gauthier, I. Visual object understanding. Nat Rev Neurosci 5, 291–303 (2004). https://doi.org/10.1038/nrn1364
Artificial Intelligence Review (2022)
A temporal hierarchical feedforward model explains both the time and the accuracy of object recognition
Scientific Reports (2021)
Enhancing human-machine teaming for medical prognosis through neural ordinary differential equations (NODEs)
Human-Intelligent Systems Integration (2021)
Computational Brain & Behavior (2020)
Visual Object Categorization Based on Hierarchical Shape Motifs Learned From Noisy Point Cloud Decompositions
Journal of Intelligent & Robotic Systems (2020)