Three-dimensional object recognition is viewpoint dependent

Article metrics

Abstract

The human visual system is faced with the computationally difficult problem of achieving object constancy: identifying three-dimensional (3D) objects via two-dimensional (2D) retinal images that may be altered when the same object is seen from different viewpoints1. A widely accepted class of theories holds that we first reconstruct a description of the object's 3D structure from the retinal image, then match this representation to a remembered structural description. If the same structural description is reconstructed from every possible view of an object, object constancy will be obtained. For example, in Biederman's2 oft-cited recognition-by-components (RBC) theory, structural descriptions are composed of sets of simple 3D volumes called geons (Fig. 1), along with the spatial relations in which the geons are placed. Thus a mug is represented in RBC as a noodle attached to the side of a cylinder, and a suitcase as a noodle attached to the top of a brick. The attraction of geons is that, unlike more complex objects, they possess a small set of defining properties that appear in their 2D projections when viewed from almost any position (e.g., all three views of the brick in Fig. 1 include a straight main axis, parallel edges, and a straight cross section). According to the RBC theory, a complex object can therefore be recognized from its constituent geons, which can themselves be recognized from any viewpoint.

The leftmost figure in each row was arbitrarily designated the 0° view; the other two figures represent 45° and 90° rotations of the objects in the depth plane.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 2: Results of the psychophysical experiments.

References

  1. 1

    Marr, D. Vision: A Computational Investigation into the Human Representation and Processing of Visual Information (Freeman, San Francisco, 1982).

  2. 2

    Biederman, I. Psychol. Rev. 94, 115–147 ( 1987).

  3. 3

    Biederman, I. & Gerhardstein, P. C. J. Exp. Psychol. Hum. Percept. Perform. 19, 1162–1182 ( 1993).

  4. 4

    Hayward, W. G. & Tarr, M. J. J. Exp. Psychol. Hum. Percept. Perform. 23, 1511–1521 ( 1997).

  5. 5

    Jolicoeur, P. Mem. Cognit. 13, 289–303 ( 1985).

  6. 6

    Bülthoff, H. H. & Edelman, S. Proc. Natl. Acad. Sci. USA 89, 60–64 (1992).

  7. 7

    Humphrey, G. K. & Khan, S. C. Can. J. Psychol. 46, 170–190 (1992).

  8. 8

    Tarr, M. J. Psychonomic Bull. Rev. 2, 55–82 ( 1995).

  9. 9

    Tarr, M. J., Bülthoff, H. H., Zabinski, M. & Blanz, V. Psychol. Sci. 8, 282–289 (1997).

  10. 10

    Perrett, D. I. et al. Proc. R. Soc. Lond. B 223, 293–317 (1985).

  11. 11

    Logothetis, N. K., Pauls, J. & Poggio, T. Curr. Biol. 5, 552– 563 (1995).

  12. 12

    Poggio, T. & Edelman, S. Nature 343, 263–266 (1990).

  13. 13

    Loftus, G. R. & Masson, M. E. J. Psychonomic Bull. Rev. 1, 476–490 (1994).

Download references

Acknowledgements

This research was supported by an Air Force Office of Scientific Research Grant. We thank Jay Servidea and Jaymz Rosoff for their assistance in running the psychophysical studies.

Author information

Correspondence to Michael J. Tarr.

Supplementary information

Supplementary Information

Supplementary Methods (HTM 9 kb)

Rights and permissions

Reprints and Permissions

About this article

Further reading