Abstract
THE visual recognition of three-dimensional (3-D) objects on the basis of their shape poses at least two difficult problems. First, there is the problem of variable illumination, which can be addressed by working with relatively stable features such as intensity edges rather than the raw intensity images1,2. Second, there is the problem of the initially unknown pose of the object relative to the viewer. In one approach to this problem, a hypothesis is first made about the viewpoint, then the appearance of a model object from such a viewpoint is computed and compared with the actual image3–7. Such recognition schemes generally employ 3-D models of objects, but the automatic learning of 3-D models is itself a difficult problem8,9. To address this problem in computational vision, we have developed a scheme, based on the theory of approximation of multivariate functions, that learns from a small set of perspective views a function mapping any viewpoint to a standard view. A network equivalent to this scheme will thus 'recognize' the object on which it was trained from any viewpoint.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
Perceptual warrant and internal access
Philosophical Studies Open Access 29 October 2022
Access options
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Rent or buy this article
Get just this article for as long as you need it
$39.95
Prices may be subject to local taxes which are calculated during checkout
References
Marr, D. Vision (Freeman, San Francisco, 1982).
Poggio, T., Gamble, E. B. & Little, J. J. Science 242, 436–440 (1988).
Fischler, M. A. & Bolles, R. C. Commun. ACM 24, 381–395 (1981).
Thompson, D. W. & Munday, J. L. in Proc. IEEE Conf. Robotics and Automation 208–220 (Raleigh, North Carolina, 1987).
Huttenlocher, O. P. & Ullman, S. in Proc. 1st Int. COnf. Computer Vision 102–111 (IEEE, Washington DC, 1987).
Lowe, D. G. Perceptual Organization and Visual Recognition (Kluwer Academic Publishers, Boston, Massuchusetts, 1986).
Ullman, S. Cognition 32, 193–254 (1989).
Grimson, W. E. L. & Lozano-Perez, T. IEEE Trans. Pattern Analysis Machine Intell. 9, 469–482 (1987).
Fan, T. J., Medioni, G. & Nevatia, R. in Proc. 2nd Int. Conf. Computer Vision 474–481 (Florida, IEEE, Washington DC, 1988).
Tsai R. Y. & Huang, T. S. IEEE Trans. Pattern Analysis Machine Intell. 6, 13–27 (1984).
Longuet-Higgins, H. C. Nature 293, 133–135 (1981).
Ullman, S. The Interpretation of Visual Motion (MIT Press, Cambridge, Massachusetts, 1979).
Koenderink, J. J. & van Doorn, A. J. Biol. Cybern. 32, 211–217 (1979).
Poggio, T. & Girosi, F. Artif. Intell. Lab. Memo No. 1,140 (Artificial Intelligence Laboratory, MIT, Cambridge, 1989).
Poggio, T. & Girosi, F. Science (in the press).
Tikhonov, A. N. & Arsenin, V. Y. Solutions of III-posed Problems (Winston, Washington DC, 1977).
Poggio, T., Torre, V. & Koch, C. Nature 317, 314–319 (1985).
Powell, M. J. D. in Algorithms for Approximation. (eds Mason, J. C. & Cox, M. G.) (Clarendon, Oxford, 1987).
Broomhead, D. S. & Lowe, D. Complex Syst. 2, 321–355 (1988).
Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Nature 323, 533–536 (1986).
Perrett, D. I., Mistlin, A. J. & Chitty, A. J. Trends Neurosci. 10, 358–364 (1989).
Edelman, S. & Poggio, T. Optic News 15, 8–15, May 1989.
Poggio, T. & Edelman, S. Artif. Intell. Lab. Memo No. 1,181 (Artificial Intelligence Laboratory, MIT, Cambridge, 1989).
Basri, R. & Ullman, S. Artif. Intell. Lab. Memo No. 1,152 (Artificial Intelligence Laboratory, MIT, Cambridge, 1989).
Rock, I. & DiVita, J. Cognitive Psychol. 19, 280–293 (1987).
Edelman, S., Bülthoff, H. & Weinshall, D. Artif. Intell. Lab. Memo No. 1,138 (Artificial Intelligence Laboratory, MIT, Cambridge, 1989).
Edelman, S. & Weinshall, D. Artif. Intell. Lab. Memo No. 1,146 (Artificial Intelligence Laboratory, MIT, Cambridge, 1989).
Jenkins, W. M., Merzenich, M. M. & Ochs, M. T. Soc. Neurosci. Abstr. 10, 665 (1984).
Edelman, G. M. & Finkel, L. in Dynamical Aspects of Neocortical Function (eds Edelman, G. M., Gall, W. E. & Cowan, W. M.) 653–695 (Wiley, New York, 1984).
Gross, C. G., Rocha-Miranda, C. E. & Bender, D. B. J. Neurophys. 35, 96–111 (1972).
Perrett, D. I., Rolls, E. T. & Caan, W. Expl Brain Res. 47, 329–342 (1982).
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Poggio, T., Edelman, S. A network that learns to recognize three-dimensional objects. Nature 343, 263–266 (1990). https://doi.org/10.1038/343263a0
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1038/343263a0
This article is cited by
-
Perceptual warrant and internal access
Philosophical Studies (2023)
-
Generalized Multiscale RBF Networks and the DCT for Breast Cancer Detection
International Journal of Automation and Computing (2020)
-
Deep convolutional neural networks in the face of caricature
Nature Machine Intelligence (2019)
-
Handgun Detection in Single-Spectrum Multiple X-ray Views Based on 3D Object Recognition
Journal of Nondestructive Evaluation (2019)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.