Monocular depth perception from optical flow by space time signal processing

22 April 1983

journal article
Published by The Royal Society in Proceedings of the Royal Society of London. B. Biological Sciences

Vol. 218 (1210), 27-47
https://doi.org/10.1098/rspb.1983.0024

Abstract

A theory of monocular depth determination is presented. The effect of finite temporal resolution is incorporated by generalizing the Marr-Hildreth edge detected operator -$\nabla ^{2}$G(r) where $\nabla ^{2}$ is the Laplacian and G(r) is a two-dimensional Gaussian. The constraint that the edge detection operator in space-time should produce zero-crossings at the same place in different channels, i.e. at different resolutions of the Gaussian, led to the conclusion that the Marr-Hildreth operator should be replaced by -$\square ^{2}$G(r,t) where $\square ^{2}$ is the d'Alembertian $\nabla ^{2}-(1/u^{2})(\partial ^{2}/\partial t^{2})$ and G(r, t) is a Gaussian in space--time. To ensure that the locations of the zero-crossings are independent of the channel width, G(r, t) has to be isotropic in the sense that the velocity u appearing in the defintion of the d'Alembertian must also be used to relate the scales of length and time in G. However, the new operatior -$\square ^{2}$G(r,t) produces two types of zero-crossing for each isolated edge feature in the image I(r, t). One of these, termed the `static edge', corresponds to the position of the image edge at time t as defined by $\nabla ^{2}$I(r,t) = 0; the other, called a `depth zero', depends only on the relative motion of the observer and object and is usually found only in the periphery of the field of view. When an edge feature is itself in the periphery of the visual field and these zeros coincide, there is an additional cross-over effect. It is shown how these zero-crossings may be used to infer the depth of an object when the observer and object are in relative motion. If an edge feature is near the centre of the image (i.e. near the focus of expansion), the spatial and temporal slopes of the zeros crossing at the static edge may be used to infer the depth, but, if the edge feature is in the periphery of the image, the cross-over effect enables the depth to be obtained immediately. While the former utilizes sharp spatial and temporal resolution to give detailed three-dimensional information, the cross-over effect relies on longer integration times to give a direct measure of the time-to-contact. We propose that both mechanisms could be used to extract depth information in computer vision systems and speculate on how our theory could be used to model depth perception in early visual processing in humans where there is evidence of both monocular perception of the environment in depth and of looming detection in the periphery of the field of view. In addition it is shown how a number of previous models are included in our theory, in particular the directional sensor proposed by Marr & Ullman and a method of depth determination proposed by Prazdny.

Keywords

This publication has 21 references indexed in Scilit:

Visual hyperacuity: spatiotemporal interpolation in human vision
Proceedings of the Royal Society of London. B. Biological Sciences, 1981
A computer implementation of a theory of human stereo vision
Philosophical Transactions of the Royal Society of London. B, Biological Sciences, 1981
The interpretation of a moving retinal image
Proceedings of the Royal Society of London. B. Biological Sciences, 1980
The optic flow field: the foundation of vision
Philosophical Transactions of the Royal Society of London. B, Biological Sciences, 1980
Perception of Surface Slant and Edge Labels from Optical Flow: A Computational Approach
Perception, 1980
Theory of edge detection
Proceedings of the Royal Society of London. B. Biological Sciences, 1980
Visual stability and space perception in monocular vision: mathematical model
Journal of the Optical Society of America, 1980
A computational theory of human stereo vision
Proceedings of the Royal Society of London. B. Biological Sciences, 1979
Cooperative Computation of Stereo Disparity
Science, 1976
A short-range process in apparent motion
Vision Research, 1974

Cited by 42 articles