Many aspects of our perceptual experience are dominated by the fact that our two eyes point forward. Whilst the location of our eyes leaves the environment behind our head inaccessible to vision, co-ordinated use of our two eyes gives us direct access to the three-dimensional structure of the scene in front of us, through the mechanism of stereoscopic vision. Scientific understanding of the different brain regions involved in stereoscopic vision and three-dimensional spatial cognition is changing rapidly, with consequent influences on fields as diverse as clinical practice in ophthalmology and the technology of virtual reality devices.
This article is part of the themed issue ‘Vision in our three-dimensional world’.
Given that we live in a three-dimensional world, it might seem that our forward-facing eyes are in an inconvenient position, as this leaves us blind to a large part of the world immediately around us. Like other vertebrates, our eyes are constrained by their optical apparatus, so that a single eye is able to cover at most a hemisphere of visual space. The compound eyes of some insects are placed on stalks to increase the coverage of solid angle, but for many vertebrates the only available compensation is to place the eyes laterally. Each eye can then contribute independently by covering a different part of visual space. Within the viewing sphere around our heads, we humans are by comparison blind to what happens above or behind us.
The constraints on the visibility of three-dimensional space are seemingly even more severe, when we also take into account that the acuity of the human eye is highly non-uniform across the light-sensitive retina. The fovea of each eye is a small region of high acuity, comprising a segment of the viewing sphere with an apex angle of about 5°. This segment covers only 0.2% of the viewing sphere around us. In contrast, this same small region of high acuity is served by something like 30% of the primary visual cortex  and within the cortex serving this part of visual space, individual neurons display highly precise spatial acuities, up to the behavioural performance limit . The compensation for the narrow field of high resolution is that our eyes are highly mobile, with a sophisticated control system for rapidly coordinating the two foveae, so that they are directed at features of interest. One main advantage of co-ordinating these two high acuity foveal systems is the capacity to use both eyes in conjunction to give us a direct sense of depth.
(a) Discovery of stereopsis
In just over 20 years' time, it will be 200 years since Charles Wheatstone presented his invention of the stereoscope to the Royal Society  and revealed this direct sense of depth. This marked a turning point in our understanding of how we perceive the world around us. Kant  had proposed that the concept of space is a necessary precursor for the organization of knowledge. With Wheatstone's discovery, there was a break into this apparently impregnable fortress of spatial cognition. A specific component of spatial vision could be studied separately. Specifically, Wheatstone showed that small differences between images presented separately to the left and right eyes could be combined in vision to give an impression of three-dimensional structure, not present in the individual images.
The discovery had a profound effect. By 1867, Helmholtz  devotes a whole chapter of his Handbook of Physiological Optics to ‘Perception of Depth’. This chapter title might appear to promise a general introduction to all aspects of depth perception but, in reality, most of the text is taken up with a discussion of binocular depth and the stereoscopic percept. At the same time, the stereoscope became a great device of popular entertainment through the nineteenth century. The significance of binocular vision continues to be debated from a philosophical standpoint . Meanwhile, the three-dimensional movie experience has now come out of specially equipped cinema theatres into the living room, much like the stereoscope did beforehand. Stereoscopic viewing is also a standard feature of virtual reality displays, devices whose mass-market popularity is about to be tested.
(b) Stereopsis for pattern recognition
Until the mid-twentieth century, binocular stereoscopic vision was regarded as a two-stage process, in which the brain initially identified spatial forms and objects in monocular pathways. These forms were combined at a binocular stage, at which disparity was extracted to deliver a sense of depth. The use of stereo cameras in aircraft reconnaissance pointed towards a different aspect of stereoscopic vision. This is the capacity of stereoscopic vision to break camouflage by directly revealing spatial form.
This concept was only fully articulated with the insights of Julesz [7,8]. He exploited the modern computing technology to create stereograms with entirely random assignments of simple, primitive elements (see fig. 1 in ). This technique demonstrated clearly that pure statistical correlation between the left and right eyes' images was detectable in the absence of a luminance-defined spatial form. Structure within the binocular image-pair was sufficient by itself to break camouflage, even with monocular patterns that had a random and unstructured distribution of luminance in each eye's image. Julesz also articulated the correspondence problem: in a pair of binocular images with large numbers of similar elements, there are multiple possible matches between the features presented to the left eye and those to the right. The brain is confronted with a significant computational task in order to sift out false correspondences and arrive at a globally consistent solution for our three-dimensional percept.
(c) Neuronal mechanisms of stereoscopic vision
Understanding of stereoscopic vision at the neuronal level came from experiments in which single nerve cells were recorded from the visual cortex of anaesthetized animals [10,11]. Individual neurons are selective for a particular binocular depth of a simple, visual feature, with different neurons sensitive to different depths. This implies an organized set of neurons that sample not just the two-dimensional surface of the retina but also the three-dimensional volume of visual space. Recordings in awake, behaving animals confirmed these findings for the primary visual cortex (area V1) but also demonstrated a new set of properties of the neuronal coding of binocular depth in the various extrastriate visual areas. Here, neurons show sensitivity to the relative disparity between multiple visual features, decision-related firing and a tighter correlation with the perceptual qualities of stereoscopic vision . Most critically, there is evidence for a causal role of some of these neurons in the perception of binocular depth. Targeted intervention in the signals passing through extrastriate areas by means of focal electrical microstimulation [13–16] has a direct influence on perceptual judgments of binocular depth.
(d) Disorders of stereoscopic vision
Understanding the fundamentals of stereoscopic vision, particularly the neural mechanisms involved, is essential for the treatment of amblyopia, a disorder of binocular vision . Within every generation, some 2–4% of individuals cannot acquire depth from binocular inputs. These people often receive clinical treatment in order to try to persuade their eyes to work together. This may include surgical intervention for binocular vision, which although declining as a procedure remains the second-most common reason for elective surgery in childhood in the UK (England and Wales Hospital Episode Statistics, 2010–11, procedures C31.1–C33.6). With or without surgery, the caseload for optometric intervention and treatment is high, with a consequent financial demand on the health-care system. Often, the outcome of these treatments is a degree of binocular co-ordination of the eye movements, without any recovery of the ability to sense stereoscopic depth. It is now clear that amblyopia involves changes in cortical function outside of the primary visual cortex.
2. Current research
This special issue provides an overview of the current state of research activity in the field of three-dimensional vision. The articles range from neurophysiological and psychophysical studies of the fundamental mechanisms of binocular vision to modern forms of therapeutics for improving binocular vision and analysis of the integration of stereoscopic depth signals with other information about the structure of the three-dimensional world. Standing back from the details of the research and clinical findings, there are at least two reasons for wanting to engage in understanding binocular vision. The first is what this research can tell us about how the brain works, processing complex neural signals and extracting meaning from them. The second is cultural and commercial: virtual reality headsets and augmented reality systems are coming to the mass-market now. Knowledge about human performance with these three-dimensional viewing aids is critical to get the best out of these devices.
(a) Fundamental mechanisms of binocular vision
Biologically regarded, stereoscopic vision itself seems to be about prehension and predation. Read and her collaborators examine stereoscopic vision in the praying mantis . This is one of the few invertebrates that seems to have a functioning stereo system, which it utilizes for its infrequent strikes at prey. Several mammals and birds have developed elaborate neural mechanisms for processing stereo information that are primarily concerned with extracting depth and/or breaking camouflage. In some cases, this development may have been driven by an evolutionary advantage for predators, whilst in others prehension is a more likely driver, particularly for eye–hand co-ordination in primates. By comparison, the binocular visual system of rodents appears to be dominated by factors other than stereo and binocular fusion .
The energy model of disparity detection  has become canonical for the first-line description of disparity-selective neurons in the primary visual cortex (area V1) of cats and monkeys. There appears now to be a convergence of thinking to update the energy model with a more flexible pooling model, in which the sensory signals from disparity-specific sub-units are collected together across a limited region of the visual field so that they converge on a single neuron in the primary visual cortex. Cumming et al.  review the recent neurophysiological evidence that has led to this conclusion, whilst Ohzawa et al. present new experiments that reveal the internal organization of the sub-unit model . Parker et al. exploit the sub-unit model to estimate the cortical architecture responsible for setting the depth range over which stereoscopic disparities are processed and the spatial pooling of those disparity signals . As the paper by Ohzawa et al. demonstrates, this pooling has the capacity to generate a narrower, more precise representation of binocular disparity .
(b) Stereo vision in models of perception and decision-making
Binocular disparities can be controlled with great precision experimentally, so stereoscopic tasks have been used as the basis for testing models of the relationship between neuronal and behavioural events [13,14,23,24]. There are two distinct streams of work represented in this special issue. For the first, stereo has been used as a route to identify which populations of cortical neurons are responsible for driving perceptual decisions about the three-dimensional configuration of objects and which neuronal signals are relevant [25,26]. Krug et al.  show that neurons in extrastriate cortical area V5/MT maintain high sensitivity to small changes in disparity across significant changes in the firing rate. Furthermore, the correlation of the neuron's response with the behavioural decisions of the animal is better predicted by the neuron's sensitivity to disparity rather than the neuron's firing rate. Fujita and Doi  also build on neurophysiological data. They suggest that the conscious perception of stereo depth, as reported by human observers, may depend on two separate components of the neuronal response identified in recordings from macaque monkeys.
Another approach has been to consider how stereoscopic information is exploited in the two major cortical processing streams of the primate visual system. These are the dorsal and ventral visual pathways . Broadly regarded, the dorsal visual system has been associated with visuo-motor co-ordination and the perception of the organization of visual space, whilst the ventral visual pathway has been associated with pattern discrimination and object recognition, particularly the ability to generalize recognition across several viewpoints and locations in space. Very roughly, these capacities have been referred to as a ‘Where?-pathway’ for the dorsal stream and a ‘What?-pathway’ for the ventral stream. Also working with macaque monkeys, two papers examine neural coding of depth in conjunction with other cues: Janssen et al. analyse the processing of shape in the ventral visual stream , whilst DeAngelis et al. examine the role of motion parallax in the dorsal visual stream . This theme is also embedded within several other papers in this special issue.
Glennerster  takes a radically different line on stereoscopic vision and three-dimensional spatial navigation, arguing that for many practical purposes a full three-dimensional representation of visual space is not essential. Glennerster proposes that navigation tasks can be often be supported by representing them as a sequence of movements with an associated two-dimensional image for each step change in the movement sequence, thereby removing the need for a full three-dimensional model of the visual environment. Banks and Guan  present findings that suggest that perceptual processing tends towards a metric representation of binocular depth that is presumed to be accurate across x, y and z co-ordinates, by identifying a novel form of depth constancy, analogous to contrast constancy seen in the luminance domain . Like other sources of visual information, stereoscopic disparity may be used for segregation of one region from another, as well as for integration within the bounds of a segregated region to improve depth resolution. Cammack and Harris  investigate the factors governing the balance between these two processes.
(c) From basic to clinical science: stereoscopic vision and its failings
One paper points to the way in which modern optometric therapeutics is heading: by exercising the visual systems of amblyopes with a focussed and motivating binocular task, Levi et al. have successfully promoted the recovery of stereo vision . It remains to be seen whether this perceptual training is awakening a residual stereo capability that was present but somehow suppressed in function, or whether there is a genuine capacity in adult life for building neural connections that were not developed in childhood. However, this is an exciting time for new ideas about how to promote recovery in amblyopia.
Quite separately, clinical disturbances of vision following cortical lesions have been a long-standing route into evaluating the contribution of different parts of the cerebral cortex to visual processing. Two papers examine this issue from the perspective of binocular vision. Bridge  reviews the field of research into cortical lesions that lead to losses of stereoscopic vision, whilst Murphy et al.  present evidence of hemispheric lateralization in parietal cortex for the processing of binocular stereoscopic depth.
(d) Neural architectures for vision: where does stereo fit?
At first glance, the cortical organization for processing of stereoscopic information provides little evidence for specialization of particular cortical areas for binocular vision. As argued earlier, binocularity is fundamental to the nature of human vision, initially through the exertion of dual control over the left and right eyes to bring about binocular convergence and sensory fusion. This leads to a different notion of what cortical specialization might mean for binocular vision. Convergence of signals from right and left eyes occurs within V1, so almost all visual cortical neurons beyond V1 have some binocular properties. Consequently, most neurons in extrastriate visual cortex are to some degree sensitive for changes in stereoscopic depth.
One idea is that stereoscopic information is integrated with other sources of visual information towards the major functions of the dorsal and ventral processing streams [12,36]. Thus neural signals for stereoscopic vision may be found widely within the visual system, but each stream will exhibit a more detailed and differentiated exploitation of stereoscopic information, according to the primary function of each stream. One of the next goals for understanding the processing of stereoscopic information is to examine how different cortical areas interact with one another to perform these functions.
It is a striking that the same goal is identifiable at the same time from other developments in neuroscience. The ‘Connectome’ projects for both human  and macaque  are approaching a stage at which the total number of areas in the cerebral neocortex is countable. Reaching this stage naturally also raises the same question of how these areas interact with one another, so the more general question about cortical function can be addressed by studies of stereoscopic perception, for which the relevant neuronal populations are already defined and the outcome in terms of perception is clearly and objectively defined. A more difficult and open question is the contribution of sub-cortical brain sites to binocular stereoscopic function, whether directly in terms of perception or indirectly through control of binocular, eye movements. Given that some species have forms of binocular co-ordination that are driven by factors other than the requirement for high acuity stereopsis , neural circuitry outside the neocortex may be central for these functions. It is likely that humans and macaques have similar versions of these sub-cortical, neural circuits, but whether they serve the same or different functions as the equivalent circuits in rodents is unclear at this stage. The next 20 years may succeed in resolving some of these questions at both the cortical and sub-cortical levels.
I declare I have no competing interests.
I received no funding for this study.
One contribution of 15 to a theme issue ‘Vision in our three-dimensional world’.
- Accepted April 7, 2016.
- © 2016 The Author(s)
Published by the Royal Society. All rights reserved.
Guest editor profile
Andrew J. Parker is currently Professor of Physiology and Fellow of St John's College at the University of Oxford. He studied Natural Sciences at the University of Cambridge, where he remained to complete his PhD. He moved to the University of Oxford with a Beit Memorial Fellowship and was appointed to the faculty in 1985. His research covers several aspects of spatial vision and the neuronal mechanisms of perceptual decisions. His present work concentrates on the neurophysiology and neuro-imaging of stereoscopic vision. He has held a Leverhulme Senior Research Fellowship and a Wolfson Merit Award from the Royal Society and presently holds a Presidential International Fellowship from the Chinese Academy of Sciences. His interests in education extend beyond his university teaching to a leading role in the foundation of the first English state school that offers three fully bilingual streams and teaches the European Schools’ curriculum.