The stochastic accumulation framework provides a mechanistic, quantitative account of perceptual decision-making and how task performance changes with experimental manipulations. Importantly, it provides an elegant account of the speed–accuracy trade-off (SAT), which has long been the litmus test for decision models, and also mimics the activity of single neurons in several key respects. Recently, we developed a paradigm whereby macaque monkeys trade speed for accuracy on cue during visual search task. Single-unit activity in frontal eye field (FEF) was not homomorphic with the architecture of models, demonstrating that stochastic accumulators are an incomplete description of neural activity under SAT. This paper summarizes and extends this work, further demonstrating that the SAT leads to extensive, widespread changes in brain activity never before predicted. We will begin by reviewing our recently published work that establishes how spiking activity in FEF accomplishes SAT. Next, we provide two important extensions of this work. First, we report a new chronometric analysis suggesting that increases in perceptual gain with speed stress are evident in FEF synaptic input, implicating afferent sensory-processing sources. Second, we report a new analysis demonstrating selective influence of SAT on frequency coupling between FEF neurons and local field potentials. None of these observations correspond to the mechanics of current accumulator models.
1. The stochastic accumulator framework
The act of choice is a commitment to one course of action instead of other potential actions. A decision process whereby available evidence for the alternatives is weighed guides the most effective choices. Decades of research—using behavioural analysis and computational modelling of manual or saccadic choice reaction times and accuracy rates—has led to the broad consensus that the decision process can be understood as a stochastic accumulation of evidence [1–4]. Sensory evidence sampled from the environment for each alternative is accumulated, and a choice is enacted when the first accumulation process reaches some criterion. Although there are several variants on this framework, all have several aspects in common: a baseline or starting level of each accumulator, which can be affected by biases or expectations, an accumulation rate determined by the strength or reliability of the evidence provided in the perceptual input and a threshold level of accumulation mapped to a particular response  (figure 1).
With just a few parameters, stochastic accumulator models account for a large proportion of behavioural variability—predicting the shapes of reaction time distributions and error rates, while simultaneously explaining how those distributions and error rates change as a function of various experimental manipulations. Of particular importance is the speed–accuracy trade-off (SAT) , a universal and pervasive phenomenon that must be accommodated by any tenable decision model. According to stochastic accumulator models, SAT is achieved through a modification of the accumulation threshold: when set lower, response time (RT) is shortened due to the smaller excursion accumulators must traverse to terminate the decision process. However, this also reduces the amount of noise that can be averaged out of the accumulation process, thereby also increasing the error rate. The reverse holds true for observers placed under accuracy stress. One should appreciate that the model does not simply predict the difference in mean RT and error rate between speed and accuracy conditions, however. The elegance of the model is borne out in the fact that the shapes of participants' RT distributions change in specific ways, and stochastic accumulator models predict these shapes precisely [1,2,5,7–15]. Given the success of the framework, it is natural to investigate how the brain accomplishes these perceptual decisions. Will the form of neural processes parallel the form of the accumulators? Here, we review our recent work in non-human primates, suggesting that neural activity differs substantially in several ways from the architecture of current accumulator models. Next, we present several novel analyses that further substantiate our conclusion that psychological accumulator models do not capture the diverse reality of neural activity during decision formation. These include chronometric analyses of single units, local field potentials (LFPs) and electroencephalogram (EEG), as well as the results of a time-frequency, spike-field coherence (SFC) analysis.
2. Brain regions and cell classes
The relationship between stochastic accumulation and neural activity is most well characterized in the macaque oculomotor system using tasks that require monkeys to make saccadic eye movements to indicate choice [16–20] (but see [21,22]). According to the framework, saccades occur once accumulated sensory evidence reaches a decision threshold. In a sense, the decision process can be likened to a transformation of sensory information into motor execution. It should thus not be surprising that the brain regions critical for saccadic decisions are also those identified with sensorimotor integration, including the frontal eye field (FEF), superior colliculus (SC) and lateral intraparietal area (LIP). These structures are composed of many functionally distinct cell classes [23–25]. The heterogeneity of neural activity during decision-making tasks has been highlighted in many recent studies [23,26–29]. Here, we will focus on two. Pre-saccadic movement neurons exhibit weak or no response to visual stimulation, but increase their firing rate in the period prior to a saccade (figure 2a). In comparison, visual neurons exhibit a vigorous burst of activity following the presentation of a stimulus falling within its receptive field (RF), but have no pre-saccadic modulation. As illustrated in figure 2b, visual neurons exhibit an initially non-selective visual response that evolves to discriminate target items (solid lines) from distractor items (dashed lines) presented in their RF.
Pre-saccadic movement neurons demonstrate several properties that suggest they embody the stochastic accumulation process. First, models predict that the accumulation rate should be proportional to the strength of sensory evidence. For instance, manipulations that alter the visibility of a critical stimulus should be best accounted for by a model allowing the rate parameter to vary across conditions. This has been validated in quantitative fits to human behaviour: manipulations that affect the strength or quality of perceptual information are best captured by models that allow accumulation rate to vary between conditions [10,30,31]. Similarly, the rate at which pre-saccadic neural activity in FEF [17,18,29,32], LIP [16,33,34] and SC [19,35] builds up prior to saccade varies monotonically with the strength of sensory evidence. Second, accumulator models suggest that all else being equal, variability in RT is due to the amount of time required for accumulation to reach a fixed threshold. In other words, accumulator models can capture wide variability in RT without the need for a variable threshold. This remains true even when conditions differ in difficulty, such as with a visual search1 set size manipulation [36,37]. This, too, is borne out in the pre-saccadic activity of movement neurons: for several types of tasks, the level of neural activity at decision is invariant over RT quantiles and task manipulations [16–19], particularly when monkeys cannot predict the nature of the upcoming trial. A small number of studies have shown changing neural threshold when task conditions are presented in blocks or cued trial-to-trial [38,39]. While less specific than single-unit neurophysiology, it is noteworthy that the covariation between model parameter and neural activity has been verified on the larger scale of brain networks in humans, using functional brain imaging [40–45], magnetoencephalography  and EEG [47,48] (for a review, see Bogacz et al. ).
Still more evidence is gleaned from countermanding tasks that require subjects to occasionally cancel a prepared saccade. Movement neurons in both SC and FEF are observed to initially increase but then decline in activity following a stop or change signal [27,49,50]. Crucially, whether or not monkeys can cancel a saccade depends on whether or not a threshold discharge rate was reached [51,52] (for a review, see [53,54]).
The identification of movement neurons with stochastic accumulators also follows from anatomical considerations. Because stochastic accumulators trigger overt choice at the moment the accumulation reaches threshold, so too must accumulator neurons trigger movement at some physiological threshold. Movement neurons are well suited for this, having direct projections to brainstem nuclei responsible for initiating eye movements [55–59].
In comparison, when tested under conditions of visual search, visually responsive neurons in FEF, SC and LIP represent the salience of stimuli presented in their RF through a modulation of firing rate [60–67] (but see [68,69], cf. ). Here, salience refers to both the physical conspicuousness of an item, such as luminance, contrast or status as a feature singleton (bottom-up salience), as well as its behavioural relevance determined by the requirements of the task (top-down salience). The union of top-down and bottom-up influences provides a viable mechanism for segregation of the visual field—a process that is a central role in several behavioural models of visual search [71–74]. The evolution of the salience map can be quantified by comparing neural activity on trials when a target fell in the RF versus trials in which a distractor appeared in the RF. The time at which neural activity statistically discriminates target from non-target stimuli (the target selection time, TST) as well as the magnitude of discrimination has behavioural consequences. For instance, slower (faster) RTs are associated with a later (earlier) differentiation of target and non-target stimuli . Increasing the number of competitor stimuli both delays and reduces the difference between target and non-target stimuli [76,77], and errors tend to result when the salience map favours distractor over target stimuli [78,79]. Thus, the salience map represented by visual neurons may be the perceptual evidence that is input to stochastic accumulators leading to guided action.
3. Gated accumulator model
The above suggests a straightforward model to explain visual search: visual neurons represent the salience evidence that is accumulated by movement neurons, and saccadic decisions occur when movement neuron activity reaches a fixed threshold. To test this, we used a neurally constrained modelling approach . The model, depicted in figure 3, takes as input spike trains recorded from FEF visual neurons2 and produces as output predicted behaviour (RT and accuracy rate) as well as several key properties of FEF movement neurons. Specifically, the time-course of accumulation suggested by the best-fitting model closely matched the dynamics of real neural activity exhibited by FEF movement cells. The success of this model (and a subsequent extension ), constrained by neural data, indicates that the general framework is viable. To summarize, data stemming from computational modelling, behavioural experiments and single-unit electrophysiology consistently agree that stochastic accumulator models are more than a convenient approximation, but are realized in the neural mechanisms responsible for perceptual decision-making.
4. Speed–accuracy trade-off with non-human primates
One gap in the above is that changes in SAT have never been observed in non-human species. As a consequence, the linchpin observation of decreasing threshold with increasing speed stress has not been validated in single-unit responses. The identification of FEF movement neurons with stochastic accumulators leads to an exceptionally clear prediction: the threshold neural activity—the spiking output in the moments prior to saccade—should vary with SAT condition, such that neural threshold is highest under accuracy stress and lowest under speed stress, consistent with the consensus derived from behavioural and modelling studies described above. To test this hypothesis directly, we trained monkeys to alter SAT settings on cue  and recorded single neuron activity from movement and visual neurons in FEF.
Monkeys performed visual search for a target item presented among seven distractor items (figure 4a). SAT conditions (fast, neutral3 and accurate) were presented in short blocks of 10–20 trials each, cued only by the colour of the fixation point. The speed condition placed more emphasis on fast responding than correct responding through several reward and punishment (timeout) contingencies. Monkeys earned juice reward for responding correctly, but only if their RT met a response deadline predetermined through pilot testing (see ). If monkeys met the response deadline, but chose incorrectly, they were not rewarded but were also not penalized. Responses of any type that did not meet the deadline were followed by a long 4 s timeout. The converse was true for the accuracy condition: monkeys were rewarded for slow, correct responses, but were given timeout if their response was incorrect. We also included a neutral condition that had no response deadline. The use of response deadlines to control RT is highly effective and follows a long tradition in human SAT research [6,82]. We observed that this paradigm was equally effective in monkeys, who produced a classic SAT, characterized by decreasing RT and increasing error rate with speed emphasis (figure 4b). Just as importantly, monkeys adapted their behaviour instantaneously upon presentation of a new SAT cue, demonstrating a voluntary and flexible change of state.
The critical neural data were unambiguous: SAT-related changes in neural activity were not homomorphic with accumulator model architecture. Quite to the contrary, threshold neural activity varied with speed stress (figure 5a), but in the direction opposite to predictions (higher threshold for speed stress than accuracy stress). At the same time, threshold remained invariant with RT within conditions, suggesting that threshold variability was yoked not to RT per se, but to a cognitive state elicited by SAT cues. Moreover, SAT instructions affected not just one aspect (threshold) of neural activity, but also perceptual processes as well, as evidenced by changes in visually responsive neurons (figure 5b). Specifically, speed stress led to increases in baseline neural activity, the magnitude of visual responses to otherwise identical stimuli, and also the time required for visual neurons to select target from distractor items placed in its RF.
5. The integrated accumulator model
We found that SAT is accomplished through a multitude of adjustments in multiple processing stages. Despite these findings showing that conventional accumulator models do not map gracefully onto neural mechanisms, we thought it hasty to dismiss the stochastic accumulator framework altogether. Thus, we sought to reconcile these neural findings with the accumulator model framework based on one constraint of a performance parameter that did not vary with SAT in our task—the velocity and amplitude of the saccades produced under different speed–accuracy instructions were invariant (figure 6).
Saccade brainstem physiology operates like a trigger: the eyes move precisely when omnipause neurons receive threshold inhibition  from afferents originating in FEF, SC and elsewhere4 [55–58]. The metrics of the resulting saccade—its velocity and amplitude—are a precise function of the level of omnipause hyperpolarization received . The fact that saccade velocity did not vary with SAT instruction requires that brainstem recipients of FEF output must reach an invariant state at saccade onset. Saccades are ballistic movements, such as the flight of an arrow. Equivalent distance and velocity of an arrow requires equivalent final tension in the bow at release. However, the bow can be drawn quickly or slowly. Therefore, we reasoned that the threshold neural activity observed in FEF (and we conjecture throughout the pre-saccadic circuit including the SC) must be constrained by the brainstem nuclei that directly receive the accumulating premotor signal.
We believe this appreciation of the final motor circuitry provides an important insight into why the FEF movement neuron activity varies as it does under SAT. The variation in movement neuron threshold with SAT is translated to a fixed threshold in the brainstem. This insight provided the foundation for a reconciliation of our findings with the stochastic accumulator framework by assuming that: (i) FEF movement neuron output is itself integrated in the brainstem (we conjecture that this integration is implicit on the membranes of omnipause neurons); (ii) saccades are produced when this integrated activity reaches a fixed threshold (otherwise saccade velocity and amplitude could not be equivalent); and (iii) the input varies as a function of SAT (the bow string can be drawn slowly or quickly to the desired amount of tension). Our model, known as the integrated accumulator provided fits to behaviour that were comparable with the standard accumulator models and it also replicated key features of the neurophysiology . Validation of this model will require recordings from the brainstem during SAT; this is the focus of ongoing work.
6. Speed–accuracy trade-off leads to system-wide modulations
A strength of the integrated accumulator model is the appeal to a multi-stage accumulation process. The standard stochastic accumulator model considers only the deliberative stage of decision, making few assumptions about the nature and representation of the perceptual input, and no provision for any changes in the mechanisms following decision threshold crossing that are required to engage eye movements (or any other body movement). It is equally unclear exactly where stochastic accumulation occurs, and at what level the threshold is implemented (is each neuron an equally weighted accumulator, or is there a consensus within a given area?) . In what follows, we highlight the argument for a multi-stage accumulation process. Specifically, we will show that that SAT affects the processing of visual stimuli and that this modulation is influenced by sources outside of FEF. This is problematic for any model that localizes stochastic accumulation to a single stage or brain area.
We conducted a chronometric analysis of FEF spikes and LFP5, and also the non-human primate N2pc [79,87,88], an attention-sensitive, target-selective EEG component [89–92] widely considered to reflect activity in extrastriate visual cortex such as V4 [93,94]. We confirmed previous work in showing that: (i) the initial visual response occurs earlier in FEF LFP than in single-unit responses, owing to its reflection of dendritic activity and hence input to an area [95,96]; but (ii) FEF single neurons become selective for context-specific stimuli (the TST) earlier than LFP [87,88,95], suggesting that FEF computes selective responses from initially unselective input; and (iii) target selectivity emerges earliest in single units and latest in N2pc, with LFP becoming selective at intermediate times. This is consistent with other work suggesting that target selectivity computed by FEF neurons, from initially unselective inputs, is later transmitted to extrastriate cortex . Finally, we confirmed that, at least for FEF single neurons, (iv) TSTs for otherwise identical stimuli are earlier under speed stress than under accuracy stress.
We were specifically interested in the moment in time when neural activity significantly increased from baseline (onset time)6, the moment in time when neural activity discriminated the fast and accurate conditions (SAT discrimination time), and the moment in time when neural activity discriminated target from distractor items placed in its RF (TST).
In a new analysis, we computed these metrics in 144 neurons, 224 LFP recordings, and 33 N2pc recordings7 from two monkeys performing the SAT visual search task, all recorded simultaneously. Onset times were computed using ms-by-ms non-parametric t-tests, testing against 0 after baseline correction −100 to 0 ms prior to target onset. The same procedure was followed for the SAT discrimination time, testing the fast condition against the accurate condition. Similarly, we compared trials when target items appeared in the RF versus when distractor items fell there to compute the TST . Values were computed and tested statistically using a jack-knife bootstrapping procedure . All statistical comparisons were computed at the population level.
The results were largely consistent with previous work (figure 7). Onset times were slightly but consistently earlier for LFPs (41 ms) than for single neurons (46 ms). The onset time for N2pc was more variable, but similar on average to single neurons (45 ms). The visual onset time did not differ significantly between SAT conditions, for any signal.
Next, we computed the single unit, LFP and N2pc TSTs. For each signal, TSTs were significantly earlier for the fast when compared with accurate condition (single neurons: 143 versus 162 ms; LFP: 156 versus 167 ms, N2pc: 169 versus 177 ms). It is also clear that for each condition, the linear ordering was such that single units selected earliest, and N2pc selected latest .
Finally, and most importantly, we calculated the SAT encoding time: the moment at which the magnitude of neural activity discriminated the fast from accurate conditions. We observed that the visual response was magnified for the fast when compared with accurate condition earliest in the LFP (45 ms), followed by single neurons (79 ms) and N2pc (110 ms; blue lines). To emphasize this point, we plotted mean activity in the window 50–55 ms post-array onset for each signal type; this was only significant for LFP voltage (expressed as a larger negativity in the fast condition; figure 7, right-hand panels). These results suggest when monkeys were faced with speed stress, the system was pre-configured to amplify incoming visual signals. Thus, the amplification of perceptual gain is not local to FEF as it is evident in FEF input before spiking output. It is not puzzling that the fast and accurate conditions can be discriminated very early; monkeys were aware of which condition was to be presented by the colour of a fixation point presented 750–3000 ms prior to the critical stimulus. It is puzzling, though, that SAT condition affected the visual response so much earlier in the LFP when compared with spikes and the N2pc. This result demonstrates that SAT is accomplished by adaptations occurring in multiple brain regions for both visual processing and saccade planning. An important lesson to be learned from this is that if stochastic accumulator models are to be mapped onto brain processes, they should not be identified with one stage, one brain region, or one cell class.
7. Spike-field coherence correlates of speed–accuracy trade-off
Interest has increased in the role of SFC in mediating perceptual, cognitive and motor processes. Briefly, SFC emerges when neurons become entrained with the surrounding network in particular frequency bands over time. A body of evidence suggests that disparate brain regions may communicate not through spike counts per se, but rather through the timing of individual spikes and how those spikes are coupled to the greater surrounding network(s), also oscillating at a particular frequency [100,101]. Particularly important is that of oscillations in the gamma-range, approximately 30–50 Hz, or even higher. Several studies have demonstrated that gamma-band coherence is selectively enhanced when attention is directed into a shared RF [97,102–104]. The SFC for two signals is computed using a variety of methods, most of which involve two steps: transforming each continuous signal into the frequency domain while retaining temporal information using windowed Fourier transform8, and second, computing a normalized correlation between these power spectra over time . Coherence magnitude ranges from 0 to 1 and increases to the extent that signals demonstrate phase and amplitude locking over trials.
Given the strong evidence for increases in gamma-band coherence with attention, we wondered how SFC varied with SAT instruction given the dramatic behavioural and neural changes between conditions. We hypothesized that gamma-band coherence would be increased in the accurate condition relative to the fast condition, because the former required more deliberate responses, whereas the latter simply stressed speed. To examine this, we computed the SFC9 between 353 simultaneously recorded FEF single units and LFPs.
The evidence was consistent with the hypothesis. Figure 8 quantifies the relationship for an exemplar session by subtracting coherence magnitude in the accurate condition from the fast condition. As is evident, coherence in the accuracy condition is significantly increased in the gamma band, centred at about 40 Hz (areas enclosed in white are significant at the p < 0.05 level). This elevated SFC was evident across the population of 353 neuron–LFP pairs (figure 9). This is particularly interesting, given that by most metrics, the fast condition elicited more spiking events and greater LFP amplitude. Still, on this finer scale, we observe that when monkeys were instructed to make very accurate decisions, there was an increase in gamma-band coherence between approximately 30 and 40 Hz.
Evident also in figures 8 and 9 is a marked increase in low-frequency coherence for the fast condition over the accurate condition. At first blush, this seems to be an uninteresting reflection of the overall greater magnitude of responses discussed earlier. However, this is not the case, as coherence requires trial-by-trial, moment-by-moment phase and amplitude locking. At least in early visual cortex, increases in low-frequency power have been associated with increased response gain for relevant stimuli and speeded reaction times, just as witnessed here [107,108].
The stochastic accumulator framework continues to inspire and constrain theoretical and empirical work on perceptual decision-making. We have summarized evidence showing that while stochastic accumulator models provide efficient quantitative explanations of overt behaviour, the mapping between model parameters and neural processes is much less clear than others and we had originally imagined. The various analyses reported here highlight the fact that the SAT is a multifaceted phenomenon, accomplished by multiple, distinct modulations of neurons instantiating different stages of processing. Thus, the formal model explanation of SAT by a single parameter cannot be mapped meaningfully onto brain processes. For that matter, no psychological accumulator model includes the variety of effects we report here involving discharge rates, field potentials and coherence. That said, we wish to stress that our objective is not to invalidate the conventional accumulator model. To the contrary, the stochastic accumulator framework continues to provide a sophisticated, formal account of behaviour in many tasks. Furthermore, we have demonstrated that an extension of the framework that incorporates multiple stages of accumulation can reconcile model and neural processes. However, the data used to formulate and constrain the integrated accumulator model were obtained from just one node in a complex network. FEF, while a critical node in the saccade decision circuit, is extensively interconnected with multiple afferent and efferent cortical and subcortical structures. A more complete understanding of SAT, and how the brain accomplishes perceptual decisions generally, awaits further data.
This work was supported by F32-EY019851 to R.P.H., and by R01-EY08890, P30-EY08126, P30-HD015052, and the E. Bronson Ingram Chair in Neuroscience.
One contribution of 17 to a Theme Issue ‘Attentional selection in visual perception, memory and action’.
↵1 A visual search task requires observers to locate some target item presented among non-target distractor items. The stimuli commonly consist of simple or complex shapes or colours, and trials can vary in the number of and characteristics of distractors present. Observers respond in a variety of ways, commonly by button press (item is present or not present) or by eye movement (look at and maintain fixation on the one target item).
↵2 The activity of visual-movement neurons was also sufficient for the model.
↵6 The reader should note that for many neurons, baseline firing rate was significantly greater in the fast condition than in the accurate condition, as detailed in Heitz & Schall . This effect persisted across blocks of trials and itself indicates a persistent cognitive state change. Here, we were interested in effects apart from the baseline shift; therefore, although it is not standard practice, we baseline-corrected spike density functions in the same manner as the LFP and EEG.
↵7 During each experimental session, eight independent electrodes were lowered into FEF; each electrode provided one LFP and some number of single units, typically 1–2 (and occasionally 0). Meanwhile, monkeys were outfitted with several EEG electrodes. Electrodes T5 and T6 in the 10–20 system  were used to compute the N2pc and averaged). Unfortunately, electrode T5 was unusable for one monkey and was not included (for details, see Heitz et al. ).
↵8 We used a 200 ms Hanning window.
↵9 We included only LFPs and single units if they were recorded on separate electrodes to ensure that LFPs were not contaminated by spectral leakage. Statistical significance between the fast and accurate conditions was evaluated using a jack-knife bootstrapping technique .
- © 2013 The Author(s) Published by the Royal Society. All rights reserved.