Royal Society Publishing

Understanding decision-making deficits in neurological conditions: insights from models of natural action selection

Michael J Frank, Anouk Scheres, Scott J Sherman

Abstract

Models of natural action selection implicate fronto-striatal circuits in both motor and cognitive ‘actions’. Dysfunction of these circuits leads to decision-making deficits in various populations. We review how computational models provide insights into the mechanistic basis for these deficits in Parkinson's patients and those with ventromedial frontal damage. We then consider implications of the models for understanding behaviour and cognition in attention-deficit/hyperactivity disorder (ADHD). Incorporation of cortical noradrenaline function into the model improves action selection in noisy environments and accounts for response variability in ADHD. We close with more general clinical implications.

1. Introduction

Fronto-striatal dysfunction can lead to dramatic changes in cognition and action, as evidenced by various disorders with disturbances to this circuitry, including Parkinson's disease (PD), schizophrenia, attention-deficit/hyperactivity disorder (ADHD), obsessive compulsive disorder, Tourrette's syndrome, Huntington's disease and addiction (Nieoullon 2002). One might wonder how adaptive evolution of a brain system could lead to the complexity and diversity of behaviours associated with these disorders, especially since these behaviours generally do not occur spontaneously in animals. However, we could also turn this question on its ear and ask: how elegant must a neural system be to lead to more rational human behaviour? It may be an unfortunate but necessary corollary that the complexity required to produce adaptive thought and behaviour may be vulnerable to all manner of issues with the ‘plumbing’, which would have compounding effects on the overall system. Thus, the tradeoffs that come with adaptive human behaviour may be akin to those associated with a car that has electronic seat position control and GPS navigation—these luxurious amenities come with increased risk of something breaking in an unpredictable fashion.

This paper presents an attempt to understand decision-making deficits in various patients with neurological conditions, as informed by neurocomputational models of fronto-striatal circuitry. The prefrontal cortex (PFC)—often considered the seat of abstract thought and executive function—dynamically interacts with multiple subcortical and other cortical areas, and the whole system is dynamically modulated by dopamine and other neurotransmitters. It should be clear that even if our knowledge base for how each of these subsystems worked was perfect, it would nevertheless quickly become intractable to try to connect all the pieces together ‘in your head’ or on a figure. The use of computational models forces one to be explicit about each part of the system, and allows systematic exploration of how changes in a single parameter in one subsystem may propagate through the system and impact cognition and decision making. Of course, several computational approaches can be used at multiple levels of analysis—from the most biophysically detailed to highly abstract frameworks—each having its own merits, and none a panacea. The hallmark of a successful modelling endeavour should therefore be its ability to generate insights into the mechanisms needed to explain phenomena at a particular level of analysis (O'Reilly & Munakata 2000; Dayan 2001).

The computational models described below offer an integrative framework that attempts to link cellular- and systems-level interactions with cognitive dysfunction in patients with dysfunction within the same neural circuitry, including PD, patients with ventromedial/orbital prefrontal damage, ADHD and the effects of medications and surgical treatments of these conditions. The models do not attempt to provide precise quantitative fits to any particular dataset, but rather to develop qualitative patterns of predictions that depend on key principles that drive the system. The development of these principles is constrained by data at both the neural and cognitive level, thereby minimizing the number of plausible models consistent with the data. Moreover, the models have generated novel testable predictions at both the neural and behavioural levels. The details of the models and analyses are described elsewhere and in the electronic supplementary material; here, we focus on the higher level principles.

2. Neurocomputational framework

Building upon other theoretical/modelling work on the role of the basal ganglia (BG)–frontal cortical (FC) system in motor control (e.g. Houk & Wise 1995; Mink 1996; Beiser & Houk 1998; Gurney et al. 2001; Brown et al. 2004), we have developed a series of neurocomputational models that explore the roles of this system in cognitive actions (Frank et al. 2001; Frank 2005; see also Houk 2005; Frank & Claus 2006; O'Reilly & Frank 2006; Houk et al. 2007; Frank 2006). All of the above models share the idea that BG–frontal circuits play a key role in action selection.

Figure 1a shows the basic circuitry included in our models. Two main cell populations in the striatum have opposing effects on the selection of a given action via divergent projections through BG nuclei, thalamus and back to cortex. Activity in ‘Go’ neurons facilitate the execution of a response considered in cortex, whereas ‘NoGo’ activity suppresses (or prevents the facilitation of) competing responses. Dopamine (DA) modulates the relative balance of these pathways, exciting synaptically driven Go activity via D1 receptors, while inhibiting NoGo activity via D2 receptors (Gerfen 1992; Hernandez-Lopez et al. 1997; Aubert et al. 2000; Hernandez-Lopez et al. 2000). Further, via diffuse projections to BG output nuclei (Parent & Hazrati 1995), the subthalamic nucleus (STN) may exert a Global NoGo signal on the execution of all responses, which can prevent premature responding when multiple competing responses are being evaluated (Frank 2006). Simulated DA depletion in the model results in emergent oscillatory activity in BG nuclei; these oscillations are characteristic of Parkinson's tremor (e.g. Bergman et al. l994; Terman et al. 2002), and are eliminated with both real and simulated STN lesions (Ni et al. 2000; Frank 2006). Thus, although the models are intended to address decision-making functions of BG–FC circuitry, they are also constrained by data at the lower neural level of analysis. Furthermore, by virtue of interactions with different areas of frontal cortex (Alexander et al. 1986; Houk 2005), the models show how the BG can participate in a wide range of cognitive functions, from relatively ‘low-level’ tasks such as procedural learning (Frank 2005) to ‘higher-level’ tasks such as working memory (Frank et al. 2001; O'Reilly & Frank 2006) and decision making (Frank & Claus 2006).

Figure 1

(a) Striato-cortical loops including the direct (‘Go’) and indirect (‘NoGo’) pathways of the BG. The Go cells disinhibit the thalamus via GPi, facilitating the execution of an action represented in cortex. The NoGo cells have an opposing effect by increasing inhibition of the thalamus and suppressing action execution. Dopamine from the SNc excites synaptically driven Go activity via D1 receptors and inhibits NoGo activity via D2 receptors. GPi, internal segment of globus pallidus; GPe, external segment of globus pallidus; SNc, substantia nigra pars compacta; STN: subthalamic nucleus. (b) Neural network model of this circuit Frank (2005, 2006). Squares represent units, with height reflecting neural activity. The premotor cortex selects one of the four responses (R1–R4) via direct projections from the sensory input and is modulated by BG projections from thalamus. Go units are in the left half of the striatum layer; NoGo in the right half, with separate columns for the four response. In the case shown, striatum Go is stronger than NoGo for R2, inhibiting GPi, disinhibiting thalamus and facilitating R2 execution in cortex. A tonic level of dopamine is shown in SNc; a burst or dip ensues in a subsequent error feedback phase (data not shown), driving Go/NoGo learning. The STN exerts a dynamic ‘Global NoGo’ function on response execution and adaptively modulates the threshold at which actions are selected depending on the degree of cortical response conflict (Frank 2006). (c) Modelling BG interactions with orbitofrontal cortex in decision making (Frank & Claus 2006). The BG model is as in (b). In addition, medial and lateral OFC areas receive graded information about reward/punishment magnitude information from the ABL (amygdala), which have a top-down effect on responding within the striatum, and directly on premotor cortex, allowing more flexible behaviour. OFC_ctxt is a context layer that maintains recent reinforcement information in working memory and biases activity in OFC_med_lat for use in behavioural decisions.

Given that the BG participate in selecting among various competing low-level motor responses, it is natural to extend this functionality to include higher-level decisions. A key question is how do the BG know which decision has the highest value? Insight comes from various experiments showing that when monkeys are rewarded following a correct choice, transient increases in dopamine firing are observed (Schultz 2002). Conversely, choices that do not lead to reward are associated with DA dips (pauses in DA firing) that drop below baseline (e.g. Schultz 2002; Satoh et al. 2003), with longer duration pauses when rewards are highly expected (Bayer 2004). Similar DA-dependent processes have been inferred to occur in humans during positive and negative reinforcement using neuroimaging techniques (Delgado et al. 2000; Holroyd & Coles 2002; Frank et al. 2005). In our models, these DA bursts and dips modify learning in Go and NoGo striatal units. By means of D1 receptors, phasic DA bursts during rewards enhance neural activity and synaptic plasticity in those Go units that are activated by the stimulus-response conjunction, while having opposite effects via D2 receptors in the NoGo pathway; this functionality is supported by various lines of neurobiological evidence (e.g. Centonze et al. 2001; Mahon et al. 2003; Frank 2005; for a recent review, see Frank & O'Reilly 2006). Striatal units not activated by the particular input stimulus do not learn. The net result is that DA bursts support ‘Go’ learning to reinforce the good choice in response to a particular stimulus, while DA dips support ‘NoGo’ learning to avoid bad choices (Brown et al. 2004; Frank 2005). That is, a lack of DA releases NoGo cells from their tonic D2 inhibition, allowing them to become more excited than their Go counterparts and driving ‘Hebbian’ learning in the opposite direction to DA bursts. Supporting this account, D2 receptor blockade (simulating the lack of D2 stimulation during dips) is associated with enhanced NoGo activity in the indirect pathway and associated increases in corticostriatal plasticity (Robertson et al. 1992; Centonze et al. 2004).

As DA bursts and dips reinforce Go and NoGo representations in the BG, our model showed that the most adaptive (i.e. rewarding) responses are facilitated while less adaptive ones are suppressed. Further, as the BG learns to facilitate adaptive responses, the associated adaptive representations become enhanced directly in premotor cortical areas (via modification of input to premotor synaptic strengths). In this way, DA reward processes within the BG may ingrain prepotent motor ‘habits’ in cortical areas (Frank 2005). Once these behaviours are ingrained, there is less need for selective facilitation by the BG. This is consistent with observations that dopaminergic integrity within the BG is critical for the acquisition but not execution of instrumental responses (Smith-Roe & Kelley 2000; Choi et al. 2005), and with recent observations that learning-related activity is initially seen in the BG and only later in frontal cortex (Delgado et al. 2005; Pasupathy & Miller 2005; Seger & Cincotta 2006).

Next, we review how models of this circuitry can account for decision-making deficits in clinical populations.

3. Applying the models to clinical populations

(a) Parkinson's disease and DA manipulations

Parkinson's disease is a progressive neurodegenerative disease that selectively damages dopaminergic cells targeting the BG. The most obvious behavioural changes associated with PD are muscular rigidity, slowness of movements and tremor. However, motor neurons themselves are not damaged and patients can perform movements quite smoothly under some circumstances. Instead, these patients may have difficulty selecting among various competing motor actions and executing the most appropriate one. A long-standing hypothesis is that depleted DA in PD leads to an imbalance of the direct and indirect pathways (Albin et al. 1989). In effect, the threshold for facilitating a motor programme is raised (Mink 1996; Wichmann & DeLong 2003). The observation that treatment with DA agonists and l-Dopa sometimes lead to jerking movements or dyskinesia (McAuley 2003) is consistent with this hypothesis by shifting the balance the other way and making the threshold for motor execution too low (Gerfen 2003).

A number of cognitive changes also exist in PD and these are often complex and seemingly unrelated, ranging from deficits in reinforcement learning and decision making (i.e. choosing among multiple menu items at a restaurant and learning from the outcome of this decision) to working memory (holding and manipulating information in mind, as in mental arithmetic) and attentional control (directing attention to task-relevant versus distracting information). Rather than proposing separate mechanisms for the various cognitive and motor impairments in PD, our approach unifies the diverse pattern of results by adopting a mechanistic approach that attempts to decipher the underlying roles of the BG/dopamine system. In fact, the various deficits can all be accounted for by a reduced dynamic range of DA signals within the BG of the models (Frank 2005). Indeed, although executive dysfunction is sometimes assumed to be due to prefrontal deficits, frontal-like cognitive dysfunction in early-stage PD is correlated with striatal, and not prefrontal, DA measures (Kaasinen et al. 2000; Muller et al. 2000; Remy et al. 2000). Our models suggest that low striatal DA leads to diminished Go signals and difficulty in the updating of prefrontal representations, leading to frontal-like deficits (Frank 2005; Frank & O'Reilly 2006; O'Reilly & Frank 2006; see also Hazy et al. 2007).

We have tested various aspects of the hypothesized roles of the BG/dopamine system in action selection. First, we demonstrated support for a central prediction of our model regarding dopamine involvement in ‘Go’ and ‘NoGo’ cognitive reinforcement learning (Frank et al. 2004). We tested Parkinson patients on and off DA medication. We predicted that decreased DA levels in PD would enable patients to avoid selecting options that had been associated with negative reinforcement, due to spared NoGo learning, but that these patients would have more difficulty making choices which had high reward value (which depends on DA bursts). We further predicted that DA medications used to treat PD (l-Dopa and D2 receptor agonists) should alleviate the Go learning deficit, but would block the effects of DA dips needed to support NoGo learning, as was simulated to account for other medication-induced cognitive deficits in PD (Frank 2005). To test this idea, we developed a paradigm to dissociate the ability to select good actions versus avoiding bad ones. Indeed, patients who had abstained from taking medication were better at avoiding the selection of negative stimuli (NoGo choices) than they were at Go choices. In contrast, patients taking their regular dose of medication were better at Go learning and selection, but were relatively impaired at NoGo learning (Frank et al. 2004). The same learning biases were observed in the model (figure 2a).

Figure 2

(a) BG model Go and NoGo associations recorded from the striatum after learning that choosing stimulus A is rewarding on 80% of trials and choosing stimulus B is rewarding only on 20%. Parkinson's disease was simulated (Sim PD) by reducing DA input to the Striatum and medication was simulated (Sim DA Meds) by increasing tonic DA levels and reducing phasic DA dips. These qualitative patterns predicted learning biases in PD patients on and off medication (Frank et al. 2004). (b) The contributions of the STN to decision making were explored (Frank 2006). STN lesions improved PD-like symptoms in the model (not shown), but induced premature and inappropriate responding when having to choose among two positively reinforced responses (80 versus 70%). (c) The orbitofrontal cortex (OFC) is critical in the model for adaptive decision making when the magnitudes of decision outcomes (rewards and losses) is more relevant than their probability of occurrence (Frank & Claus 2006), providing a mechanistic explanation for decision-making deficits in patients with OFC damage and capturing classical irrational choice patterns in normals.

According to the model, l-Dopa medication enhanced Go choices via increases in spike-dependent DA release (Harden & Grace 1995; Pothos et al. 1998), consistent with beneficial l-Dopa effects on other tasks thought to depend on DA bursts (Shohamy et al. 2005). Moreover, the tendency for medication to impair NoGo learning was similarly predicted by the model, as DA medications (especially D2 agonists) would tonically stimulate D2 receptors and may effectively block the effects of DA dips needed to learn NoGo (Frank 2005). This effect was previously simulated in the model to explicitly account for medication-induced reversal learning deficits in PD, in which patients are impaired at learning to reverse stimulus–reward contingencies (Swainson et al. 2000; Cools et al. 2001). Cools et al. (2006) independently confirmed more specific model predictions, showing that reversal learning deficits are selectively observed for NoGo learning to a previously rewarded stimulus. Others have found medication-induced deficits in a task that required negative feedback to be inferred (from a lack of positive feedback); these deficits were only observed in conditions that required learning from incorrect initial guesses (Shohamy et al. 2006). Finally, the preserved NoGo learning in non-medicated patients is readily explained by the notion that this learning depends on DA dips that remove DA from the synapse (so as to disinhibit NoGo neurons expressing D2 receptors). While low tonic DA levels in PD may still be sufficient to inhibit highly sensitive D2 receptors (e.g. Creese et al. 1983; Goto & Grace 2005), these may also make it more probable that all DA is removed from the synapse during DA dips. Further, the D2 receptor supersensitivity observed in PD (e.g. Rinne et al. 1990) would make NoGo neurons particularly sensitive to DA dips.

(i) DA manipulation in healthy participants

We have also tested predictions for a more general role for BG/dopamine in cognitive function by administering low doses of dopamine agonists/antagonists to young, healthy participants (Frank & O'Reilly 2006). The drugs used (cabergoline and haloperidol) were selective for D2 receptors, which are by far most prevalent in the BG. By acting on presynaptic D2 receptors, low doses of these drugs modulate the amount of phasic DA released in the BG (e.g. Wu et al. 2002). Again, results were consistent with our model: increases in DA were associated with better Go choices, whereas decreases in DA were associated with better NoGo performance. These same effects extended to higher-level cognitive actions. As reviewed in Hazy et al. (2007), our models show that these same Go/NoGo mechanisms can also drive the updating of working memory representations in PFC (Frank et al. 2001; O'Reilly & Frank 2006). In support of this consistent account, drug-induced BG/DA increases selectively enhanced working memory updating of task-relevant (i.e. ‘positively valenced’), but not distracting (‘negatively valenced’) information; DA decreases had the opposite effect (Frank & O'Reilly 2006). Overall, these results show that the BG/DA system modulation of learning and action selection is not restricted to the relatively extreme case of PD.

(ii) Deep brain stimulation in PD

In addition to DA medications, PD patients are increasingly often treated with deep brain stimulation (DBS), a surgical treatment that places electrodes in the STN. This type of therapy generally improves motor symptoms and activities of daily living, but its effects on cognition are not well understood, with both enhancements and impairments reported (e.g. Witt et al. 2004). Our models may be useful in this regard, in that they can simulate when decision-making abilities are enhanced, and when they might be hindered, from increases or decreases in subthalamic activity (Frank 2006). Our model suggests that the STN provides a dynamic ‘global NoGo’ or ‘hold your horses’ signal that prevents premature responding when faced with multiple good decision options (figure 2b). This signal allows the system to take a longer time to integrate over all possibilities before selecting the best choice and is suggestive of a key role of the STN in classical speed–accuracy tradeoffs. This account is also consistent with effects of STN lesions on premature responding in choice paradigms in rats (Baunez & Robbins 1997; Baunez et al. 2001).

Moreover, our model predicts that the STN ‘hold your horses’ signal is dynamically modulated by the degree of decision conflict, as represented in premotor cortical areas, potentially extending into dorsal anterior cingulate (ACC). This region is consistently activated under conflict conditions (Yeung et al. 2004) and has direct projections to the STN (Orieux et al. 2002). Thus, when choosing among two responses that have had similar positive reinforcement histories (‘win/win’ decisions), the associated conflicting cortical representations lead to a larger intensity and longer duration STN signal (Frank 2006). We have previously observed modulation of cortical conflict signals in healthy participants using electrophysiological measures that are thought to reflect anterior cingulate activity (Frank et al. 2005). Notably, these conflict signals depended on the kinds of decisions that should elicit conflict in the particular individual. Those biased to learn more from the positive outcomes of their decisions showed conflict signals when making win/win decisions, whereas those who learned more from their errors showed greater conflict during lose/lose decisions (i.e. when having to choose among two responses that were both likely to be incorrect). Our model predicts that the effect of these conflict signals in modulating choice behaviour may in part be mediated via the STN. We are currently testing more specific predictions from this model in PD patients on and off DBS. Ultimately, we believe that a combined modelling/empirical approach can be used to constrain stimulation parameters to minimize the potential negative impact of DBS on decision making.

Finally, the model provides a mechanistic explanation for the observations that Parkinson's patients taking D2 agonists can develop spontaneous onset of pathological gambling (Dodd et al. 2005). The blockade of DA dips during the experience of losses would prevent NoGo learning in the BG, while concurrent l-Dopa medications would preserve Go learning from rewards; this undue biasing of decision outcomes would further ingrain the behaviour as a habitual response. However, the ‘basic’ BG model may not be sufficient to fully account for the data. As described below, frontal reward regions may play a key role in incorporating the inherent graded differences in the magnitudes of gains and losses associated with gambling experiences.

(b) Decision-making deficits in ventromedial/orbitofrontal patients

Despite preliminary support for its predictions, the BG model as described above is not adequate to account for more complex, ‘real-world’ decisions. In particular, it is not well equipped to pay appropriate weight either to relative differences in the magnitudes of gains and losses or to the recency of reinforcement contingencies. For such functions, the more advanced and adaptive orbitofrontal cortex (OFC) may be necessary to complement the functions of the more primitive BG/DA system (Frank & Claus 2006). Indeed, patients with OFC damage (but intact BG/DA system) make dramatic decision-making errors in their everyday lives, as well as the laboratory (e.g. Bechara et al. 1998). In our explorations of the unique OFC contributions to decision making (figure 1c), the OFC maintains recently experienced rewards and punishments and their relative magnitudes in an active state (via persistent neural firing), and has a top-down effect on the BG and premotor regions to guide behaviour (Frank & Claus 2006). This model is based on a substantial body of evidence for OFC representation of reward and punishment magnitude information, which it receives from the amygdala, and for the persistent maintenance of this activity in working memory (e.g. Hikosaka & Watanabe 2000; Holland & Gallagher 2004; Schoenbaum & Roesch 2005). These representations can bias behaviour (Wallis & Miller 2003) via efferent projections to striatum and motor cortical areas.

Our combined BG/OFC model offers a mechanistic explanation of impaired decision-making processes and reversal learning deficits in OFC patients (Rolls 1996; Bechara et al. 1998; Fellows & Farah 2003), and further accounts for irrational patterns of decisions in healthy populations (e.g. Kahneman & Tversky 1979; Frank & Claus 2006). We showed that the more primitive BG/DA system is sufficient for (relatively slow) learning to make choices based on their frequencies of positive versus negative reinforcement (data not shown; see Frank & Claus 2006). However, OFC integrity is necessary for faster learning of more recent contingencies and for making choices that lead to less probable but larger rewards than those that are more certain but yield smaller expected values (figure 2c), consistent with patterns of data observed in rats with and without OFC damage (Mobini et al. 2002). These authors further showed that OFC is necessary for making choices that lead to larger but delayed rewards instead of smaller, immediate rewards. The implication of our model is that choosing based on delayed rewards depends on working memory for action–outcome contingencies and requires suppression of responses that would lead to immediate rewards (which the BG would be able to learn itself). Recent neuroimaging results in humans support this account, showing striatal activity during the selection of immediate rewards, and OFC activity when participants suppressed this choice in favour of a later delayed reward (McClure et al. 2004).

Thus, our model suggests that the core decision-making deficit in OFC patients is in assigning reinforcement value to decisions based on the magnitude and recent temporal context of expected outcomes. In contrast, non-medicated PD patients should be unimpaired at maintaining reward value information in OFC, especially since this frontal area interacts with ventral striatal areas that are spared in mild to moderate PD. However, medication is thought to ‘overdose’ the ventral striatal-OFC circuit with DA (e.g. Cools et al. 2001). In addition to blocking DA dips in the BG, this could prevent the encoding of large losses in OFC, while sparing or even enhancing the magnitudes of gain representations (Frank & Claus 2006). Taken together, this combination could present an attractive account for the documented effects of medication on gambling behaviour in PD (Dodd et al. 2005).

(c) ADHD as a disorder of action selection

ADHD is a common childhood-onset psychiatric condition, characterized by age-inappropriate levels of inattention and/or hyperactivity–impulsivity. In order to qualify for either of the three subtypes of ADHD (inattentive subtype, hyperactive/impulsive subtype or combined subtype), symptoms need to be present in more than one situation (e.g. at home and in school) and need to cause impairment. Prevalence estimates of ADHD in childhood range from 3 to 7% (APA 1994). Although overall ADHD symptoms decline with age, approximately 15% of individuals who had ADHD in childhood meet full criteria for ADHD in adulthood and 65% meet partial criteria (see Faraone et al. 2006 for a meta-analysis). ADHD is a highly heritable psychiatric condition, with a mean heritability estimate of 76% (Faraone et al. 2005). In candidate-gene studies, a number of dopaminergic and serotonergic genes are implicated in ADHD, each with a small effect size (Faraone et al. 2005). The majority of children respond well to psychostimulant drugs, with at least 62% showing significant and clinically relevant reduction of ADHD symptoms (Swanson et al. 2001). Neuropsychological studies have mainly focused on the domain of executive function. Deficits in response inhibition, although modest in effect size, are reliably associated with ADHD (Willcutt et al. 2005). Recent research has shown that motivational processes may, independent of response-inhibition deficits, account for a large proportion of ADHD symptoms (Solanto et al. 2001).

Here, we consider the possibility that ADHD can be thought of as a disorder in action selection. A dysfunction in the circuitry that selects among multiple possible actions and inappropriately facilitates one of them is conceptually attractive for capturing the core deficits in both motor and cognitive domains. The complexity of behavioural phenotypes and associated neurobiological underpinnings motivates the need for solid theoretical foundations (Pennington 2005) that ultimately may help determine when and when not to medicate a symptomatic child. Below, we review evidence for dysfunctional BG-frontal circuits in ADHD, before elaborating potential implications of action selection models. We then show how incorporation of noradrenaline function into the model can account for additional effects of the disorder.

(i) Structural, functional and DA effects in ADHD

A recent review of the literature on structural brain imaging in ADHD clearly demonstrates reduced volumes in frontal and striatal areas (Krain & Castellanos 2006), despite earlier reported inconsistent effects in smaller sample studies (Baumeister & Hawkins 2001). A longitudinal study with the largest samples so far has clearly demonstrated reductions in brain volumes in ADHD (Castellanos et al. 2002), including volumes of total cerebrum, cerebellum, grey and white matter of the frontal lobes, and caudate nucleus when compared with healthy controls. These findings remained unchanged after controlling for differences in estimated IQ, height, weight and handedness, and were not due to the use of psychostimulant drugs. Reduced brain volumes remained stable over time, suggesting that they result from early genetic or environmental influences. Interestingly, by age 16, caudate volumes in ADHD were no longer smaller than those in healthy controls, potentially related to the reduction in ADHD motor symptoms with increasing age.

Functional MRI studies in children and adolescents with ADHD have mainly focused on studying the neural basis of executive control, and response inhibition in particular. Generally, fMRI studies have found reduced activation in striatal and frontal regions in ADHD during executive control and response inhibition, using tasks such as the Go/NoGo task, stop task and Stroop colour-word test (Rubia et al. 1999; Durston et al. 2003; Booth et al. 2005; Vaidya et al. 2005; Zang et al. 2005). Often, reduced fronto-striatal activation in ADHD is accompanied by increased activation in other brain areas (Bush et al. 2005). Studies with medication-naive subjects demonstrate that fronto-striatal abnormalities during executive control are not due to the use of psychostimulant drugs (e.g. Vaidya et al. 2005).

While a complex disorder such as ADHD is unlikely to be a function of any single neurotransmitter, DA dysfunction of some sort—whether genetic, environmental or a combination—is relatively undisputed. In a comprehensive review of the behavioural and biological bases of ADHD, the authors concluded that hypodopaminergic function in three striato-cortical loops is responsible for core deficits in DA-mediated reinforcement and extinction (Sagvolden et al. 2005). This is supported by observations that both children and adults with ADHD have abnormally high densities of dopamine transporters (DATs) which remove too much DA from the synapse (Dougherty et al. 1999; Krause et al. 2000). Some have suggested that low levels of tonic DA are accompanied by heightened phasic DA signals in ADHD, due to reduced DA stimulation onto inhibitory autoreceptors that regulate phasic release (Grace 2001; Solanto 2002). However, other data suggest that stimulants do not have preferential action on autoreceptors (Ruskin et al. 2001). Further, Sagvolden and colleagues propose that the tight regulation between tonic and phasic DA is dysregulated in ADHD, resulting in stunted phasic DA responses, despite low tonic DA. The latter position fits with findings showing that methylphenidate (Ritalin) increases extracellular striatal DA (Volkow et al. 2001) and enhances synaptic DA associated with phasic responses (Schiffer et al. 2006).

Given the above changes in fronto-striatal and DA systems, it is natural to consider ADHD as a disorder of action selection. Although it may be premature to develop a computational model for all the sources of brain dysfunction in ADHD, we can nevertheless consider the implications of the models with respect to the hypodopaminergic hypothesis, which has gained increasing support. We then consider symptoms that are more readily explained by noradrenergic mechanisms, as informed by other computational models.

By virtue of interactions with multiple frontal circuits, it is possible that a single ‘low-level’ mechanism may be responsible for diverse behavioural effects at the systems level. Thus, reduced BG/DA signals would decrease Go signals for reinforcing appropriate motor behaviours and raise the threshold for when to update information to be robustly maintained in prefrontal cortex. In the BG–PFC models, cortico-cortical projections allow a stimulus present in the environment to reach and activate PFC, independent of BG signals. BG Go signals are particularly important for selectively updating task-relevant information to be maintained once a stimulus is no longer present, and in the face of ongoing distractors (Frank et al. 2001; O'Reilly & Frank 2006; Hazy et al. 2007). Reduced BG Go signals would therefore lead to apparent hypofrontality due to reductions in selective maintenance of task-relevant information and increased distractibility. Further, we think that the same functions may apply with respect to ventral striatum and the updating of orbitofrontal representations of reward value (Frank & Claus 2006). In this case, DA reductions would lead to impairments in the updating and subsequent maintenance of large magnitude, long-term reward values to bias behaviour and motivational processes.

(ii) Reward anticipation and temporal discounting in ADHD

The hypo-DA hypothesis suggests that ADHD may be associated with a core deficit in motivational/reward processes. In a recent fMRI study, adolescents with ADHD had reduced activation in ventral striatum when they anticipated receiving monetary gains (Scheres et al. 2007). This reduction in activation may potentially reflect reduced DA levels in ventral striatum in ADHD and was selectively associated with symptoms of impulsivity–hyperactivity (and not inattention), suggesting distinct neural mechanisms for the subtypes. In certain contexts, ADHD is associated with unusually strong preferences for small immediate rewards over larger delayed rewards (e.g. Sonuga-Barke 2005), consistent with reduced striatal Go signals for updating long-term motivational information in OFC. However, we note that reduced phasic DA and ventral striatal activity should also be associated with reduced sensitivity to immediate rewards (e.g. McClure et al. 2004). Given the above-mentioned reduction in ventral striatum during reward anticipation in ADHD (Scheres et al. 2007), one might expect relative impairments in the sensitivity to immediate rewards. Indeed, we recently found evidence for this position in a study on temporal reward discounting in children and adolescents with ADHD (Scheres et al. 2006). In this case, controls were actually more susceptible to immediate rewards: whereas 73% of ADHD subjects maximized their gains by waiting for the large delayed reward, only 58% of the control group did so. We are currently testing the contexts in which preferences are seen for immediate versus delayed rewards in ADHD, and the role of ventral striatum.

A clear model prediction is that reduced BG phasic DA should lead to Go learning deficits, which should be ameliorated by DA medications, in the probabilistic learning tasks described above (Frank et al. 2004). The BG–PFC models suggest that BG/DA reductions should also lead to impaired Go signals for updating task-relevant information into prefrontal working memory representations (Frank & O'Reilly 2006). We found consistent evidence for this account in adult ADHD subjects tested on and off their regular dose of stimulant medications (Frank et al. 2007).

(iii) Noradrenaline in ADHD and action selection

One problem with this hypodopaminergic hypothesis usually unaddressed is why low DA levels in ADHD are not associated with Parkinson-like symptoms? First, it is probable that DA levels are much lower in PD patients, given that PD symptoms do not arise until DA is depleted by approximately 75–80%. Second, whereas PD patients simply do not have DA available, DA synthesis and availability is intact in ADHD. Thus, patients may try to self-regulate their DA levels, as seen in rats, who self-administer more amphetamine when DA receptors are partially blocked pharmacologically (Robbins & Everitt 1999). Intriguingly, patients may achieve these DA increases by their own hyperactive movements; matrix neurons of the striatum that are involved in motor selection can disinhibit DA release via striatonigral projections (e.g. Joel & Weiner 2000).

Moreover, while DA depletion is the core biological deficit in PD, noradrenaline (NA) regulation is also thought to be disturbed in ADHD (e.g. Biederman & Spencer 1999). The NA hypothesis is particularly well supported by the beneficial effects of the specific NA transporter blocker atomoxetine (e.g. Swanson et al. 2006). In this regard, it is instructive to consider effects of NA in physiological recordings, behaviour and computational models of action selection. While a complete review of this topic is outside the scope of this paper, we present a brief summary (see Aston-Jones & Cohen 2005 for a full review).

Like DA cells, firing states of NA-releasing neurons in the locus coeruleus (LC) come in both tonic and phasic modes. In both electrophysiological recordings and computational simulations, LC cells release phasic NA bursts during periods of focused attention, infrequent target detection and good task performance. This phasic NA burst is thought to reflect the outcome of the response-selection process and serves to facilitate response execution. In contrast, poor performance is accompanied by a high tonic but low phasic state of LC firing. The authors simulated the effects of these LC modes on action selection such that NA modulated the gain of the activation function in cortical response units (Usher et al. 1999). They showed that phasic NA release leads to ‘sharper’ cortical representations and a tighter distribution of reaction times, whereas the high tonic state was associated with more RT variability. They further hypothesized that increases in tonic NA during poor performance may be adaptive, in that it may enable the representation of alternate competing cortical actions during exploration of new behaviours.

This model has clear implications for ADHD. It is possible that ADHD participants are stuck with an intermediate high tonic, low phasic level of NA, leading to a preponderance of multiple cortical representations. This would result in variability in reaction times and distractibility of prefrontal representations. Indeed, studies that report within-subject RT variability consistently show that children with ADHD are more variable in their responses (Leth-Steensen et al. 2000; Castellanos et al. 2005). Notably, a recent study showed that this variability correlated with noradrenaline, but not dopamine function, as measured in urinary metabolites (Llorente et al. 2006).

(iv) Simulating NA function in action selection and ADHD

Given the purported role of the BG circuitry in action selection, one might question whether these cortical NA selection effects would apply within the context of a BG-cortical model. To explore potential interactions between the systems, we added a simulated LC layer to the standard BG model. In particular, we explored the effects of LC modulation of premotor cortical units, which reciprocally project back to the LC (figure 3a). Following Usher et al. (1999), the gain (i.e. slope) of the activation function (see electronic supplementary material) of premotor units was dynamically modulated in proportion to the LC unit response. This effectively makes cortical units more responsive and can increase the network signal to noise ratio, as hypothesized for NA (Servan-Schreiber et al. 1990). Thus, whereas the default gain parameter of cortical units in our modelling framework is statically set to 600 (O'Reilly & Munakata 2000), we applied a dynamic function to the gain γ of the premotor units:Embedded Image(3.1)where LCact ranges from 0 to 1 and is the mean rate-coded activation of LC units. The resulting gain is relatively low when LC activity is low and increases monotonically with increasing LC/NA activation. Low LC firing, and hence premotor gain, is associated with low-level noisy activation of multiple noisy premotor unit responses (some noise is essential for initial exploration of possible actions; Frank 2005). However, sufficient premotor activity can elicit a phasic burst in LC unit activity via top-down premotor-LC excitatory projections. Critically, this LC burst does not occur unless premotor activity is sufficiently high, such that it is preferentially elicited by stimulus-evoked activity (due to prior stimulus–response learning from the input layer to the desired cortical response). This depiction is consistent with (i) the idea that LC phasic responses reflect the outcome of a task-related decisional process (Aston-Jones & Cohen 2005), (ii) observations that frontal cortical stimulation produces excitatory LC phasic responses (Jodo et al. 1998), and (iii) electrophysiological recordings showing that frontal activity precedes LC phasic activity (Jodo et al. 2000).

Figure 3

(a) Standard BG model with additional simulated cortical noradrenaline (NA) effects. The locus coeruleus (LC) fires phasically upon sufficient activation of premotor units and reciprocally modulates the gain of these units via simulated NA. (b) Normalized distributions for model reaction times (number of processing cycles before the BG facilitates a response). The LC phasic mode is associated with a narrow distribution of reaction times, peaking at 50 cycles. In the tonic mode (LC units tonically 50% activated), noisy activation of both competing responses leads to a bimodal distribution and overall more RT variability, potentially explaining the variability seen in ADHD. In the ‘supra-tonic’ mode, LC activity was tonically set to maximal firing rates, leading to faster RTs. (c) Per cent accuracy in the same simple choice discrimination simulated to generate RT distributions in panel (b). High accuracy is seen in the phasic LC mode, as premotor responsiveness is boosted only in the presence of a task-relevant stimulus–response association. The tonic and supra-tonic modes lead to activation of alternative noisy responses, which can get inappropriately executed if not dynamically modulated by the LC.

Moreover, the cortically driven phasic LC/NA burst reciprocally enhances the gain γ of cortical units, which facilitates the execution of the most active cortical representation by allowing it to dominate over alternative noisy units. This conceptualization is very similar to (and indeed was motivated by) that of Aston-Jones & Cohen (2005), but applies even in our model of BG–cortical interactions. Although the BG circuitry enables the facilitation of a desired response together with suppression of alternative responses, the LC/NA effects modulate the strength of inputs to the BG system. As previously noted, the BG cannot select a desired response itself—this response has first to be sufficiently activated (or ‘considered’) in cortex before the BG can gate its execution (Frank 2005). As we shall see, the LC/NA modulation affects both when the target response is facilitated and in some cases, which response is ultimately executed.

To demonstrate the effect of LC modulation, we trained our BG model to select between two alternative choices in response to two separate input stimuli A and B (each represented by a column of input units). Response 1 (R1) was positively reinforced (DA burst) on 80% of stimulus A trials, whereas R2 was reinforced on 80% of stimulus B trials. Networks were trained for 50 trials and easily learned this simple discrimination via standard BG/DA modulation of Go/NoGo learning. Note that as R1 is increasingly facilitated by the BG in response to stimulus A, Hebbian learning principles drive learning directly between the stimulus A input and R1 premotor cortical units (see above). In this manner, premotor cortex comes to eventually preferentially activate R1 (R2) in response to stimulus A (B), even prior to BG facilitation; this premotor action selection is subsequently facilitated by LC cortical modulation, BG Go signals and associated thalamic activation.

To assess the effects of NA on reaction times, we generated an RT distribution across 5000 trials of response selection after the initial training (with no further learning). The stimulus onset was delayed by approximately 30 cycles on each trial, so that during initial network processing, premotor activity only reflected intrinsic noise to a similar degree in R1 and R2 units. In the intact simulations, baseline tonic LC firing was low (LCact=0.05), and the resulting low gain of premotor units prevented noisy responses from being amplified. Once the stimulus (e.g. A) was presented, the appropriate response (R1) became preferentially active. The resulting LC burst further facilitated the active R1 representation, which was then swiftly accompanied by a BG Go signal. This scenario leads to a sharp distribution of model reaction times (figure 3b).

To simulate dysfunctional NA processes as hypothesized for ADHD, tonic LC firing was set high (50% maximal firing rate) and premotor to LC connections were severed so that no phasic burst was elicited. In this case, the intermediate gain of premotor units led to enhanced noisy activity in the absence of a task-relevant stimulus. Thus, in stimulus A trials, if R1 happened to be more active than R2 when A was presented, R1 was immediately facilitated. However, if R2 was more active, then the stimulus-evoked R1 activity led to increased response conflict in premotor cortex, due to simultaneous R1/R2 representations. This cortical conflict in turn led to longer decision times (see above discussion on the STN and Frank 2006). The overall pattern across trials led to a bimodal RT distribution for the LC tonic mode, with more variable and somewhat overall slower RTs, as is observed in ADHD. This bimodal distribution demonstrates that the same mechanisms responsible for simultaneous activation of responses (and associated exploratory behaviour) can lead to reaction time variability.

Finally, to demonstrate the need for a phasic (dynamic) LC signal, and to control for overall differences in premotor unit gain, we ran a ‘supra-tonic’ condition in which the LC units were tonically active at maximal firing rates (to the same degree as maximal phasic activity). The tonically high gain led to overall more excitable premotor units and facilitated response execution, as evidenced by faster RTs. However, the lack of an adaptive LC signal for modulating premotor gain caused networks to be more likely to choose the incorrect response (in this case, R2) when stimulus A arrives (figure 3c). This is because if noise happens to favour R2, the high cortical gain can cause inappropriate execution. Overall, these simulations (and others in the electronic supplementary material) show that a dynamic LC signal is adaptive in modulating motor responsiveness. Simulated LC/NA dysfunction leads to more variable reaction times and simultaneous activation of multiple responses, which could lead to exploratory behaviour. Indeed, in other simulations we found that the same tonic NA parameters that increase RT variability here can also lead to erratic trial-to-trial exploratory behaviour, consistent with our observations that RT variability and exploration were highly correlated in non-medicated ADHD patients (Frank et al. 2007).

The NA account may also explain response inhibition deficits in ADHD. Phasic LC responses would be expected to occur during the infrequent ‘stop-signals’ in inhibition tasks, and these may transiently enhance processing in frontal and BG regions that support response inhibition. Supporting this account, increases in NA by atomoxetine leads to enhanced response inhibition in both healthy participants and those with ADHD (Overtoom 2003; Chamberlain et al. 2006). In the BG, NA may enhance Global NoGo signals via excitatory effects in the STN (Arcos et al. 2003) and/or in inferior frontal regions which in turn activate STN (Aron & Poldrack 2006).

In sum, both DA and NA effects are critical for various deficits in action selection in ADHD. It is plausible that NA effects are primarily involved in response inhibition and variability, while DA effects are involved in motivational/reward processes, supporting the independence of these symptoms (Solanto et al. 2001). We are hopeful that further investigation of the interactions between BG/DA and cortical NA effects in our models will provide increasingly refined predictions that can be tested empirically.

4. Clinical implications of basal ganglia modelling

The classical model of BG connections as described by Gerfen and others has had great heuristic value in explaining the effects of DA replacement therapy and surgical intervention in PD. This model was also of importance in the development of DBS utilizing the STN as the target (Benabid 2003). The evolution of this static anatomical model to a network-based model, which encompasses phasic changes in DA release, multiple feedback loops with variable delays and plastic changes based on Hebbian principles, will be important in further refining our clinical approaches to BG disorders, including the associated deficits in decision making. The effect of chronic D2 stimulation by D2 agonists to selectively diminish the impact of negative consequences while leaving positive rewards intact can easily be extrapolated to predict gambling addictions in patients treated with D2 selective agonists. Indeed, such effects have been reported in patients with PD and must be considered during the initiation of DA replacement therapy. As these drugs are now finding widespread use in the treatment of restless leg syndrome (which affects more than 5% of the general population), the effect on decision making has important public health ramifications. Patients with ADHD have increased risk for developing substance abuse later in life (e.g. Disney et al. 1999). Modelling of BG circuitry will be helpful in the development of specific psychological tests to screen for potentially adverse effects of new dopaminergic drugs on behaviour. DBS of BG structures has become the treatment of choice for advanced PD and is increasingly being applied to other neurological conditions including experimental trials in Tourette's syndrome and obsessive compulsive disorder (Dell'Osso et al. 2005). Despite the increasingly common use of this technique, basic questions remain regarding the optimum clinical application. For example, treatment of PD can be accomplished by DBS in either the GPi or the STN with nearly equal improvement in motor function (Anderson et al. 2005). Modelling suggests that there may be important non-motor, cognitive effects of DBS that differ between these two anatomical targets. Interference of STN function, which is thought to provide a Global NoGo signal in the face of multiple competing incipient motor plans, could lead to impulsive behaviour. At a low level, this could lead to an ‘impulsive gait’ that has been observed in some patients where improvement in motor function leads to increased falls if the patient fails to account for residual difficulties with balance. At a higher level, there may be impairment of fronto-striatal circuitry that leads to more global behavioural impulsivity. Clinical trials that test the predictions of BG modelling on decision making will be critical for selecting the proper therapeutic option.

Selective modulation of BG pathways using advanced neurobiological techniques hold great potential for psychiatric treatment. One important challenge that remains is the separation of effects on motor and cognitive circuits. DA receptor-blocking drugs that are useful in the treatment of schizophrenia and mania have a crossover effect in the motor portion of the striatum where serious side effects such as Parkinsonism and tardive dyskinesia occur. Similarly, DA replacement therapy and DBS for movement disorders leads to neuropsychiatric side effects due to action in the cognitive and emotional circuits of the caudate and ventral striatum. One approach to this problem would be to capitalize on the wide anatomic separation of motor, cognitive and emotional circuits in the striatum. A viral gene transfer vector that selectively modulates the indirect or direct pathway could be constructed based on cell type-specific promoters coupled to ion channel-modifying sequences which shape the electrical output response of the neuron. When injected into the relevant area of the striatum, this agent would selectively control the Go or NoGo pathway in a limited sector of BG loops. This scenario may not be too far in the future since viral gene transfers vectors that modify synaptic transmission in the indirect pathway are currently in phase 1 trials (Luo et al. 2002).

Having a robust computational model will be critical for exploring how these effects interact at the dynamic systems level and in response to changing task demands. For example, while the described effect of DA on striatal D1 receptors was excitatory, this is only true for neurons that are in the ‘up-state’ (i.e. high membrane potential driven by synaptic activity); D1 activation is inhibitory on those in the ‘down-state’ (Hernandez-Lopez et al. 1997). Since this state-dependent modulation is inherently dynamic in nature, simulation of tonic D1 modulation effects (as in viral gene transfers) will be critical for assessing their potential benefit. It is acknowledged that therapies that operate at the ion channel level will require more detailed biophysical models than those presented here (e.g. Wolf et al. 2005). Nevertheless, abstractions of these functions may still be useful in the systems level of analysis. In our BG model, the D1 membrane potential modulation is simulated by a contrast enhancement function on the gain of Go neuron activation, such that the most active units are strengthened while less active units are suppressed. In this manner, Go learning during DA bursts is restricted to the most active synapses and is prevented in more weakly active synapses (Frank 2005)—allowing the model to learn Go to a response only in an appropriate stimulus context. See Cohen et al. (2002) for a similar discussion on the theoretical benefits of interplay between abstract and biophysically detailed models of D1 receptor effects in prefrontal cortex.

Footnotes

  • One contribution of 15 to a Theme Issue ‘Modelling natural action selection’.

    References

    View Abstract