Royal Society Publishing

Is there a brainstem substrate for action selection?

M.D Humphries, K Gurney, T.J Prescott

Abstract

The search for the neural substrate of vertebrate action selection has focused on structures in the forebrain and midbrain, and particularly on the group of sub-cortical nuclei known as the basal ganglia. Yet, the behavioural repertoire of decerebrate and neonatal animals suggests the existence of a relatively self-contained neural substrate for action selection in the brainstem. We propose that the medial reticular formation (mRF) is the substrate's main component and review evidence showing that the mRF's inputs, outputs and intrinsic organization are consistent with the requirements of an action-selection system. The internal architecture of the mRF is composed of interconnected neuron clusters. We present an anatomical model which suggests that the mRF's intrinsic circuitry constitutes a small-world network and extend this result to show that it may have evolved to reduce axonal wiring. Potential configurations of action representation within the internal circuitry of the mRF are then assessed by computational modelling. We present new results demonstrating that each cluster's output is most likely to represent activation of a component action; thus, coactivation of a set of these clusters would lead to the coordinated behavioural response observed in the animal. Finally, we consider the potential integration of the basal ganglia and mRF substrates for selection and suggest that they may collectively form a layered/hierarchical control system.

1. Introduction

A generally effective strategy for designing controllers of autonomous agents is to reverse-engineer biological systems that have evolved as solutions to the control problems. One such problem, the theme of this special issue, is action selection: a mortal agent must continuously choose and coordinate behaviours appropriate to both its context and current internal state if it is to survive. Animals necessarily embody successful solutions to the action-selection problem. Thus, it is natural to wonder what parts of the central nervous system—the neural substrate—have evolved to carry out the action-selection process.

Recent proposals for the neural substrate of the vertebrate action-selection system have focused on the basal ganglia (e.g. Mink & Thach 1993; Graybiel 1995; Doya 1999; Kropotov & Etlinger 1999; Redgrave et al. 1999; Rubchinsky et al. 2003; Grillner et al. 2005 and papers in this volume). This collection of nuclei in the forebrain and midbrain are undoubtedly intimately involved in motor control: damage to the basal ganglia results in a wide variety of disorders with motor symptoms, such as Parkinson's disease (Zigmond & Burke 2002). We have argued that, of all the structures of the vertebrate brain, the basal ganglia have the necessary inputs, outputs and internal connectivity to function as the central switch of an action-selection system (Prescott et al. 1999; Redgrave et al. 1999). Computational modelling of the intrinsic basal ganglia circuitry demonstrated that it is capable of resolving competition between action-representing signals, such that the basal ganglia output expresses the selection of the most appropriate action(s) and suppresses the others (Gurney et al. 2001a,b). At the same time, we readily acknowledged that the basal ganglia do not form the complete vertebrate action-selection system (Redgrave et al. 1999). Animals lacking functioning basal ganglia are not completely impaired, though their behavioural repertoire is undeniably limited. Thus, the basal ganglia are not necessary for all forms of action selection.

Decerebrate animals and altricial (helpless at birth) neonates do not have fully intact basal ganglia but are capable of expressing spontaneous behaviours and coordinated and appropriate responses to stimuli. During decerebration, the entire brain anterior to the superior colliculus is removed leaving only the hindbrain intact (figure 1). Yet, the chronic decerebrate rat can, for example, spontaneously locomote, orient correctly to sounds, groom, perform coordinated feeding actions and discriminate food types (Woods 1964; Lovick 1972; Berntson & Micco 1976; Berridge 1989). Such animals clearly have some form of intact system for simple action selection that enables them to both respond to stimuli with appropriate actions (more complex than simple spinal-level reflexes), and sequence behaviours—as demonstrated by the holding, gnawing and chewing required for eating solid food.

Figure 1

Anatomical locations of the putative action-selection systems. (a) The relative locations of major nuclei and structures including the basal ganglia (hashed) and the mRF shown in a cartoon sagittal section of rat brain. The dashed lines show the location of the three most common decerebration lines—all the brain rostral to the line is removed, leaving hindbrain and spinal cord intact. GP, globus pallidus; SN, substantia nigra; STN, subthalamic nucleus; SC, superior colliculus. (b) Principal reticular formation nuclei and fields in a schematic horizontal section from spinal cord to decerebration line 1 in (a). The main components of the putative brainstem action-selection system are in the medial RF.

Is there then a brainstem substrate for action selection? Such a substrate should have the necessary properties of a system specialized for action selection. We believe these to be the following (Redgrave et al. 1999). First, the system requires inputs that provide information about an animal's internal state and external context. Second, the system requires a method for computing the urgency (or salience) of each available action from the provided information, in some ‘common currency’ that allows comparison of their relative levels of support. Third, the system must have an internal configuration that allows for both the representation and the resolution of competition between actions. Fourth, the system must have outputs allowing the expression of the selected action. In addition, we may identify the substrate by the effect that its manipulations have on the performance of actions.

On this basis, of the structures left intact in the brainstem of decerebrate animals, we propose that the medial reticular formation (mRF) is the most probable substrate of a generalized simple action-selection mechanism. We are not proposing that the mRF subsumes the basal ganglia's action-selection role, but rather that the mRF is capable of performing limited action selection in the absence of basal ganglia.

We are not the first to note that the mRF may function as some form of selection device. Warren McCulloch and colleagues proposed that the mRF was a ‘mode selector’, which sets the global behavioural state of an animal—such as escape, feeding and so on. To demonstrate the plausibility of their proposal, they created one of the first computational neuroscience models and showed that their interpretation of the mRF's structure could perform selection of signals (Kilmer et al. 1969). Their emphasis was on the ascending projections of the RF, the connections to thalamus and cortex being responsible for setting the overall state of the animal. Our emphasis is on the dominant descending projections of the mRF and the potential they have to directly control motor behaviour.

Manipulations of the mRF directly affect actions. An intact mRF is trivially necessary for action selection in the sense that lesions to specific parts of it cause coma and even death in humans (Parvizi & Damasio 2003). Substantial cytoskeletal lesions have also been found in the mRF of Parkinson's disease patients (Braak et al. 2000). Thus, like the basal ganglia, damage to the mRF may make a significant contribution to the symptomatic motor deficits of this disease. Early studies showed that stimulation of the RF resulted in motor responses (Magoun & Rhines 1946); electrical stimulation of specific mRF regions can elicit locomotion (Kinjo et al. 1990; Whelan 1996). Neurons within other regions of the mRF are critical for the maintenance of posture (Mori 1987), the control of feeding behaviours (Lund et al. 1998) and the generation of eye movements (Moschovakis et al. 1996). In a comprehensive review, Siegel (1979) found that multiple competencies were attributed to the mRF because its neural activity correlated with a wide range of responses to stimuli and with naturally occurring behaviours. He concluded that the only way to reconcile these conflicting data was to assume that mRF neuron activity controlled the specific muscle groups required to perform the behaviour or response being tested. These studies are all consistent with Kuyper's classical division of motor control into a lateral system with fine control of the distal musculature, governed by cortex, and a medial system with gross control of the axial musculature, governed by the medial brainstem (Kuypers 1964).

We will now argue that the mRF has the necessary properties of an action-selection system. A review of its inputs and outputs suggests that receiving information and expressing selection are accounted for. At the outset of this work, we found that no clear current picture of the mRF's internal organization existed. We thus devote considerable attention to our proposal—part of which was published in Humphries et al. (2006)—for its structure, the quantitative models that generate it and the reasons for its existence. Having established a structural organization, we then consider the potential methods of representing and resolving action selection within it. To do so, we use example simulations of a new population-level computational model to illustrate the alternatives. Finally, we briefly consider how the putative basal ganglia and mRF action-selection mechanisms may interact.

2. External connections of the mRF

A substrate for action selection should have access to all the information necessary to compute an appropriate subsequent action. Numerous studies have demonstrated mRF neurons responding to a wide variety of stimuli, and many respond to multiple sensory modalities (Siegel 1979; Scheibel 1984). Classically, the small neurons in the lateral brainstem—the parvicellular area—were thought to relay sensory input to the medial brainstem (Scheibel & Scheibel 1967). However, neurons in the parvicellular area receive input from a limited range of sensory sources (Shammah-Lagnado et al. 1992), and many sensory systems provide primary or secondary afferents directly to the mRF.

The mRF receives input from every one of the body's sensory, pain, vestibular (balance), visceral (organs), proprioceptive (muscle and joint), cardiovascular and respiratory systems. Many of these links have been demonstrated anatomically: direct inputs have been traced from secondary nuclei in the whisker (Kleinfeld et al. 1999), auditory (Cant & Benson 2003) and vestibular systems (Yates & Stocker 1998); the proprioceptive information carried by the ascending dorsal column is directly relayed to the mRF via collaterals from the gracile and cuneate nuclei (Salibi et al. 1980); and the spinoreticular tract and collaterals from the spinothalamic tract, the primary routes for pain signals to the brain, are a major source of fibres reaching the mRF (Fields & Basbaum 1978).

These anatomical inputs are consistent with the multimodal responses recorded from mRF neurons. Individual neurons respond to somatic stimuli (Segundo et al. 1967), and many respond to the stimulation of multiple body locations (Bowsher 1970; Schulz et al. 1983). Similarly, mRF neurons respond to experimental manipulations of the cardiovascular (blood pressure and cardiac rhythm) and respiratory (rhythm, lung inflation and deflation) systems (Langhorst et al. 1983). Again, many of the recorded neurons showed responses to manipulations of both systems. Moreover, a combined study showed that many mRF neurons respond to stimulation of multiple somatic regions and manipulation of both cardiovascular and respiratory systems (Langhorst et al. 1996). Thus, it seems that the mRF has access to all information made available by an animal's external and internal sensory and monitoring systems. Moreover, since these inputs converge on single neurons, they are in a position to extract correlated input, providing a basis for the computation of an action's salience.

A substrate for action selection should also be able to express the outcome of the selection competition. The majority of neurons in the mRF project extensively to all levels of the spinal cord and to the cranial nerves (Torvik & Brodal 1957; Eccles et al. 1976; Jones 1995). Axons of individual reticulospinal neurons can contact multiple spinal levels on both sides of the spinal cord (Peterson 1979). Recent studies have shown that the majority of reticulospinal neurons synapse on spinal interneurons (Matsuyama et al. 2004). The anatomy of the mRF's output is thus consistent with the ability to control the axial musculature (trunk, limbs and neck) and the face.

Reticulospinal neurons have direct control over the activity of central pattern generators (CPGs) located in the spinal cord (Matsuyama et al. 2004) and the brainstem (Lund et al. 1998). Studies of the lamprey swimming CPG—homologous to the mammalian locomotion CPG—have found that the level of mRF neuron activity is directly related to the frequency of oscillation in the CPG, and thus may set the speed of swimming and the angle of turning (Deliagina et al. 2002). Similarly, Noga et al. (2003) proposed that mRF reticulospinal neurons directly drive the putative mammalian locomotion CPG. Thus, there is evidence not only that mRF neurons contact structures able to directly express action, but also that their activity levels may encode the degree of behavioural activation.

3. Internal circuitry of the mRF

The effects of manipulations of the mRF on behaviour and its external connectivity together make a compelling case for the involvement of the mRF in action selection. Demonstrating that it is able to represent and resolve action competitions is impeded by the lack of a clear picture of its internal anatomy. We describe here our recent work to solve this problem.

(a) Known anatomy of the mRF

Classic Golgi staining work by Scheibel & Scheibel (1967) showed the existence of giant-bodied neurons with bifurcating axons and disc-like radial dendritic trees; they proposed that the giant neurons were arranged along the rostrocaudal axis like ‘a stack of poker chips’. However, little work had been done to integrate more recent anatomical studies of the RF into a coherent picture of its internal structure. Therefore, we conducted an extensive literature review, leading us to propose the following structural organization (Humphries et al. 2006).

We identified two main neuron classes. The projection neurons extend a bifurcating axon, predominantly sending the major branch caudally to the spinal cord and the minor branch rostrally towards the midbrain (the giant neuron of the Scheibels' Golgi studies belongs to this class). The neurons make excitatory contacts with their targets, mostly via collaterals regularly branching from the main axon. Typically medium-to-giant in size, projection neurons have a characteristic radial dendritic field extending in the coronal (vertical, mediolateral) plane but not along the rostrocaudal axis. The dendrites thus seem positioned to sample from the multiple fibre tracts traversing the RF along the rostrocaudal axis, carrying the axons of many spinal, cortical and sensory systems. Figure 2a shows the spatial relationships between these tracts, and the projection neurons' dendritic fields and axon trajectories. The interneurons project their axon almost entirely within the RF, predominantly along the mediolateral axis, and make inhibitory contacts with their targets. There is good functional evidence for localized intra-mRF inhibition (Holmes et al. 1994; Iwakiri et al. 1995).

Figure 2

Anatomical organization of the vertebrate mRF. Directional arrows apply to both panels. (a) Sagittal section of the brainstem. The dendritic trees (thick grey lines) of the projection neurons (one neuron body shown, open circle) extend throughout the mRF along the dorsoventral axis but extend little along the rostrocaudal axis. These dendritic trees contact axon collaterals of both passing fibre systems (black dashed line) and far-reaching axons of the projection neurons (the axon of the depicted neuron body is shown by the black solid line); the example fibre system is the spinothalamic tract (ST). (b) The proposed mRF organization: it comprises stacked clusters (three of them are shown) containing medium-to-large projection neurons (open circles) and small-to-medium interneurons (filled circles); cluster limits (grey ovals) are defined by the initial collaterals from the projection neuron axons. The projection neurons' radial dendritic fields allow sampling of ascending and descending inputs both from other clusters (solid black lines) and from passing fibre systems (dashed black line). The interneurons project within their parent cluster. Reproduced from Humphries et al. (2006).

We proposed that these neurons are arranged in a series of stacked clusters, each comprising a mix of projection and interneurons, and each delimited by the initial collateral from the projection neurons' axons—which occurs roughly 100 μm from the initial bifurcation. In other words, a cluster's rostral and caudal borders are defined by the first collateral in those directions from the projection neurons' axons. Thus, the interneurons project only within the cluster and the projection neurons contact only the neurons outside the cluster. This cluster structure is replicated on both sides of the midline (on both sides of the raphe nuclei in figure 1b). The proposed mRF structure is explained further in figure 2b.

(b) An anatomical model of the mRF

In Humphries et al. (2006), we specified a stochastic model that generated a network with the above organization. A network is a combination of a set of nodes and the set of links between those nodes; for the mRF's neural network the nodes are neurons and the links represent synaptic contact. Here, we describe the definitions of the nodes and links for the mRF model—the full mathematical description is given in the electronic supplementary material, A, and further detail in Humphries et al. (2006).

Six parameters completely describe the network's structure. Two parameters determine the number of nodes. Each of the Nc clusters in the network has n neurons (the total number of neurons—nodes—within the network is thus T=Nc×n). One parameter determines the class of neuron the nodes represent. Within each cluster, a certain proportion, ρ, of neurons are deemed to be the projection neurons; the remainder are interneurons.

The other three parameters describe the connectivity and thus define the links between the nodes. The probability of each projection neuron contacting a given cluster is P(c). This models the probability of the projection neuron's axon extending a collateral into that cluster. If a collateral is extended, then P(p) is the probability of the projection neuron forming a connection with any given neuron in that cluster. Finally, P(l) denotes the probability of an interneuron forming a connection with any other given neuron in its own cluster.

We also proposed an alternative generating model for the cluster structure, based on the stochastic model, in which the links were defined by a procedure analogous to the neural development process. Both the existing and new results described below are similar for both the models, so henceforth we refer to them collectively as the anatomical model.

Ranges for the values taken by parameters Nc, ρ and P(c) were defined from anatomical data in the literature. Values for n were chosen to maximize the size of the networks that could be comfortably supported on a desktop PC. The synaptic connection parameters P(p) and P(l) do not have supporting values in the literature, and thus these were free parameters of the model.

(c) Structural properties of the mRF

An extensive exploration of the network properties of the anatomical model showed, to the extent that it captures the mRF's organization (and for all realistic values of the parameters given previously), that the mRF is likely to be a small-world, but not scale-free, network at the individual neuron level (Humphries et al. 2006). A small-world network has two defining properties: its nodes are more clustered—more locally interconnected—than would be expected if the same number of total links were made at random; and its nodes are also linked by shorter paths than would be expected if the same number of total links were made uniformly. Small worlds have been found in many real-world networks, including connections between airports, electricity grids and food webs, suggesting that some general organizational principle is at work (see Albert & Barabasi 2002, for review).

Why then is the mRF a small-world network? What functional advantages does it bestow? The structural properties of a small-world network imply certain dynamic properties—of rapid cross-network synchronization, consistent stabilization and persistent activity—that may all be critical to the representation and resolution of competition between actions (briefly reviewed in Humphries et al. 2006). However, the presence of a small world also implies further organizational properties. For example, Mathias & Gopal (2001) demonstrated that, in a one-dimensional ring of nodes, small-world networks were formed when attempting to find the optimal trade-off between the total wire length and the shortest path length. It is not known whether this result is true for any other placement of nodes, such as the irregular node spacing and higher-dimensional space of the proposed mRF cluster structure.

Could the cluster structure have thus evolved to optimize neural connectivity? Other neural structures appear to have optimized component placement to minimize the total wiring length (Cherniak 1994). This may be a priority of neural design, as it reduces energy usage during creation of, maintenance of, and signal propagation along, axons (Laughlin & Sejnowski 2003). We therefore look for the first time at how a cluster structure may reduce the total axonal wire length.

(d) The cluster structure reduces wiring length for a network configuration

To begin, we must define what the wiring length is reduced with respect to. Our two hypotheses, shown in figure 3, are: (H1) the cluster structure could reduce the wiring connecting together neurons fixed in particular positions, i.e. the neuron placement is critical, for example, due to the position of input fibres, and the wiring is arbitrary to some extent; and (H2) the cluster structure could reduce the length of wiring required to achieve a particular network configuration, i.e. the internal wiring is critical and the neuron position is arbitrary to some extent. The second hypothesis is akin to the problem of component placement optimization (Cherniak 1994).

Figure 3

Two hypotheses of wiring efficiency. The total wiring length of a network (left) can be reduced in two ways. Hypothesis 1 (H1): if the node placement is crucial—due, say, to the position of the inputs to the network—then the wiring length may be minimized (for the same number of links) by moving the links while ensuring that each node remains connected. Hypothesis 2 (H2): if the network configuration is crucial, then the wiring length may be minimized by moving the nodes while maintaining the links.

A set of cluster model networks were generated by varying the synaptic connection probabilities (P(p) and P(l)) over their plausible ranges—further details are given in the electronic supplementary material, B. Each node of the network was assigned a three-dimensional position within the estimated volume of its anatomical cluster. The total axonal wire length was then computed by calculating the Euclidean distance between each pair of connected nodes in the network and summing over all pairs. Thus, we are only interested here in direct point-to-point wiring: we take no account of the design of morphological features (dendritic trees and axon branching points) that may have evolved to further reduce the wiring costs. Nevertheless, as the axon length required to connect two neurons is simply a function of the distance between them, a useful comparison can be made with other networks, which also do not account for morphology.

For each generated cluster model network, two random networks were created to test each of the two hypotheses just outlined. First, a randomly wired network in which nodes were placed in the same three-dimensional positions, but pairs of nodes were connected at random until the same total number of links as the compared cluster model network was reached. This model tests H1: if the total wire length for the cluster model was less than for the randomly wired graph, then there is evidence that the cluster structure reduces axonal wiring for a given node (neuron) placement. Second, a randomly positioned network in which the cluster model network links were retained, but all nodes were randomly placed in the total three-dimensional volume covered by the clusters. This model tests H2: if total wire length for the cluster model was less than for the randomly positioned graph, then there is evidence that the cluster structure's node (neuron) placement reduces axonal wiring for a given wiring configuration. (This analysis cannot demonstrate that the cluster structure optimizes either wiring configuration or node placement, which would require an exhaustive search of all possible configurations or placements; we simply show here their comparative efficiency.)

The total wire length for the cluster structure was greater than that of the corresponding randomly wired network, but less than that of the corresponding randomly positioned network, for every generated cluster model network (see electronic supplementary material, B). Therefore, we reject H1 but have evidence for H2: the cluster structure of the mRF does not specifically reduce the axonal wire length for a given neuron placement (H1), but wiring length is comparatively reduced for a given wiring configuration (H2), and thus may explain why the cluster structure has evolved.

4. Action representation in the mRF

Having examined both the structure of the mRF and possible reasons for the structure's existence, we now turn to the question of how that structure supports the representation and resolution of competition between actions. We begin by reviewing existing ideas on the functional organization of the mRF.

(a) Functional organization of mRF

Many researchers have seen no functional organization in the mRF. Early studies report stimulation of the RF resulting in either postural inhibition, via descending projections to the spinal cord (Magoun & Rhines 1946), or desynchronization of the cortical electroencephalogram (EEG), via ascending projections (Moruzzi & Magoun 1949). The latter result gave rise to the well-known concept of the ascending reticular activating system. These results, along with the wide array of overlapping sensory inputs to the mRF that lack a demonstrable organization (other than lateralization; Segundo et al. 1967), led some researchers to assert that mRF output was only a function of general sensory arousal (Scheibel & Scheibel 1967; Hobson & Scheibel 1980).

Though still widely discussed, the division of the RF into just two systems (ascending, facilitatory and descending, inhibitory) was refuted soon after by Sprague & Chambers (1954). By applying micro-stimulation at or near threshold to mRF neurons of awake animals, they were able to elicit a multitude of single and multiple limb movements. They saw little of the reported postural inhibition. More recent micro-stimulation studies of the medial medullary RF have demonstrated both multiple movement and multiple muscle responses following the injection of short trains of low-amplitude current pulses (Drew & Rossignol 1990). (The same micro-stimulation applied to the lateral medullary RF did not consistently result in movement, further evidence that the mRF is the substrate of action selection in the brainstem.) Neurons of the mRF thus have functionally specialized rather than general outputs.

How then might the mRF neurons be functionally organized? They are not topographically organized to match patterns of sensory input; despite numerous attempts, no topographical projections to the mRF have ever been convincingly demonstrated (Segundo et al. 1967; Bowsher 1970; Eccles et al. 1976). Groves et al. (1973) reported that tactile stimuli were encoded in rough somatotopic form in the RF, but the methods used could not distinguish between recording from neuron bodies and that from passing fibres, and their recording sites covered the whole coronal extent of the brainstem (Angel 1977). On the output side, Peterson (1979) proposed a crude topography of the reticulospinal projections, based on the combinations of elicited responses in motoneurons related to the neck, back, forelimb and hindlimb. However, other studies of this system found no anatomical topography of the spinal projections (Torvik & Brodal 1957; Eccles et al. 1976), and neurons responding during movement of those body parts seemed randomly intermingled (Siegel & Tomaszewski 1983).

In spite of the above, there is evidence for a functional organization in the mRF based on common activity patterns. Neighbouring pairs of mRF neurons have correlated activity in both waking (Siegel et al. 1981) and anaesthetized (Schulz et al. 1985) animals, evidence for a common afferent input. In both studies, all neuron pairs separated by more than 200 μm showed no correlations. Similarly, neighbouring mRF neurons have overlapping somatic sensory fields, but distal pairs do not (Schulz et al. 1983). There is thus evidence for neighbouring neurons having common activity patterns. On this basis, we hypothesize that clusters in the mRF are functionally as well as anatomically distinct and are, therefore, the representational unit in the brainstem action-selection system.

We assume that sites in the cranial nerve nuclei and the spinal cord targeted by the projection neurons express the action selected by the mRF system. Many projection neurons have correlated activity with multiple movements, and the activity of near-neighbour projection neurons often does not correlate with the same movement or set of movements (Siegel & Tomaszewski 1983). Thus, the correlated activity between near-neighbour projection neurons in waking animals (Siegel et al. 1981) would lead to the simultaneous recruitment of multiple muscle groups and movement types. We therefore propose that sufficient activation of a cluster's projection neurons would lead to a coordinated behavioural response—as has been demonstrated for some spinal CPGs (§2).

(b) Computational modelling of the mRF

Beyond the work just detailed, there is little direct evidence on the functional organization of the mRF. We must thus explore the potential methods of representing and resolving action selection through simulation by computational models. Moreover, as the Kilmer et al. (1969) model is the only quantitative model of the mRF (discussed further in §4d), and as that model does not reflect the proposed cluster structure of the mRF, we must define a computational model which bases its connectivity on our anatomical models.

(i) Incorporating afferent input

Before examining the dynamics of the cluster organization, we must add definitions to the stochastic anatomical model for afferent input. As reviewed previously, this input comes from multiple sensory and internal monitoring systems. Two parameters are added to define the proportion of neurons receiving input: a proportion of projection neurons, ρs, and a proportion of interneurons, λs, are defined as receiving afferents within each cluster—these proportions are the same for every cluster. Ranges for these two parameters are discussed in the electronic supplementary material, C.1. The result of these additions is that each node in the generated network is assigned a flag, indicating the presence or absence of afferent input.

(ii) The computational model

One option for exploring the potential for action selection in the mRF would be to simply implement the anatomical network as a neural network, with an artificial neuron for every node. However, this creates a network of the order of 103–104 neurons, which prohibits a thorough examination of its dynamic properties in simulation. Moreover, it is rather more detailed than we require to consider the initial list of possible action representations in the mRF.

Instead, we follow a tradition of capturing the global dynamic properties of a neural system using what have been variously called ‘macroscopic’, ‘mean-field’ or ‘population’ models (Wilson & Cowan 1972; Tsodyks et al. 1997; Latham et al. 2000; Yousif & Denham 2005). In this approach, populations of neurons are treated as a statistical ensemble, assuming that the connections between populations are such that functionally meaningful subgroups of neurons cannot be further distinguished. Thus, the model is a set of simplified ordinary differential equations describing the change in the normalized mean firing rate of each population over time; in other words, it is only concerned with temporal dynamics. Nevertheless, if the parameter values and the populations are carefully chosen, then this approach can both reveal similar dynamics to more complex models with individual neural elements and match recorded changes in neural activity (Latham et al. 2000; Yousif & Denham 2005). Moreover, the simplicity of the resulting models allows for a more thorough exploration of their dynamic properties, via both simulation and analysis. Thus, we establish here a population-level model of the mRF.

Given the proposed cluster structure and the hypothesis of projection neurons encoding the action representation, the most natural division of the mRF is into separate populations of projection and interneurons for each cluster. The computational model thus has two vectors encapsulating its behaviour: the projection-neuron activity, c, and the interneuron activity, i. Each vector element is a population: ck is the normalized mean firing rate of the kth cluster's projection-neuron population and ik is the normalized mean firing rate of the kth cluster's interneuron population. These activities evolve according to the differential equations given in the electronic supplementary material, C.2.

The connections between the populations are defined by the underlying network generated by the anatomical model. Each link in the network is assigned a weight value, indicating its relative strength and sign (inhibitory or excitatory). A population in the computational model encapsulates a set of nodes in the network; the connection weight between any pair of populations is thus the mean value of all the weighted links between the nodes of those two populations in the network.

Both the anatomical organization and the neural activity characteristics (§3) are consistent with each cluster having a unique pattern of multimodal input. We thus describe input to the model by the vector u, where each element uk is the normalized mean afferent input to the kth cluster. Each uk's relative contribution to the projection and interneuron populations of the kth cluster are given by the values for ρs and λs, respectively.

(c) Potential configurations as an action-selection system

We now explore hypotheses of action representation within the cluster structure, using example simulations of the corresponding population-level models to illustrate the ideas. A single instantiation of the anatomical model was used to derive the connection parameters of the computational model—details are given in the electronic supplementary material, C.3. To simplify the discussion, we consider here only the models in which input is received by the projection neurons; the addition of input to the interneurons made little difference to the relative outcomes.

(i) Single-action configuration

The output of each cluster could represent a complete action. The maximum number of representable actions is thus just Nc, and grows by one with each additional cluster. Action selection in such a circuit requires a winner-takes-all (WTA) competition, to reduce the set of potential actions to just the most appropriate one. To form a WTA-like circuit in a fully connected cluster structure (figure 4b), the projection-neuron population of each cluster must receive greater input (i.e. inhibition) from its corresponding interneuron population than from the combined input of its intercluster connections; otherwise, the net effect of any sensory input to the network would be excitatory (in a symmetrical network).

Figure 4

Potential configurations of the mRF cluster architecture as an action-selection mechanism. (These illustrate connection schemes, not relative physical location.) Cluster-specific total afferent input (un) targets only the cluster's projection neuron population (cn), whose outputs drive some form of coherent behavioural response to that particular combination of input from sensory, pain, respiratory systems, etc. A cluster's interneuron population (in) contacts only the projection neuron population. (a) Input values for the example simulations, in which each configuration was instantiated as a population-level model. (b) Each cluster's projection-neuron population represents a single action. Competition between actions is putatively resolved by a WTA-type circuit, formed by stronger relative weighting of the inhibitory within-cluster interneuron connections (open circles) than of the excitatory projection-neuron connections to other clusters (arrows). However, the simulation outputs show that such a single-action configuration does not act as a WTA circuit, but as an amplified relay of the inputs. (c) With all intercluster excitatory connections to projection neurons removed, a traditional WTA circuit seems to be created; yet, the simulation outputs show that this does not form a WTA circuit either. Moreover, it does not account for the existence of the long-range axons. (d) Each cluster's projection-neuron population represents a sub-action. Specific wiring configurations may create a circuit in which the sensory activation of a single cluster recruits other clusters representing compatible (or essential) sub-actions, via the intercluster connections between projection neurons. The combination of sub-actions then creates the coherent behavioural response observed in the animal. In simulation, the sub-action configuration results in appropriate selection for the given inputs: activation of cluster 1 (c1) results in concurrent recruitment of cluster 3 and inhibition of cluster 2.

One option is that intercluster connections to interneurons have a higher weight than intercluster connections to projection neurons in the same target cluster. However, without detailed anatomical data on, for example, bouton counts from a single axon, there is no a priori reason to believe this to be true. The alternative option is that the inhibitory intracluster connection from the cluster's interneuron population to its projection-neuron population has a relatively high (absolute) value when compared with any excitatory intercluster connection weight. Thus, input from other clusters to both the interneuron and projection-neuron populations will result in a net inhibitory effect on the projection-neuron population. Synapse counts from projection-neuron dendritic trees suggest that this may be the case. Roughly 45% of the synapses on a projection neuron are GABAergic (Jones et al. 1991)—and thus inhibitory—and interneurons are the primary (perhaps only) source of GABAergic input (Holmes et al. 1994). Yet, the proportion of interneurons to projection neurons is much smaller than this value. Thus, an interneuron input to a projection neuron would have a disproportionately larger effect than a given projection-neuron input, as it forms more synapses. Therefore, we believe there is a case for assuming that inhibitory weights are stronger than excitatory weights in the mRF (see electronic supplementary material, C.3, for more detail), and thus a WTA circuit may be supported.

Simulation of a population-level model with such an architecture shows that the cluster structure can implement soft selection—simultaneous selection of more than one action. Some thresholding of output would be required to implement hard selection—a true WTA competition—a threshold possibly set by the amount of cluster output required to sufficiently activate target neurons in the cranial nerve nuclei and spinal cord. However, the outputs for this simulation are, roughly, just the ratio of the corresponding inputs, which reduces the mRF architecture to a simple relay system.

Removing the excitatory intercluster connections to the projection neurons leaves only the intercluster projections to interneurons and, thus, would seem more able to implement a WTA circuit (figure 4c). We generate this configuration by setting the projection-to-projection neuron connections to zero. However, simulation of this altered model shows that it does not implement a WTA circuit either: the output of the clusters is little different from their input values. The presence or absence of the long-range connections appears to have little impact on the mRF's ability to act as a selection mechanism if each cluster is assumed to represent a single action. The existence of abundant long-range connections between projection neurons is not in doubt, and thus should be accounted for in a functional model of the mRF. Therefore, we are left to consider the purpose of the long-range intercluster projection-neuron connections.

(ii) Sub-action configuration

It is possible that in the mRF, some cluster-to-cluster projections preferentially target the interneuron populations, while others preferentially target the projection-neuron populations. Thus, the output of a single cluster may simultaneously inhibit some clusters and excite others. Excitation of a target cluster could correspond to recruitment of a compatible, perhaps essential, component of an action; conversely, inhibition of a target cluster could correspond to the prevention of an incompatible, perhaps dangerous, component of an action. The output of each cluster thus activates a sub-action, a component part of a coherent behaviour. This has a representational advantage over a single-action representation: the upper limit of potential unique sub-action combinations is Embedded Image, and grows by Embedded Image with each additional cluster.

An example of a sub-action configuration in the same three cluster models is shown in figure 4d. To generate this configuration, we again set the appropriate connections to zero (see electronic supplementary material, C.3). In simulation, the outputs of both clusters 1 and 3 exceed the value of their inputs, and both have considerably greater output than cluster 2 (which has a much reduced output compared with its input). Thus, in this configuration, the output pattern is consistent with sub-actions 1 and 3 being activated, and sub-action 2 being suppressed.

Having demonstrated that the sub-action configuration works in principle, we now turn to a preliminary assessment of its robustness over a range of inputs. The configuration depicted in figure 4d supports just two actions: one signalled by the sufficient output of both clusters 1 and 3, and another by the sufficient output of cluster 2. In this initial assessment, we deem sufficient output to mean that the outputs of the required clusters exceed those of all the other clusters—the selection of a sub-action is based solely on the ordering of the output values. Thus, given any set of inputs u, we may define two correct output states:

  1. if the outputs are ordered such that (c1>c2)∧(c3>c2), then action 1 is correctly selected if and only if the input relationship is (u1u2)∨(u3u2) and

  2. if outputs are ordered such that (c2>c3)∧(c2>c1), then action 2 is correctly selected if and only if the input relationship is (u2u1)∨(u2u3),

where ∧ means propositional conjunction (AND) and ∨ means propositional disjunction (OR). All other alternatives are deemed to be incorrect selections (the example in figure 4d fulfils output state 1 and is, therefore, a correct selection). We note that these are hard definitions of correct selection: in particular, both sub-actions that comprise action 1 must be selected together at all times (other interpretations, such as the correct selection of individual sub-actions given appropriate inputs, will be considered in future work).

To assess the robustness of sub-action selection, we simulated the model just described, varying each element of input vector u over the interval [0, 1] in steps of 0.1, making a total of 1341 simulations. For each input vector, the projection-neuron output vector c was assessed at equilibrium to determine whether it signalled correct or incorrect selection, as defined previously. We find the majority of input vectors (75%) result in correct selection (see electronic supplementary material, D). Thus, sub-action selection is robust over a wide range of inputs.

The incorrect selections occurred for input vectors that either had all elements roughly equal, or had at least element u2 and one other equal (with the third element being close to zero). Thus, this simple model of a configuration of the mRF's anatomy lacks a mechanism for resolving selection competitions between closely matched inputs.

(d) Non-local action representation in the mRF

The proposed mapping of clusters to actions (or sub-actions) is not the only possibility: the anatomical organization does not necessarily map directly onto a functional organization. An alternative is suggested by reinterpretation of the model of Kilmer et al. (1969): we could consider their ‘modes’ to be simpler ‘actions’ and take the output of the model to be the activity projected to the spinal cord rather than to the ascending systems. The model then suggests that actions are represented by the parallel long axons of the projection neurons (rather than the clustered neuron bodies), i.e. a few projection neurons from each (or many) of the clusters contribute their axons to a group which represents a single action (or sub-action). The activity transmitted by that axon group to the spinal cord thus recruits the appropriate musculature for the action. Some evidence for this scheme has been found in studies of grooming behaviour under progressive decerebration (Berridge 1989).

Remarkably, the general structure of the Kilmer et al. (1969) model is still consistent with the known organization of the projection neurons in the mRF. We thus tested this model in embodied form (the original authors' long-held wish) as a controller for a robot in a survival task, to evaluate the possibility of it forming an action-selection mechanism (Humphries et al. 2005). We found that the model, as originally proposed, could not sustain action selection, but, by evolving the model with a genetic algorithm, certain configurations could be found that did. Thus, the mRF may also be able to support action selection based on parallel representation of those actions (a sub-action version was not tested). However, inevitably, given its age, several aspects of the model were incorrect or implausible, or omitted features known from more modern studies of the mRF. Thus, a full evaluation of the parallel representation scheme awaits further work that will look at how the proposed anatomical models could support parallel representation in a computational model.

5. Integration of the action-selection systems

The mRF cluster model's inability to resolve competitions between (roughly) equally salient actions suggests the tantalizing possibility that more complex action-selection systems evolved partly to cope with ambiguous situations—complex systems which could, of course, encompass the basal ganglia. It is thus natural to consider how the proposed basal ganglia and mRF action-selection mechanisms may interact.

There are three candidate control architectures which could encapsulate the combined action-selection system, shown in figure 5. First, a strict hierarchy of control, in which decisions made at higher levels limit those of lower levels. This is often taken to imply that lower levels encode more elementary actions than higher levels. The modelling work reported previously supports this and it is consistent with the decomposition of the control of grooming in rats: intact basal ganglia are necessary to correctly sequence the components of the grooming routine (Berridge & Whishaw 1992), but each component is encoded entirely within the brainstem (Berridge 1989). The basal ganglia's primary route to the brainstem is via the pedunculopontine nucleus (PPN), which itself projects heavily into the mRF (Delwaide et al. 2000). Some functional and anatomical data, therefore, support a hierarchical architecture in which the basal ganglia dictate control of the mRF output (figure 5a).

Figure 5

Alternative schemes for integrating the action-selection substrates. (a) A hierarchical architecture: lower levels represent increasingly simple actions, selected by the higher layers. This is consistent with the output of the basal ganglia reaching the mRF via the PPN, and with the results of our modelling work. (b) A layered architecture: the mRF and basal ganglia form separate layers in a control system dealing with increasingly complex stimuli, the higher layers being able to veto the output of the lower layers. This design is consistent with the separate sensory input to the basal ganglia and mRF, and with the basal ganglia's access to the spinal cord via the PPN. (c) A combined architecture: the competences of each layer contribute to the whole system. This is consistent with the evidence for feedback pathways within the neural systems, particularly between the PPN and the basal ganglia. Arrows, excitatory pathways; open circles, inhibitory pathways.

The second alternative is a layered architecture, such as Brooks' subsumption architecture (Brooks 1991). Increasingly complex computations are supported by higher layers of this architecture and, while all layers compute in parallel, higher layers can veto the output of lower layers. There is considerable evidence that the sensorimotor mappings within the vertebrate brain are organized in this fashion (Prescott et al. 1999). Do basal ganglia and mRF circuits thus run in parallel, with basal ganglia output able to veto mRF if necessary? (See figure 5b.) The motor effects of both Parkinson's disease (Zigmond & Burke 2002) and lateral hypothalamic damage (Teitelbaum et al. 1990), in which the basal ganglia are jammed in ‘off’ mode, suggest that it is continually vetoing lower layers. In addition, the paradoxical results of Parkinson's disease interventions point to the existence of parallel systems. Following drug treatments (l-DOPA), Parkinson's disease patients regain voluntary movement, but continue to have problems controlling their axial musculature (Lakke 1985), which is under the direct control of the mRF. Moreover, surgical interventions often destroy sections of the basal ganglia; the patients' recovery of voluntary movement after surgery (Marsden & Obeso 1994) thus suggests that destruction of the basal ganglia releases other action-selection systems to work. Anatomically, this design has potential in some circuits: the basal ganglia and mRF do receive separate inputs, and the basal ganglia can bypass the mRF and access the spinal cord via the PPN. However, this basal ganglia–PPN–spinal circuit may be limited to only postural control (Takakusaki et al. 2004).

The third alternative is, thus, some combined hierarchical/layered system and is necessarily supported by the data reviewed previously, which support each of those elements. In addition, a combined system incorporates some form of heterarchy in the control decomposition, in that lower levels can influence higher levels. Anatomically, the PPN projects extensively into the basal ganglia (Inglis & Winn 1995) and the mRF may project into PPN (Jones 1995)—see figure 5c. There is little research on what these ascending projections may be encoding, though the known properties of the PPN and mRF suggest attentional arousal and motor feedback, respectively. Exploration of the functional decomposition of control within the vertebrate action-selection system is thus our next task.

6. Final remarks

The reticular formation is a strange beast: where some see an undifferentiated neuron mass, responsive only to global sensory input, others see a conglomeration of functionally specific units. Both views contain an element of truth. The dense ascending input and intra-RF connectivity point to a system capable of responding to stimulation only with increased activation. Yet, stimulation of individual neurons within it elicits discrete repeatable movements. We hope that by proposing the mRF as an action-selection system, we may unify these disparate views: the dense web of inputs provide the ability to extract correlated sensory information, the internal connectivity provides the substrate for the coordination of behavioural components, and the individual neurons drive the appropriate motor systems.

Our proposal partially rests on the structure of the mRF: if the cluster structure is an accurate depiction of the mRF's internal anatomy, then the most probable method of representing and resolving action competitions is that the activity of a cluster's projection-neuron population encodes the relative selection of an action component. This sub-action configuration has the advantage of both providing a functional role for the collaterals of the long-range axons and increasing the representational capacity of the system. It is possible that both clustered and parallel action representations coexist: competing complex behaviours may be represented by parallel axon activity that recruits the necessary sub-actions for each behaviour by activating the appropriate clusters. Combining these representational schemes with the potential control decomposition across the basal ganglia and mRF makes for a fascinating, if daunting, proposition.

The current work is intended to move us closer to an understanding of the neural substrate of action selection in the vertebrate brain, in part to better constrain the design of controllers for autonomous agents. The utility of this approach depends on the demonstration of the substrate's proposed function in embodied forms, a strategy we and others have pursued for the basal ganglia (Girard et al. 2003; Prescott et al. 2006), and will continue to pursue in our evaluation of the mRF. At the very least, we hope this work inspires re-evaluation of the mRF's functional significance.

Acknowledgments

This work was funded by the EPSRC (GR/R95722/01), a Wellcome Trust VIP award and the European Union Framework 6 ICEA project.

Footnotes

  • One contribution of 15 to a Theme Issue ‘Modelling natural action selection’.

    References

    View Abstract