Metacognition is usually construed as a conscious, intentional process whereby people reflect upon their own mental activity. Here, we instead suggest that metacognition is but an instance of a larger class of representational re-description processes that we assume occur unconsciously and automatically. From this perspective, the brain continuously and unconsciously learns to anticipate the consequences of action or activity on itself, on the world and on other people through three predictive loops: an inner loop, a perception–action loop and a self–other (social cognition) loop, which together form a tangled hierarchy. We ask what kinds of mechanisms may subtend this form of enactive metacognition. We extend previous neural network simulations and compare the model with signal detection theory, highlighting that while the latter approach assumes that both type I (objective) and type II (subjective, metacognition-based) decisions tap into the same signal at different hierarchical levels, our approach is closer to dual-route models in which it is assumed that the re-descriptions made possible by the emergence of meta-representations occur independently and outside of the first-order causal chain. We close by reviewing relevant neurological evidence for the idea that awareness, self-awareness and social cognition involve the same mechanisms.
There is undoubtedly a relationship between awareness and metacognition, for our common understanding of conscious knowledge is simply that it is knowledge that we know we possess. Congruently, it is precisely in those cases where our behaviour is guided by knowledge we do not know we possess that we speak of unconscious knowledge. Colloquially, thus, metacognition, or ‘cognition about cognition’, appears to be fundamental to our understanding of consciousness. However, metacognition is usually construed as a controlled, intentional process whereby people intentionally and effortfully reflect upon their own mental activity. Here, we would instead like to suggest that metacognition is but an instance of a larger class of representational re-description processes  that, we assume, occur unconsciously, automatically and continuously. From this perspective, the brain is continuously and unconsciously learning to anticipate the consequences of action or activity on itself, on the world and on other people. In so doing, we shall argue, it learns to represent its own activity to itself, so developing systems of meta-representations that characterize the manner in which first-order representations are held. Such systems of meta-representations both enable conscious experience (for it is in virtue of such meta-representations that the agent ‘knows that it knows’) and define its subjective character (for each agent's meta-representations will be idiosyncratic, shaped by its experience with the world and with others).
To support these ideas, we begin by discussing the relationships between consciousness and metacognition. Next, we ask what kinds of mechanisms are necessary to subtend it. We argue that signal detection theory (SDT), as applied to the study of consciousness, has a descriptive character that we should like to see replaced by a mechanistic account. We propose such an account in the next section, based on the neural network models we initially introduced in two previous papers [2,3]. Next, we analyse the performance of such models through signal detection analysis, explore their implications for our understanding of consciousness and overview relevant neurological evidence. We close by suggesting that consciousness is something that the brain learns to do rather than a static property of certain neural representations and not others. This we call the ‘Radical Plasticity Thesis’.
Metacognition covers a lot of ground. It has been variously construed as the ability to reflect upon one's own mental activity (‘cognition about cognition’), as awareness of possessing task-relevant knowledge (so-called judgement knowledge ) or as the introspective mechanism that lies at the core of perceptual awareness (i.e. sensory metacognition). A number of recent papers have addressed both the neurobiological underpinnings of metacognition [5–7], as well as its functions and mechanisms [8,9].
The complex relationship between consciousness, self-awareness and metacognition is the object of an ongoing debate ([10,11]; see also  for an overview). In a nutshell, the argument hinges on whether metacognition is taken to be a precondition or a consequence of consciousness. Contemporary theories of consciousness, in this respect, roughly fall into one of two categories: those that see capacity for metacognition as a consequence of content becoming conscious and therefore available to higher-order processes and introspection (so-called ‘fame-in-the-brain’ approaches), and those that assume that some form of metacognition is a necessary prerequisite for consciousness.
‘Fame in the brain’ theories, introduced by Dennett [12,13], typically assume that consciousness occurs whenever particular conditions are fulfilled, such as stability and strength or complexity of a knowledge representation, which can result from processes such as re-entrant processing and/or from synchrony of neural processing. Essentially, it is assumed that the brain is a large dynamical system in which stable, attractor states come in and out of existence as a result of continuously operating global constraint satisfaction processes. The main functional consequence of such states is that the information they convey then becomes available to the global workspace [14–16] for further information processing, such as cognitive control or conscious access. However, one problem with ‘fame-in-the-brain’ proposals is that there is no particular property of the information contained in conscious representations, apart from strength, stability or complexity, that sets it qualitatively apart from information contained in unconscious representations. All information remains first-order information in the system, and some of that information somehow gives rise to conscious awareness of it.
As an alternative point of view, approaches that take higher-order or meta-representations as a prerequisite for consciousness hold that in order for content to become conscious, a system needs to be able to represent its internal states to itself. In other words, for a system to be conscious of its internal states, said internal states have to become available to inspection, in addition to serving their first-order functions. As Karmiloff-Smith  put it: knowledge in the system has to become knowledge for the system. First-order systems—those that merely transform, however appropriately, inputs into outputs—can never know that they know: they simply lack the appropriate machinery . This points to a fundamental difference between sensitivity and awareness. Sensitivity merely entails the ability to respond in specific ways to certain states of affairs. Sensitivity does not require consciousness in any sense. A thermostat can appropriately be characterized as being sensitive to temperature, just as the carnivorous plant Dionaea muscipula (Venus flytrap) may appropriately be described as being sensitive to movement on the surface of its leaves. But our intuitions tell us that such sensitive systems (thermostats, photodiodes, transistors, cameras, carnivorous plants) are not conscious. They do not have ‘elementary experiences’, they simply have no experiences whatsoever. Sensitivity can involve highly sophisticated knowledge, and even learned knowledge, but such knowledge is always first-order knowledge, it is always knowledge that is necessarily embedded in the very same causal chain through which processing occurs.
Awareness, on the other hand, always seems to minimally entail the ability of knowing that one knows. This ability, after all, forms the basis for the verbal reports we take to be the most direct indication of awareness. And when we observe the absence of such ability to report on the knowledge involved in our decisions, we conclude that the decision was based on unconscious knowledge. Thus, it is when an agent exhibits knowledge of the fact that he is sensitive to some state of affairs that we take this agent to be a conscious agent. This second-order knowledge, we argue, critically depends on learned systems of meta-representations, and forms the basis for conscious experience of the first-order knowledge that is the target of such meta-representations. Despite remaining heavily debated, this higher-order approach to consciousness has received substantial support recently [10,18–22] (see also  for a recent overview) and is currently enjoying renewed interest.
Irrespective of whether one sees metacognition as a consequence of or as a prerequisite to awareness, there remains the question of what mechanisms subtend it. In this respect, Lau  has defended the idea that metacognition involves the brain performing signal detection on its own representations. For instance, in a typical visual detection or discrimination task aimed at investigating task performance and awareness, participants have an ‘objective’ discrimination performance and a ‘subjective’ awareness rating. SDT approaches to awareness [9,24–28] model this relationship by assuming that, for each of these judgements, the participant's (and the brain's) task comes down to representing the outside world in terms of stimulus and noise, and looking for decision criteria to set both apart in objective (type I) and subjective (type II) terms. In general terms, this comes down to calculating two sensitivities and criteria. Type I sensitivity d′1 is, as usual, based on the proportion of hits with respect to the proportion of false alarms in the context of the actual task, and criterion c1 represents the bias with which the participant tends to be conservative versus risk-taking (in detection tasks; or selects one response option over the other in discrimination tasks). Type II sensitivity d′2, however, which is the degree to which one can tell apart one's correct from one's false responses, is thus the number of ‘awareness hits’ with respect to ‘awareness false alarms’. Thus, if awareness is measured by rating one's confidence in one's response, d′2 reflects the proportion of high confidence ratings for my correct responses with respect to the proportion of high confidence ratings for wrong responses, whereas c2 reflects my bias in terms of how prone I am to rate my confidence as high or low. The relationship between type I and type II SDT analysis has been described in depth elsewhere .
However, within this general framework, important differences exist between how ‘fame-in-the-brain’ or higher-order approaches characterize this relationship. Recent modelling work  has laid out the different classes of possible models that follow from the above distinction within a SDT framework. The study distinguishes three types of models: first-order models, which assume that one stream of information accounts for both behavioural output and awareness of this output; dual-channel models, which assume that information that informs behaviour is essentially processed along a different channel from that which informs awareness of this information; and hierarchical models, which assume that information is first processed on a first-order level (which determines behaviour), and that a second-order level is necessary to make the information available to awareness. The modelling results  show that hierarchical SDT models outperform first-order or dual-channel models.
SDT, however, offers essentially a descriptive account of the relationships between type I and type II performance. Here, building on earlier work, we would like to propose a computational account [2,3] of these relationships. This proposal is motivated by different reasons.
First, as mentioned before, both ‘fame in the brain’ and higher-order approaches as operationalized in SDT somehow assume that metacognition, whether a consequence or a prerequisite, is necessarily tied to consciousness. Here, we argue that metacognition may be an instance of a larger class of learning-related representational re-description processes  that, we assume, occur unconsciously and automatically.
Second, we believe that, although SDT might provide a conceptual description of what occurs in any given visual detection or discrimination task (as mentioned above: the brain performing signal detection on itself), it offers no explanation as to how such signal detection might come about and therefore remains largely descriptive: it is not because people behave as if performing a signal detection task that this is how the brain produces this behaviour. This is not an argument about biological plausibility (which has also been criticized for neural network models), but about explanatory power. In our opinion, SDT models lack an account of how the brain develops criteria, how it develops a representation of the world, and how it develops awareness. In our view, it is crucial to incorporate an organism's interaction with the world in order to understand how metacognition develops.
Third, conceptually, type II SDT in the context of awareness is somewhat ambiguous. In a type I task, there is, objectively, a stimulus present or not, and we can say there is one, or not—there is no a priori relationship (d′1) between the two. Thus, my ratings can correspond to or diverge from the actual probability of a stimulus being present in the experiment. A type II task (e.g. confidence ratings) is completely different. There is a probability of correct decisions (which is a match between the world and the type I decision), but I do not simply provide subjective ratings that correspond or diverge from this probability. This is because a guess is just that, a guess. Confidence in a response A (instead of B) indeed means that I thought it was A, but when I claim to guess, I do not say ‘A is wrong’, and that it should be B—rather, it means that for all I care it could be either of them. Overall, there are usually no (or very few) trials in which I know I was wrong, I am just not sure whether I was right. Indeed, if I consistently say ‘guess’ only for trials where I make an error, I am in fact fully aware (see zero correlation criterion ). So in principle, irrespective of the relative proportion of guesses on correct versus incorrect trials (the ‘misses’ versus the ‘correct rejections’), those ‘guess’ trials should contribute in equal proportions, or not at all, to how I represent my decisions to myself, since when I guess, I do not state that my type I decision was wrong. Thus, at least in our opinion, type II tasks cannot be seen simply as a higher-level equivalent of type I tasks. There are many ways in which one can define the relationship between type I and type II decision axes, but those described by Maniscalco & Lau  do not include a mechanism that accounts for the accrual over time on both decision axes and how their relationship comes to be established.
Fourth, on a more general note, in our view, SDT, irrespective of whether it is implemented as a first-order, dual-channel, or hierarchical model, assumes (i) that a noisy but rich signal enters the sensory channels and (ii) that the brain represents one or two sensitivities (d′) and sets at least two criteria (c) that allow for the selection of the adequate type I and type II outputs. Apart from the fact that these criteria have to be arbitrarily chosen and hence that there is no explanation of how they come about, this approach is reminiscent of traditional filter models and of spectatorial accounts of cognition in general, whereby the senses receive massive (though noisy) amounts of information, and where the passive observer's brain is merely tasked to extract the signal. In this respect, one of the important variables manipulated by Maniscalco & Lau's  hierarchical models is a decay factor, which determines how much of the first-order information remains for the second-order classification. This suggests, first, that somehow at one point there is an enormous amount of information (rich phenomenal consciousness) that dissipates over time, leaving only limited access to whatever remains [31,32], and second, that consciousness is essentially a passive endeavour. Indeed, using a decay, one has to subscribe to the fact that consciousness ‘slips through our fingers’—whereas in fact, there have been recent findings suggesting that consciousness takes time , and that over this time, many misconstruals can happen [34,35]. In fact, it has been argued that conscious content is but an ‘illusion’ created by the brain based on piecemeal sensory input in combination with priors (partial awareness hypothesis ; see also ), a notion that, to some extent, is also in line with an enactive view on consciousness [38,39], whereby the agent, embedded in an environment, is not a spectator but plays an active role in constructing his awareness of that environment and of himself (see below for an elaboration of this idea). Thus, even if one accepts that SDT criteria can be influenced by priors, there is no account of how this might happen. Taken together with the second point, SDT accounts are very useful at a descriptive level, but lack a developmental perspective, both in terms of how they come about through interaction of an organism with the world and in terms of how conscious content is generated based on priors acquired through such interactions. The simulation work we carried out in Pasquali et al.  is an attempt to offer an alternative, computationally oriented, account. We revisit this work in §3.
3. A hybrid neural network approach
We recently proposed a neural network approach to metacognition [2,3]. The core idea of our approach, which bears some resemblance to the actor-critic models introduced by Sutton & Barto , is that two independent networks (a ‘first-order’ network and a ‘second-order’ network) are connected to each other in such a way that the entire first-order network is input to the second-order network (figure 1). This means that all the units of the first-order network are used as input for a second network, which can then in principle learn to discriminate the different ways in which the first-order network's internal representations match the outside world.
Both networks are, for instance, simple feedforward back-propagation networks. The first-order network is trained to perform a simple discrimination task, that is, to produce type I responses, whereas the second is trained to judge the accuracy of the first-order network's responses, that is, to perform type II judgements. In its more general form, as depicted in figure 1, such an architecture would also be sufficient for the second-order network to also perform other kinds of judgements, such as distinguishing between an hallucination and a veridical perception, developing knowledge about the overall geography of the internal representations held by the first-order network, or forming propositional attitudes.
The fundamental difference between this type of model (a ‘metacognitive network’) and SDT models is that the former learns and develops both first- and second-order representations over time. Pasquali et al.  instantiated the general architecture depicted in figure 1 in different ways. One instantiation was a strictly hierarchical model (figure 2a), whereas the other is best described as implementing a hybrid between dual-route models and hierarchical models (figure 2b).
The hierarchical instantiation, which we will here dub ‘hidden unit-readers’ (figure 2a; ; and , simulation 3), directly reads out the first-order network's internal representations from its hidden units (containing the relationships between input and output patterns) . The model is hierarchical because the sensory input needs to be fully processed by the first-order network before it becomes available to the second-order network. The information contained in the second-order network is directly dependent on the information contained in the first-order network where the hidden unit patterns predict both the first-order and the second-order responses.
Re-representing knowledge through meta-representations (i.e. ‘content-explicit representations’) is not sufficient, however: one must also represent oneself as being in possession of that content (‘attitude-explicit representations’ ). Such attitude-explicit representations require access to the relevant first-order knowledge in a manner that is independent from the causal chain in which it is embedded, such that not only the content but also the accuracy of the knowledge is represented. Indeed, it has been suggested that metacognition hinges upon encoding the precision of a representation, because this would allow organisms not only to evaluate what they know, but to engage in prospective error monitoring and optimization of decision-making, for instance, by smoothing the accrual of evidence for the ‘right’ decision over time .
We also explored the characteristics of a second instantiation (figure 2b; ‘comparator units’, : simulations 1 and 2), which indirectly reads out the first-order network's internal representations by comparing first-order input with first-order output (the latter of which is, in fact, the computational consequence of the hidden unit patterns). In these networks, the second-order network lies outside of the first-order causal chain, because the information used by the first-order network to execute its task is not the information used by the second-order network to place a high or a low wager. Thus, they are, in principle, dual-channel models. Still, as both networks ‘plug into’ the same basic knowledge (first-order performance; albeit in a different way, see below), this type of model is effectively a hybrid between hierarchical and dual-route models.
Our hybrid models thus depend on two core assumptions: first, evaluating one's own performance requires that the first-order representations that are responsible for performance be accessed in a manner that is independent from their expression in behaviour. Second, one must possess attitude-explicit representations that require access to the relevant first-order knowledge in a manner that is independent from the causal chain in which it is embedded, such that not only the content but also the accuracy of the knowledge is represented. The first of these assumptions refers to the hierarchical component of the models, whereas the second refers to their dual-channel aspect. Obviously, the notion of independence of the first-order causal chain is also present in dual-channel SDT models. One of the consequences of using non-dual channel SDT to model type I and type II decisions is that when there is no type I sensitivity, then there is no type II sensitivity: when there is no signal to discriminate between the presence or absence of a stimulus, or between two stimuli, there should in principle be no signal to base one's subjective rating on—something which, in the context of sensory metacognition, is at least plausible. However, Scott et al.  recently demonstrated why a model of metacognition should exhibit such independence. Specifically, they showed, in an artificial grammar learning (AGL) task, that participants could perform better than chance in expressing judgements about their own performance (type II decisions) in spite of the fact that their performance (type I discrimination) was actually at chance! Such findings have two implications. First, strictly first-order and hierarchical models cannot account for such dissociations, which is suggestive that only dual-channel models have enough generality. Second, such findings support the idea that the information contained in the first-order network can be used in different, perhaps orthogonal decision criteria. Our hybrid–hierarchical comparator models do precisely that, where they use the prediction error of the first-order network in a different way for first- and second-order decisions. In particular, while the first-order network takes its decisions based on the performance error (the standard SSE), the second-order network's decisions are based on a more detailed pattern representation of the first-order error. Thus, the second-order network learns to re-describe the error committed by the first-order network explicitly, as a pattern of activation rather than as a scalar signal. This is what enables it to leverage information that may not be captured by the first-order error. In principle, this might reflect the fact that, even if a first-order decision is predominantly subject to bias without any discriminative sensitivity, there is still enough information in the first-order performance signal in order to detect when one is wrong and when right in a discrimination task. In other words, the second-order network has a finer-grained access to the first-order error, precisely because it can ‘look at’ the error by representing it as a (potentially manipulative) pattern of activation, rather than just use it to guide output, as the first-order network does. In light of Scott et al.'s  data, this would mean that, even though the overall first-order error with respect to string grammaticality cannot be used to distinguish between strings in a type I task, the way in which those strings elicit errors is detectable by the second-order system, and hence reflected in above-chance type II judgements.
4. A signal detection theory analysis of the hybrid metacognitive model
Our simulations were able to successfully account for the pattern of associations and dissociations between performance and confidence (or wagering) observed in the Iowa Gambling task, in an AGL task and in blindsight. Here, we sought to analyse the hybrid model's performance in terms of SDT. Thus, we performed SDT analyses on the performance of the network in the AGL task and in blindsight (; electronic supplementary material).
In the AGL task simulation, the first-order network was trained, as in Persaud et al. , to discriminate grammatical from non-grammatical strings of letters, while the second-order network was trained to produce wagers on the first-order network's decisions. We showed  how the model was able to capture the patterns of associations and dissociations between classification performance and wagering in the two conditions (implicit and explicit) tested by Persaud et al. .
Here, to analyse the model's performance using SDT, we replicated our original simulations, inserting a test block—instances of new grammatical strings and of non-grammatical strings—after every block of the learning phase and not only after the third (implicit condition) and the twelfth (explicit condition) block, as was the case in the original study. This small modification of the simulation setup allowed us to capture the networks' performance at every step during the learning phase (figure 3a). As expected, type I sensitivity d′1 steadily increases from 0 to a maximum value through learning, reflecting a progressively larger proportion of hits—correct discriminations of the new grammatical strings—than of false alarms—incorrect discriminations of the ungrammatical strings. In addition, networks tend to lose their initial conservative bias (type I c) as their knowledge develops. At the end of the learning phase, the neural networks end up with perfect knowledge of the grammar, as suggested by a high type I sensitivity and a null type I criterion. Type II sensitivity and criterion follow roughly the same pattern, although d′2 does not increase as much as d′1 and although c2 here appears to already start from a neutral value (but higher initial criterion values were sometimes obtained in other simulations). As a reminder, the second-order network had already been trained in a pre-training phase and no more updates of its internal weights occurred afterwards, that is, during the actual learning phase. Thus, the second-order network behaves as a simple observer of the first-order network's knowledge and yet, its type II performance improves just as well through the learning phase, as reflected by a greater sensitivity and a neutral bias at the end of the task. Finally, by comparing type I and type II measures on the figures, one may notice that objective performance seems to have shaped subjective performance in this simulation, just as one would have predicted from a purely hierarchical architecture.
Our second implementation of the hybrid model was dedicated to capturing blindsight. In their blindsight experiment, Persaud et al.  showed that blindsight subject GY (i.e. a patient who, under specific circumstances, makes visual discriminations in the absence of visual awareness), when presented with sub-threshold stimuli in his blind field, displayed above-chance localization performance but failed to maximize his earnings through wagering, suggesting that he was not always aware of the knowledge involved in his decisions for stimulus localization. However, for supra-threshold stimuli (both in normal and blind fields), GY maximized performance as well as earnings. We successfully simulated these results  by pre-training the networks to discriminate among arbitrary positions of a stimulus and to simultaneously place wagers on their own performance. The distinction between supra-threshold and sub-threshold blindsight vision was introduced during a subsequent testing phase, in which the networks classified the patterns they had previously been presented with (supra-threshold), as well as degraded versions of these patterns in which stimulus-to-noise ratio was manipulated by increasing the noise level (sub-threshold). Here, we look at how the model's performance develops over time, and at how the model accounts for blindsight in light of Persaud's data.
To track the model's performance over time, we used the same procedure as for the AGL simulation, inserting test blocks after each block of the pre-training phase. We thus captured the networks' objective and subjective performance through the pre-training phase—results at the 150th block reflecting one's normal performance in a standard subliminal detection task—as well as in a post-test blindsight condition for which the level of background noise in input was raised (figure 3b). Only after a short time of adaptation—the required time for the networks to learn to see anything, which may end around block 30 in the pre-training phase—type I performance seems to evolve perfectly normally. With training, d′1 starts to increase, as the networks progressively become able to discriminate between noise and signals. However, c1 never reaches the null value, indicating the maintenance of a conservative policy. This, of course, is because of the fact that a few of the stimuli are displayed below the noise threshold and hence cannot be discriminated properly by the networks. Keeping a conservative bias thus prevents the networks from exhibiting too high a rate of false alarms. By contrast, type II scores seem rather peculiar. By the time the networks ‘learn to see’, type II d′ has reached its maximum value, and type II c is at its lowest, that is, second-order networks have acquired a very high sensitivity but also a very liberal bias. One might think that they are somehow fully ‘open-minded’, which pays off since subjective performance over-rides the lack of objective knowledge in this case. Following this phase, type II sensitivity returns to a more moderate value while the criterion's slope tends towards a conservative value, as if bounded again by type I knowledge. Finally, type II scores in the post-test blindsight situation confirm our earlier findings , that is, a preserved sensitivity but a highly conservative bias. Although our overall results match the general findings by Persaud et al. , this criterion-setting account of blindsight diverges from the data of Persaud et al., which suggest that a decreased sensitivity, and not a criterion-setting problem was underlying the failure to optimize wagering. However, Overgaard et al. [46,47] showed that this decreased-sensitivity account is linked to the use of dichotomous measures such as the high versus low wagers used by Persaud et al., whereas use of more graded measures reveals that in fact sensitivity is preserved but that patients use a very conservative criterion, which is what our current analysis suggests as well, and what others propose in this issue .
Our analyses thus highlight the hybrid character of the model. Indeed, in the AGL simulation, type II performance directly depends upon type I performance, whereas in the blindsight simulation, the second-order network is able to build relevant meta-knowledge despite the first-order network's poor performance.
In closing, we should stress that the models we have presented have substantial limitations. Two such limitations are worth highlighting. The first is that the models fail to be dynamical. Responses are computed in a single time step, whereas we envision the relevant type I and type II processes as unfolding over time. The second is that the models fail to be recurrent: The meta-representations developed in the second-order network cannot influence the representations developed in the first-order network. Going beyond these two limitations is important for the following reason: when responses take time to be computed by a first-order network that contains multiple levels (e.g. six or seven layers of hidden units), the second-order network may actually, were it able to influence the states of the first-order network, compute or at least bias the appropriate type I response even before the first-order network has completed its own processing. In other words, the second-order network would then be able to predict future states of the output layer of the first-order network. This would capture a central idea in our framework, namely that the brain continuously learns to predict the consequences of activity in one region for activity on other regions (what we call the ‘inner loop’, see below). Augmenting our models with the necessary computational mechanisms will require using different, fully recurrent, dynamical learning algorithms.
5. Learning to be conscious: metacognition as radical plasticity
What are the implications of this approach to metacognition as a dynamic representational re-description process? First, this approach suggests that metacognition (and hence, consciousness) takes time, at different time scales, that is, over a single trial, over learning and over development. Second, this approach suggests that metacognition, far from being mere filtering as perhaps suggested by SDT, is an active, trained construction process. Recent work supports the idea that one can train people to gain conscious access to their own representations. For instance, participants can be trained to improve their performance in subliminal perception tasks , aversive learning can teach people to make novel olfactory distinctions  and imposing a deadline on simultaneous type I and type II ratings interfered with the degree to which participants were able to identify their correct responses  (interestingly, type I performance was also affected, but only on those trials for which people had claimed to be sure, suggesting that disruption of this metacognitive signal affects lower-level processing). It has been suggested  that gradual learning of (type II) precision estimates over a certain amount of time is particularly useful ‘in situations where the causes of perceptual evidence may change unpredictably over time, and as such may provide a better account of the sort of fluid, ongoing sensorimotor integration that characterizes everyday activities such as riding a bicycle’. Indeed, the creation of a conscious experience of the world may protect us and our brain from piecemeal and unpredictable sensory input.
Second, we would instead like to suggest that metacognition is but an instance of a larger class of representational re-description processes that, as stated before, occur unconsciously and automatically. From this perspective, the brain is continuously and unconsciously learning to anticipate the consequences of action or activity on itself, on the world and on other people (see below for elaborations on the latter two). There is considerable evidence for such hierarchical predictive mechanisms in the human brain , through which the brain continuously attempts to minimize ‘surprise’ or conflict by anticipating its own future activity based on learned priors. Through these predictive mechanisms, the brain develops systems of meta-representations that characterize and qualify the target first-order representations. Such learned re-descriptions, enriched by the emotional value associated with them, form the basis of conscious experience. Learning and plasticity are, thus, central to metacognition and consciousness, to the extent that experiences occur only in experiencers who have learned to know that they possess certain first-order states and who have learned to care more about certain states than about others. Cleeremans [19,51] has termed this view the ‘Radical Plasticity Thesis’. While this paper is concerned primarily with meta-representation as a prerequisite for consciousness, this ‘caring about’ aspect is equally crucial to our model of consciousness, in that the knowledge that resides in those meta-representations (i.e. the knowledge about the first-order representations) has to have relevance for the organism. It has to matter to an organism whether the first-order state is A or B. Such relevance may be related to prospective error monitoring , or may be related to motivational and emotional components.
The idea that predictive re-description processes take place unconsciously can in fact be argued to form the core of the higher-order thought (HOT) theory of consciousness , according to which a representation is a conscious representation when one is conscious of it. In other words, by HOT, it is in virtue of the occurrence of (unconscious) higher-order thoughts ‘that we are now conscious of some content’, that the content becomes phenomenally conscious. This, we surmise, requires the ability for the agent to re-describe its own states to itself as suggested above. We further suggest that a system's ability to re-describe its own knowledge to itself minimally requires (i) the existence of recurrent structures that enable the system to access its own states and (ii) the existence of predictive models (meta-representations) that make it possible for the system to characterize and anticipate the occurrence of first-order states. Importantly however, here, and in contrast to HOT, such meta-representational models (i) may be local and hence occur anywhere in the brain, (ii) can be sub-personal, and (iii) are subject, just like first-order representations, to learning and plasticity mechanisms and, hence, can themselves become automatic.
Note that the proposed metacognitive architecture instantiates the minimal requirements necessary to enable a cognitive system to distinguish between veridical perceptions and hallucinations (something a pure first-order system would be unable to do) and, more generally, to develop the metacognitive knowledge necessary to represent the manner in which its own first-order knowledge is held, that is, propositional attitudes (is this a belief? a hope? a regret?).
6. Beyond consciousness: three predictive loops
As discussed above, the core idea of our proposal is that the brain is continuously and unconsciously learning to anticipate the consequences of action or activity on itself, on the world and on other people. Thus, we have three closely interwoven loops that link the brain with itself, with the world and with other agents, all driven by the same prediction-based mechanisms (figure 4). A first, internal or ‘inner loop’, involves the brain re-describing its own representations to itself as a result of its continuous and unconscious attempts to predict how activity in one region influences activity in other regions. In other words, the brain does not know in and of itself that there is a causal link between, say, activity in supplementary motor area and activity in primary motor cortex, or between any other cerebral regions that are so causally linked. The knowledge contained in such feedforward links is thus implicit to the extent that there is no mechanism to access it directly. Our proposal, largely based on Friston's own analysis , is that the brain learns to render this implicit knowledge explicit by re-describing it through unconscious prediction-driven mechanisms. This is essentially the mechanism that our simulations attempt to capture.
The second loop is the familiar ‘perception–action loop’. It results from the agent as a whole continuously predicting the consequences of its actions on the world.
The third loop is the ‘self–other loop’, and links the agent with other agents, again using the exact same set of prediction-based mechanisms as involved in the other two loops. The existence of this third loop is constitutive of conscious experience, we argue, for it is in virtue of the fact that as an agent I am constantly attempting to model other minds that I am able to develop an understanding of myself. The processing carried out by the inner loop is thus causally dependent on the existence of both the perception–action loop and the self–other loop, with the entire system forming a ‘tangled hierarchy’ (e.g. Hofstadter's concept of ‘a strange loop’ ) of predictive internal models.
This third predictive loop thus extends beyond the agent into the social world. Consistent with the recent proposal by Carruthers , we surmise that understanding ourselves depends on the ability to anticipate the consequences of our actions on other agents. Roughly, successfully anticipating how other agents will react to the actions we direct towards them requires that we have built internal models of how such agents will react to our actions. We assumed that such model building is enabled by automatic prediction of the other's actions in ongoing dynamic interaction [37,54].
Recently, Schilbach et al. [55,56] have suggested that, ontogenetically, becoming an expert in social cognition may crucially depend on social interaction, while later competencies of more detached, reflective social cognition (mirroring, mentalizing) could be a result of reactivating the neural networks forged during social interactions (neural ‘re-use’ ) and representationally re-describing these interaction-based capacities [1,19]. Crucial to this third loop, rather than seeing such a re-description as an internally generated, qualitatively different representation of discrete knowledge about the world, the ‘social’ re-description is an ongoing learning process driven by increasingly complex interactive contexts, for instance, when moving from dyadic to triadic interaction, which creates the possibility and need to communicate with respect to an external, third object or person . In this light, language, for example, might not only be shaped by social interaction, but also the other way around, with the gradual development of language providing a scaffolding that allows implicit social know-how to develop in explicit social knowledge. Social context as a driving force for learning has, indeed, been recognized in language learning , child development  and social cognition . Recently, it has also been suggested that mirror neurons might be the result of reinforcement learning [62–64]. Thus, the third loop conceptualizes metacognition as resulting from predictive learning mechanisms that allow for agents to simultaneously learn about the environment as well as about their own internal representations. The ongoing re-descriptions that this entails make for a potential explanation of how implicit precursors to mentalizing (such as gaze following) later develop into explicit Theory of Mind and our capacity to consciously reason about others and ourselves .
Finally, the idea that all three loops may be subtended by the same mechanisms is supported by recent findings that metacognition, social interactions and the processing of self-relevance all involve the recruitment of a common set of brain areas. Using an activation-likelihood estimation (ALE) approach, Schilbach et al.  recently investigated the statistical convergence of results from functional neuroimaging studies that had, respectively, targeted social cognition, emotional processing and unconstrained cognition, based on the assumption that a ‘common denominator’ could exist in cognitive terms, consisting in a reliance on introspective processes, in particular, prospective metacognition. By exploring the commonalities of the results from these three individual meta-analyses by means of a conjunction analysis, the authors were, indeed, able to provide empirical evidence for a shared neural network localized in dorso-medial prefrontal cortex and in the precuneus. These two regions are known to be critical hubs in the neurofunctional architecture of the human brain [67–73] and have been shown to be closely related to introspective ability . Crucially, comparing the results of our conjunction analysis with the recent findings by Fleming et al.  demonstrates anatomical overlap both in the PFC and the precuneus (figure 5).
Interestingly, the two brain regions that appear to be involved both in social cognition and introspective or meta-cognitive processes are part of what has become known as the ‘default mode of brain function’ . We have recently argued that this convergence might be taken to suggest that the physiological baseline of the human brain, i.e. the default mode network (DMN), is related to a psychological baseline of social cognition . Here, we extend this argument by suggesting that social interactions might enable introspective processes and conscious experience while relying on changes in the activity of the DMN. Congruently, Carhart-Harris & Friston  have recently argued that the DMN might realize the Freudian secondary process, i.e. the ‘mode of cognition of the ego’, or in other words, normal waking consciousness. Strikingly, this analysis is rooted in a Bayesian perspective on the brain, which assumes that the brain uses internal hierarchical models to predict its sensory inputs and suggests that neural activity tries to minimize the ensuing prediction error or (Helmholtzian) free energy [52,74]. Consistent with the proposal of key regions of the DMN subserving introspective processes and social cognition, and our claim that these abilities take time to develop, it has been found that connectivity within the DMN develops through ontogeny [75,76]. Importantly, such developments hinge upon interactions with the environment and might be necessary to establish a balance between internally oriented cognition and engagement with the external world. Apart from the empirical evidence for an anatomical overlap of the brain regions relevant for introspection and social interaction, Carhart-Harris & Friston's account  can also be taken to suggest that all of the three loops, which we assume are relevant for metacognition, rely on similar neural mechanisms, namely internal models that are used to predict network changes based either on sensory input or on endogenously generated activation.
Overall, our perspective is thus akin to the sensorimotor or enactive perspective  and to the general conceptual framework provided by forward modelling , in the sense that awareness is linked with knowledge of the consequences of our actions. Crucially, however, we extend the argument inwards (the inner loop) and further outwards (the self–other loop), and specifically towards social cognition (see also ). Our representations of ourselves are shaped by our history of interactions with other agents. Learning about the consequences of the actions that we direct towards other agents uniquely requires more sophisticated models of such other agents than when interacting with objects, for agents, unlike objects, can react to actions directed towards them in many different ways as a function of their own internal state. A further important point here is that caretakers act as external selves during development, interpreting what happens to developing children for them, and so providing meta-representations where they lack. In this light, theory of mind can thus be understood as rooted in the very same mechanisms of predictive re-descriptions as involved when interacting with the world or with oneself (see also ).
Thus, we end with the following idea, which we call the ‘Radical Plasticity Thesis’: the brain continuously and unconsciously learns not only about the external world and about other agents, but also about its own representations of both. The result of this unconscious learning is conscious experience, in virtue of the fact that each representational state is now accompanied by (unconscious learnt) meta-representations that convey the mental attitude with which the first-order representations are held. From this perspective, there is nothing intrinsic to neural activity, or to information per se, that makes it conscious. Conscious experience involves specific mechanisms through which particular (i.e. stable, strong and distinctive) unconscious neural states become the target of further processing, which we surmise involves some form of representational re-description in the sense described by Karmiloff-Smith .
B.T. is supported by a European Commission Marie Curie Fellowship FP7-PEOPLE-IEF 237502 ‘Social Brain’. L.S. is supported by the Koeln Fortune Programme of the Medical Faculty, University of Cologne and the Volkswagen Foundation. A.C. is a Research Director with the National Fund for Scientific Research (F.R.S.-FNRS, Belgium).
One contribution of 13 to a Theme Issue ‘Metacognition: computation, neurobiology and function’.
- This journal is © 2012 The Royal Society