The glmS ribozyme-riboswitch is the first known example of a naturally occurring catalytic RNA that employs a small molecule as a coenzyme. Binding of glucosamine-6-phosphate (GlcN6P) activates self-cleavage of the bacterial ribozyme, which is part of the mRNA encoding the metabolic enzyme GlcN6P-synthetase. Cleavage leads to negative feedback regulation. GlcN6P binds in the active site of the ribozyme, where its amine could function as a general acid and electrostatic catalyst. The ribozyme is pre-folded but inactive in the absence of GlcN6P, demonstrating it has evolved strict dependence on the exogenous small molecule. The ribozyme showcases the ability of RNA to co-opt non-covalently bound small molecules to expand its chemical repertoire. Analogue studies demonstrate that some molecules other than GlcN6P, such as l-serine (but not d-serine), can function as weak activators. This suggests how coenzyme use by RNA world ribozymes may have led to evolution of proteins. Primordial cofactor-dependent ribozymes may have evolved to bind their cofactors covalently. If amino acids were used as cofactors, this could have driven the evolution of RNA aminoacylation. The ability to make covalently bound peptide coenzymes may have further increased the fitness of such primordial ribozymes, providing a selective pressure for the invention of translation.
The glmS ribozyme is a catalytic RNA derived from a ligand-dependent self-cleaving mRNA domain conserved among Gram-positive bacteria (reviewed in ). In these organisms, the 5′-untranslated region (5′-UTR) of the mRNA encoding the protein enzyme glucosamine-6-phosphate (GlcN6P) synthetase self-cleaves when it binds to GlcN6P . Self-cleavage of the 5′-UTR leads to degradation of the mRNA, resulting in negative feedback regulation of the protein enzyme by its product metabolite . GlcN6P synthetase is an essential enzyme that catalyzes the first committed step in the metabolic pathway that leads to synthesis of the bacterial cell wall. The activity is universally present, but is catalysed by two types of protein enzymes distributed in different taxa . Eukaryotes and Gram-negative bacteria have a GlcN6P synthetase that is allosterically controlled by its reaction product. Gram-negative bacteria such as Escherichia coli also regulate expression of the enzyme at the mRNA using small antisense RNAs. The GlcN6P synthetase of Gram-positive bacteria, in contrast, is not an allosteric enzyme. Thus, the regulation of protein expression at the level of mRNA by the ribozyme-riboswitch domain is likely to be important . Indeed, it has been shown that replacement of the ribozyme by a catalytically compromised mutant results in abrogation of sporulation by Bacillus subtilis .
In vitro, the self-cleaving domain of the glmS mRNA can be engineered to function as a multiple-turnover catalyst, that is, a ribozyme. Deletion analysis showed that the ribozyme is comprised of approximately 150 nucleotides (nt) and that a single nucleotide 5′ of the cleavage site suffices for maximum GlcN6P-induced activity. Furthermore, approximately 75 nt RNA constructs (with one nt 5′ of the cleavage site) are responsive to GlcN6P, but exhibit reduced catalytic activity . Structure determination of the full-length glmS ribozyme  demonstrated that it is composed of a doubly pseudoknotted approximately 75 nt core domain that encompasses the cleavage site and the GlcN6P-binding site, and a peripheral element that appears to stabilize the core, fully explaining the biochemical results (figure 1).
2. Mechanism of glucosamine-6-phosphate activation of the ribozyme
GlcN6P could activate the glmS ribozyme by two mechanisms. The small molecule could function as an allosteric effector, its binding to an allosteric site leading to a conformational rearrangement that activates the ribozyme. Alternatively, the small molecule could function as a catalytic cofactor or coenzyme, binding to the active site of the ribozyme and providing chemical functionality that is indispensable for catalysis. Structural and biochemical evidence strongly support a coenzyme function for GlcN6P in glmS ribozyme activation. Crystal structures have been determined of glmS ribozymes from two different Gram-positive bacteria, each in multiple functional states (e.g. GlcN6P-free pre-cleavage, GlcN6P-bound pre-cleavage, GlcN6P-bound transition-state mimic, post-cleavage). The RNA structure is experimentally identical in all states [5–9]. In addition, it was shown that the glmS ribozyme crystallized in the absence of GlcN6P could be activated for cleavage in the crystalline state by briefly soaking the crystals in solutions containing GlcN6P. Structure determination using such soaked crystals revealed an RNA in a post-cleavage state that was structurally indistinguishable from structures obtained from crystals of ribozymes activated and cleaved in solution prior to crystallization . Thus, global motions of the RNA (which would be incompatible with the crystal lattice) are not required for activation, catalysis or release of the 5′-cleavage product. Lack of GlcN6P-induced conformational changes of the glmS ribozyme is also supported by the results of hydroxy-radical footprinting  and fluorescence resonance energy transfer (FRET) experiments .
GlcN6P binds to the RNA with its amine group in van der Waals contact with the scissile phosphate; that is, the GlcN6P-binding site is the active site of the ribozyme, not an allosteric effector site . The functional importance of the amine group of GlcN6P for glmS ribozyme activity is underscored by studies using the isosteric sugar, glucose-6-phosphate (Glc6P). This compound, which differs from GlcN6P only in having a hydroxyl rather than an amine group at position 2 of the pyranose ring, is not an activator of the ribozyme . Cocrystal structures show that it binds precisely in the same location as GlcN6P. As expected from the foregoing, the conformation of the Glc6P-bound ribozyme is identical to those of the ribozyme in other functional states . Biochemical studies show that Glc6P competes with GlcN6P for binding to the glmS ribozyme, functioning in effect as an antagonist . Since the ribozyme is pre-folded but completely inactive in the absence of GlcN6P, the most parsimonious conclusion is that this compound functions as a catalytic cofactor of the RNA.
3. Plasticity of natural and artificial gene-regulatory RNAs
The glmS ribozyme is also a riboswitch, that is, an mRNA domain that controls gene expression in cis in response to the intracellular concentration of a small molecule metabolite or second messenger (reviewed in [13,14]). However, its mechanism of action and its structural rigidity make it an atypical riboswitch. Five distinct mechanisms of gene regulation have been documented for riboswitches. First, the majority of known bacterial riboswitches function by transcriptional attenuation. The ligand-binding (or ‘aptamer’) domain of the riboswitch and a ρ-independent terminator stem-loop compete for folding, with ligand binding altering the efficiency of formation of the latter. (Aptamers are RNAs evolved in vitro to bind to specific ligands .) Second, many bacterial riboswitches function at the level of translation initiation. In these, the ligand-bound conformation of the ligand-binding domain (the ‘aptamer’ domain) of the riboswitch competes with another that either occludes or exposes the Shine–Dalgarno element. Third, ligand binding by some riboswitches results in alternative splicing of introns. The thiamine pyrophosphate (TPP)-responsive thi-box riboswitch is the only riboswitch thus far described in eukaryotes. It is present in some introns in algae, fungi and plants, and it controls alternative splicing by exposing splice junctions or splicing enhancers depending on TPP concentration (reviewed in ). In bacteria, group I self-splicing introns have been described  that select between two alternative splice sites based on binding of cyclic diguanylate (c-di-GMP). Fourth, an S-adenosylmethionine (SAM)-responsive riboswitch has been described that initially functions through transcriptional attenuation, but in which the prematurely terminated transcript induced by ligand binding also functions in trans as an antisense RNA . Finally, the glmS riboswitch is the first known example of a riboswitch that functions by ligand-induced self-cleavage. Whereas the first four mechanisms of riboswitch action require that the riboswitch (comprised of aptamer domain and the ‘expression platform’, the RNA sequences that interface with the transcription, translation or splicing machinery) adopt at least two mutually exclusive conformations, the glmS riboswitch is rigid and fully assembled prior to GlcN6P binding, and it is only its catalytic activity that switches (by approximately a million-fold).
4. Ligand selectivity of the glmS ribozyme
Riboswitches and artificial aptamers that recognize their ligands by partially or wholly enveloping them are known to undergo folding transitions upon ligand binding (e.g. [19–25]). Conversely, the rigid and pre-folded glmS ribozyme binds GlcN6P in an open, solvent-accessible pocket (figure 1) . Analogue studies  underscore the importance of three of the functional groups of GlcN6P that interact with the glmS ribozyme: the anomeric hydroxyl, the amine and the phosphate. A variety of compounds that present vicinal amine and hydroxyl groups are weak activators of the glmS ribozyme. Such (presumably adventitious) activators include glucosamine (GlcN), l-serine, serinol, Tris and ethanolamine (figure 2). Importantly, the apparent pKa of the ribozyme reaction tracks the pKa of the amine group of the activating compound . Thus, the pKa of the reaction rises from 7.9 to 8.7 when GlcN is replaced with serinol (the pKa values of their amine groups are 7.8 and 8.8, respectively). This confirms the importance of the amine group of the activator, suggested by the intimate contact between the amine of GlcN6P and the scissile phosphate [5,6,8]. However, analogues also point to the importance of the vicinal hydroxyl group, and to the precise stereochemical relationship between the amine and the hydroxyl groups. Thus, ammonia and methylamine are not activators, and d-serine, in which the stereochemical arrangement of the two functional groups is opposite that of GlcN6P (and l-serine) is not an activator . The ability of GlcN or glucosameine-6-sulphate to serve as weak activators of the glmS ribozyme also indicates that the phosphate of GlcN6P is not chemically essential, even though it is important for increased binding affinity to the RNA. In addition to interacting with the RNA through outer-sphere coordinated metal ions, the phosphate of GlcN6P receives a hydrogen bond from the N1 imine of riboswitch residue G1 . It was found that replacing G1 with purines with a non-protonated N1 position (such as A, 2-aminopurine, or dimethyladenine) resulted in riboswitches more strongly activated by GlcN than GlcN6P, presumably because of the clash between the phosphate of GlcN6P and the unprotonated N1 of these purines .
5. Catalytic mechanism of the glmS ribozyme
The glmS ribozyme catalyzes the same overall chemical transformation as other well-characterized self-cleaving RNAs, such as the hepatitis delta virus (HDV), hairpin and hammerhead ribozymes (reviewed in ). RNA cleavage proceeds through a concerted (SN2) transesterification in which the 2′-OH of the nucleotide preceding the scissile phosphate (residue −1) attacks the phosphorus of the scissile phosphate. A transition state featuring a pentacovalent phosphorus is resolved by departure of the 5′-oxo group of residue +1, and formation of a 2′,3′-cyclic phosphate (figure 3). A variety of strategies are employed by RNA to catalyse this reaction. The HDV ribozyme employs a water molecule chelated by a magnesium ion to function as a specific base to deprotonate the nucleophilic 2′-OH, and the N3 amine of a cytosine residue (C75) to serve as a general acid, to protonate the leaving group [27,28]. In order that it functions effectively as a general acid, the pKa of the N3 of C75 is perturbed by over 2 pH units from its undisturbed value of approximately 4.2. Indeed, C75 is responsible for most of the catalytic rate enhancement achieved by the HDV ribozyme (figure 4) . The hairpin ribozyme has two nucleobases, G8 and A38, positioned to serve as general base and general acid in the cleavage reaction [30,31]. This ribozyme, unlike the HDV or glmS ribozymes, also catalyzes the reverse ligation reaction. For catalysis of the ligation, the catalytic role of the nucleotides has to be reversed (because of the principle of microscopic reversibility). Raman spectroscopic studies of crystalline hairpin ribozymes show that the pKa of A38 is perturbed by nearly 2 pH units . In addition, crystallographic studies show that G8 and A38 have the ability to preferentially bind to the transition state of the reaction, thus lowering the activation energy by hydrogen bonding . The hammerhead ribozyme appears to use yet another strategy, in which N1 of G12 functions as a general base to deprotonate the nucleophilic 2′-OH of the cleavage reaction, and the 2′-OH of G8 functions as a general acid . Like the hairpin ribozyme, the hammerhead ribozyme also catalyses ligation during which the role of the catalytic functional groups must be reversed. Importantly, it can be shown that near-neutral pKa values are not a strict condition for functional groups to serve as general acid–base catalysts [35,36].
The location of the amine group of GlcN6P in the active site of the glmS ribozyme is analogous to that of the N3 of C75 in the HDV ribozyme and A38 in the hairpin ribozyme [5,30,37]. This suggests that the small molecule would function as a general acid catalyst (figure 4). The simplest interpretation of the catalytic inactivity of the glmS ribozyme in the absence of GlcN6P is that the small molecule is responsible for all the catalytic power of the ribozyme. However, mutational analysis does not support this. All known glmS ribozyme isolates have a G residue at position 40 (Thermoanaerobacter tengcongensis numbering scheme). G40 is positioned with its N1 imine within hydrogen bonding distance of the nucleophilic 2′-OH of A − 1. Thus, it is possible that G40 functions as a general base in catalysis, or that it serves to orient the nucleophilic 2′-OH during the reaction. Mutation of G40 to A completely abrogates the activity of the ribozyme, underscoring its importance. Structure determination of the G40A mutant reveals that a ribozyme folds into a structure indistinguishable from that of the wild-type with, most surprisingly, GlcN6P bound in precisely the same position as in the active ribozyme . This demonstrates that while GlcN6P is necessary for glmS ribozyme activity, it is not sufficient. It also indicates that whatever the precise role of G40 in catalysis might be, it becomes manifest only when GlcN6P binds the ribozyme. Thus, the active site of the glmS ribozyme is not simply a rigid, passive scaffold for GlcN6P binding. Rather, the RNA and the small molecule appear to tune each other's chemical properties to give rise to catalysis. Two molecular dynamics studies reached conflicting conclusions regarding the potential catalytic role of G40 [38,39]. The exact nature of this coupling between the glmS ribozyme active site and GlcN6P remains unresolved.
6. Expansion of RNA chemistry by exogenous small molecules
The glmS ribozyme is the first characterized example of a natural catalytic RNA that employs a small molecule cofactor. Such a role for small molecules has precedents in artificial RNAs. In vitro selection methodology has been employed by many groups to isolate RNAs capable of catalysing a variety of chemical transformations from pools of random sequences (reviewed in [40,41]). Several such experiments have isolated ribozymes that employ a non-covalently bound small molecule as a cofactor (rather than, or in addition to, a small molecule substrate). For instance, Meli et al.  mutagenized the hairpin ribozyme and isolated variants that are dependent on free adenine for catalytic activity. Biochemical analyses of these ribozymes suggest that the exogenous purine base participates in acid–base catalysis. Starting with a pool of random sequence RNA, Tsukiji et al.  selected alcohol dehydrogenase ribozymes that employ NAD+ as a redox cofactor. These ribozymes were capable of rate enhancements of 107-fold over the uncatalysed reaction. From a structural standpoint, the ability of RNA molecules to bind to adenine or NAD+ and employ them as cofactors is not surprising, given the chemical similarity of these molecules to nucleic acids. Indeed, White  proposed that the fact that many coenzymes (NAD, FAD, CoA, etc.) are nucleotides or contain heterocycles that could be derived from nucleotides (TPP, tetrahydrofolate, pyridoxal phosphate) could indicate that the metabolism of primordial organisms was catalysed by nucleic acids, and that these coenzymes are remnants of the active sites of such ancestral ribozymes.
Structural and biochemical studies of riboswitches and aptamers show that many of these RNAs undergo large conformational rearrangements concomitant with ligand binding. Likewise, the plasticity of RNA is a recurrent theme in RNA–protein interactions (e.g. [45,46]). Noller  has suggested that the fact that many peptide-binding RNAs display large ligand-driven folding transitions may hint at the primordial driving force for the invention of translation. Specifically, peptides could have first served as facilitators for the folding of primordial ribozymes, and translation evolved to make synthesis of such peptides more efficient and reproducible. The successful design of synthetic aptazymes, artificial catalytic RNAs that require binding of a small molecule (by an aptamer domain) in order to achieve catalytic activity, demonstrates that ligand-induced folding of RNA can be functionally coupled to the catalytic competence of ribozymes .
7. Amino acids as primordial ribozyme cofactors?
The mechanism of action of the glmS ribozyme, which does not fold concomitant with ligand binding and which can use cofactors that do not have obvious chemical similarity to nucleotides (amine sugars or even amino acids), suggests how small molecules may have expanded RNA chemistry in the RNA world, rather than functioning as folding chaperones. Activation of the glmS ribozyme l-serine is particularly suggestive in the context of speculation about cofactor use by primordial ribozymes for two reasons. First, it has often been posited that the greater chemical diversity of amino acids over nucleotides would have been a driving force for the transition from RNA catalysts to protein enzymes (reviewed in ). Serine provides the glmS ribozyme with two chemical groups (a primary amine and a primary hydroxyl) that are absent in RNA, indeed showing that amino acids have more chemical diversity than RNA. However, the greater chemical diversity of amino acids does not, per se, explain why the conventional set of 20 amino acids was chosen. It can be postulated that primordial metabolism was catalysed by numerous ribozymes that shared a set of amino acid cofactors, and that the specific set of contemporary amino acids is the result of a contingent choice of cofactors by early ribozymes that became fixed through the ‘principle of many users’ or ‘principle of continuity’ . Once a particular set of amino acids became required for the function of many genetically distinct ribozymes, this set could not be altered. Any change would have necessitated the simultaneous coevolution of multiple amino acid-dependent ribozymes. This is analogous to the explanation given for the persistence of particular coenzymes that are essential for the function of the majority of protein enzymes . Second, the glmS ribozyme is activated by l-serine but not d-serine. While this is a trivial stereochemical consequence of the chirality of RNA, it nonetheless suggests that the choice of l-amino acids in early evolution may have been the result of a contingent choice of cofactors by a group of primordial l-amino acid-dependent ribozymes. That choice would also have been fixed through the principle of many users (this does not provide an explanation for the choice of d-ribose to make RNA).
8. Covalent attachment of cofactors to RNA and the evolution of translation
Others have speculated that tRNA aminoacylation originated either as a means of conferring a replicative advantage to a primordial tRNA-like genomic tag  or as a form of post-transcriptional RNA modification that would confer new structural or functional capabilities to the nucleic acid . If, as suggested above, primordial metabolism was catalysed by a set of ribozymes some of which evolved dependence on amino acids functioning as coenzymes, then RNA aminoacylation (and the subsequent evolution of translation) would have arisen as a consequence of the evolutionary pressure to improve the efficiency of such ribozymes by covalent attachment of their cofactors. In vitro selection experiments demonstrate that even fairly simple ribozymes can carry out aminoacylation with high specificity and regioselectivity (e.g. ).
In summary, the glmS ribozyme is a small-molecule-dependent catalytic RNA that is widespread among Gram-positive bacteria, where it regulates a key metabolic pathway. It employs the small molecule GlcN6P not to facilitate folding, but as part of the active site, as a cofactor or coenzyme. This shows that RNA can employ non-covalently bound small molecules to expand its chemical functionality. Moreover, in vitro mechanistic characterization of the glmS ribozyme demonstrates that it can employ l-serine as an adventitious coenzyme. This use of a free amino acid as a coenzyme by a ribozyme suggests how the use of amino acids in modern biology may have derived from the primordial use of these small molecules as coenzymes in the RNA world, and that the specific set of amino acids used throughout contemporary biology may reflect the ‘basis set’ of amino acid cofactors shared among the ribozymes of the RNA world in which translation evolved.
The author thanks past and current members of the Ferré-D'Amaré laboratory for their many contributions, and in particular D. Klein for his work on the glmS ribozyme. This research was supported by the intramural programme of the National Heart, Lung and Blood Institute, National Institutes of Health.
One contribution of 17 to a Discussion Meeting Issue ‘The chemical origins of life and its early evolution’.
- This journal is © 2011 The Royal Society