Thermodynamic modelling of organic synthesis has largely been focused on deep-sea hydrothermal systems. When seawater mixes with hydrothermal fluids, redox gradients are established that serve as potential energy sources for the formation of organic compounds and biomolecules from inorganic starting materials. This energetic drive, which varies substantially depending on the type of host rock, is present and available both for abiotic (outside the cell) and biotic (inside the cell) processes. Here, we review and interpret a library of theoretical studies that target organic synthesis energetics. The biogeochemical scenarios evaluated include those in present-day hydrothermal systems and in putative early Earth environments. It is consistently and repeatedly shown in these studies that the formation of relatively simple organic compounds and biomolecules can be energy-yielding (exergonic) at conditions that occur in hydrothermal systems. Expanding on our ability to calculate biomass synthesis energetics, we also present here a new approach for estimating the energetics of polymerization reactions, specifically those associated with polypeptide formation from the requisite amino acids.
It is generally accepted that organisms synthesize proteins, lipids, nucleic acids and other cellular constituents at an energetic cost. However, unless we invoke panspermia—the seeding of life from an extraterrestrial source—the prebiotic formation of the molecules that assembled into the first cells must have been thermodynamically favourable somewhere on the early Earth. In the build-up to the earliest metabolic pathways and the emergence of cells, abiotic synthesis of first simple and then more complex organic compounds was necessarily driven by natural geochemical energy sources. Since their discovery in 1977, deep-sea hydrothermal systems have been viewed as one of the most plausible environments for abiotic organic synthesis and the origin of life. Support for this theory is far-reaching and diverse, with complementary arguments from microbiology [1–3] and experimental chemistry [4–6]. A recent review of 50 years of experiments on the emergence of life concluded that ‘the best results [in these experiments] were achieved at temperatures between 60 and 90°C’, consistent with the hydrothermal system argument [7, p. 5485]. Geochemical gradients that established naturally in such systems could have enabled organic synthesis and subsequently the evolution of free-living organisms . Specifically, the chemical energy required for these processes is embodied in redox disequilibria, and these are as germane for organic synthesis (and ultimately biopolymer formation) on prebiotic Earth as they are in hydrothermal ecosystems today.
Hydrothermal systems—and here we use the term in the broadest sense, to include those on land and those in deep and shallow-seafloor settings—have sparked scientific interest far beyond the emergence of metabolism and life. For example, microbial diversity studies of these systems have revealed the presence of several hundred species of thermophilic archaea and bacteria, including many deeply rooted phyla in the universal tree of life [9–13]. Aerobes and anaerobes, phototrophs and chemotrophs, and autotrophs and heterotrophs can thrive at temperatures approaching and, in some cases, exceeding 100°C [9,14–22]. There is an ever-growing number of heat-loving organisms and robust evidence of their role in many biogeochemical processes. Despite this, the energetics of organic and biomolecule synthesis at elevated temperatures and pressures have not fully entered the mainstream scientific discussion. Such investigations are paramount to a better understanding of life's limits and can be used as predictive tools to explore the Earth's (and other planetary bodies') potentially large and diverse biosphere. Here, we review the numerous thermodynamic studies on this subject, considering the formation of organic compounds both outside the cell (i.e. abiotic reactions) and inside the cell (i.e. anabolism).
It should be noted that with respect to biomass formation, thermodynamic modelling is currently limited to the level of synthesis of biomolecular monomers (i.e. amino acids, nucleotides, fatty acids, saccharides and amines). Several studies have evaluated the thermodynamic costs of synthesizing these monomers from CO2 and other inorganic nutrients [23–25], but calculating the energetics of polymerization into biopolymers such as proteins, lipids and nucleic acids has been faced with substantial hurdles. A second goal of this communication, in addition to the review component, is to advance our understanding of the thermodynamics of polypeptide and protein formation.
2. Sources of thermodynamic data for organic compounds
Two types of data provide the foundation of any effort to assess the energy associated with organic synthesis reactions, regardless of whether these are abiotic or biotic processes. On the one hand, accurate analytical data from natural systems supply concentrations of all reactants and products, as well as ionic strengths required to convert concentrations into thermodynamic activities. On the other hand, internally consistent thermodynamic data provide an independent means of assessing the extent to which constituents of natural systems depart from states of equilibrium and, therefore, represent sources of energy if reactions among them are catalysed. In this section, we review sources of internally consistent thermodynamic data for organic compounds and summarize strategies that can be used to estimate additional values of compounds of interest.
The notion that data must be internally consistent to be useful in bioenergetic calculations can be understood by reflecting on the variety of experiments that yield thermodynamic properties. Some experiments provide data that apply directly to a compound in a particular form. Examples include direct calorimetric measurements on a sample of a liquid hydrocarbon or a crystalline amino acid. Other experiments provide data on a compound during a phase transition, or during a reaction involving other compounds. One example would be measurements of the heat of solution of a liquid hydrocarbon as it dissolves into an aqueous solution. Another would be measuring the heat of combustion of a crystalline amino acid as it is oxidized into CO2, N2 and H2O vapour. In the case of direct calorimetric measurements, there is little ambiguity about how to assign the resulting thermodynamic properties. When a reaction is involved, the property obtained refers to the reaction itself, and although it must be possible to combine the properties of the constituents in the reaction to obtain the reaction property, there are many possible combinations. Using the hydrocarbon example, obtaining an enthalpy value for the aqueous form from the heat of solution measurement depends on the choice of enthalpy for the liquid form. Likewise, obtaining an enthalpy value of the amino acid from the heat of combustion measurement depends on the enthalpy values selected for the gases CO2, N2 and H2O. To maintain internal consistency, the same data for these gases would need to be used to interpret heat of combustion measurements for another amino acid. In fact, if the same data for CO2 and H2O are used to interpret heat of combustion data for the liquid hydrocarbon, then the resulting thermodynamic properties of the hydrocarbon and the amino acid would be internally consistent, and so on. This is the underlying strategy for building an internally consistent database of thermodynamic properties for any substance, which permits any imaginable reaction among those substances to be studied. Because geochemical processes involve minerals, aqueous solutions, gases, silicate melts, petroleum, organic compounds and biomolecules, geochemists have constructed fairly diverse internally consistent thermodynamic databases.
Efforts in geochemistry to build an internally consistent thermodynamic database of organic compounds began with the publication of data for about 80 aqueous species . In the process, consistency of these data was established with thermodynamic data for inorganic aqueous ions and neutral solutes [27,28], as well as for minerals, gases and H2O . Internal consistency was maintained as data for inorganic ions and complexes were expanded [30–35]. These efforts used the revised Helgeson–Kirkham–Flowers (HKF) equation of state for aqueous solutes [36,37]. As a consequence, it became possible to consider the effects of organic oxidation–reduction reactions on the stabilities of minerals containing iron, sulfur and other redox-sensitive elements and vice versa, at the prevailing temperatures and pressures of sedimentary basins and hydrothermal systems. Such analyses led to new ideas regarding metastable equilibrium and hydrolytic disproportionation in geochemical processes [38–41], which provided the foundation for models of abiotic organic synthesis in hydrothermal systems [42–46]. It also became possible to examine the effects of organic acid decarboxylation reactions that produce CO2 on the stabilities of carbonate minerals .
Early efforts focused on expanding the number and variety of organic compounds that could be included in geochemical calculations. These involved new data for aqueous aldehydes , small peptides , alkylphenols , aqueous mono- and dicarboxylic acids and hydroxy acids , aqueous metal–organic complexes involving organic acids [51–53], chloroethylenes  and carbohydrates . These developments depended on using experimental data to generate correlation algorithms that permit estimation of data that have not been measured directly, and were extended to thiols  and organic sulfides . Additional correlations built on regression of experimental data permitted estimation of parameters for the revised-HKF equation of state .
At the time, the existing high-temperature–pressure thermodynamic data for aqueous organic compounds were relatively limited. This meant that the applicability of correlation algorithms was constrained to compounds with similar molecular structures. This problem was overcome by the introduction of group contribution estimation methods for families of aqueous organic compounds, including amino acids and unfolded proteins [59–61], and for pure hydrocarbons, carboxylic acids, amino acids, thiols, alcohols and other organic compounds such as solids, liquids or gases . In the development of group contribution methods, experimental data are regressed to obtain contributions from functional groups of organic compounds. These functional-group contributions can then be summed to estimate data and equation of state parameters for individual compounds. The result is that a greater number and variety of organic compounds can be included in thermodynamic calculations. As an example, data for amino acids as pure solids and as aqueous solutes were combined to calculate solubilities as functions of pH at elevated temperatures and pressures . The group contribution approach for pure compounds was extended to include isoprenoids, hopanes, steranes, polycyclic aromatic compounds and a wide variety of S-bearing organic compounds [64,65].
Subsequent developments in group contribution methods for estimating thermodynamic properties of aqueous organic compounds included alkanes, cycloalkanes and aromatic hydrocarbons [66,67], alcohols and ketones , esters , S-bearing compounds , ethers  and nitriles . These efforts produced widely applicable but somewhat less precise first-order group contribution methods, and more highly accurate but less inclusive second-order methods that take into account interactions among functional groups. Results of group contribution estimates are compared with experimental data for dozens of aqueous solutes at the ORCHYD website (orchyd.asu.edu; ). At nearly the same time, progress was made in extending group contribution estimation procedures to biomolecules, including a new take on amino acids, polypeptides and unfolded proteins , nucleic-acid bases, nucleotides and nucleosides , and magnesium complexes of adenosine nucleotides, as well as oxidized and reduced nicotinamide adenosine dinucleotides (NAD) and NAD-phosphates . Internally consistent data and group contribution estimation methods for crystalline forms of many of the same compounds can be found in the latter publications, and internally consistent thermodynamic data for crystalline peptides were also contributed recently .
It is now possible to include organic compounds of increasing complexity from methane to proteins in geochemical calculations. The ability to include biomolecules, in particular, is permitting novel perspectives on how microbes interact with their environment, and how biochemistry operates when constrained by geochemistry. A new analysis provides the energetics associated with the degradation of organic compounds and offers thermodynamic explanations for why certain organic compounds persist in sediments, whereas others are rapidly transformed by microbial processes . This approach permits the explicit inclusion of thermodynamic properties of complex organic compounds in kinetic models of organic compound degradation (see also ). The availability of thermodynamic data for proteins permits assessments of their relative stabilities in various geochemical settings. As an example, it was shown that the protein compositions encoded in metagenomic data from a hot spring correlate with the changing oxidation state of the hot spring fluid as it flows and cools along its outflow channel . That study also showed that the energy requirements associated with making each set of proteins are tuned to the environmental conditions, including the activities of various solutes in the hot spring fluid. Much will be discovered as biochemical reactions are placed in the context of external geochemical habitats, as well as geobiochemical environments inside cells.
3. Organic synthesis outside the cell
Organic compounds—defined here as those with at least one C–C bond—can be synthesized biotically or abiotically from inorganic sources or from transformation of other organic compounds. Organic synthesis inside the cell—autotrophy—is covered in a subsequent section; here, we review thermodynamic evaluations of abiotic organic synthesis from CO2/HCO3− and other inorganic reactants in high-temperature aqueous solutions. Most of these studies consider present-day hydrothermal systems, but some also invoke scenarios for the early Earth or Mars. To place these theoretical studies into context, we first briefly remind readers of some of the relevant experimental and field evidence of abiotic organic synthesis at elevated temperatures in aqueous solutions.
(a) Experimental evidence of abiotic organic synthesis
Numerous experimental studies have assessed the potential for abiotic synthesis of organic compounds in geologic systems, focusing in particular on hydrothermal conditions. One primary focus of these studies has been formation of hydrocarbons and lipids by Fischer–Tropsch-type (FTT) synthesis and related reactions (reviewed in ). FTT reactions have been investigated under conditions intended to simulate those in deep-sea hydrothermal systems, including using fluid–rock interactions as the source of H2, inclusion of naturally occurring minerals as catalysts, use of dissolved CO2 as a carbon source and exclusion of a gas phase in the reactor [6,81–86]. When only aqueous phase reactants are included, these studies have so far only documented the formation of a few small organic compounds (methane, formic acid and light hydrocarbons), but these products are consistent with those observed in deep-sea hydrothermal fluids [87–89]. In other experiments, where a gas phase is present during the reaction, FTT synthesis produced higher yields and a more diverse suite of organic compounds that includes hydrocarbons, carboxylic acids and alcohols containing one to more than 30 carbon atoms [6,86]. It should be noted that a free gas phase is common at the low pressures in shallow-sea hydrothermal environments, but does not occur at the in situ pressures of deep-sea vent systems.
Numerous other experiments have investigated abiotic formation of compounds with more direct biological relevance, such as amino acids, sugars and nucleobases, under hydrothermal conditions. Several experimental studies have demonstrated the abiotic synthesis of amino acids during the heating of aqueous solutions containing combinations of formaldehyde, cyanide (CN−, in the form of HCN, KCN or NaCN) and ammonium (NH4+) [90–97]. Sugars are readily formed by the formose reaction when heating solutions of formaldehyde [98,99]; experimental studies have shown that this reaction is promoted by alkaline conditions and the presence of some minerals and glyceraldehydes. Purines can be synthesized by heating aqueous solutions of cyanides [97,100,101], and pyrimidines by including compounds such as malic acid or cyanoacetylene in addition to cyanides [102,103] or by heating solutions of formamide [104,105].
It should be noted that although these studies demonstrate that biomolecules can be synthesized abiotically in hydrothermal solutions, the relevance of the experimental conditions to those of natural hydrothermal systems has been questioned (see, for instance, [106–109]). Most of these experiments are performed with highly reactive reactants such as HCN and formamide that are present in concentrations many orders of magnitude higher than could reasonably be expected to occur in natural environments. When experiments are performed with more realistic concentrations of reactants, biomolecules are not produced in detectable quantities. Furthermore, the biomolecules produced in these experiments rapidly decompose during continued heating, and they are only observed in experiments with short reaction times. Thus, the results may only be relevant to natural systems where fluids have very short residence times.
(b) Field evidence of abiotic organic synthesis
Organic compounds with a probable abiotic origin have been identified in a number of geologic fluids, with settings that include seafloor hydrothermal systems, fracture networks in crystalline rocks within continental and oceanic crust, volcanic fumaroles and hot springs and fluids discharged from serpentinized ultramafic rocks [87,89,110–113]. For the most part, these compounds are limited to methane and light hydrocarbons, although formic acid has also been identified as a possible abiotic component in fluids discharged from serpentinites on the seafloor . Some authors have speculated that more complex organic compounds found in fluids discharged from serpentinites may have an abiotic origin [114,115], but there are few data supporting this claim and their relative abundance does not appear to be consistent with the in situ light hydrocarbons in these systems for which there is considerable evidence of an abiotic source . To date, there have been no reports from modern terrestrial systems of biomolecules such as amino acids, sugars and nucleobases that are thought to have an abiotic origin.
The source of the abiotic hydrocarbons observed in geologic fluids is generally believed to be reduction of dissolved or gaseous CO2, with H2 serving as a reductant. Formation of these compounds apparently occurs through the mineral-catalysed FTT reactions . In the case of ultramafic-hosted deep-sea hydrothermal systems, the highly elevated H2 concentrations resulting from fluid–rock interactions in the subsurface provide conditions in which the reduction of CO2 to methane and organic compounds is strongly favoured by thermodynamics [5,23,116]. However, the abundance of hydrocarbons measured in the fluids remains well below the amounts predicted for thermodynamic equilibrium, indicating reduction reactions require catalysis and only proceed partially towards equilibrium.
In contrast to terrestrial hydrothermal systems, numerous biologically relevant organic compounds with an abiotic origin have been identified in meteorites that have been exposed to varying degrees of aqueous alteration [117–119]. This suite of identified organic compounds includes a variety of amino acids, saccharides and nucleobases. While some of the organic matter found in meteorites appears to have been accreted during formation of the meteorite parent body, many of the biomolecules are thought to have formed as a result of transient hydrothermal events occurring on the parent body.
(c) Thermodynamic evaluations of abiotic organic synthesis
Although experimental and field evidence for abiotic organic synthesis under hydrothermal conditions remains limited to a relatively small number of compounds, the potential for abiotic organic synthesis is striking. Inevitable mixing of cold, oxidized seawater with hot, reduced hydrothermal fluid in submarine vent environments can provide both the energy and the reactant molecules for an array of redox reactions . The naturally established redox disequilibria are required for organic synthesis from inorganic reactants, and they provide a source of catabolic energy for resident microorganisms. In fact, the first calculations of reaction energetics in a hydrothermal vent ecosystem focused on potential catabolic processes . Values of Gibbs energy (ΔGr) for 10 inorganic redox reactions were evaluated for a scenario where seawater mixes with vent fluid from EPR 21°N OBS [121,122]. McCollom & Shock  showed that aerobic respiration is thermodynamically favourable at low to moderate temperatures (T < 40°C), but anaerobic catabolisms, including methanogenesis, are the most exergonic options at higher temperatures. Note, however, that this switch in thermodynamically favoured metabolisms is due more to differences in the chemistry of low- and high-temperature mixed fluids than to differences in temperature per se.
It was soon recognized that if complete reduction of CO2 to CH4 (i.e. methanogenesis) was energy yielding, then perhaps incomplete reduction of CO2 to organic compounds would be as well (figure 1). This schematic conceptualizes the electron transfer processes from CO2 to reduced carbon, with abiotic organic synthesis on the top and biomass synthesis on the bottom; the change in carbon oxidation state during catabolic CH4 generation by methanogenic archaea is also depicted. The upper half in figure 1 represents the formation of relatively simple, metastable organic compounds by purely (geo)chemical means, without the advantage of biochemical pathways or enzymes. The lower half represents the enzyme-catalysed intracellular synthesis of biomolecules. This figure is intended to illustrate that if complete reduction of CO2 to CH4 is exergonic, then partial reduction of CO2 to organic compounds and biomass, whether inside or outside the cell, can also be exergonic. We should, perhaps, reiterate here that this thermodynamic argument does not inform on the rates of abiotic organic synthesis reactions, including the relative rates of partial versus complete reduction of CO2.
Using essentially the same geochemical framework—mixing seawater with hydrothermal vent fluid—the thermodynamic potentials for abiotic organic synthesis were evaluated for present-day and the early Earth [42,43,46,123]. The modelling considered CO2/HCO3− reduction with H2 to generate low molecular weight hydrocarbons, alcohols, ketones, aldehydes, carboxylic acids and simple biomolecules. Based on results from earlier thermodynamic models of natural systems, high kinetic barriers were assumed for the formation of light alkanes (including methane), graphite and aromatic compounds, preventing their synthesis. It was shown that at strongly reducing conditions and corresponding high levels of H2, an aqueous mixture of inorganic and organic carbon is thermodynamically more favourable than a solution where all the carbon is CO2/HCO3− . In other words, the synthesis of carboxylic acids and other low molecular weight organic compounds contributes to the Gibbs energy minimization of the overall system. The specifics of which organic compounds are energetically most likely to form and how much of the total carbon would be organic if equilibrium levels are approached, depend on the temperature, pressure and chemical composition of the system. Key controlling factors are the mineralogy of the host rock and the concentration of H2 in equilibrium with that mineralogy [24,46,116]. At equilibrium, assuming an initial oxygen fugacity (fO2) in the vent fluid set by the pyrite–pyrrhotite–magnetite (PPM) mineral assemblage [124,125], up to approximately 5 per cent of the total carbon would reside in simple organic compounds . The most favourable conditions for thermodynamically stable organic compounds would be at approximately 100–150°C, with acetic acid/acetate dominating the distribution of organic species.
Amino acid synthesis can also be exergonic in hydrothermal systems if the appropriate reaction pathways are open. Modelling demonstrated that at 100°C and 250 bar in a mechanical mixture of seawater and vent fluid (again, based on EPR 21°N), the synthesis of 11 of the 20 amino acids from CO2 and inorganic N and S sources is exergonic . It was further shown that the sum synthesis of all the requisite amino acids for thermophilic proteins (i.e. the primary structure) yielded, rather than consumed, chemical energy—up to 8 kJ mol−1 protein. Furthermore, the formation of amino acids and other organic compounds may result in surplus energy that can perhaps be shunted to the intracellular pathways for making proteins, lipids, nucleic acids and other biopolymers. At a minimum, the favourable energetics in hydrothermal systems may significantly reduce the energetic costs of biomass synthesis.
The discovery and geochemical analysis of a growing number of deep-sea hydrothermal systems spurred thermodynamic modelling to generate an ever more comprehensive picture of abiotic organic synthesis. More than 150 active submarine vents are now known from diverse tectonic settings along the ca 60 000 km-long ocean ridges and in back-arc basins . Most of these systems are insufficiently characterized to permit robust energy modelling, but seven hydrothermal systems were recently investigated for their potential in abiotic organic synthesis . The vent fluids at Rainbow and TAG (Mid-Atlantic Ridge), Kairei (Central Indian Ridge), Endeavor (Juan de Fuca Ridge), 9°N (East Pacific Rise), and the basins at Guaymas (Gulf of California) and Lau (southwest Pacific) are hosted in ultramafic, basalt or andesite rock, and discharging fluids range broadly in pH (2–6) and levels of dissolved hydrogen (0.37–16 mM), total sulfide (1.2–59.8 mM), ammonia (0.01–15.6 mM) and methane (0.01–63.4 mM), among other parameters.
These large differences in hydrothermal fluid compositions translate to large differences in Gibbs energies of organic synthesis reactions. The thermodynamics were consistently most favourable at the peridotite-hosted Rainbow site and the basalt-hosted Guaymas and Kairei sites—those where vent fluids have the highest H2 concentrations. In fact, the formation from CO2 of ethane, ethene, ethanol, acetaldehyde, several carboxylic acids and even some amino acids is generally exergonic at Rainbow, Guaymas and Kairei over a wide temperature range, approximately from 10°C to 200°C. At TAG, Lau Basin and Endeavor, where H2 concentrations are relatively low (0.37–0.62 mM), organic synthesis is endergonic (ΔGr > 0) for most reactions and temperature ranges investigated.
(i) Early Earth scenarios
The examples reviewed above considered present-day hydrothermal systems, but analogous modelling studies for the early Earth revealed even more favourable thermodynamic conditions for abiotic organic synthesis. Clearly, it is difficult to tightly constrain the composition of fluids in these cases, but reasonable assumptions can be made, especially for temperature, pH, major element chemistry and redox states of seawater and vent fluids on the early Earth. The ocean chemistry of Hadean Earth is often modelled after modern seawater, but with little, if any, dissolved oxygen. Current views on the chemistry of the Hadean atmosphere also help constrain the composition of Hadean seawater. Despite some vocal opposition [127,128], it is widely accepted that the atmosphere on the early Earth was predominantly CO2 and N2, with traces of H2, CO, CH4, NH3, H2O and reduced sulfurous gases, but essentially no free O2 [129–134]. The high CO2, in particular, would have caused warm global ocean temperatures and moderately acid pH. Because deep-sea hydrothermal fluid compositions are tightly controlled by fluid–rock interactions, and the geologic record shows that volcanic rocks from the early Earth had essentially the same bulk composition as those erupting today, the corresponding hydrothermal fluids would likely have had much the same range of compositions that we observe today.
Shock & Schulte  considered a range of redox states in their modelling of organic synthesis in the early Earth hydrothermal systems. They showed that a combination of inorganic and organic compounds is thermodynamically more favourable than a system with only inorganic carbon. As in the present-day example mentioned earlier, carboxylic acids dominate, but they peak at different temperatures (50–200°C), depending on the initial fO2 of the model hydrothermal fluid. It was calculated that carboxylic acids could account for up to approximately 45 per cent of the total carbon if the early Earth hydrothermal fluid is initially in redox equilibrium with the PPM mineral assemblage. This number reaches more than 90 per cent with hydrothermal fluids initially in redox equilibrium with fayalite–magnetite–quartz (FMQ). In fact, 100 per cent of the total carbon could be tied up in a combination of various simple aqueous organic compounds, if the initial redox state is reducing enough.
The oxidation states of submarine hydrothermal vent fluids on the early Earth likely would have varied considerably from those in present-day systems hosted in mid-ocean ridge basalt. The calculated redistribution of carbon into organic compounds, which is largely controlled by the oxidation state of the hydrothermal fluid, is illustrated in figure 2 for the early Earth. Calculations depicted in this figure, recasting results from Shock & Schulte , reveal the dependence of abiotic organic synthesis on the oxidation state of the fluid, as tracked by the redox parameter ƒO2. All of the results shown are for 100°C, which is reached by mixing hot vent fluids (initially at 350°C) with cold seawater (at 2°C). By combining results from models where the oxidation state of the vent fluids varies from those set at the high end by the PPM mineral assemblage to one ƒO2 unit below the oxidation state set by the FMQ mineral assemblage at the low end, it is possible to track the effects of redox variations in 100°C mixtures. It can be seen that CO2/HCO3− are dominant at the more oxidized conditions towards the right-hand side figure 2a, but that they are replaced in metastable equilibrium states by organic acids, alcohols, ketones and alkenes as the fluid mixtures become more reduced. This trend is consistent with a shift to higher H : C ratios and lower average oxidation states of carbon in the organic compounds. This shift is reflected in figure 2b that shows the distribution of the organic acids, which are summed in the single curve labelled ‘organic acids’ in figure 2a. Note that acetic acid dominates the most oxidized mixtures, followed by propanoic acid in somewhat more reduced conditions, and that the entire distribution of compounds flips over so that dodecanoic acid becomes dominant in the most reduced mixtures. The variability in ƒO2 values attainable in these examples of 100°C mixtures reflects the composition of the rocks hosting the submarine hydrothermal fluids.
The synthesis and stability of amino acid, nucleobases, ribose and deoxyribose have also been considered for the early Earth scenarios [123,135]. Redox disequilibria established from mechanical mixing of model Hadean seawater and hydrothermal fluid provides a substantial energetic drive for abiotic amino acid synthesis. As one example, thermodynamic calculations showed that at 250 bar and 100–200°C, the formation of most of the 20 protein-forming amino acids from CO2 may be exergonic . Owing mostly to the major geochemical differences in seawater, the energetics are far more favourable in the Hadean Earth scenario than in the present-day analogue. It also has been suggested that the reduction of CO (rather than CO2) may have represented the first steps in the emergence of chemolithoautotrophic life [136,137]. Energy considerations show, however, that amino acid synthesis from CO, compared with that from CO2 is less favourable but still exergonic. Note that this is due in part to the two-electron difference in the carbon oxidation state between CO and CO2.
4. Organic synthesis inside the cell
In an early attempt to quantify the energetic costs of anabolism, Morowitz  divided cellular biomass into its constituent monomers—amino acids, nucleotides, fatty acids, saccharides and amines. This approach was later adopted and adapted to analyse energy flow for biomass synthesis in heterotrophic [139–141] and chemoautotrophic [23–25] microorganisms; the latter is reviewed here. Gibbs energies are calculated for the formation of biomonomers from inorganic starting materials (HCO3−, NH4+, HPO42−, HS−, H+ and H2). The extracellular concentrations of the reactants reflect the geochemical system of interest, and the intracellular concentrations of the biomonomers are those in the model bacterium Escherichia coli.
In several studies, the energetics were evaluated at specific oxic (actually microoxic) and anoxic conditions, as well as at conditions in present-day and the early Earth hydrothermal systems [23–25]. For the microoxic example, the Eh was set at 0.77 eV, equivalent to an oxygen concentration of 0.1 per cent air saturation (which is less than 1 µM, a concentration that is representative for an ecosystem inhabited by microaerophiles); for the anoxic example, the Eh was set at −0.27 eV (typical of ecosystems inhabited by methanogens and other facultative anaerobes). It was shown that the energy requirements for the autotrophic synthesis of all the biomass monomers are approximately 13 times greater under microoxic than anoxic conditions, approximately 18 400 J compared with approximately 1400 J g–1 of dry cellular biomass. When the N and S sources were NO3− and SO42− (instead of NH4+ and HS−), the energetic cost under microoxic conditions is higher still, approximately 21 600 J g−1 cells or 15 times that under anoxic conditions. This is consistent with the previously recognized higher biomass yield per unit energy in anaerobic autotrophs compared with their aerobic counterparts .
The energetics of biomass synthesis were also calculated in 12 deep-sea hydrothermal systems. In computer models, seawater was mixed with vent fluids in basalt (Edmond, Endeavor, EPR 21°N, Lucky Strike, TAG, Menez Gwen), peridotite (Rainbow, Logatchev, Lost City), felsic rock (Brothers, Mariner) and a troctolite–basalt hybrid (Kairei). The key geochemical parameters varied widely: pH (2.7–9), H2 (0.04–16 millimolal, mm), H2S (0.1–9.7 mm), NH4+ (0.1–503 µm) and CH4 (0.007–2.5 mm), among others. Consequently, the energetics of biomonomer synthesis ranged demonstrably among the different systems. Predominantly because of the high H2 levels, the formation of biomass yielded the most energy in the peridotite and troctolite–basalt hybrid systems, up to approximately 900 J g–1 dry cell mass. As noted , this energy yield may lessen the overall ATP requirement in growing cells, or allow ‘surplus’ ATP to be diverted to drive other, endergonic biomass synthesis reactions. In the basalt-hosted and felsic rock-hosted systems, the energetics were far less favourable even at the optimum conditions considered, with values ranging from −400 to +275 J g–1 dry cell mass.
The effect of temperature on biomass synthesis energetics was also investigated. Note that in the modelling, temperature is coupled to fluid chemistry; both are direct functions of mechanical mixing of high-temperature hydrothermal fluid of one composition with low-temperature seawater of another. It was shown that the Gibbs energies for the formation of total cell biomass as a function of temperature and seawater : hydrothermal fluid (SW : HF) mixing ratio minimize between approximately 10°C and 50°C and a SW : HF ratio of approximately 50–5 (figure 3). In seven of the 12 systems investigated (Rainbow, Logatchev, Kairei, Lost City, Endeavor, EPR 21°N, Lucky Strike), this minimum is at ΔGr < 0, indicating that the synthesis of cellular biomonomers is exergonic at these conditions. It should be emphasized that these calculations consider only the net energetics of reaction from inorganic compounds to biomonomers; other possible energy costs are not included so that the total anabolic process may well have a positive Gibbs energy. It should also be noted that the energetics differed demonstrably among the different biomolecule families (data not shown). Amino acid and fatty acid synthesis reactions were generally the most favourable and exergonic. The formation of amines, saccharides (both with ΔGr ≈ 0 J g−1 dry cell mass) and nucleotides (ΔGr > 0) is energetically much less favourable. In fact, nucleotide synthesis was endergonic in each system and at all conditions considered, perhaps reflecting the structural complexity (e.g. double-bonded carbon–nitrogen rings) and the relatively high carbon redox state (figure 1) of these compounds.
(a) From biomonomers to biopolymers
Sections 2 and 3 summarize the substantial progress that has been made in determining how chemical and physical variables affect the formation energetics of organic compounds and relatively simple biomolecules. However, microorganisms are largely composed of biomacromolecules, such as RNA, DNA, proteins, lipids and polysaccharides—these are polymeric versions of the monomers discussed earlier. Owing to the scarcity of thermodynamic data, the energetics of biomacromolecule polymerization have not received the same level of attention as their constituent monomers. Recent advances in theoretical biogeochemistry have narrowed this gap, however, and these advances are used here to compute the energetics of amino acid polymerization into proteins.
As with the condensation of nucleotides to nucleic acids and monosaccharides to polysaccharides, the polymerization of amino acids into polypeptides is a dehydration reaction that can be written as 4.1where AAi represents any of the 20 protein-forming amino acids and Pn denotes a (poly)peptide consisting of n amino acid residues. The Gibbs energy of amino acid polymerization, ΔG1, similar to that of any other chemical reaction, is a function of temperature, pressure and composition, and it can be quantified using 4.2where stands for the standard state Gibbs energy of reaction, R and T refer to the gas constant and temperature (in kelvin), respectively, and Q denotes the reaction quotient, which is defined later. Values of ΔG1 can be estimated at a given temperature and pressure using recently developed group contribution approaches [59,74].
The group contribution algorithm used to calculate values of can be represented as 4.3where and refer to standard state Gibbs energies of formation of the protein backbone and amino acid backbone groups, respectively, which have been determined in the aqueous [61,74] and crystalline states . Values of are shown in figure 4 from 0°C to 150°C at 500 bar (50 MPa) and at pressures corresponding to the liquid–vapour saturation pressure for water, PSAT. It can be seen in figure 4 that at both pressures, of amino acid polymerization increases with increasing temperature, reaching maxima at approximately 120–150°C; with a further increase in temperature, decreases (data not shown). It is worth noting that the pressure effect (independently of temperature) accounts for only approximately 0.5 kJ per mole of peptide bond formed, and the temperature effect from 0°C to 150°C (independently of pressure) accounts for up to 1.5 kJ per mole of peptide bond. It should also be pointed out that the polymerization energetics for these calculations are taken to be independent of the amino acid identity. This may be an oversimplification [74,77], but too few data are available in the literature to fully represent this process for all of the 20 common amino acids.
It cannot be overemphasized that the values in figure 4 refer only to standard state conditions. In order to take into account the effects of intracellular amino acid and peptide concentrations on the Gibbs energy of polymerization, values of Q in equation (4.2) must be calculated. This can be done with 4.4where refers to the activity of the polypeptide of interest, designates the activity of the ith amino acid and νi indicates the stoichiometric coefficient of the ith amino acid. Activity is related to concentration, C, through individual activity coefficients, γ, consistent with 4.5The curves in figure 5 illustrate the effect of concentration on the energetics of polypeptide polymerization as a function of temperature for the formation of a hypothetical protein consisting of the median number of amino acid residues (278) in E. coli protein (table 1). Values of ΔGr were calculated with equations (4.2)–(4.4) for several total amino acid concentrations from 0.01 to 100 mM, including the average concentration of a given amino acid in E. coli cytosol, 6.5 mM (table 2). The concentration of the hypothetical E. coli peptide (P278; table 2) was 8.7 µM, and values of activity coefficients (γ) for all aqueous species were taken to be unity. Not surprisingly, more energy is needed to polymerize a mole of this E. coli protein for lower amino acid concentrations than for higher values (figure 5). In fact, for every order of magnitude increase in amino acid concentration, there is a corresponding decrease of about 1.5 MJ required to polymerize a mole of this protein. Additionally, the effect of temperature on the energetics of protein formation is strongest at the lowest concentration, increasing from approximately 15 MJ (mol protein)−1 at 0°C to approximately 19 MJ (mol protein)−1 at 150°C.
In a similar set of calculations, the Gibbs energies of polymerizing proteins of different lengths were also evaluated (figure 6). Here, the curves represent the energy required to polymerize a mole of median-length protein for three organisms, Pyrobaculum aerophilum, Bifidobacterium longum and Homo sapiens. The first two organisms have, respectively, the shortest and longest median protein lengths of the 83 prokaryotes given in Brocchieri & Karlin ; Homo sapiens was included because it is the only species that can read this. It is noteworthy that over the temperature range considered here (0–150°C), the median protein in B. longum requires over 50 per cent more energy to polymerize than the median protein in P. aerophilum. The average concentrations of amino acids and polypeptides in E. coli were used along with equations (4.2)–(4.4) to generate the curves in figure 6. Again, activity coefficients were taken to be unity.
To put these numbers into perspective, to make all the peptide bonds in 1 g of dry E. coli, the Gibbs energy is 0.191 kJ at 25°C, 1 bar, [AA] = 6.5 mM and [P278] = 8.7 µM (table 2), while ΔGr for the de novo synthesis of all the amino acids in 1 g of dry E. coli was estimated at 11.4 kJ (dry gram)−1 and 0.69 (dry gram)−1 under microoxic and anoxic conditions, respectively . That is, the energy required to form all the peptide bonds in a given mass of E. coli cells in an anoxic environment is nearly 28 per cent of the energy required to synthesize all the amino acids in those proteins from inorganic constituents, such as HCO3−, NH4+ and HS−. Because the energy required to make these amino acids under microoxic conditions is more than an order of magnitude larger than that under anoxic conditions, the energy of polymerization in this instance is a rather trivial component of the total energy required to make proteins from inorganic precursors. This may seem at odds with the commonly held notion that polymerization of amino acids into peptides requires substantially more ATP than the de novo synthesis of amino acids . Note, however, that the results in the current study compare the total energy required to synthesize amino acids from inorganic precursors to the subsequent polymerization of those amino acids. That is, the reactions describing the intracellular conversion of CO2 (and other reactants) into amino acids are commonly endergonic, requiring reducing power (e.g. NADPH) that is, in turn, generated from the oxidation of electron donors found in the environment. The energy associated with this transfer of electrons is accounted for in the present communication, but not in the ATP budget for microbial growth given by Stouthamer .
Up to this point, the energetics of amino acid synthesis have been reviewed and a procedure for calculating the energetics of amino acid polymerization into polypeptides has been demonstrated. However, in order to quantify the energetics of forming many biomacromolecules, one also must account for the energy associated with folding macromolecules into functional forms. For instance, most proteins must assume a particular three-dimensional conformation in order to perform their various structural and enzymatic roles. Below, we combine calorimetric data and the information in tables 1 and 2 to estimate the energy associated with protein folding in E. coli.
Based on compiled calorimetric data , the average standard state Gibbs energy of folding, based on 11 polypeptides at 25°C and 1 bar, is −0.34 kJ (mol amino acid residue)−1, with a standard deviation of −0.097 kJ (mol amino acid residue)−1. For a protein of 278 amino acid residues (the mean length in E. coli), this corresponds to −94 kJ (mol protein)−1. Using the law of mass action, 4.6and 4.7this corresponds to an equilibrium activity ratio of folded (af) to unfolded proteins (au) equal to 2.94 × 1016 (activity coefficients were taken to be unity). Given the aqueous volume in an E. coli cell (table 1), the concentration of a newly synthesized, yet unfolded, protein in E. coli is 2.37 × 10–9 mol l−1. Taking into account that the average concentration of the median protein in E. coli is 8.7 × 10–6 mol l−1, the Gibbs energy of folding a mole of the mean protein in E. coli is −74 kJ (mol protein)−1 ±21 kJ (using standard deviation data calculated from data in Privalov & Gill ) where Q = af/au = 3670. This is an exergonic process; hence, protein folding should be spontaneous at 25°C. As a side note, using data from Privalov & Gill , the standard state Gibbs energy of protein folding at 110°C and 1 bar for an E. coli protein of mean length is +278 kJ (mol protein)−1. That is, this hypothetical protein is not predicted to be folded under these conditions.
Those who use thermodynamics to evaluate the potential of organic/biomolecule synthesis in natural systems are often confronted with criticism along the lines of: (i) laboratory experiments do not support the predictions; (ii) field evidence does not support the predictions; and (ii) in biology, it is more about kinetics than thermodynamics. Let us be clear, the thermodynamic approach reviewed here informs on what is and is not energetically possible. It relies on the most accurate and internally consistent thermodynamic properties at the appropriate temperatures and pressures, as well as on detailed compositional data for the environment of interest. It serves as a framework within which to interpret experimental data and field evidence, but also to design better experiments and analyse for a wider array of compounds. The thermodynamic approach does not, however, map out what reactions will happen or what compounds will be synthesized in nature. There are certainly kinetic inhibitions to many organic reactions in hydrothermal systems, but there is also abundant evidence that some organic reactions equilibrate rapidly [151–155]. In other words, reactions involving organic compounds in hydrothermal systems are under a combination of kinetic and thermodynamic controls. Kinetic rate laws can be built into these models in the future, but at present, the requisite data are sparse. For now, thermodynamic models provide a means to evaluate which compounds might be in equilibrium and which are not, and they can identify the reactions for which there is an energetic drive at the conditions of interest.
One contribution of 14 to a Discussion Meeting Issue ‘Energy transduction and genome function: an evolutionary synthesis’.
- © 2013 The Author(s) Published by the Royal Society. All rights reserved.