## Abstract

This paper explores the ability of molecular evolution to take control of collective physical phases, making the first decisive step from independent replicators towards cell-like collective structures. We develop a physical model of replicating combinatorial molecules in a ternary fluid of hydrocarbons, amphiphiles and water. Such systems are being studied experimentally in various laboratories to approach the synthesis of artificial cells, and are also relevant to the origin of cellular life. The model represents amphiphiles by spins on a lattice (with Ising coupling in the simplest case), coupled to replicating molecules that may diffuse on the lattice and react with each other. The presence of the replicating molecules locally modulates the phases of the complex fluid, and the physical replication process and/or mobility of the replicating molecules is influenced by the local amphiphilic configuration through an energetic coupling. Consequently, the replicators can potentially modify their environment to enhance their own replication. Through this coupling, the system can associate hereditary properties, and the potential for autonomous evolution, to self-assembling mesoscale structures in the complex fluid. This opens a route to analyse the evolution of artificial cells. The models are studied using Monte Carlo simulation, and demonstrate the evolution of phase control. We achieve a unified combinatorial framework for the description of isotropic families of spin-lattice models of complex phases, opening up the physical study of their evolution.

## 1. Introduction

Physical and chemical models of evolving molecular systems, both theoretical and experimental, have provided important insight into the structure of evolution (Eigen *et al*. 1989; Johns & Joyce 2005). To date, however, most of the modelling of evolving chemical systems has been based on chemical reaction kinetics (Eigen *et al*. 1989), sometimes in conjunction with molecular structure as in RNA landscapes (Eigen *et al*. 1989; Fontana *et al*. 1989; Schuster 2006), with spatial effects when considered, studied in terms of reaction–diffusion equations (Boerlijst & Hogeweg 1991; Böddeker 1995; Cronhjort & Blomberg 1997). The interplay of collective self-assembling mesoscale structures (Coveney *et al*. 1996; Mayer & Rasmussen 2000; Nielsen *et al*. 2004) with the evolution of molecules has not yet been investigated in a coherent physical model. The current work aims to bridge this gap and thereby to open up the joint investigation of dynamical self-organization with a single model exhibiting coupled self-assembly and evolution. Models of amphiphilic self-assembly have been much investigated (Gompper & Schick 1991; Elliott *et al*. 2006), not least owing to their relevance to cellular membranes, but although the relations between molecular properties (such as head group size and chain length) and the equilibrium phase diagram are now quite well understood, the evolution of molecules utilizing collective phases based on this relation is not. The evolution of genetic molecules with surfactant properties was proposed to be important for the origin of life (Colgate *et al*. 2003). The physical connection between evolution and self-assembly in complex fluids is the topic here.

Rather than starting with a simulation of a physical abstraction of a particular protocell (Rasmussen *et al*. 2003, 2004), we seek to explore the general class of evolvable multiphase systems in a physically grounded model. This work extends insights from evolving spatial reaction–diffusion systems (Böddeker 1995; McCaskill 1997; McCaskill *et al*. 2001) to such multiphase fluids, where mesoscale structures create mobility barriers and local thermodynamic reaction conditions for replicators. In terms of evolution theory, the question is how different replicators can cooperate to induce appropriate collective phases for their proliferation. The effects of mesoscale structures on molecular evolution are bound to be important because it is now well accepted that spatial localization of chemicals, even in simple reaction–diffusion systems, enables complexity in evolving systems (McCaskill 1997; Altmeyer *et al*. 2004). In particular, the Gray–Scott mechanism of resource-limited cooperative replication was shown (Pearson 1993) to support growing and dividing spot patterns, reminiscent of cells, and these structures actually in turn promote the evolutionary stability of cooperative replication. In any case, the importance of cell-like mesoscale structures for biological evolution need hardly be stressed, so our goal is to reduce the model complexity to the point where the key processes and phenomena can be illuminated by simple, but still grounded, physicochemical models and simulation.

Our model is intended to remain as simple as possible while retaining the essential features of an amphiphilic system under the control of a combinatorial family of replicating molecules. To this end, for the amphiphilic system, we base our model on the simple spin-lattice models introduced by Ising (1925), Widom (1986) and Hansen *et al*. (1991), which we shall however generalize to embed them in the context of a general evolvable family of isotropic models with local interactions. The Ising model will serve as a simplest reference system for highlighting the basic physics entailed in coupling collective physical phases with evolving molecules, towards life-like systems and demonstrating that important issues of collective evolutionary self-organization can be addressed in the simplest models of phase transitions. The Widom case establishes the applicability of these considerations to amphiphile systems relevant to emerging life. We will study our model in two and three dimensions: in two dimension for visual simplicity and computational speed and in three dimension for more physical realism and topological variety.

In addition to the chemically simple amphiphile–hydrocarbon–water system, we assume that a combinatorially complex class of molecules, capable in principle of self-replication, may also be present. These combinatorial molecules are assumed to influence the free energies of local amphiphile configurations, and consistent with this influence, their diffusion on the lattice is kinetically biased by these induced free energy changes. If we include some specific influences of the local phase on the replication or degradation processes for these replicators, we arrive at a rich thermodynamically consistent model of amphiphile phase-dependent and phase-influencing replication that may serve as an elementary physical model for studying the evolutionary transition from chemical to cellular systems. Thermodynamic consistency is important if we are to ensure a physically sound model of information flow in the interplay between self-assembly and catalytic self-organization.

## 2. The evolving self-assembling phases model

In this section, we first introduce the basic homogeneous, spin-lattice thermodynamic models and their extensions to a family of more general isotropic Hamiltonians. We then discuss our choices in formulating kinetics (non-equilibrium treatment) for these models, before describing our main innovation of heterogeneously modulated spin-lattice systems. The local modulation of thermodynamic parameters is attributed to the presence of a family of combinatorial molecules, which are introduced as initially static, then as mobile diffusing and finally as reacting and self-replicating entities.

### (a) The basic Widom spin-lattice ternary model

The underlying amphiphile model we have chosen is motivated by the success of Widom's spin-lattice model and its relatives in modelling the phase diagram of ternary fluids. The Widom model (Widom 1986) reduces the complexity of ternary fluids comprising hydrocarbon, water and amphiphiles to a two-state model, bringing it remarkably close to the simple Ising model, which can also be applied to fluid mixtures.

The Widom model is based on a regular square or cubic lattice of sites with two states (+, −) that represent a collection of either hydrophobic or hydrophilic ends of molecules meeting at the site. In this model, all species are modelled as dimers, straddling two sites of the lattice, with hydrocarbon (oil) molecules (−, −), water (+, +) and amphiphiles (+, −) or (−, +).Owing to their prohibitively large energy, configurations, where mixtures of hydrophobic and hydrophilic ends meet, are excluded. This is reasonable for systems containing amphiphiles, where oil–water interfaces will always be mediated by amphiphiles. Although the model cannot well describe the isolated amphiphiles in water and hence the critical micelle concentration, it has proved effective in reproducing many features of the phase diagram of oil–water–amphiphile emulsions. The model is grand canonical such that both material and energy can be freely exchanged with the environment. In particular, activity coefficients are used to specify the concentration biases of water to oil and amphiphiles to water and oil. For example, changing a local ‘spin’ state may induce a change in the number of amphiphiles in the system.

The equilibrium thermodynamics can be derived from the partition function, which is defined in box 1

*K*when the ends of the two amphiphiles meet at a lattice site, rather than two ends of oil or water dimers, reflecting differences in the interactions of amphiphiles compared with water or oil. Furthermore, the hydrophobic ends of amphiphiles can have a different energy

*K*(1+

*λ*) from the hydrophilic ends

*K*(1−

*λ*). The relative activity coefficients,

*z*

_{1}and

*z*

_{2}, describe the entropic cost in the grand canonical ensemble of number fluctuations away from equal occupation by water versus oil, Δ

*n*

_{+}, and amphiphile versus homophile, Δ

*n*

_{±}, respectively.

### (b) A general evolvable local isotropic Hamiltonian

The Widom model is a particular instance of a local energy model for an amphiphilic system, assigning energies to amphiphilic spin configurations consistent with translational, rotational and reflection symmetries of space. In fact, examining the most general model with nearest neighbour interactions, with a view to later permit the evolution of arbitrary local amphiphilic interactions, we find a reduced number of distinct local spin configurations, subject to spatial symmetries depending on the lattice. Although the situation is simpler on a square 4-neighbour than a hexagonal 6-neighbour lattice in two dimensions, we have used the hexagonal lattice for some of our simulations owing to known artefacts on a square lattice. In three dimensions, the greater number of neighbours appears to allow enough configurational flexibility even for a cubic lattice.

Extensions to allow longer chain amphiphiles, as in the Larson model (Larson 1988, 1989, 1992), which allow further mesoscale structures such as vesicles, as well as for an explicit presentation of amphiphiles (e.g. Schick & Shih 1986), are postponed to subsequent investigations. Extensions to include higher-order spin terms (e.g. four spin) with next to nearest neighbours, allowing the alignment costs for amphiphiles to be considered, while of interest in capturing certain properties of the phase diagram, are clearly in a later round of investigation for evolutionary models. Thus, while the Widom model is known to have certain limitations in describing some aspects of the ternary phase diagram, essentially these simple spin-lattice models are known to explain the main features of the progression of phases in micellar systems with oil, water and amphiphiles (Widom 1986).

In a general isotropic (nearest neighbour) lattice model of this kind, the energies are determined by the interactions between adjacent spins and spin triples, independent of their detailed relative orientation. Allowing spatially symmetric additive contributions to the energy, up to spin triples, with no distinction between different geometric arrangements of triples, the local energy depends on six parameters,(2.1)where is the central spin at the lattice site in three dimensions or in two dimensions and(2.2)

The parameters are positive integer constants depending on the lattice: for the square lattice (6,3,1) and for hexagonal (two-dimensional) and cubic (three-dimensional) lattices (15,5,1).

For the Ising model, the only non-zero values of *η* are *η*_{2} and *η*_{4}, equal to the Ising field strength *H* and the spin coupling constant *J*. In the case of the original Widom model, the parameters *η*_{l} are(2.3)

To illustrate the extended formula, the Widom parameters can be used to calculate the local energy contribution of the central site in the configuration given in figure 1 as follows (hexagonal two-dimensional lattice):(2.4)

The energy contribution comes from only three triples with hydrophobic ends of amphiphiles meeting at site, as expected for the Widom parameter special case.

If we completely differentiate configurations (equating the energies of states identical under rotation and reflection), then we arrive at a somewhat larger number of different energies *ς*_{i} for the general nearest neighbour Hamiltonian (not restricted to lower-order spin terms), depending on the lattice: square (12), hexagonal (28) and cubic (20). The number of parameters is still significantly reduced compared with the number of local configurations (32,128,128) by the physical isotropy requirement. In addition to the interaction energy parameters, the lattice models contain parameters defining the grand canonical ensemble (either activity coefficients or chemical potentials): *z*_{1} for Ising and {*z*_{1}, *z*_{2}} for Widom. While these have also been subject to genetic modulation, we defer this to later investigations.

### (c) Kinetics of the multiphase system

Although the models are amenable to extensive theoretical analysis in equilibrium, using, for example, mean field, self-consistent field and transfer matrix techniques, and this allows a characterization of the phase diagram for the homogeneous case, our primary interest in this paper is in the kinetic perturbation of these equilibrium phases by the combinatorial molecules and so we need to incorporate kinetics.

Arguably, the simplest kinetics consistent with the above partition function is a Monte Carlo (MC) model (Metropolis *et al*. 1953) with single spin flips resulting in a free energy change Δ*F*_{1} being accepted with the probability *P*_{1}=Min(1, e^{−ΔF1/kT}). Such a single spin flip dynamics is equivalent to collectively exchanging all halves of dimeric molecules at a lattice site. In particular, this means, for example, that spin flips can cause amphiphiles and then water to appear in the middle of an oil droplet. This makes some sense for an open system in two dimensions, if we imagine the exchange of molecules with an off-surface reservoir for example, but is less easy to justify in three dimensions, unless we view the system to be in continual turnover in connection with reservoirs of dimeric molecules of the various types. Restricting the dynamics to interchange of neighbouring pairs of spins (Morawietz *et al*. 1992) would conserve the number of hydrophilic and hydrophobic molecule ends, but not the number of amphiphiles. The kinetics with such a semi-conservative dynamics tends to be much less efficient in equilibrating the system, and so we have opted for the more efficient dynamics in this work, while aware of its limitations.

For the single spin flip MC dynamics, the resulting energy difference must be computed for the central site and its nearest neighbours. The change in the number of oil, water and amphiphilic dimers induced by the spin change must be computed only at the central site to determine the entropic cost of the transition associated with the activity coefficients. The resulting free energy differences (box 1) can be computed efficiently. The traditional spin formulation can be replaced by a neighbour spin table lookup, enabling the physical transition probabilities with detailed balance to be considered in the broader context of stochastic cellular automata update rules. In particular, the traditional additive energy contributions from combinations of small numbers of spins can be extended to nonlinear combinations of spins of higher order than the nonlinear triplet terms.

### (d) Combinatorial energetic variation coupled to the multiphase system

Owing to the ephemeral nature of amphiphiles in the Widom model, it is not productive to try to attach combinatorial information about interaction properties to amphiphile configurations. The number fluctuations would result in a rapid loss of this information or require artificial reservoirs to keep track of the dynamics of hidden molecules. Instead, we chose to consider separate combinatorial molecules that modulate the hydrophilic interaction properties (e.g. *K* and *λ* or *z*_{1} and *z*_{2}) at a lattice site, thereby influencing the amphiphile system, but not subject to rapid number fluctuations. This has the additional advantage of separating the time-scale for combinatorial change from the relaxation time of the amphiphile system. In the discussion below, we will also refer to these combinatorial molecules as *templates* or *replicators*. Portions of these molecules encoding parameters that specify the modulation of the amphiphilic interactions (e.g. *K* and *λ*) at a given lattice site are referred to as genes.

A static population of such combinatorial molecules corresponds to a heterogeneous spin-lattice amphiphile system model, with fixed but spatially varying energy (and/or entropy) parameters at each site. In two dimensions, this could correspond to a surface local modulation of hydrophilicity by a solid support. The result is a generalization of the spin-glass random energy model for disordered solids (Anderson 1958), but is not necessarily of special interest in the context of fluid amphiphilic systems. Below, we shall therefore allow the combinatorial molecules to diffuse. Regular gradients of the genetic parameters can be employed in both two and three dimensions to scan a whole range of parameter conditions for interesting local phases. Such fixed multidimensional gradient scans have already been found useful for reaction–diffusion systems such as the Scott–Gray system (Pearson 1993) and were found useful here in tuning the starting parameters of our model. We have set up our model to allow genetic parameters to potentially influence *z*_{1} and/or *z*_{2} and one of *J* for the Ising model, *K* and/or *λ*, in the Widom case, a subset of the six extended Widom parameters *η*_{l} in the isotropic triple spin extension, or the 12, 28 or 20 isotropic site energy parameters *ς*_{i} (for square, hexagonal or cubic lattices, respectively) in the full nearest neighbour spin model. We typically consider a linear range of variation between maximum and minimum values. This is also the range of modulation we shall investigate when allowing template mobility and evolution as described below.

Figure 2 shows two typical gradient images of a simulation of the amphiphile spin-lattice system obtained for the (non-extended) Widom model itself. If we let the simulations run long enough, and make the system large enough, the gradient portrait is equivalent to an equilibrium phase diagram. Note the transitions between homogeneous, micellar, lamellar and liquid crystalline phases which are observable in the model. The micellar interpretation is problematic in the Widom model (they are actually small emulsion droplets), which also cannot exhibit the CMC transition, because it does not allow for isolated amphiphiles. Also, note that the gradients must be very weak to be able to infer accurately, from the local phase at a location in the gradient field, what the phase of the homogeneous amphiphile system with these local parameters would be. This is especially true for gradients in the activity coefficients ln *z*_{1} and ln *z*_{2}, since spin configurations can appear at one location and migrate (by single spin flips) to another. Nonetheless, the gradient simulations not only give a rapid indication of the equilibrium phase behaviour of the system, but also seem to capture some transient phenomena resulting from heterogeneity, which is likely to be relevant in the context of evolution.

### (e) Mobile combinatorial molecules

If we allow the combinatorial molecules to diffuse, then we have a novel mobile-fluid phase model. However, it appears at first sight unavoidable that allowing such diffusion will break the thermodynamic consistency of the amphiphile model. The problem is that diffusion of such molecules induces free energy changes in the amphiphile system through their modulation of its local energetics, and these changes need to be reflected in biasing movement. These free energy changes must be taken into account for diffusive interchange of a pair of neighbouring combinatorial molecules, which we do using a MC weighting as above. This means that for a fixed amphiphile (spin) configuration, the free energy difference, Δ*F*_{2}, on interchanging the combinatorial molecules (thermodynamic parameters) between two sites, which may have either sign, is calculated (including the energetic effects induced at neighbouring sites) and the interchange accepted with probability *P*_{2}=Min(1, e^{−ΔF2/kT}).

The advantage is now that we have a very natural and thermodynamically consistent feedback of amphiphile system information controlling the microscopic mobility properties of a combinatorial family of molecules. This is in fact what is required if we are to move beyond the simple pattern formation properties of reaction–diffusion systems as proposed by Turing (1952) to include the possibility of evolution. In cell biology, this model might also be used to represent the biased diffusion of protein molecules embedded in a lipid bilayer membrane.

### (f) Template chemistry of combinatorial molecules

Chemical reactions among the combinatorial molecules allow a dynamical self-organization of the phase behaviour of the amphiphile system to take place and, in turn, the amphiphile system can modulate the chemical reactions, both through influencing the reaction rates directly via the local phase or indirectly through the influence on template mobility and effective mutual affinities via the induced amphiphile energetics (as above). We wish to include chemical reactions that allow the proliferation and destruction of combinatorial molecules, which we shall then call replicators. Chemically, no complex biochemical apparatus is required for replication, as shown by the known experimental examples of enzyme-free template chemistry such as in von Kiedrowski (1986), although to date only limited combinatorial complexity has been achieved without enzymatic mediation. Non-explicit cooperative effects can occur even in the simple replication mechanism mediated by the amphiphile system.

Simple error-prone replication (*X*→2*X*) of a combinatorial family of molecules X corresponds to the well-studied (Eigen *et al*. 1989) physicochemical model of the quasispecies introduced by Eigen (1971). The cooperative replication mechanism (2*X*→3*X*) is a standard model for the problem of functional exploitation (McCaskill 1997) and its spatial resolution, which is germane to the origin of life and generates proliferating cell-like reaction–diffusion patterns (self-replicating spots; Pearson 1993), which have been shown to allow evolutionary stabilization of the cooperative functionality with a much simpler mechanism than that of higher-order hypercycles (Eigen 1979) via spirals (Boerlijst & Hogeweg 1991). In the current investigation, it is of primary interest to investigate the indirect cooperative effect via the amphiphile phase system without such explicit higher-order catalysis.

Note that detailed mechanisms of templating, e.g. via polymerization from monomers or ligation of fragments, are not considered in this work. In particular, the common (Szathmary & Demeter 1987) chemical replicator model with parabolic growth and coexistence of multiple templates is not investigated here, since we are interested in the evolution of combinatorially more diverse replicator functionality. The phenomenon of coexisting templates itself is also not so interesting, despite much recent attention, being a property of the simpler self-inhibition by double-strand formation (with or without the resource B) as shown in (Biebricher *et al*. 1984), and studied for ligation (Anderson 1983) abstracting Blum's model (Blum 1962).

To connect replication with the amphiphile system of our model, three things are necessary. Firstly, we need to translate the mechanisms into nearest neighbour configurational changes on the lattice. We consider all replicators *X* as existing on the sites of the amphiphile lattice, occupying a lattice site with utmost single occupancy, with default amphiphile energetics being employed in their absence. Replicators *X* modulate the energetic properties of the amphiphile lattice at the site they occupy. When new combinatorial species are created, they are placed at a neighbouring lattice site, overwriting the original site occupant. Secondly, we need to preserve the thermodynamic consistency of the amphiphile system by testing reaction outcomes thermodynamically. Placing a replicated combinatorial molecule at a neighbouring site, affecting some subset of local amphiphile thermodynamic parameters (e.g. *ζ*s and *z*s), induces a free energy change Δ*F*_{3} on the amphiphile system, by the same local thermodynamic contributions described above. As described above, the reaction can be accepted or rejected via the MC probability *P*_{3}=Min(1, e^{−ΔF3/kT}). Thirdly, direct influences of the local phase on the reaction can be included optionally: the reaction takes place only if the local site is hydrophobic or hydrophilic, or only at the interface between like or different phases.

To prevent complete saturation, the chemical destruction of combinatorial molecules is also considered. In this work, all modulators are assumed to be destroyed at a constant rate, but at a rate that may depend on the amphiphile phase. In addition to the possibilities of specific destruction rates in either the hydrophobic (−) or hydrophilic (+) phases, we may also consider the possibility of specific destruction either at an interface (a site with an oppositely signed neighbour) or in bulk (the reverse). Although we have begun exploration of a broader range of reaction behaviours in our model, in this paper we concentrate on the simplest scenario in which not only destruction but also replication rates are not directly sequence independent. This highlights the role of the complex fluid coupling in determining the evolutionary outcome. Antagonistic scenarios where cooperation between replicators has to overcome differential non-cooperating proliferation have been shown elsewhere (McCaskill *et al*. 2001) to be kept in check by spatial correlations in a manner similar to the neutral case adopted here for simplicity. We have also chosen the simplest independent replication mechanism without specific resources and considered just two simple scenarios where sequence independent replication is allowed either in the hydrocarbon phase or at the interface.

### (g) Combinatorial encoding of amphiphile modulation, reaction rates and mutation

In order to provide a general modelling framework, we divide the genetic effects into two classes: those affecting the amphiphile thermodynamics and those affecting reaction rates. The first class encodes both the local amphiphile energetic parameters and the two chemical potentials. The second class specifies reaction probabilities, typically just one such as for replication, and this is not investigated in this work. The thermodynamic values are encoded in binary sequences, abstracting more general combinatorial structure, and interpolate linearly between maximum and minimum values. The ranges for the signed parameters *ς* (or *z*) are given as parameters to the simulation in the form *ς*=*ς*_{0}±*s*Δ*ς*, where *s* is a finite precision signed binary fraction encoded by the sequence, between −1 and 1. Sixteen bits were used per parameter for *z*_{1} and *z*_{2}, and for *K* and *λ* in the Widom case. For the extended parameters *ς*_{i}, bit lengths between 2 and 5 were used, depending on the lattice (5 for square two-dimensional lattice and 2 for hexagonal lattice).

Mutation is performed as single bit substitutions with a constant uniform probability. We allow multiple mutations with exponentially decreasing probability, by repeating the mutation decision for a sequence until it fails. Note that while the extended parameters *ς*_{i} can be restricted to values equivalent to the original Widom case (yielding parameters *K* and *λ*), their mutation must also be restricted in order to study the evolution within the restricted Widom model.

### (h) Simulations and software

The computer code was written jointly in Mathematica v. 5.2 (Wolfram Research) and the GNU GCC C/C++ compiler, using the Mathlink connection. This allowed algebraic computations of parameters to be interfaced with optimized run-time code and provided powerful high-level graphics and GUI handling capabilities. The simulations were performed on G4 and G5 Macintosh processors under OS X 10.4, with typical equilibrating CPU run-times in the order of seconds to minutes and evolution run-times in the order of minutes to an hour. The graphical display language SDL was used for intermediate graphical display.

### (i) Discussion: coupling of system thermodynamics and kinetics

The presence of a genetic molecule locally modifies the free energy of the complex fluid due to the hydrophobic and hydrophilic (or amphiphilic) interactions and Newton's Third Law dictates that this interaction has to be symmetric so that the ternary system locally influences the genetic molecule in the same manner. If, for instance, a template is amphiphilic, it will tend to locate itself at a nearby water–oil interface if possible and have a lower probability of leaving this state. Consequently, the combinatorial molecule diffusion is phase dependent since, for example, an amphiphilic combinatorial molecule is much more likely to diffuse along an oil–water interface and much less likely to diffuse into the oil or the water phases. Conversely, the presence of the amphiphilic combinatorial molecule will influence the local energetics of the substrate to produce more local interface if possible. The gene–phase interaction strength can be adjusted in the model using the parameters Δ*ς*_{i}, since these specify the maximum magnitude of sequence-dependent physicochemical properties of the combinatorial molecule on the local entropic or energetics of the complex fluid.

Further, it is assumed that the genetic molecules' ability to replicate and/or survive depends on their environment; for example, a hydrophilic combinatorial molecule will replicate with a much higher probability in an aqueous local environment. These catalytic genetic properties are reduced to a case-by-case study in our model: the combinatorial molecules can replicate (either at the same rate or at a rate determined by their sequence) only in a predetermined local phase: either oil, water, interface, bulk or ubiquitously. Mixed scenarios are not considered at this stage. Attention here is devoted to the neutral case, where direct sequence-dependent effects on replication rate are absent: a combinatorial molecule can only influence its survival indirectly via interactions with the complex fluid phases.

Since the underlying multiphase system is represented in a grand canonical ensemble, there is no synthetic cost associated with mass transport of water, oil or amphiphilic molecules into and out of the multiphase system. This means that the local energetic influence that a combinatorial molecule imposes on the complex fluid can in a metabolically neutral manner drive a local mass transport of water, oil and amphiphiles. This is equivalent to assuming that the complex fluid components are freely available or readily synthesized and do not represent combinatorially complex entities. Each combinatorial molecule thereby slightly modifies the local fluid composition and thus the local phase. This resulting phase will in turn influence the combinatorial molecule mobility, the location, as well as the replication rate (figure 3).

The combinatorial molecules are not represented in the grand canonical ensemble because the heredity of their information is a definitive characteristic limiting evolutionary survival. Since there is free energy exchange between the complex fluid and the combinatorial molecules, the combinatorial molecules are represented in the canonical ensemble. Since our simplified model does not explicitly include any metabolic processes, and only the combinatorial molecules are conservatively synthesized by the system, genetic replication and loss defines the non-equilibrium energetic pumping of the overall system. The overall system consists of the template molecules coupled with the ternary fluid system. Each replication event is associated with a free energy cost and this (arbitrary, but fixed) free energy is supplied to the system from outside. We can safely assume that the free energy associated with this pumping (per molecule) is much greater than the free energy associated with the relaxation and self-assembly processes occurring in the multiphase system (per molecule), as the former is associated with the formation of covalent bonds while the latter is associated with the hydrophobic effect.

Owing to these gene–ternary fluid interactions, evolutionary adaptation happens in two fundamentally different ways in this system: (i) in the manner that is traditionally considered, where the generated variation in the template population, due to random mutations, enables selection for templates with the fastest replication rate for a given environment, which is normally called *adaptation to the local environment*, and (ii) the combinatorial molecules can also be selected for as they *adapt the local environment* to enhance their own replication rate. This is possible because the presence of the templates modifies the local thermodynamic conditions of the ternary system, which in turn modifies the thermodynamic (and kinetic) conditions for these combinatorial molecules. Thus, particular templates can be selected for, which modify the local environment, which in turn enhance their replication rate.

This adaptation of the local environment is a collective (cooperative) effect (of multiple genetic molecules) owing to the way local interactions collectively induce phases in the complex fluid. Such population-dependent interactions and environmental influences are known potentially to complicate Darwinian evolution, producing outcomes other than simple adaptive optimization (including limit cycle and more complex oscillatory or chaotic evolutionary dynamics). Spatial effects then clearly play an important role in the feedback between genetic influence on the complex fluid and its influence on genetic survival.

## 3. The combinatorial self-assembly dynamics without reactions

Before considering any reactions or evolution of the combinatorial molecules, it is important to gain some understanding of the nature of this coupled dynamical system: how it affects the amphiphile phase diagram and how it impacts on the free diffusion of modulators. This will be the subject of this section.

Although we shall demonstrate this two-way coupling using generalizations of the Widom model and its essential precursor, the random coupling Ising model, we expect the type of phenomena we demonstrate here to be a general feature of combinatorially coupled complex fluid models.

Note that the influence of random static, non-reacting combinatorial molecules is related to the random energy models as introduced in the study of spin glasses and other disordered solids (Anderson 1958). The simplest case is just an Ising model with a random site energy (or single spin *H* parameter) or random coupling terms *J*. In this static case, it is known that the disorder itself can induce localization thresholds in the mobility of particles subject to this Hamiltonian, but this is unlikely to be the case if the random energies are themselves mobile. In the case of random *H*, the perturbations on the Ising model behaviour appear local. In the case of random *J*, the critical temperature itself can be shifted radically by random perturbations as shown in the gradient plot in figure 4. In order to demonstrate the main effect, we chose reference systems with zero interaction energies. In the lower plots, static inhomogeneous perturbations of the Ising model, as applicable to a binary fluid, are shown. Note that in the absence of the fixed combinatorial molecules, there is no ordered phase at any temperature since *J* is set to 0. The correlated patches in the ordered regime are at fixed locations in the random *J* model. The figures were created with MC simulations starting with a random spin configuration.

A similar result is achieved for the Widom model with fixed combinatorial molecules perturbing *K*^{+}=(1+*λ*) and *K*^{−}=(1−*λ*), as shown in figure 4*c*. Note that, it is more natural to treat these two quantities as independent random variables subject to genetic control, than *K* and *λ* directly, because they represent the separate energies of the juxtaposition of hydrophilic or hydrophobic ends of two amphiphiles ends, respectively, compared with water or hydrocarbon. Although these plots are noisy, being subject to the uncorrelated perturbations of the combinatorial molecules, more correlated perturbations induced by the combinatorial molecules can have a stronger influence.

If we now allow the combinatorial molecules to diffuse, as befits disordered fluids, in contrast with solids, then a number of new phenomena emerge. Recall that the random walks must be biased by the energetic changes induced in the binary (Ising) or ternary (Widom) fluid for thermodynamic consistency, as discussed in §2. Consequently, the mobile combinatorial molecules can both influence the phase of the complex fluid in which they float, and their spatial arrangement and mobilities can be influenced by the induced phases. In figure 5, we show this coupled phase behaviour for the Ising model coupled to mobile elements, which randomly perturb the coupling constant *J* from 0. This describes the way in which a random population of combinatorial molecules with varying degrees of amphiphilicity interacts with a homogeneously mixed two-component fluid to induce mesoscale structuring of the fluid and combinatorial sorting and clustering of the combinatorial amphiphiles. Note that the combinatorial molecules themselves may experience effective, attractive or repulsive interactions allowing aggregation, for example, induced by their impact on amphiphile energetics. We note that such examples of noise-induced order may assist the onset of cooperative action by populations of genetic elements, through correlating collocation with function autonomously.

A similar non-random clustering of the diffusing combinatorial molecules, interacting with a Widom ternary emulsion system of hydrocarbon, water and surfactant, occurs as shown in figure 6. So, in addition to the vertical gradient, we also show a horizontal gradient in the activity coefficient of amphiphile (*z*_{2}). Near *z*_{2}=0, at the left and right extremities of the plot, the higher temperature phase is the mixed system, whereas at lower amphiphile activities, in the centre, the higher temperature phase is the separated fluid. The spatially resolved inverse mobility plot in figure 6*b* confirms the structured feedback on the combinatorial molecules through their perturbation of the complex fluid. The values shown in figure 6*b* are calculated as the free energy cost in units of *kT* for an interchange of two neighbouring amphiphiles in the vertical direction. The plot looks similar for interchanges in the horizontal direction. The mobilities of the combinatorial molecules become strongly correlated in spatial clusters of molecules having similar interaction parameters below the transition to the higher temperature phases.

In figure 7, we show the corresponding free energies that drive the equilibration process for the Widom model case shown in figure 6. Note that equilibration is essentially complete after 1000 MC updates per site. It is worthwhile to observe that the combinatorial molecules relax to lower the free energy of the system, at the expense of the configurational free energy associated with the ternary fluid, consistent with the equilibrium state containing more structure than in the absence of the combinatorial molecules.

In summary, in this section, we have shown that mobile combinatorial molecules can both modulate the phase of complex fluids and themselves be ordered by this interaction. It is significant that random perturbations by combinatorial molecules can induce phase transitions to more ordered phases of complex fluids (as in noise-induced order). Note that although static random energy and random coupling models have been studied in solid-state areas such as spin glasses, the equivalent for a multiphase liquid system with mobile elements does not appear to have received attention until now. The ability of molecular interactions with a complex fluid to provide structured pattern formation in a system of combinatorial molecules introduces a rich range of novel pattern-forming mechanisms to act as a substrate for evolution. This will be explored in §4.

## 4. The evolving replicator–amphiphile system

In this section, we shall explore the coupled evolution and self-assembly which our model captures. This is accomplished by allowing reactions to take place among the combinatorial molecules, in particular, replication, which enables inheritance and subsequently evolution. The simplest case is that of the Ising model as applicable to a simple binary fluid, as discussed in §3, which we shall examine first. This exemplifies the character of evolution coupled to phase control in a thermodynamically consistent way. The second case is the Widom model, and since it incorporates a combinatorial treatment of amphiphiles, this of course is more relevant to the self-organization of protocells. Both models are special cases of the general isotropic local Hamiltonian introduced in §2, which will be investigated in the future for more general evolution.

Consider firstly the evolution of a population of combinatorial molecules which can only replicate in an organic phase, but can influence the coupling constant *J*, determining the interaction energy between the organic molecules and water, in a sequence-dependent fashion. Since non-zero coupling constants can support collective phases, it is conceivable that templates with sequences for appropriate values of *J* can cooperate to produce a local oily phase and thereby increase the replication probability and survival of these templates. As in §3, we set up a temperature-gradient environment (from 0.25 to 2.25), with the environmental coupling coefficient *J*_{0}=0, so that the initial fluid is in a well-mixed state containing equal amount of water and organic material. The range of combinatorial variation is set to ±1. There is no combinatorial variation in replication probability or loss rate, which are set to 0.5 and 0.1 per MC update (unit time). We use simple replication and the binary landscape described in §2 and a single mutation rate of 0.01 (per replication event, with multiple mutation events per replication allowed). The results are shown in figures 8 and 9. Note that at low temperatures, the steady-state population supports some ‘parasitic’ combinatorial molecules, which do not contribute in full to maintaining the oily phase (figure 8*e*).

The evolution of combinatorial molecules capable of acting as or producing amphiphilic molecules will be the subject of the remainder of this section. This is of some significance to the evolution of cellular life, as suggested by the experiments on self-reproducing surfactant systems (Luisi & Varela 1989). Already in the Ising model simplification, some useful insights can be gained, so we present an evolutionary simulation before proceeding to our main result on the Widom model. If replication of template molecules can only occur at the interface between hydrophobic and hydrophilic phases, an assumption relevant to the Los Alamos protocell model (Rasmussen *et al*. 2003), then the situation changes dramatically from the replication in oil scenario shown above (figures 8 and 9).

We commence with a finite environmental value to the coupling constant *J*=0.5, so the system is phase separated at nearly all temperatures in the gradient simulation (*T* from 0.5 to 1; figure 11*a*). We allow a population of combinatorial molecules to influence the coupling constant as in the previous example. Although we could have attained more rapid evolution with the binary sequence landscape used in the previous case, we have used a unary encoding of the coupling constant here (30 bits), which serves to illustrate a landscape where significant functionality is rare and many mutations from random sequences are required to reach this.

The evolution shows increasing adaptations of the combinatorial molecules to produce interface via negative *J* values and thereby an emulsion of the oil–water system. Initially, the population is localized near the macroscopic interface, with minor departures determined by the relative magnitudes of diffusion and loss probabilities. Interface forms more easily at high temperatures, so the time course (figure 10) shows a successive occupation towards the low-temperature centre. The steady-state evolved population occasionally completely fills the space, but predominantly a small phase-separated region at low temperature remains (figure 11*c*). On average, five adaptive mutations are required to achieve the coupling values of the evolved population, with more negative values being adapted in the central low-temperature region (figure 11*e*). Compare this with the neutral case, where local spatial correlations induced by the replication–diffusion system are visible (figure 11*f*). The free energy traces in figure 10*b* show that the evolution drives the system steadily, but non-monotonically, to a higher free energy state with about half the free energy in the genetic modulation.

The final example discussed in this section demonstrates the evolutionary adaptation of replicators' modulator genes to enhance their proliferation via genetic control of self-assembled amphiphile structures. The example is not just selection from among the variants present in an initial population, but a sequence of at least two separate adaptive events. The first occurs some time after the simulation starts, when a colony of replicators first successfully adapts to the watery phase. This adaptation changes the water phase by creating lots of micelles, thus maximizing interface (thus, maximizing reproduction). The second, later adaptation arises independently in a number of new colonies that learn to survive in the oily phase by creating reverse micelles, thus maximizing interface. The timing of these adaptations indicates that they were not present in the initial population (or were present but did not survive).

The example uses an environment created by a gradient of *z*_{1} ranging from −2.7 to 2.7, which creates two bulk phases (water and oil) with a complex interface between them. The replicators evolve different adaptations to survive in these niches, as well as at the interface. Replicators are chosen to replicate only at the interface. Other thermodynamic parameters are *kT*=0.2, *z*_{2}=1.3, *K*=0.25 and *λ*=0.01. Figure 12 illustrates the approach to equilibrium behaviour of the amphiphile system with no replicators present in figure 12*a*,*b*. In figure 12*b*, we see the watery phase on the right (white), the oily phase on the left (black), complex phase in the middle with a few micelles (black dots in white), a few reverse micelles (white dots in black), and various bilayer and inverted bilayer structures that create a large amount of interface (amphiphiles, at the border between white and black). Figure 12*c* illustrates the asymptotic effect of the successfully evolved replicators on that equilibrium. Note that the replicators have changed the self-assembled structures by filling the watery phase with micelles and the oily phase with reverse micelles. Comparison of figure 12*b* with figure 12*c* dramatizes the effect of coupling replicators and self-assembled structures. The replicators have changed the amphiphilic structures to create much more interface, which they need to reproduce, thus evolving a coupling with the amphiphiles that produces a favourable local environment for themselves.

Figure 13 illustrates key statistics measured as the replicators evolve. Note that the population shows signs of making two significant adaptations, the first around time-step 300 and the second around time-step 800. The timing of the first adaptation (around time-step 300) corresponds to the onset of a steady fall in the mean value of gene 1; the corresponding characteristic rise and fall in the variance in gene 1 and the rise of the replicator concentration to its first plateau. The timing of the second adaptation (around time-step 800) is confirmed by the slight final bump in mean value of gene 1 in figure 13*d* when the new oily living subpopulations grow and raise the mean gene 2 value; this timing is corroborated by the corresponding rise of the replicator concentration to its second plateau and the variance dynamics in that gene (not shown).

Figure 14 illustrates the evolutionary sequence. The amphiphiles and replicators together are shown on the left and the amphiphiles alone are shown on the right. We clearly see two separate kinds of adaptation: the first involves colonizing the watery niche by creating interface via micelles and the second involves colonizing the oily niche by creating interface via reverse micelles. Comparison of equilibrium spin configurations with and without replicators shows the dramatic effects of coevolution of replicators and amphiphiles.

Figure 15 illustrates the evolutionary activity waves (Bedau & Packard 1991; Bedau *et al*. 1997) indicating the evolution of significant adaptations. The activity statistics were computed by measuring persistence of genes in the population, attaching a counter to each gene, and propagating the counter unchanged to offspring, except when a mutation occurs, when the counter is set to zero. Activity is plotted by forming the distribution of persistence counter values in the population and observing the evolution of that distribution as a function of time. Formation and propagation of a wave in such a plot indicates the introduction of an innovation that persists owing to survival value conferred on replicators by a particular gene value. The large wave A originating at the origin corresponds to certain combinatorial molecules that originated in the few initial replicators that eventually seeded the first successful population. The first adaptation corresponds to activity wave starting near time-step 200, when the population learns to survive in the watery phase by mutations in gene 1 that have the effect of creating micelles and thus increasing the amount of interface. Successive adaptations continue and the most successful version of this adaptation originates around time-step 300, when the population level of replicators shows a significant increase. The second adaptation corresponds to activity wave originating around time-step 600 when a small subset of the population migrates into the oily phase and learns to create reverse micelles. This figure confirms that the main adaptive events are not mere selection from variability that is present in the initial population, but instead involves new innovations. The many waves starting around time-step 800 correspond to a succession of different subpopulations that independently migrate into the oily phase and adapt to it.

## 5. Discussion and conclusion

We have shown how the physics of complex fluids can be coupled with that of physicochemical evolution and reaction–diffusion kinetics to yield physicochemical evolution and to elucidate the way in which pattern formation in collective physical phases (self-assembly) can become the object of natural selection. We have explored this path of combinatorial evolution from simple Ising models of phase transitions to as far as spin-lattice models of ternary amphiphile systems such as the Widom model and beyond, which are known to capture essential features of the complex phase diagram of real emulsion systems.

The characteristic of this paper is clearly that of opening up a large new area of inquiry with a presentation of first results, rather than an exhaustive analysis of all intermediate models. We have not attempted to use analytical treatments of the equilibrium properties of spin-lattice models in the analysis of the evolutionary system. It would be useful to have results on the equilibrium properties of the non-reacting versions of the coupled models we have presented of complex combinatorial fluids. The replica method and renormalization techniques (McCaskill 1984) have proved successful in analysing the steady-state properties of combinatorial replicator models with statistical ensembles of fitness values and could be applied here.

The evolution of pattern-forming systems has been studied in the context of reaction–diffusion systems by one of us in previous work, both experimentally and theoretically (McCaskill 1997), and this work has also yielded analytical tools (McCaskill *et al*. 2001), which may be relevant to the current context. It is clear that microreactor technology, with laser-induced fluorescence detection, can be used to create controlled gradients of temperature and concentrations with which the phase space screens utilized for rapid insight in this work can be realized experimentally. The control of non-equilibrium self-assembly of binary and ternary fluids has been the subject of much recent investigation (Thorsen *et al*. 2001), not least owing to potential biotechnological applications. Epstein has shown that emulsions can also be used to modify the properties of reaction–diffusion systems (Vanag & Epstein 2001). However, chemical examples of the mutual interaction of complex fluid phases with immersed reaction–diffusion kinetics appear to be novel outside existing living cells. Since such systems can capture the essential physics of this coupling, they provide an important stepping stone in the transition between inanimate and animate matter.

The spin-lattice approach suffers a serious drawback when it comes to modelling the dynamics of collective effects. No inertial effects are included in the dynamics, and so although collective movement of mesoscale structures can be observed, we cannot hope to obtain the correct scaling of diffusion of these structures with increasing size, nor hydrodynamic interactions, nor the appropriate response to external forces. In separate work, we are pursuing dissipative particle dynamics (Bedau & Brown 1999; Buchanan *et al*. 2006) to create inertial evolvable models conserving the numbers of molecules in the system, but these more explicit off-lattice simulations are more time consuming and there are still open questions about valid physical parameterization in pushing the integration time-step well beyond that of molecular dynamics through stochastic integration.

We have not yet explored the influence of a specific free energy or chemical building block resource on the self-organization in detail. Our modelling framework is equipped with a resource plane with diffusing resources that are consumed during replication. All the simulations reported in this paper, however, were performed with immediate replenishment of a uniformly dense resource population. The pattern formation ability of resource depletion has been powerfully illustrated in the context of reaction–diffusion systems by the self-replicating spots in the Scott–Gray model (Pearson 1993).

Various authors have constructed theoretical models of components of protocells. Closest in spirit to the current approach is that of Ikegami (Ono & Ikegami 2000; Madina *et al*. 2003) who employed a more complex orientation-dependent model (cf. Dawson & Kurtovic 1990) of a membrane-forming system to analyse the chemical kinetics in complex fluids supporting cell-like entities. The spectrum of models presented here, with amphiphilic structure modelled by spin configurations on a lattice, is simpler and not biased towards membrane systems. It enables a sharper characterization of energy couplings and free energy flows that drive pattern formation. An evolving system coupled to a non-evolving one in two layers was also explored in a different and non-thermodynamic context (the game of life CA) by Taylor (2004). In connection with the origin of life, although our work employs a sequential combinatorial representation of genetic information, this is not critical for many of the conclusions, so that the current approach may also be applicable to compositional information models of protocells as introduced by Lancet (Serge *et al*. 2000).

An important property of the current model is that combinatorial molecules must act together in a concerted fashion if they are to induce a local phase change (which then impacts either directly on replication rate or indirectly via mobility changes or induced aggregation on template survival). This means that the central problem of natural selection with cooperative functionality is present in the coupling of combinatorial selection with phase control. In order to quantify this relationship, one could analyse the minimum density of combinatorial molecules required to induce the phase change for a given level of parameter influence. Note that it is well known that the outcome of cooperative selection is very strongly dependent on the mobilities of the molecules carrying the heritable information and is supported only for an intermediate range of mobilities (McCaskill *et al*. 2001). It is an open question whether a system can coevolve mobility and cooperative functionality to place itself in this window. It does appear that part of the functionality of a protocell is to provide a strong control of the mobility of genetic information.

## Acknowledgments

The bulk of the work was supported by the EU in the PACE integrated project, EC-FP6-IST-FET-IP-002035. All authors acknowledge travel grants from the Santa Fe Institute in 2003, European Center for Living Technology 2004. Additional support for S.R. 2005 came from the LDRD-DR Protocell Assembly (PAs) project. The corresponding author wishes to thank Manfred Eigen for his continuing support and his research group BioMIP, especially T. Maeke and S. McCaskill, for their advice and consideration.

## Footnotes

↵† On leave, Friedrich Schiller University, Jena.

One contribution of 13 to a Theme Issue ‘Towards the artificial cell’.

- © 2007 The Royal Society