## Manfred Eigen

Print publication date: 2013

Print ISBN-13: 9780198570219

Published to British Academy Scholarship Online: May 2013

DOI: 10.1093/acprof:oso/9780198570219.001.0001

Show Summary Details

# (p.667) Appendix A4 The Mathematics of Darwinian Systems

Source:
From Strange Simplicity to Complex Familiarity
Publisher:
Oxford University Press

(p.667) Appendix A4

The Mathematics of Darwinian Systems

Abstract: Optimization is studied as the interplay of selection, recombination and mutation. The underlying model is based on ordinary differential equations (ODEs), which are derived from chemical kinetics of reproduction, and hence it applies to sufficiently large — in principle infinite — populations. A flowreactor (continuously stirred tank reactor, CSTR) is used as an example of a realistic open system that is suitable for kinetic studies. The mathematical analysis is illustrated for the simple case of selection in the CSTR. In the following sections the kinetic equations are solved exactly for idealized conditions. A brief account on the influences of finite population size is given in the last section.

# 1. Replication in the flowreactor

Replication and degradation of molecular species I i, (i=1,2,&,n) in the flowreactor (Figure A4.1) follows the mechanism

(A4.1)
$Display mathematics$

(p.668)

Figure A4.1 The flowreactor for the evolution of RNA molecules.

A stock solution containing all materials for RNA replication ([A] = a0), including an RNA polymerase, flows continuously at a flow rate r into a well‐stirred tank reactor (CSTR) and an equal volume containing a fraction of the reaction mixture ([*] = {a, b, ci}) leaves the reactor (for different experimental setups see Watts).[1] The population of RNA molecules (I1, I2, …, In present in the numbers N1,N2, …,Nn with $N = ∑ i = 1 n N i$) in the reactor fluctuates around a mean value, N±√N. RNA molecules replicate and mutate in the reactor, and the fastest replicators are selected. The RNA flowreactor has been used also as an appropriate model for computer simulations. [2, 3, 4]There, other criteria for selection than fast replication can be applied. For example, fitness functions are defined that measure the distance to a predefined target structure and mean fitness increases during the approach towards the target.[4]

(p.669) and is described by the following (n + 2) kinetic differential equations
(A4.2)
$Display mathematics$

The variables a(t), b(t), and ci(t) are concentrations, [A]=a, [B]=b and [Ii] = ci, which are defined by a=NA/(V NL), b=NB/(V NL), and c_i= Ni/(V NL) where V is the volume and NL is Avogadro's number, the number of particles in one mole of substance. The particle numbers N are discrete and non‐negative quantities, whereas concentrations are assumed to be continuous because NL=6.023 × 1023 is very large.1

The equations (A4.2) sustain (n+1) stationary states fulfilling the conditions for i=1,2,&,n. Every stationarity conditions for one particular class of replicating molecules I i

$Display mathematics$

has two solutions (i) $c̄ i = 0$ and . Since any pair of type (ii) conditions is incompatible,2 only two types of solutions remain: (i) , the state of extinction, because no replicating molecule ‘survives’ and (ii)n states with and . Steady‐state analysis through linearisation and diagonalisation of the Jacobian matrix at the stationary points yields the result that only one of the n states is asymptotically stable, in particular the one for the species I m that fulfils

(A4.3)
$Display mathematics$

Accordingly, species I m is selected and we call this state the state of selection. The proof is straightforward and yields simple expressions for the eigenvalues λk (k=0,1,&,n) of the Jacobian matrix when degradation is neglected, dj=0 (j=1, 2, &, n). For the state of extinction we find

(A4.4)
$Display mathematics$

(p.670) It is asymptotically stable as long as r 〉 km a0 is fulfilled. If r 〈 km a0 then r 〈 kja0 ∀ j≠ m is valid by definition because of the selection criterion (A4.3) for dj=0. For all other n pure states the eigenvalues of the Jacobian are:

(A4.5)
$Display mathematics$

All pure states except the state at which I m is selected (state of selection: cm=a0-r/km, cj=0, j=1, &, n, j≠ m) have at least one positive eigenvalue and are unstable. Therefore we observe indeed selection of the molecular species with the largest value of kj (or kj a0 −dj, respectively), because only at $c̄ m ≠ 0$ are all eigenvalues of the Jacobian matrix negative.

It is worth indicating that the dynamical system (A4.2) has a stable manifold $ȳ = ā + b̄ + ∑ i = 1 n c̄ i = a 0$ since $ẏ = ȧ + ḃ + ∑ i = 1 n ċ i = ( a 0 − y ) r$. The sum of all concentrations, y(t), follows a simple exponential relaxation towards the steady state $ȳ = a 0$:

$Display mathematics$

with the flow rate r being the relaxation constant.

# 2. Selection

As shown in the previous section the basis of selection is reproduction in the form of a simple autocatalytic elementary step, (A)+I i→2I i. We idealise the system by assuming that the material consumed in the reproduction process, A, is present in excess. Therefore, its concentration is constant and can be subsumed as a factor in the rate constant: fi=ki [A]. In addition we neglect the degradation terms by putting di=0 ∀ i. In terms of chemical reaction kinetics selection based on reproduction without recombination and mutation is described by the dynamical system

(A4.6)
$Display mathematics$

As before the variables ci(t) are the concentrations of the genotypes Ii, the quantities fi are reproduction rate parameters corresponding to overall replication rate constants in molecular systems or, in general, the fitness values of the genotypes. A global flux Փ(t) has been introduced in order to regulate the growth of the system. (p.671) Transformation to relative concentrations, xi=ci/c, and adjusting the global flux Փ(t) to zero net‐growth simplifies further analysis:3

(A4.7)
$Display mathematics$

The relative concentrations xi(t) fulfil $∑ j = 1 n x j ( t ) = 1$ and the flux Փ(t) is the mean growth rate of the population. Because of this conservation relation only n−1 variables xi are independent. In the space of n Cartesian variables, n, the x variables represent a projection of the positive orthant onto the unit simplex (Figure A4.2)

$Display mathematics$

The simplex $S n ( 1 )$ is an invariant manifold of the differential equation (A4.7). This means that every solution curve x(t)=(x1(t),x2(t),&,xn(t) that starts in one point of the simplex will stay on the simplex forever.

Figure A4.2 The unit simplex.

Shown is the case of three variables (c1, c2, c3) in Cartesian space ℝ(3) projected onto the simplex $S 3 ( 1 )$. The condition x 1 + x 2 + x 3 = 1 defines an equilateral triangle in ℝ(3) with the three unit vectors, e1 = (1, 0, 0), e2 = (0, 1, 0) and e3 = (0, 0, 1) as corners.

(p.672) In order to analyse the stability of $S n ( 1 )$ we relax the conservation relation $∑ i = 1 n x i ( t ) = c ( t )$ and assume that only the conditions
$Display mathematics$

are fulfilled. According to this assumption all rate parameters are strictly positive — a condition that will be replaced by the weaker one fi≥0∀i≠ k∧ fk〈0 — and the concentration variables are non‐negative quantities. Stability of the simplex requires that all solution curves converge to the unit simplex from every initial condition, $t → ∞ ( ∑ i = 1 n x i ( t ) ) = 1$. This conjecture is proved readily: from $∑ i = 1 n x i ( t ) = c ( t )$ follows

(A4.8)
$Display mathematics$

For $ċ = 0$ we find the two stationary states: a saddle point at $c̄ = 0$ and an asymptotically stable state at $c̄ = 1$. There are several possibilities to verify its asymptotic stability, we choose to solve the differential equation and find:

$Display mathematics$

Starting with any positive initial value c(0) the population approaches the unit simplex. When it starts on $S n$ it stays there even in the presence of fluctuations.4 Therefore, we restrict population dynamics to the simplex without loosing generality and characterise the state of a population at time t by the vector x(t) which fulfils the $L ( 1 )$ norm $∑ i = 1 n x i ( t ) = 1$.

The necessary and sufficient condition for the stability of the simplex, Փ(t)〉0, enables us to relax the condition for the rate parameters fi. In order to have a positive flux it is sufficient that one rate parameter is strictly positive provided the corresponding variable is non‐zero:

$Display mathematics$

For the variable xk it is sufficient that xk(0)〉0 holds because xk(t)≥ xk(0) when all other products fjxj were zero at t=0. This relaxed condition for the flux is important for the handling of lethal mutants with fj=0.

(p.673) The time dependence of the mean fitness or flux Փ is given by

(A4.9)
$Display mathematics$

Since a variance is always non‐negative, equation (A4.9) implies that Փ(t) is a non‐decreasing function of time. The value {f}=0 refers to a homogeneous population of the fittest variant, and then Փ(t) cannot increase any further. Hence it has been optimised during selection.

It is also possible to derive analytical solutions for equation (A4.7) by a transform called integrating factors ([5], p.322ff.):

(A4.10)
$Display mathematics$

Insertion into (A4.7) yields

$Display mathematics$

where we have used zi(0)=xi(0) and the condition . The solution finally is of the form

(A4.11)
$Display mathematics$

Under the assumption that the largest fitness parameter is non‐degenerate, {fi; i=1,2,&,n} = fmfi∀i ≠ m, every solution curve fulfilling the initial condition xi(0)〉0 approaches a homogenous population: $lim t → ∞ x m ( t ) = x̄ m = 1$ and $lim t → ∞ x i ( t ) = x̄ i = 0 ∀ i ≠ m$, and the flux approaches the largest fitness parameter monotonously, Փ(t)→ fm (examples are shown in Figure A4.3).

Qualitative analysis of stationary points and their stability yields the following results: (p.674)

1. (i) The only stationary points of equation ((A4.7)) are the corners of the simplex, represented by the unit vectors ek = {xk = 1, xi = 0∀i ≠ k},

2. (ii) only one of these stationary points is asymptotically stable, the corner where the mean fitness Փ adopts its maximal value on the simplex $e m : x̄ m = 1$ defined by {f i; i=1,2,&,n} = fmfi ∀i≠ m), one corner is unstable in all directions, a source where the value of Փ is minimal $e s : x̄ s = 1$ defined by {f i; i=1,2,&,n}= fsfi ∀ i≠ s), and all other n−2 equilibria are saddle points, and

3. (iii) since xi(0)=0 implies xi(t)=0 ∀ t〉0, every subsimplex of $S n ( 1 )$ is an invariant set, and thus the whole boundary of the simplex consists of invariant sets and subsets down the corners (which represent members of class $S 1 ( 1 ) )$.

Although Փ(t) represents a Liapunov function for the dynamical system (A4.7) and its existence is sufficient for the proof of global stability for selection of the fittest being the species with the largest fk value, it is of interest that the differential equation ((A4.7)) can be interpreted as a generalised gradient system [6, 7, 8] through the introduction of a Riemann‐type metric based on a generalised scalar product defined at position x

$Display mathematics$

(A4.12)
$Display mathematics$

the potential V(x) increases steadily along the orbits,

$Display mathematics$

and it does so at a maximal rate, since the velocity vector, being equal to the gradient, points at the position of maximum increase of V. In other words, the velocity vector points in the direction of steepest ascent, which is always orthogonal to the constant level sets of V. In gradient systems we observe optimisation of the potential function V(x) along all orbits.

For the purpose of illustration we choose an example, equation (A4.12) with

$Display mathematics$

(p.675) The time derivative of the potential function is obtained by

$Display mathematics$

The potential is increasing until it reaches asymptotically the maximal value V=0. Solutions of the differential equation are computed by integration: xi(t)=xi(0) exp (−fit) ∀ i=1,&,n. The result derived from dV}/dt is readily verified, since limt→∞ xi(t)=0 ∀ i=1,&,n and hence limt→∞ V(t)=0.

Equation ((A4.7)), on the other hand, is not an ordinary gradient system: it fulfills the optimisation criterion but the orbits are not orthogonal to the constant level sets of V(x)=Փ(x) (see Figure A4.3). In such a situation, it is often possible to achieve the full gradient properties through a generalisation of the scalar product that is tantamount to a redefinition of the angle on the underlying space, ℝn or , respectively. We shall describe here the formalism by means of the selection equation ((A4.7)) as an example.

The potential function is understood as a map, $V ( x ) : ℝ n ⇒ ℝ 1$. The derivative of the potential DV(x) is the unique linear map $L : ℝ n ⇒ ℝ 1$ that fulfils for all $y ∈ ℝ n$:

$Display mathematics$

where for o(y) holds o(y)/║ y ║ → 0 as ║ y ║→ 0. To L corresponds a uniquely defined vector l ∈ℝn such that 〈l, y〉=L(y) where 〈*,*〉 in the conventional Euclidean inner product is defined by $〈 u,v 〉 = ∑ i = 1 n u i v i$ for u, v∈ℝn. This special vector ℓ is the gradient of the potential V, which can be defined therefore by the following mapping of y into ℝ1:

(A4.13)
$Display mathematics$

The conventional Euclidean inner product is associated with the Euclidean metric, ║ x ║=〈x, x1/2.

It is verified straightforwardly that equation ((A4.7)) does not fulfill the condition of a conventional gradient (A4.13). The idea is now to replace the Euclidean metric by another more general metric that allows the properties of the gradient system. We introduce a generalised inner product corresponding to a Riemann‐type metric where the conventional product terms are weighted by the co‐ordinates of the position vector z:

(A4.14)
$Display mathematics$

Expression (A4.14), [*,*]z, defines an inner product in the interior of $ℝ + n$, because it is linear in u and v, and satisfies [u,u]z≥ 0 with the equality fulfilled if and only

(p.676)

Figure A4.3 Selection on the unit simplex.

In the upper part of the figure we show solution curves x(t) of equation ((A4.7)) with n = 3. The parameter values are: f 1 = 1 [t−1], f 2 = 2 [t−1] and f 3 = 3 [t−1], where [t−1] is an arbitrary reciprocal time unit. The two sets of curves differ with respect to the initial conditions:

(i) x(0) = (0.90, 0.08, 0.02), dotted curves, and (ii) x(0) = (0.9000, 0.0999, 0.0001), full curves. Color code: x1(t) green, x2(t) red and x3(t) black. The lower part of the figure shows parametric plots x(t) on the simplex $S 3 ( 1 )$. Constant level sets of Փ are straight lines (grey).

(p.677) if ║ u ║=0. Based on this choice of the inner product we redefine the gradient:
(A4.15)
$Display mathematics$

The differential DV (x) is defined completely by V(x) and hence is independent of the choice of an inner product; the gradient, however, is not because it depends on the definition (A4.15). Shahshahani [6] has shown that the relation dx/dt=Grad[Փ(x)] is fulfilled for Fisher's selection equation (A4.19; see section 5) with $Φ = ∑ i = 1 n ∑ j = 1 n a i j x i x j$. As an example for the procedure we consider here the simple selection equation ((A4.7)) with $Φ = ∑ i = 1 n f i x i$.

The differential equation ((A4.7)) is conceived as a generalised gradient system and we find:

$Display mathematics$

By application of equation (A4.15) we obtain

$Display mathematics$

which can be derived by conventional differentiation from

$Display mathematics$

By straightforward computation we find the desired result:

$Display mathematics$

With the new definition of the scalar product, encapsulated in the definition of ‘Grad’, the orbits of equation ((A4.7)) are perpendicular to the constant level sets of Փ(x).

# 4. Complementary replication

Often the molecular mechanism of replication proceeds through an intermediate represented by a polynucleotide molecule with a complementary sequence: (A)+I +I ++I and (A)+I I +I +. In analogy to equation ((A4.7)) and with $f 1 = f̃ + [ A ] , f 2 = f̃ − [ A ] , x 1 = [ I + ]$, and x2=[I ] we obtain the following differential equation 9–11:

(A4.16)
$Display mathematics$

(p.678) Applying the integrating factor transformation (A4.10) yields the linear equation

$Display mathematics$

The eigenvalues and (right‐hand) eigenvectors of the matrix W are

(A4.17)
$Display mathematics$

Straightforward calculation yields analytical expressions for the two variables (see paragraph mutation) with the initial concentrations x 1(0) and x 2(0), and $γ 1 ( 0 ) = f 1 x 1 ( 0 ) + f 2 x 2 ( 0 )$ and $γ 2 ( 0 ) = f 1 x 1 ( 0 ) + f 2 x 2 ( 0 )$ as abbreviations:

(A4.18)
$Display mathematics$

After a sufficiently long time the negative exponential has vanished and we obtain the simple result

$Display mathematics$

After an initial period, the plus‐minus pair, I ±, grows like a single autocatalyst with a fitness value $f = f 1 f 2$ and a stationary ratio of the concentrations of complementary stands $x 1 / x 2 ≈ f 2 / f 1$.

# 5. Recombination

Recombination of n alleles on a single locus is described by Ronald Fisher's [12] selection equation,

(A4.19)
$Display mathematics$

(p.679) As in the simple selection case the two conditions a ij}〉 0∀i,j = 1,2,&,n and x i≥ 0∀i=1,2,&,n will guarantee Փ(t)≥0. Summation of allele frequencies, $∑ i = 1 n x i ( t ) = c ( t )$, yields again equation (A4.8) for ċ and hence for $∑ i = 1 n x i ( 0 ) = 1$ the population is confined again to the unit simplex.

The rate parameters a ij form a quadratic matrix

$Display mathematics$

The dynamics of equation (A4.19) for general matrices A may be very complicated [13]. In case of Fisher's selection equation, however, we are dealingwith a symmetric matrix for biological reasons,5 and then the differential equation can be subjected to straightforward qualitative analysis.

The introduction of mean rate parameters $ā i = ∑ j = 1 n a i j x j$ facilitates the forthcoming analysis.The time dependence of Փ is now given by

(A4.20)
$Display mathematics$

Again we see that the flux Փ(t) is a non‐decreasing function of time, and it approaches an optimal value on the simplex. This result is often called Fisher's fundamental theorem of evolution (see, for example, [14]).

Qualitative analysis of equation (A4.19) yields 2n−1 stationary points, which depending on the elements of matrix A may lie in the interior, on the boundary or outside the unit simplex $s n ( 1 )$. In particular, we find at most one equilibrium point on the simplex and one on each subsimplex of the boundary. For example, each corner, (p.680) represented by the unit vector $e k = { x̄ k = 1 , x i = 0 ∀ i ≠ k }$, is a stable or unstable stationary point. Where there is an equilibrium in the interior of $S n ( 1 )$ it may be stable or unstable depending on the elements of A. In summary, this leads to a rich collection of different dynamical scenarios which share the absence of oscillations or chaotic dynamics. As said above, multiple stationary states do occur and more than one may be stable. This implies that the optimum, which Փ(t) is approaching, need not be uniquely defined. Instead Փ(t) may approach one of the local optima and then the outcome of the selection process will depend on initial conditions [14, 15, 16, 17].

Three final remarks are important for a proper understanding of Fisher's fundamental theorem: (i) selection in the one‐locus system when it follows equation (A4.19) optimises mean fitness of the population, (ii) the outcome of the process need not be unique since the mean fitness Փ may have several local optima on the unit simplex, and (iii) optimisation behaviour that is susceptible to rigorous proof is restricted to the one locus model since systems with two or more gene loci may show different behaviours of Փ(t).

# 6. Mutation

The introduction of mutation into the selection equation ((A4.7)) based on knowledge from molecular biology is due to Manfred Eigen [9]. It leads to

(A4.21)
$Display mathematics$

Mutations and error‐free replication are understood as parallel reaction channels, and the corresponding reaction probabilities are contained in the mutation matrix

$Display mathematics$

where Qij expresses the frequency of a mutation I jI i. Since the elements of Q are defined as reaction probabilities and a replication event yields either a correct copy or a mutant, the columns of Q sum up to one, $∑ i = 1 n Q i j = 1$, and hence, Q is stochastic matrix. In case one makes the assumption of equal probabilities for I jI i and I iI j, as it is made for example in the uniform error rate model (see equation (A4.29) and [10,18], Q is symmetric and hence a bistochastic matrix.6 The mean fitness or flux Փ is described by the same expression as in the selection‐only case (p.681) (A4.7). This implies that the system converges to the unit simplex, as it did without mutations. For initial values of the variables chosen on the simplex, $∑ i = 1 n x i ( 0 ) = 1$, it remains there.

In the replication‐mutation system the boundary of the unit simplex, $S n ( 1 )$, is not invariant. Although no orbit starting on the simplex will leave it, which is a conditio sine qua non for chemical reactions requiring non‐negative concentrations, trajectories flow from outside the positive orthant into $S n ( 1 )$. In other words, the condition x i(0)=0 does not lead to x i(t)=0 ∀ t〉0. The chemical interpretation is straightforward: If a variant I i is not present initially, it can, and depending on Q commonly will, be formed through a mutation event.

## 6.1 Exact solution

Before discussing the role of the flux Փ in the selection‐mutation system, we shall derive exact solutions of equation (A4.21) following a procedure suggested in [19,20]. At first the variables xi(t) are transformed as in the selection‐only case (A4.10):

$Display mathematics$

From $∑ x = 1 n x i ( t ) = 1$ follows straightforwardly — again as in the selection‐only case — the equation,

$Display mathematics$

What remains to be solved is a linear first‐order differential

(A4.22)
$Display mathematics$

which is readily done by means of standard linear algebra. We define a matrix W ={Wij = Qij fj} = Q∙F where F ={Fij=fiδij is a diagonal matrix, and obtain the differential equation in matrix form, ż=W∙z. Provided matrix W is diagonalisable, which will always be the case when the mutation matrix Q is based on real chemical reaction mechanisms, the variables z can be transformed linearly by means of an invertible n × n matrix L= {ℓij i,j=1,&,n} with L−1=hij; i,j=1,&,n} being its inverse,

$Display mathematics$

(p.682) such that L−1∙ W ∙ L=λ is diagonal. The elements of λ, λk, are the eigenvalues of the matrix W. The right‐hand eigenvectors of W are given by the columns of L, ℓj=ℓi,j; i = 1,&,n, and the left‐hand eigenvectors by the rows of L−1, hk = (hk,i; i = 1,&,n, respectively. These eigenvectors can be addressed as the normal modes of selection‐mutation kinetics. For strictly positive off‐diagonal elements of W, implying the same for Q which says nothing more than every mutation I iI j is possible, although the probability might be extremely small, the Perron—Frobenius theorem holds (see, for example, [21] and next paragraph) and we are dealing with a non‐degenerate largest eigenvalue λ0,

(A4.23)
$Display mathematics$

and a corresponding dominant eigenvector ℓ0 with strictly positive components, ℓ10〉0 ∀ i=1,&,n.7 In terms of components the differential equation in ζ has the solutions

(A4.24)
$Display mathematics$

Transformation back into the variables z yields

(A4.25)
$Display mathematics$

with the initial conditions encapsulated in the equation

(A4.26)
$Display mathematics$

From here we obtain eventually the solutions in the original variables xi in analogy to equation (A4.11)

(A4.27)
$Display mathematics$

## 6.2 Perron‐Frobenius theorem

Perron—Frobenius theorem comes in two versions [21], which we shall now consider and apply to the selection‐mutation problem. The stronger version provides (p.683) a proof for six properties of the largest eigenvector of non‐negative primitive matrices8 T:

1. (i) The largest eigenvalue is real and positive, λ_0〉0,

2. (ii) a strictly positive right eigenvector ℓ0 and a strictly positive left eigenvector h 0 are associated with λ0,

3. (iii) λ0〉|λk| holds for all eigenvalues λk≠λ0,

4. (iv) the eigenvectors associated with λ0 are unique up to constant factors,

5. (v) if l ≤ B ≤ T is fulfilled and β is an eigenvalue of B, then |β|≤λ0, and, moreover, |β|=λ0 implies B=T,

6. (vi) λ 0 is a simple root of the characteristic equation of T.

The weaker version of the theorem holds for irreducible matrices9 T. All the above given assertions hold except (iii) has to be replaced by the weaker statement

(iii) λ0〉|λk| holds for all eigenvalues λk≠λ0

Irreducible cyclic matrices can be used straightforwardly as examples in order to demonstrate the existence of conjugate complex eigenvalues (an example is discussed below). Perron—Frobenius theorem, in its strict or weaker form, holds not only for strictly positive matrices T〉0 but also for large classes of mutation or value matrices W≡T being a primitive or irreducible non‐negative matrix) with off‐diagonal zero entries corresponding to zero mutation rates. The occurrence of a non‐zero element $t i j ( m )$ in Tm implies the existence of a mutation path I jI k→&→ I I i with non‐zero mutation frequencies for every individual step. This condition is almost always fulfilled in real systems.

## 6.3 Complex eigenvalues

In order to address the existence of complex eigenvalues of the value matrix W we start by considering the straightforward case of a symmetric mutation matrix Q. Replication rate parameters fi are subsumed in a diagonal matrix: F={f i δi,j;i,j=1,&,n}, the value matrix is obtained as product W=Q∙F, and, in general, W is (p.684) not symmetric. A similarity transformation,

$Display mathematics$

yields a symmetric matrix [22], since $F 1 2 · Q · F 1 2$ is symmetric if Q is. Symmetric matrices have real eigenvalues and as a similarity transformation does not change the eigenvalues W has only real eigenvalues if Q is symmetric.

The simplest way to yield complex eigenvalues is by the introduction of cyclic symmetry into the matrix Q in such a way that the symmetry with respect to the main diagonal is destroyed. An example is the matrix

$Display mathematics$

with different entries Qij. For equal replication parameters the eigenvalues contain complex n th roots of one, $γ k n = 1$ or γk= exp (2π ik/n), i=1,&, n, and for n≥3 most eigenvalues come in complex conjugate pairs. As mentioned earlier symmetry in mutation frequencies is commonly not fulfilled in nature. For point mutations the replacement of one particular base by another one does usually not occur with the same frequency as the inverse replacement, GA versus AG, for example. Needless to say, cyclic symmetry in mutation matrices is also highly improbable in real systems. The validity the of Perron—Frobenius theorem, however, is not affected by the occurrence of complex conjugate pairs of eigenvectors. In addition, it is unimportant for most purposes whether a replication‐mutation system approaches the stationary state monotonously or through damped oscillations (see next paragraph).

## 6.4 Mutation and optimisation

In order to consider the optimisation problem in the selection‐mutation case, we choose the eigenvectors of W as the basis of a new co‐ordinate system (Figure A4.4):

$Display mathematics$
where the vectors ei are the unit eigenvectors of the conventional Cartesian co‐ordinate system and lk the eigenvectors of W. The unit eigenvectors represent the corners of $S n ( 1 )$ and in complete analogy we denote the space defined by the vectors $S ˜ n ( 1 )$. Formally, the transformed differential equation

(p.685)

Figure A4.4 The quasi‐species on the unit simplex.

Shown is the case of three variables (x 1, x 2, x 3) on $S 3 ( 1 )$. The dominant eigenvector, the quasi‐species denoted by ℓ0, is shown together with the two other eigenvectors, ℓ1 and ℓ2.The simplex is partitioned into an optimization cone (white, red trajectories) where the mean replication rate $f ̄ ( t )$ is optimised, two other zones where $f ̄ ( t )$ may also decrease (grey) and the master cone, which is characterised by non‐increasing $f ̄ ( t )$ and which contains the master sequence (white, blue trajectories). Here, I 3 is chosen to be the master sequence. Solution curves are presented as parametric plots x(t). In particular, the parameter values are f 1 = 1.9[t−1], f 2 = 2.0[t−1] and f 3 = 2.1[t−1], and the Q−matrix was assumed to be bistochastic with the elements Qii = 0.98 and Qij = 0.01 for i,j = {1, 2, 3}. Then the eigenvalues and eigenvectors of W are:

k

λk

k

2k

3k

0

2.065

0.093

0.165

0.742

1

1.958

0.170

1.078

−48

2

1.857

1.327

−0.224

−0.103

The mean replication rate $f ̄ ( t )$ is monotonously increasing along red trajectories, monotonously decreasing along the blue trajectory and not necessarily monotonous along green trajectories. Constants level sets of Փ are straight lines (grey).

(p.686)
$Display mathematics$

is identical to equation ((A4.7)) and hence the solutions are the same,

$Display mathematics$

as well as the maximum principle on the simplex

(A4.9a)
$Display mathematics$

The difference between selection and selection‐mutation comes from the fact that the simplex $S ˜ n$ does not coincide with the physically defined space $S n$ (see Figure A4.4 for a low‐dimensional example). Indeed only the dominant eigenvector ℓ0 lies in the interior of $S n ( 1 )$: it represent the stable stationary distribution of genotypes or quasi‐species [10] towards which the solutions of the differential equation (A4.21) converge. All other n−1 eigenvectors, ℓ1,&,ℓn−1 lie outside $S n ( 1 )$ in the not physical range where one or more variables xi are negative. The quasi‐species ℓ0 is commonly dominated by a single genotype, called the master sequence I m, having the largest stationary relative concentration, $x ̄ m 〉 〉 x ̄ i ∀ i ≠ m$, reflecting, for not too large mutation rates, the same ranking as the elements of the matrix W, $W m m 〉 〉 W i i ∀ i ≠ m$; im. As sketched in Figure A4.4 the quasi‐species is then situated close to the unit vector em in the interior of $S n ( 1 )$.

For the discussion of the optimisation behaviour the simplex is partitioned into three zones: (i) the zone of maximization of Փ(t), the (large) lower white area in figure A4.4 where equation (A4.9a) holds and which we shall denote as optimisation cone, (ii) the zone that includes the unit vector of the master sequence, em, and the quasi‐species, ℓ0, as corners, and that we shall characterize as master cone,10 and (iii) the remaining part of the simplex $S n ( 1 )$ (zones (iii) are coloured grey in Figure A4.4). It is straightforward to prove that increase of Փ(t) and monotonous convergence towards the quasi‐species is restricted to the optimisation cone [23]. From the properties of the selection equation ((A4.7)) we recall and conclude that the boundaries of the simplex $S n ( 1 )$ are invariant sets. This implies that no orbit of the differential equation (A4.21) can cross these boundaries. The boundaries of $S n ( 1 )$, on the other hand, are not invariant but they can be crossed exclusively in one direction: (p.687) from outside to inside.11 Therefore, a solution curve starting in the optimisation cone or in the master cone will stay inside the cone where it started and eventually converge towards the quasi‐species, ℓ0.

In zone (ii), the master cone, all variables ξk except ξ0 are negative and ξ0 is larger than one in order to fulfill the L(1)-norm condition $∑ k = 0 n − 1 ξ k = 1$. In order to analyse the behaviour of Փ(t) we split the variables into two groups, ξ0 the frequency of the quasi‐species and the rest [23], k; k = 1,&,n − 1} with $∑ k = 1 n − 1 ξ k = 1 − ξ 0$:

$Display mathematics$

Next we replace the distribution of λk values in the second group by a single λ‐value, $λ ˜$ and find:

$Display mathematics$

After a few simple algebraic operations we find eventually

(A4.28)
$Display mathematics$

For the master cone with ξ0≥1, this implies dՓ(t)/dt≤0, as the flux is a non‐increasing function of time. Since we are only interested in the sign of dՓ/dt, the result is exact, because we could use the mean value $λ ˜ = λ ̄ = ( ∑ k = 1 n − 1 λ k ξ k ) / ( 1 − ξ 0 )$, the largest possible value λ1 or the smallest possible value λn−1 without changing the conclusion. Clearly, the distribution of λk‐values matters for quantitative results. It is worth mentioning that equation (A4.28) applies also to the quasi‐correct coneand gives the result that Փ(t) is non‐decreasing. Decrease of mean fitness or flux Փ(t) in the master cone is readily illustrated. Consider, for example, a homogeneous population of the master sequence as the initial condition: xm(0)=1 and Փ(0)=fm. The population becomes inhomogeneous because mutants are formed. Since all mutants have lower replication constants by definition, (fifm∀i≠ m), Փ becomes smaller. Finally, the distribution approaches the quasi‐species ℓ0 and limt→∞Փ(t)=λ0fm.

An extension of the analysis from the master cone to zone (iii), where not all ξk values with k≠0 are negative, is not possible. It has been shown by means of numerical examples that dՓ(t)/dt may show non‐monotonous behavior and can go through a maximum or a minimum at finite time [23].

## (p.688) 6.5 Mutation rates and error threshold

In order to illustrate the influence of mutation rates on the selection process we apply (i) binary sequences, (ii) the uniform error rate approximation,

(A4.29)
$Display mathematics$

with dij being the Hamming distance between the two sequences I i and I j, ν the chain length and p the mutation or error rate per site and replication, and (iii) a simple model for the distribution of fitness values known as single peak fitness landscape [24],

$Display mathematics$

which represents a kind of mean field approximation. The mutants with the master sequence I 1 are ordered in mutant classes: the zero‐error class contains only the reference sequence (I 1), the one‐error class comprises all single‐point mutations, the two‐error class all double‐point mutations, etc. Since the error rate p is independent of the particular sequence and all molecules belonging to the same mutant class have identical fitness values fk, it is possible to introduce new variables for entire mutant classes Γk:

(A4.30)
$Display mathematics$

The mutation matrix Q has to be adjusted to transitions between classes [24,25]. for mutations from class Γl into Γk we calculate:

(A4.31)
$Display mathematics$

The mutation matrix Q for error classes is not symmetric, Qkl≠ Qkllk as follows from equation (A4.31).

A typical plot of relative concentrations against error rate is shown in Figure A4.5. At vanishing error rates, lim p→0, the master sequence is selected, $lim t → ∞ y 0 ( t ) = y ̄ = 1$, and all other error classes vanish in the long time limit. Increasing error rates are reflected by a decrease in the stationary relative concentration of the master sequence and a corresponding increase in the concentration of all mutant classes. Apart from $y ̄ 0 ( p )$ all concentrations $y ̄ k ( p )$ with k〈ν/2 go through a maximum and approach pairwise the curves for $y ̄ v − k$ at values of p that increase with p. At p = 0.5 the eigenvalue problem can be solved exactly: the largest eigenvalue is strictly positive λ0 〉 0: it corresponds to an eigenvector ℓ0, which is the uniform distribution in relative stationary concentrations $x ̄ 1 = x ̄ 2 = … = x ̄ n = 1 / n$, and

(p.689)

Figure A4.5 Error thresholds in the quasi‐species model.

The figures show the stationary distribution of relative concentrations of mutant classes as functions of the error rate, $y ̄ k ( p )$, for sequences of chain length ν = 20. The population on a single peak landscape (upper part, σ = 2) gives rise to a sharp transition between the ordered regime, where relative concentrations are determined by fitness values fk and mutation rates Qkl (A4.31), and the domain of the uniform distribution where all error classes are present proportional to the numbers of sequences in them, $| Γ k | = ( v k )$. The colour code is chosen such that the error classes with the same frequency, for example Γ0 and Γν, Γ1 and Γν−1, etc., have identical colours and hence curves with the same colour merge above threshold. The population on a hyperbolic fitness landscape (lower part, σ = 1.905) shows a smoother transition that can be characterised as weak error threshold. Careful observation shows that the coalescence of curves with different colours at p ≈ 0.05 is accidental since they diverge again at higher error rates.

this implies $y ̄ k = ( v k )$ for the class variables. The uniform distribution is a result of the fact that at p = 0.5 = 1 − p correct digit replication and errors are equally probable (for binary sequences) and therefore we may characterise this scenario as random replication. All other eigenvalues vanish at p = 0.5: λ12=&=λn−1=0.

(p.690) The mutant distribution $y ̄ ( p )$ comes close to the uniform distribution already around p≈0.035 in Figure A4.5, and stays constant for the rest of the p values (0.035〈 p 〈0.5). The narrow transition from the ordered replication (0〈p〈0.035) to random replication (p〉0.035) is called the error threshold. An approximation based on neglect of mutational back‐flow and using ln(1−p)≈−p yields a simple expression for the position of the threshold [9]:

(A4.32)
$Display mathematics$

The equation defines a maximal error rate p max above which no ordered — non‐uniform — stationary distributions of sequences exist (see also section 7). In the current example (we calculate p max=0.03466 in excellent agreement with the value observed in computer simulations. RNA viruses commonly have mutation rates close to the error threshold [26]. Error rates can be increased by pharmaceutical drugs interfering with virus replication and accordingly a new antiviral strategy has been developed, which drives virus replication into extinction either by passing the error threshold [28] or by extinction. Recently, the mechanism of lethal mutagenesis in virus infections has been extensively discussed [29,30].

Several model landscapes describing fitness by a monotonously decreasing function of the Hamming distance from the master sequence, f(d), are often applied in population genetics, examples are:

$Display mathematics$

Interestingly, all three model landscapes do not sustain sharp error thresholds as observed with the single peak landscape. On the hyperbolic landscape the transition is less sharp than on the single peak landscape and may be called weak error threshold. The linear and quadratic landscapes show rather gradual and smooth transitions from the quasi‐species towards the uniform mutant distribution (Figure A4.6). Despite the popularity of smooth landscapes in populations genetics, they are not supported by knowledge derived from biopolymer structures and functions. In contrast, the available data provide strong evidence that the natural landscapes are rugged and properties do not change gradually with Hamming distance.

In order to generalise the results derived from model landscapes to more realistic situations, random variations of rate constants for individual sequences were superimposed upon the fitness values of a single peak landscape — whereby the mean value (p.691) $f ̄ − m$ was kept constant [31, pp. 29–60]. Then, the curves for individual sequences within an error class differ from each other and form a band that increases in width with the amplitude of the random component. Interestingly, the error threshold phenomenon is thus retained and the critical value p max is shifted to lower error rates. Another very general approach to introduce variation into the value matrix without accounting for the underlying chemical reaction mechanism was taken by Walter Thirring and coworkers [32]. They also found a sharp transition between ordered and disordered domains.

## 6.6 Population entropies

Population entropies are suitable measures for the width of mutant distributions. For steady states they are readily computed from the largest eigenvector of matrix W:

(A4.33)
$Display mathematics$

where the expression on the right‐hand side refers to mutant classes.The pure state at p = 0 has zero entropy, S(0) = 0. For the uniform distribution the entropy is maximal, and for binary sequences we have

Figure A4.6 Smooth transitions in the quasi‐species model.

The two figures show stationary mutant distributions as functions of the error rate, $y ̄ k ( p )$, for sequences of chain length ν= 20. The upper figure was calculated for a linear landscape (σ = 1.333), the lower figure for a quadratic landscape (σ = 1.151) of fitness values.The transitions are smooth in both cases.

(p.692)

Figure A4.7 Population entropy on different fitness landscapes.

The plot shows the population entropy as functions of the error rate, S(p), for sequences of chain length ν = 20.The results for individual landscapes are colour coded: single peak landscape black, hyperbolic landscape red, linear landscape blue and quadratic landscape green.The corresponding values for the superiority of the master sequence are σ = 2, 1.905, 1.333 and 1.151, respectively.

$Display mathematics$

Between these two extremes, 0≤ p≤0.5, the entropy is a monotonously increasing function of the error rate, p. Figure A4.7 shows the entropy S(p) on the four model landscapes applied in Figures A4.5 and A4.6. The curves reflect the threshold behaviour encountered in the previous paragraphs (Figures A4.5 and A4.6): the entropy on the single peak landscape makes a sharp kink at the position of the error threshold, the curve for the entropy on the hyperbolic landscape has a similar bend at the threshold but the transition is smoother, whereas the entropies for the two other landscapes are curved differently and approach smoothly the maximum value, Smax=ν ln 2.

## 6.7 Lethal mutants

It is important to note that a quasi‐species can exist also in cases where the Perron—Frobenius theorem is not fulfilled. As an example we consider an extreme case of lethal mutants: only genotype I 1 has a positive fitness value, f1 〉 0 and f2 = & = fn = 0, and hence only the entries W k1=Q k1f1 of matrix W are non‐zero:

$Display mathematics$

Accordingly, W is not primitive in this example, but under suitable conditions x̄ = (Q11, Q21,…,Qn1) is a stable stationary mutant distribution and for — correct replication occurs more frequently than a particular (p.693) mutation — genotype I 1 is the master sequence. On the basis of a rather idiosyncratic mutation model consisting of a one‐dimensional chain of sequences the claim was raised that no quasi‐species can be stable in the presence of lethal mutants and accordingly, no error thresholds could exist [33]. Recent papers [30,34], however, used a realistic high‐dimensional mutation model and presented analytical results as well as numerically computed examples for error thresholds in the presence of lethal mutations.

In order to be able to handle the case of lethal mutants properly we have to go back to absolute concentrations in a realistic physical setup, the flowreactor applied in section 1 and shown in Figure A4.1. We neglect degradation and find for I 1 being the only viable genotype:12

(A4.34)
$Display mathematics$

Computation of stationary states is straightforward and yields two solutions, (i) the state of extinction with ā=a0 and $a ̄ = a 0$, and (ii) a state of quasi‐species selection consisting of I 1 and its mutant cloud at the concentrations $a ̄ = r / ( Q 11 k 1 ) , c ̄ 1 = Q 11 a 0 − r / k 1$ and $c ̄ i = c ̄ 1 ( Q i 1 / Q 11 )$ for i=2,&,n.

As an example we compute a maximum error rate for constant flow, r=r0, again applying the uniform error rate model (A4.29):

$Display mathematics$

where di1 again is the Hamming distance between the two sequences I i and I 1. Instead of the superiority σ of the master sequence — which diverges since f −m = 0 because of f2=&=fn=0 — we use the dimensionless carrying capacity η, which can be defined to be

$Display mathematics$

for the flowreactor. The value of p, at which the stationary concentration of the master sequence $c ̄ ( p )$ and those of all other mutants vanishes, represents the analogue of the error threshold (32), and for the sake of clearness it is called the extinction threshold. Using ln(1−p)≈−p again we obtain:

(A4.35)
$Display mathematics$

(p.694)

Figure A4.8 Lethal mutants and replication errors

The model for lethal mutants corresponding to a single peak landscape with k 1 = 1 and k2 = … = k n = 0 is studied in the flowreactor. The concentrations of the master sequence (black) and the mutant classes (red, dark orange, light orange, etc.; full lines) are shown as functions of the error rate p. For the purpose of comparison the parameters were chosen with ν = 20, r = 1, a 0 = 2, and η = 2. The plots are compared to the curves for the master sequence (grey; broken curve) and the one error class (light red; broken curve) in figure A4.5 (single peak landscape) with f 1 = 2, f 2 = … = f n = 1, ν = 20, and σ = 2.

The major difference between the error threshold (A4.32) and the extinction threshold (A4.35) concerns the state of the population at values pp max: replication with non‐zero fitness of mutants leads to the uniform distribution, whereas the population goes extinct in the lethal mutant case. Accordingly, the transformation to relative concentrations fails and equation ((A4.7)) is not applicable. In Figure A4.8 we show an example for the extinction threshold with ν=20 and η=2. For this case the extinction threshold is calculated from (A4.35) to occur at p max=0.03466 compared to a value of 0.03406 observed in computer simulations. In the figure we see also a comparison of the curves for the master sequence and the one error class for the single peak landscape and the lethality model. The agreement of the two curves for the master sequences is not surprising since the models were adjusted to coincide in the values $c ̄ ( 0 ) = 1$ and p max=ln 2/20. The curves for the one‐error classes show some difference that is due to the lack of mutational backflow in the case of lethal variants.

# 7. Limitations of the approach

An implicit assumption of the mathematical analysis of Darwinian selection presented here is the applicability of kinetic differential equations to describe selection and mutation in populations. In principle the approach by ordinary differential (p.695) equations (ODEs) neglects finite size effects and hence is exact in principle for an infinite population size only. Biological populations, however, may be small and low–frequency mutants may be present often in a single copy or very few copies only. The uniform distribution at error rates above the threshold can never be achieved in reality because the numbers of possible polynucleotide sequences — 4ν yielding, for example, 6 × 1045 sequences of tRNA length — are huge compared to typical populations ranging from 106 to 1015 individuals in replication experiments with bacteria, viruses or RNA molecules. Typical situations in biology may thus differ drastically from scenarios in chemistry where large populations are distributed upon a few chemical species. Are the results derived from the differential equations then representative for real systems? Two situations can be distinguished: (i) individual mutations are rare events and it is extremely unlikely that the same mutation will occur twice or is precisely reversed after it has occurred, and (ii) mutations are sufficiently frequent and occur in both directions within the time of observation. The second scenario is typical for virus evolution and in vitro evolution experiments with molecules. The first case seems to be fulfilled with higher organisms. Bacteria may be in an intermediate situation.

In scenario (i), i.e. at low mutation rates, the exact repetition of a given mutation is of very low probability. Back‐mutations, precise inversions of mutations, are also of probability zero for all practical purposes. So‐called compensatory mutations are known, but they are not back‐mutations, they are rather caused by second mutations that compensate the effect of the first mutation. A phenomenon called Muller's ratchet [35] in population genetics becomes effective in finite populations. Since lost mutants are not replaced, all variants starting with the fittest one will disappear sooner or later, and it is a only matter of time before a situation is reached where all genotypes have been replaced by others no matter what their fitness values were. For a comparison between the error threshold phenomenon and Muller's ratchet see [33].

The frequent mutation scenario (ii) allows for modelling and studying the kinetic equations of reproduction and selection as stochastic processes [25, 36, 37, 38] — examples are multitype branching or birth‐and‐death processes — as well as for computer simulations [39] (for an overview of stochastic modeling see, for example [40]). In essence, the solutions of stochastic models are time‐dependent probability distributions instead of solution curves. The mean or expectation value of the distribution coincides with the deterministic (ODE) solution, since all reactions in the kinetic model are (pseudo) first order. The (relative) width of the distribution increases with growing mutation rate and deceasing population size, and the error threshold phenomenon is reproduced as a superposition of error propagation and finite size effects. The expression for the error threshold can be readily extended to finite populations [25]. Formation of stable quasi‐species requires a replication fidelity that is higher the smaller the population size.

(p.696) At error rates above threshold the kinetic ODEs predict the uniform distribution of sequences as stationary solution of equation (A4.21).13 Differences in fitness values do not matter under these conditions and there is no preferred master sequence. Realistic populations are far too small to form uniform distributions of sequences and hence the deterministic model fails. Below threshold the quasi‐species can be visualised as a localisation of the population in some preferred region of sequence space with high fitness values (or at least one particularly high fitness value) [18, 41]. Above threshold the population is no more localised and drifts randomly in sequence space.14 At the same time, populations are also too small to occupy a coherent region in sequence space and break up into smaller clones, which migrate in different directions as described for the neutral evolution case [3,44].

How relevant is the error threshold in realistic situations? According to the results presented in section 6.5 the question boils down to an exploration of natural fitness landscapes: are biopolymer landscapes rugged or smooth? All evidence obtained so far points towards a rather bizarre structure of these landscapes. Single nucleotide exchanges may lead to large effects, small effects or no consequences at all, as in the case of neutral mutations. Since biomolecules are usually optimised with respect to their functions within an organism, most mutations have deleterious effects or no effect. Biopolymer landscapes have three characteristic features, which are hard to visualize: (i) high dimensionality (ii) ruggedness and (iii) neutrality. Where equally fit genotypes are nearest or next nearest neighbors in sequence space they form joint quasi‐species as described in [23]. When they are not closely related, however, neutral evolution in the sense of Motoo Kimura is observed [45]. In the case of neutrality in genotype space a selection model can still be formulated in phenotype space [46, 47]. The variables are concentrations of phenotypes that are obtained by lumping together all concentrations of genotypes, which form the same phenotype. Then, an analysis similar to the one presented here can be carried out. The genotypic error threshold is relaxed and the system gives rise to a phenotypic error threshold below which the fittest or master phenotype is conserved in the population. The ODE model is readily supplemented by a theory of phenotype evolution based on new concept of evolutionary nearness of phenotypes in sequence space [4, 48, 49], which is confirmed by computer simulations of RNA structure optimization in a flowreactor of the type shown in Figure A4.1 [4, 48, 50]. Random drift of populations on neutral subspaces of sequence space has also been dealt with [50]. A series of snapshots shows the spreading of a population that breaks up into individual clones in (p.697) full agreement with earlier models [44, 51]. Computer simulations were also successful in providing evidence for the occurrence of error thresholds in stochastic replication‐mutation systems [52].

## Notes:

(2) We do not consider degenerate or neutral cases, di=dj and ki=kj, here (see also section A4.7).

(3) Care is needed for the application of relative co‐ordinates, because the total concentration c(t) might vanish and then relative co‐ordinates become spurious quantities (see subsection 6.7).

(4) Generalisation to arbitrary but finite population sizes c≠1 is straightforward: For $∑ i = 1 n x i ( 0 ) = c 0$ the equation Generalisation to arbitrary $x · t = f i x i − ( x i / c 0 ) ∑ j = 1 n f j x j , i = 1 , 2 , … , n$ plays the same role as equation ((A4.7)) did for $∑ i = 1 n x i ( 0 ) = 1$.

(5) The assumption for Fisher's equation is based on insensitivity of phenotypes to the origin of the parental alleles on chromosomes. Phenotypes derived from genotype a i a j are assumed to develop the same properties, no matter which allele, a i a j, on the chromosomal locus comes from the mother and which comes from the father. New results on genetic diseases have shown, however, that this assumption can be questioned.

(6) Symmetry in the direction of mutations is commonly not fulfilled in nature. It is introduced as a simplification, which facilitates the construction of computer models for equation (A4.21). Moreover, the assumption of a symmetric mutation matrix Q is not essential for the analytic derivation of solutions.

(7) We introduce here an asymmetry in numbering rows and columns in order to point at the special properties of the largest eigenvalue λ0 and the dominant eigenvector l 0.

(8) A square non‐negative matrix T=tij;i,j=1,&,n; tij≥0 is called primitive if there exists a positive integer m such that Tm is strictly positive: Tm, which implies $T m = { t i j ( m ) ; i , j = 1 , … , n ; t i j ( m ) 〉 0 }$

(9) A square non#x2010;negative matrix T=tij;i,j=1,&,n; tij≥0 is called irreducible if for every pair (i,j) of its index set there exists a positive integer mij≡ m(i,j) such that $t i j m i j 〉 0$. An irreducible matrix is called cyclic with period d if the period of (all) its indices satisfies d〉1, and it is said to be acyclic if d=1.

(10) The exact geometry of the optimisation cone or the master cone is a polyhedron that can be approximated by a pyramid rather than a cone. Nevertheless we prefer the inexact notion cone because it is easier to memorise and to imagine in high‐dimensional space.

(11) This is shown easily by analysing the differential equation, but follows also from the physical background: no acceptable process can lead to negative particle numbers or concentrations. It can, however, start at zero concentrations and this means the orbit begins at the boundary and goes into the interior of the physical concentration space, here the simplex $S n ( 1 )$.

(12) We use ki for the rate constants as in section 1, since a(t) is a variable here.

(13) As mentioned before the uniform distribution is the exact stationary solution of equation ((A4.21)) for equal probabilities of correct and incorrect incorporation of a nucleotide, which is the case at an error rate p=1−p=0.5 for binary sequences.

(14) The mutation rate can be seen as an analogue to temperature in spin systems and the error threshold corresponds to a phase transition. The relation between the selection‐mutation equation and spin systems was studied first by Ira Leuthäusser [42,43]