• Non ci sono risultati.

Small world effects in evolution

N/A
N/A
Protected

Academic year: 2021

Condividi "Small world effects in evolution"

Copied!
9
0
0

Testo completo

(1)

Small world effects in evolution

Franco Bagnoli*

Dipartimento di Matematica Applicata, Universita` di Firenze, Via S. Marta, 3 I-50139 Firenze, Italy and INFM, INFN, Sezione di Firenze, Largo E. Fermi, 2 I-50125 Firenze, Italy

Michele Bezzi†

SISSA Programme in Neuroscience, Via Beirut 2-4 I-34103, Trieste, Italy and INFM, Sezione di Firenze, Largo E. Fermi, 2 I-50125 Firenze, Italy

共Received 31 July 2000; revised manuscript received 9 February 2001; published 26 July 2001兲 For asexual organisms point mutations correspond to local displacements in the genotypic space, while other genotypic rearrangements represent long-range jumps. We investigate the spreading properties of an initially homogeneous population in a flat fitness landscape, and the equilibrium properties on a smooth fitness land-scape. We show that a small-world effect is present: even a small fraction of quenched long-range jumps makes the results indistinguishable from those obtained by assuming all mutations equiprobable. Moreover, we find that the equilibrium distribution is a Boltzmann one, in which the fitness plays the role of an energy, and mutations that of a temperature.

DOI: 10.1103/PhysRevE.64.021914 PACS number共s兲: 87.10.⫹e, 05.20.⫺y, 05.45.⫺a

I. INTRODUCTION

Darwinian evolution on asexual organisms acts by two mechanisms: mutations, which increase the genetic diversity, and selection, which fixes and reduces this diversity. We classify mutations into point mutations, corresponding to lo-cal displacements on the genotypic space 共defined more ac-curately in the following兲, and other types of mutation or rearrangement, which imply larger jumps. Mutations are quite rare, so we can safely assume that for each generation at most only one mutation occurs. From a biological point of view, this discussion applies to unicellular asexual organ-isms, for which there is no distinction between somatic and germ cells. Moreover, we do not consider the possibility of recombination 共exchange of genetic material兲. For nonre-combinant 共asexual兲 organisms, the combined effects of re-production and mutations correspond to a random walk on the genotypic space. Even sexual 共recombinant兲 populations do transmit asexually some part of their genotype, such as mithocondrial or y-chromosome DNA, to which the follow-ing analysis applies. Furthermore, we assume a constant en-vironment.

We schematize the effects of selection by the selective reduction of the survival probability. Selection results from several constraints and can affect the frequency of certain genotypes, or even eliminate them from the population. We classify the components of selection into static and dynamic ones. In the first class we put the constraints that are inde-pendent of the population distribution, such as the function-ality of a certain protein or the tuning of a metabolic path. The static part of the selection is equivalent to the concept of ‘‘fitness landscape’’关1,2兴. The dynamic part of the selection originates from the competition among individuals. We as-sume in this schematization that the competition arises only

because the population shares some external resource that is evenly distributed, i.e., we disregard the effects of interindi-vidual competition关3,4兴. In this limit the effects of competi-tion do not depend on the distribucompeti-tion of genotypes, and simply limit the total size of the population.

The most important effects of the selection can be roughly schematized by assuming that certain genotypes are forbid-den, so that they are eliminated from the accessible space. Let us consider for the moment that evolution takes place in a flat fitness landscape. In this framework, there are no inter-actions among individuals: the evolution is given by the simple superposition of all possible lineages. The probability distribution of the population in the genotypic space at a given time can be obtained by summing over all possible individual ‘‘histories’’ in a way similar to the path integral formulation of statistical mechanics 关5兴. In this view, the phylogenetic lineage of an individual is given by the path connecting the genotypes of ancestors.

This assumption has to fail somewhere, since otherwise evolution would be equivalent to a diffusion process, without anything favoring the formation of species. It is often as-sumed that speciation events are rare and occur in a very short time关6兴, either due to some change in the fitness land-scape 共caused by an external catastrophe like the fall of a large meteorite, or by an internal rearrangement induced by coevolution兲 or because small isolated populations, escaping from competition, are free to explore their genotypic space 共genetic drift兲 and find a path to a higher fitness peak 关7兴. Therefore, it is usually assumed that one can reconstruct the diverging time of two species from their genotypic distance from a single speciation event关8兴.

It is known, however, from recent investigations about the small-world phenomenon 关9兴 that diffusion is seriously af-fected even by a small fraction of long-range jumps关10–12兴. This fact could have dramatic consequences in our under-standing of the large-scale evolution mechanism. If the short-range mutations are dominant, the evolution is equivalent to a diffusion process in the genotypic space. Assuming that the *Electronic address: bagnoli@dma.unifi.it

(2)

fitness landscape is formed by mountains separated by val-leys, and that the crests of the mountains are almost flat, then the large-scale evolution is dominated by the times needed to cross a valley by chance, while the short scale is dominated by the neutral exploration of crests 关6,13兴.

Vice versa, if the long-range mutations are important 共eventually amplified by the small-world mechanism兲, the time needed to connect two genotypes does not depend on their distance nor on the shape of the fitness landscape and the fitness maxima are quickly populated. Moreover, in this scenario, the speciation phenomenon should not be ascribed to the ‘‘discovery’’ of a preexisting niche, but rather to the formation of that niche in a given ecosystem due to internal interactions 共coevolution兲 or external physical changes 共ca-tastrophes兲. After formation, the niche is quickly populated because of long-range mutations. In this framework, allopat-ric speciation关7兴 loses its fundamental importance, and sym-patric speciation due to coevolution关14,15兴 becomes a plau-sible alternative.

In order to evaluate quantitatively the relevance of these speculations, we introduce an individual-based model of an evolving population. We assume that the genetic information 共the genotype兲 of an individual is represented by a binary string x⫽(x1,x2, . . . ,xL) of L symbols xi⫽0,1 共multilocus model with two alleles兲. In this way we are modeling haploid organisms, i.e., bacteria or viruses, or, more appropriately, more archaic, prebiotic entities. The choice of a binary code is not fundamental but certainly makes things easier. It can be justified by thinking of a purine-pyrimidine coding, or of ‘‘good’’ and ‘‘bad’’ alleles for genes 共units of genetic infor-mation兲. In this second version, 0 represents a good gene and 1 a bad one. This is a sort of ‘‘minimal model’’ which is often used to model the evolution of genetic populations 关14–18兴. For bacteria, there are indications that major and minor codons work in such a way关19兴. The genotypic space is thus a Boolean hypercube of L dimensions, and each indi-vidual sits on a corner of this cube, according to its genotype. The 2Lcorners of this hypercube represent all possible geno-types, which are at a maximum distance equal to L.

In the next section, we formalize the model and introduce the mathematical representation of mutation mechanisms. Then, in Sec. III, we consider the consequences of short- and range mutations for a flat fitness landscape. In the long-range case all genotypes are connected, regardless of their mutual distance: this case can be considered the equivalent of a mean-field approximation. We derive analytically the rate of spreadingv and the characteristic spreading time␶of a genetically homogeneous inoculum in the short-range (vs,␶s) and long-range (vl,␶l) cases. In the first case,vs is independent of the genotype length L, andsgrows linearly with L; the opposite happens in the long-range case. Clearly, this different behavior has dramatic evolutionary conse-quences as L becomes large.

In general, however, only a small set of all possible long-range mutations is observed in real organisms. We thus com-pute numerically the rate of spreading v in a mixed

short-range and sparse long-short-range case. A small-world effect can be observed: in the limit of large genotype lengths, a vanish-ing fraction of long-range mutations cooperates with the

short-range ones to give essentially the mean-field results. In Sec. IV, we show that the effects of mutations and selection can be separated in the limit of a very smooth fit-ness landscape and mean-field long-range mutations. In this approximation we obtain analytically the result that the asymptotic probability distribution is a Boltzmann equilib-rium one, in which the fitness plays the role of energy and mutations correspond to temperature. We compute numeri-cally the asymptotic probability distribution for several fit-ness landscapes, finding that the equilibrium hypothesis is verified. As a nontrivial example of small-world effects in evolution, we checked numerically that this scenario still holds for a sparse long-range mutation matrix. Conclusions and perspectives are drawn in the last section.

II. THE MODEL

In order to be specific, let us assume that the genotype of an individual is represented by a Boolean string of length L. In this way, the genotypic space is a Boolean hypercube with 2L nodes.

The genotype of an individual is represented by a string

x⫽(x1,x2, . . . ,xL) of L Boolean symbols xi⫽0,1. Each po-sition corresponds to a locus whose gene has two allelic forms. In this way we are modeling haploid共only one copy of each gene兲 organisms, i.e., bacteria or viruses, or, more appropriately, more archaic, prebiotic entities. The genotype can also be viewed as a spin configuration. In this case we use the symbol ␴i⫽2xi⫺1.

We consider two kinds of mutation: point mutations, which interchange a 0 with a 1, and all other mutations, which do not alter the length of the genotype共transposition and inversions兲. We define the distance d(x,y) between two genotypes x and y as the minimal number of point mutations needed to pass from x to y 共Hamming distance兲:

d共x,y兲⫽

i⫽1 L

共xi⫺yi兲2.

All possible genotypes of length L are distributed on the 2L vertices of a hypercube. A point mutation corresponds to a unit displacement on that hypercube共short-range jump兲.

The occurrence of point mutations in real organisms de-pends on the identity of the symbol and on its position on the genotype; in the present approximation, however, we assume that all point mutations are equally likely. Moreover, since the probability of observing a mutation is quite small, we impose the condition that at most one mutation is possible in one generation. The probability of observing a point muta-tion from genotype y to genotype x is given by the short-range mutation matrix Ms(x,y). Denoting bys the prob-ability of a point mutation per generation, we have

Ms共x,y兲⫽

1⫺␮s if x⫽ys L if d共x,y兲⫽1 0 otherwise. 共1兲

(3)

Other types of mutation correspond to long-range jumps in the genotypic space. The simplest approximation consists in assuming all mutations equiprobable. Let us denote by␮l the probability per generation of this kind of mutation. The long-range mutation matrix Ml(x,y) is defined as

Ml共x,y兲⫽

1⫺␮l if x⫽yl

2L⫺1

otherwise. 共2兲

In the real world, only some kinds of mutation are actually observed. We model this fact by replacing Ml with a sparse matrix Mˆl. We introduce the sparseness index s, which is the average number of nonzero off-diagonal elements of Mˆl. The sum of these off-diagonal elements still gives␮l. In this case Mˆlis a quenched sparse matrix, and Mlcan be consid-ered the average of the annealed version.

After considering both types of mutation, the overall mu-tation matrix is M⫽MlMs or M⫽MˆlMs for the quenched version.

We model our population at the level of the probability distribution of genotypes p⬅p(t), thus disregarding spatial effects. The evolution equation for p is

p

共x兲⫽ A共x兲

y M共x,y兲p共y兲 A ¯ , 共3兲

where the selection function A(x) corresponds to the average reproduction rate of individuals with genotype x, and A¯ ⫽兺xA(x) p(x) is the average reproduction rate of the

popu-lation 关1,2,20,21兴. We write A(x) in an exponential form A(x)⫽exp关V(x)兴, and we denote V(x) the fitness landscape. The selection does not act directly on the genotype, but rather on the phenotype 共how an individual appears to oth-ers兲. The phenotype of a given genotype can be interpreted as an array of morphological characteristics. We consider the simplest case, in which the phenotype is univocally deter-mined by the genotype x, which is not the general case, since polymorphism or age dependence is usually present. The general mapping between genotype and phenotype is largely unknown and is expected to be quite complex. The effects of some genes are additive 共nonepistatic兲, while others can in-teract in a simple 共control genes兲 or complex 共morphologic genes兲 way.

A possible way of approximating these effects is to use the following form for the fitness V(x):

V共x兲⫽H L i

⫽1 LiJ L⫺1

i⫽1 L⫺1 ␴ii⫹1⫹K共x兲, 共4兲 where␩(x) is a random function of x, uniformly distributed between⫺1 and 1 关

(x)(y)

⫽␦xy兴. The ‘‘field’’ term H

represents a nonepistatic contribution to the fitness, in which all genes have equal weight. The ‘‘ferromagnetic’’ term J represents simple interactions between pairs of genes 共even though in general those are not symmetric兲 and corresponds

to a weakly rough landscape. Finally,K modulates a widely rough landscape and can be thought of as an approximation of the effects of complex 共spin-glass-like兲 interactions among genes.

III. SPREADING AND SMALL-WORLD EFFECTS ON A FLAT FITNESS LANDSCAPE

In this section we study the case of a flat fitness landscape 共no selection兲, i.e., H⫽J⫽K⫽0. In this way we are model-ing the evolution on the crest of a mountain, assummodel-ing that all deleterious mutations are immediately lethal 共the part of the genotype that can originate this kind of mutation is not considered兲, and that we can neglect the small variation of fitness along the crest. This landscape is the one usually con-sidered in the theory of neutral evolution关13兴. Let us assume that this crest is colonized by a founder deme genotypically homogeneous 共a ␦ peak in the genotypic distribution兲. We want to obtain the average time needed to populate the crest according to the different mutation schemes. Since the fitness landscape is flat, A¯ is a constant.

We are interested in the behavior of ␳(t), which is the average distance of the population from a given starting genotype x0, i.e.,

共t兲⫽

x

p共x,t兲d共x,x0兲. 共5兲

We introduce the spreading velocityv as

v⫽⳵␳

t

t⫽0

.

In the Appendix we obtain the spectral properties of the mutation matrices Msand Ml. The corresponding spreading velocities in the limit of small time interval t compared to the characteristic spreading time ␶s⫽L/2s 共short range兲 and ␶l⫽1/␮l共long range兲 are vs⫽␮s, Eq.共A3兲, and vl⫽Ll/2, Eq.共A4兲. If short- and long-range mutations coexist, one has

v⫽vs⫹vl, Eq.共A5兲.

One can approximate the behavior of real biological sys-tems by considering a mixture of short-range mutations, which occur with a relatively high frequency, and sparse long-range mutations, with sparseness index s.

We investigated the sparse case by numerical simulations, for some genotype lengths L. As shown in Fig. 1, as soon as the sparseness index s⬎0, the numerical value of v becomes very close to the mean-field one, Eq. 共A5兲. Notice that the average distance from the inoculum, ␳(t), is rather insensi-tive to the distribution of p. The actual distribution can be quite different from the one obtained with the mean-field matrix Ml.

This transition may be interpreted as an indication of a small-world effect, and that there exists a first-order transi-tion at s⫽0 关22兴. However, the standard deviation of the spreading velocity, S(v), appears to diverge at s⫽0 with an

exponent 1/2, as shown in Fig. 2.

These results suggest the presence of a small-world phe-nomenon in evolution: the rare and sparse long-range

(4)

muta-tions cooperate synergetically with the short-range ones to give essentially a mean-field effect. We checked that the pre-vious results hold also for a nonflat共but smooth兲 fitness land-scape (␮s⫽10⫺5, ␮l⫽10⫺4 and H⫽10⫺4, J⫽0, K⫽0). Again, as soon as s⬎0, the spreading velocity increases from the short-range values to the mean-field ones.

Our findings agree qualitatively with those in the litera-ture 关10,11兴, although we used a slightly different setup. In-stead of rewiring links we added long-range jumps and our mutation matrix共which is the adjacency matrix of the prob-lem兲 in general is not symmetric for the long-range part. Moreover, we are more interested in metric properties 共the spreading velocity兲 than geometrical ones 共the ‘‘chemical distance’’兲.

IV. EQUILIBRIUM PROPERTIES AND SMALL-WORLD EFFECTS ON A SMOOTH FITNESS LANDSCAPE In the following we shall study the effects of the coopera-tion between short-range and sparse long-range mutacoopera-tions on the equilibrium properties of the model.

Let us study the model in the presence of a smooth static

fitness landscape. In this case the fitness A does not depend explicitly on the distribution p, and Eq.共3兲 can be linearized by using unnormalized variables z(x,t) satisfying

z共x,t⫹1兲⫽

y

A共y兲M共x,y兲z共y,t兲, 共6兲 with the correspondence

p共x,t兲⫽ z共x,t兲

y

z共y,t兲 .

In vectorial terms, Eq. 共6兲 can be written as

z共t⫹1兲⫽MAz共t兲, 共7兲

where Mxy⫽M(x,y) and Axy⫽A(x)xy.

When one takes into consideration only point mutations (M⬅Ms), Eq. 共6兲 can be read as the transfer matrix of a two-dimensional Ising model 关23–25兴, for which the geno-typic element ␴i(t) corresponds to the spin in row t and column i, and z(,t) is the restricted partition function of row t. The effective HamiltonianV 共up to constant terms兲 of a possible genealogical history 兵x(t)其 or 兵␴(t)其 from time 1⭐t⭐T is V⫽

t⫽1 T⫺1

i⫽1 Li共t兲i共t⫹1兲⫹V„x共t兲…

, 共8兲 where␥⫽⫺ln关␮s/(1⫺␮s)兴.

This unusual two-dimensional Ising model has long-range coupling along the row 共depending on the choice of the fit-ness function兲 and ferromagnetic coupling along the time direction 共for small short-range mutation probability兲. In or-der to obtain the statistical properties of the system one has to sum over all possible configurations 共stories兲, eventually selecting the right boundary conditions at time t⫽1. The bulk properties of Eq.共8兲 cannot be reduced in general to the equilibrium distribution of a one-dimensional system, since the transition probabilities among rows do not obey detailed balance. Moreover, the temperature-dependent Hamiltonian 共8兲 does not allow an easy identification between energy and selection, and temperature and mutation, which is naively expected by the biological analogy with an adaptive walk.

A. Long-range mutations

Let us first consider the long-range mutation case. Equa-tion 共7兲, reformulated according to Eq. 共3兲, corresponds to

z共t⫹兲⫽共AMl兲␶z共t兲.

Since it is easier to consider the effects of A and Ml sepa-rately, let us study in what limit they commute. The norm of the commutator on the asymptotic probability distribution p is

兩兩关AMl兴兩兩⫽

xy 兩关AMlxy

p共y兲兩,

FIG. 1. The spreading velocityv vs sparseness s for three values of the genotype length L;l⫽10⫺4, ␮s⫽10⫺3, flat fitness

land-scape. The dashed lines indicate the mean-field value v⫽␮s

⫹L␮l/2. Average over 100 runs.

FIG. 2. The standard deviation S(v) of the spreading velocity v

vs sparseness s for three values of the genotype length L;l

⫽10⫺4,

s⫽10⫺3, flat fitness landscape. The dashed line

(5)

and it is bounded by␮lc, where c⫽maxxy兩Axx⫺Ayy兩. In the

limit␮lc→0 共i.e., a very smooth landscape兲, to first order in c we have

共AMl兲␶⫽AMl⫹O共␶ 2

lc兲A␶⫺1Ml␶⫺1, which is analogous to the Trotter product formula.

When ␶ is of order 1/␮l, Ml␶ is a constant matrix with elements equal to 1/2L, and thus Mlp is a constant

probabil-ity distribution. If␮l is large enough, (AMl)␶⫽AMl␶. The asymptotic probability distribution p˜ is thus propor-tional to the diagonal of A1/␮l:

p

˜共x兲⫽C exp

V共x兲l

, 共9兲

i.e., a Boltzmann distribution with Hamiltonian V(x) and temperature ␮l. This corresponds to the naive analogy be-tween evolution and equilibrium statistical mechanics. In other words, the genotypic distribution is equally populated if the phenotype is the same, regardless of the genotypic distance since we used long-range mutations.

We have checked this hypothesis numerically, by iterating Eq. 共3兲 for a time T large enough to be sure of having reached the asymptotic state. We plotted the logarithm of the probability distribution p˜ (x) versus the value of the Hamil-tonian V(x). We computed the slope 1/e of the linear re-gression. The quantity ␮e is the effective ‘‘temperature’’ of the probability distribution according to the equilibrium hy-pothesis.

The results for the mean-field mutation matrix Ml are shown in Fig. 3. We see that the equilibrium hypothesis is well verified in the limit ␮lⰇc; and that convergence is

FIG. 3. Numerical check for long-range mutations. In the simulations we set L⫽8, ␮l⫽0.1, and␮s⫽0. We varied H 共a,b兲, J 共c,d兲, and

K 共e,f兲, setting all other parameters to zero. 共a兲 H⫽0.01, J⫽0, K⫽0; 共b兲 H⫽0.001, J⫽0, K⫽0; 共c兲 H⫽0, J⫽0.01, K⫽0; 共d兲 H ⫽0, J⫽0.001, K⫽0; 共e兲 H⫽0, J⫽0, K⫽0.1; 共f兲 H⫽0, J⫽0, K⫽0.01. In the figures t indicates the number of generations,e the

(6)

faster for a rough landscape. In all cases the effective tem-perature␮eis close to the expected value, i.e., to the muta-tion rate␮l.

B. Short-range mutations

The above results hold only qualitatively for pure short-range mutations as shown in Fig. 4. The small dispersion of points in the figure implies that the genotypes can be divided into evenly populated groups sharing approximately the same fitness. This is always the case for the additive fitness landscape (H contribution兲, since in this case the position of symbols in the genotype has no influence, and the short-range mutations are able to homogenize the distribution in-side a group. This is only approximately valid for the J contributions, since in this case the fitness depends on pairs of symbols, while mutations act only on single symbols.

However, as shown in Fig. 4共d兲, the homogeneous state is reached for a sufficiently high mutation probability. Finally, the homogeneous state is never reached for the very rough landscape case (H contribution兲, even though the linear re-lation is satisfied on average.

In order to obtain a quantitatively correct prediction, one has to consider that the resulting slope ␮e is related to the second largest eigenvalue ␭1 of the mutation matrix by␮ ⫽1⫺␭1. When the fitness depends only on the ‘‘external field’’ termH, in the asymptotic state the short-range muta-tions connect groups of equal fitness. This fact is reflected by the vanishing dispersion of points in Figs. 4共a兲 and 4共b兲.

In all cases, we expect that in the limit of a very smooth landscape ␮etends toward the limiting value 2␮s/L of Eq. 共A2兲. This limit is reached faster when only the additive H term is present. This implies that for a large genotypes the effective temperature due to short-range mutations is

FIG. 4. Numerical check for short-range mutations. In the simulations we set L⫽8,␮l⫽0, and␮s⫽0.1. We varied H 共a,b兲, J 共c,d兲, and

K 共e,f兲, setting all other parameters to zero. 共a兲 H⫽0.1, J⫽0, K⫽0; 共b兲 H⫽0.01, J⫽0, K⫽0; 共c兲 H⫽0, J⫽0.01, K⫽0; 共d兲 H⫽0, J⫽0.001, K⫽0; 共e兲 H⫽0, J⫽0, K⫽0.01; 共f兲 H⫽0, J⫽0, K⫽0.001. In the figures t indicates the number of generations,e the

(7)

vanishing.

In the opposite case, when the selection is strong, the application of the matrix A ‘‘rotates’’ the distribution p in a way that is essentially random with respect to the Fourier eigenvectors of Ms. Thus, the effective second eigenvalue of the mutation matrix is given by 1⫺␮s, obtained by averag-ing over all the eigenvalues of Eq. 共A2兲. Consequently, we obtain ␮e⯝␮s. This limit is reached faster in the case of a strongly disordered fitness landscape, i.e., in the limit of largeK. When only the J term is present, one observes an intermediate case.

C. Small-world effects

Let us now consider the case of a sparse long-range mu-tation matrix, coupled to a stronger short-range matrix, i.e., ␮sⰇ␮l. We performed simulations with ␮s⫽0.1 and ␮l ⫽0.01, L⫽6, 8, and 10, and varying s. The results are sum-marized in Table I.

The effects of the two kinds of mutation are additive, implying that, for small ␮l, the distributions are visually similar to the short-range case, Fig. 4. However, as soon as s is greater than a threshold sc, the effective temperature␮e increases by the expected long-range contribution ␮l. This increment is very relevant when the effect of short-range mutations is vanishing, i.e., for smooth landscapes (H and J contributions兲 and large genotypes. The effect is less relevant in the disordered case (K contributions兲, since in this case the contribution to the effective temperature by the small-range mutations does not decrease with increasing L.

By increasing the weight of the sparse long-range muta-tions, the probability distribution also becomes similar to the long-range case, Fig. 3. We performed simulations with ␮s ⫽0.1 and␮l⫽0.1, L⫽6,8,10, and varying s. The results are shown in Table I. Notice that the value of sc 共estimated vi-sually from the plot of data兲 is rather insensitive to L and the parameters of the fitness V(x).

We quantified the distance between the resulting distribu-tion and the Boltzmann one Eq.共9兲 through the computation of the average square difference from linear regression, ␹2. In Fig. 5 we show that ␹2 becomes very small as soon as s ⬎sc, which seems to vanish with L→⬁.

This implies that, for large genotypes, a vanishing fraction of long-range mutations, coupled to small-range ones, is suf-ficient to establish the statistical mechanics analogy of selec-tion and mutaselec-tions.

V. CONCLUSIONS

We studied some simple models of asexual populations evolving on a smooth fitness landscape, in the presence of point mutations共small-range jumps in genotypic space兲 and other genetic rearrangements 共long-range jumps兲.

We computed analytically the spreading velocity of an initially homogeneous inoculum on a flat fitness landscape, for the short-range and the long-range mean-field 共all muta-tions equiprobable兲 cases. Since in a real situation only a small set of all possible mutations can occur, we also con-sidered the quenched version of the long-range mutation ma-trix. In this case we showed that a small-world effect is present, since even a small number of quenched long-range jumps makes the results indistinguishable from those ob-tained by assuming all mutations equiprobable. These results still hold for a smooth fitness landscape.

We investigated this issue further, studying the equilib-rium properties of the system in the presence of a smooth fitness landscape. In this framework, it was possible to show that the equilibrium distribution is a Boltzmann one, in which the fitness plays the role of an energy, and mutations that of a temperature. We checked this result numerically for different fitness landscapes, and a mean-field long-range mu-tation mechanism. As in the previous case, a small-world phenomenon appears, since similar results can be obtained TABLE I. Effective temperature␮e for several values of the genotype length L and parameters of the

fitness V(x), Eq.共4兲. Here␮s⫽0.1,␮l⫽0.01; the row labeled H stands for H⫽0.1, J⫽K⫽0, and the row

labeledJ stands for J⫽0.1, H⫽K⫽0, the row labeled K stands for K⫽0.1, H⫽J⫽0. We report the value of␮e

(0)

for sparseness s⫽0 共only small-range mutations兲 and the average value␮¯e for s⬎sc, where scis

estimated visually from the plot of data共see also Fig. 5兲.

V L⫽6 L⫽8 L⫽10 L⫽12 ␮e (0) ␮¯e sce (0) ␮¯e sce (0) ␮¯e sce (0) ␮¯e sc H 0.035 0.045 3 0.025 0.035 3 0.020 0.030 2 0.017 0.027 2 J 0.071 0.082 4 0.053 0.063 4 0.042 0.052 4 0.035 0.045 3 K 0.080 0.105 6 0.090 0.105 6 0.096 0.107 6 0.098 0.111 3

FIG. 5. Scaling of the average square difference from linear regression, ␹2, vs sparseness s, for three values of the genotype

length L;l⫽0.1,␮s⫽0.1, H⫽0, J⫽0, K⫽0.1. Averages taken

(8)

using a combination of sparse long-range and short-range mutations.

We wish to acknowledge our participation in the Dynam-ics of Complex Systems 关26兴 group. The numerical simula-tions were performed using the computational facilities of INFM PAIS 1999-G-IS-Firenze.

APPENDIX: SPECTRAL PROPERTIES OF THE MEAN-FIELD MUTATION MATRICES

Both Ms and Ml, Eqs.共1兲 and 共2兲, are Markov matrices. Moreover, they are circular matrices, since the value of a given element does not depend on its absolute position but only on the distance from the diagonal. This means that their spectrum is real, and that the largest eigenvalue is ␭0⫽1.

Since the matrices are irreducible, the corresponding eigen-vector␰0is nondegenerate, and corresponds to the flat

distri-bution ␰0(x)⫽1/2L. In analogy with circular matrices in the

usual space, one can obtain their complete spectrum using the analog of Fourier transform in a Boolean hypercubic space. Let us define the ‘‘Boolean scalar product’’䉺:

x䉺y⫽i⫽1

L xiyi,

where the symbol 丣 represent the sum modulus two共XOR兲

and the multiplication can be substituted by an AND共which

has the same effect on Boolean quantities兲. This scalar prod-uct is obviously distributive with respect to the XOR:

共xy兲䉺z⫽共x䉺z兲共y䉺z兲.

Note that the operation xy is performed bitwise between

the two genotypes: (xy)i⫽xiyi.

Given a function f (x) of a Boolean quantity x苸兵0,1其L, its ‘‘Boolean Fourier transform’’ is f˜(k) (k苸兵0,1其L)

共k兲⫽ 1

2L

x 共⫺1兲

x䉺kf共x兲.

The antitransformation operation is determined by the defi-nition of the Kronecker delta

k0⫽ 1 2L

x 共⫺1兲 x䉺k, and is given by f共x兲⫽

k 共⫺1兲 x䉺k共k兲.

One can easily verify that the Fourier vectors ␰k(x) ⫽(⫺1)x䉺kare eigenvectors of both M

land Ms, with eigen-values

0⫽1,

k⫽1⫺␮l⫺ ␮l

2L⫺1 共A1兲

for the long-range case, and ␭0⫽1,

k⫽1⫺

2␮sd共k,0兲

L 共A2兲

for the short-range case, where d(x,y) is the Hamming dis-tance between genotypes x and y.

The computation of ␳(t), Eq. 共5兲, is easily performed in Fourier space, using the analog of the Parsifal theorem:

x f共x兲g共x兲⫽

x

k

k 共k兲g˜共k

兲共⫺1兲x䉺(kk⬘) ⫽2L

k

k 共k兲g˜共k

兲␦kk⬘⫽2L

k 共k兲g˜共k兲,

where we have used the property kk

⫽0 if and only if

ki⫽ki

for each component i. Let us denote by en the unit vector along direction n, i.e., (en)i⫽␦ni.

The Fourier transform of the distance d(x,y) is obtained considering that (⫺1)(xy)䉺en gives 1 if x

n⫽yn and ⫺1 otherwise; thus d共x,y兲⫽1 2

Ln

⫽0 L⫺1 共⫺1兲(xy)䉺en

. We obtain d ˜x 0共k兲⫽ 1 2L

x d共x,x0兲共⫺1兲x䉺k ⫽1 2L 1 2

x

L

n⫽0 L⫺1 共⫺1兲(xx0)䉺en

共⫺1兲x䉺kL 2␦k0⫺ 1 2

n

⫽0 L⫺1 共⫺1兲x0䉺en

x 共⫺1兲 x䉺(ken)

L2k0⫺ 1 2 n

⫽0 L⫺1 共⫺1兲x0䉺en ken; i.e., d ˜ x0共k兲⫽

L/2 if k⫽0 ⫺1/2 if k⫽en and 共x0n⫽0 1/2 if k⫽en and 共x0兲n⫽1 0 otherwise.

The probability distribution p(x,t) can be expanded on the eigenvector basis␰k(x) of M:

p共x,0兲⫽

k

akk共x兲⫽

k

(9)

and p共x,t兲⫽Mtp共x,0兲⫽

k akk t k共x兲, i.e., p ˜共k,t兲⫽a kk t . Thus ␳共t兲⫽

k d ˜ x0共k兲p˜共k,t兲⫽

k d ˜ x0共k兲akk tL20 t

n⫽0 L⫺1 1 2aenen t .

If at t⫽0 the distribution is concentrated at x0⫽0 关p(x,0)⫽x0兴 then ak⫽1 for all k. In both the short- and

long-range cases, ␭en does not depend on n 关see Eqs. 共A1兲

and共A2兲兴, and thus

共t兲⫽L2共␭0

t⫺␭

1

t兲. For the short-range case we have

s共t兲⫽ L 2

1⫺

1⫺ 2␮s L

t

L 2

1⫺exp

t ln

1⫺ 2␮s L

冊册冎

L2

1⫺exp

ts

冊册

,

and the characteristic spreading time is ␶s⫽L/2s. For t small compared with␶s (L→⬁) we have

s共t兲⯝st⬅vst. 共A3兲 For long-range mutations we have

l共t兲⫽ L 2

1⫺

1⫺␮l⫺ ␮l enL⫺1

t

L2

1⫺exp

tl

冊册

with␶l⫽1/␮l. For tⰆ␶l(␮l→0) we have

l共t兲⯝ Ll

2 t⬅vlt. 共A4兲

The behavior of ␳(t) for short times, vanishing mutation probabilities, and large genotypes, Eqs. 共A3兲 and 共A4兲, is rather trivial. In these approximations one can neglect back mutations, and obtain the same results on a Cayley tree. However, the full analysis gives the exact behavior of ␳(t) for all times.

In the mixed case one has M⫽MsMl. Since Ms and Ml share the same eigenvectors, ␭⫽␭sl and, in the previous approximations,

v⫽vs⫹vl. 共A5兲

关1兴 R. A. Fisher, The Genetical Theory of Natural Selection 共Do-ver, New York, 1930兲.

关2兴 S. Wright, in Proceedings of the 6th International Congress on Genetics, Ithaca, edited by D.F. Jones 共Brooklyn Botanical Garden, New York, 1936兲, pp. 356–366.

关3兴 F. Bagnoli and M. Bezzi, Phys. Rev. Lett. 79, 3302 共1997兲; e-print cond-mat/9708101.

关4兴 F. Bagnoli and M. Bezzi, Annu. Rev. Comput. Phys. 8, 265 共2000兲; e-print cond-mat/9906164.

关5兴 F. Bagnoli and M. Bezzi, in Path Integrals from peV to TeV, edited by R. Casalbuoni et al. 共World Scientific, Singapore, 1999兲, p. 493.

关6兴 S. Gould and N. Elredge, Paleobiology 3, 114 共1997兲; Nature 共London兲 366, 223 共1993兲.

关7兴 E. Mayr, Populations, Species and Evolution 共Harvard Univer-sity Press, Cambridge, MA, 1970兲.

关8兴 J. Maynard Smith, Evolutionary Genetics 共Oxford University Press, Oxford, 1998兲.

关9兴 D.J. Watts and S.H. Strogatz, Nature 共London兲 393, 440 共1998兲.

关10兴 M. Barthe´le´my and L. A. Nunes Amaral, Phys. Rev. Lett. 82, 3180共1999兲; 82, 5180 共1999兲.

关11兴 A. Barrat and M. Weigt, Eur. Phys. J. B 13, 547 共2000兲. 关12兴 R. Monasson, Eur. Phys. J. B 12, 60 共1999兲.

关13兴 M. Kimura, The Neutral Theory of Molecular Evolution 共Cam-bridge University Press, Cam共Cam-bridge, 1983兲.

关14兴 A. S. Kondrashov and F. A. Kondrashov, Nature 共London兲 400, 351共1999兲.

关15兴 U. Dieckmann and M. Doebeli, Nature 共London兲 400, 354 共1999兲.

关16兴 W. Eigen, Naturwissenschaften 58, 465 共1971兲; W. Eigen and P. Schuster, ibid. 64, 541共1977兲.

关17兴 M. Doebeli, J. Evol. Biol. 9, 893 共1999兲.

关18兴 M. Doebeli, J. Evol. Biol. 9, 893 共1999兲; U. Dieckmann and M. Doebeli, Nature共London兲 400, 354 共1999兲.

关19兴 F. Bagnoli and P. Lio´, J. Theor. Biol. 173, 271 共1995兲. 关20兴 D. L. Hartle, A Primer of Population Genetics, 2nd ed.

共Sinauer, Sunderland, MA, 1988兲.

关21兴 L. Peliti, in Physics of Biomaterials: Fluctuations, Self-Assembly and Evolution, edited by T. Riste and D. Sherrington 共Kluwer, Dordrecht, 1996兲 pp. 287–308; e-print cond-mat/9505003.

关22兴 M. Argollo de Menezes, C. F. Moukarzel, and T. J. P. Penna, Europhys. Lett. 50, 5共2000兲; e-print cond-mat/9903426. 关23兴 I. Leutha¨usser, J. Stat. Phys. 48, 343 共1987兲.

关24兴 P. Tarazona, Phys. Rev. A 45, 6038 共1992兲.

关25兴 T. Wiehe, E. Baake, and P. Schuster, J. Theor. Biol. 177, 1 共1995兲.

Riferimenti

Documenti correlati

Gubler et al [27] in a series of 14 patients treated by OTSC clip placement for acute iatrogenic perforations of the GI tract demonstrated closure of the GI defects is possible

The most relevant difference between the airborne and spaceborne data sets was observed relative to the vertical profiles yielded by the TomoSAR processor, in that the airborne

Biogas and syngas contribution to the total electric power production for the integrated plants exploiting the manure from all the considered animal families for

La composizione dei sistemi di involucro eseguiti nei confronti della stazione funiviaria di Punta Helbronner (alla quota di 3.466 m), collocata a monte del collegamento, si

Objective: An observational case–control study was designed to analyse the discriminative value of ultrasound (US)-detected joint effusion compared with physical examination in

Ora, se è vero che l’autore di una decisione cautelare (fors’anche in ragione dell’effettività propria della giurisdizione cautelare) deve poter essere (anche) altro da quello

H&E stained samples showed the presence of perifollicular changes 108. ranging from a slight thickening of basophilic appearance (figure 1a) to