Poincar´ e Recurrences in Mixed Dynamical Systems and in Genomic Sequences

(1)

Universit` a degli Studi di Bologna

FACOLT `A DI SCIENZE MATEMATICHE, FISICHE E NATURALI

Dottorato di Ricerca in Fisica, XVIII Ciclo in cotutela con

l’Universit´e du Sud Toulon-Var, France

Tesi di Dottorato di Ricerca

Poincar´ e Recurrences in Mixed Dynamical Systems and in Genomic Sequences

Luca Rossi

Direttore di Tesi presso l’Universit`a di Bologna Prof. Giorgio Turchetti

Direttore di Tesi presso l’Universit´e du Sud Toulon-Var Prof. Sandro Vaienti

Coordinatore del Corso di Dottorato in Fisica Prof. Roberto Soldati

Settore disciplinare di afferenza: FIS/01 – MAT/07

Bologna, marzo 2006

(2)

(3)

In this Ph.D. thesis I present the results obtained from the study of Poincar´e recurrences for mixed dynamical systems, that is systems composed of invariant regions, and from an application of such recurrences to the analysis of coding and noncoding genomic sequences. These results have been also published in the papers [1], [2], [3] and [4].

In this respect, I would like to clarify that when I use terms like “we” or

“our,” I implicitly refer, beside me, to Giorgio Turchetti and Sandro Vaienti, who have been my supervisors and the persons I mainly collaborated with.

iii

(4)

(5)

Introduction

During his study of the three-body problem, Poincar´e gave the proof of the following theorem, which appeared in the famous memoir [5] of 1890:

If a flow preserves volume and has only bounded orbits then for each open set there exist orbits that intersect the set infinitely often.

It is well known that it played a crucial role in the development of statistical mechanics at the end of the nineteenth century. The opponents of the atomistic hypothesis considered this theorem as one of the strongest argument against the possibility to think the matter as a collection of particles moving according to Newton’s laws of dynamics. In fact, this would have led to contradict the laws of thermodynamics, well confirmed by the expe- rience. However, a few years later Boltzmann was able to reconcile these two apparent incompatible conceptions of matter, and this helped to clarify some of the most controversial conceptual issues present in the framework of statistical mechanics.

Recently, the study of Poincaré recurrences has received a growing atten- tion, above all by the theory of dynamical systems (see Refs. [6] and [7] for an overview). This is mainly due to the fact that Poincaré recurrences may be used to investigate the ergodic and statistical properties concerning the global dynamics of a system over wide regions of the phase space (see, for example, Refs. [8] and [9]). In this respect, changes in the recurrence statistics have been observed for transitions from normal to anomalous transport (see Refs. [10] and [11]). Thanks to Poincaré recurrences it is also possi-

1

(8)

ble to compute the metric entropy of a system equipped with an ergodic measure [12]. Moreover, they seem to be connected, at least for particular kinds of dynamical systems, to other quantities used to describe the fractal properties of the dynamics [13].

During the last years the statistics of first return times has been ex- tensively studied, and rigorous results have been obtained in two different situations: for systems with strong mixing properties and for zero-entropy systems like irrational rotations. For these two cases the global features of recurrence are fairly well understood.

In fact, for a wide class of strongly mixing systems it has been proved (see Refs. [8] and [14]–[21]) that the limit recurrence statistics is exponential, even if they are not uniformly hyperbolic. However, this result was obtained by choosing the shrinking set as a ball whose radius goes continuously to zero, or as a cylinder originating from a dynamical partition of the phase space. Moreover, the convergence to the function e^−t holds for almost every points of the phase space of these systems: by taking a point which is not generic (for example a periodic point), there are proofs that the limit statistics can be different.

Contrary to the behaviour enjoyed by the strongly mixing systems, for the one-dimensional irrational rotations of the circle there are at most three possible first returns in each subset [22, 23]. This prevents the existence of any limit recurrence statistics, unless the shrinking sets are chosen in a very particular class of intervals with strong arithmetic properties [24, 25].

A recent paper [26] shows that for all aperiodic ergodic dynamical systems any kind of distribution can be obtained, provided that the decreasing sequence of sets is chosen suitably around all points, but in general such sets will not be balls or cylinders.

Unfortunately, for systems of higher physical importance, like low dimensional Hamiltonian systems, very little is known, despite the interest in obtaining analytical results. In this respect, by studying a model of the hyperbolic part of the phase space of an Hamiltonian system near a hierar- chical islands structure, lower and upper bounds were produced for the limit statistics of first return times in terms of a power law [27]. This example worked out a self-similar structure of the phase space, in the same spirit as the model proposed in Refs. [28]–[30] for the dynamics of sticky sets in

(9)

1.1 Poincar´e recurrences 3

Hamiltonian systems.

Although it is difficult to obtain rigorous results, interesting indications about the behaviour of significant systems may come by performing care- ful numerical investigations. In this regard, the intense numerical studies performed by several authors suggest that, in the thin stochastic layer sur- rounding a chain of islands, the decay of Poincar´e recurrences could follow a power law due to the sticking phenomenon, which is believed to be responsible for the anomalous diffusion modeled by Levy like processes [11, 31].

Furthermore, a mixture of exponential and power law decays has been observed in a model of stationary flow with hexagonal symmetry, when the transport is anomalous [10].

1.1 Poincar´ e recurrences

Before introducing the notion of Poincar´e recurrences, I would like to briefly recall some of the basic definitions concerning dynamical systems.

In this respect, let us consider a dynamical system (Ω, T, µ), where T is a transformation defined on the phase space Ω, and µ represents a probability measure, that is µ(Ω) = 1.

Definition 1.1 (invariant measure) Given a dynamical system (Ω, T , µ), the measure µ is said to be invariant with respect to T if for any measurable set A ⊆ Ω it holds µ(T⁻¹(A)) = µ(A).

The following theorem, due to Poincar´e, applies to such dynamical systems, which are therefore also called recurrent systems.

Theorem 1.1 (Poincar´e) Let be (Ω, T , µ) a dynamical system whose measure µ is T -invariant. Then, for any measurable set A ⊆ Ω, µ-almost every point x ∈ A returns infinite times in A, that is there exist infinite positive integer numbers k such that T^k(x) ∈ A.

Among the possible definitions of ergodicity, the one that developed from statistical mechanics and that was known as Boltzmann’s ergodic hypothesis is probably the most significant from a physical point of view.

Definition 1.2 (ergodic system) Let be (Ω, T , µ) a dynamical system whose probability measure µ is T -invariant. If for any integrable function f

(10)

we have

N →∞lim 1 N

N −1X

k=0

f (T^k(x)) = Z

Ω

f (x) dµ (1.1)

µ-almost everywhere, the system is said to be ergodic.

An equivalent definition of ergodicity is the following.

Definition 1.3 (ergodic measure) Given a dynamical system (Ω, T , µ) whose probability measure µ is T -invariant, µ is said to be T -ergodic if for any f, g ∈ L²(Ω) we have:

N →∞lim 1 N

N −1X

k=0

Z

Ω

f

T^k(x)

g(x) dµ = Z

Ω

f (x) dµ Z

Ω

g(x) dµ. (1.2) Finally, there is a class of systems, characterized by a rapid decay of the correlations, which has played an important role in the recent development of the theory of Poincar´e recurrences.

Definition 1.4 (strongly mixing system) A dynamical system (Ω, T , µ), whose probability measure µ is T -invariant, is said to be strongly mixing if for any f, g ∈ L²(Ω) it holds:

k→∞lim Z

Ω

f

T^k(x)

g(x) dµ = Z

Ω

f (x) dµ Z

Ω

g(x) dµ. (1.3)

Let us consider now the statistics of Poincar´e recurrences, also known as statistics of first return times. To this end, let (Ω, T , µ) be a dynamical system equipped with a T -invariant probability measure. Taking a measurable set A ⊆ Ω and a point x ∈ A, the first return time of x into A is defined as

τ_A(x) = minn

k ∈ N : T^k(x) ∈ Ao

∪ {+∞}

. (1.4)

Thus, τ_A(x) is a positive integer number or, if x does not return (that is, T^k(x) 6∈ A for any k ∈ N), τA(x) = ∞. The mean return time into A is given by:

hτAi = Z

A

τ_A(x) dµ_A, (1.5)

where µ_A denotes the conditional measure with respect to A: µA(B) = µ(B ∩ A)/µ(A), for any measurable B ⊆ Ω. For ergodic systems Kac proved the following important result.

(11)

1.1 Poincar´e recurrences 5

Theorem 1.2 (Kac) If the dynamical system (Ω, T , µ) is ergodic, then for any set A ⊆ Ω, with µ(A) > 0, it holds:

hτAi = 1

µ(A). (1.6)

We may then introduce the statistics of first return times as F_A(t) = µ_An

x ∈ A : τA(x)/hτAi > to

. (1.7)

One of the main questions is whether the limit statistics, F_x_¯(t) = lim

µ(A)→0F_A(t), (1.8)

exists when the set A shrinks toward a given point ¯x ∈ Ω. Note that, from now on, I will drop the dependence on the point ¯x in the notation of the limit statistics of first return times, writing it simply as F (t), since it will be clear from the context which is the point considered.

Beside the recurrence statistics, it is possible to define the distribution of first return times:

G_r,A(t) = µ_An

x ∈ A : τA(x)/hτAi ≤ to

, (1.9)

denoting with G_r(t) the corresponding limit distribution for µ(A) → 0, when it exists. It is easy to see that Gr,A(t) = 1 − F^A(t).

An extension of the notion of statistics of first return times is represented by the distributions of the number of visits, whose definition is based on successive return times. For this purpose, let us consider the following quantity:

ξA(t; x) =

bhτAi tc

X

j=1

χA◦ T^j

(x), (1.10)

where χ_A is the characteristic function of a measurable set A ⊆ Ω, while the symbol b . c represents the integer part function. It is easy to verify that ξ_A(t; x) measures how many times a point x ∈ A returns into A after bhτAi tc iterations of the map T .

We will be mainly interested in the distributions of the number of visits in A,

F_k,A(t) = µA

n

x ∈ A : ξ^A(t; x) = ko

, (1.11)

(12)

in the limit for µ(A) → 0, denoting the limit distributions, whenever they exist, by

F_k(t) = lim

µ(A)→0F_k,A(t). (1.12)

Of particular interest is the distribution of order k = 0; in this case, in fact, Eq. (1.11) gives the statistics (with respect to t) of first return times as defined in Eq. (1.7). In this respect, I would like to remark that I will refer to the recurrence statistics equally as F_A(t) or F_0,A(t).

1.2 Main results

The main purpose of the present work has been to study Poincar´e recurrences for systems where regular and chaotic motions coexist, trying to give rigorous results when possible or, otherwise, performing accurate numerical investigations.

To this end, G. Turchetti, S. Vaienti and I started by considering, as a model of the dynamics of systems showing regular behaviours, the following skew map defined on the cylinder C = T × [0, 1],

R :







x⁰= x + y mod 1, y⁰ = y,

(1.13)

which is area-preserving with respect to the usual Lebesgue measure, and has zero entropy. Perturbing this simple map leads, according to the KAM theory, to a transformation that is integrable only for a subset of C whose Lebesgue measure approaches one as the amplitude of the perturbation van- ishes. As an example, the standard map is reduced to R when the coupling parameter goes to zero. Moreover, this transformation has been used to describe the flow on a square billiard [32].

It is interesting to observe that this map describes a shear flow in which, for almost all the ordinates y, the dynamics along the fiber placed at y is given by the irrational rotation: x⁰ = x + y mod 1. Since the velocity of rotation is different for each invariant torus T, this map is also referred to as “anisochronous rotations” on the two-dimensional cylinder.

Despite the fact that irrational rotations are ergodic, this does not hold for R. However, it enjoys a sort of local mixing property, caused by fil-

(13)

1.2 Main results 7

amentation, which seems responsible for the existence of limit recurrence statistics.

The first result I obtained for this map was to rigorously prove that the statistics of first return times for a particular kind of domains of C exists and follows an asymptotic power law like t⁻². This allowed to prove also the existence of the limit recurrence statistics for the fixed points of the map. Furthermore, having developed an algorithm which reproduces the dynamics of R and is characterized by an algebraic computational complex- ity, I obtained strong numerical evidences that the result on the asymptotic polynomial decay of the statistics is valid even for a generic subset of the cylinder.

Subsequently, I investigated the distributions of the number of visits, which, as seen, represent an extension of the statistics of first return times.

Despite R is horizontally almost everywhere foliated by irrational rotations (which seem to admit piecewise constant limit distributions only if the shrinking domain is chosen in a descending chain of renormalization intervals), the analysis of the distributions of the number of visits suggests the existence of the corresponding limit laws for domains that shrink in an arbitrary way around points of the cylinder. In particular, for square sets containing the fixed points of the map, the distributions present even in this case an asymptotic decay like t⁻².

Since the same features may be found for the distributions computed by assuming that the differences between successive return times are independent, we believe that, although our skew map is not ergodic, the local mixing property it enjoys plays here some role. Moreover, through accurate numerical investigations I verified that an asymptotic power law decay like a t^−β also holds for arbitrary rectangular domains not containing the fixed points, although in this case the exponent β is usually greater than two and seems to grow along with the order k.

This will be discussed in detail in Chapter 2.

Instead, in Chapter 3 I will present the results concerning Poincar´e recurrences for mixed systems, that is systems composed of invariant regions with respect to the dynamics. We proved that the distributions of the number of visits (which include, of course, the statistics of first return times) for a domain A that intersects the boundary between two invariant regions

(14)

is a linear superposition of the distributions characteristic of such regions, weighted by coefficients equal to the relative size of the intersection of A with each invariant region. Under a condition of continuity, this also holds in the limit when A shrinks toward a point of the boundary. I checked numerically this result for a system whose invariant components are two strongly mixing maps, verifying that the general formula obtained describes very well the behaviour of the distributions of the number of visits for domains crossing the two invariant regions.

Such a formula allows also to understand why, by coupling a generic regular system and a mixing system together (whose distributions are assumed to follow a polynomial and an exponential decay respectively), the regular region asymptotically gives the main contribution, which appears as a power law tail, to the distributions computed for domains of finite positive measure containing the boundary. I would like to remark that this effect has been already observed by other authors, without receiving however a theoretical explanation.

Concerning the limit distributions for points belonging to the boundary, the theorem assuring their existence may not be directly applied in the case our skew map is coupled with an arbitrary strongly mixing transformation.

However, I was able to show, by proving it in a particular situation and by numerical investigations when considering more general cases, that the limit distributions are ruled out by the mixing component, despite their expression differs from the Poissonian one found in the pure mixing case.

Although this two-dimensional system represents a rather simple model, nevertheless the results obtained appear to describe well what happens for systems of higher physical interest, such as the standard map, when we consider domains where regular and chaotic motions coexist. In particular, this model seems to provide a possible explanation for the existence of the power law tails observed for the distributions of domains lying in the chaotic

“sea” far away from regular orbits.

In this respect, I investigated the standard map in a regime in which the stochastic layer between the regular orbits and the chaotic sea may be considered as a sharp boundary. The numerical analysis performed for domains wholly contained in the regular region has shown that, as it happens for the simple model represented by our skew map, the distributions asymp-

(15)

1.2 Main results 9

totically follow a power law decay with an exponent near 2, and there is some evidence suggesting that the exponent slightly grows as the order k increases.

For domains lying on the chaotic sea and far away from the integrable region the distributions depart from the expected Poissonian behaviour and still decay like a t^−β, with β ' 2. This is reasonably due to the fact that most of the orbits originating from points of the chaotic sea closely approach, sooner or later, the regular region. Moreover, the distributions concerning domains that intersect the stochastic layer are in agreement with the results obtained for mixed systems.

The distributions of the number of visits appear therefore capable to capture some of the fundamental features of the dynamics, and to provide some information about the relative measures of the components where it differs.

Finally, in Chapter 4 I will discuss the application of the statistics of first return times to the genomic sequences regarded as a special kind of dynamical system. In particular, I tried to understand whether the capability of Poincar´e recurrences to capture the different qualitative properties of the dynamics could be used as a tool able to distinguish between the coding and noncoding regions of genomes.

Unfortunately this does not happen, because the statistics of first return times follows the same exponential behaviour for both coding and noncoding sequences. However, taking into account that this behaviour is typical of strongly mixing systems, it seems sensible to interpret the results obtained as suggesting that if long-range correlations are present in the sequences, their weight should be negligible, at least compared to the one of the short-range correlations.

(16)

(17)

Chapter 2

Shear flow

2.1 Skew map

My study of the recurrence properties of regular dynamics started by considering the following integrable skew map, which is defined over the cylinder C = T × [0, 1],

R :







x⁰ = x + y mod 1,

y⁰ = y. (2.1)

It is an area preserving transformation, with respect to the usual Lebesgue measure µ, and has zero entropy. This map describes a shear flow and its behaviour is rather simple. In fact, each point (x, y) ∈ C is transformed according to a one-dimensional rotation whose rotation number is y. Thus, the cylinder C appears to be foliated by invariant tori, and the rotation velocity changes along the y axis.

2.1.1 Noteworthy properties

Despite its simple behaviour, the map (2.1) presents some interesting features. First, it enjoys a sort of local mixing property; G. Turchetti proved (see Ref. [1] for more details) that in a particular, although important, case, the autocorrelation decay goes like O(n⁻¹), with n being the number of iterations of the map R.

More precisely, let us consider the cylinder C= T × [0, ], and define the 11

(18)

conditional measure µ as

µ(A) = µ(A)

µ(C) = µ(A)

, A ⊆ C. (2.2)

Note that the cylinder C is invariant with respect to R, and µ is the invariant measure therein.

Proposition 2.1 Given the dynamical system (C, R, µ), the following property holds for domains like A= [x, x + ] ×[0, ], A⊆ C, after n iterations of the map R:

µ(A∩ Rⁿ(A)) − µ²(A)

= O(n⁻¹). (2.3) It seems sensible to expect that the same may be valid even for a generic domain, although in this case it is not easy to give a proof. Of course this result differs from the usual mixing condition, since it has a local character and does not require the ergodicity of the system.

Such a local mixing property, caused by filamentation, appears to be responsible for the existence of a continuous limit statistics of first return times, despite the fact that for each irrational y coordinate the corresponding one-dimensional rotation does not admit, in general, a limit statistics.

In this respect, the skew map R shows another interesting property. To compute the recurrence statistics FA(t), one need to know the mean return time hτAi. But, since the transformation (2.1) is not ergodic, we can not apply Kac theorem to replace hτAi with the inverse of the measure of the set A. Nevertheless, a direct computation of the mean return time can be performed as well, obtaining for hτAi a value which is very similar to Kac’s formula. In fact, by defining with µ^x and µ^y the Lebesgue measure along the x and y axes, respectively, the following result holds.

Proposition 2.2 For the map (2.1), the mean return time hτAi into an arbitrary measurable domain A ⊆ T × [0, 1] of positive Lebesgue measure is given by

hτ^Ai = µ^y(I^A)

µ(A) , (2.4)

where

IA = {y ∈ [0, 1] : µ^x(Ay) > 0}, and

Ay =

(x⁰, y⁰) ∈ A : y⁰ = y .

(19)

2.2 Statistics of first return times 13

In other words, the mean return time equals the ratio between the measure of the part of cylinder “visited” by all the images Rⁿ(A), for n → ∞ [that is, µ^y(I^A) µ^x(T) ≡ µ^y(I^A)], and the measure of A itself.

Despite Eq. (2.4) may seem involved, in many cases the result is very simple. If, for example, we choose A = [x, x+]×[y, y +], the mean return time is hτAi = 1/.

2.2 Statistics of first return times

By considering the map (2.1), it is possible to rigorously prove that the statistics of first return times exists for square domains of C whose lower side is ‘placed’ on the fixed points of R.

At first, H. Hu had the idea to solve the problem by means of a geometric construction. G. Turchetti developed subsequently a way to compute the recurrence statistics for square domains of side = 1/m, with m ∈ N. I extended this proof by finding initially the statistics of return times for square subsets of side = n/m, with n, m ∈ N, and then in the case is an arbitrary real number between 0 and 1. This last result allowed me to obtain an explicit expression for the limit statistics when the square domain shrinks continuously toward one of the fixed points of R.

Proposition 2.3 Given the map (2.1), if A = [x, x + ] × [0, ] ⊂ C, with 0 < < 1 and 0 ≤ x < 1, then the statistics FA of first return times into A is

FA(t) =











1, if 0 ≤ t < t1,

1/2, if t_n≤ t < tn+1, t_n< t_¯_n, 1

2

1 −(t_n− 1 + )² t_n

, if t_n≤ t < tn+1, t_n= t_¯_n, (1 − )²

2 tn(tn− ), if t_n≤ t < tn+1, t_n> t_¯_n,

(2.5)

where ¯n = b1/c and tn = n/hτAi = n , n ∈ N. Furthermore, the limit

(20)

statistics exists and is given by

F (t) ≡ lim

→0⁺F_A(t) =











1, if t = 0, 1/2, if 0 < t < 1, 1/2 t⁻², if t ≥ 1.

(2.6)

What about more general cases? Unfortunately, already for subsets like A = [x, x + ] × [y, y + ] with y > 0, the use of the geometric method is very involved, because the lower side of A is no longer invariant. How- ever, to try to investigate the recurrence statistics in such situations, above all the asymptotic behaviour of F_A(t), it is possible to turn to numerical computations of the statistics. So, I decided to develop a numerical algorithm which reproduces the geometric construction itself. The advantage of this approach over more conventional statistical methods is represented by the fact that one may obtain, compared to the latter, very highly accurate results with limited computational resources, both in memory space and runtime. In fact, the final accuracy is only affected by the propagation of round off errors.

In order to test the reliability of my numerical algorithm, I started by checking the results of the program obtained for domains A= [x, x + ] × [0, ], for which the analytical expression of the statistics is given by the law (2.5). As an example, in the case of the set A = [0, ] × [0, ], with = 10⁻³, and t from 0 to 1000 (corresponding to 10⁶ iterations of R), the maximum difference between the computed and exact value is less then 10⁻¹⁶.

Subsequently, I computed F_A(t) for a large sample of different domains A = [0, ] × [y, y + ], varying both and y. Since I were mainly interested on the asymptotic behaviour of the recurrence statistics, I used a least- squares method to fit the statistics obtained against the function a t^−β, with t sufficiently large. The best-fit value found for the exponent β is always very near to 2, and in most cases β differs from 2 less than 5 × 10⁻⁴, as shown in Fig. 2.1.

Thus, it seems sensible to conclude that there is a strong evidence for the limit statistics F (t) to follow asymptotically a power law decay as F (t) ∼ t⁻² (Fig. 2.2).

(21)

2.3 Irrational rotations 15

-0.0015 -0.0010 -0.0005 0 0.0005 0.0010 0.0015

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

∆

y

1

Figure 2.1: Values of ∆ = β − 2 obtained by fitting the statistics F^A(t) against the function a t^−β, with 3000 < t < 5000, for domains A = [0, ] × [y, y + ], = 10⁻².

2.3 Irrational rotations

It is interesting, at this point, to compare the results just obtained for the skew map (2.1) with the behaviour shown by Poincar´e recurrences in the case of irrational rotations. In this respect, let us consider the following map, which represents a rotation by an angle α (the so called rotation number ) over the unitary circle T:

R_α: x⁰ = x + α mod 1. (2.7)

It is clear from the definition that we may take, without loss of generality, 0 ≤ α < 1.

2.3.1 Existence of limit laws

A first important result concerning Poincar´e recurrences for the transformation (2.7) was given by Slater [22] in 1967. He proved that for irrational rotations — that is rotations whose rotation number α is an irrational real number — there exist, at most, three different return times.

We could wonder, then, whether the statistics of first return times F_An(t), obtained by fixing a point ¯x ∈ T and by taking a sequence of intervals

(22)

-7 -6 -5 -4 -3 -2 -1 0

0 0.5 1 1.5 2 2.5 3

log10 F(t)

log₁₀t

Figure 2.2: Plot of the recurrence statistics FA(t), for the domain A = [0, ] × [y, y + ], with = 10⁻² and y = 10⁻¹. The dashed line represents the linear fit in the interval t ∈ [10², 10³].

An ⊆ T that shrink toward ¯x, converge to a limit one. In general, there is not a limit law for an arbitrary sequence {An}, since the value of the return times depends (usually in a rather involved way) on the subset An

considered. However, Z. Coelho and E. De Faria were able to show [24]

that limit distributions of first entry times Ge(t) exist for irrational rotation when the shrinking subsets An are chosen in an appropriate way.

To construct the sequence {An} used in the proof of the existence of G_e(t), they consider the continued fraction expansion of the rotation number α, which may be written like α = [0, a₁, a₂, a₃, . . .], if 0 < α < 1. The truncated expansion of order n of α is then given by pn/qn= [0, a1, . . . , an], where p_n and q_n verify the following recurrence relations,

p_k = a_kp_k−1+ p_k−2,

q_k = a_kq_k−1+ q_k−2, (2.8) with p₋₂ = 0, p₋₁ = 1 and q₋₂ = 1, q₋₁ = 0. Now, choosing an arbitrary point ¯x on the circle T, they define An as the closed interval of endpoints R^qαⁿ⁻¹(¯x) and R^qαⁿ(¯x) containing ¯x. This means that An= [R^qαⁿ(¯x), R^qαⁿ⁻¹(¯x)]

if n is odd, and the contrary holds if n is even.

Since we were mostly interested in return times, starting from this work of Coelho and De Faria we tried to demonstrate the existence of the limit

(23)

2.4 Distributions of the number of visits 17

statistics of first return times F (t) too. This was possible by using a recent result [33], which establishes the following relation between G_e(t) and F (t):

G_e(t) = Z t

0

F (s) ds. (2.9)

Thus, knowing the expression of Ge(t) as found by Coelho and De Faria, we could explicitly compute the limit statistics,

F (t) =











1, if 0 ≤ t < ta, 1

1 + ω, if t_a≤ t < tb, 0, if t ≥ tb,

(2.10)

where

t_a= ω(1 + θ)

1 + θω , t_b = (1 + θ)

1 + θω. (2.11)

The real numbers 0 < θ ≤ 1 and 0 ≤ ω < 1 are related to the coefficients ai

of the continued fraction expansion of the rotation number α.

More recently, Turchetti has independently given a simple proof [25]

concerning the existence of F (t) for irrational rotations when α is taken as a quadratic irrational with all the coefficients of the continued fraction expansion equal, that is α = [0, a, a, a, . . .].

2.4 Distributions of the number of visits

As noted in the Introduction, the statistics of first return times may be considered as the zero-order distribution of the number of visits. The next log- ical step in our analysis of Poincar´e recurrences for the transformation (2.1) has been therefore to investigate whether limit distributions of the number of visits F_k(t) exist for a generic order k > 0. Since our skew map is horizontally almost everywhere foliated by irrational rotations, it was firstly studied the behaviour of such distributions in the case of irrational rotations.

2.4.1 Successive returns for irrational rotations

In order to investigate the limit distributions of the number of visits for irrational rotations, we considered the same sequence of intervals Anused to

(24)

0 0.2 0.4 0.6 0.8 1.0 1.2

0 1 2 3 4 5

Fk,A20(t)

t

k = 1 k = 2 k = 3

Figure 2.3: Distributions of the number of visits Fk,A20(t) of order k = 1, 2 and 3.

obtain the statistics of first return times, hoping that it could be appropriate also for the distributions of higher order.

In this respect, we decided to take the rotation number equal to the golden ratio γ = (√

5 − 1)/2, which exhibits the very simple continued fraction expansion γ = [0, 1, 1, 1, . . .], thus allowing to get easily the intervals An. I performed the analysis for several orders k, computing for each of them the distribution F_k,An(t), with n from 10 to 20.

Although it is not possible to deal with limit distributions by means of numerical methods, nevertheless the results obtained strongly suggest the existence of the limit distributions F_k(t). I found in fact that the distributions F_k,An(t) with the same order k are very close to each other, regardless of the value of n, and this despite the presence of statistical fluctuations and effects due to the finite size of the intervals An (the measure of An

goes from about 2 × 10⁻² for n = 10, to 2 × 10⁻⁴ for n = 20). In Figs. 2.3 and 2.4 are shown the distributions referring to the smaller interval only, namely A20, since we are interested in the limit for µ(A) → 0. However, the distributions of the same order computed for different values of n would appear practically indistinguishable in the graph.

It is worthwhile to note some of the features of the distributions F_k,An(t)

(25)

0 0.2 0.4 0.6 0.8 1.0 1.2

2 3 4 5 6 7

Fk,A20(t)

t

k = 3 k = 4 k = 5

Figure 2.4: Distributions of the number of visits Fk,A20(t) of order k = 3, 4 and 5.

numerically obtained. First, their support is an interval and, for any k, it can be partitioned in three subintervals I_k^(l), I_k^(c) and I_k^(r) (the leftmost, the central and the rightmost, respectively) in such a way that F_k,An(t) is constant on each of these subintervals. In particular F_k,An(t) = 1 if t ∈ I_k^(c). Moreover, the intervals I_k^(r)and I_k+1^(l) practically coincide (in this regard, the distribution for k = 3 is reported in both figures to show clearly that this is true even for I₂^(r), I₃^(l) and I₃^(r), I₄^(l)). For every distribution studied we have that µ(I_k^(l)) ' µ(Ik^(r)) ' 0.447 and µ(Ik^(c)) ' 0.724, except for k = 2 and k = 5, where µ(I_k^(c)) ' 0.276. Interestingly enough, the measure of the support of F_k,An(t) for k = 1, 3 and 4 is about 1.618, that is near to 1/γ.

Surely, it would be interesting to get an analytic proof about the existence of the limit distributions F_k(t) and a theoretical explanation of their properties.

2.4.2 Distributions for the skew map

Irrational rotations, as seen, do not admit limit distributions of the number of visits unless the shrinking neighborhoods are taken in a suitable way.

Nevertheless, we were confident that, similarly to the statistics of first return

(26)

times, it would have been possible to show the existence of limit distribution for the skew map (2.1).

We started by considering the particular situation of square domains like A = [0, ] × [0, ] whose side, of length , goes continuously to zero.

Unfortunately, it soon appeared clear that in this case a geometric proof, as was performed for the first return times, was exceedingly complicated, as well as the development of a reliable and efficient numerical algorithm implementing the corresponding geometrical construction. The only viable solution seemed therefore to recur to a statistical method. In this respect, the numerical computations I performed suggest definitely that limit laws exist, as I will show later.

However, to try to understand this fact from a theoretical point of view, S. Vaienti and I developed an heuristic, but quantitative, argument which provides predictions very close to the numerical observations.

Theoretical investigation

For this purpose, it is necessary to consider another equivalent characteriza- tion of the distributions of the number of visits. Let us begin by introducing the kth return time of a point x ∈ A in a subset A,

τ_A^k(x) =







0, if k = 0,

τ_A^k−1(x) + τ_A

T^τ^A^k⁻¹^(x)(x)

, if k ≥ 1,

(2.12)

[note that τ_A¹(x) = τ_A(x)]. Subsequently, we may define the distribution of the kth return time as

P_k,A(t) = µ_A

x ∈ A : τ_A^k(x) hτ^Ai ≤ t

. (2.13)

We then observe that Eq. (1.11) can be rewritten as

F_k,A(t) = µA

(

x ∈ A : τ_A^k(x)

hτAi ≤ t ∧ τ_A^k+1(x) hτAi > t

)!

= P_k,A(t) − Pk+1,A(t). (2.14)

Since

τ_A^k = τ_A+ (τ_A² − τA) + . . . + (τ_A^k− τ_A^k−1), (2.15)

(27)

it is also possible to consider the function P_k,A(t) as representing the distribution of the sum of the differences, normalized by hτAi⁻¹, of consecutive return times until the kth return. The distribution of the difference between two consecutive return times (normalized by hτAi⁻¹) follows the same law as the distribution of the first return (see Ref. [8]), because the measure µ_A is invariant with respect to the induced application on A and because

τ_A^k − τA^k−1 = τ_A◦ T^τ^A^k⁻¹. (2.16) Now, if the variables τ_A/hτAi, (τA² − τA)/hτAi, . . . , (τA^k − τ_A^k−1)/hτAi were identically independently distributed (i.i.d) with the same distribution function G_r,A(t), then it is well known that the distribution function of their sum would be the following convolution product:

P_k,A(t) = Gr,A(t) ∗ G^r,A(t) ∗ . . . ∗ G^r,A(t)

| {z }

k times

. (2.17)

In the case of highly mixing systems [for instance φ-, α- and (φ, f )-mixing systems] for which the limit distribution of first return times G_r(t) is almost everywhere given by 1 − e^−t, the differences of the normalized successive return times become asymptotically independent when µ(A) → 0. The strategy adopted in Ref. [8] to compute, for a suitable choice of the sets A, the Poisson law

P_k,A(t) − Pk+1,A(t) −→ e^−tt^k

k! , (2.18)

was just based on this fact.

In this regard, although our skew map is not ergodic, nonetheless it enjoys a sort of local mixing property. This suggested us to try to obtain the distributions of the number of visits by assuming that even in such a situation the differences of successive return times were asymptotically independent. As seen before, the limit statistics of first return times F (t) for the sets A, when → 0, is given by Eq. (2.6). With the corresponding limit distribution being G_r(t) = 1 − F (t), under the preceding assumption we can write

P_k(t) = G_r(t) ∗ Gr(t) ∗ . . . ∗ Gr(t)

| {z }

k times

, (2.19)

and

F¯_k(t) = P_k(t) − Pk+1(t). (2.20)

(28)

In particular, it holds that

F¯₁(t) = G_r(t) − Z +∞

−∞

G_r(t − s) dGr(s). (2.21) A rather straightforward computation of the Stieltjes integral then gives:

F¯₁(t) =











0, if t = 0,

1/4, if 0 < t < 2,

1

4t² + 1

4(t − 1)² + 3

2t³ +3 log(t − 1)

t⁴ + 6 − 7t

4t³(t − 1)², if t ≥ 2.

(2.22) We may note that, when t is large, ¯F1(t) behaves like 1/2 t⁻². Through a similar, but more cumbersome, computation, we could obtain F₂(t) too,

F¯2(t) =











0, if t = 0, 1/8, if 0 < t < 1,

1 4− 1

8t², if 1 ≤ t < 2, O t⁻²

, if t 2.

(2.23)

Using a recursive argument, it is possible to show that

F¯_k(t) =











0, if t = 0, 1

2^k+1, if 0 < t < 1, O t⁻²

, if t 2.

(2.24)

I would like to remark two interesting features of the distributions ¯F_k(t):

(i) for 0 < t < 1, the distributions present a plateau whose height is given by 1/2^k+1, and this is the only explicit dependence on k that we were able to easily detect;

(ii) for t → ∞, all the ¯F_k(t) exhibit the same behaviour whatever the order k, decaying like 1/2 t⁻².

(29)

10^-8 10^-6 10^-4 10^-2 10⁰

10^-1 10⁰ 10¹ 10² 10³ 10⁴ F1(t)

t

Figure 2.5: Distribution of the number of visits of order k = 1 computed for the domain A= [0, ]×[0, ], with = 10⁻². The dotted line represents the function (2.22).

Numerical analysis

We compared the distributions ¯F_k(t), computed under the assumption that the differences of successive return times are asymptotically independent, with the ones obtained through numerical investigations.

The qualitative features described above still seem to persist, although there is some discrepancy between the two kind of distribution. In particular, both show an initial plateau, even if the value of the actual one differs from the expected. But, what is more interesting, all the numerical distributions appear to decay like 1/2 t⁻², at least after a transitory peak (see Figs. 2.5 and 2.6).

The discrepancy between the two kind of distribution is reasonably due to the presence of some sort of weak correlation between the differences of successive returns. Note that this could be in agreement with a result of Z. Coelho and E. De Faria [24], showing that the limit joint distributions of the differences of successive entry times are not given by the product of the individual limit distributions.

I investigated also the behaviour of the distributions of the number of

(30)

10^-8 10^-6 10^-4 10^-2 10⁰

10^-1 10⁰ 10¹ 10² 10³ 10⁴ F2(t)

t

Figure 2.6: Distribution of the number of visits of order k = 2 computed for the domain A= [0, ]×[0, ], with = 10⁻². The dashed line represents the function 1/2 t⁻².

visits F_k,A(t) with A = [0, ]×[y0, y₀+] ⊂ C, y0 > 0, for several values of the parameters y₀ and . I found that an asymptotic power law decay preceded by a peak seems to hold even for such domains, as shown in Fig. 2.7; note how the peaks narrow and shift toward larger t when k increases, while their height slightly decreases. So, in order to estimate the decay exponent β, I used a least-squares method to fit the numerical distributions against the function a t^−β. In this more general case β is usually greater than 2, and the mean value of its distribution appears to be positively correlated to the order k (Fig. 2.8).

Even the distributions obtained for rectangular domains like A = [x0, x₀+

] × [y0, y₀+ δ] present similar features: in particular they decay following a power law with an exponent greater, but near, to 2. Moreover, the mean return time computed numerically is still hτAi = 1/.

In conclusion, the inverse square decay in t of the limit distributions, whatever the order k, seems to be typical for the fixed points (which lie along the x axis) of our skew map, while as soon as one considers other points, the exponent β increases weakly with k. However, the difference between the distributions of the number of visits for periodic and generic

(31)

10^-6 10^-4 10^-2 10⁰

10⁰ 10¹ 10² 10³

Fk(t)

t

1 2 5 10 20

Figure 2.7: Distributions of the number of visits of order k = 1, 2, 5, 10 and 20, computed for the domain A = [0, ] × [y⁰, y0+ ], with y0 = 0.35 and = 10⁻². The dashed line represents the function 1/2 t⁻².

points appears to be a general fact of recurrences, as we will also see in the next chapters.

(32)

1.95 2.00 2.05 2.10 2.15

k = 1 k = 2 k = 3

β

Figure 2.8: Distributions of the best-fit parameter β, obtained through a least-squares method in the range 30 ≤ t ≤ 100 from the distributions of the number of visits of order k = 1, 2 and 3. For each order k, the distributions of the number of visits have been computed for twenty domains of side

= 10⁻². The arrows show the position of the mean value of β.

(33)

Chapter 3

Mixed dynamical systems

In this chapter I will consider the following situation. Suppose to take a subdomain A of a measurable phase space Ω, which intersects two regions of Ω that are invariant with respect to a transformation T acting on Ω. Upon these two regions, the map T is defined so that it behaves in two different ways, for example it may be mixing on one of the components and simply ergodic (or not ergodic at all) on the other one.

In this case, one could wonder about the existence of a limit recurrence statistics — or, more generally, of limit distributions of the number of visits

— when A shrinks around a point which belongs to the common boundary of the two regions, in such a way to still intersect the two invariant components.

3.1 Distributions of the number of visits

Let us consider a measurable space Ω and a map T acting on it. Moreover, be µ a T -invariant measure; note that it can be taken as a generic invariant measure, not necessarily a Lebesgue measure. Suppose now that the dynamical system (Ω, T, µ) is such that it splits into two subsystems (Ω₁, T₁, µ) and (Ω₂, T₂, µ), where

Ω = Ω₁∪ Ω2, µ(Ω₁∩ Ω2) = 0. (3.1) The maps T₁ and T₂, defined over Ω₁ and Ω₂ respectively, satisfy the following conditions:

T₁ = T_|Ω₁_\(Ω₁_∩Ω₂₎, T₂ = T_|Ω₂_\(Ω₁_∩Ω₂₎, (3.2) 27

(34)

that is, they coincide with T except possibly on the zero measure boundary Ω₁∩ Ω2 of the invariant regions.

In order to study the behaviour of the distributions of the number of visits for neighborhoods of points which belong to this common boundary, let us take a neighborhood A of a point ¯x ∈ Ω1∩ Ω2, such that µ(A) > 0.

Then, denote with A1 and A2 the two different components of A, that is, A¹= A ∩ Ω¹ and A² = A ∩ Ω²; of course A = A¹∪ A².

The sequence of domains A shrinking around ¯x is chosen in such a way that the relative weights

w₁(A) = µ(A1)

µ(A), w₂(A) = µ(A2)

µ(A), (3.3)

have a finite limit when µ(A) → 0, namely we will assume that the following limits exist and are different from zero,

w₁= lim

µ(A)→0w₁(A), w₂= lim

µ(A)→0w₂(A). (3.4) It is easy to prove that the mean return time into A is related to those in A1 and A2 as follows:

hτAi = w1(A) hτA1i + w2(A) hτA2i. (3.5) In fact, from the definition of mean return time and of conditional measure, it holds that

hτAi = Z

A

τ_A(x) dµ_A

= 1

µ(A) Z

A

τA(x) dµ

= µ(A1) µ(A) Z

A1

τ_A1(x) dµ_A1 + µ(A2) µ(A)

Z

A2

τ_A2(x) dµ_A2

= w₁(A) hτA1i + w2(A) hτA2i.

Furthermore, by calling F_k,A1(t) and F_k,A2(t) the distributions of the number of visits in A1 and A2 respectively, it is possible to show that

F_k,A(t) = w₁(A) Fk,A1(w₁⁰(A) t) + w2(A) Fk,A2(w⁰₂(A) t), (3.6)

(35)

3.2 Coupling of mixing maps 29

where

w₁⁰(A) = hτAi

hτ^A¹i, w⁰₂(A) = hτAi

hτ^A²i. (3.7)

Then, assuming that the limits F_k,1(t) = lim

µ(A)→0F_k,A₁(t), F_k,2(t) = lim

µ(A)→0F_k,A₂(t), (3.8) and

w⁰₁ = lim

µ(A)→0w⁰₁(A), w⁰₂ = lim

µ(A)→0w₂⁰(A), (3.9) are well defined, we proved the following theorem.

Proposition 3.1 Under the existence of the limits (3.4), (3.8) and (3.9), the limit distributions of the number of visits, for points on the boundary Ω1∩ Ω², exist and are given by

F_k(t) = w1F_k,1(w₁⁰ t) + w2F_k,2(w⁰₂t), (k ≥ 0) (3.10) at the points of continuity of both F_k,1 and F_k,2.

In other words, when the conditions of Proposition 3.1 are satisfied, the limit distributions F_k(t) are a linear superposition of the limit distributions F_k,i corresponding to the invariant components Aiof the shrinking neighborhood A, weighted by the relative size of such components.

As an application of these results, I first studied a one-dimensional system whose phase space is parted into two mixing invariant regions, and then a more remarkable two-dimensional system obtained by coupling the skew map (2.1) with the so-called Arnold’s “cat map”.

3.2 Coupling of mixing maps

In order to check Eq. (3.10), I considered the following one-dimensional, sawtooth-like map defined over the interval Ω = [−1, 1],

T (x) =







T₁(x), if − 1 ≤ x < 0, T2(x), if 0 ≤ x ≤ 1,

(3.11)

(36)

where

T₁(x) =











−3x − 3, if − 1 ≤ x < −²₃, 3x + 1, if −²₃ ≤ x < −¹₃,

−3x − 1, if − ¹₃ ≤ x ≤ 0,

(3.12)

and

T₂(x) =











3x, if 0 ≤ x < ¹₃, 2 − 3x, if ¹3 ≤ x < ²3, 3x − 2, if ²₃ ≤ x ≤ 1.

(3.13)

Note that the two subsets Ω₁ = [−1, 0] and Ω2 = [0, 1] are invariant as regards T₁ and T₂, respectively.

The map T is piecewise linear and preserves the Lebesgue measure.

Moreover, it is continuous except at the point ¯x = 0, which is a periodic point of period two for T1 and a fixed point for T2. Thus, whenever we take a ball of radius around ¯x = 0, the statistics of first returns on the left and on the right of ¯x will follow, respectively, those around a periodic point of period 2 and a fixed point; in this situation, the limit statistics for → 0 will not follow the exponential 1-law e^−t as for generic points. Instead, it is possible to use a result due to M. Hirata [15], according to which for one- dimensional Markov maps, like T₁ and T₂, the statistics of first return times in a ball of radius around a periodic point x of period P is given, in the limit → 0, by the following formula:

F (t) = ρ_xe^−ρ^x^t, (3.14)

where

ρ_x= 1 − eu(x) + u(T (x)) + ... + u(T^P⁻¹(x)). (3.15) Here u(x) is the potential associated to the invariant Gibbs measure; in the present case, in which µ is a Lebesgue measure, it holds that

u(x) = ln 1

|T1⁰| = ln 1

|T2⁰| = ln(1/3) (3.16) and, therefore, ρ_¯_x is 8/9 for T₁ and 2/3 for T₂, consistently with the fact that, in general, we expect ρ_x to approach 1 for periodic points of increasing period.

(37)

3.2 Coupling of mixing maps 31

-4 -3 -2 -1 0

0 2 4 6 8 10 12

log10 F(t)

t

Figure 3.1: Statistics of first return times FA(t) obtained for the set A = [−10⁻³, 10⁻³]. The dashed line represents the law (3.17).

Then, by using Proposition 3.1, the limit statistics of first return times in ¯x = 0 reads as

F₀(t) = w₁F_0,1(w₁⁰ t) + w₂F_0,2(w₂⁰ t) (3.17)

= w₁8

9e⁻⁸⁹^w⁰¹^t+ w₂2

3e⁻²³^w⁰²^t.

Since the maps T₁ and T₂ are mixing, one may immediately see, thanks to Kac’s theorem, that hτAii = µ(Ωi)/µ(Ai), i = 1, 2. So, under the existence of the limits (3.4) and (3.9), we have w⁰_i = [µ(Ω)/µ(Ω_i)] w_i. The prescribed choice of the set A as a symmetric interval around the boundary point ¯x thus implies w1 = w2 = 1/2 and w⁰₁ = w⁰₂ = 1. Replacing these values into Eq. (3.17) gives a result which is well confirmed by the numerical computations, as shown, for example, in Fig. 3.1.

A similar situation occurs if we want to apply Proposition 3.1 to compute the distributions of the number of visits, of order k > 0, in ¯x = 0. That is, we need to know which are the distributions around periodic points. In this respect, since the two mixing maps T₁ and T₂ are conjugated with a Bernoulli shift on three symbols with equal weights, it is possible to use a general formula recently proved by N. Haydn and S. Vaienti which gives the limit distribution of order k for cylinders C_n around periodic points of

(38)

10⁰ 10^-1 10^-2 10^-3 10^-4 10^-5

10^-6

0 5 10 15 20 25

Fk(t)

t

1 2

Figure 3.2: Distributions of the number of visits of order k = 1 and 2, computed for the map given by Eq. (3.11) in the interval A = [−, ], with = 5 × 10⁻⁴. The dotted lines represent the theoretical predictions; in particular the one for k = 1 corresponds to formula (3.20).

period P ,

F_k(t) ≡ lim_n→∞µCn

n

x ∈ Cⁿ: ξCn(t; x) = ko

= (1 − p^P) e^−(1−p^P^)t Xk j=0

k j

p^{P (k−j)}(1 − p^P)^2j

j! t^j, (3.18) where p = µ(C_n+1)/µ(C_n). Note that for k = 0 this formula coincides with the one found by M. Hirata, since then it reads as

F₀(t) = (1 − p^P) e^−(1−p^P^{) t}. (3.19) I would like to note that the distributions (3.18) can be obtained, under the assumption that the differences of successive return times are asymptotically independent, by convoluting F₀(t) according to the procedure described in Sec. 2.4.2.

Considering, for example, a sequence of shrinking cylinders C_n centered around the point ¯x = 0 (so that w₁ = w₂ = 1/2 and, by Kac’s theorem,

Poincar´ e Recurrences in Mixed Dynamical Systems and in Genomic Sequences

Universit` a degli Studi di Bologna

Dottorato di Ricerca in Fisica, XVIII Ciclo in cotutela con

l’Universit´e du Sud Toulon-Var, France

Tesi di Dottorato di Ricerca

Poincar´ e Recurrences in Mixed Dynamical Systems and in Genomic Sequences

Luca Rossi

Bologna, marzo 2006

Contents

Chapter 1

Introduction

1.1 Poincar´ e recurrences

1.2 Main results

Chapter 2

Shear flow

2.1 Skew map

2.2 Statistics of first return times

2.3 Irrational rotations

2.4 Distributions of the number of visits

Chapter 3

Mixed dynamical systems

3.1 Distributions of the number of visits

3.2 Coupling of mixing maps