Linear response theory for complex systems

(1)

Faculty of Mathematical, Physical and Natural Sciences

Linear response theory for complex

systems

Master degree thesis

Giancarlo De Luca

 : Paolo Grigolini

 July 

(2)

(3)

(4)

(5)

Preface . . . vii

1 Complex systems power laws and subordination theory 1

. Characteristics of complex systems . . .  .. Power laws . . .  . Mesoscopic phenomena and stochastic processes . . .  .. Stochastic processes: some definitions . . .  . Subordination theory and renewal processes . . .  .. Subordination theory . . . 

2 Ergodicity, ergodicity breaking and non stationarity 25

. Boltzmann’s Ergodic hypothesis . . .  . Mathematical theory of ergodicity and Brickhoff theorem . . .  .. Invariant Measure . . .  .. Ergodic measures and Birkhoff ’s theorem ergodic and invariant

version . . .  . Ergodicity of time series . . .  . Ergodicity breaking . . . 

.. Toward a event driven ergodicity breaking: recurrence time for discrete random walk . . .  .. Ergodicity breaking in sub diffusive system . . .  .. Subordinated renewal processes . . . 

3 Linear Response Theory 49

. Traditional linear Response theory . . .  .. Static response . . .  .. Dynamic Linear Response . . .  .. Velocity Autocorrelation Function . . .  . Stochastic Resonance . . .  . Linear Response theory for complex systems . . .  .. Event Driven Systems . . .  .. The Onsager Principle . . .  .. Linear Response theory for event driven Poissonian processes . . .  .. Non poissonian event driven linear response theory . . .  .. The Fluctuation-Dissipation theorem . . .  .. Dichotomous non-poissonian case . . .  .. Response to an harmonic perturbation . . . 

(6)

.. Phenomenological Response to harmonic perturbation: The “Freud effect” . . .  .. Liquid Crystals experiment . . .  . Further consideration and conclusions . . .  .. Fokker-Planck equation . . .  .. Complexity Matching . . .  .. Conclusions . . . 

(7)

We shall present in this work a proposal for modeling perturbation on complex systems. In the chapter I we shall introduce the main mathematical tools and definitions of our model: Renewal processes. Our attempt to model the perturbation of complex systems will be limited to those for which a renewal perspective is allowed.

In chapter II we shall show, with the help of numerical simulations, that these systems exhibit a non stationary, thus non ergodic, behavior.

In chapter III we shall present our proposal for a perturbation theory of complex systems. Since Kubo’s fluctuation-dissipation theorem:

⟨A(t)⟩pert− ⟨A(t)⟩unpert= ε

∫

t 

χ_{(t − s)B(s)ds} whereχ(t − s) = d

dsC(t − s) holds for stationary processes, we have to extend it in order to

use it for complex renewal systems.

In those cases perturbation can act either on the event generating operator (thus per-turbing the leading process without affecting the event occurrence time) or on the global interaction (then perturbing our waiting time distribution). The first approach, which we refer to as “phenomenological” gives χ(t, s) = d

dsC(t, s); the second one, which we call

“dynamic” givesχ(t, s) = −d

dtC(t, s). In the stationary case, both prescriptions lead to Kubo

theorem again.

We assert that the “dynamical” approach is the one which better describes our processes and then extend this theory to non dichotomous processes. In this case, besides the linear response term, a new term appears. If the perturbation is harmonic, A cos(ωt + φ), the linear response theory leads to a response of the following form:

⟨A(t)⟩pert− ⟨A(t)⟩unpert= εBR(t) cos(ωt + Φ)

withR(t) ∼ /t−µ andB and φ depending on the peculiar characteristics of the system . We then illustrate an experimental result on Liquid crystals dynamics that confirms our theory.

(8)

(9)

subordination theory

Answering a question like “what is complexity science?” is still a very hard task: complexity science is a very recent discipline and, in spite of an exponentially increasing number of results, it still lacks the support of a unifying theory, accepted by the majority of the scientists working in this field. The large variety of systems studied, the diversity of behaviors which are usually labeled as “ complex” and its interdisciplinary status have made it to be a very fast changing discipline in which many different approaches, of which each one has its pros and cons, coexists: there is not yet a commonly recognized foundations even if some typical behaviors are recognized.

As a discipline, complexity science suffers on account of difficulty of defining what a complex system is. A typical heuristic reply to this question may be a negative one: a system iscomplex if it is neither a completely deterministic one nor a completely stochastic one. Complex systems stands in a certain way between Newtonian physics (that is physics of large scales) and statistical and quantum physics (physics of small scales).

Although this definition is correct, it is too vague unsatisfactory and we would like to elucidate some specific behavior of complex systems

1.1 Characteristics of complex systems

This said, we would like to be able to give a more “positive” definition of complex system, and we would like to be able to give some property that we would label as “bookmarks” of complexity behavior (see [])

We want to identify three different types of behavior that can characterize a complex system:

Chaos As usually defined a chaotic system is a causal system with unpredictable evolution.

This is historically the first example of complex system.

Non linearity Non linear systems, that is systems whose outputs are not proportional to

their inputs, are another class of systems that exhibitcomplex dynamics (e.g. limiting circles, bifurcations, period doubling)

Self-organization and cooperation Complex systems like neural networks, scale free

complex networks, cellular automata, decision-making networks show some typical dynamics characterized by different form of self-organization, by the birth and death of coherent structures (patterns), and power law behaviors.

(10)

We shall be mainly interested in strongly cooperative systems whose evolution can be characterized by a renewal process.

We shall not consider in our discussion, any specific model, but rather we shall propose a generalized theory that may be applied whenever an event driven renewal description is plausible.

We then shall refer to some recent experimental evidences on liquid crystals [, ] which prove that an event driven description is possible in this case, and so we shall confront our theoretical proposals with some experimental results.

1.1.1 Power laws

In complex systems the emergence of power law is ubiquitous. Power Laws have been found to govern the occurrence time of large earthquakes [, ], to model financial markets behavior [], rains [] and many others.

Brain dynamics too seems to undergo an event power law distribution. Similarities be-tween Omori’s law for earthquakes and epileptic seizures distributions have been found [], Many complex networks too, like World Wide Web [] (see figure .), Social Networks, human dynamics (e.g. electronic correspondence [] and traditional one []) exhibit a complex topology characterized by a power law distribution of the degrees of nodes. Those networks, called scale-free complex networks [, ], lighten the nature of power law emergence in complex system.

Figure 1.1: A figure taken from [?] which shows the emergence of a power law distribution for real networks. The distribution function of connectivity for various large networks. (A) Actor collaboration graph withN= ,  vertices and average connectivity⟨k⟩ = . (B) WWW, N = , , ⟨k⟩ = . (). (C) Power grid data,N = , ⟨k⟩ = . The dashed lines have slopes (A) µ = . (B) µ = . and (C)µ= 

While in fact purely random traditional complex network models (i.e. Erdős - Rényi graphs [], Watts-Strogartz small world []) are characterized by degree distribution

(11)

that is mainly poissonian, real-world complex networks, which have actually a very strong cooperative behavior, are actually better modeled by scale free model.

Power laws are in fact able to correctly model what has been calledsporadicity. Moreover with respect to poissonian power laws they allow rare events to occur with a higher and not negligible probability.

We consider it the only parameter which governs theuniversality class of system driven by a power law distribution, and experimental observation are able to determine the exponent µ.

Many complex physical systems have shown to exhibit a power law decay from a non-equilibrium: a recent example is provided by liquid crystals []. Then we are interested to analyze power law event driven processes as a model from which extrapolate theoretical predictions.

Let us point here that power laws are asymptotic. They can be useful to model a universal long term behavior but not the transient one which is strongly dependent on the microscopic details.

In order to facilitate calculations we have to make some assumptions on the form of the distribution we shall use. These assumptions will introduce biases that should not affect the asymptotic behavior of our results. The assumption done will influence the choice of these parameter without afflicting the asymptotic behavior.

In this chapter we shall present some functions that exhibits an asymptotic power law behavior. We shall use these function our calculations.

Mittag-Leffler Function derivative

Mittag-Leffler function [] has frequently been considered to extend the concept of expo-nential. To understand it we have digress a little and give a rapid introduction to Fractional derivation.

There are many ways we could extend the concept of derivation. Liouville’s guess on exponential function (i.e.Dαeax = aαeax),has been historically the first attempt to extend the concept of Derivative but soon Liouville was confronted to the problems of this definition (it was not a coherent definition). More then a century took to mathematicians to give a coherent theory.

A fractional derivative cannot be a local operator. For the derivative operator defined over a L_{space this is obvious, since it has to can be constructed by the mean of infinite}

series of operator. But in general this is not obvious unless we use one of the many forms in which fractional Derivatives may be expressed, the Riemann-Liouville form.

Forq<  we set [] and []

aD(q)t X(t) =  Γ(−q)

∫

t a X_(ξ) (t − ξ)q+dξ (.)

and we extend this definition toα= q + n

aD(α)t X(t) = aD(q+n)t X(t) = −

dn dtn a

D(q−n)_t X(t). (.)

(12)

Here the non locality of this operator is clear.

We want to introduce Mittag-Leffler function as a generalization of exponential functions (see []). Since we know that the exponential function is the solution of the equation ordinary kinetic equation

D_tX_i(t) = c_iX_i(t) (.)

Integrating (notice that integration according to . is nothing but D (−)

t operator) we

have:

Xi(t) − Xi() = ci D(−)t Xi(t) (.)

We can thus generalize this equation dropping indices and letting _D(−)

t → D (−ν) t that is Xν(t) − Xν() = cνD (−ν) t Xi(t) (.)

This equation can be solved and we obtain

Xν(t) = Xν()cν ∞ ∑ k= (−)k_(ct)νk Γ(νk + ) =Xνc ν_()E ν(cνtν) (.)

We call the function :

Eν(t) = ∞ ∑ k= (−)k_(t)k Γ(νk + ) (.)

theMittag-Leffler function.

As this derivation shows,Mittag-Leffler function is a kind of interpolating function be-tween exponential law and power law.

Let us consider now the function ψ_ML(t) = − d

dtEα(λ

α_tα_{). If we consider its Laplace}

transform we obtain ˆ ψML(s) =  − λαsα withα∈ [, ] (.)

Bochner’s theorem assures us thatψ_ML(t) is actually a probability density function. Using Tauberian theorem ( [], cap V) for Laplace transform we have (µ= α + ):

ˆ

ψML(s) ∼  + λαsα (.)

and so fort→ ∞ we have

ψML(t) ∼



Γ(µ + )tµ (.)

obtaining an asymptotic power law.

(13)

0.2 0.4 0.6 0.8 1.0 yn 0.2 0.4 0.6 0.8 1.0 yn+1 Mannveille Map

Figure 1.2: Manneville map for z= . α = 

(14)

Manneville’s Map and Manneville’s distribution

In an article of  [] Paul Manneville proposed a model for intermittent turbulence, which we shall call Manneville’s map

yn+= M(yn) = yn+ αyzn (mod ) (.)

withz > . This function is plotted in Figure ..

As Gaspard and Wang found in  [] Manneville map dynamic has a very distinctive behavior

≤z ≤ 

 normal dynamics (Gaussian Fluctuations (.)



 ≤z ≤ 

transient anomalous dynamics (.)

≤z anomalous dynamics (Lévy fluctuations) (.)

As we see from figure . forz= . dynamics of Manneville model is characterized by a certain form of clustering: long periodlaminar phases interrupted by chaotic burst.

2000 4000 6000 8000 10 000 n 0.02 0.04 0.06 0.08 0.10 0.12 yn Manneville series

Figure 1.3: Manneville series for z = . a =  x= .

After establishing the intermittent nature of y we aim to calculate its probability density function [], to do that we have to take a continuous time limit, by example considering the differential equation:

y′= αyz_. _(.)

the solutions of this equation is given by: α_(τ− τ) =

∫

y y yz_d_y₌  − z (  yz− −  yz−  ) . (.) 

(15)

Thus, the time distance between two consecutive jumps (that is, by the structure of ., we sett=  and y = ) is given by

ατ₌  − z (  yz−  − ) . (.)

by inverstion of this equation we get:

y= ζ(τ) = (



( − z)ατ + )

/z−

. (.)

Since y∼ U(, ) we have

ψM(t) = d dt Prob(τ < t) = d dt Prob(y_< ζ(t)) = d dt (  ( − z)t + )  z− = (µ_{(T + τ)}− )Tµ−µ (.) where we have setµ= z

z− andT =(z−)α .

We can apply Gaspard and Weng analysis [] to this case and obtain:

µ_{≥ } normal dynamics (Gaussian Fluctuations, finite mean and variance) (.) ≤µ ≤  transient anomalous dynamics (finite mean, variance not defined)

(.) ≤µ ≤  anomalous dynamics (Lévy fluctuations, mean and variance not defined)

(.) Thus, Manneville’s intermittency is governed by a power law. We shall see that Laplace Transform (see []) of probability density functions is of fundamental importance for our theory but in this case a closed form is not available. We have, in fact,

ˆ ψM(s) =

∫

∞  e−st(µ − )T µ− (T + τ)µ dt= (µ − )T µ−_sµ−_e−sT_Γ_{( − µ, sT)} _(.)

where Γ(x, α) =

∫

_α∞ettx−dt is the upper incomplete Gamma function []. ψ_M(t) is a probability density function ˆ(ψ)_M() =  we can compute his asymptotic behavior for s →  and obtain

If < µ <  we can expand the function and obtain: ˆ ψM(s) ∼  − Γ( − µ)(sT)µ− (.) IF < µ <  we obtain instead: ˆ ψM(s) ∼  + sT − µ −Γ( − µ)(sT) µ−_{=  + s⟨t⟩ − Γ( − µ)(sT)}µ− _(.) 

(16)

Lévy function

Lévy’s distribution is another function which can be used as asymptotic power law. Lévy introduced his distribution looking for [, ]

Definition 1.1.1(infinite divisible distributions). Let φX(t) be the characteristic function

of a probability distribution f(t) (i.e φ_X(t) = E_µ(eiωXwhereX∼ µ). A probability function is said to beinfinitively divisible if for any n there exists a probability measure ν whose characteristic function λ_n(t) satisfies

φX(t) = (λn(t))n.

Among all the infinite divisible distributions a particular class is of wide interest. To explore this point let us use

Definition 1.1.2(Stable distribution). A distribution is stable if it is stable under

convo-lution that is if, for anya_,a_,b_,b_there exista, b∈ R , so that its characteristic function φX(t) satisfies:

φaX+b(t) = φaX+b(t)φaX+b(t)

If two random variablesXandXare distributed according to a stable distribution the

their sum (rescaled and translated) is also distributed with the same distribution.

The most widely known stable distribution is the Gaussian distribution. Levy and Kin-chine have shown that the only possible attractors of probability distribution are stable distribution and Levy has given a canonical representation theorem

Theorem 1.1.1(Lévy-Kintchine representation theorem). The most general form for the

characteristic function Lα,β(k) of a stable is given by

lnL_α,β(k) = iγk − c∣k∣α( − β k ∣k∣ω(k, α)) (.) where ω_{(k, α) =}⎧⎪⎪_⎨⎪⎪ ⎩ tan(πα ) if α ≠   πln∣k∣ if α =  (.) where γ is arbitrary, c_{> , α ∈ [, ], − < beta < }

Sinceγ and c are scale factors, the do not contribute to the shape of the distribution. α and β instead determines the shape Lévy distribution. The first exponent is called the characteristic exponent since it governs the asymptotic behavior of the distribution, we have in fact

for < µ <  The (bilateral) Laplace transform of the distribution can be calculated since

we have the characteristic function and expanding near origin we have Lα,β(is) = ˆψL(s) ∼  − ∣k∣α− β∣k∣α− k ω_{(k, α)} we get that ψL(t) → ±  ∣x∣+α forx → ±∞ (.) 

(17)

for α=  the function becomes a Gaussian distribution.

The β is called skewness parameter since it control the symmetry of the function. For β_{=  we obtain symmetric Lévy function, for β = − The distribution in concentrated in} the half line[γ, ∞]

1.2 Mesoscopic phenomena and stochastic processes

Phenomena usually studied by physicists have well defined physical scales. Newtonian dynamics, classical (equilibrium) statistical physics and general relativity investigates macro-scopical phenomena in which the fluctuations can be neglected, quantum physics inves-tigates microscopical phenomena in which quantum fluctuations are not negligible any more. Recent advances in technologies (e. g. molecules tracking) have enabled scientists to investigate phenomena whose typical scale are not large enough that fluctuations due to microscopical dynamic can be totally neglected and still not small enough that a complete quantum mechanical treatment can be set up. For these phenomena the name ofmesoscopic phenomena has been proposed.

The natural framework in which those systems are studied is that of stochastic processes. Stochastic processes can be seen as a kind of “microscopical phenomenological description” of the systems, that is, we take into account the microscopical dynamics through a fluctuating variable which describes it “phenomenologically”.

This approach is obviously not new to physics: it has been, indeed, widely used in those fields whichante tempora studied phenomena we could today define as mesoscopic (e.g. non equilibrium statistical physics, Brownian motions etc.). We want to point, nonetheless, that recent advances in experimental techniques have enabled us to study an extremely rich variety of new systems and phenomena which cannot be interpreted from within the usual perspective applied to those disciplines. New “interpretational” paradigm are needed to describe these new fundamental phenomena (and some, like self-organized criticality have yet been provided and have manifestated a powerful exegetic strength ).

1.2.1 Stochastic processes: some definitions

Usual mathematical description of probability is quite cumbersome, under certain aspects. Here we shall limit ourselves to state some definition a property needed further. As a reference books we have mainly used [] and [] (and also [], []).

Definition 1.2.1(Stochastic Process). Let L(X, µ, B) a probability space, a stochastic (or

random) process is collection of stochastic variables_{Xt}t∈T parameterized over a setT and

assuming values in Rn

IfT = R then we shall call the process a continuous time stochastic process. If T = N then we shall call the process adiscrete time stochastic process

This definition enables us to translate every concept we already have on stochastic variables to stochastic processes

(18)

Definition 1.2.2(Finite dimensional distributions). Given a a stochastic process{Xt}t∈T

over the probabilityL(X, µ, B), for any finite dimensional set of indexes {t

, . . . ,tk} we

define we define the finite dimensional distributions of the process the sets of measures {µt,...,t_k(F× ⋯ × Fk)} over R

nk_{defined by}

µt,...,t_k(F× ⋯ × Fk) = Prob(Xt ∈ F∩ Xt ∈ F∩ ⋯ ∩ Xtn ∈ Fn) (.)

The mathematical definition of our process allows us to interpret them in tree different ways: either as random variables (i.e. measurable functions over our probability space Xt ∶ X → Rn) or as functions defined over the setT× X (i.e. instead of interpreting it like

Xt(F) we look at it as X ∶ (t, F) ∈ T × X → Rn). A third possibility is to interpret them as

model functions for physics problems.

Definition 1.2.3(Path). For any Fixed F ∈ B we call the function

fF(t) = X(t, F) (.)

apath of our process.

As usually done for any set of variables, we can define some statistical properties for our processes. Two statistical properties are particularly important :

Definition 1.2.4(Mean). Let{Xt}t∈T be a process on the probability spaceL(X, µ, B),

and let µ_tthe -dimensional distribution as defined in definition .. We call the function

µ_{(t) = E(X}t) =

∫

Xtdµt (.)

themean of the process and

Definition 1.2.5 (Autocorrelation). Let {Xt}t∈T be a process on the probability space

L(X, µ, B) and let µ

t,s the -dimensional distribution as defined in definition . we

define the function

C_{(t, s) = E(X}tXs) =

∫

XtXsdµt,s (.)

theautocorrelation of the process.

Among the great variety of processes a particular class of continuous time processes are very important and are characterized by Markov Property

Definition 1.2.6(Markov Processes). Let{Xt}t∈T be a process on the probability space

L(X, µ, B), We say that the process is a Markov process if, for any set of indexes {t

, . . . ,tk−,s} ∈

R+k so thatt< t< ⋯ < tk−< s the process has the Markov Property

Prob(X_s ∈ F_s ∣ X_t

k− ∈ Ftk−∪ ⋯ ∪ Xt ∈ Ft) = Prob(Xs∈ Fs∣ Xt_k− ∈ Ft_k−) (.)

(19)

Using the definition of conditional probability and definition of finite dimensional distribu-tion we get the previous property translates

µs,t_k−,...,t(G × Fk−× ⋯ × F) =

µs,t_k−(G × Fk−) ⋅ µt_k−,t_k−(Fk−× Fk−) ⋅ . . . ⋅ µt,t(F× F)

µt_k−(Fk−) ⋅ . . . ⋅ µt(F)

(.) that is the  and  dimensional distributions totally determines the process.

Ifµ_t,s(F, G) = µ_t−s,(F, G) the Markov process will be called time homogenous, otherwise time inhomogeneous.

Since we are interested in modeling physical systems we content ourselves to choosing R as basic space . We, moreover, will assume that the probabilities involved can be expressed in terms of their probability density functions (to be true with a slight abuse of notation we shall consider among those densities also the tempered distributions like Dirac’sδ(t)).

In the following, if nothing is otherwise expressed, we shall indicate the processes simply by X_tor even X(t) where no confusion is possible and the probability density function of our process will be simply denoted like p(x, t) = Prob(X_t ∈ [x, x + dx]). Conditional probability analogously will be written like p(x, t ∣ yt′).

Markov Chains

Among all Markov processes an important class shall be studied more accurately

Definition 1.2.7(Markov Chain). A Markov process{Xt} over a countable (of finite) subset

of R is called a Markov Chain.

Since the possible outcomes the stochastic process can give are countable we shall use a more comfortable notation. We label each possible outcome which we shall refer to asstate, with an integeri and denote the probability of the state i to occur at time t with π_i(t).

We will call thetransition probability W_{i j}(t, s) the conditional probability Prob(i, t ∣ j, s). ObviouslyW_{i j}(t, t) = δ_{i j}whereδ_{i j}is the Kronecker symbol.

If Markov property holds we can write πi(t) = ∑

j

Wi j(t, s)πj(s) (.)

It straightforward to derivate some important properties like

Proposition 1.2.1(Chapman-Kolmogorov-Smoluchovski). For all i, j and for all s< u < t

the transition probabilities satisfies

Wi jt, s= ∑ k

Wik(t, u)Wk j(u, s) (.)

We shall show that under certain regularity hypothesis a differential system of equation can be obtained.

(20)

Let us assume that W_{i j}(t + dt ∣ t) = δ_{i j}+ K_{i j}(t) dt + o(dt) and π

i(t + dt) = πi(t) + d

dtπ(t) dt + o(dt

). We can thus write:

d

dtπi(t) dt = ∑_j Ki j(t) dtπj(t). (.) If the series ofK_{i j}p_j(t) still converge we can take limits and obtain

d

dtπi(t) = ∑_j Ki j(t)πj(t),

(.)

which we shall calltime inhomogeneous Master equation.

If the Markov Chain istime homogenous the transition probability will satisfy W_{i j}(t, s) = Wi j(t − s, ) and thus Wi j(t + dt, t) = Wi j(dt, ).

IfW_{i j}(dt, ) → δ_{i j}at least linearly in dt we can take the limit for dt→  and we can write d

dtπi(t) = ∑_j Ki jπj(t)

(.)

whereK_{i j}= d

dtWi j(t)∣. Equation . is usually known in physics asmaster equation.

Since π_i(t) are probability, we have to request that ∑_iπ_i(t) =  for every time. If

d

dt∑iπi(t) = ∑i d

dtπi(t) we have that ∑i∑jKi jπj(t) = . If we change the order of

summa-tion¹ we obtain that theK_{i j} satisfy the request ∑

i

Ki j= . (.)

Under these conditions (K_ii= − ∑_{i≠ j}Ki j) we thus restate the master equation in the form d

dtπi(t) = ∑_j≠iKi jπj(t) − Ki jπi(t).

(.)

The physical interpretation of master equation is now clear. If we interpret the p_i as the occupation probability of a statei, K_{i j}π_j(t) measures the occupation growth of state i due to particles that leave the state j to go to the state i, K_{i j}π_i(t) measures, instead, the decrease of occupation of state i due to particles that leave state i to go to state j.

A Markov chain is said to have reachedequilibrium if its probability distribution is time independent.

If our Markov chain satisfies a master equation an equilibrium exists if∑_jK_{i j}π_j(t) = . If the states are infinite we cannot establish a priori if an equilibrium exists. If we are dealing with a finite state homogenous Markov Chain the existence of equilibrium is guaranteed. We have, in fact, thatK_{i j}are the entries of a matrix K and theπ_i can be thought as the elements of a vector π. Since equation . says that one raw it a linear combination of the other ones, we have that Ker(K) contain at leat one possible equilibrium solution.



the hypothesis we have made are trivially true for finite dimensional Markov chains

(21)

Discrete time Markov Chain

The previous discussion cannot be carried out for discrete time Markov Chains since limits are not allowed. This is not a big deal. We shall repeat our discussion to obtain a similar result.

Probability transition will now depend on to discrete indicesW_{i j}(n, m) with W_{i j}(n, n) = δi j. In this case Markov Property translates nicely since in the discrete case there is a “last

step before”, that is :

πi(n) = ∑ Wi j(n, k)πj(k) = ∑ Wi j(n, n − )πj(n − ). (.)

We can defineK_{i j}(n − ) = W_{i j}(n, n − ) − δ_{i j}and restate the previous condition as πi(n) = ∑

j (δ

i j+ Ki j(n − ))πj(n − ), (.)

which is the discrete analog of inhomogeneous Master equation.

If the process it finite dimensional we can adopt a vector form that is a matrix(K)_{i j} = K_{i j} and a vector π(n) whose components are π_i(n). The discrete time master equation then become:

π(n) = π(n − ) + K(n)π(n − ). (.)

Thus

π(n) = Π(n)π() (.)

where Π(n) = ∏_j(I + K(j)) is called the propagator of the system. For time homogenous discrete Markov Chain we have Π(n) = Π()n.

Coin tossing

Fair Coin tossing may be seen as a Markov Chain. There are only two states, we denote them + and -, and at each step the system can move to the state+ with probability / or in the state− with probability /.

The most general form the operator K fitting the constraint∑_iK_{i j} =  is

K= ( a b

−a −b). (.)

In order for that the equilibrium to be(/, /) we must set: K= (/ −/

−/ / ). (.)

We notice that K+ K = . We refer to this throughout as a dichotomous process.

(22)

Dice throwing

A generalization of previous problem is that of a Markov chain in which the system hask states and at each time moves to one of those state with a given probabilityπ_k. We call π(n) the vector of probability at timen, π_eqthe steady distribution and K the transition matrix previously defined, we have

π() = (I + K)π() = π_eq (.)

and

π() = (I + K)π() = π_eq (.)

thus yielding the matricial equations:

K+ K =  (.)

and

K π_eq =  (.)

We shall refer to this throughout as multichotomous process.

Random Walk

Another important example of discrete Markov Chain is random walk. In this case we chose our transition probability to be constant and have

Wi j = pδi j−+ qδi, j +  with p + q =  (.)

The master equation then reads

πi(n) = pπi−(n − ) + qπi+(n − ) (.)

the solution is easily obtainable and has a typical Bernoulli distribution πi(n) = ( n

n_{− i)}piqn−i (.)

1.3 Subordination theory and renewal processes

As we have earlier pointed out, we are mainly interested in systems which exhibit complex behavior characterized by the presence of abrupt transition (which we call “events”) between two or more state, with a power law distribution density of the time distance between two consecutive events.

A benchmark characteristic of those system it ageing, that is the system maintains a memory of the moment of preparation.

The theoretical frame which better suits the description of these systems is that of sub-ordination theory of renewal systems. A substantial treatment of this topic in advanced probability theory can be found in Feller’s work( []and []) and a general review on renewal theory can be found in Cox’ work []

We shall first illustrate Blinking quantum dots behavior as a prototype of the systems we are interested in.

(23)

Blinking quantum dots: a prototypical system

Quantum dots (nanocrystals of semiconductors) are intensively studied since they seem to promise great applications like light emissive diodes, solid state lighting, lasers. Investigation on their nature has pointed out some characteristic behavior: fluorescence intermittency [].

This intermittency, subsequently calledblinking, still has a non completely understood microscopical origin (even if many interpretation have been advanced) and constitutes a major problem to be solved to be able to use semiconductors nanocrystals at their best.²

Figure 1.4: A figure taken from [] which shows the typical blinking behavior of quantum dots

The behavior of the blinking is complex; that is, it cannot be described by a poissonian waiting time distribution. As shown in [] this typical behavior cannot be interpreted as a consequence of slow modulation of parameters, since non ageing is possible within the framework of this theory. Renewal subordination theory instead has, as a benchmark, that of showing ageing.

Figure 1.5: A figure taken from [] which shows the typical intensity over time behavior of quantum dots

Figure . shows a typical intensity over time fluctuation of blinking quantum dots. Anal-ysis of those data shows that the permanence time is a random variable which is roughly



recently a possible solution to this problem has been proposed [] but still it does not unveil the nature of this behavior

(24)

distributed like∼ 

tµ.

To model these systems we imagine that the interaction among units has the effect of creating abrupt transitions from one state to another. This is equivalent to assume that the process is modeled by a coin tossing Markov chain. The time between two tossings, due to complex interactions, is not constant any more but is distributed according to a power law.

This is the basic idea of subordination theory and we shall analyze it now.

1.3.1 Subordination theory

We start with a definition

Definition 1.3.1(Subordinated process). Let{Xn} be a discrete time stochastic process

defined over R, and{Tn}n>a discrete time stochastic process defined over R+_{. We defined}

the subordinated process ofX_n toT_nthe continuous time processξ(t) defines ξ_{(t) =}⎧⎪⎪_⎨⎪⎪

⎩

Xift< T

Xn ifTn< t < Tn+

(.)

We shall call{X_n} the leading process and {T_n}_n>thesubordination generating process. We want to point out that this is amathematical model. The only physical process is the result of subordinationξ(t) and both the leading process and the subordination generating process are phenomenological description of collective interaction.

Surely in certain cases we can give the leading process a microscopical interpretation, like the modeling of shocks in a ideal gas if we accept Boltzmann’s Stosszahlansatz. The waiting time distribution, in this case, will have to be inferred from the statistical properties of ideal gases.

Complex phenomena do not allow, usually, such a simple interpretation in terms of local microscopical vs. global macroscopical behavior, since both processes involved in subordination structure emerge from cooperative global interactions. The distinction is rather made in term of the effects both processes give rise to (i.e. the distinction we make is ana posteriori phenomenological one that enables us to propose a model).

Another fact that should be stressed is that subordination is a key mechanism to explore cooperative systems which could also not be simple fundamental physical systems. Studies in neural and social network have shown to exhibit this characteristic behavior (which happens to be tunable moreover).

Independent Increment Processes and renewal processes

The previous definition is rather general. We could choose assubordination generating process an arbitrary one. The first simplification we shall make is that of considering :

Definition 1.3.2(Independent increment processes). A stochastic process{Xt} is said

to be anindependent increment process if for any s < t < w < u X_t− X_s and X_u− X_w are independent variables.

(25)

The previous condition implies that the probability density function of the interval de-pends only on the time difference (i.e. Prob(X_t− X_s ∈ [x, x + dx]]) = f (x, t − s)). Among all independent increment processes we are particularly interested inrenewal processes

Definition 1.3.3(Renewal process). A discrete time independent increment process{Tn}

defined on R+_{is called a}renewal process

The reason why this kind of processes are called renewal processes will be clarified in next section. We first point out the most important property of those systems.

A renewal process{T_n} is totally determined by only one distribution

ψ_{(t) = Prob(T}n+− Tn ∈ [t, t + dt]) = Prob(T− T∈ [t, t + dt]) = f (t, ) (.)

which in the contest of subordination theory we shall callwaiting time distribution.

The independent increment process condition enables us, in fact, to obtain all the other conditions immediately. If we define the events

A_{(t, n) = {T}n− T∈ [t, t + dt]} (.)

by the independent increment hypothesis those events are independent. In particular the event A(t, n) can be split recursively in this way:

A_{(t, n) = ⋃}

t′ [A(t − t

′_{, }) ∩ A(t′_,n− )], _(.)

that is, we consider the probability of having the n-th element in[t, dt] as probability to find then− -th in t′ < t and that last interval has the length t − t′_{(obviously this works}

becauseT_n ∈ R+_{) we immediately write}

ψn(t) = Prob(A(t, n)) = Prob(⋃ t′[A(t − t ′_{, }) ∩ A(t′_,n− )]) =

∫

t  ψn−(t′)ψ(t − t′) dt′. (.) If we consider the Laplace transform (for the mean properties look Appendix) of the waiting time distribution ˆψ(u) the Laplace transform of ψ_n(t)

ˆ

ψn(u) = ˆψn(u). (.)

Renewal hypothesis: waiting time distribution as renewal failure time distribution

To clarify why discrete time positive valued independent increments processes are called renewal processes we have to think waiting time distribution as a failure time distribution. In its classical Monograph on Renewal Theory ( []) Cox gives a simple but insightful description of what a renewal process is.

Renewal theory is originally linked to the study of probabilistic problems connected with the failure and replacement of components. typical terminology could sound a little weird to a scientist’s ear, but we shall use it, for now, to let reader to easily find it in specialistic literature.

(26)

Let us then think to a robotized assembly line. It will work efficiently if all its components are working. But even the best constructed robot will endure soon or later some failure problems (due to wear e.g.). We now think that every time a robot fails it is immediately and completely restored in a perfectly working state. This is called therenewal hypothesis

For simplicity sake (and we are actually interested in these kinds of mechanism) we consider a single robot line.

We can model the failure probability of these robots as real positive random variable T called failure time. This failure time give rise to a failure time distribution f_{(t). The} probability for the system not to break if calledsurvival probability and has the obvious expression

Ψ(t) = Prob(T > t) =

∫

∞

t f(t

′) dt′_. _(.)

We can construct a discrete time process setting{T_n} is the time of the n-th failure and renewal. This is, by construction, a discrete time positive values independent interval process as previously defined and now the reason why it’s calledrenewal process is clear.

Renewal hypothesis make us able to give a nice description of failure time distribution f_(t)

Let us consider a key property of renewal processes calledfailure rate:

g_{(t) = lim}

∆t→+

Prob(T ∈ [t, t + ∆t]∣t < T) ∆t

. (.)

Since Prob(T ∈ [t, t + ∆t]∣t < T) = Prob((T∈[t,t+∆t])∩T>t)

ProbT>t we get g_{(t) =} f(t) Ψ(t). (.) By definition Ψ(t) = −d dtf(t). Therefore: g_{(t) = −}Ψ′(t) Ψ(t) = − d dt log Ψ(t) (.)

integrating equation . and noticing that Ψ() = , we obtain Ψ(t) = exp(−

∫

t 

g_(t′) dt′). _(.)

Therefore, a renewal-process occurrence time is completely characterized by its failure rate. Equation . enables us to make some analysis on g(t).

• g(t) =  in this the mean failure rate is constant and we obtain Ψ(t) = exp(−gt) and subsequently f(t) = g exp(−gt): this is the case of Poissonian failure time distribution. This is the typical “ failure” mechanism in traditional physics (e.g. radioactive decay, usual statistical physics phenomena etc.)

• g(t) ∼ Atα_{with α}_{>  In this case we get a probability whose queues are super}

exponen-tially depressed∼ exp−Btα+/(α + )

(27)

• g(t) ∼ A ∗ tα_{with }_{< α <  In this case we get sub exponential distribution which}

asymp-totically give rise to what are called stretched exponentials distribution Ψ(t) ∼ exp(−Atγ/γ) withα+  = γ ∈ [, ]

• g(t) ∼ A/t In this case we get power laws in fact Ψ ∼ exp  log(t−A_{)) =}  tA

• g(t) ∼ tα _{with α}_{< − In this case we find the construction is impossible since it would}

lead to immortality that is the f(t) is not normalized to 

This analysis shows than that the power laws are a limiting case of failure time distributions that seem to correctly modelsporadicity.

We notice, moreover, thatg(t) is not constant so the failure rate, that is the probability of decaying, changes over time: the system is ageing in the sense that from an estimationg∗ ofg(t) we can get an estimation of the age (i.e. the time elapsed since last failure) of the system g−(g∗)

Specific choice of g(t) let us to derive the power laws we have already presented. If g(t) = r

+rt

we obtain Ψ(t) = (r_+ t)−r/r_{. If we call}µ =  − r r

and T = r_we obtain back Manneville’s distribution ..

More complicated (i.e. non analytical) choices lead to Lévy and Mittag-Leffler distribu-tions.

The rate of event per unit time

Let us consider the random variable

N_{(t) = # events occurred in [, t]} (.)

we may ask what is the mean number of event. This calculation is easily carried out if we notice that the probability of having n events before timet that is B(n, t) = {n events have occurred before time t} can be split using using .

Prob(B(n, t)) = Prob(⋃ t′(A(n, t ′) ∩ A(, t′)) =

∫

t  ψn(t − t′)Ψ(t′) dt′ (.)

and thus the mean is easily written out: H_{(t) = E(N(t)) ∑} n n Prob_{(B(n, t)) =}_∑∞ n=

∫

t  nψn(t − t′)Ψ(t′) dt′ (.)

Using Laplace transform we have :

H_{(u) =} − ˆψ(t) u ∞ ∑ n=

∫

t  n ˆψn ₌− ˆψ(u) u ψ(u) d d ˆψ(u) ∞ ∑ n= ˆ ψn_{(u) =}  u ˆ ψ_(u) − ˆψ(u) (.) and thus H_{(t) =}

∫

t  ∞ ∑ n= ψn(t′) dt′. (.) 

(28)

We can now define a crucial quantity for our renewal processesmean rate of events : R_{(t) =} d dtH(t) = d dtE(N(t)) = ∞ ∑ n= ψn(t) (.)

Another way to understand what R(t) is, can be that of considering the event E = { an event occur at time t }. In can be easily be split into an union of independent event that is

Prob(E) = Prob(⋃

n

A_{(n, t)) dt =}_∑∞

n=

ψn(t) dt = R(t) dt. (.)

For Manneville power law, using Tauberian Theorem and asymptotic expansion . and . we write for < µ < 

R_{(u) ∼ (}uT)−µ Γ( − µ) (.) and thus R_{(t) ∼}  Tµ−_Γ_{( − µ)Γ(µ − )}. (.) and for < µ <  R_{(u) ∼}  ⟨t⟩u + (uT) −µ Γ( − µ) (.) and thus R_{(t) ∼}  ⟨τ⟩[+ Tµ− − µ  tµ−]. (.)

Subordinated renewal processes

We can now completely analyze subordinate renewal processes. Our analysis is based on the seminal works of Montroll and Weiss on Continuous Time Random Walk (CTRW) [] .

Our end is to obtain the the distribution of the subordinated processπ(ξ, t).

The key idea is to consider that according to our definition, the to processes are indepen-dent. Since we know by hypothesis the distribution of the leading process p(x, n) and our waiting time distribution we have everything. In fact the pdf

p_{(ξ, t) dξ = Prob[(X}∈ [ξ, ξ + dξ] ∩ no event occurred until t) ∪ ⋯

∪(Xn∈ [ξ, ξ + dξ] ∩ exactly n events occurred before timet ∪ ⋯]

(.) Since, by independence we can write

p_{(ξ, t) dξ = Prob(⋃}

n (B(n, t) ∩ Xn ∈ [ξ, ξ + dξ) = ∑n

Prob(B(n, t))π(ξ, n) dξ (.) Now we have all the pieces of information needed to write (which sometimes known as Montroll-Weiss equation) p_{(ξ, t) =} _∑∞ n=

∫

t  ψn(t − t′)Ψ(t′)π(ξ, n) dt′. (.)

This is the most general form Montroll-Weiss equation can take unless we make some other hypothesis on our system.

(29)

Generalized master equation

If{X_n} is a finite time homogenous discrete Markov Chain, adopting our shortcut notation ³ we can write: p(t) = ∞ ∑ n=

∫

t  ψn(t − t′)Ψ(t′)Π()ndt′π(). (.)

Taking the Laplace transform of both sides we write:

ˆ p(u) = − ˆψ(u) u ∞ ∑ n=(ψ(u)Π()) n_d_t′_π(). _(.)

Since both∣ ˆψ∣ and ∥Π()∥ are less than  we can sum the geometrical series and considering that p() = π(), we have: ˆ p(u) = − ˆψ(u) u  − ˆψ(u)Π()p(). (.)

Defining K= Π() − I and rearranging we obtain: uˆp_{(u) − p() =} u ˆψ

− ˆψ(u)

K ˆp(u). (.)

Transforming back we obtain theGeneralized Master Equation d

dtp(t) =

∫

t 

Φ(t − t′)Kp(t′) dt′_, _(.)

where the quantity Φ(t) is called memory kernel and is defined by its Laplace transform: ˆ

Φ(u) = u ˆψ(u)

− ˆψ(u). (.)

We thus see that subordination induces a loss of Markoviantity, that is, it introduces memory in the process.

Ageing

Renewal processes by derived by subordination are characterized by ageing. To see it let us suppose a renewal system is prepared at time , and our observation starts at times. Obvi-ously the first occurrence time is no more governed by our waiting time distribution. We have in fact (a graphical sketch can be found in picture .) to find the distribution the eventO= { the first observable event occur at time t given the system observation started at times}.

As usually we can split, for any (< t′< s < t this event as follows:

O_{= ⋃}

t′ ⋃n (A(n, t

′) ∩ A(, t − t′)) _(.)



we label states byj and consider the probability vector π(n) of π( j, n) and the vector p(t) of the probability p(i, t)

(30)

0 _t' -t's t ψ(t-t') ψ(t,s) n events n

...

ψ(t')_n

Figure 1.6: A visual sketch of aged waiting time calculation

that is, by disjunction and independence we can write thewaiting time distribution of age s:

ψ_{(t, s) = Prob(O) =}_∑∞ n=

∫

s  ψn(t′)ψ(t − t′) dt′= ψ(t) +

∫

s  R_{(q)ψ(t − q) dq.} (.) We can associate thesurvival probability of age s integrating :

Ψ(t, s) =

∫

∞ t ∞ ∑ n=

∫

s  ψn(t′)ψ(t′′− t′) dt′dt′′. (.)

Changing the order of integration and using formula . we obtain:

Ψ(t, s) = Ψ(t) +

∫

s 

R_(t′)Ψ(t − t′) dt′ _(.)

We can obtain directly this prescription considering the stochastic failure rate that is :

r_{(t) = g(t − t}i) (.)

In a certain wayr(t) represents the failure rate of the entire process see figure . Remembering the definition of g we can write:

Ψ(t, s) = ⟨

∫

s  δ_{(q) + R(q)e}−∫_qsr(τ) dτ ⟩ =

∫

s  (δ(q) + R(q))e −∫_qsg(τ) dτ_dq= Ψ(s) +

∫

s  R_{(q)Ψ(t − q) dq} (.)

We want to point that for poissonian processes we have g_{(t) = r(t) = R(t) =} 

⟨t⟩ (.)

(31)

10 20 30 40 0.1 0.2 0.3 0.4 0.5

Figure 1.7: A simple example of r(t) corresponding to equation g(t) =  +t

(32)

(33)

non stationarity

The aim of this chapter is to introduce the reader to the concept of ergodicity as it has been conceived by physicists and mathematicians and to analyze some physical phenomena which exhibit an “ ergodicity breaking’. It will be shown, in particular, that complex systems are likely to be considered “non ergodic systems”

2.1 Boltzmann’s Ergodic hypothesis

Theory of “irreversibility’ had always been a hard problem to deal with for physicists of XIX century. Clausius law, which had been proved by experiments, posed ha complicated problem. How can irreversibility arise from fundamental microscopical laws, which are time invariant?

It was not until the end of the century that a solution appeared, thanks to the work of Boltzmann.

Ludwig Boltzmann had yet began to organize his theories about irreversibility while building his kinetic theory. His H functional seemed, then, to provide a good mathematical instrument to show that irreversibility could be outputted by his kinetic theory but still he wasn’t able to link his “phenomenological’ theory to microscopical fundamental laws.

During the ’ and ’ of XIX century Boltzmann in his papers proposed hisErgodic Hypothesis as the foundations of his, then innovative, theory of irreversibility. The usual form under which ergodic hypothesis is stated nowadays is to be ascribed to Ehrenfest, who in a review of  [], stated it :

Boltzmann-Ehrenfest’s Ergodic hypothesis A dynamical system during his evolution

will takeall the microscopical configurations compatibles with a given macroscopic state (i.e. a single trajectory will cover the whole phase space during his evolution) To be true Boltzmann never stated his hypothesis this way, but he limited himself in assuming a “uniform probability“ of phase space.

Conservative systems’ evolution is known to follow Liouville’s equation

∂tρ= L ρ (.)

whereρ(p_i,q_i) ∏_idd p_iddq_i is the measure on the phase space (MPS).

Liouville’s theorem warrants us that time evolution preserves phase space measure. Nor-malizing MPS we get a probability space. Thus for conservative systems, this probability is

(34)

invariant under time evolution. This mean that we can ”safely“ considertemporal means of a variable f : lim t→∞/t

∫

t  f_{(x(t)) dt} (.)

Under Boltzmann-Gibbs frame, we “introduce” a measure of our ignorance of the effective initial conditions of the system by defining a new space, theGibbs ensemble, which is nothing but the set of infinite copies of the given dynamical system at fixed time, each one of which is the time evolute of one of all the possible compatible initial conditions. We associate a probability measure to each phase space configuration in the usual frequency limit way and use it to calculate averages.

Ergodic hypothesis is, roughly speaking, nothing but the assumption that MPS and Gibbs measure are the same that is, temporal mean and Gibbs average are the same.

As stated earlier, Boltzmann-Ehrenfest Hypothesis is proved to be false. The original Boltzmann hypothesis as been weakened and stated in a more “realistic” way:

Ergodic hypothesis (weak form) The set of values taken by a dynamical system is dense

in the set ofall the microscopical configurations compatible with a given macroscopic state.

Under this form, which has enabled mathematician to state and prove ergodic theorems, Ergodic hypothesis has been proved to hold for some dynamical systems but it is still not clear why it should be true for all. Moreover if warrants the existence of time and ensemble averages and their equivalence it has been shown that for an arbitrary observable the time needed to reach equilibrium is exponential in the number of elements of the system.

Most authors (i.e. Landau [] ) tend to diminish the importance of this hypothesis as a foundational hypothesis of Statistical physics and in recent years many examples of ergodicity breaking has been shown to exist.

We will show that the complex systems of our interest are non ergodic.

2.2 Mathematical theory of ergodicity and Brickhoff

theorem

Mathematicians have tried to establish a well founded theory of ergodicity, during the XX century and have succeeded in establishing very powerful results, which are linked to mathematical theory of dynamical systems. Before correctly stating the main, and most known result of this theory, we have to give some preliminary definitions (see also []).

As we have shown in the previous chapter, Boltzmann’s ergodic hypothesis allows us to associate to any system a probability space L(X, B, dµ) which describes how certain

microscopical configurations lead to a given macroscopical configuration. In this theory the macroscopical value of adynamical variable A is calculated as the mean⟨A⟩ =

∫

A dµ.

Traditionally a statistical dynamical system is described mathematically by a flow from a metric space to another

(35)

Definition 2.2.1. Flow Let X be a metric space, we define a flow over X a collection of maps

{Tt∶ Tt∶ X → X} indexed over a given set I such that:

i. T_tT_s = T_t+s ii. T= 

Generically mathematicians callergodic any asymptotic property of a dynamical system expressed by a flow. To find any connection with the main problem of statistical physics we confine ourselves to considertemporal means of dynamical variable f (i.e. a L_{function of}

dynamical variables)

The mathematical ergodic theory aims to analyze the temporal mean

¯ f _{= lim} T→∞  T

∫

T  f_(Ttx) dt (.)

and its relation withspacial mean

⟨f ⟩ =

∫

∞



f dµ (.)

One of the most fundamental question of mathematical theory of ergodicity is to assess when the temporal mean of f is equal to its space mean.

2.2.1 Invariant Measure

Before we continuing our discussion we have to consider some definitions

Definition 2.2.2 (Invariant measure). Let L(X, B, dµ) be a probability space. Then a

measure is said to beinvariant with respect to the flow T ∶ X → X if µ(A) = µ(T−A)

An obvious characterization of invariant measure is the following :

Lemma 2.2.1. A map T preserves µ if and only if

∫

f dµ=

∫

T○ f dµ for all in L(X, (B), µ)

A trivial generalization of the previous definitions can be obtained for flows

Definition 2.2.3 (Invariant measure). Let L(X, B, dµ) be a probability space. Then a

measure is said to beinvariant with respect to the flow T_tfor t inI if µ(A) = µ(T− t A) for

all t inI

From now on we shall considerI = N and so T_n = Tn. Obviously if a temporal mean exists we can confine ourselves to consider discrete flows. In this case invariance for flows is simply T invariance.

Let us state one of the most fundamental results of ergodic theory:

Theorem 2.2.1(Poincaré Recurrence Theorem). Let T∶ X → X be a measurable

transfor-mation on a probability space L(X, B, µ) preserving µ. Let A ∈ B so that µ(A) > ; then for

almost all points x _{∈ A the orbit {T}n_x_}

n≥returns to A infinitely many often

(36)

Proof. Let us define the set

F _{= {x ∈ A ∶ T}n_x _{/∈ A, n > }} _(.)

First we note that T−nA∩ T−mA = ∅, for n > m. Where it not, we would have for w _{∈ T}−n_A_{∩ T}−m_{A , T}m_w _{∈ F and T}n−m_(Tm_w_{) ∈ A contradicting our hypothesis. We can}

thus write

∑

n

µ_(T−n_F_{) = µ(∪}

nT−nF) ≤  (.)

butµ is T-invariant and so equation . can hold only if µ(F) = 

2.2.2 Ergodic measures and Birkhoff’s theorem ergodic and invariant

version

A stronger property is needed to establish Birkhoff theorem

Definition 2.2.4 (Ergodic measure). Let L(X, B, dµ) be a probability space. Then a

measure is said to beergodic with respect to T ∶ X → X if for every set B ∈ B with B = T−B,

µ_{(B) =  o µ(B) = }

As previously we can characterize ergodic measure in a simple way

Lemma 2.2.2. A map T is ergodic with respect to µ if and only if for every f ∈ L(X, B, µ),

f _{= T ○ f implies f be constant.}

Now we can state the first version of Birkhoff theorem

Theorem 2.2.2(Birkhoff theorem). Let f ∈ L(X, B, µ). If µ is ergodic then

lim N→∞  T N ∑ n= f_(Tn_x_{) =}

∫

_{f dµ} _(.)

for almost every x in X

This demonstration is quite technical and not very significant on a pysical point of view. Assuming without loss of generality that

∫

f dµ= , if it is not so we can substitute f with

f ₋

∫

f dµ. The main idea of this demonstration is to show that the set defined: Eε(f ) = {x ∈ X ∶ lim sup N→∞  N ∣ N− ∑ n= f_(Tn_x_{) ∣≥ ε}} _(.)

has null measure (i.e. µ(E_ε(f ) = ). We first prove two sublemmas

sublemma 2.2.2.1. µ(Eε(f )) ≤ inf ∣f ∣dµ_ε

(37)

Proof. Defining f _{= f}+− f−where f+(x) = max(f (x), ) and f− = max(−f (x), ).

Obvi-ously∣ f ∣= f₊+ f₋. Now we define EM ε (f+) = {x ∈ X ∶ ∃ ≤ N ≤ M, N− ∑ n= f+(T n_x_{) ≥ εN}} _(.) and EM ε (f−) = {x ∈ X ∶ ∃ ≤ N ≤ M, N− ∑ n= f−(T n_x_{) ≥ εN}} _(.) forM≥ . If we consider that: P− ∑ n= f+(T n_x_{) ≥ ε}P−M ∑ j= χEM ε (f+)(T j_x₎ _(.) and P− ∑ n= f−(T n_x_{) ≥ ε}P−M ∑ j= χEM ε (f−)(T j_x₎ _(.)

where we have bounded f from below by  or ε. Thus, integrating both sides of . and .,we write:

∫

P−∑ n= f+(T n_x_{) dµ(x) = P}

∫

f+dµ≥ ε(P − M)µ(E M ε (f+)) (.) and analogously:

∫

P−∑ n= f−(T n_x_{) dµ(x) = P}

∫

_f −dµ≥ ε(P − M)µ(E M ε (f−)) (.) for allM≥ . WhenP→ ∞ we have:

∫

f±dµ≥ εµ(E M ε (f±)) (.) and thus µ_(Eε(f ) ≤ lim sup M→∞ µ_(EM ε (f+)) + lim sup M→∞ µ_(EM ε (f−)) ≤

∫

f+dµ+

∫

f−dµ. (.)

Now we need to be able to control the size of the higher bound and to do this we can we prove this second lemma

sublemma 2.2.2.2. If

∫

f dµ= , then, for every δ ≥  there exists a function h ∈ L∞(X, B, µ)

for which

∫

∣ f − (hT − h) ∣ dmu < δ

(38)

Proof. Let S be defined by: S_{= {h ○ T − h ∶ h ∈ h ∈ L}∞(X, B, µ)} _(.) and theB_ B= {f ∈ L  (X, B, µ) ∶

∫

f dµ_{= }.} (.)

. We first show thatS is dense in B_. Hann Banach theorem guarantees us we only need to show that every null functional onS is also a null functional on B_.

As known for every functional α(f ) defined on L(X, B, µ), there exists a function

k _{∈ L}∞(X, B, µ) so that α(f ) =

∫

f ⋅ k dµ Now let us suppose that α vanishes on S thus

∫

(h ○ T − h) ⋅ k dµ =  if h = k we have k ⋅ (kT)k =

∫

k_dµ

We can then write:

∫

(k○T−k)

dµ=

∫

(k○T)dµ+

∫

kdµ−

∫

(k○T)k dµ = (

∫

kdµ−

∫

(k○T)⋅k dµ) =  (.) We have thatk = k ○ T and so k must be constant by ergodicity hypothesis. We can thus write = k

∫

f dµ=

∫

f k dµ = α(f ) which proves the lemma.

We can now proceed to prove Birkhoff theorem.

Birkhoff ’s theorem proof. As earlier done, we consider without loss of generality f _{∈ B}.

Let delta > . Using sublemma ... and choose h so that

∫

∣ f − (hT − h) ∣ dµ ≤ δ. Eε(f ) = Eε([f − (hT − h)] + (hT − h)) ⊂ Eε/(f − (hT − h)) + Eε/(hT − h)) and so:

µ_(Eε(f )) ≤ µ(Eε/(f − (hT − h))) + µ(Eε/(hT − h))). (.)

But∀x ∈ X we can write  N ∣ N− ∑ n=(hT − h)(T n_x_{)∣ =}  N ∣h(T N_x_{) − h(x)∣ ≤} ∥h∥∞ N (.) and soµ(E_ε/(hT − h)) = . Using ... we have µ_(Eε/(f − (hT − h))) ≤

∫

∣f − (hT − h) dµ∣ ε_/ ≤ δ ε and thusµ(E_ε/(f − (hT − h))) =  which proves the result.

2.3 Ergodicity of time series

It is a well known fact that Dynamical Systems like those considered in the previous section are in fact Markov Chains (see []).

In a certain way the Markov Chain perspective is nothing but a microscopical phenomeno-logical description of the effect of global dynamic of the system. Under this perspective we wonder how ergodicity is espressed in stochastic Process.

(39)

In the previous sections we have seen that ergodicity is roughly equivalent to say that the temporal means equal statistic means. Thus a single process we can express ergodicity as follows (see []):

Definition 2.3.1(Strict ergodic process). A stochastic process is ergodic if all his statistical

means can be calculated trough a single realization of the process

as above we can confine ourselves to considering a weaker form of ergodicity that is

Definition 2.3.2(Wide sense ergodic process). A stochastic process is ergodic in the wide

sense if if holds: ¯ Xt= lim T−>∞  T

∫

T −T X(t ′) dt′= E[X(t)] _(.) and RXX(τ) = lim T−>∞  T

∫

T −TX(t ′)X(t′+ τ) dt′= E[X tXt+ τ] (.)

It is natural to wonder what is the equivalent concept of invariance in the language of stochastic processes. In this case too, little work is needed to translate concept:

Definition 2.3.3 ((Strictly) Stationary processes). A random process {Xt} is called a

(Strictly) Stationary process if his cumulative distributions

FXt...X_tn(xt. . .xtn) = FXt +τ...Xtn+τ(xt. . .xtn), (.)

for allt_i,τ∈ R

Usually weaker form of stationarity is required to get useful results, that is only the first and the second moment are stationary:

Definition 2.3.4(Wide sense Stationary processes). A random process{Xt} is called a

(Weak) Stationary process if its mean

E[X_t] = E[X_t+τ] = µ (.)

and its auto covariance (or autocorrelation)

E[X_tX_t+τ] = E[XXτ] = C(τ) (.)

For allt, τ∈ R

In a stationary process, thus, we can begin an observation at any time and we shall still be able to access to all the information on the process.

As shown in the previous chapter, ergodicity is a stronger property than invariance: the same holds for ergodic and stationary processes.

Proposition 2.3.1. Ergodicity in the wide sense implies stationarity in the wide sense

(40)

Proof. The proof is almost trivial. In fact the limit exists equation . reads E[X(t)] = lim T−>∞  T

∫

T −TX(t ′) dt′= µ _(.) and . reads E[X_tXt+ τ] = lim T−>∞  T

∫

T −TX(t ′)X(t′+ τ) dt′= R XX(τ). (.)

Thus proving that a process (or a dynamical system)is not stationary is the same as showing that it is not ergodic.

But stationarity does not imply ergodicity. To see it letU a random variable with mean µ. Let us consider the process defined as follows:

Xt=⎧⎪⎪⎨⎪⎪

⎩

U ift=  X ift> 

. (.)

By construction this is a stationary process but it is clearly non ergodic. In fact⟨X_t⟩ = µ but ¯X_t= U.

2.4 Ergodicity breaking

When coming to “Ergodicity breaking ” many physicists think to usual critical phenomena. Second phase transition have, in fact, have provided very rich experimental ground upon which physicists have built a very well founded theory (see []). Typically, in those systems, ergodicity breaking is explained as a consequence ofspontaneous symmetry breaking at a certain critical temperatureT_c(e.g. Curie Law for magnetization).

Similar but slightly different systems are those which undergo critical dynamics. In this case the system is thought to be in a non equilibrium state and expected to regress to equilibrium during his time evolution. In a totally ergodic system regression to equilibrium should occur with a precise an fixed “mean regression time” which is nothing but the “time correlation length”.

When the system is near a critical point this happens to be false and the more the system is near the critical point, the more the “time correlation length” of the system grows: system exhibit what is calledcritical slowing down.

Yet the simple and rough Van Hove model [] had shown it , and the models further pro-posed by Kawazaki in the late sixties [] and to the work of Höhenberg and Halperin [] [] who managed to give a Renormalization Group description of critical dynamics have con-firmed it.

All these theories have shown that the typical characteristic behavior of a system near a critical point satisfies what is calleddynamical scaling hypothesis, that is, the typical time behaves like: