• Non ci sono risultati.

Bayesian non parametric construction of the Fleming-Viot process with fertility selection

N/A
N/A
Protected

Academic year: 2021

Condividi "Bayesian non parametric construction of the Fleming-Viot process with fertility selection"

Copied!
14
0
0

Testo completo

(1)

BAYESIAN NONPARAMETRIC CONSTRUCTION OF THE FLEMING-VIOT PROCESS WITH

FERTILITY SELECTION

Matteo Ruggiero and Stephen G. Walker Universit`a degli Studi di Pavia and University of Kent

Abstract: This paper provides a construction in the Bayesian framework of the Fleming-Viot measure-valued diffusion with diploid fertility selection, and high-lights new connections between Bayesian nonparametrics and population genet-ics. Via a generalisation of the Blackwell-MacQueen P´olya-urn scheme, a Markov particle process is defined such that the associated process of empirical measures converges to the Fleming-Viot diffusion. The stationary distribution, known from Ethier and Kurtz (1994), is then derived through an application of the Dirichlet process mixture model and shown to be the de Finetti measure of the particle process. The Fleming-Viot process with haploid selection is derived as a special case.

Key words and phrases: Blackwell-MacQueen urn-scheme, Dirichlet process mix-ture, fertility selection, Fleming-Viot process, Gibbs sampler.

1. Introduction And Preliminaries

The Fleming-Viot process, introduced by Fleming and Viot (1979), is a dif-fusion on the space P(X ) of Borel probability measures on X , endowed with the topology of weak convergence, where X is a locally compact complete separa-ble metric space called the type space. The general form of the generator which provides the Fleming-Viot process is given by Ethier and Kurtz (1993) as

Aφ(µ) =12 ! X ! X µ(dx)x(dy) − µ(dy)} ∂2φ(µ) ∂µ(x)∂µ(y) + ! X µ(dx) G " ∂φ(µ) ∂µ(·) # (x) + ! X ! X µ(dx)µ(dy)R " ∂φ(µ) ∂µ(·) # (x, y) +! X ! X µ(dx)µ(dy)(σ(x, y)− "σ, µ2#)∂φ(µ) ∂µ(·),

where ∂φ(µ)/∂µ(x) = limε→0+ε−1{φ(µ + εδx) − φ(µ)}, µ2 denotes product

mea-sure, and we take the domain D(A) to be the set of all φ ∈ B(P(X )), where B(P(X )) is the set of bounded, Borel measurable functions on P(X ), of the form φ(µ) = F ("f1, µ#, . . . , "fm, µ#) = F ("f, µ#), where "f, µ# = $f dµ and, for

(2)

m≥ 1, we have f1, . . . , fm ∈ D(G) and F ∈ C2(Rm). Also, G is the generator of

a Feller semi-group on the space ˆC(X ) of continuous functions vanishing at infin-ity, known as the mutation operator, R is a bounded linear operator from B(X ) to B(X2), known as the recombination operator, and σ ∈ B

sym(X2) is called

selection intensity function. Here Bsym(X2) is the set of nonnegative, bounded,

symmetric, Borel measurable functions on X2. We assume throughout the paper

that R ≡ 0, i.e., there is no recombination in the model.

Ethier and Kurtz (1986) showed that when there is no selection nor recom-bination, and the mutation operator is

Gf (x) = 1 2θ

!

[f(z) − f(x)]ν0(dz), (1.1)

where θ > 0 and ν0 is a non atomic probability measure on X , then the

sta-tionary distribution of the Fleming-Viot process (in this case often called neutral diffusion model) is the Dirichlet process with parameter (θ, ν0), denoted by Πθ,ν0.

The Dirichlet process, introduced by Ferguson (1973), is defined as follows. Let α be a finite measure on X , endowed with its Borel sigma-algebra B(X ). A random probability measure µ∗on X is said to be a Dirichlet process with parameter α if

for every measurable partition B1, . . . , Bk of X , the vector (µ∗(B1), . . . , µ∗(Bk))

has the Dirichlet distribution with parameters (α(B1), . . . , α(Bk)). A recent

con-tribution by Walker, Hatjispyros and Nicoleris (2007) showed how the neutral diffusion model is strictly related to Bayesian nonparametrics.

Assuming (1.1) holds, define φm(µ) = "f, µm#, for f ∈ B(Xm) and µm being

a m-fold product measure, and consider a diploid selection function σ ∈ Bsym(X2)

and no recombination. Then the generator of the Fleming-Viot process is (cf., e.g., Donnelly and Kurtz (1999))

m % i=1 "Gif, µm# +12 % 1≤k%=i≤m "Φkif− f, µm# + m % i=1 & "σi·(·, ·)f, µm+1# − "σ(·, ·) ⊗ f, µm+2#', (1.2)

where Gi is (1.1) operating on f as a function of xi alone, Φkif is the function

of m − 1 variables obtained by setting the ith and the kth variables in f equal, σi·(·, ·) denotes σ(xi, xm+1), and σ(·, ·) ⊗ f denotes σ(xm+1, xm+2)f(x1, . . . , xm).

Ethier and Kurtz (1994) showed that in this case the stationary distribution is Π(dµ) = Ce&σ,µ2'

Πθ,ν0(dµ), (1.3)

(3)

The purpose of the present work is to extend Walker, Hatjispyros and Nico-leris (2007) and further detail how Bayesian nonparametrics is connected to pop-ulation genetics, and to the Fleming-Viot diffusion in particular. More specif-ically, this paper provides an explicit construction of the Fleming-Viot process with diploid selection, whose generator is (1.2), based on ideas borrowed from the Bayesian nonparametric literature. Our arguments present interesting appli-cations of some typically Bayesian models, as P´olya predictive schemes, Gibbs sampling procedures, and Bayesian hierarchical models, to population genet-ics. The outline of the construction is the following. First a generalisation of the Blackwell-MacQueen P´olya urn scheme is introduced by means of a mix-ture model and the key predictive distribution is computed. Then we define an Xn-valued Markov jump process, representing the evolution in time of the

individuals, whose transitions are based on the new predictive, and it is shown that, with a particular choice for the selection function, the associated process of empirical measures converges weakly to a Fleming-Viot process with diploid fertility selection. By exploiting the properties of the Gibbs sampling algorithm, it is then shown that the stationary distribution of the measure-valued diffusion, known to be (1.3), is the de Finetti measure of the exchangeable variables intro-duced in the mixture model. In Section 3 some insight into the functional form of Π(dµ) will be provided from a Bayesian viewpoint. Finally, the Fleming-Viot process with haploid selection is derived as a special case.

2. Conditional Predictive Density

Let pn(dx1, . . . , dxn), for n even (which we assume henceforth), be the

ex-changeable law associated with the Blackwell-MacQueen urn-scheme (cf. (2.3) be-low and Blackwell and MacQueen (1973)). Let Pndenote a pairing of {1, . . . , n}

such that, given Pn, k is paired with jk. The distribution of the pairings is

as-sumed to be uniform. We will also use ˜σn(xk, xjk), where ˜σn∈ Bsym(X

2). Then

we consider a generalisation of pn via the function ˜σn(x, y), by introducing

qn(dx1, . . . , dxn, Pn) ∝ pn(dx1, . . . , dxn)

(

k

˜σn(xk, xjk). (2.1)

In Section 3 a representation of (2.1) in terms of a mixture model is provided. It is clear that integrating out Pn, we obtain an exchangeable law. Removing one

element of the vector (x1, . . . , xn), say xi, then the predictive, jointly with Pn, is

qn(dxi, Pn| x−i) ∝ pn(dxi| x−i) ˜σn(xi, xji),

where x−i denotes (x1, . . . , xi−1, xi+1, . . . , xn). Now

qn(Pn| x−i) ∝

!

(4)

and since from (2.1) we can write qn(dxi| x−i, Pn) = pn(dxi| x−i) ˜σn(xi, xji) $ pn(dw| x−i) ˜σn(w, xji) , we obtain qn(dxi| x−i) ∝ n % j%=i qn(Pn| x−i)$pnp(dxi| x−i) ˜σn(xi, xj) n(dw| x−i) ˜σn(w, xj) ∝ pn(dxi| x−i) n % j%=i ˜σn(xi, xj). Thus we have qn(dxi| x−i) = pn(dxi| x−i) )jn%=i˜σn(xi, xj) $ pn(dw| x−i) )nj%=i˜σn(w, xj) . (2.2)

When pnis derived from the Dirichlet process prior, the predictive for xi is given

by the Blackwell-MacQueen urn scheme pn(dxi| x−i) =

θ ν0(dxi) + )k%=iδxk(dxi)

θ + n− 1 . (2.3)

Then the predictive (2.2) can be written as qn(dxi| x−i) =

θ)nj%=i˜σn(xi, xj) ν0(dxi) + )k%=in )nj%=i ˜σn(xi, xj) δxk(dxi)

$ * θ)nj%=i˜σn(w, xj) ν0(dw) + )nk%=i )n j%=i ˜σn(w, xj) δxk(dw) + = θn,iνn,i(dxi) + ) n k%=i )n j%=i˜σn(xi, xj) δxk(dxi) θn,i+ )nk%=i )n j%=i˜σn(xk, xj) , (2.4) where θn,i= θ ! %n j%=i ˜σn(w, xj) ν0(dw), (2.5) νn,i(dw) = )n j%=i˜σn(w, xj) ν0(dw) $ )n j%=i˜σn(w, xj) ν0(dw) . (2.6)

Expression (2.4) is the transition density of the Markov particle process defined in Section 4, and the full conditional distribution driving the Markov chain con-structed via a Gibbs sampler in the next section. Observe that in (2.4) a larger ˜σnimplies a larger probability for the first coordinate of being selected to update

xi; that means the larger the ˜σn the higher the fitness of the individual who is

(5)

or quality that makes an individual more likely to give birth. In this framework, this is mathematically modeled by the pair-dependent coefficient ˜σn that

quan-tifies the degree of possession of that property or, in other words, the level of fitness. In population genetics terms, such a function describes the intensity of fertility selection. When ˜σn(x, y) ≡ 1 for all n we recover the Dirichlet case, that

is (2.3).

Note that, from (2.1), it is also possible to derive the distribution of the pairing. Indeed qn(Pn) ∝ ! Xn pn(dx1. . . dxn) ( k ˜σn(xk, xjk),

from which is also clear the key role of the selection function: since a pair with higher fitness will give a higher value of ˜σn, those individuals which are fitter

when paired will increase the probability of that specific pair occurring. 3. Gibbs Sampling

The Gibbs sampler (see Geman and Geman (1984) and Gelfand and Smith (1990)) is an iterative procedure that belongs to the more general class of Markov Chain Monte Carlo algorithms, whose use is nowadays wide-spread in the Bayesian literature. The Gibbs sampler can be synthetically outlined as fol-lows. Suppose we want to sample from f(z1, . . . , zn|ϑ), but this is not feasible.

Given a vector of initial values (z(0) 1 , . . . , z

(0)

n ), we sample updates for each zi from

the so-called full conditional distribution f(zi|z1, . . . , zi−1, zi+1, . . . , zn, ϑ), which

is typically easier to deal with. This is done iteratively, so that z1(1) ∼ f(z1|z2(0), . . . , z(0)n , ϑ), z2(1) ∼ f(z2|z1(1), z (0) 3 , . . . , zn(0), ϑ), ... zn(1) ∼ f(zn|z1(1), . . . , z (1) n−1, ϑ), z1(2) ∼ f(z1|z2(1), . . . , zn(1), ϑ),

and so on. This produces a Markov chain whose stationary distribution is given by the joint distribution f(z1, . . . , zn|ϑ), so that after a sufficient number of

itera-tions an arbitrarily large sample from the distribution of interest is available (see for example Gelfand and Smith (1990)). Observe that the stationary distribution of a single component zi is f(zi|ϑ). Note also that the result still holds if the

coordinates are updated in a random order (called random scan), as long as all coordinates are visited infinitely often.

(6)

In this section the theoretical properties of the Gibbs sampler are exploited to derive the stationary distribution of a chain of random distribution functions, which will be of great help in Section 4 for the derivation of the stationary distri-bution of the Fleming-Viot process. We can represent qnin (2.1) in the following

way. We take µ ∼ Πθ,ν0 and x1, . . . , xn|µ to be independent and identically

distributed according to µ. We then assume there is a conditional density for continuous variables (y1, . . . , yn), with each yi defined on the real line, whereby

˜pn(y1= 1, . . . , yn= 1|x1, . . . , xn, Pn) =

(

k

n ˜σ(xk, xjk),

the product being over n/2 terms. Hence, qn(dx1, . . . , dxn, Pn) is the conditional

law of (x1, . . . , xn, Pn) given (y1 = 1, . . . , yn = 1). Here we are considering an n

factor in the product because we are going to take ˜σ of order n−1. The reason

for this choice will be made clear in Section 4.

Consider now a Gibbs sampler algorithm implemented on (x1, . . . , xn, µ),

where at each iteration x1, . . . , xn are sampled from the full conditionals (2.4),

that is qn(dxi| x1, . . . , xi−1, xi+1, . . . , xn), and µ is sampled from the Dirichlet

process conditional on (x1, . . . , xn), denoted by Πθ,ν0(·| x1, . . . , xn). Note that

(see Ferguson (1973))

Πα( · | x1, . . . , xn) = Πα+Pni=1δxi( · ).

From the properties of the Gibbs sampler it follows that the stationary dis-tribution of the Xn-valued Markov chain generated by (x

1, . . . , xn) is given by qn(dx1· · · dxn). Furthermore, since ˜pn(y1 = 1, . . . , yn= 1| µ, Pn) = , n !! ˜σn(x, y)µ(dx)µ(dy) -n 2

which does not depend on Pn, the stationary distribution of the chain of random

distribution functions is the law of (µ|y1 = 1, . . . , yn = 1), denoted by Πn(dµ),

given according to Bayes’ theorem by Πn(dµ) = Πθ,ν0(dµ)˜pn(y1 = 1, . . . , yn= 1|µ) $ Πθ,ν0(dν)˜pn(y1 = 1, . . . , yn= 1|ν) ; i.e., it is proportional to , n !! ˜σn(x, y)µ(dx)µ(dy) -n 2 Πθ,ν0(dµ). (3.1)

Note that P(X ) is compact if X is, in which case {Πn, n ≥ 1} is tight. If we

now set ˜σn(x, y) = n−1[1 + 2n−1σ(x, y)], where σ(x, y)∈ Bsym(X2), we obtain

Πn(dµ) ∝ , 1 +!! 1 n/2σ(x, y)µ(dx)µ(dy) -n 2 Πθ,ν0(dµ)

(7)

and, taking the limit for n → ∞, yields

Π(dµ) ∝ exp,!! σ(x, y)µ(dx)µ(dy)

-Πθ,ν0(dµ).

This is the stationary distribution of the chain of random distribution functions when the sample size, i.e. the dimension of the chain, grows to infinity, and is also the de Finetti measure of the sequence (x1, x2, . . .|y1 = 1, y2 = 1, . . .). The

de Finetti measure of an infinite exchangeable sequence (z1, z2, . . .) is defined as

the unique probability measure Q on the space P(X ) such that, for every n > 0, (z1, . . . , zn) ∼

!

P(X )

ξn(d·)Q(dξ),

where ξ is a probability measure sampled from Q and ξndenotes a n-fold product

measure. In other words, conditional on ξ, the random variables (z1, . . . , zn) are

iid ξ. See Aldous (1985) for more details.

4. The Particle Process and the Associated Measure-Valued Process In this section we construct an Xn-valued Markov particle process based

on (2.4) and an associated P(X )-valued process, and show that, in a special case for the function ˜σn, for large n the latter converges in distribution to the

Fleming-Viot process with diploid fertility selection.

Consider a vector of n particles, and define the particle process as follows. Instantaneously after each transition, a particle xi, for 1 ≤ i ≤ n, is selected

with uniform probability, and a holding time is sampled from an exponential distribution of parameter λn,i = λn(xi). At the next transition, the ith particle

is replaced with a random sample from (2.4). Since the holding time depends on xi, which belongs to the current state only, the process is clearly Markovian.

Consider now the Markov chain embedded at jump times. Since the transition laws are given by qn(dxi| x−i), the chain is otherwise obtained by implementing

a Gibbs sampler on qn(dx1, . . . , dxn), of which (2.4) is the full conditional

dis-tribution. This ensures that qn is the stationary distribution of the Xn-valued

chain and, given that between jumps the vector is constant, the continuous time process.

For f ∈ B(Xn), the generator of the Xn-valued process is

Anf (x) = n % i=1 λn,i n ! . f (ηi(x|y)) − f(x)/ × 0

θn,iνn,i(dy) + )nk%=i)j%=in ˜σn(y, xj) δxk(dy)

θn,i+ )nk%=i

)n

j%=i˜σn(xk, xj)

1 ,

(8)

where ηi(x|z) = (x1, . . . , xi−1, z, xi+1, . . . , xn), which equals n % i=1 λn,iθn,i n*θn,i+ )nk%=i )n j%=i˜σn(xk, xj) + !

[f(ηi(x|y)) − f(x)]νn,i(dy)

+ n % i=1 n % k%=i n % j%=i λn,i ˜σn(xk, xj) n*θn,i+ )nk%=i )n j%=i˜σn(xk, xj) +[f(ηi(x|xk)) − f(x)].

If we let the Poisson intensity rate be λn,i=

n*θn,i+ )nk%=i)nj%=i˜σn(xk, xj)

+ 2 , (4.1) we obtain n % i=1 1 2θn,i !

[f(ηi(x|y))−f(x)]νn,i(dy)+

%

1≤k%=i%=j≤n

1

2˜σn(xk, xj)[f(ηi(x|xk))−f(x)].

(4.2) Consider now the same choice for ˜σnas in Section 3, i.e.,

˜σn(x, y) =

1 n+

2

n2 σ(x, y), (4.3)

where σ ∈ Bsym(X2). Substituting (2.5), (2.6), and (4.3) in (4.2) yields

Anf (x) =1 n % 1≤j%=i≤n Gn,σj i f (x) +1 2 % 1≤k%=i≤n [f(ηi(x|xk)) − f(x)] + 1 n2 % 1≤k%=i%=j≤n σ(xk, xj)[f(ηi(x|xk)) − f(x)], where Gn,σjf (x) = 1 2θ ! [f(y) − f(x)] " 1 +2σ(y, xj) n # ν0(dy) (4.4) and Gn,σj

i is the operator Gn,σj applied to the ith argument of f.

As in Donnelly and Kurtz (1999), define the probability measure µ(m) on

Xm, m ≤ n, by µ(m) = 1 n(n− 1) . . . (n − m + 1) % 1≤i1%=···%=im≤n δ(x i1,...,xim). (4.5)

(9)

Also, for f ∈ B(Xm), m ≤ n, define φ(m)(µ) = "f, µ(m)# and Anφ(m)(µ) =

"Anf, µ(m)#, where "f, µ# =$f dµ. Then the generator for the process of

empir-ical measures in the n-dimensional case is Anφ(n)(µ) = 1 n % 1≤j%=i≤n "Gn,σj i f, µ(n)# +1 2 % 1≤k%=i≤n "Φkif − f, µ(n)# + 1 n2 % 1≤k%=i%=j≤n "σkj(·, ·)(Φkif− f), µ(n)#,

where σkj(·, ·) denotes σ(xk, xj), and Φkif is the function of f where the

coordi-nate at level k has replaced the coordicoordi-nate at level i. Observe that for f ∈ B(Xm), m ≤ n,

n % i=m+1 n % j%=i "Gn,σj i f, µ(n)# = 0, n % i=m+1 n % k%=i "Φkif− f, µ(n)# = 0, n % i=m+1 n % k%=i n % j%=i "σkj(·, ·)(Φkif − f), µ(n)# = 0,

given that in all cases xi is not an argument of f and thus f does not change.

Further, that m % i=1 n % k=m+1 "Φkif − f, µ(n)# = 0

given that "Φkif, µ(n)# = "f, µ(n)# when xk is not an argument of f. Hence, when

f ∈ B(Xm), m ≤ n, we have Anφ(m)(µ) = 1 n % 1≤j%=i≤m "Gn,σj i f, µ(m)# +n− m n m % i=1 "Gn,σm+1 i f, µ(m+1)# +1 2 % 1≤k%=i≤m "Φkif− f, µ(m)# + 1 n2 % 1≤k%=i%=j≤m * "σkj(·, ·)Φkif, µ(m)# − "σkj(·, ·)f, µ(m)# +

(10)

+n− m n2 m % i=1 m % j%=i * "σij(·, ·)f, µ(m)# − "σ·j(·, ·)f, µ(m+1)# + +n− m n2 % 1≤k%=i≤m * "σk·(·, ·)Φkif, µ(m+1)# − "σk·(·, ·)f, µ(m+1)# + +(n − m)2 n2 m % i=1 * "σi·(·, ·)f, µ(m+1)# − "σ(·, ·) ⊗ f, µ(m+2)# + , (4.6) where we used the notation σh·(·, ·)f = σ(xh, xm+1)f(x1, . . . , xm) and σ(·, ·)⊗f =

σ(xm+1, xm+2)f(x1, . . . , xm).

Note that in the fifth term we have σij, since the operator Φki has replaced

the particle at level i in f with xk, which is the reason for the different dimension

of integration. The same applies to the last term. Given now that (4.4) converges to

Gf (x) = 1 2θ ! [f(y) − f(x)]ν0(dy), we have that "Gn,σm+1 i f, µ(m+1)# converges to "Gif, µ(m+1)# = "Gif, µ(m)#. The

last identity is due to the fact that the (m + 1)-th dimension is referred to an argument of σ in (4.4), which vanishes in the limit.

Since in addition, for large n, µ(m)is essentially product measure, the limiting

operator is Aφm(µ) = m % i=1 "Gif, µm# +12 % 1≤k%=i≤m "Φkif − f, µm# + m % i=1 & "σi·(·, ·)f, µm+1# − "σ(·, ·) ⊗ f, µm+2#', (4.7)

where φm(µ) = "f, µm# for f ∈ B(Xm). Aφm(µ) is the generator of the

Fleming-Viot process with diploid fertility selection (cf. (1.2)). The above computation implies that the measure-valued process, constructed by means of the generalised P´olya urn scheme, converges in distribution to the Fleming-Viot process with fertility selection.

Theorem 4.1. Let X be a compact Polish space, f ∈ B(Xm) for m ≤ n, µ(m)

be as in (4.5), and σ ∈ Bsym(X2). Let the mutation operator Gn,σj be defined by

(4.4) and Φki: Xn→ Xn−1 be defined, for k, i ≤ n, by

Φkif (x1, . . . , xn) = f(x1, . . . , xi−1, xk, xi+1, . . . , xn).

Let {µn(t)}t>0be a Markov process with sample paths in DP(X )([0, ∞)) and with

generator (4.6). Let {µ(t)}t>0 be the Fleming-Viot process with generator (4.7).

(11)

Proof. Given the convergence of (4.6) to (4.7), the result follows by an appli-cation of Theorem 1.6.1 and 4.2.5 of Ethier and Kurtz (1986), together with the observation that the martingale problem for (4.7) is well-posed (cf., Ethier and Kurtz (1993)).

Observe that for σ ≡ 0, i.e. when there is no selection, (4.7) reduces to the generator of the neutral diffusion model whose stationary distribution is the Dirichlet process. This is consistent with the generalisation of the predictive distribution described in Section 3, which for σ ≡ 0 reduces to the the Blackwell-MacQueen case.

5. Stationary Distribution

As stated in the Introduction, it was shown by Ethier and Kurtz (1994) that the measure-valued process with generator (4.7) has stationary distribution given by (1.3). In this section we provide a different proof of this result, based on the construction of the previous sections. In particular, in Section 3 the use of the Gibbs sampler enabled us to elicit the stationary distribution of the chain of random distribution functions. What remains to do is to connect the de Finetti measure of the sequence with the empirical measure of the particles when the population size goes to infinity.

Theorem 5.2. Let X be a compact Polish space, and let {µ(t)}t>0 be the

Fleming-Viot process with generator (4.7). Then Π(dµ) = C exp,!! σ(x, y) µ(dx)µ(dy)

-Πθ,ν0(dµ) (5.1)

is the stationary distribution of {µ(t)}t>0, where C is a constant and Πθ,ν0

de-notes the Dirichlet process with parameters (θ, ν0).

Proof. In Section 4 we showed that qn(dx1, . . . , dxn) is the stationary

distribu-tion of the Xn-valued particle process. Then, for fixed t ≥ 0 in steady state of the

particle process, from Section 3 it follows that (x1, . . . , xn| µ, y1 = 1, . . . , yn= 1)

are i.i.d. µ with µ ∼ Πn, where Πnis proportional to (3.1). Hence the limit as n

tends to infinity of n−1)n

i=1δxi has distribution Π∞ (see, for example, Aldous

(1985)). That is, the limiting distribution of the empirical measure of the parti-cles (x1, . . . , xn) is given by the de Finetti measure of the infinite exchangeable

sequence (x1, x2, . . . ) conditional on (y1= 1, y2 = 1, . . . ). Since the particles are

in steady state, this holds for every s ≥ t.

Furthermore, the CP(X )[0, ∞) martingale problem for A is well posed (cf., Ethier and Kurtz (1993)), that is, A characterises a unique solution. In this case Lemma 4.9.1 of Ethier and Kurtz (1986) states that if the limiting process has the same distribution at all instants, this is a stationary distribution. Uniqueness follows from Ethier and Kurtz (1994).

(12)

Clearly, when σ(x, y) ≡ 0 we recover the Dirichlet process. 6. Haploid Case

In this section we show how the parameters of the model simplify in the special case of haploid selection. The Fleming-Viot process with haploid fertility selection has generator given by

m % i=1 "Gif, µm# +12 % 1≤k%=i≤m & "Φkif, µm−1# − "f, µm# ' + m % i=1 & "σi(·)f, µm# − "σ(·) ⊗ f, µm+1# ' , (6.1)

where σi(·) denotes σ(xi) and σ(·) ⊗ f denotes σ(xm+1)f(x1, . . . , xm) (cf.,

Don-nelly and Kurtz (1999)). Its stationary distribution is Π(dµ) = Ce2&σ,µ'Π

θ,ν0(dµ), (6.2)

where "σ, µ# = $σ(x)µ(dx). Note that (6.2) is a special case of (1.3), when σ(x, y) = σ(x) + σ(y). See also Ethier and Shiga (2000).

A construction analogous to that exposed so far can be done starting from pn(y1= 1, . . . , yn= 1| x1, . . . , xn) =

n

(

i=1

˜σn(xi)

together with x1, . . . , xn| µ ∼iid µ and µ∼ Πθ,ν0. Proceeding as in Section 2 we

obtain

qn(dx|x1, . . . , xj−1, xj+1, . . . , xn) =

θnνn(dx) + )ni%=j˜σn(xi) δxi(dx)

θn+ )ni%=j˜σn(xi)

, (6.3) where θn= θ$ ˜σn(x)ν0(dx) and νn(dx) = ˜σn(x)ν0(dx)[$ ˜σn(x)ν0(dx)]−1.

Defin-ing a Markov particle process as in Section 4, where now the transition law for the new particle is given by (6.3), and letting λn,i= 2−1n(θn+ )k%=i˜σn(xk)), we

obtain n % i=1 1 2θn ! [f(ηi(x|y)) − f(x)]νn(dy) + % 1≤k%=i≤n 1 2˜σn(xi)[f(ηi(x|xk)) − f(x)].

Consider the particular choice ˜σn(x) = 1 + n−12σ(x), where σ is a bounded

nonnegative measurable function on X ; this yields Anf (x) = n % i=1 1 2θ ! [f(ηi(x|y)) − f(x)]{1 + 2σ(y) n }ν0(dy)

(13)

+1 2 % 1≤k%=i≤n [f(ηi(x|xk)) − f(x)] + 1 n % 1≤k%=i≤n σ(xk)[f(ηi(x|xk)) − f(x)].

Proceeding as in Section 4 we can derive the generator for the process of the empirical measures in the n-dimensional case through

Anφ(n)(µ) = n % i=1 "Gnif, µ(n)# + 1 2 % 1≤k%=i≤n "Φkif− f, µ(n)# + % 1≤k%=i≤n 1 nσk(·)"Φkif − f, µ (n)#, where Gnf (x) = (1/2)θ$[f(z) − f(x)]{1 + 2σ(z)/n}ν 0(dz). When f ∈ B(Sm), m < n, Anφ(m)(µ) = m % i=1 "Gnif, µ(m)# + 1 2 % 1≤k%=i≤m "Φkif − f, µ(m)# + 1 n % 1≤k%=i≤m * "σk(·)Φkif, µ(m)# − "σk(·)f, µ(m)# + + n− m n m % i=1 * "σi(·)f, µ(m)# − "σ(·) ⊗ f, µ(m+1)# + . The limiting operator is then

Aφm(µ) = m % i=1 "Gif, µm# +12 % 1≤k%=i≤m "Φkif− f, µm# + m % i=1 & "σi(·)f, µm# − "σ(·) ⊗ f, µm+1#', (6.4)

where Gn has been replaced by G, defined in (1.1).

Following an analogous procedure to that used in the proof of Theorem 5.2, one can show that the stationary distribution of the Fleming-Viot process with generator (6.4) is (6.2), as we know from Ethier and Kurtz (1994) and Ethier and Shiga (2000).

Acknowledgement

M. Ruggiero is indebted to the Department of Statistics and Applied Math-ematics, University of Turin, and to the Institute of MathMath-ematics, Statistics and Actuarial Science, University of Kent, where the research was carried out. M. Ruggiero was supported by the CRT Foundation, Turin, and by MIUR, grant 2006/133449.

(14)

References

Aldous, D. (1985). Exchangeability and related topics. Ecole d’Et´e de Probabilit´es de Saint Flour XIII. Lecture notes in Math. 1117. Springer, Berlin.

Blackwell, D. and MacQueen, J. B. (1973). Ferguson distributions via P´olya urn schemes. Ann. Statist. 1, 353-355.

Donnelly, P. and Kurtz, T. G. (1999). Genealogical processes for Fleming-Viot models with selection and recombination. Ann. Appl. Probab. 9, 1091-1148.

Ethier, S. N. and Kurtz, T. G. (1986). Markov Processes: Characterization and Convergence. Wiley, New York.

Ethier, S. N. and Kurtz, T. G. (1993). Fleming-Viot processes in population genetics. SIAM J. Control Optim. 31, 345-386.

Ethier, S. N. and Kurtz, T. G. (1994). Convergence to Fleming-Viot processes in the weak atomic topology. Stochastic Process. Appl. 54, 1-27.

Ethier, S. N. and Shiga, T. (2000). A Fleming-Viot process with unbounded selection. J. Math. Kyoto Univ. 40, 337-361.

Ferguson, T.S. (1973). A Bayesian analysis of some nonparametric problems. Ann. Statist. 1, 209-230.

Fleming, W. H. and Viot, M. (1979). Some measure valued Markov processes in population genetics theory. Indiana Univ. Math. J. 28, 817-843.

Gelfand, A. E. and Smith, A. F. M. (1990). Sampling-based approaches to calculating marginal densities. J. Amer. Statist. Assoc. 85, 398-409.

Geman, S. and Geman, D. (1984). Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Trans. Patt. Anal. Mach. Intelligence 6, 721–740.

Walker, S. G., Hatjispyros, S. J. and Nicoleris, T. (2007). A Fleming-Viot process and Bayesian nonparametrics. Ann. Appl. Probab. 17, 67-80.

Dipartimento di Economia Politica e Metodi Quantitativi, Universit`a degli Studi di Pavia, Via San Felice 5, 27100 Pavia, Italy.

E-mail: matteo.ruggiero@unipv.it

Institute of Mathematics, Statistics and Actuarial Science, University of Kent, CT2 7NZ Can-terbury, UK.

E-mail: S.G.Walker@kent.ac.uk

Riferimenti

Documenti correlati

In addition, a selection of νμ CC interactions at ND280 is used to constrain the neutrino flux and cross section uncertainties in order to substantially reduce the uncertainties on

The GUI consists of four parts: the first part contains the P&amp;Id scheme and the block set list, the second part (Human Machine Interface) allows the simulation of the Lube

I pazienti con disturbo evitante di personalità non soffrono solo quando sono insieme agli altri; soffrono anche la solitudine e la condizione di ritiro, che li porta a

Molti anni dopo, nell’Indisposizione di Belle Arti del 1881, un’esposizione umoristica or- ganizzata a Milano da un gruppo di giovani artisti scapigliati per “rispon-

Altered patterns of the interferon-inducible gene IFI16 expression in head and neck squamous cell carcinoma: immunohistochemical study including correlation with retinoblastoma

Electus research was presented in order to detect enterpreneurs’ preferences and obtain ideal profiles using part-worth utilities from CA Existence of different kind of

The presence of organic matter (in a form of humic acids, HAs) in the treated solutions had a slight impact on EC value. It can be predicted that HAs are deposited on the membrane

In this paper are listed the methods, that were used during the research, the results of the laboratory tests of anionic flocculants Magnaflock, that were conducted on various