• Non ci sono risultati.

View of Bayesian nonparametric duration model with censorship

N/A
N/A
Protected

Academic year: 2021

Condividi "View of Bayesian nonparametric duration model with censorship"

Copied!
12
0
0

Testo completo

(1)

BAYESIAN NONPARAMETRIC DURATION MODEL WITH CENSORSHIP

J. Hakizamungu, J.M. Rolin

1.INTRODUCTION

Nonparametric Bayesian methods have known a strong development after Ferguson’s (1973, 1974) presentation of the Dirichlet process. Soon after number of authors brought new contributions on the subject including among others An-toniak (1974), Doksum (1974), Susarla and Van-Rysin (1976), Ferguson and Phadia (1979), Rolin (1983, 1992a, 1992b), Florens and Rolin (1988), Rolin (1997), Florens, Mouchart and Rolin (1999).

This paper is concerned with nonparametric i.i.d. durations models with cen-sored observations and we establish by a simple and unified approach the general structure of a Bayesian nonparametric estimator for a survival function S . We then give some corollaries with Dirichlet prior random measures. These results are essentially supported by prior and posterior independence properties.

In general, the estimation of a survival function S has to cope with the techni-cal difficulty of the partial observability of the latent duration variable X . This situation is modelized by introducing a censoring variable Y . The observable variables are then a duration variable T =min(X,Y) and an indicator variable

{ }

1X Y

D= ≤ which indicates if X is observed or censored. The estimation proc-ess will be done using T and D .

This paper is organized as follows: we start by the specification of the model and some assumptions, we present afterwards nonparametric Bayesian estimators of the survival function. In an appendix, we recall some properties of random Dirichlet measures and present the Bayesian nonparametric estimator using purely discrete prior distributions.

2.SPECIFICATION OF THE MODEL

Let X and Y be two independent random variables being respectively a la-tent duration and a censoring time with respective survival functions S and G . The joint survival function is therefore:

(2)

( , )x y P X( x Y, y S G| , ) S x( ) G y( )

Φ = > > = × (1)

Assumption 1. A priori S and G are independent, i.e., S



G .

Therefore, by standard properties of conditional independence, we obtain the following result:

Lemma 1. Assumption 1 and equation(1) are equivalent to (X, S )



Y, G( ) If {(X Yi, i) : 1≤ ≤i n} is a sample of size n of (X,Y), i.e.,

1 ( i i)| i n X , Y S, G ≤ ≤



and (X Yi, i)| ,S G≈(X Y S G, )| , ∀1≤i ≤n, the observable data are defined by

) , min( i i i X Y T = and 1{ } i i Y X i D = ≤ ∀1≤i ≤n (2)

By elementary properties of conditional independence (see, e.g., Mouchart and Rolin (1984) or Florens, Mouchart and Rolin (1990)), we deduce the following result:

Lemma 2.

i) The observed random variables are independent given Y1n = {Yi: 1≤ ≤i n},

i.e., 1 1 ( i i)| , n i n T , D S, G Y ≤ ≤



(3)

ii) The distribution of (Ti,Di) conditionally on ( , , 1 )

n Y G S only depends on S and Yi , i.e., 1 ( ,T Di i)



( , ,S G Yn)| S, Y (4) and that implies

1 1 1

(Tn,Dn)



G S, Y | n (5)

Since the distribution of Y1n is independent of S , we have a Bayesian cut (see Florens, Mouchart and Rolin (1990)) and the inference may be totally separated into the inference on G through the marginal model generating Y1n and the in-ference on S through the conditional model generating (T1n,D1n) given Y1n.

(3)

Equivalently, Y1n may be considered as “known” or “fixed” for the estimation of S .

As will be shown later on, we have in fact a stronger result : the inference on

S does not require the knowledge of the censoring times if (T1n,D1n) are ob-served. This means that the knowledge of “unactive” censoring times (Yi greater than Xi) is unnecessary.

Lemma 3. The following conditional independence relation is verified :

1| 1, 1

n n n

S



Y T D (6)

In conclusion, whatever is the distribution of G , the posterior distribution of

S will be the same in the joint model as the posterior distribution of S in the model conditional on Y1n , i.e., in the model considering that the censoring times are fixed and known.

Let us now specify the prior distribution of S or equivalenty the prior distri-bution of the associated hazard function H defined by

) ( ln ) (t S t H =− (7)

Assumption 2. The measure H* associated to the hazard function H is purely random, i.e., for all {Bj: 1≤ jk}, measurable finite partition of =[ ∞0, )

+

R ,

we have the following independence property:



k j j B H ≤ ≤ 1 * ) ( (8)

This assumption amounts to say that S is a neutral to the right process and Doksum (1974) has shown that the independence properties are preserved a pos-teriori (see also Rolin (1983)). The prior distribution is therefore only specified through independence properties. Note also that the Dirichlet process is a neutral to the right process.

3.NONPARAMETRIC BAYESIAN ESTIMATOR

We want to analyse the posterior distribution of S given ( 1 , 1)

n n D T . For } ,..., 1 { n J ⊂ , let = ∩{ =1}∩ ∩{ =0} ∈ ∈J i i J i i J D D A c . On this set, 1 ( J, Jc) n Y X T = where XJ ={Xi :iJ} and { : c} i J Y i J

(4)

Theorem 1. Whatever is the prior distribution of S , for any measurable function K , 1 1 1 E ( ) ( )| , E[ ( )| , , ] E ( )| , c c c c i J J i J n n n i J J i J K S S Y X Y K S T D Y S Y X Y ∈ ∈ ª º « » « » ¬ ¼ = ª º « » « » ¬ ¼

on AJ (9)

This proves Lemma 3, since the above formula shows that the conditional ex-pectation only depends on the observed censoring times.

The above formula may be rewritten in terms of the order statistics of the ob-servable lifetimes. Let {Zj , 1≤ j ≤ M} be the order statistics (the distinct ob-servable lifetimes in increasing order) of T1n. We define the number of individu-als at risk at time Zj by

¦

≤ ≤ ≥ = n i Z T j i j N 1 } { 1 (10)

and respectively, the number of individuals censored and the number of individu-als uncensored at time

Z

j by

¦

≤ ≤ = = = n i D Z T j i j i F 1 } 0 , { 1 and

¦

≤ ≤ = = = n i D Z T j i j i E 1 } 1 , { 1 (11)

With these notations, Theorem 1 may be rewritten as

Corollary 1. For any measurable function K ,

1 1 1 1 E ( ) ( ) | , E[ ( )| , ] E ( ) | , j c j c F j J J j M n n F j J J j M K S S Z X Y K S T D S Z X Y ≤ ≤ ≤ ≤ ª º « » « » ¬ ¼ = ª º « » « » ¬ ¼

on AJ (12)

Proof. Let K and Ki , 1≤i ≤n, be arbitrary measurable functions, by definitions and independence properties, we have

(5)

= E ( ) ( ,1)1{ i i} ( , 0)1{ i i}| 1 c n i i X Y i i X Y i J i J K S K XK Y > Y ∈ ∈ ª º « » « » ¬

¼ = E ( ) ( ,1)1{ i i} ( , 0) ( )| 1 c n i i X Y i i i i J i J K S K XK Y S Y Y ∈ ∈ ª º « » « » ¬

¼ because 1 1 | , n i i n X S Y ≤ ≤



. Now, since 1| n J S



Y X , if we define E ( ) ( )| , ( , ) E ( )| , c c c c c i J J i J J J i J J i J K S S Y X Y L X Y S Y X Y ∈ ∈ ª º « » « » ¬ ¼ = ª º « » « » ¬ ¼

we have 1 E ( ) ( )| , c n i J i J K S S Y X Y ∈ ª º « » « » ¬

¼= L(XJ,YJc ) E c ( )| , 1 n i J i J S Y X Y ∈ ª º « » « » ¬

¼ Therefore, 1 1 E ( ) ( , )1 | J n i i i A i n K S K T D Y ≤ ≤ ª º « » ¬

¼= 1 1 E ( , c ) ( , )1 | J n J J i i i A i n L X Y K T D Y ≤ ≤ ª º « » ¬

¼

Now, for Zjt <Zj+1, according to Docksum’s result H(t)= −lnS(t) may be decomposed into the following sum of independent terms conditionally on

c J J Y X , , i.e., * * 1 1 1 ( ) (( l , l ]) (( l , ]) l j H t H ZZ H Zt ≤ ≤ =

¦

+ (13)

This is equivalent to say that S(t) is a product of independent terms condition-ally on c J J Y X , , i.e., 1 1 E ( ) ( , )1 | J n i i i A i n K S K T D Y ≤ ≤ ª º « » ¬

¼

(6)

) ( ) ( ) ( ) ( ) ( 1 l j l 1 j l Z S t S Z S Z S t S =

× ≤ ≤ − (14)

Now, if S* is the measure associated to the survival function S and if we suppose that S* is a random measure which follows the Dirichlet law, i.e.,

) ( * *

a Di

S ≈ , where a* is a finite measure, Ferguson (1973) has shown that for uncensored observations, S* is a posteriori a Dirichlet measure. Therefore,

* * | J, c ( u) J S X YDi a +N where 1 i i u X i T i J i n N ε Dε ∈ ≤ ≤ =

¦

=

¦

(15)

is the counting measure of uncensored observations (εx is the Dirac probability measure at point x ).

From properties of the Dirichlet measure (see Appendix 1), we deduce that,

* * 1 1 ( ) | , [( )(( , )], (( , ]) ] ( ) c l J J u l l l l l S Z X Y Beta a N Z a Z Z E S Z − − ≈ + ∞ + and that * * ( ) | , [( )(( , ]), (( , ])] ( ) J Jc u j j S t X Y Beta a N t a Z t S Z ≈ + ∞

Combining these properties with Theorem 1, we obtain the following result:

Theorem 2. If H* is purely random (Assumption 2), then for Zjt <Zj+1, we may write S(t) as a product of independent terms conditionally on (T1n,D1n), namely ) ( ) ( ) ( ) ( ) ( 1 l j l 1 j l Z S t S Z S Z S t S =

× ≤ ≤ −

Moreover if S ≈* Di(a*) and a t( )=a*([0, ]),t ∀ >t 0, then, for 1≤l ≤ M ,

1 1 1 1 ( ) | , [ ( ) ( ) , ( ) ( ) ] ( ) n n l l l l l l l l S Z T D Beta a a Z N E a Z a Z E S Z − − ≈ ∞ − + − − + (16)

(7)

and 1 1 1 ( ) | , [ ( ) ( ) , ( ) ( )] ( ) n n j j j S t T D Beta a a t N a t a Z S Z + ≈ ∞ − + − (17)

Proof. Note that

([ , ]) 1 1 1 ( ) ( ) ( ) c l j N Z F l j j M l M l S Z S Z S Z ∞ ≤ ≤ ≤ ≤ − ­ ½ = ® ¾ ¯ ¿

where ([ , ]) c l j l j M N Z F ≤ ≤ ∞ =

¦

c

N denotes therefore the counting measure of censored observations, i.e.,

1 (1 ) i c i T i n N D ε ≤ ≤ =

¦

Applying Theorem 1 provides the result if we notice that

1 (( , ]) ([ , ]) u l c l l l l l N Z ∞ +N Z ∞ =NE =N + +F and (( , ]) u c N t ∞ +N ([Zj+1, ])∞ =(N +u Nc) ([Zj+1, ])∞ =Nj+1

The same type of computations provides the following corollary:

Corollary 2. Conditionally on (T1n,D1n), ) ( ) ( ) ( ) ( ) ( ) ( 1 1 − × − = − − j j j j j j Z S Z S Z S Z S Z S Z S where 1 1 1 ( ) ( ) | , ( ) ( ) j j n n j j S Z S Z T D S ZS Z − −



1 1 1 1 ( ) | , [ ( ) ( ) , ( ) ( )] ( ) j n n j j j j j S Z T D Beta a a Z N a Z a Z S Z − − − ≈ ∞ − − + − − (18)

(8)

1 1 ( ) | , [ ( ) ( ) , ( ) ( ) ] ( ) j n n j j j j j j j S Z T D Beta a a Z N E a Z a Z E S Z ≈ ∞ − + − − − + − (19)

This last formula shows that, if a* is a diffuse measure, then, a posteriori, there is a jump only at the death observations (Ej >0).

Taking the posterior expectation provides the Susarla-Van Ryzin estimator

Corollary 3. If S ≈* Di(a*), then, for Zjt <Zj+1,

1 1 1 1 1 1 1 1 1 ˆ ( ) [ ( )| , ] ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) {1 } ( ) ( ) ( ) n n SV j l l l l j j j l l j l l j l l S t E S t T D a a t N a a Z N E a a Z N a a Z N a a t N F a n a a Z N + ≤ ≤ + − + ≤ ≤ + = ∞ − + ∞ − + = × ∞ − + ∞ − + ∞ − + = × + ∞ + ∞ − +

(20) where a t( )=a*([0, ])t ,∀t >0.

Considering the non informative case, i.e., a(∞)=0, we obtain the Kaplan-Meier estimator

Corollary 4. In the noninformative case, the posterior expectation is given by:

{ : } ˆ ( ) 1 j j KP j Z t j E S t N ≤ ­ ½ ° ° = ®¾ ° ° ¯ ¿

(21) APPENDIX

1. Random Dirichlet measures

We first recall the definition of a Dirichlet measure (Ferguson (1973)).

Definition: m~* is a random Dirichlet measure defined on (A, M ) with parame-ter a*, a finite measure on (A, M ), and we write ~m ≈* Di(a*), if for all partitions

∈ ≤

j

j j l B

B ,1 },

{ M, the random vector m~*(B1),...,m~*(Bl) follows the Dirich- let law with parameter a*(B1),..., a*(Bl) characterized by

(9)

* 1 * 1 1 * * 1 1 1 1 { } 1 [ ] j j j l j j l a (B ) j j j j v j l j l j j l Ƅ a (B ) P m (B ) dv v dv Ƅ a (B ) ≤ ≤ ≤ ≤ − ­ ½ ° = ° ¦ ® ¾ ≤ ≤ ≤ ≤ ° ° ¯ ¿ ≤ ≤ ª º « » ª º «¬ »¼ ∈ = × × « » « » ¬ ¼

¦





(22) In particular, ∀BM * * * ( ) [ ( ), ( c)] m B ≈Beta a B a B (23)

The Dirichlet measure may also be characterized as follows (see, e.g., Rolin (1992a). Let us define the following σ -algebra

*

{ ( ) : , }

Bm C CB CM

F

and the conditional probability * * * ( ) ( | ) ( ) m C B m C B m B ∩ =   

Proposition. Let ~m* be a random probability on (A, M ). Then m ≈~* Di(a*)if and only if, ∀C,BM with 0<a*(B)<1,

(i)m C B*( | )



c

B

F

(ii)m C B*( | )≈Beta a C[ (* ∩B a C), *( cB)] (24)

2. Estimator using a purely discrete prior distribution

Let {tj :1≤ jm} with t1<t2 <...<tm be the support of S*. The hazard rate, for 1≤ j ≤ m , is defined by

) ( ) ( 1 ) ( ) ( 1 1 − − = − − = Λ j j j j j t S t S t S t S (25)

These hazard rates characterize entirely S* since, for 1≤ j ≤m ,

1 ( )j (1 l) l j S t ≤ ≤ Λ =

− and

(10)

* 1 ({ })j ( )j ( j ) j (1 l ) l j S t S t S t ≤ < Λ Λ = − − =

Now, if , for 1≤ j ≤m, Nj, Ej et Fj denote respectively the number of in-dividuals at risk, uncensored and censored at time tj, the likelihood of the sam-ple is given by j j F m j j E m j j n S t S t L ({ }) ( ) 1 1 *

< ≤ ≤ ≤ × =

or, in terms of hazard rates, by

1 (1 ) j j j E N E n j j j m L − ≤ < Λ Λ =

(26)

For the prior distribution, note that H* is a purely random measure if and only if the Λj ’s are independent random variables. In view of the likelihood, this

property will hold a posteriori. Now, if S* is a Dirichlet measure with purely dis-crete parameter a*, we have

* *

[ ({ }), (( , ])] [ , (1 )]

j Beta a tj a tj Beta c lj j cj l j

Λ ≈ ∞ = − (27)

where l j =Ej) and cj =a*([ , ])tj ∞ =a( )∞ −a t( j− .)

By taking cj, 1j < m, to be arbitrary positive constants, we see that we may

use more general priors than Dirichlet measures to obtain tractable posterior dis-tributions. This gives the following result

Theorem 3. If a priori, 1 j j m ≤ < Λ



and for 1j < m, j Λ ≈ Beta [ , (1 )] j j j j c l cl (28) where l j =Ej) and j

c , 1≤ j <m are arbitrary positive constants, then, a posteriori, 1 1 1 | n, n j j m T D ≤ < Λ



and for 1≤ j <m, j Beta Λ ≈ [ , (1 ) ] j j j j j j j c l +E cl +NE (29)

Note that cj indicates the degree of credibility that the statistician allows to the prior distribution because the variance of the hazard rates is given by

(11)

(1 ) ( ) 1 j j j j l l V c Λ = − + As a corollary, we obtain

Corollary 5. The Bayesian nonparametric estimator of the survival function for

m j < ≤ 1 is given by: 1 1 1 ˆ ( ) [ ( )| n, n] 1 l l l BN j j l j l l c l E S t E S t T D c N ≤ ≤ ­ + ½ = = ®¾ + ¯ ¿

(30)

We note that if cj 0 we obtain the Kaplan-Meier estimator. This is the same as to use a non-informative prior law as fj)∝λ−j1(1−λj)−1

Department of Mathematics, HAKIZAMUNGU JOSEPH

University of Benin, Togo

Institut de Statistique, ROLIN JEAN-MARIE

Université catholique de Louvain, Belgium

REFERENCES

C.E. ANTONIAK, (1974), Mixtures of Dirichlet processes with applications to Bayesian nonparametric

problems, “Annals of Statistics”, 2, 1152-1174.

K. DOKSUM, (1974), Tailfree and neutral random probabilities and their posterior distributions,

“An-nals of Probability”, 2, 183-201.

T. FERGUSON, (1973), A Bayesian analysis of some nonparametric problems, “Annals of statistics”,

1, 209-230.

T. FERGUSON, (1974), Prior distributions on spaces of probability measures, “Annals of statistics”, 2,

615-629.

T. FERGUSON, E.G PHADIA, (1979), Bayesian nonparametric estimation based on censored data,

“An-nals of statistics”, 7, 163-186.

J.P. FLORENS, M. MOUCHART, J.M. ROLIN, (1990), Elements of Bayesian Statistics, Marcel Dekker,

New York.

J.P. FLORENS, M. MOUCHART, J.M. ROLIN, (1999), Semi-and non-parametric Bayesian analysis of

dura-tion models with Dirichlet priors: a survey, “International Statistical Review”, 67, 2, 187-210.

J.P. FLORENS, J.M. ROLIN, (1998), Simulation of posterior distributions in nonparametric censored

analy-sis,Institut de Statistique discussion paper n°9818, , Université Catholique de Louvain, Louvain-La-Neuve, Belgique.

J. HAKIZAMUNGU, (1992), Modèles de durées: Approche Bayésienne semiparamétrique, Thèse de

doc-torat, Université Catholique de Louvain, Belgique.

M. MOUCHART, J.M. ROLIN, (1984), A note on conditional independence with statistical applications,

“Statistica”, 557-584.

(12)

sta-tistical models from parametric to nonparametric using Bayesian approaches, Lecture notes in sta-tistics, Springer-Berlin, 108-133.

J.M ROLIN, (1992a), Some useful properties of the Dirichlet processes, Core discussion paper

n°9207. Université Catholique de Louvain, Louvain-La-Neuve, Belgique.

J.M ROLIN, (1992b), On the distribution of jumps of the Dirichlet processes, Core discussion paper

n°9259. Université Catholique de Louvain, Louvain-La-Neuve, Belgique.

J.M ROLIN, (1998), Bayesian survival analysis, “Encyclopedia of Biostatistics”, 1, 271-286, John

Wiley, New-York.

V. SURSALA, J. VAN RYZIN, (1976), Nonparametric Bayesian estimation of survival curves from

incom-plete observations, “JASA”, 71, 897-902.

RIASSUNTO Modello non parametrico Bayesiano di durata con dati censurati

Il seguente articolo tratta di modelli i.i.d. non parametrici di durata con dati censurati e considera una struttura generale di uno stimatore bayesiano non parametrico per una fun-zione di sopravvivenza attraverso un approccio semplice e unificato. Per le distribuzioni a prior di Dirichlet, si descrive in modo completo la struttura della distribuzione a posteriori della funzione di sopravvivenza. Questi risultati sono essenzialmente sostenuti dalle pro-prietà di indipendenza a priori e a posteriori.

SUMMARY Baesian nonparametric duration model with censorship

This paper is concerned with nonparametric i.i.d. durations models with censored ob-servations and we establish by a simple and unified approach the general structure of a bayesian nonparametric estimator for a survival function S. For Dirichlet prior distribu-tions, we describe completely the structure of the posterior distribution of the survival function. These results are essentially supported by prior and posterior independence properties.

Riferimenti

Documenti correlati

On October 2002, D˘ anut¸ Marcu from Bucharest submitted to our journal a short paper, entitled “Some results on the independence number of a graph”.. It can be downloaded from our

La storia della società piemontese è strettamente legata alla famiglia Agnelli già dalla fondazione nel 1897 e lungo tutte le tappe più importanti dello stesso club:

No repeat emission was detected from an FRB during this time placing weak limits on bursting or flaring sources; a more detailed and long-term study would be needed to rule

Se il docente risulta avere tutti i dati, una volta individuate le relative soglie e ottenuto il numero di articoli, citazioni e l’indice-H, viene “costruita” la tabella

dichiarazione; gli articoli 1 e 2 stabiliscono i concetti di libertà ed eguaglianza ; gli articoli da 12 a 17 stabiliscono i diritti che l’individuo vanta nei

Three fits are shown: fitting all data with a single PL (black dashed line), or with a broken PL (black line) as well as a fit to the data excluding obvious flaring points (blue

We study the behaviour of the cyclotron resonant scattering feature (CRSF) of the high-mass X-ray binary Vela X-1 using the long-term hard X-ray monitoring performed by the Burst

In particolare, Diego Di Masi ha curato il paragrafo 1 “Dal minimal al maximal: un approccio basato sull’agency per l’educazione alla cittadinanza” e il