COMPATIBILITY OF CONDITIONAL DISTRIBUTIONS
Patrizia Berti, Emanuela Dreassi, Pietro Rigo
Universit` a di:
Modena e Reggio-Emilia, Firenze, Pavia
Pisa, December 7, 2014
THE PROBLEM
Up to technicalities, the problem can be stated as follows
Let X and Y be real random variables. Suppose we are given the (measurable) kernels
α = {α(x, ·) : x ∈ R} and β = {β(x, ·) : x ∈ R}
where α(x, ·) and β(x, ·) are probability measures on R.
Is there a joint distribution for (X, Y ) such that
P (Y ∈ · | X = x) = α(x, ·) and P (X ∈ · | Y = y) = β(y, ·)
for almost all (x, y) ∈ R
2?
MORE GENERALLY
The previous problem is actually a special case of the following Let X
1, . . . , X
kbe real random variables and
Y
j= (X
1, . . . , X
j−1, X
j+1, . . . , X
k), 1 ≤ j ≤ k.
Suppose we are given the (measurable) kernels α
j= {α
j(y, ·) : y ∈ R
k−1}
where α
j(y, ·) is a probability measure on R.
Is there a joint distribution for (X
1, . . . , X
k) such that P (X
j∈ · | Y
j= y) = α
j(y, ·)
for all 1 ≤ j ≤ k and almost all y ∈ R
k−1?
If yes, the kernels α
1, . . . , α
kare compatible (or consistent).
MOTIVATIONS AND REMARKS
(i) We focus on real random variables for the sake of simplicity.
Nothing important changes if each X
jtakes values in a measurable space (X
j, B
j). Similarly, (X
1, . . . , X
k) can be replaced by an infinite sequence X
1, X
2, . . .
(ii) Let C be a collection of probability measures on R
k. The kernels α
1, . . . , α
kare C-compatible if they are the conditional distributions induced by some member of C.
(iii) Conceptually, we are just dealing with a coherence problem within the Kolmogorovian setting. Indeed, those people familiar with de Finetti’s coherence can think of this problem as
Coherence + Disintegrability
Technically, however, the problem is much harder than Finetti’s co-
herence
(iv) In some statistical frameworks, such as
Gibbs sampling, Multiple data imputation, Spatial statistics the conditional distributions of
X
jgiven Y
j= (X
1, . . . , X
j−1, X
j+1, . . . , X
k)
are needed. Sometimes, such conditional distributions are not derived from a joint distribution for (X
1, . . . , X
k). Rather:
One first selects the kernels α
1, . . . , α
k, with each α
jregarded as the conditional distribution of X
jgiven Y
j, and then checks compatibility of α
1, . . . , α
k.
Note that, if α
1, . . . , α
kfail to be compatible, any subsequent anal- ysis does not make sense !!!
(v) Compatibility issues also occur in
Statistical mechanics, Machine learning, Bayesian statistics
RESULTS
In BDR (Stochastics, 2013; J. Multivariate Analysis, 2014) compat- ibility of α
1, . . . , α
kis characterized under the assumptions that
(i) Each X
jhas compact support or
(ii) Each kernel α
jhas a density with respect to some reference measure λ
j, namely α
j(y, dx) = f
j(x | y) λ
j(dx)
Furthermore,
(iii) C-compatibility of α
1, . . . , α
kis characterized for C = {exchangeable laws on R
k}
C = {identically distributed laws on R
k}
We next focus on (iii)
THM1: Let C = {exchangeable laws on R
k}. Suppose α
1= . . . = α
k= α and α(y, ·) = α[π(y), ·]
for all permutations π. Then,
α
1, . . . , α
kare compatible ⇔ α
1, . . . , α
kare C-compatible
THM2: Let C = {identically distributed laws on R
k}. Suppose k = 2; X
1, X
2take values in a finite set; α
1irreducible
Then, α
1, α
2are C-compatible if and only if α
1(x, y) > 0 ⇔ α
2(y, x) > 0
Qn
i=1