LUCA PRATELLI AND PIETRO RIGO
Abstract. Let X = {X
t: 0 ≤ t ≤ 1} be a centered Gaussian process with continuous paths, and I
n=
a2nR
10
t
n−1(X
12−X
t2) dt where the a
nare suitable constants. Fix β ∈ (0, 1), c
n> 0 and c > 0 and denote by N
cthe centered Gaussian kernel with (random) variance cX
12. Under an Holder condition on the covariance function of X, there is a constant k(β) such that
kP √
c
nI
n∈ · − EN
c(·) k ≤ k(β) a
nn
1+α β+ |c
n− c|
c for all n ≥ 1, where k·k is total variation distance and α the Holder exponent of the covari- ance function. Moreover, if
na1+αn→ 0 and c
n→ c, then √
c
nI
nconverges k·k-stably to N
c, in the sense that
kP
F√
c
nI
n∈ · − E
FN
c(·) k → 0
for every measurable F with P (F ) > 0. In particular, such results apply to X = fractional Brownian motion. In that case, they strictly improve the existing results in [5] and provide an essentially optimal rate of convergence.
1. Introduction
Malliavin calculus is a powerful tool which leads to effective results in a plenty of frameworks. Being a general approach, however, Malliavin calculus can not be expected to give optimal results in any specific problem. Indeed, it may be that (much) stronger results can be obtained through standard methods designedly shaped to the problem at hand.
This paper provides an example of this fact.
In [5], Nourdin, Nualart and Peccati establish general results on Malliavin op- erators and then apply such results to some (meaningful) special cases. Following this route, they get stable limit theorems for quadratic functionals of fractional Brownian motion.
Let B be a fractional Brownian motion with Hurst parameter H on the proba- bility space (Ω, F , P ). Define
A n = n 1+H 2
Z 1 0
t n−1 (B 2 1 − B 2 t ) dt.
The asymptotics of A n and other analogous functionals (such as weighted power variations) is investigated in various papers. See e.g. [3], [5], [6] and references therein. One more reason for dealing with A n is that, for H ≥ 1/4,
Z 1 0
t n B t dB t = A n
n H − H 2H + n
2010 Mathematics Subject Classification. 60B10, 60F05, 60G15.
Key words and phrases. Fractional Brownian motion, Gaussian process, Stable convergence, Total variation distance.
1
where the stochastic integral is meant in Skorohod’s sense (it reduces to an Ito integral if H = 1/2); see [1].
Let c = H Γ(2H) and let N c be the Gaussian random probability measure with mean 0 and (random) variance cB 1 2 . We denote by E[N c (·)] the probability measure on B(R) given by A 7→ E[N c (A)]. Equivalently, E[N c (·)] may be regarded as the probability distribution of √
c U B 1 , where U is a standard normal random variable independent of B 1 .
In [5, Theorem 3.6] it is shown that, if H ∈ [1/2, 1), then A n converges stably to N c and
kP (A n ∈ ·) − E[N c (·)] k ≤ k n −(1−H)/15 for all n ≥ 1, where k·k is total variation distance and k a constant independent of n.
In this paper it is proved that, for every H ∈ (0, 1) and β ∈ (0, 1), there is a constant k(H, β) such that
kP (A n ∈ ·) − E[N c (·)] k ≤ k(H, β) n −β a(H) for all n ≥ 1 (1)
where
a(H) = 1/2 − |1/2 − H|.
Furthermore, for fixed > 0,
kP (A n ∈ ·) − E[N c (·)] k is not O n −a(H)− .
(2)
Roughly speaking, kP (A n ∈ ·) − E[N c (·)] k can be estimated for every H ∈ (0, 1) (and not only for H ∈ [1/2, 1)) with a rate arbitrarily close to n −H or n −(1−H) according to whether H < 1/2 or H ≥ 1/2. Also, in view of (2), such a rate is quite close to be optimal.
In addition, A n converges stably to N c in a very strong sense. In fact, kP F (A n ∈ ·) − E F [N c (·)] k → 0
(3)
for every F ∈ F with P (F ) > 0, where P F (·) = P (· | F ) is the probability measure on F conditional on F and E F denotes expectation under P F .
Both (1) and (3) are proved by standard, elementary tools. Thus, a question is whether they can be obtained by Malliavin calculus.
A last note is that the class of functionals covered by this paper is actually richer than it appears so far. In fact, (1) and (3) are strengthened as follows.
(i) Up to replacing n 1+H and c with appropriate constants, (1) and (3) are still true if B is any centered Gaussian process with a suitable covariance function.
(ii) Let K t = B t − E(B t B 1 ) B 1 and T = (K t1, . . . , K tm), where t 1 , . . . , t m ∈ [0, 1]. Then, inequality (1) generalizes into
), where t 1 , . . . , t m ∈ [0, 1]. Then, inequality (1) generalizes into
kP (T, A n ) ∈ · − Q c (·) k ≤ k n −β a(H) for all n ≥ 1,
where the constant k depends on t 1 , . . . , t m , H and β while Q c is a suitable probability measure on B(R m+1 ). Further, the pairs (T, A n ) converge stably (in the strong sense mentioned above) to the product kernel δ T × N c . (iii) Our method of proof allows to handle functionals, more general than A n ,
such as
A 0 n = n 1+H p
Z 1 0
t n−1 (B 1 p − B p t ) {1 + g(K t )} dt
where K has been defined in (ii), g is a suitable function and p ≥ 2 any in- teger. In particular, contrary to Malliavin calculus, the rate of convergence of A 0 n does not depend on the degree of g when g is a polynomial.
Points (i)-(ii) are developed in Sections 3-4, while point (iii) is only briefly dis- cussed in Section 5. However, A 0 n could be managed by exactly the same techniques used for A n .
2. Preliminaries
2.1. Notation. All random elements involved in this paper are defined on a fixed probability space
(Ω, F , P ).
If F ∈ F and P (F ) > 0, we let P F (A) = P (A ∩ F )/P (F ) for all A ∈ F and we write E F to denote expectation under P F .
Let X be a topological space. Then, B(X ) denotes the Borel σ-field on X and P(X ) the collection of probability measures on B(X ). Also, k·k is the total variation distance on P(X ), namely,
kµ − νk = sup
A∈B(X )
|µ(A) − ν(A)| whenever µ, ν ∈ P(X ).
A measurable map L : Ω → P(X ) is called a random probability measure on X or a kernel on X . Measurability means that ω 7→ L(ω)(A) is F -measurable for fixed A ∈ B(X ).
We denote by δ x the unit mass at the point x and we write X ∼ µ to mean that µ is the probability distribution of the random variable X. Further, N (a, b) is the Gaussian law on B(R) with mean a ∈ R and variance b ≥ 0 (with N (a, 0) = δ a ).
Finally, the standard normal density is denoted by φ, namely, φ(x) = (2 π) −1/2 exp −(1/2) x 2 .
2.2. A (natural) extension of stable convergence. Let G ⊂ F be a sub-σ-field.
For each U ⊂ F , write
U + = {F ∈ U : P (F ) > 0}.
Also, let X be a separable metric space, L a random probability measure on X and X n a sequence of X -valued random variables.
Then, X n converges to L stably with respect to G if P G (X n ∈ ·) weakly −→ E G [L(·)]
as n → ∞ for all G ∈ G + . Here, E G [L(·)] stands for the probability measure on B(X ) given by A 7→ E G [L(A)].
Stable convergence can be extended, consistently with Renyi’s original ideas [7], as follows. Fix a distance d on P(X ) and say that X n converges to L, d-stably with respect to G, if
d
P G (X n ∈ ·), E G [L(·)]
→ 0
as n → ∞ for all G ∈ G + . Such a definition reduces to the previous one if d is any
distance metrizing weak convergence of probability measures.
In the sequel, we let
G = F and d = k·k = total variation distance.
Since G = F , for simplicity, G is not mentioned at all. Hence, X n is said to converge k·k-stably to L if
lim n kP F (X n ∈ ·) − E F [L(·)] k = 0 (4)
for all F ∈ F + .
A last note is that, to prove k·k-stable convergence, it suffices to get (4) for certain conditioning events F .
Lemma 1. Let U ⊂ F be such that: (i) Ω ∈ U ; (ii) U is closed under finite intersections; (iii) σ(U ) ⊃ σ(L, X 1 , X 2 , . . .). If condition (4) holds for all F ∈ U + , then X n converges k·k-stably to L.
Proof. By (i)-(ii) and the inclusion-exclusion formulae, condition (4) holds for each F in the field generated by U . Hence, it can be assumed that U is a field. Note also that, for any F, G ∈ F + ,
kP F (X n ∈ ·) − E F [L(·)] k ≤ 2P (F ∆G) + kP G (X n ∈ ·) − E G [L(·)] k
P (F ) .
Next, fix F ∈ σ(U ) + and > 0. Since U is a field, there is G ∈ U + such that 2 P (F ∆G)/P (F ) < . Thus, condition (4) holds for all F ∈ σ(U ) + . It follows that
sup
A∈B(X )
E
1 {Xn∈A} V
− E
L(A)V → 0
provided the random variable V is bounded and σ(U )-measurable. Therefore, be- cause of (iii), for each F ∈ F + one obtains
kP F (X n ∈ ·) − E F [L(·)] k =
= 1
P (F ) sup
A∈B(X )
E
1 {Xn∈A} E(1 F | σ(U ))
− E
L(A)E(1 F | σ(U )) → 0.
2.3. Technical lemmas. A few simple facts are needed to prove our main result (Theorem 6). They are most probably known, but we provide proofs since we are not aware of any reference.
Lemma 2. Let X n and X be real random variables. If
E(|X n | p ) = O(1) and kP (X n ∈ ·) − P (X ∈ ·)k = O(n −b ), for all p > 0 and some b > 0, then
|E(X n ) − E(X)| = O n −b+
for all > 0.
Proof. First note that E(|X|) ≤ lim inf n E(|X n |) < ∞. Up to considering positive
and negative parts, it can be assumed X n ≥ 0 and X ≥ 0. Then, for all n ≥ 1,
t 0 > 0 and p > 1, one obtains
|E(X n ) − E(X)| =
Z ∞ 0
P (X n > t) − P (X > t) dt
≤ t 0 kP (X n ∈ ·) − P (X ∈ ·)k + Z ∞
t
0P (X n > t) + P (X > t) dt
≤ t 0 c n −b + Z ∞
t
0E(X n p ) + E(X p )
t p dt
≤ t 0 c n −b + Z ∞
t
0c p
t p dt = t 0 c n −b + c p
t 1−p 0 p − 1
where c and c p are suitable constants. Thus, given > 0, it suffices to take t 0 = n
and p > max(1, b/).
Lemma 3. If a 1 , a 2 ∈ R, 0 ≤ b 1 ≤ b 2 and b 2 > 0, then kN (a 1 , b 1 ) − N (a 2 , b 2 )k ≤ 1 − r b 1
b 2
+ |a 1 − a 2 |
√ 2 π b 2 .
Proof. The lemma is trivially true if b 1 = 0. Hence, it can be assumed b 1 > 0. Let u = pb 1 /b 2 . Since u ≤ 1, then φ(x) ≤ φ(ux) for each x ≥ 0. Thus,
kN (a 1 , b 1 ) − N (a 1 , b 2 )k = kN (0, b 1 ) − N (0, b 2 )k
= 1 2
Z ∞
−∞
√ 1
b 1 φ x
√ b 1 − 1
√ b 2 φ x
√ b 2
dx =
Z ∞ 0
|φ(x) − u φ(ux)| dx
≤ (1 − u) Z ∞
0
φ(x) dx + u Z ∞
0
φ(ux) − φ(x) dx
= (1 − u)
2 + 1
2 − u
2 = 1 − u.
Letting α = (a 1 − a 2 )/ √
b 2 , one also obtains
kN (a 1 , b 2 ) − N (a 2 , b 2 )k = kN (α, 1) − N (0, 1)k
= 1 2
Z ∞
−∞
|φ(x − α) − φ(x)| dx = Z
|α|2−
|α|2φ(x) dx ≤ |α|
√ 2 π . Therefore,
kN (a 1 , b 1 ) − N (a 2 , b 2 )k ≤ kN (a 1 , b 1 ) − N (a 1 , b 2 )k + kN (a 1 , b 2 ) − N (a 2 , b 2 )k
≤ 1 − u + |α|
√ 2 π .
If a 1 = a 2 , Lemma 3 is well known; see e.g. [4, Proposition 3.6.1]. In this case, one trivially obtains
kN (a, b 1 ) − N (a, b 2 )k ≤ |b 1 − b 2 |
b i for each i such that b i > 0.
As already pointed out, Lemma 3 is most probably known even if a 1 6= a 2 , but we do not know of any reference.
In the next result, U, T 1 , . . . , T m are real random variables and T = (T 1 , . . . , T m ).
Lemma 4. Let λ and µ be the probability distributions of T and (T, U ), respectively, and
ν = λ × N (0, b) where b > 0.
If (T, U ) has a centered Gaussian distribution, then kµ − νk ≤ r
√ b n | √
b − p
E(U 2 )| +
m
X
i=1
|E(U T i )| o where the constant r depends only on λ.
Proof. Throughout this proof, each x ∈ R m is regarded as a column vector, x 0 is the transpose of x and
kxk ∗ =
m
X
i=1
|x i |.
Let Σ be the covariance matrix of T . If rank(Σ) = 0, then T 1 = . . . = T m = 0 a.s. and the lemma is trivially true because of Lemma 3. If 0 < rank(Σ) < m, there is a subvector T 0 of T and a (linear) function h such that T = h(T 0 ) a.s. and T 0
has non-singular covariance matrix. In that case, denoting by V a random variable independent of T such that V ∼ N (0, b), one obtains
kµ − νk = kP (T, U ) ∈ · − P (T, V ) ∈ ·k = kP (h(T 0 ), U ) ∈ · − P (h(T 0 ), V ) ∈ ·k
≤ kP (T 0 , U ) ∈ · − P (T 0 , V ) ∈ ·k.
Thus, up to replacing T with T 0 , it can be assumed rank(Σ) = m.
Let C 0 = E(U T 1 ), . . . , E(U T m ) . For t ∈ R m , define µ t = N
C 0 Σ −1 t, E(U 2 ) − C 0 Σ −1 C .
Then, {µ t : t ∈ R m } is a version of the conditional distribution of U given T . By Lemma 3,
kµ t − N (0, b)k ≤ 1
√ b
√ b − p
E(U 2 ) − C 0 Σ −1 C
+ |C 0 Σ −1 t|
≤ 1
√ b
| √ b − p
E(U 2 )| +
√
C 0 Σ −1 C + |C 0 Σ −1 t| . Hence, there is a constant r 0 depending only on Σ such that
kµ t − N (0, b)k ≤ r 0
√ b
| √ b − p
E(U 2 )| + kCk ∗ + kCk ∗ kΣ −1 tk ∗ . Define
r = r 0 1 + Z
kΣ −1 tk ∗ λ(dt) (recall that λ is the probability distribution of T ). Then,
kµ − νk ≤ Z
kµ t − N (0, b)k λ(dt)
≤ r 0
√ b
| √ b − p
E(U 2 )| + kCk ∗ + kCk ∗ Z
kΣ −1 tk ∗ λ(dt)
≤ r
√ b
| √ b − p
E(U 2 )| + kCk ∗ .
The last lemma plays a key role in proving Theorem 6.
Lemma 5. Let U be a standard normal random variable and D = {U > x}, where x ∈ R ∪ {−∞}. Then, there is a constant q ≥ 1 (independent of x) such that
kP D rU 2 + yU + z ∈ · − P D yU ∈ ·k ≤ min 1, q
P (D)
|r| + |z|
|y|
for all r, y, z ∈ R with y 6= 0.
Proof. On noting that
kP D rU 2 + yU + z ∈ · − P D yU ∈ ·k = kP D
r
y U 2 + U + z y ∈ ·
− P D (U ∈ ·)k, it can be assumed y = 1.
By Lemma 3, kP (U + z ∈ ·) − P (U ∈ ·)k ≤ |z|. Hence,
kP rU 2 + U + z ∈ · − P U ∈ ·k ≤ kP rU 2 + U + z ∈ · − P U + z ∈ ·k + |z|
= kP rU 2 + U ∈ · − P U ∈ ·k + |z|.
Define
η = P
{U > x}∆{rU 2 + U + z > x} . Then,
P (D) kP D rU 2 + U + z ∈ · − P D U ∈ ·k
≤ η + kP rU 2 + U + z ∈ · − P U ∈ ·k
≤ η + |z| + kP rU 2 + U ∈ · − P U ∈ ·k.
Since {U +z > x} ⊂ {rU 2 +U +z > x} if r > 0 and {rU 2 +U +z > x} ⊂ {U +z > x}
if r < 0, one also obtains η ≤ P
{U > x}∆{U + z > x} + P
{U + z > x}∆{rU 2 + U + z > x}
≤ |z| +
P U + z > x − P rU 2 + U + z > x
≤ |z| + kP (rU 2 + U ∈ ·) − P (U ∈ ·)k.
In addition, since −U ∼ U ,
kP −rU 2 + U ∈ · − P U ∈ ·k = kP rU 2 + U ∈ · − P U ∈ ·k.
To summarize, to prove the lemma, it suffices to see that there is a constant q ≥ 1 such that
kP r U 2 + U ∈ · − P U ∈ ·k ≤ q r for each r ∈ (0, 1].
(5)
We finally prove (5). Define q = 7 + 6
Z ∞ 0
u 3 φ 2u 1 + √
1 + 4u
du
and fix r ∈ (0, 1]. Let g be the density of r U 2 +U with respect to Lebesgue measure.
Then, g(u) = 0 if u ≤ −1/4r and g(u) = φ 1 (u) + φ 2 (u) if u > −1/4r, where φ 1 (u) = φ −1 + √
1 + 4ru 2r
1
√ 1 + 4ru , φ 2 (u) = φ
− 1 + √ 1 + 4ru 2r
1
√ 1 + 4ru .
On noting that P U < −(1/c) ≤ c for all c > 0 and Z ∞
−
4r1φ 2 (u) du = Z −2r1
−∞
φ(u) du = P U < − 1 2r , one obtains
2 kP r U 2 + U ∈ · − P U ∈ ·k = Z ∞
−∞
|g(u) − φ(u)| du ≤ 6r + Z ∞
−
4r1|φ 1 (u) − φ(u)| du
≤ 6r + Z ∞
−
4r1φ −1 + √ 1 + 4ru 2r
4r|u|
√ 1 + 4ru du + Z ∞
−
4r1φ −1 + √ 1 + 4ru 2r
− φ(u) du.
Since r ≤ 1, Z ∞
−
4r1φ −1 + √ 1 + 4ru 2r
4r|u|
√ 1 + 4ru du = 4r Z ∞
−
2r1|ru 2 + u| φ(u) du ≤ 4r EU 2 + |U | ≤ 8r.
Since
−1 + √ 1 + 4ru
2r = 2u
1 + √
1 + 4ru and
2u 1 + √
1 + 4ru − u
≤ 4ru 2 , the Lagrange theorem yields
Z ∞
−
4r1φ −1 + √ 1 + 4ru 2r
− φ(u) du ≤
≤ Z 0
−
4r14ru 2 2|u|
1 + √
1 + 4ru φ(u) du + Z ∞
0
4ru 2 u φ 2u 1 + √
1 + 4ru
du
≤ 8r Z 0
−
4r1|u| 3 φ(u) du + 4r Z ∞
0
u 3 φ 2u 1 + √
1 + 4u
du
≤ 12r Z ∞
0
u 3 φ 2u 1 + √
1 + 4u
du.
Collecting all these facts together,
kP r U 2 + U ∈ · − P U ∈ ·k ≤ 3r + 4r + 6r Z ∞
0
u 3 φ 2u 1 + √
1 + 4u
du = q r.
Thus, condition (5) holds, and this concludes the proof. Even if conceptually simple, the above proof is quite cumbersome. As suggested by an anonymous referee, such a proof could be notably shortened by exploiting Stein’s method to get (5). However, as noted in Section 1, one goal of this paper is to use elementary tools only. Thus, we have not adopted this shorter proof.
3. Results
From now on, X = {X t : 0 ≤ t ≤ 1} is a centered Gaussian process with continuous paths and covariance function
f (s, t) = E X s X t .
We assume f (1, 1) = E(X 1 2 ) = 1. More importantly, we assume that there are two constants α > 0 and γ > 0 such that
|f (s, t) − f (s, 1)| ≤ γ (1 − t) α for all s, t ∈ [0, 1].
(6)
Note that, since the X-paths are continuous, condition (6) implies continuity of f .
Let
K t = X t − f (t, 1)X 1 .
The process K is a sort of Brownian bridge based on X. In particular, K is independent of X 1 . In the sequel, we fix m ≥ 1, t 1 , . . . , t m ∈ [0, 1], and we let
T = (K t1, . . . , K tm).
).
We aim to investigate
I n = a n
2 Z 1
0
t n−1 (X 1 2 − X t 2 ) dt, where n ≥ 1 and the constant a n is given by
a n = E
Z 1 0
t n−1 f (t, 1) K t dt 2 −1/2
.
The expectation involved in a n is finite but may be null. Thus, the convention 1/0 = ∞ is adopted.
For each c ≥ 0,
N c = N 0, cX 1 2
denotes the centered Gaussian kernel with variance cX 1 2 . Also, Q c is the product probability measure on B R m+1 given by
Q c = P (T ∈ ·) × E[N c (·)].
We are now able to state our main result.
Theorem 6. Let X be a centered Gaussian process with continuous paths and covariance function satisfying (6). Fix m ≥ 1, t 1 , . . . , t m ∈ [0, 1], and define T = (K t1, . . . , K tm). Then, for all β ∈ (0, 1), c n > 0 and c > 0, one obtains
). Then, for all β ∈ (0, 1), c n > 0 and c > 0, one obtains
kP (T, √
c n I n ) ∈ · − Q c (·) k ≤ |c n − c|
c + k a n n 1+α
β
for all n ≥ 1 where k is a constant depending on t 1 , . . . , t m and β. Moreover, if n a1+αn → 0 and c n → c, then
(T, √
c n I n ) converges k·k-stably to the product kernel δ T × N c . Based on Theorem 6, if n a1+αn → 0 and c n → c, a conjecture is that (K, √
c n I n ) converges k·k-stably to δ K ×N c , where K is regarded as a random element with val- ues continuous functions on [0, 1]. This is not true, however, as shown by Example 9.
Theorem 6 is actually a consequence of the following lemma, which has possible independent interest.
Lemma 7. Let D = {X 1 > x}, where x ∈ R ∪ {−∞}, and let Q 1,D be the product probability measure on B R m+1 given by
Q 1,D = P (T ∈ ·) × E D [N 1 (·)].
(Note that Q 1,Ω = Q 1 ). For every β ∈ (0, 1) there is constant k, depending on x, t 1 , . . . , t m and β, such that
kP D (T, I n ) ∈ · − Q 1,D (·) k ≤ k a n
n 1+α
β
for all n ≥ 1.
Finally, Theorem 6 implies the results mentioned in Section 1, which in turn largely improve [5, Theorem 3.6].
Recall that a fractional Brownian motion is a centered Gaussian process with covariance function
f (s, t) = |s| 2H + |t| 2H − |s − t| 2H
2 ,
where the Hurst parameter H ranges in (0, 1). Such f satisfies condition (6) with α = min(1, 2H).
Corollary 8. Let B be a fractional Brownian motion with Hurst parameter H.
Define
c = H Γ(2H), a(H) = 1/2 − |1/2 − H|, A n = n 1+H 2
Z 1 0
t n−1 (B 1 2 − B t 2 ) dt, T = (K t1, . . . , K tm) for some m ≥ 1 and t 1 , . . . , t m ∈ [0, 1].
) for some m ≥ 1 and t 1 , . . . , t m ∈ [0, 1].
Then, (T, A n ) converges k·k-stably to the product kernel δ T × N c . In addition, for each H ∈ (0, 1) and β ∈ (0, 1), one obtains
kP (T, A n ) ∈ · − Q c (·) k ≤ k n −β a(H) for all n ≥ 1 where k is a constant depending on H, t 1 , . . . , t m and β.
As noted in Section 1 (see condition (2)) the rate of convergence provided by Corollary 8 is nearly optimal.
4. Proofs
Our proofs rely on two simple facts. First, I n can be written as I n = r n X 1 2 + Y n X 1 + Z n ,
(7)
where the r n are constants and (Y n , Z n ) random variables independent of X 1 . Sec- ond, because of Lemma 5, the asymptotic behavior of I n essentially agrees with that of Y n X 1 .
To prove (7), just note that, since X t = K t + f (t, 1)X 1 , I n = (a n /2)
Z 1 0
t n−1
(1 − f (t, 1) 2 )X 1 2 − 2f (t, 1)K t X 1 − K t 2 dt.
Thus, one obtains (7) and (Y n , Z n ) independent of X 1 by letting r n = (a n /2)
Z 1 0
t n−1 (1 − f (t, 1) 2 ) dt,
Y n = −a n
Z 1 0
t n−1 f (t, 1) K t dt, Z n = −(a n /2)
Z 1 0
t n−1 K t 2 dt.
Two more remarks are in order. First, up to replacing k with max k, 1, the inequalities in Theorem 6 and Lemma 7 are trivially true if a n > n 1+α (in particular, if a n = ∞). Accordingly, in the sequel, we assume that
a n ≤ n 1+α for each n ≥ 1.
Second, for each > 0, Stirling formula implies n 1+
Γ( + 1) Z 1
0
t n−1 (1 − t) dt = n n!
Γ(n + + 1) = 1 + O(n −1 ).
(8)
4.1. Proof of Lemma 7. Let β ∈ (0, 1). Denote by λ n and λ ∗ n the probability distributions of (Y n , Z n ) and (T, Y n , Z n ), respectively. Since I n = r n X 1 2 +Y n X 1 +Z n and X 1 is independent of (T, Y n , Z n ), one obtains
kP D (T, I n ) ∈ · − P D (T, Y n X 1 ) ∈ · k ≤
≤ Z
kP D (t, r n X 1 2 + yX 1 + z) ∈ · − P D (t, yX 1 ) ∈ · k λ ∗ n (dt, dy, dz)
≤ Z
kP D r n X 1 2 + yX 1 + z ∈ · − P D yX 1 ∈ · k λ n (dy, dz)
≤ Z
min 1, q P (D)
|r n | + |z|
|y| λ n (dy, dz)
= E
min 1, q P (D)
|r n | + |Z n |
|Y n |
where q is a constant and the last inequality depends on Lemma 5. Thus, to prove Lemma 7, it suffices to see that
E
min 1, q P (D)
|r n | + |Z n |
|Y n |
= O a n
n 1+α
β (9)
and
kP D (T, Y n X 1 ) ∈ · − Q 1,D (·)k = O a n
n 1+α
β . (10)
We begin with (9). Fix an integer m ≥ 1 and define δ = sup
t
(1 + |f (t, 1)|) and µ n (dt) = n 1 [0,1] (t) t n−1 dt.
By condition (6), |1 − f (t, 1)| ≤ γ(1 − t) α . By Stirling formula (8), n
Z 1 0
t n−1 (1 − t) mα dt ≤ b n −mα for some constant b. Hence,
|2 r n | m = a m n (1/n)
Z 1 0
(1 − f (t, 1) 2 ) µ n (dt)
m
≤ (a n /n) m Z 1
0
|1 − f (t, 1) 2 | m µ n (dt)
≤ (a n /n) m δ m Z 1
0
|1 − f (t, 1)| m µ n (dt) ≤ (a n /n) m (γ δ) m Z 1
0
(1 − t) mα µ n (dt)
= (a n /n) m (γ δ) m n Z 1
0
t n−1 (1 − t) mα dt ≤ b (γ δ) m a n
n 1+α
m
.
As to Z n , first note that K t is a centered Gaussian random variable with variance E(K t 2 ) = f (t, t) − f (t, 1) 2 ≤ |1 − f (t, t)| + |1 − f (t, 1) 2 |
≤ δ |f (t, 1) − f (t, t)| + 2 |1 − f (t, 1)| ≤ 3 γ δ (1 − t) α , where the last inequality is by (6). Since X 1 ∼ N (0, 1),
E(K t 2m ) = E(X 1 2m ) E(K t 2 ) m ≤ E(X 1 2m ) (3 γ δ) m (1 − t) mα .
Therefore,
E |2 Z n | m ≤ (a n /n) m E Z 1 0
K t 2m µ n (dt)
= (a n /n) m Z 1
0
E(K t 2m ) µ n (dt)
≤ E(X 1 2m ) (3 γ δ) m (a n /n) m n Z 1
0
t n−1 (1 − t) mα dt ≤ b E(X 1 2m ) (3 γ δ) m a n
n 1+α
m
. Hence, for each fixed m ≥ 1, there is a constant b(m) such that
E
(|r n | + |Z n |) m
≤ (1/2)
|2 r n | m + E(|2 Z n | m )
≤ b(m) a n n 1+α
m . Next, take m > β/(1 − β). On noting that min(1, u) ≤ u β for every u ≥ 0 (since 0 < β < 1) one obtains
E
min 1, q P (D)
|r n | + |Z n |
|Y n |
≤ q P (D)
β E
(|r n | + |Z n |) β |Y n | −β
≤ q P (D)
β E
(|r n | + |Z n |) m
mβE |Y n |
−mβm−βm−βmwhere the second inequality depends on Schwartz-Holder inequality with exponent p = m/β. Further,
E |Y n |
−mβm−βm−βm= E |X 1 |
−mβm−βm−βm< ∞ since Y n ∼ N (0, 1) ∼ X 1 and mβ < m − β. Thus, one finally obtains
E
min 1, q P (D)
|r n | + |Z n |
|Y n |
≤ b ∗ a n
n 1+α
β
for some constant b ∗ .
It remains to prove (10). For each u ∈ R \ {0}, let ν u be the product probability measure on B(R m+1 ) given by
ν u = P (T ∈ ·) × N (0, u 2 ).
Recalling that T = (K t1, . . . , K tm) and Y n u ∼ N (0, u 2 ), Lemma 4 implies kP (T, Y n u) ∈ · − ν u (·)k ≤ r
) and Y n u ∼ N (0, u 2 ), Lemma 4 implies kP (T, Y n u) ∈ · − ν u (·)k ≤ r
|u|
m
X
i=1
|E(K tiY n u)| = r
m
X
i=1
|E(K tiY n )|
where the constant r depends only on the distribution of T . Conditioning on X 1 , it follows that
kP D (T, Y n X 1 ) ∈ · − Q 1,D (·)k ≤ 1 P (D)
Z ∞ x
kP (T, Y n u) ∈ · − ν u (·)k φ(u) du
≤ r
P (D)
m
X
i=1
|E(K tiY n )|.
On the other hand,
−E(K tiY n ) = a n E Z 1 0
t n−1 f (t, 1) K tiK t dt
= a n
Z 1 0
t n−1 f (t, 1) E K tiK t dt
= a n
Z 1 0
t n−1 f (t, 1) f (t i , t) − f (t i , 1)f (t, 1)} dt.
By condition (6),
f (t, 1) f (t i , t) − f (t i , 1)f (t, 1)}
≤ γ δ (δ − 1) (1 − t) α . Therefore, using (8) again, there is a constant b ∗∗ such that
kP D (T, Y n X 1 ) ∈ · − Q 1,D (·)k ≤ m r γ δ (δ − 1) P (D) a n
Z 1 0
t n−1 (1 − t) α dt ≤ b ∗∗ a n
n 1+α . Finally, since a n ≤ n 1+α and β ∈ (0, 1), one also obtains
kP D (T, Y n X 1 ) ∈ · − Q 1,D (·)k ≤ b ∗∗ a n
n 1+α
β
. Thus, condition (10) holds, and this concludes the proof of Lemma 7.
4.2. Lemma 7 implies Theorem 6. First note that kµ − νk = kµ ◦ h −1 − ν ◦ h −1 k whenever µ and ν are probability laws on B(R m+1 ) and h : R m+1 → R m+1 is bijective and Borel. Given a constant b > 0, take
h(u, v) = (u, v
√ b ) for u ∈ R m and v ∈ R.
On noting that Q b ◦ h −1 = Q 1 , one obtains kP (T, √
b I n ) ∈ · − Q b (·)k = kP (T, I n ) ∈ · − Q 1 (·)k.
Fix β ∈ (0, 1), c n > 0 and c > 0. By Lemma 7, kP (T, √
c n I n ) ∈ · − Q cn(·)k = kP (T, I n ) ∈ · − Q 1 (·)k ≤ k a n n 1+α
β for some constant k and every n ≥ 1. By Lemma 3,
kQ cn(·) − Q c (·)k ≤ E n
kN (0, c n X 1 2 ) − N (0, cX 1 2 )k o
≤ |c n − c|
c . Therefore,
kP (T, √
c n I n ) ∈ · − Q c (·)k ≤ |c n − c|
c + k a n
n 1+α
β
for all n ≥ 1.
Next, suppose n a1+αn → 0 and c n = c = 1. We have to show that (T, I n ) converges k·k-stably to δ T × N 1 . To this end, because of Lemma 1, we can restrict to those conditioning events F of the type
F = S ∈ A, X 1 > x
where S = (K s1, . . . , K sp) for some p ≥ 1 and s 1 , . . . , s p ∈ [0, 1], A ∈ B(R p ) and x ∈ R ∪ {−∞}. Take one such F , with P (F ) > 0, and write D = {X 1 > x}. Then,
) for some p ≥ 1 and s 1 , . . . , s p ∈ [0, 1], A ∈ B(R p ) and x ∈ R ∪ {−∞}. Take one such F , with P (F ) > 0, and write D = {X 1 > x}. Then,
P F (T, I n ) ∈ · = P D S ∈ A, (T, I n ) ∈ ·
P (S ∈ A) and
E F δ T × N 1 (·) = E D 1 {S∈A} δ T × N 1 (·) P (S ∈ A) .
Let T ∗ = (S, T ) and Q ∗ 1,D = P (T ∗ ∈ ·) × E D [N 1 (·)]. By Lemma 7, applied with T ∗ and Q ∗ 1,D in the place of T and Q 1,D , there is a constant k such that
kP F (T, I n ) ∈ · − E F δ T × N 1 (·)k ≤ kP D (T ∗ , I n ) ∈ · − Q ∗ 1,D (·)k
P (S ∈ A) ≤
k
a
nn
1+α β
P (S ∈ A) .
Hence, (T, I n ) converges k·k-stably to δ T × N 1 since n a1+αn → 0.
Finally, suppose n a1+αn → 0 and c n → c. Fix F ∈ F + . Arguing as above, kP F (T, √
c n I n ) ∈ · − E F δ T × N cn(·)k = kP F (T, I n ) ∈ · − E F δ T × N 1 (·)k.
Thus,
kP F (T, √
c n I n ) ∈ · − E F δ T × N c (·)k ≤
≤ kP F (T, I n ) ∈ · − E F δ T × N 1 (·)k + kE F δ T × N cn(·) − E F δ T × N c (·)k
≤ kP F (T, I n ) ∈ · − E F δ T × N 1 (·)k + |c n − c|
c → 0.
Therefore, (T, √
c n I n ) converges k·k-stably to δ T ×N c , and this concludes the proof of the implication ”Lemma 7 ⇒ Theorem 6”.
4.3. Theorem 6 implies Corollary 8. Define c = H Γ(2H), a(H) = 1/2 − |1/2 − H| and recall that α = min(1, 2H) and A n = n1+Ha
n