EDGEWORTH AND CORNISH FISHER EXPANSIONS AND CONFIDENCE INTERVALS FOR THE DISTRIBUTION, DENSITY AND QUANTILES OF KERNEL DENSITY ESTIMATES C.S. Withers, S. Nadarajah
1. INTRODUCTION
Our aim here is to present nonparametric confidence intervals (CIs) for densi-ties on ℜ based on kernel estimates. Suppose we have a random sample
1,..., n
X X of size n on ℜ with empirical distribution ˆ ,( )F x from a distribution X∼F(x) with density f x( 0) and derivatives f x⋅r( 0),r =1, 2,... finite at some
point x0. Let K(z) be a function on ℜ with the finite values
( ) r ( )i ri ri
B =B K =
∫
z K z dz for r=0,1,..., i=1,2,... and B = . The kernel estimate 01 1for f x of kernel K and bandwidth (or smoothing) parameter h is ( 0)
1 0 0 1 1 ˆ( ) ( ) ( )ˆ n , h i i f x K x y dF y n− U = =
∫
− =∑
(1) where ( ) 1 ( / ) hK x =h K x h− and U1i =K xh( 0−Xi). The bandwidth h h n= ( ) is assumed to satisfy h → as n → ∞ . This estimate has mean 0
0 0 0 1 0 ( , ) ( ) ( ) ( ) ( ) ( )( )r / !. h h r r r T F K K x y dF y K z f x hz dz ∞ f x⋅ h B r = =
∫
− =∫
− =∑
− (2) Note that the existence of the full Taylor series in (2) is not needed for the results in this paper. The boundedness of the required number of derivatives of f is suf-ficient to validate the results presented.Now suppose that the kernel K is of order p≥1, that is, B = for 1≤r<p and r1 0
0 rp
B ≠ . So, Ef xˆ( )0 = f x( 0)+O h( p). Also var f x{ (ˆ 0)} { (= f x B0) 02+O h( )}/( )nh
so its minimum asymptotic mean square error (AMSE) is
2
0 0
ˆ
| ( ) ( )| ( )
h cn= −α (3) for c >0 and α =1/(2p+ . This rate y holds of course for any 1) c >0, not just
the AMSE-optimal c c f x= ( , 0) say. Compare Parzen (1962). (Allowing c to de-pend on x means that 0 ˆ( )f x no longer integrates to one.) In fact, this choice 0 of α achieves minimum asymptotic integrated MSE (AIMSE)
2
ˆ
| ( ) ( )| ( )
E f x − f x dx O n= −γ
∫
(4)under suitable regularity conditions for y=2 /(2p p+ and 1) c >0, not just the AIMSE-optimal c c f= ( ) say. See Theorem 2.1.7 of Prakasa Rao (1983), and (3.4) of Terrell and Scott (1992). Using either AMSE or AIMSE, optimal c can be es-timated using an initial estimate for f. See, for example, Scott et al. (1977). How-ever, the use of such an estimate may destroy its optimality property.
Kernel estimates were introduced by Rosenblatt (1956). Since then there has been a huge literature on the subject. We mention only a few here. Silverman (1986) and Terrell and Scott (1992) show that the gain in optimizing c is generally not great compared with fixed c estimators. For example, the latter have asymp-totic efficiency about 0.9 or 0.8 for f normal or Cauchy. (Terrell and Scott (1992) scale K so that B = ± .) Prakasa Rao (1983) has many related results. For ex-p1 1
ample, some give convergence rates for supx| ( )f xˆ − f x( )|. See Theorems 2.1.10, 12, 15 and 20 and Theorems 2.2.3 and 2.2.4 for similar results. Silverman (1986) and Terrell and Scott (1992) also consider estimates of the form
0 ( ) 0 ˆ
( ) h y ( ) ( )
f x =
∫
K x − y dF y for K of (1). When h p =2 one can choose h(y) to achieve (4) as ifp =4, that is with y =8/9. Optimizing c is one way of decid-ing how to choose h or equivalently how to scale K. Another way is to replace hby 1/2
2
( )n ( )n
h F =cn−αµ F , where
2( )F
µ is the variance and F the empirical dis-n tribution. One can show that this does not affect the optimal rate γ. Silverman (1986, page 45) suggests this with α =1/(2p+ for 1) p = and 2 c =1.06.
Turning now to CIs for f x , the AMSE and AIMSE criteria are no longer ( 0) relevant. The obvious criterion is to minimise the coverage error - the difference between the actual coverage probability and the nominal level of the CI - or rather the asymptotic coverage error (ACE). Theorem 2.1.18 of Prakasa Rao (1983) gives a consistent CI for f x but the ACE is not given. The first order CIs are ( 0)
based on Studentizing. Their optimal bandwidth h again has the form (3), giving
( )
ACE O n= −β . We shall refer to β as the ACE rate. Its possible values are 0
β > .
The contents of this paper are organized as follows. Section 2 provides Edge-worth and Cornish Fisher expansions for Studentized versions of the kernel
den-sity estimate, ˆ( )f x , and its extensions. These expansions are used to derive 0 various CIs for f x . Section 3 derives first order one- and two-sided CIs using ( 0) asymptotic Studentization (ε= and empirical Studentization (1) ε = p), where ε is an indicator variable. Section 4 derives second order one- and two-sided CIs. Section 5 considers choosing the ACE-optimal constant c in (3) for first order CIs based on empirical Studentization. Finally, some conclusions are noted in Section 6.
2. CORNISH FISHER EXPANSIONS
Here, we lay the groundwork by showing that the kernel density estimate has Cornish-Fisher expansions with parameter m=nh. Moments and cumulants of the kernel estimate ˆ( )f x in (1) are derived in Section 2.1. These quantities are used 0 in Sections 2.2–2.5 to provide Cornish Fisher expansions for Studentized ver-sions of f x . Section 2.2 derives Cornish Fisher expansions for ˆ( )0
1/2 1/2 1 1 2 ˆ ( ) , n h Y =m w −w k− (5) where w1= f x( 0), wˆ1= ˆf x( 0), U1=K xh( 0−X), κrh =κr(U1) and 1 r rh rh
k =h −κ . Section 2.3 derives Cornish Fisher expansions for
1/2( ˆ ), n Y =m W W− (6) where W =( ,...,w1 w ′q) , Wˆ =( , ,wˆ1 w ′ˆq) , q 1≥ , wih =T F Kh( , i), ( ) ( ) i i K z =K z
and wˆih =T F Kh( ,ˆ i). Note that we have suppressed the dependency of w on h ih by simply writing w for i w . Section 2.4 derives Cornish Fisher expansions for ih
1/2 1/2
0 ( ( )ˆ ( )) 21 ,
n
Y =m t W −t W a− (7)
where ( )t W is a real function with finite derivatives and satisfies the asymptotic expansion (see (19) and Withers (1982)):
1 ˆ ( ( )) i r ri i r t W a m κ ∞ − = − ≈
∑
(8)for r 1≥ and for certain bounded functions ari =a wrih( ) given by Withers (1982). Finally, Section 2.5 derives Cornish Fisher expansions for
1/2 1 ( )ˆ n Y =m t W (9) for 1/2 1 0 ˆ2 ( ) ( ( )) h t W = w − f x k− and k an estimate of ˆ2h k . 2h
The expansions given are “formal”. We have given the expansions in the full-est possible forms - rather than say to the first or the second order - because they could lead to better approximations and have wider applicability. Often expan-sions give better approximations even if they diverge. We have not attempted to check the existence or the validity of the expansions. Some Cramer-type condi-tions and condicondi-tions on differentiability may be required to ensure existence and validity, see, for example, Hall (1992) and Garcia-Soidan (1998). This task is be-yond the purpose of the present paper.
2.1 Moments and cumulants
The estimate f x of (1) can be viewed as the mean of a random sample of ˆ( )0 size n, where each observation of the sample is distributed as U . Its ith moment 1 is 1 1i ( 0 )i ( ) i . ih h ih m =EU =
∫
K x − y dF y =h w− By (2), 0 0 ( ) ( ) / !r ih r ri r w ∞ f x B⋅ h r = =∑
− (10) 0 ( ) i w O h = + (11) for 0 ( 0) ( )i iw = f x
∫
K z dz. So, w is of magnitude 1 and ih ( 1 i) ihm =O h− . Typi-cally K is symmetric so that (10) is a power series in h . Both parametric and 2
nonparametric approaches, using ŵ and ˆ ,F have their merits. We can view ˆwi as the mean of a random sample of size n, where each observation of the sample is distributed as 1 0 0 ( ) (( )/ ) . i i i h U =K x −X =h K x− −X h (12)
The rth cumulant of ˆw is given by 1
1 1 1 ˆ ( ) r r . r w n rh m krh κ = −κ = − (13)
The moments and cumulants of ŵ are polynomials in h and w. An expression for the general krh =k wrh( ) as a polynomial in h and w is given in Appendix B. One can now use (10) to expand k as a formal power series in h with coeffi-rh cients functions of the derivatives of f at x . Note that 0 k is bounded as rh h → 0 and
1 0 lim0 r ( 0) 0 . r h rh r k h m− f x B → = = (14)
2.2 Cornish Fisher expansions for (5)
By (13), ˆ( )f x behaves like the mean of a random sample of size m nh0 = with rth cumulant k . (Hall (1992, page 211) gave a heuristic justification for this.) By rh (13), the cumulants of wˆ1= f xˆ( 0) satisfy the Cornish-Fisher assumption with re-spect tom nh= . That is 1
1
ˆ
( ) ( r)
r w O m
κ = − as m → ∞ . So, the distribution of
n Y in (5) satisfies the Cornish-Fisher expansions
( ) ( ) n n P x =P Y ≤x ≈ /2 1 ( ) ( ) r ( , ), r rh r x φ x ∞ m− h x λ = Φ −
∑
(15) ( )/ n dP x dx ≈ /2 1 ( ) r r( , rh), r x m h x φ ∞ − λ =∑
(16) 1( ( )) n P x − Φ ≈ /2 1 ( , ), r r rh r x ∞ m− f x λ = −∑
(17) 1( ( )) n P− Φ x ≈ /2 1 ( , ), r r rh r x ∞ m− g x λ = +∑
(18)where Φ and φ are the unit normal distribution and density. Note that h , r h , r f , r r
g are functions given in Cornish and Fisher (1960) for r ≤ 4 or in Withers (1984) and that λrh are the coefficients of the expansions. For example, h x1( ,λrh)=
1( , rh)
f x λ =g x1( ,λrh)=λ3h(x2−1)/6 and h x1( ,λrh)=λ3h(x3−3 )/6x . (An al-ternative derivation is to apply Example 1 of Withers (1983) with
0 ( ) h( ) h( ) ( ) T F =T F =
∫
K x − y dF y , a = if ri 0 i r≠ − , and 1 ari =κrh if i r= − . 1 So, 1 1 0 ˆ ( ( )) r r r f x rhn k mrh κ =κ − = − , and 1/2 1/2 1/2 1/2 0 1 2 ˆ1 1 2 ( ( ) ) ( ) n h h h h Y =n f x −κ κ− =m w −k k−has an Edgeworth-Cornish-Fisher expansion with respect to m in terms of the coefficients Ar r, 1− = κ κrh 2−hr/2=λrhh1 /2−r for λrh of (17). By (18), one can replace n and {Ar r, 1−} in the Cornish-Fisher expansions by m and { }λrh .) The expansion (15) was noted in (4.83) of Hall (1992). (There is a slip there: µ40 in p y in 2( )
second equation of page 212 should be 2
40 3h 20
µ − µ . This slip is also on pages 269 and 271.)
2.3 Cornish Fisher expansions for (6)
The joint rth order cumulants of ŵ have magnitude m1 r− for the same reason
that k is bounded: rh 1 1 1 ˆ ˆ ( ,..., r) r ( ... ) i i h r w w m k i i κ = − (19) for 1 1 1 ( ... ) r ( ,..., r) (1) h r i i k i i =h −κ U U =O and i U of (12). This expresses k i i h( ... )1 r in the form ki1...i hr ( )F . We can also write k i ih( ... )1 r =ki1...i hr ( )w as a polynomial in
h and w: see Appendix A. For example, ( )k ijh =wi j+ −hw wi j. Now fix q and set ( ( ) : 1h , )
V = k ij ≤i j q≤ . So, V is q q× . For x in ℜq, let ( ) V x
Φ and φV( )x be the distribution and density of a normal q-vector with mean 0 and covariance V. Then for Y in (6) the Edgeworth expansion (15) has an extension of the form n
/2 1 ( ) ( ) r ( / , ) ( )/ !, n V r h V r P Y x x ∞ m− b x k x r = Φ ∂ ∂ Φ ≤ ≈ +
∑
− (20) where, for s in ℜq, ( , ) ( ( )) r h r b s k =B c s , 0 ( ) r ( ) r ri i B c B c = =∑
, B c and r( ) B c are ri( ) the complete and partial exponential Bell polynomials tabled on page 307 of Com-tet (1974), c sr( )=kr+2, 1r+( )/{(s r+2)(r+1)} for kr r, 1− ( )s =κ(Ui1,...,U sir) ...i1 sir ,and we use the tensor sum convention of implicitly summing repeated pairs of in-dices i1,...,i over their range 1,...,q . (See Appendix B for a definition of these Bell r polynomials.) Equation (5.67) of Hall (1992) gives a double expansion version of this for q = . 2
2.4 Cornish Fisher expansions for (7)
By Withers (1982), the Cornish-Fisher expansions (15)–(18) also hold with Y n replaced by Y in (7) and n0 λrh replaced by
/2 21
{ r }
h ri ri
A = A =a a− (21)
provided that a is bounded away from 0 and that some regularity conditions 21 ensuring 1 | | | | 1 1 ( ) ( ) 1 ( ) q q j t t j j sup ς exp t K u dF x hu C ς h ∞ +…+ > = −∞ ⎧ ⎫ ⎪ − ⎪ − < − ⎨ ⎬ ⎪ ⎪ ⎩
∑
⎭∫
(22)hold, where C(⋅) is some bounded real valued function. Also by Withers (1982), the first few coefficients in (8) are:
21 i j h( ), a =t t k ij⋅ ⋅ 11 ij h( )/2, a =t k ij⋅ 32 21 323 a =c + c (23) for 1 r r ( )/ 1... r i i i i t⋅ =∂ t W ∂w ∂w , c21=t t t k ijl⋅ ⋅ ⋅i j l h( ), and c23=t k ij t k lq t⋅i h( )⋅jl h( )⋅q, where we use the tensor summation convention. For example,
/2 0 1 ( ) ( ) ( ) r ( ) n r r P Y x x φ x ∞ m− h x = Φ ≤ ≈ −
∑
(24) for 3 1 0 ( , ) ( ) r ( ) r h r rih r i h x A h x − P He x = = =∑
, where ( ) ( ) (1 / ) ( )i r He x =φ x − −d dx φ x is the rth Hermite polynomial, P is a polynomial in rih A , and the last summation his restricted to r−i odd. For example, 1 1
0,2 ( , h) ih i( ) i h x A P He x = =
∑
and 2 2 1,3,5 ( , h) ih i( ) i h x A P He x = =∑
for P10h =A11h, P12h =A32h/6, 2 21h ( 11h 22h)/2 P = A +A , 23h (4 11h 32h 43h)/24 P = A A +A , and P25h =A322h/72. By Withers (2000), ( ) ( 1 )r i i He x =E x+ − N =ψ say, where N N∼ (0,1), ψ0 = , 1 ψ1= , x 2 2 x 1 ψ = − , 3 3 x 3x ψ = − , ψ4 =x4 −6x2+ , 3 ψ5=x5−10x3+15x, ....2.5 Cornish Fisher expansions for (9)
Now consider the Studentized estimate of f x given by (9). We shall con-( 0)
sider two types of estimates with a21= +1 O h( )ε for ε =1 or p. We can apply (24) by noting that Yn1≤ if and only if z Yn0 x a211/2(z m t W1/2 ( ))
− ≤ = − . So, x z= + for δ δ δ= 0+δ1z, δ0= −a21−1/2m t W1/2( )=O m hp( 1/2 p), and 1/2 1 a21 1 O h( ).ε δ = − − = (25) Also at x z= + , δ 2 1 ( )x He xi( ) ( )z He zi( ) ( )z Hei ( )z O( ), φ =φ −δφ + + δ 2 1 1 1 1 0,2 ( ) ( ) ( ) ( ) ( ) ih i ( ) ( ), i x h x z h z z P He z O φ φ δφ + δ = = −
∑
+1 0 ( n ) ( n ) P Y ≤z =P Y ≤x = 3 /2 2 1 ( ) ( ) r ( ) ( ) r r x φ x m− h x O m− = Φ −
∑
+ = 2 3 /2 1 ( ) ( ){ /2 r r( ) r z φ z δ zδ m− h z = Φ − − + +∑
1/2 3 1/2 1 2 1 1 0,2 ( ) ( ), ih i i m P He z O m m m δ − δ − δ −δ − + = = −∑
+ + + + (26) giving 1 (| n | ) P Y ≤ = z 2 ( ) 1 2 ( ){ (Φ z − − φ z z − +δ1 δ02/2)+m h z−1 2( ) 1/2 3 1/2 2 1 2 0 1 1 0 0 0 0,2 ( ) ( ), ih i i m P He z O m m m δ − δ − δ −δ − + = = −∑
+ + + + (27) 3. FIRST ORDER CIsThe first order CIs for f x are obtained as usual by Studentizing. This can ( 0)
be done using the empirical estimate of mvarf xˆ( )0 =k w2h( ) or by estimating its asymptotic approximation of (14), k w2h( )≈ f x( 0)
∫
K z dz( )2 .3.1 First order one-sided CIs based on asymptotic studentization Choose 1/2 1 1 0 1 02 ( ) ( ( ))( ) . t w = w − f x w B − (28) Then 2 21 1 2h 1 ( ) a =t k⋅ = +O h . So, ε=1 in (25) and, by (26), 1 1 ( n ) ( ) ( )n P Y ≤z =Φz +O e (29) for 1/2 1/2 1 p n e =m− + +h m h . Taking h as in (3), en1=O n( −β) with min((1 )/2, , (p 1/2) 1/2) β = −α α + α − .
The case p ≥ : 2 β has its maximum 1/3 at =1/3α . So, a one-sided lower CI for
0
( )
f x of level Φ( )z +O n( −1/3) and “half-width” O m( −1/2)=O n( −1/3) is given
by Yn1≤ , that is z
1/2
0 ˆ1 02 1ˆ
( ) ( / ) .
Replacing z by −z, a one-sided upper CI for f x of level ( 0) Φ( )z +O n( −1/3) and “half-width” O n( −1/3) is given by 1 n Y ≥ , that is z 1/2 0 ˆ1 02 1ˆ ( ) ( / ) . f x ≤ =U w + B w m z (31)
The case p = : 1 β has its maximum 1/4 at =1/4α . So, both the error and width of these one-sided CIs of nominal level ( )Φ z are O n( −1/4).
Note 1. Note t w is strictly increasing. So, ( )1 Yn1≤ if and only if z wˆ1≤t−1(m−1/2z) if
and only if Yn ≤ = z0 m1/2{ (t−1 m−1/2z)−k k1h} 2−h1/2. So, an asymptotic expansion for the left hand side of (29) is given by the right hand side of (15) at x z= . Alternatively, we can 0
“stabilize the variance” by choosing s w with variance ( )ˆ1 ≈s w⋅1( )1 2w B1 02/n=1/n, that is 1/2
1 1 02
( ) 2( / )
s w = w B . That is, choosing t w( )1 =s w( )1 −s f x( ( 0)), a CI with the same
properties as (30) is given by Yn1=m t w1/2 ( )ˆ1 ≤ , that is z
1/2 1/2 2
0 ˆ1 02
( ) { ( / ) /2} .
f x ≥ =L w − B m z (32)
Replacing z by –z, a one-sided upper CI for f x of level ( 0) Φ( )z +O n( −1/3) and
“half-width” O n( −1/3) is given by
1/2 1/2 2
0 ˆ1 02
( ) { ( / ) /2} .
f x ≤ =U w + B m z (33)
A similar idea was used in S3, see page 210 of Hall (1992).
Note (11) suggests estimating θ = by w1 1 ˆ ˆ i i i θ ∞ τ θ = =
∑
for θˆi =w Bˆ /i 0i and { }τi constants adding to 1. Its bias is ( )O h or O h if K is symmetric (about 0). ( )2Also , 1 ˆ ( ) i j h( ) i j mvar θ ∞ τ τ k ij = =
∑
so that E(θ θˆ− )2 =m−1θτ τ′B +O h( 2+h m/ ) for 0, ( i j)B= B + . To this degree of accuracy this MSE is minimized by
11/1 11
B B
τ= − ′ − , where 1 is an infinite vector of 1’s, giving
2 1 1 2
ˆ
( ) /1 1 ( / )
Eθ θ− =m−θ ′B− +O h +h m . Apart from the question of if and how these infinite sums need to be truncated, clearly this estimate can be used to pro-vide an alternative CI to that of (30). However, we shall see that its asymptotic efficiency relative to the following method is 0 for p≥3.
3.2 First order one-sided CIs based on empirical studentization Choose
1/2 1 0 2 ( ) ( ( )) h( ) , t W = w − f x k w − (34) where 2 2h( ) 2h 2 1 k w =k =w −hw . Then 2 2 21 1 h(11) 2 1 2 h(12) 2 h(22) 1 ( p) a =t k⋅ + t t k⋅ ⋅ +t k⋅ = +O h since t⋅1=k2−h1/2+O h( p+1) and t⋅2 =O h( )p . So, ε = in (25). So, by (26), p
1 1
( n ) ( ) ( n ) ( )
P Y ≤z =Φ z +O e ′ =O n−β
for en1′=m−1/2+m h1/2 p and min((1 )/2,(= p+1/2) 1/2) with the maximum
/(2 2)
p p
β = + (35)
at 1/(α= p+ . So, a one-sided lower CI for 1) f x of level ( 0) /( 2 2 )
( )z O n( −p p+ )
Φ + and “half-width” O m( −1/2)=O n( −p/( 2p+2)) is given by 1 n Y ≤ , that is z 1/2 0 ˆ1 2 ˆ ( ) ( h( )/ ) . f x ≥L′=w − k w m z (36)
Replacing z by −z gives the corresponding one-sided upper CI
1/2
0 ˆ1 2 ˆ
( ) ( h( )/ ) .
f x ≤U′=w + k w m z (37)
So, for p = or 2, empirical and asymptotic Studentization achieve the same 1 β
but for p ≥3 empirical Studentization is superior.
3.3 First order two-sided CIs
By (27) with h of (3), P Y(| n1| ) 2 ( ) 1≤ =z Φz − +O e( n2) for
2 1 2 1
2 1 0 p
n
e = +δ δ +m− ∼hε +mh +m− ∼n−β and =min( ,1β εα −α,(2 +1)p α− . 1)
If 1ε= the corresponding two-sided CI of level 2 ( ) 1Φ z − +O n( −β) is given by
0
( )
L≤ f x ≤ for L and U of (30) and (31). An alternative with the same as-U ymptotic properties is L≤ f x( 0)≤U, for L and Ũ of (32) and (33). If ε= p
(i.e. using empirical Studentization), the corresponding two-sided CI of level 2 ( ) 1Φ z − +O n( −β) is given by L′≤ f x( 0)≤U′ for L' and U' of (36) and (37).
4. SECOND ORDER CIs
4.1 Second order one-sided CIs using asymptotic studentization Taking t of (28), by (26), 1/2 1 1 1 0 ( n ) ( ) ( ) ( ) P Y ≤z =Φ z +m− h z +O δ +m− so 1/2 1 1 1 0 ( n ( ) ) ( ) ( )
P Y −m− h z ≤z =Φ z +O δ +m− . So, one would expect that
1/2 1 1 ˆ1 0 ( n ( ) ) ( ) ( ) P Y −m− h z ≤z =Φ z +O δ +m− (38) if 1/2 1 1 ˆ ( ) ( ) p( )
h z =h z +O m− . Let us take h zˆ1( )=h z w1h( , )ˆ . We now confirm (38).
Given z, set 1 1 1 ( ) ( ) ( , ) m h t W =t w −m h z w− for t w of (28). By Lemma 5.1 of ( )1 Withers (1983), 1 ˆ ( ( )) i r m ri i r t W a m κ ∞ − = − ′′
≈
∑
for certain functions ari′′ =ari′′( )w .Also a10′′ =a10, a21′′ =a21, a11′′ =a11−h z1( ), and a32′′ =a32. Set
1/2 1/2 0 ( ( )ˆ ( )) 21 n m Y ′′ =m t W −t W a− and 1/2 1 ( )ˆ n m Y ′′ =m t W . Then Yn0′′ ≤ if x and only if 1/2 1/2 21 ˆ ( ( )m ( )) m t w −t w ≤ =y a x if and only if 1/2 1/2 1 ( )ˆ ( ) n m
Y ′′ =m t w ≤ = +z y m t w and this occurs with probability
1/2 1 1 ( )x m− h x( ) ( )φ x O m( − ) Φ − ′′ + = 1/ 2 1 2 ( )z m− h z''( ) ( )φ z O e( n ) Φ − + since 1/2 ( p)
x z O h m h= + + . Here, h ′′ is 1 h for the coefficients 1 a ′′ . So, ri h z1′′( ) = 2 11 32( 1)/6 A ′′ +A z − = 1/2 1( ) 21 1( ) ( ) h z −a− h z =O h since 1/2 11 11 21 1( ) A ′′ =A −a− h z . So, P Y( n1′′ ≤ =z) Φ( )z +O e( n2). This confirms (38).
So, a second order lower one-sided CI is
1/2 1/2
0 2 ˆ1 02 1ˆ ˆ1
( ) ( / ) ( ( ))
f x ≥L =w − B w m z m+ − h z . For h of (3), its error behaves as ∼ +δ m−1∼hε+m h1/2 p +m−1∼n−β for β =min((1−α εα), ,(p+1/2)α−1/2).
(For p ≥2 this rate improves on β using the method of Section 3.2.) Replacing z by −z, a second order upper one-sided CI of level ( )Φ z +O n( −β) is
1/2 1/2 0 2 ˆ1 02 1ˆ ˆ1 ( ) ( / ) ( ( )) f x ≤U =w + B w m z m− − h z . Recall that 2 1( ) 11 32( 1)/6 h z =A +A z − is given by (21)-(23) and (20). Now suppose we take instead 10 1
0 ˆ ( ) lim ( , )ˆ h h h z h z w →
= . This introduces another error of magnitude O h since p( ) h z w has a power series expansion in h. The 1h( , ) effect is to add a term of magnitude m−1/2h into e in (38). But n2 m−1/2h O e= ( n2)
so there is no change to the above α and β. Using λ30= 3
0 lim h h→ λ = 3/2 20 30 w− w = 1/2 3/2 0 02 03 ( ) f x − B− B , we obtain
2
210 1, 110 30/2, 320 2 30, 10( ) 30(2 1)/6.
a = a = −λ a = − λ h z = −λ z + (39)
4.2 Second order one-sided CIs using empirical studentization
Now consider the second type of Studentizing. By (26),
1/2 1 1 2 ( n ( ) ) ( ) ( n ) P Y −m− h z ≤z =Φ z +O e ′ for 1 1/2 2 p n e ′ =m− +m h ∼n−β and =min(1 , ,( +1/2)p p 1/2)
β −α α α− . As before one can show that
1/2
1 ˆ1 2
( n ( ) ) ( ) ( n )
P Y −m− h z ≤z =Φ z +O e ′ . So, a second order lower one-sided CI of level ( )Φ z +O n( −β) for β =2 /(2p p+ is given by =3/(2p+3)3) α and
0 2
( )
f x ≥L ′= 1/2 1/2 1
1 2 1
ˆ h( ) {ˆ h( , )}ˆ
w −k w m− z m h z w− − . The corresponding upper
one-sided CI is f x( 0)≤U ′2 = wˆ1+k w2h( ) {ˆ 1/2 m−1/2z m h z w− −1 1h( , )}ˆ . So, a two-sided CI of level 2 ( ) 1Φ z − +O n( −β) is given by L2′≤ f x( 0)≤U2′.
4.3 An alternative approach
For j ≥1, Example 1 of Withers (1983) gives a one-sided CI of level
/2
( )x O n( −j )
Φ + of the form Vjm( , )F xˆ ≤µ( )F for a mean ( )µ F =E Y for ( )
Y h X= and h(x) a given function, where, for example,
/2 1 ˆ ˆ ˆ ( , ) ( ) ( , ) j r jn r r V F x µ F n− q F x = = +
∑
, 1/2 1( , ) 2 q F x = −µ x, 1 2 2( , ) 2 3(1 2 )/6 q F x =µ µ− + x , and ( ) r r Y µ =µ . Taking Y =K xh( 0−X), we have n r/2q F zr( , ) m r/2q w zr( , ) − = − , where 1/2 1( , ) 2h q w z = −ν z, 1 2 2( , ) 2h 3h(1 2 )/6 q w z =ν ν− + z and r 1 ( ) rh h r Y ν = −µ . SeeAppen-dix B for these as polynomials in h and w. Replacing x by z, this suggests evaluat-ing the coverage probability of the followevaluat-ing jth order CI for f x : ( 0)
0 ˆ ( , ) ( ), j jm L′′ =V w z ≤ f x (40) where 1 /2 1 ˆ ˆ ˆ ( , ) ( , ) j r jn r r V w z w m− q w z =
= +
∑
. We now evaluate the probability of (40) for the case j=2. Note (40) holds if and only if m t W1/2 m( )ˆ ≤ for z1 1 ( ) ( ) ( ) m t W =t W +m t W− , t W1( )=w2−3/2w3(1 2 )/6+ z2 , and t of (34). By Lemma 5.1 of Withers (1983), 1 ˆ ( ( )) i, r m ri i r t W a m κ ∞ − = − ′ ≈
∑
(41)where the leading a′ are given in terms of ri a for ri t t= by 0 ar r, 1−′ =ar r, 1−
and a11′ =a11+t W1( ). So, the Cornish-Fisher expansions hold for
1/2 1/2
0 ( ( )ˆ ( )) 21
n m
Y =m t W −t W a− with polynomials ( )h xi′ say. By an argument similar to that proving (38) one obtains the probability of (40) as
1/2 1 2 ( )z m− h z( ) ( )φ z O e( n ) Φ − ′ + ′ for h z1′( ) = 1/2 1( ) 21 1( ) ( ) h z +a− t W =O h so that the ACE has magnitude 1/2
2
n
m− h e+ ′ . If p ≥ this gives ACE rate 3 β =2/3 for 1/3
α = . If p ≤ this gives ACE rate 3 β =2 /(2p p+ for 3) α =3/(2p+ , the 3) same as achieved by a one-sided CI by empirical Studentization.
4.4 An improvement
One can show that we can improve the rate given above for p ≥3 to that given for p ≥3 if we choose t w so that 1( ) h z1′( )=O h( )ε with 3ε≥ . (This fol-p
lows since the ACE then has magnitude m−1/2hε +en′2.) We now show how to
achieve this with ε= . p
Write t of (34) as ND−1/2. So, N O h= ( )p is not a function of w, but D and the derivatives of N and D are. For example, D⋅1= −2hw1. Also
1 t⋅ =t1−ND−1/2D⋅1/2= +t1 O h( p+1), 2 t⋅ =−ND−3/2/2=O h( ),p 11 t⋅ =t11+3ND−5/2D⋅1/4−ND−3/2D⋅11/2=t11+O h( p+2), 12 t⋅ =t12+3ND−5/2D⋅21/4=t12+O h( p+1), 22 t⋅ =3ND−5/2/4=O h( )p for 1/2 1 t =D− , 3/2 11 1 t = −D− D and 3/2 12 /2
t = −D− . In this way we can ap-proximate the derivatives of t to ( )O h , and so using (21)-(23), approximate the p coefficients a and ( )ri h z to ( )r O h . Call these approximations p ( )
ri ri a =a w and ( ) ( , ) r rh h z =h z w . For example, 11 3h/2, a = −λ 32 21 3 ,23 a =c + c 3/2 21 3h, c =D− k
3/2 2 5/2 1/2 3/2 23 1 2h 2h h(12) 2 1 2h ( 3 1 2) 2h , c = −D− D k⋅ −D− k k = hw k − w −hw w k− 2 1( ) 1h( , ) 11 32( 1)/6. h z =h z w =a +a z − Now choose t wm( )=t w( )−h z w1h( , ). By (41), h z1′( )= 1/2 1( ) 21 1( ) ( )p h z +a− t w =O h so that the ACE of the CI 1/2 ( )ˆ
m
m t w ≤ has magnitude z e′ giving ACE rate n2
2 /(2p p 3)
β = + for α =3/(2p+ , the same as achieved for a one-sided CI by 3) empirical Studentization. This gives the second order lower one-sided CI
0 2 ( ) f x ≥L = 1/2 1/2 1 1 2 1 ˆ h( ) (ˆ h( , ))ˆ w −k w m− z m h z w+ − of level Φ( )z +O n( −β). The corresponding upper CI is f x( 0)≤U2= 1/2 1/2 1 1 2 1 ˆ h( ) (ˆ h( , ))ˆ w +k w m− z m h z w− − .
4.5 Second order two-sided CIs using empirical studentization By (27), 1 (| n | ) P Y ≤ =z 1 2 2 0 2 ( ) 1 2 ( )Φ z − − φ z m h z− ( )+O(δ +m− )= 1 2 2 ( ) 1 2 ( )Φ z − − φ z m h z w− h( , )+O e( n3) for e =n3 hp +mh2p +m−2 so that 1 1 2 (| n | ( , )) P Y ≤ +z m h z w− =2 ( ) 1Φ z − +O e( n3)
and this also holds with w replaced by ˆw . Also en3∼n−β for
= min( , 1+(2 +1) ,2 2 )=2 /( +2) at =2/( +2)p p p p p
β α − α − α α .
This gives the two-sided CI | (f x0)−wˆ1| ≤ (k w m2h( )/ ) (ˆ 1/2 z m h z w+ −1 2( , ))ˆ of
asymptotic level 2 ( ) 1Φ z − and ACE rate =2 /( +2)β p p . 5. OPTIMIZING THE CONSTANT C IN THE SMOOTHING PARAMETER (3)
5.1 One-sided first order CI based on empirical studentization
Consider the one-sided first order CI (36) with =1/( +1)α p . We noted that, for any 0c > , its coverage error is (O n−β), where =p/(2 +2)β p . We now show how to choose c c f x= ( , 0) to minimize the asymptotic coverage error (ACE). Set
0 1
( ) ( 1) / !r
r f x Br r r
ρ = ⋅ − . Since K is of order p, 0ρp ≠ . We now prove the fol-lowing in terms of h z of (39). For 10( ) ρp > , 0 c f x = ( , 0)
1/( 1) 10 [ ( )/{(2 1) }] p p h z p+ ρ + . For 0 p ρ < ,
1/( 1)
0 10
( , ) [ ( )/ ] p
p
c f x = h z ρ + (42)
and the ACE is improved to O n( −β1), where
1 (p 2)/(2p 2)
β = + + > . This β
surprising result is analogous to the result of Silverman (1986) quoted in the in-troduction. (As for that result, in practice one needs to study the effect of replac-ing ρp by an estimate. We shall not consider that here.)
We now prove these results. By (25),
1/2 1/2 2 ( 2 ) p p p h x z m h= − ρ k− +O δ +h for δ2=m h1/2 p+1, and 1 1 3 2 ( ) ( ) ( ) ( p) n n P Y ≤z =Φ z −φ z e +Oδ +m− +h for 1/2 1/2 1/2 1/2 3 p 2 1( ) 30 ( ) n p h n e =m h ρ k− +m− h z =e +O m− h , 1/2 1/2 30 p 10( ) ( ) n p e =m h ρ +m− h z =n a c−β , 1/2 1/2 1/2 2 10 ( ) p h p ( ) a c =ρ k− c + +c− h z ,
and α, β as in (35). We want to minimize the asymptotic value of the error |en30|,
that is, minimize |a(c)|. We assume that
∫
K >3 0 so that30 0
w > . The minimiz-ing c is as given. For ρp <0, |a(c)|=0 at c of (42) giving ACE rate behaves as
1 1/2 m h n β ε+ − ∼ − since m−1+hp ∼n−p/(p+1) and 1 /( 1) p p+ >β for p > . For 2 0 p
ρ < , one can show that we can improve this ACE rate β1 to
2 ( /2 1)/(p p 1) 1
β = + + > by replacing h cnβ = −α by
3
(1 )
h cn= −α +c n−α for c of
(42) and c3=N D/ for N c= p+3/2ρp+1+ch z11( ), and
1/2 1/2
10
( 1/2) p p ( )/ 2
D= − +p c + ρ +c− h z , where
1r( )
h z is defined by the
expan-sion 1 1 1 0 ( ) ( ) ( ) r h r r h z h z ∞ h z h =
= =
∑
. In fact, this process may be repeated using a third term in h to further increase the ACE rate above β2.5.2 Two-sided first order CI based on empirical studentization
First note that 2 1
21 1 2 20 p p ( p ) a = − w− ρ h +O h + and 2 1 1 w20 php O h( p ) δ = − ρ + + . So, by (27), 1 2 1 (| n | ) 2 ( ) 1 2 ( ) ( ) ( ) P Y ≤z = Φ z − − φ z n a c−β +O n−β for 1 1/(p 1) α = + , 1 p/(p 1) β = + , β2 =β1+α1= , 1 a c( )=γ0c−1−γ1cp +γ2c2p+1, γ0=h z20( ),
2 1/2 1 20 20 1 0 1 0,2 / p i i ( ) i zw w P He z γ ρ − − + = = +
∑
, and 1 2 2 zw20ρpγ = − . The optimal c
mini-mizes|a(c)|. Again there are two cases. If the minimum is positive the CI has ACE rate β1= p/(p+ . But if there exists c such that ( ) 01) a c = then the CI has
ACE rate β2 = . Since 1 γ2> there one such c exists if 0 γ0< while if 0 γ1> 0
there will be two such c if γ1 is sufficiently large but otherwise no such c; that is, if
1 10
γ ≥γ , where γ10 is given by eliminating c from 0 a c( )0 =a c( ) 00 = for ( )a c the
derivative of ( )a c .
6. CONCLUSIONS
In this paper, we have given first order and second order one- and two-sided CIs for f x . These CIs are based on Edgeworth and Cornish Fisher expan-( 0) sions for Studentized versions of the kernel density estimate, ˆ( )f x . Tables 1 0 and 2 summarize the main CIs we give in terms of their ACE rate β achieved at their optimal choice of α and their subsection.
TABLE 1
ACE rate β and α for (3) for one-sided CIs (ε= for asymptotic Studentization, 1 ε= for empirical p Studentization, *p = best ACE rate achievable using the c in (3), p = order of the kernel)
ε first order CIs § second order CIs §
1 1/3 at 1/3 if p≥2 3.1 1/2 at 1/2 if p≥2 4.1.1, Hall
1 1/4 at 1/4 if p=1 3.1 2/5 at 3/5 if p=1 4.1.1, Hall
P p/(2p+2) at 1/(p+1) 3.2 2p/(2p+3) at 3/(2p+3) 4.1.2, 4.1.4, Hall, page 222
p* (p+2)/(2p+2) at 1/(p+1) 5.1
TABLE 2
ACE rate β and α for (3) for twosided CIs (ε= for asymptotic Studentization, 1 ε= for empirical p Studentization, *p = best ACE rate achievable using the c in (3), p = order of the kernel)
ε first order CIs § second order CIs §
1 1/2 at 1/2 3.3 2/3 at 2/3 4.5
P p/(p+1) at 1/(p+1) 3.3 2p/(p+2) at 2/(p+2) 4.5
p* 1 at 1/(p+1) 5.2
Hall (1992, Section 4.4.3, page 222) gives one- and two-sided CIs based on boot-strapping with β =2 /(2p p+ and 3) β = p/(p+ using 1) α=3/(2p+ and 3)
1/(p 1)
α = + , respectively. (This is for the case where bias is not estimated. Oth-erwise the formulas for β become too complicated.) We have used two types of Studentizations for the first order CIs given in Section 3. Using asymptotic Studen-tization, we obtain β=1/3 and 1/2 for one- and two-sided CIs using 1/3α= and 1/2 for p ≥ . Using the usual empirical Studentization, these 2 values of β are improved to β = p/(2p+ for one-sided CIs, and 2)
/(p p 1)
β = + for two-sided CIs both using α =1/(p+ . The second order 1) one- and two-sided CIs given (Section 4) have β =2 /(2p p+ for one-sided CIs, 3) and 2 /(β= p p+ ) for two-sided CIs using 2) α=3/(2p+ and 2/(3) p +2).
We have also considered two cases for choosing the ACE-optimal constant c in (3) for first order CIs based on empirical Studentization. In one case, using this c increases the ACE rate to (p+2)/(2p+ for one-sided CIs and to 1 for two-2) sided CIs. This result is analogous to that of Silverman (1986).
ACKNOWLEDGMENTS
The authors would like to thank the Editor-in-Chief and the referee for carefully read-ing the paper and for their comments which greatly improved the paper.
Applied Mathematics Group CHRISTOPHER S. WITHERS
Industrial Research Limited
School of Mathematics SARALEES NADARAJAH
University of Manchester
REFERENCES
L. BIGGERI, (1999), Diritto alla ‘privacy’ e diritto all’informazione statistica, in Sistan-Istat, “Atti della Quarta Conferenza Nazionale di Statistica”, Roma, 11-13 novembre 1998, Roma, Istat, Tomo 1, pp. 259-279.
L. COMTET, (1974), Advanced Combinatorics, Reidel, Dordrecht, Holland.
R. A. FISHER, E. A. CORNISH, (1960), The percentile points of distributions having known cumulants, “Technometrics”, 2, pp. 209-225.
P. GARCIA-SOIDAN,(1998), Edgeworth expansions for triangular arrays, “Communications in Sta-tistics-Theory and Methods”, 27, pp. 705-722.
P. HALL, (1992), The Bootstrap and Edgeworth Expansion, Springer-Verlag, New York.
E. PARZEN,(1962), On the estimation of a probability density and mode, “Annals of Mathematical Statistics”, 33, pp. 1065-1076.
B. L. S. PRAKASA RAO,(1983),Nonparametric Functional Estimation, Academic Press, Orlando. M. ROSENBLATT, (1956), Remarks on some nonparametric estimates of a density function, “Annals of
Mathematical Statistics”, 27, pp. 883-835.
D. W. SCOTT, R. A. TAPIA, J. R. THOMPSON,(1977), Kernel density estimation by discrete maximum
pe-nalized-likelihood criteria, “Annals of Statistics”, 8, pp. 820-832.
B. W. SILVERMAN,(1986), Density Estimation for Statistics and Data Analysis, Chapman and Hall, New York.
G. R. TERRELL, D. W. SCOTT,(1992), Variable kernel density estimation, “Annals of Statistics”, 20, pp. 1236-1265.
C. S. WITHERS,(1982), The distribution and quantiles of a function of parameter estimates, “Annals of the Institute of Statistical Mathematics, Series A”, 34, pp. 55-68.
C. S. WITHERS,(1983), Expansions for the distribution and quantiles of a regular functional of the empirical
distribution with applications to nonparametric CIs, “Annals of Statistics”, 11, pp. 577-587.
C. S. WITHERS,(1984), Asymptotic expansions for distributions and quantiles with power series
cumu-lants, “Journal of the Royal Statistical Society, Series B”, 46, pp. 389-396.
C. S. WITHERS,(2000), A simple expression for the multivariate Hermite polynomial, “Statistics and Probability Letters”, 47, pp. 165-169.
APPENDIX A: WAYS TO CONSTRUCT KERNELS OF HIGHER ORDER
We saw in Section 1 that f x , the kernel estimate with kernel K of order ˆ( )0
p has bias ( )O h . One can reduce the bias to p O h( p+1) by using
0 ˆ 0 ˆ 0 1 ( ) ( ) ( ) ( ) / !p p p f x = f x − f x B⋅ −h p , where ˆ ( ) ( / )p ˆ( ) p f x⋅ = d dx f x . But this amounts to replacing the kernel K by a kernel of order p+1, giving
0( ) ( ) p( ) p1( 1) / ! (1p p p1( 1) / !) ( )p
K z =K z −K z B⋅ − p = −D B − p K z for = /D ∂ ∂z. If
K is symmetric then p is even so K is of order 0 p + . 2
Example 1. Take ( )= ( )K z φz so p = . Set = /2 D ∂ ∂z. Then
2 2 0( ) (1 /2) ( ) (1 2( )/2) ( ) (3 ) ( )/2 K z = −D φ z = −He z φ z = −z φ z has order 4, 4 2 1 2 4 6 2 4 6 ( ) (1 /8)(1 / 2) ( ) (1 ( )/ 2 ( )/8 ( )/16) ( ) (15 45 15 )/16 ( )/16 K z D D z He z He z He z z z z z z φ φ φ = + − = − + + = = − + − has order 8, 6 4 2 2 2 4 6 8 10 12 ( ) (1 /96)(1 /8)(1 /2) ( ) (1 ( )/2 ( )/8 ( )(1/16 1/96) ( )/(96 2) ( )/(96 8) ( )/(96 8 2) K z D D D z He z He z He z He z He z He z φ = − + − = = − + − + + × = = − × + × ×
APPENDIX B: CUMULANTS IN TERMS OF MOMENTS
Here, we show how to express 1
1 ˆ ( ) r rh r k =m −κ w and 1 1 1 ˆ ˆ ( ... ) r ( ,..., r ) h r i i
k i i =m −κ w w of (13) and (19) as polynomials in h and
1 2
( , ,...)
w= w w of (10). Let U be a real random variable with finite moments r
r
m =EU and cumulants κr defined as usual by
1 / ! ln(1 ( )) r r r t r S t κ ∞ = ≡ +
∑
for allt in C for which the moment generating function 1+S t( )=EeUt exists. By equa-tion (2), page 160 of Comtet (1974),
1 1 ( 1) ( 1)! ( ), r i r ri i i B m κ − = =
∑
− − (43)where for m=( ,m m1 2,...), ( )B m is the partial exponential Bell polynomial defined by ri
1 / ! / ! ( ) / ! i r r r ri r r i m t r i B m t r ∞ ∞ = = ⎛ ⎞ = ⎜ ⎟ ⎝
∑
⎠∑
for i=0,1,... . These polynomials B m are tabled by Comtet on page 307. For ri( ) example, B mr1( )=mr, B mrr( )=m1r, B m32( ) 3= m m1 2, B m42( ) 4= m m1 3+3m22, and
2 43( ) 6 1 2
B m = m m . An explicit formula for B m is given by ri( )
1 1 1 1 2 ( ) { ( ) ... r : ... , 2 ... } r n n ri r r r n N B m π n m m n n i n n rn r ∈ =
∑
+ + = + + + =for π(n) the partition function defined by
1 ( ) !/ ( !i !) r n i i n r i n π = =
∏
for 1 2 2 ... rr n= + n + +rn . Now take U U= 1=K xh( 0−X), where X has distribution
F on ℜ. In the notation of Section 2, 1 r
r rh r m =m =h w− , and 1 r r rh h krh κ =κ = − . It follows that 1 1 1 ( 1) ( 1)! ( ), r i i rh ri i k − i h B w− = =
∑
− − (44)a polynomial in h of degree r − . For example, 1
1h 1, k =w 2 2h 2 1, k =w −hw 2 3 3h 3 3 2 1 2! 1, k =w − hw w + h w
2 2 2 3 4 4h 4 (4 1 3 3 ) 2! (62 1 2) 3! 1, k =w −h w w + w + h w w − h w 2 2 2 3 3 4 5 5h 5 (5 1 4 10 2 3) 2! (10 1 3 15 1 2) 3! (10 1 2) 4 ! 1. k =w −h w w + w w + h w w + w w − h w w + h w
Analogous to wr =wrh =h EYr−1 r and krh =hr−1κr( )Y , define νrh =hr−1µr( )Y . Expanding in terms of non-central moments gives
2 1 1 1 1 0 0 ( ) ( ) ( 1)( ) r r j j r r rh r j r j j j r r hw w hw w r h w j j ν − − − − = = ⎛ ⎞ ⎛ ⎞ = ⎜ ⎟ − = ⎜ ⎟ − + − − ⎝ ⎠ ⎝ ⎠
∑
∑
for w0=h−1. For example, ν2h =k2h =w2−hw12 and ν4h =k4h +3hk22h. The mul-tivariate forms of (43) and (44) can now be written as
1 1 1 ... 1 ( 1) ( 1)! r( ) r r i i i i i i i i B w κ − = =
∑
− − and 1 1 1 1 1 ( ... ) ( 1) ( 1)! r( ), r i i i i h r i i k i i − i h B− w = =∑
− − where i1...ir( ) iB m is the multivariate form of B m . These may be written down ri( )
on sight. For example, 2
42( ) 4 1 3 3 2 B m = m m + m is replaced by 1...4 2i i ( ) B m = 1 2 4 1 2 3 4 4 3 i i i i i i i m m + m m
∑
∑
, where ( ... )1 N r f i i∑
is f j( ... )1 j summed over r all N permutations j1...j of r i i giving different terms. So, 1...r B2i1...i4( )m isthe sum of
∑
4 m mi1 i2 i4 =m mi1 i2...i4 +m mi2 i i i3 4 1 +m mi3 i i i4 1 2 +m mi4 i i i1 2 4 and1 2 3 4 1 2 3 4 1 3 2 4 1 4 2 3 3 i i i i i i i i i i i i i i i i m m =m m +m m +m m
∑
. So, 1 1 ( ) , h i k i =w 1 2 1 2 1 2 ( ) , h i i i i k i i =w + −hw w 1 2 3 1 2 3 1 2 3 3 2 1 2 3 ( ) 2! , h i i i i i i i i i k i i i =w + + −h∑
w + w + h w w w 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 4 3 6 2 3 ( ) { } 2! 3! . h i i i i i i i i i i i i i i i i i i i i k i i i i w + + + h w + + w w + w + h w + w w h w w w w = = −∑
+∑
+∑
− For example, 2 2 2 4 1 3 2 1 2 (112) (2 ) 2 h k =w −h w w +w + h w w .SUMMARY
Edgeworth and Cornish Fisher expansions and confidence intervals for the distribution, density and quantiles of Kernel density estimates
We show that kernel density estimates ˆ( )f x0 of bandwidth h h n= ( )→0 satisfy the
Cornish-Fisher assumption with parameter m=nh. This allows Cornish-Fisher expansions about the normal for standardized and Studentized kernel density estimates in powers of
1/ 2
m− for smooth functions t. The expansions given are formal and the conditions for existence/validity are not explored. The expansions lead to first order confidence inter-vals (CIs) for f x( 0) of level 1− +ω O n( −β), where β = p/(2p+2) for one-sided CIs and β = p/(p+1) for two-sided CIs, where p is the order of the kernel used. The second order one- and two-sided CIs are given with β =2 /(2p p+3) and β =2 /(p p+2). We show how to choose the bandwidth for asymptotic optimality.