Analysis of spatio-temporal point patterns with replication

(1)

Analysis of spatio-temporal point patterns with

replication

Jonatan A. González1∗, Ute Hahn2,and Jorge Mateu1

1_{Departamento de Matemáticas, Universitat Jaume I, Castellón, Spain; [email protected], [email protected]} 2_{Centre for Stochastic Geometry and Advanced Bioimaging, Department of Mathematics, University of Aarhus,}

Aarhus C, Denmark; [email protected]

∗_{Corresponding author}

Abstract. We develop and apply several methods for the analysis of replicated spatio-temporal point patterns in order to identify structural differences between groups of them. First, we calculate a number of functional descriptors of each spatio-temporal pattern to investigate departures from completely random patterns, both among subjects and groups. The distributions of our functional descriptors and of our derived test statistics are unknown, so for the nonparametric inference we use bootstrap and permutation procedures to estimate the null distribution of our test statistics, even though the null hypothesis is not supported by these data. A simulation study provides evidence of the validity and power of our procedures.

Keywords. K-function, non-parametric test, spatio-temporal point process, subsampling, permutation test.

1 Spatio-temporal tests

We assume X as a spatio-temporal point process which has a separable locally integrable intensity func-tion λ(u, v) = λ1(u)λ2(v), where λ1 and λ2 are non-negative integrable functions; for spatio-temporal

stationary and isotropic process, λ(u, v) assumes a constant value λ. The K-function of a stationary, isotropic spatio-temporal process is defined as

K(r,t) = λ−1E!0(N[c(r,t)]) (1)

where c(r,t) represents a spatio-temporal cylinder within radius r and high 2t and N() is the number of points of a spatio-temporal volume. The conditional expectation can be interpreted as the expected number of further events within distance r and time t of an arbitrary event (taken as the origin). Note that the K-function measures pattern independently of spatio-temporal density. Under a spatio-temporal homogeneous Poisson process, whose spatial and temporal components are independent homogeneous

(2)

González, Hahn and Mateu ANOVA for spatio-temporal point patterns

Poisson processes on R2 and R+ respectively, K(r,t) = 2πr2t, this represents the volume of a cylinder

with base radius r and height 2t. An estimator of K(r,t) is given by

ˆ K(r,t) = 1 ˆ λ2|W × T |i6= j

∑

1 ui− uj ≤ r, 1 ti− tj ≤ t e2(ui, uj) e1(ti,tj) , (2)

where ed() is a d-dimensional edge correction function.

1.1 Diggle’s spatio-temporal test procedure

In order to test for the differences between independent replicates of empirical spatial K-functions, [1, 2] suggested a bootstrap procedure. We develop a similar but more general test in the space-time case, indeed, we suppose an original sample consisting of g groups of sizes m1, ...mg. Let wi j = ni j/ni (ni=

∑m_j=1i ni j), and n = ∑ g

i=1ni. Given such an estimated descriptor (Ki j(r,t)) for each pattern, we define the

estimated group-specific and overall mean functions as usual in heteroscedastic ANOVA by

¯ Ki(r,t) = mi

∑

j=1 wi jKˆi j(r,t) and K(r,t) =¯ 1 n g

∑

i=1 niK¯i(r,t); (3)

and the statistic

Dst= g

∑

i=1 Z r0 0 Z t0 0 ni r2_t[ ¯Ki(r,t) − ¯K(r,t)] 2 drdt, (4)

which is a natural extension of the proposed by [2] to measure differences between groups.

The sampling variation of Ki j(r,t) increases with r and t, so we use a weighting factor (1/r2t), which

down-weights the variance of the space-time K-function estimates at large r and t. The statistic Dstis a

sensible measure of the extent to which the group-specific mean K-functions differ and is analogous to a residual sum of squares in a conventional one-way ANOVA.

1.1.1 A Bootstrap Procedure

The interest focuses on testing the null hypothesis that K-functions do not differ between groups, i.e.

H0 : E ( ¯K1(r,t)) = E ( ¯K2(r,t)) = · · · = E ( ¯Kg(r,t)) for all r and t

H₁ : E ( ¯Ku(r,t)) 6= E ( ¯Kv(r,t)) for some r, some t and for some u and v.

The analytical form of the probability function of Dstis intractable, but we perform a pure randomization

test to permute the ˆKi j(r,t) across groups and recompute Dstin order to obtain its exact conditional

dis-tribution. We generate bootstrap samples as follows: in the first step, residual spatio-temporal functions are defined as

ˆ

Ri j(r,t) = n1/2i j Kˆi j(r,t) − ¯Ki(r,t) . (5)

Under the null or the alternative hypotheses the ˆRi j(r,t) are approximately exchangeable quantities since

the sampling variance of each Ki j(r,t) is proportional to n−1i j . Note that

ˆ

Ki j(r,t) = ¯Ki(r,t) + n−1/2_{i j} Rˆi j(r,t) .

(3)

Then, we obtain a random sample, without replacement, of functional residuals and define

ˆ

K_{i j}boot(r,t) = ¯K(r,t) + n−1/2_{i j} Rˆboot_{i j} (r,t) . (6)

To determine the bootstrap p-value, the observed value of Dstis ranked among the corresponding

boot-strap values (Dboot_st ). We proceed to analyse a set of simulations generated varying parameters as the number of patterns per group or the intensity.

The simulation study indicates that this way of bootstrapping by permutation of residuals may fail to reproduce the distribution of the test statistic, presenting non-uniformity p-values in some cases consid-ered in the simulation scenario, which leads us to believe that the generalisation of Diggle’s statistic is not enough to make comparisons in the spatio-temporal case.

1.2 Spatio-temporal Hahn’s permutation test

Because spatio-temporal Diggle’s test yields non-uniform p- values under the null hypothesis, we give a generalised spatio-temporal version of the Studentized permutation test proposed by [3], which has uniformly distributed rejection rates by construction. Consider the estimates of Ki j(r,t) using an unbiased

estimator. Let ¯ Ki(r,t) = 1 mi mi

∑

j=1 ˆ Ki j(r,t) and s2i (r,t) = 1 mi− 1 mi

∑

j=1 ˆ Ki j(r,t) − ¯Ki(r,t) 2

denote empirical mean and variance of the K-function estimates on a given group i. We define a statistic associated to the t-statistic, as

T_st=

_∑

1≤i< j≤g Z r0 0 Z t0 0 ( ¯Ki(r,t) − ¯Kj(r,t))2 m−1_i s2 i(r,t) + m−1j s2j(r,t) drdt (7)

The use of the statistic T may lead to tests sensitive to heteroscedasticity. In these cases, we prefer using the statistic T_st=

_∑

1≤i< j≤g Z r0 0 Z t0 0 ( ¯Ki(r,t) − ¯Kj(r,t))2 m−1_i s2_i(r,t) + m−1_j s2_j(r,t)drdt, (8) where s2 i(r,t) = r2t r0t0 Z r0 0 Z t0 0 s2_i(u, v) u2_v dudv.

When there is a heteroscedastic stage, instead of Tst to use Tst guarantees a better performance of the

test. As expected, the Studentized permutation tests present a general better performance than Diggle’s tests, and this is shown through a new set of simulations.

2 Database

We have a climate database, where information of occurrence locations of tornadoes over 59 years in USA are collected (see Figure 1), we want to detect possible differences between the patterns of occur-rence of tornadoes in hot (Spring - Summer) and cold (Autumn - Winter) weather seasons. The database

(4)

Autumn − Winter Spring − Summer

1953 1963 1973 1983 1993 2003 year

Figure 1: Observed point patterns of tornado occurrences in U.S. from 1953 until 2012 in a couple of groups corresponding to two climatic seasons.

exhibits a rough spatial inhomogeneity, representing a trouble to be addressed prior to apply any of the tests we have proposed; then divide the spatial region into subregions (as in a tessellation) according to areas with greater or lesser intensity in order to ensure the homogeneity of local intensities, so we are able to perform the Studentized permutation test over our data.

Acknowledgments. This research was supported by grant MTM2010-14961 from the Spanish Ministry of Science and Education and the Centre for Stochastic Geometry and Advanced Bioimaging. We want to grant to Storm Prediction Center of the National Oceanic and Atmospheric Administration (NOAA) which is part of the National Weather Service (NWS) and the National Centers for Environmental Pre-diction (NCEP) for providing dataset.

References

[1] Diggle, Peter J. and Lange, Nicholas and Bene?, Francine M. (1991). Analysis of Variance for Replicated Spatial Point Patterns in Clinical Neuroanatomy, Journal of the American Statistical Association, 86, 618– 625.

[2] Diggle, Peter J. and Mateu, Jorge and Clough, Helen E. (2000). A comparison between parametric and non-parametric approaches to the analysis of replicated spatial point patterns, Advances in Applied Probability, 32(2), 331–343.

[3] Hahn, Ute. (2012). A Studentized Permutation Test for the Comparison of Spatial Point Patterns, Journal of American Statistical Association, 107, 754–764.