Statistical Scaling - Multiscale Analysis

3.2 Multiscale Analysis

3.2.1 Statistical Scaling

X1+ . . . + Xm

m , . . . ,Xmi+1+ . . . + X(m+1)i

m , . . .

. (3.6)

This section is organized as follows. First we look at statistical scaling in Section 3.2.1 that aims at determining the Hurst parameter of the process. In Section 3.2.2 a method to analyze multifractal scaling, that results in the Legendre spectrum, is described with which we will compare our approximating MAP with the real traffic trace. The fitting procedure is based on another way of examining a finite sequence of number, the Haar wavelet transform (Section 3.2.3), because it allows us to compute the desired properties of our MAP structure analytically.

The introduction to statistical and multifractal scaling given hereinafter is partly based on [75] and [68].

3.2.1 Statistical Scaling

Recently, it has been agreed [44, 54, 55] that when one studies a traffic trace the most significant parameter to be estimated is the degree of self-similarity, usually given by the so-called Hurst-parameter. The aim of the statistical approach, based on the theory self-similarity, is to find the Hurst-parameter.

The standard definition of self-similarity is stated for continuous-time processes: Y = {Y (t), t > 0} is self-similar if

Y (t)= a^d ^−HY (at), ∀t > 0, ∀a > 0, 0 < H < 1, (3.7) where = denotes equality in the sense of finite-dimensional distributions and H is the Hurst-parameter.^d The most broadly applied signal model satisfying (3.7) is the fractional Brownian motion [73, 54, 55]

whose power lies in its simple parameterization. It is fully determined by its mean, variance and the Hurst-parameter.

There are several, different definitions of self-similarity involving stationary sequences X = {Xⁱ, i ≥ 1};

discrete-time stochastic process X = {Xi, i ≥ 1} is said to be exactly self-similar if

X = m^d ^1−HX^(m) (3.8)

for all aggregation level m. In other words X is called to be exactly self-similar if X and X^(m)are identical within a scale factor in the sense of finite-dimensional distributions. (We note here that if X is the incremental process of an exactly self-similar continuous-time process it satisfies (3.8) for all aggregation level m.) A stationary sequence is said to be asymptotically self-similar if (3.8) holds as m → ∞. A covariance-stationary sequence X is exactly second-order similar or asymptotically second-order self-similar if m^1−HX^(m) has the same variance and auto-correlation as X for all aggregation level m, or as m → ∞.

It is worth to point out that (3.8) and stationarity imply that either E[X] = 0, or E[X] = ±∞, or H = 1. But H = 1 implies as well that Xi = Xj, ∀i, j almost surely. As a consequence, to test for statistical self-similarity makes sense only having zero-mean data, i.e., the data has to be centered before the analysis. On the other hand, as we show later, multifractal analysis may be carried out on data with non-zero mean as well.

In the rest of this section, methods for estimating the Hurst-parameter of datasets are recalled. Beside the procedures described in the following, others as well can be found in the literature. See [6] for an exhaustive discussion on this subject.

It is important to note that the introduced statistical tests of self-similarity, based on a finite number of samples, provide an approximate value of H only for the considered range of scales. Nothing can be said about the higher scales and the asymptotic behavior based on these tests.

Application of the procedures will be presented by applying the estimators to the first trace of the well-known Bellcore dataset that contains local-area network (LAN) traffic collected in 1989 on an Ethernet at the Bellcore Morristown Research and Engineering facility. It may be downloaded from the WEB site collecting traffic traces [80]. The trace was first analyzed in [28].

Variance-Time Plot

As it was proposed in [75] one may perform a test of self-similarity by analyzing the behavior of the absolute moments of the aggregated process. If X is exactly self-similar

log(E[|X^(m)|^q]) = log(E[|m^H−1X|^q]) = q(H − 1)log(m) + log(E[|X|^q]). (3.9)

According to (3.9), in case of a self-similar process plotting log(E[|X^(m)|^q]) against log(m) for a fixed q results in a straight line. The slope of the line is q(H − 1). Based on the above observations the test is

0.001 0.01 0.1 1

1 10 100 1000 10000 100000 1e+06 1e+07

Variance

Aggregation level

Variance time plot least square fit

Figure 3.5: Variance-time plot and its least square fit for the Bellcore trace

10 100 1000 10000 100000 1e+06

log10(R/S(n))

n R/S plot least square fit

Figure 3.6: R/S plot and its least square fit for the Bellcore trace

performed as follows. Having a series of length N , the moments may be estimated as

E[|X^(m)|^q] = 1

where bxc denotes the largest integer number smaller or equal to x. To test for self-similarity log(E[|X^(m)|^q]) is plotted against log(m) and a straight line is fitted to the curve. If the straight line shows good correspondence with the curve, then the process is self-similar and its Hurst-parameter may be calculated by the slope of the straight line.

The variance-time plot, which is used widespread to gain evidence of self-similarity, is the special case with q = 2. It depicts the behavior of the 2nd moments for the centered data, i.e., log(Var(X^(m))) versus log(m). For self-similar time series, the slope of the variance-time plot, −β, is greater than −1. The Hurst parameter can be calculated as H = 1 − (β/2). The variance-time plot for the analyzed Bellcore trace is depicted in Figure 3.5. The Hurst-parameter given by the variance-time plot is 0.83.

R/S Plot

The R/S method is one of the oldest tests for self-similarity, it is discussed in detail in [48]. For a time series, X = {Xⁱ, i ≥ 1}, with sample variance

the R/S statistic, or the rescaled adjusted range, is given by:

R/S(n) = 1 "

R/S(n) is the scaled difference between the fastest and the slowest arrival period considering n arrivals.

For stationary LRD processes R/S(n) ≈ (n/2)^H. To determine the Hurst parameter based on the R/S statistic the dataset is divided into blocks, log[R/S(n)] is plotted versus log n and a straight line is fitted on the points. The slope of the fitted line is the estimated Hurst parameter.

The R/S plot for the analyzed Bellcore trace is depicted in Figure 3.6. The Hurst-parameter deter-mined based on the R/S plot is 0.78.

Whittle Estimator

The Whittle estimator is based on the maximum likelihood principle assuming that the process under analysis is Gaussian. The estimator, unlike the previous ones, provides the estimate through a non-graphical method. This estimation takes more time to perform but it has the advantage of providing confidence intervals as well. For details see [29, 6]. For the Bellcore trace, the estimated value of the Hurst parameter is 0.82 and its 95% confidence interval is [0.79, 0.84].

Nel documento Budapest University of Technology and Economics Department of Telecommunications (pagine 27-30)