An algorithm for automated phase picking and localization of seismic events.

(1)

Riassunto

Durante gli ultimi anni, la diffusione della sismometria digitale ha reso disponibile una grande quantitá di dati utili alla localizzazione di eventi sismici di diversa natura. L’analisi di queste informazioni richiede tuttavia una enorme mole di lavoro, per cui si é resa evidente la necessitá di svilup-pare metodi e procedure per individuare in maniera automatica gli arrivi P ed S sui sismogrammi.

Se effettuata manualmente, la lettura dei tempi di arrivo (picking) puó richiedere molto tempo ed essere affetta da errori sistematici a causa della soggettivitá nell’individuazione delle diverse fasi sismiche. Questo risulta es-sere particolarmente problematico nel momento in cui si vogliano confrontare dati provenienti da fonti diverse o addirittura gli stessi dati analizzati da diversi operatori, poiché i criteri usati per la scelta dei tempi saranno cer-tamente diversi. Questi problemi chiaramente non si pongono se la scelta dei tempi di arrivo é automatizzata; inoltre, usando metodi automatici per il picking, é possibile localizzare piú rapidamente un maggior numero di eventi. In questo lavoro di tesi é stato sviluppato un codice Matlab che, a partire dalle tracce sismiche, individua automaticamente gli arrivi P ed S e localizza ciascun evento utilizzando il metodo della ricerca su griglia. L’algoritmo si propone di essere versatile e veloce sia nella fase di individuazione dei tempi di arrivo che in quella di localizzazione.

Dopo aver passato in rassegna alcune delle tecniche utilizzate per il pick-ing sia delle onde P che delle onde S presenti nella letteratura, seguirá una descrizione dettagliata dell’algoritmo sviluppato; infine verrano presentati i risultati ottenuti testando l’algoritmo su dati reali.

(3)

Chapter 1

Introduction

During the past years, a great number of digital data has been made available from seismic stations, all of which could potentially be used to locate seismic events. Given the amount of work necessary to process the data manually, it became apparent the necessity to develop tecniques to automatically detect P and S onsets in seismograms.

Manually detecting onsets is a process that is both extremely time con-suming and also prone to introduce sistematic errors, due to subjective bias. This is also impractical when trying to compare data coming from different sources or even from the same dataset in case more than one person has been working on the phase picking, since it is unlikely that the same criterion will be used. These problems are easily overcome if the whole process is auto-mated; in particular it is possible, using automated picking algorithms, to quickly locate a great number of events.

Of course it must be said that manual picking, despite the aforementioned disadvantages, is generally more accurate, thus the necessity to refine the methods for the automated picking in order to achieve results comparable to those obtained with manual picking. In addition an accurate onset picking leads to a better event location.

Various techniques can be found in the literature for onset time identi-fication both for single station and three-components recordings. Many of these techniques exploit the concept of characteristic function (CF), that is, a function obtained by transforming the original signal in a way that enhances the properties of interest, namely the arrival time of the seismic wave. Allen (1978), who introduced the concept of characteristic function,

(4)

used the square of the trace plus a weighted square of its first derivative of the trace as a CF. A modified version of Allen’s CF was proposed by Baer & Kradolfer (1987) who implemented a dynamic signal threshold as well; this algorithm is still used in automated picking systems (Aldersons, 2004). CF obtained using higher order statistics have also been proposed, e.g. by Saragiotis et al. (2002).

Another family of pickers exploits autoregressive (AR) techniques; based on the Akaike Information Criterion (AIC), many examples can be found in literature, e.g. Takanami & Kitagawa (1988) and Leonard & Kennett (1999) both use an AR-AIC method to pick phase arrival times. It is also possible to calculate the AIC function directly from the seismograms, without using the autoregressive coefficients, as shown by Maeda (1985).

These, however, are not the only possibilities. Boschetti et al. (1996) used fractal dimension, while Murat & Rudman and McCormack et al. (1993) used a method based on neural networks; other methods include spatial correlation as well. Moreover, it is necessary to point out that these algorithms perform differently with regard to speed and accuracy in the de-termination of onset times, and that is why some phase-detection algorithms integrate more than one technique in order to obtain better results.

For a more accurate earthquake location and in particular for the deter-mination of hypocentral depths, the estimation of S onset times is of great importance. However, S phase picking is more difficult than P phase picking, since S waves are usually contaminated by the tail of the P waves and possi-bly the presence of converted waves generated at different interfaces. Despite the inherent difficulty, there are different possibilities to approach the prob-lem. For instance, some of the previously mentioned algorithms may be used to detect S phase onset times on horizontal components, where S waves are predominant. Another method instead is based on polarization analysis of data coming from three-components recordings (Cichowicz, 1993).

In the context of automated picking techniques, my work aims to develop an algorithm that can provide both a quick detection of P and S phases onsets and the localization of events recorded from a seismic network. The work will be organized as follows: first I will present an overview of some of the methods used for automated picking of both P and S phases and then a detailed description of the algorithm that I deveoped. Finally, I will discuss the data used to test the algorithm and present the results obtained.

(5)

Chapter 2

An overview of different

picking techniques

2.1 Allen’s picker and Baer and Kradolfer’s picker

The picker developed by Allen (1978) exploits a characteristic function (CF) of the seismic trace defined as

f (t) = x(t)2+ Kx0(t)2, (2.1)

where x(t) is the seismic trace and x0(t) its first derivative, while K is a weighting constant that depends on the sampling rate and noise level of the seismic stations. An example of this characteristic function can be seen in figure 2.1.

Given this characteristic function, the picker calculates its STA (Short Time Average) and LTA (Long Time Average); the LTA is then multiplied by a certain reference level R, thus defining a threshold for the detection of phase arrivals. When the STA value exceeds this threshold (ST A > LT A∗R) an onset is declared and the time is stored.

Baer and Kradolfer (1987) observed that the use of a fixed threshold prevented the detection of weak signals in presence of a high level of noise; moreover, the picker proposed by Allen was very sensitive to changes in fre-quency, but not to changes in amplitude. To solve these issues they suggested

(6)

Figure 2.1: A seismogram and its Allen characteristic function (red).

a new characteristic function:

f (t) = (e(t)

4_{− e(t)}4₎

σ(t)2 . (2.2)

Here σ(t)2 is the variance calculated from the beginning of the trace up to point t, while e(t)2 _{is the envelope function:}

e(t)2 = x(t)2+x

0_(t)2

w(t) . (2.3)

This envelope function has been squared to achieve a more distinct signal to noise behaviour in the CF. In the above formula, w(t) is the istantaneous frequency of the signal, however, since it is not convenient to calculate it for every time instant, it has been replaced by the authors with the following weighting parameter: Pτ t=1xt2 Pτ t=1x0t 2. (2.4)

Using this parameter to weight the envelope function, both the square am-plitude and the squared first derivative are of the same order of magnitude; this solution, as the authors pointed out, yelded better results than simply substituting the instantaneous frequency with the dominating frequency of

(7)

the seismic trace.

2.2 Higher order statistics

Since the statistical properties of a seismic trace change significantly upon the arrival of an earthquake, it is possible to use these properties to identify a phase onset.

Let X be a random variable with a distribution p(x); the expectation value for this distribution is defined as

E(X) =X

x

xp(x). (2.5)

Given this definition, we can define the statistical moment and the central statistical moment of order k as

µk= E(Xk) =

P

xxkp(x) moment of order k;

ck= E[(X − E(X))k] =Px(x − µ1)kp(x). central moment of order k.

The second central moment is the variance, which describes the spread of the data around the mean value. A further characterization of the data includes skewness and kurtosis:

skewness = _E[(X−E(X))E[(X−E(X))3/23]_],

kurtosis = E[(X−E(X))_E[(X−E(X))42]_].

Skewness is a measure of symmetry or, more precisely, the lack of sym-metry; this means that the value of skewness for normal distributions is zero, while it is negative for data skewed left and positive for data skewed right. On the other hand kurtosis is a measure of whether the data are peaked or flat relative to a normal distribution: data sets with high kurtosis tend to have a distinct peak near the mean, decline rather rapidly, and have heavy tails. Data sets with low kurtosis tend to have a flat top near the mean rather than a sharp peak. A uniform distribution would be the extreme case. Since the kurtosis for a normal distribution is three, this value is

(8)

of-Figure 2.2: Histograms of four different statistical distributions and their skew-ness and kurtosis values; each distribution is generated from 10.000 samples (from http://www.itl.nist.gov).

ten subtracted by the above definition; using this definition, the kurtosis can have negative values as well. In figure 2.2 there are some examples of statistical distributions and their value of skewness and kurtosis.

On the seismic trace, both skewness and kurtosis are calculated on mov-ing time windows. Once the functions have been calculated on the selected portion of the trace, they can be used to pick arrival times; in fact while the noise can be generally considered to be gaussian, when the time window reaches the P phase onset both skewness and kurtosis increase strongly, as can be seen in figure 2.3. This property holds true also for the S wave arrival, although the increase is not as high as it is for P waves arrival.

Because there isn’t a simple mathematical way to detect the change in behaviour of these functions, however, some transformations need to be made in order to pick a time for the phase arrival.

For example Baillard et al. (2014) apply a succession of three transfor-mations to the kurtosis function in order to pick P and S phase arrivals:

• the first transformation eliminates all the negative slopes to obtain an increasing function of time;

(9)

Figure 2.3: An example of seismogram and its kurtosis (red) and skewness (green) functions.

• the second tranformation subtracts a linear trend so that the first and last point of the function are zero and the phase onsets become local minima of the function;

• the last transformation subtracts the next value of the function to each preceding value and sets to zero all positive values, leaving a function that is overall zero except than in corrispondence of phase onsets. Phase onsets are picked on this function. However, as observed by the au-thors, to improve the picker accuracy, the results should be calculated for different windows sizes and frequency bandwidths.

2.3 Autoregressive models: the AIC picker

A way to analize seismograms is to treat them as time series. Statistically a time series is a stochastic event of which we only have one observation; this observation is just one of the infinite possible observations that could have been made. Thus, our time series can be treated as one particular realization of a stochastic process; the totality of the realizations being called an ensamble. The analysis of a time series is concerned with evaluating the properties of the probability model which generated the observed series.

(10)

Given the difficulty of specifying the probability distribution for any set of times t1, ..., tnand for every value of n, the probability distibution is usually

described in terms of its moments, in particular the mean, variance and covariance of the process.

A particular class of stochastic processes are the stationary processes; a time series (Xt)t∈T, where T is the time interval in which the process is

defined, is said to be strictly stationary if

fX_t1,...,Xtn(x1, ..., xn) = fX_t1+h,...,Xtn+h(x1, ..., xn) (2.6)

∀h ∈ R, ∀n ∈ N, ∀x ∈ Rn_{. This means that a time shift does not affect the}

properties of the distribution. A weaker definition of stationariety requires that the mean value and standard deviation of the process are time indepen-dent and that the correlation function depends only on the time difference t2− t1.

A mean of characterizing time series is in terms of an autoregressive (AR) process, in which every value depends on a linear combination of past values. The order of the autoregressive model depends on the number of past values employed; an AR model of order p can be written as:

Xt= a1Xt−1+ ... + apXt−p+ εt (2.7)

where εt is white noise with zero mean and variance σε2.

Although seismic traces can’t be considered as stationary processes, since it it evident that the properties of the signal change in time, it is possible to divide them in intervals that can be considered stationary; each of these intervals can be represented by an appropriate AR model. Typically seismic noise can be represented with an AR process of lower order than the one representing the actual signal.

If we divide a time series X_t of length M in two subseries of length N1 and N2, we can describe each subseries with an appropriate AR model

of order p1 and p2:

X1t=Ppn=11 a1nXt−n+ ε1t, (1 ≤ t < N )

(11)

The AR model is to be determined for each subseries and for each value of N. This can be achieved using the AIC (Akaike Information Criterion), proposed by Akaike (1973) as a mean to select the best statistical model. In the case of locally stationary AR models, the AIC is given by:

AIC = −2(maximum log likelyhood) + 2(number of parameters). (2.8) If we consider two subseries, the likelyhood of the locally stationary AR model will be:

2 Y i=1 1 2πσ_i2 Ni₂ exp{− 1 2σ2_i Ni X t=1 Xit− pi X n=1 ainXit−n 2 }, (2.9)

and the relative log likelihood

−1 2 2 X i=1 {Nilog 2πσ2i + 1 σ2_i Ni X t=1 (Xit− pi X n=1 ainXit−n)2}. (2.10)

The maximum of the log likelyhood (Kitagawa and Akaike, 1978) is obtained for σ2_i = 1 Ni Ni X t=1 Xit− pi X n=1 a1nXit−n 2 , (2.11)

and substituting this value in the above formula, we obtain −1₂P2

i=1(Nilog 2πσ2i + Ni)

−M₂(1 + log 2π) − 1₂P2

i=1Nilog σi2

The two AR models are fitted independently and the AIC is calculated for each subseries. The total value of the AIC for the whole time series will be the sum of those two values:

AIC = N1log σ21+ N2log σ22+ 2(p1+ p2). (2.12)

By calculating the AIC for each couple of intervals obtained shifting the separation point, we obtain a function whose minimum value indicates the position of the phase onset time on the seismic trace, as shown in figure 2.4. Since the order and the coefficients of the autoregressive models are not

(12)

Figure 2.4: A portion of a seismic trace cointaining the P phase arrival and the relative AIC function (below). The red dot on the seismic trace, corresponding to the minimum of the AIC function, is the P phase onset.

known and must be computed for each portion of the analyzed seismogram, Maeda (1985) proposed, as an alternative method, to compute the AIC di-rectly from the seismic trace, without using the AR coefficients . For a trace x of N samples the AIC is then

AIC(k) = k log var(x[1, k]) + (N − k − 1) log var(x[k + 1, N ]), (2.13) where k = 1, ..., N .

It must be noted that, since the AIC picker considers the phase onset to be the global minimum of the AIC function, it is important, to pick the correct phase onset time, to cut an appropriate time window to be analyzed.

2.4 The fractal dimension

A fractal is an object or quantity that displays self-similarity on all scales. This quantity does not need to exhibit exactly the same structure at all scales, but the same types of structures must appear on all scales.

Seismic traces are not self-similar throughout their length, however, they can be divided in appropriate intervals that show properties of self-similarity; then for each of these intervals a different fractal length can be found. This

(13)

property can be used to find the arrival time of a seismic wave, since a change in the fractal length of the trace can be assumed to be a phase onset.

There are different methods to calculate the fractal dimension of a curve (Klinkenberg, 1994); regarding phase picking techniques, Boschetti et al. (1996), in their study analyzed two different methods to automatically pick phase arrivals using fractal dimension:

Figure 2.5: Approximation of the length of a curve using different step sizes and the resulting log-log graph. As the step size is reduced, the "spezzata" follows more closely the real curve (from [5]).

divider method: at the basis of this method is the calculation of the length of the curve by approximating it with a series of straight lines; if f (x) represent our curve, each segment is calculated as follows:

l(i) =p∆x2_{+ (f (∆x · i) − f (∆x · (i − 1)))}2_, _(2.14)

where ∆x is the fixed step used to divide the curve in segments. The total length L is the sum of the length of all the segments and, if the

(14)

curve presents a fractal behaviour, then

L(∆x) ∝ ∆x1−D, (2.15)

where D is the fractal dimension.

The more the size of the step is reduced, the more the segments will follow the original shape of the curve. Of course, choosing steps either too small or too big in relation to the total length of the curve and its variability will prevent the recognition of any structure. A plot of the length of the curve on a log-log graph versus the step size then gives a straight line, whose slope is the fractal dimension (see figure 2.5). Hurst method: this method works taking into consideration portions of

the curve of progressively greater length. For each of these progres-sively longer windows the range of the data R, which is the difference between the highest and lowest value of the curve inside the window, is computed; R is then normalized by dividing it by the standard de-viation of the data (σ). If the data show a fractal behaviour, then

R σ ∝ F

H_, _(2.16)

where F is a constant and H is the Hurst exponent, related to the fractal dimension by the relation D = 2 − H. The Hurst exponent can be obtained by plotting R/σ against the window size on a log-log graph.

The result of this study pointed out that the Hurst method required much less computation time, but yielded good results only when the noise level wasn’t too high; the divider method, on the other hand, despite being more demanding on the computational point of view, is more robust in presence of higher levels of noise.

2.5 Polarization filtering

Polarization filtering is a technique that requires three-component recordings of a seismic event. The three components are recorded along the northern (n), eastern (e) and vertical (z) direction. However, it is convenient to

(15)

rotate our reference frame so that the axes are oriented along the direction of polarization of the P, SH and SV waves. To do so, the first step is to calculate the covariance matrix for the three components of the signal:



 

cov(z, z) cov(z, n) cov(z, e) cov(n, z) cov(n, n) cov(n, e) cov(e, z) cov(e, n) cov(e, e)





, (2.17)

where the covariance between two vectors of length N is defined as

cov(x, y) = 1 N N X i=1 xiyi. (2.18)

The covariance matrix is symmetric and its elements are real, so its eigen-values U_i are real and its eigenvector λ_i form an orthogonal basis. The eigen-values are ordered from the biggest to the smallest (λ₁ ≥ λ₂ ≥ λ₃) and the λiUi are called the principal axes of the matrix and are, as already pointed

out, directed along the direction of polarization of the body waves. The pre-vious (z, n, e) components are now rotated into (R, S, T ) components, related to the former by the following relationship:

   R S T   =    u11 u12 u13 u21 u22 u23 u31 u32 u33       z n e   , (2.19)

where ui,j, for j = 1, 2, 3, are the direction cosines relative to the ith

prin-cipal direction.

After the rotation in the new reference frame, several properties, all cal-culated in time moving windows, can be extracted using the eigenvalues and the eigenvectors of the covariance matrix. Following are some of the properties that can be evaluated by polarization analysis:

Degree of rectilinearity: given the eigenvectors of the covariance matrix (λ₁≥ λ₂ ≥ λ₃), the degree of rectilinearity (Samson, 1973) is

Rec(t) = (λ1− λ2)

2_{+ (λ}

1− λ3)2+ (λ2− λ3)2

2(λ1+ λ2+ λ3)2

. (2.20)

For circular polarization (λ1 = λ2 = λ3), Rec = 0, while for linear

(16)

intermediate values of the rectilinearity. Since both P and S waves are linearly polarized, this parameter is expected to be close to one upon their arrival.

Energy ratio between the estimated transversal energy and total energy:

E(t) = P in2i + e2i P izi2+ n2i + e2i . (2.21)

This parameter approaches one for S waves and tends towards zero for P waves; moreover, Cichowicz (1993) noted that since E(t) is not very sensitive to the presence of noise in a signal it may be especially suitable for the detection of S waves arrivals.

Dip: the dip of maximum polarization is defined as

D(t) = tan−1 u11 pu2 12+ u213 . (2.22)

where u1j are the components of the eigenvector U1 relative to the

biggest eigenvalue. The value of the dip of maximum polarization ranges between ±90◦; an horizontal maximum polarization would have a dip of 0◦ (Vidale, 1986).

Azimuth: the azimuth can be calculated from the horizontal orientation of the rectilinear motion, still making use of the eigenvector relative to the dominat direction of polarization (Jurkevics,1988);

A(t) = tan−1 u12 u13

. (2.23)

The parameters obtained from polarization analysis can be either used on their own to detect a phase onset or, as proposed by Cichowicz (1993), combined together to obtain a characteristic function on which it is easier to pick arrival times.

2.6 Neural networks

Neural networks aims to reproduce the way neurons work. A neuron can be schematically thought of as composed of three parts:

(17)

• dendrites that receive inputs from other neurons;

• a soma, which collects and elaborates the received inputs; • an axon which transmits the output signal to other neurons.

A neural-network structure can be defined as a collection of parallel pro-cessors connected together: a learning law permits the propro-cessors to adapt to the specific information environment. Each processing element in the net-work can be represented as a node, the nodes are connected with each other in a layered structure where each node can have multiple inputs, but just one output that can be sent to other nodes. A neural network is composed of two principal layers: an input layer and an output layer ; the layers that may be present in between these are called hidden layers.

Figure 2.6: Scheme of a neural network; the information received from the input parameters is processed through different layers, until an output is determined. The arrows represent the path followed by the information flow (from docs.opencv.org).

In figure 2.6 is a schematic example of a neural network; the direction of the information flow in the network is indicated by the arrows. Each connection is weighted, so that the information is appropriately scaled when it is transmitted to the next node. Neural networks must be trained to recognize patterns and solve specific problems, in particular, to train a neural network means to find an appropriate set of weights.

Many architectures for neural networks have been developed, the most commonly used being the backpropagation neural network (e.g. Murat &

(18)

Rudman, 1992, McCormack et al, 1993). The procedure to train this network can be summarized as follows:

1. apply an input vector (of which the output is already known) to the network and calculate the output;

2. compare the result with the known output value; 3. correct the weight according to the final errors;

4. repeat the process until the the output of the neural network is suffi-ciently accurate.

The number of training data and the size of the network depend on the problem at hand; regarding the initial weights to use to train the network, instead, they should be small random numbers (McCormack et al., 1993, suggest to choose them in the interval ±0.25). Each weight in the network is then corrected using a recursive formula; if w_ij is the weight given to the connection between the nodes i and j, then

wij(k + 1) = wij(k) + α(1 − β)δjOi+ β∆wij(k). (2.24)

This recursive technique, called momentum technique (Freeman and Skapura, 1991), increases the speed of convergence by adding a fraction of the previous change when calculating the weight correction. This additional term tends to keep the weight changes going in the same direction, hence the term momentum. In the formula, ∆wij(k) stands for wij(k) − wij(k − 1), δj is the

error in the output of the jth node, Oi is the output of the ith node, α is

a gain constant that ranges between 0 and 1, β is the momentum and it’s walue is also less than one.

The network training procedure is the search of the set of weight param-eters that minimizes the error function. Once a network is trained, using an appropriate set of training data, it will recognize certain input patterns and will produce the desired output. In the case of picking phase arrivals, the training data should consist of examples of first breaks from several traces that are representative of variations in amplitude, phase, and frequency ex-pected to be encountered in the first break events (for example, along a seismic profile).

(19)

Chapter 3

A program for phase picking

and event localization

The program presented works with cut traces from three components record-ings. Two different algorithms are employed to pick the onset of P phases, then an S phase onset is determined and the event is localized. The picked phases used for the localization algorithm presented here are stored in an output file and can be subsequently used.

In the next page a block diagram illustrates in more detail the structure of the program. After reading and separating the different components of the seismic traces, a bandpass filter is applied and a P onset is determined; all the incorrect times are discarded and the remaining phases are used to attempt a first localization of the event. After the localization, some of the P phases are discarded and a second localization is attempted, using only the remaining phases; then an appropriate window for the detection of the S waves onset is cut and the S onset determined. Finally, a last localization is performed using both P and S phases arrival times.

(20)

Cut seismic traces

Filtering

P phase picking

Removal of stations with incorrect P picking

Are there enough stations remaing? Exit program

Localization first try

Removal of outliers

Localization second try

Removal of outliers

Are there at least four stations remaining?

Don’t discard the stations S phase picking

Last localization

Picked phases, localization parameters

no

yes

no

yes

(21)

3.1 Trace filtering

All the components of the seismic trace are filtered using a bandpass filter; the corner frequencies can be chosen by the user, however, when choosing them it must be considered that they will be the same for all the traces that are going to be analyzed.

Before calculating the coefficients of the filter a window function is ap-plied to all the components of the trace to prevent leakage when calculating the Fourier tranform of the signal. If no windowing is applied, in the fre-quency domain the signal will look like it has been multiplied by a rectangular window, whose Fourier transform is the function sinc; the convolution of this function with the Fourier transform of the signal is what causes the leakage, that is, the introduction of false frequency components in the Fourier tran-form of the signal. To reduce this phenomenon it is convenient to multiply the signal in the time domain with a window function that goes to zero less steeply on both ends.

In this algorithm the window function used is the Tukey window, a win-dow that is cosine-like at the edges but retains a flat top. The Tukey winwin-dow is defined as: w(x) =      1 2{ 2π r [x − r 2]} 0 ≤ x < r 2 1 r₂ ≤ x < 1 −r 2 1 2{ 2π r [x − 1 + r 2]} 1 − r 2 ≤ x ≤ 1 (3.1)

where x is the signal and r the ratio of cosine-tapered length to total length of the trace; the r/2 factor in the above equations arises from the symmetry of the window. Linear trends are removed as well before applying the filter, which in this case is a butterworth bandpass filter. The filter is applied two-ways in order to have a zero-phase filtering of the trace.

3.1.1 Butterworth filters

A filter can be described using its transfer function; in terms of frequency, if ωcis the cut off frequency, the (squared) transfer function can be written as

|H(ω)|2= 1

(22)

Figure 3.1: Outline of the transfer function of a lowpass filter; the shaded areas represent, from left to right, the passband, transition region and stopband of the filter.

where F (ω) is the characteristic function of the filter and ε a coefficient taking into account the degree of error in the passband or stopband. In figure 3.1, the shaded regions represents the areas where the transfer function of the filter must lay. Butterworth filters are designed to have a flat response both in the passband and the stopband, altough to achieve this result they have a relatively wide transition region from passband to stopband compared to other filters. This means that a filter of higher order is required to implement a specific stopband specification.

Figure 3.2: The transition region of a Butterworth lowpass filter of different orders. As the order of the filter increases the transition region becomes narrower (from cnx.org).

(23)

Referring to the equation 3.2, for a Butterworth filter the characteristic function is F (ω) = (ω/ωc)n, where n is the order of the filter, and ε = 1

since the response of the filter is flat both in the passband and the stopband. Thus the transfer function becomes:

|H(ω)|2 = 1

1 + (_ωω

c)

2n. (3.3)

This is the transfer function for a lowpass filter, that is the typical prototype for all filters; highpass filters are obtained by modifying 3.3 into

|H(ω)|2 = 1 − 1 1 + (_ωω

c)

2n, (3.4)

while bandpass and bandstop filters are obtained by combining the above functions.

The filter used in this algorithm is a fourth order bandpass filter.

3.2 P phase picking

The picking of P waves arrivals is carried on only on the Z component of the recorded seismic traces. Two different algorithms are used to detect the onsets, then each picking is weighted and a weighted mean of the two obtained values is considered to be the P arrival time. Moreover, in order to detect incorrect pickings, the difference in the times picked by the two algorithms is evaluated and, if the time difference is bigger than a certain value, the picking is labeled as incorrect and discarded.

3.2.1 AIC picker

The first algorithm employed is the AIC picker; as suggested by Maeda the algorithm is implemented without calculating the order of the autoregressive models.

As already pointed out in 2.3 this algorithm considers the P phase arrival to be coincident with the global minimum of the AIC function; figure 3.3 and 3.4, however, show that the global minimum of this function is not necessarily coincident with the P wave arrival time. It may happen that we either have only one minimum which is not located in correspondence of the P onset, or more than one minimum where the global one does not mark the

(24)

(a)AIC function of the whole seismic trace. It can be seen that the minimum of the function does not correspond to the P onset as it should; moreover, in correspondence or the P onset, the AIC function does not have even a local minimum.

(b) AIC function of an appropriate portion of the same seismic trace. This time the global minimum corresponds to the P wave arrival.

(25)

(a)AIC function of the whole seismic trace. There is a local min-imum corresponding to the P wave arrival time, but the global minimum is situated further along the trace.

(b) AIC function of an appropriate portion of the same seismic trace. The global minimum is now situated in correspondence of the P wave arrival time.

(26)

P arrival time.

To avoid the problem it is necessary to cut the seismic traces appropri-ately; in particular we need a window which both contains the P onset and is short enough to ensure that the global minimum of the function corresponds to the P wave arrival; however, the window should still be long enough to contain an appropriate number of elements.

The trigger algorithm

The algorithm used to cut a portion of the seismic trace containing the P onset is variance based. The variance of a vector x of length N is defined as

N

X

i=1

(xi− ¯x)2 (3.5)

where ¯x is the mean value of vector x.

If we calculate the variance of a seismic trace on a moving window, we find a function similar to the one in figure 3.5

Figure 3.5: A seismic trace and its variance (red). The variance is calculated along the trace on a moving window of four seconds.

It can be seen that the variance is close to zero in the portion of the trace which contains only noise, then it increases significantly upon the arrival of a seismic wave, reaches a maximum and then decreases when the event comes to an end. This property has been used to detect the endpoints of the window we need to cut.

In particular the seismic trace is divided in n segments s₁, s2, ..., sn of

equal length and for each segment the value of the variance v1, v2, ..., vn

(27)

Figure 3.6: In the graphic above the red lines correspond to the endpoints of the window individuated by the trigger algorithm. Below, the value of the variance for each segment of the seismic trace.

will be very small; on the other hand, when a seismic wave arrives, the corresponding segment will have an associated value of variance significantly higher. When the event ends the variance will assume again lower values.

To detect the changes in the variance of the trace we calculate the ratios vi+1/vi between each value of the variance and the value of the variance

relative to the preceding segment; this value is assigned to segment si. Since

the variance for segments which don’t contain the signal is very close to zero, if we calculate the ratios, the highest value will be found upon the arrival of the P waves. However, since it is possible for the P onset to be located at the end of a segment, it might happen that the maximum value of the ratios is not referred to the correct segment, but to the following one. Thus, to make sure that the window will contain the P onset, the start time of the window has been chosen to be the start time of the segment preceding the one to which the maximum ratio is referred.

(28)

The end time of the windows is instead the start time of the first segment to have a variance ratio lower than a certain value. If this condition can’t be met, the length of the window will be assigned automatically and have a fixed length.

The length of the trigger window is assigned automatically also in those cases when the noise level of the trace is particularly high.

3.2.2 Kurtosis picker

This picker exploits the properties of the kurtosis function. The value of the kurtosis is calculated on a moving window. However, to reduce the comput-ing time necessary to calculate its value on the whole trace, the function is only calculated on a window spanning from the beginning of the trace to the end point of the window calculated by the trigger algorithm.

Figure 3.7: Above, the part of the seismic trace on which the kurtosis function is calculated; below, the kurtosis function relative to the trace. The red dot on the seismic trace is the P arrival time found with the kurtosis picker.

Once the kurtosis function has been calculated, the P wave arrival time can be extracted from the properties of the function, which, as showed in figure 3.7, increases strongly upon the arrival of a seismic wave.

If the moving window we use to calculate the kurtosis contains n elements, then the value of the kurtosis calulated in every point depends on the n − 1 points preceding it; this means that the maximum of the kurtosis function will not coincide with the P wave arrival time, instead it will be located

(29)

at a successive time, as can be seen in figure 3.8. Moreover, the delay of the maximum of the kurtosis function with respect to the P onset depends strongly on the signal and thus can’t be predicted and corrected beforehand.

Figure 3.8: A portion of the seismic trace containing the P onset (above) and the respective kurtosis function (below); the red dot represents the picked onset, while the black dot corresponds to the maximum of the kurtosis function, which is late compared to the P wave arrival time.

To find the correct onset then it is best to consider the differences in the value of the kurtosis function and to set a threshold to determine the correct arrival time.

3.2.3 Weighting and final P onset determination

Once the times have been picked, the final P wave arrival time will be de-termined by a weighted mean of the times picked by the two algorithms, the weight being the signal to noise ratio relative to each pick.

It must be noted that, since the whole process is automated, a P onset will be found for all traces, including those containing only noise (see figure 3.9). In general, the times determined with the two algorithms are not very different from each other, meaning that if the difference in the picks is too great either one or both the picks are incorrect. In the case of a trace which only contains noise, both the picks will be incorrect and, to eliminate these traces, the difference in the times picked by the two algorithms is taken into account; if this difference is greater than a certain value the traces are discarded.

(30)

Figure 3.9: A trace containing only noise. The green line marks the time picked by the kurtosis picker, while the red line the one picked by the AIC picker. The times picked by the two algorithms are very different.

3.3 The localization algorithm

After the stations with incorrect picks have been discarded, it is possible to localize the event using the remaining P phases. The localization of a seismic event requires the knowledge of four variables: three spatial coordinates p0 = (x, y, z) and the origin time t0. The time variable can be eliminated

considering the following reasoning: given an origin time t0, the arrival time

of the P wave at the i-th station will be:

ti= t0+ f (p0, pi), (3.6)

where piare the coordinates of station i and the function f (p0, pi) represent

the travel time from the origin to the i-th station.

For each station the arrival time will be given by 3.6; if there are N stations, the sum of all the arrival times will be

N X i=1 ti= N X i=1 t0+ N X i=1 f (p0, pi). (3.7)

(31)

If we divide by N and subtract each of the 3.7 to equation 3.6 we will find ti− 1 N N X i=1 ti = t0− 1 N N X i=1 t0+ f (p0, pi) − 1 N N X i=1 f (p0, pi), (3.8)

where t0− _N1 PNi=1t0 = 0 since the origin time is the same for all stations.

Then, for the i-th station the arrival time will be given by

t0_i = f (p0, pi)0, (3.9) where t0_i= ti−_N1 PNi=1ti f (p0, pi)0 = f (p0, pi) − _N1 PN i=1f (p0, pi). (3.10) The use of this method to eliminate one of the unknown variables reduces the computing time, however, it also means that we can only localize the event spatially in our algorithm, while the origin time must be calculated separately.

The localization method used in this work is a grid search method, which locates all the events inside a grid of user defined dimensions. In this case the length and width of the grid are determined taking as endpoints the coordinates of the stations furthest away from each other, while the depth can be chosen arbitrarily. The spacing of the points inside the grid can also be chosen by the user according to the level of precision desired; a very tight grid, however, might result in extremely long computing times.

In the case of an homogeneous mean of velocity v, the travel time from the hypocenter of an earthquake to the i − th station can be easily calculated as:

ti =

p(xi− x0)2+ (yi− y0)2+ (zi− z0)2

v . (3.11)

In a grid search algorithm this travel time is calculated for each point inside the given grid to create the model space. Provided an appropriate velocity model, it is also possible to use a non homogeneus model, however, since the computing time to obtain all the travel times in a non homogeneous model is quite long it would be convenient to calculate the travel times matrix separately and then import it inside the grid search function.

As suggested before, to avoid taking into consideration the origin time, the mean arrival time is subtracted to every point in the grid. The same

(32)

procedure applies for the measured arrival times.

Once the model space has been created we want to know which is the best fitting model with respect to the observed data. To compute the misfit function in the algorithm it is possible to choose whether to use the use L₁ or L2norm. If e is the error function, which represent the difference between

the observed and predicted data, the L2 norm is defined as

||e||₂ = N X i=1 |e|2 1₂ , (3.12)

while the L₁ norm is

||e||1 = N

X

i=1

|e|. (3.13)

In order to find the best fit between the model and the observed data it is necessary to find the minimum of the selected norm, that is p₀ = min||e||2

or p0 = min||e||1. The convenience of using a grid search algorithm lays in

the fact that it is not necessary to actually invert the data, because the grid and velocity model allow us to create a model space that we only need to compare to our data. This is especially convenient with regard to the use of the L₁ norm, for which the error function is not differentiable in every point (in particular the function is not differentiable every time the predicted data is equal to the observed data).

In this case, given the travel times for every point in the grid t_c and the observed travel times t_obs, the error function is simply e = tc− tobs. The best

fitting model then will be the one for which the difference |tc− tobs| (in case

of the L₁norm) or |t_c−t_obs|2 _{(in case of the L}

2norm) is minimum. The point

in the grid whose travel time minimizes the norm function is considered the hypocenter. This obviously means that the error in the localization will be at least as big as the spacing of the grid.

If along with P waves arrival also S wave arrival times are known it is possible to use them as well in the localization algorithm. In that case, if tpc and tpobs are the calculated and observed P arrival time and tsc and

tsobs are the calculated and observed S wave arrival times, the function to

minimize will be |tpc− tpobs| + |tsc− tsobs| in the case of the L1 norm and

|tp_c− tp_obs|2_{+ |ts}

(33)

(a) First localization attempt.

(34)

(c) Third localization attempt.

Figure 3.9: Slices of the residual matrix corresponding to the minimum along the Z axis; the small triangles in the graphics are the stations used for the localization.

It is possible to choose which norm to use for every localization attempt. In this code the localization is attempted three times. The first time the L1

norm is used because it is less sensitive to the presence of outliers; after the outliers have been removed, a second localization is attempted using the L2

norm. This second localization is important to cut an appropriate window for the picking of S waves time arrivals; in fact amongst the outputs of the localization algorithm there are also the theoric P and S travel times relative to the hypocenter, which are used to determine the length of the window where we expect to find the S onsets. The third and final localization, also done using the L2 norm, is only carried on after we have both P and S time

arrivals. In figure 3.9 are the slices of the residual matrix corresponding to the minimum along the z axis; the pictures refer to the three localization performed during the execution of the program.

(35)

3.3.1 Discarding outliers

Ideally, if the velocity model used is exact and the pickings are all correct, we should have a perfect match of the observed data and the model. Since this is never the case, after a localization attempt, corresponding to every station there will be a residual showing how much the observed time differs from the one calculated by the model. Assuming the model to be correct, a station whose residual is too big compared to the others must have an incorrect time picking and needs to be discarded to find a better solution.

In figure 3.9 are the residuals for the stations used in every localization at-tempt. In particular in the graphic referring to the first localization attempt the presence of outliers is particularly evident. To discard the stations whose residuals r are too high compared to the others, the mean m and standard deviation σ of the residuals are calculated. For the i − th station

• if m − σ < ri < m + σ the station is not discarded;

• if ri < m − σ or ri> m + σ the station is discarded.

In figure 3.10 are plotted the residuals relative to each station used in the localization; the points outsides the green lines are the points that are going to be discarded.

(36)

(a) Residuals after the first localization attempt

(37)

(c) Residuals after the third localization attempt.

Figure 3.9: Residuals after each localization attempt. On the x axis there are the observed times minus their mean time, while on the y axis there are the computed times minus their mean time; the residuals are given by the difference between these two times. The line going from the bottom left to the top right of the graphics locates the points that have zero residual so, the closer the points are to the line the smaller is the residual for the station they represent (corresponding to every point there is the name of the station it represents). In the third localization the blue dots are the residuals for P waves and the red dots are the residuals for S waves.

While after the first localization the algorithm always looks for and dis-cards outliers, after the second localization it is possible to decide whether to eventually discard more stations or not. For example, if the recordings of an event are only available from a very small number of stations it might be better to avoid discarding too many stations, especially if the quality of the recordings is not high. In any case, if after discarding the outliers for the second time the remaining stations are less than four, the algorithm automathically restores the discarded stations.

3.4 S phase picking

To find S wave arrival times through polarization filtering we need not only the Z component of the traces, but the N and E components as well. The

(38)

(a) Distribution of residuals after the first localization attempt.

(b) Distribution of residuals after the second localization attempt.

Figure 3.10: Plots of the residuals relative to the stations used for the first and second localization respectively. The blue line is the mean value and the green lines the standard deviation; all the stations whose residuals fall outside of the green lines are discarded.

(39)

first step is to cut an appropriate window in which we can find the S onset: the beginning point of the window needs to be located after the P wave arrival time, otherwise it is possible for the algorithm to pick the P onset again instead of the actual S onset. Moreover, the window must be taylored to account for the different distances of the stations from the hypocenter. In fact, the furthest a station is from the hypocenter of an earthquake, the the wider the time gap between P and S arrivals will be. Thus, to determine the length of the window, the difference ∆t between the P and S computed arrival times is caculated for every trace:

• if ∆t < 2 s, the length of the window is 5 s; • if 2 < ∆t < 4 s, the length of the window is 7.5 s; • if ∆t > 4 s, the length of the window is 10 s.

In figure 3.10 there are examples windows of different length depending on the time difference between P and S arrival times.

(40)

(b) Windowing of 7.5 seconds.

(c) Windowing of 10 seconds.

Figure 3.10: The three component of the seismic trace registered by a station; in red the portion of the trace cut by the first windowing.

After the length of the window has been defined, all the components of the trace (Z, N and E) are cut at the same points in time. It must be noted that all the components of the traces should be long at least tp + ∆t_win,

(41)

where tp is the P wave arrival time and ∆t_winis the length of the window. If this is not the case for all the components of a trace then the window length is adjusted so that its end point coincides with the end point of the trace; however, if this requirement is met only by one or two components in a trace then it means that the components of the same trace have different lengths and the trace is discarded.

This first windowing provides a still too rough idea of the position of the S onset. To restrict the research to a smaller portion of the trace it is useful to exploit the concept of the length of a curve seen in section 2.4; as a matter of fact, we expect the length of a seismic trace to increase upon the arrival of S waves, expecially on the N and E components of the traces, on which the change in amplitude is more apparent. This is the reason why only these components are used to calculate the length of the trace.

As already seen in section 2.4, to calculate the length of a curve it is necessary first of all to decide a step size ∆l to use. To every step corresponds a segment that approximates the curve in that interval; if we sum n steps we obtain the length of a portion or the curve long n ∗ ∆l.

Since we want to compare the length of consecutive portion of the seismic trace in order to detect the S wave arrival, the windows are further divided into smaller segments, then the length of the seismic trace is calculated for each segment of the window. The lengths of the consecutive segments are then compared to find the portion of the trace that contains the S wave arrival. Once the segment has been determined, its center value is considered to be the S onset. Considering that this procedure is applied to both the N and E components of the seismic trace, it is possible to find different S wave arrival times for the two components; in this case the S onset is considered to be the mean value between the two. Figure 3.11 shows the N and E component of a seismic trace (the second trace of fig. 3.10) and their respective length as well as the S onset determined using this technique.

Once we have determined with more accuracy the position of the S onset in the seismic trace, a new windowing cuts all the components of the seismic trace around the S wave arrival time found with the above procedure; the window is symmetric with respect to this time. As mentioned above, the first window was cut in a way that prevented it from containing the P wave arrival time, however, cutting a new window there is the possibility that the S onset that we take to be the center of the window is so close to the P

(42)

(a) Above: N component of the seismic trace. Below: length of the trace.

(b) Above: E component of the seismic trace. Below: length of the trace.

Figure 3.11: Cuts of the N and E components of the seismic trace and their length. The length of the trace increases upon the arrival of the S waves; the red dot on the seismic trace represents the S onset found making use of the length of the curve.

(43)

Figure 3.12: Above: the N and E component of a seismic trace, respectively on the left and on the right; the red dots represent the P onsets and the green dots the S onsets. Below: four parameters extracted using the polarization filtering; they are, in order from left to right, the sum of the eigenvalues, which is the parameter used to determine the S onset in this algorithm, the rectilinearity, the dip and the azimuth.

onset that the new window will actually contain the P onset as well. If this happens the length of the window will be reduced accordingly for all the components of the seismic trace before proceding with the next step.

The definitive S onset is determined using the polarization filtering tech-nique. This technique requires all three components of the seismic trace to create the covariance matrix from which it is possible to extract a certain number of parameters, as already seen in section 2.5. In figure 3.12 are the N and E component of a seismic trace with some parameters determined us-ing polarization filterus-ing. The parameter that gives the best result with the developed algorithm is the sum of the eigenvalues of the covariance matrix, which increases strongly upon the arrival of the S waves.

In particular this algorithm works as follows:

1. a moving window of fixed length spans the selected cuts of the seismic trace’s components advancing of one point at a time;

2. every time the windows moves forward along the trace the covariance matrix is calculated. From the matrix we extract the eigenvalues, whose sum is the parameter used to determine the S onset;

(44)

rep-resenting the sum of the eigenvalues of the covariance matrix along the selected portion of the seismic trace. The S onset is determined by differentiating this function and then using the AIC picker to find the exact time on the seismic trace.

S time picking is weighted as well. The window used for the picking of the S waves is divided in two parts by the S onset; ideally the part of the signal on the right of that point only contains S waves, while on the left there should be no S wave contribute to the signal. Considering this, the ratio between the two parts of the window is used as weight for the picking of S waves arrival times.

3.4.1 Last localization and program output

At this point, we have both P and S arrival times at our disposal and a last localization run is attempted; from this final run of the localization algorithm we obtain the definitive position of the hypocenter.

The localization algorithm only provides us with the coordinates of the hypocenter, but we still need to locate the event in time. To do so the distances of each station from the hypocenter are calculated and the stations ordered from closest to furthest from it. If we plot the arrival times at every station times their distance from the hypocenter the points are disposed along a line whose intersection with the y axes represents the time when the event has taken place (figure 3.13).

After the time of the event has been determined the program ends and an output is produced; in the output file there are not only the coordinates of the hypocenter and the time of the event, but also the picked P and S phases along with their error. The phases can also be used to localize the event using others, more sophisticated, algorithms.

(45)

Figure 3.13: P waves arrival times (black) and S waves arrival times (red) picked by the algorithm and ordered by distance from the hypocenter. The lines are interpolation of the points and their intersection with the y axis is the time when the event has taken place.

(46)

Chapter 4

Results

4.1 The picker

The algorithm produced particularly good results in the identification of P waves arrival times. In the following histograms are shown the differences between manual and automatic picking for both the AIC and the kurtosis-based picker; the algorithm has been tested on 2373 seismograms, however, the data presented contains only those pickings whose difference with the manual pickings is less than |0.3| s, reducing the number of samples by ap-proximately six hundred.

The highest differences in the picked times may be due to different rea-sons, the most frequent being:

1. especially in traces with a very low level of noise, the presence of a disturbance right before the P wave arrival can cause the algorithm to pick as P onset an earlier time instead of the real P wave arrival time. Conversely, in traces where the level of noise is high compared to the amplitude of the P waves, the algorithm is prone to pick a later time with repsect to the real P onset;

2. the algorithm works on cut traces so, in the eventuality of multiple events happening very close one to the other, it is possible for the cuts to contain not only one event but two or more of them. Manual pickings in this case are relative to the first event of the series, however, it is not possible to determine whether the times picked by the algorithm refer to the first event or to a following one, resulting in differences in time up to tens of seconds.

(47)

Figure 4.1: Differences between manual and automatic AIC-based picking. Each of the 38 bins in the histogram has a width of 0.016, which is twice the sampling rate of the analysed traces. Most of the automatic picks are within two samples from the manual picks.

The AIC picker, as can be seen in figure 4.1, has an approximately metric distribution; the kurtosis-based picker on the other hand is not sym-metrical. Looking at figure 4.2, the distribution has a peak only in corre-spondence of the bin containing the picks that differ from the manual picks between zero and 0.016 s; however, the shape of the distribution suggests that the times picked using this method are either very close to the times picked manually or they tend to be late compared to the formers; in fact the total number of picks that are early compared to the manual picks is approximately the same of the total number of picks that are late compared to the manual picks.

The S pickings are more difficult to identify correctly, even manually. Out of all the seismograms analyzed by the algorithm, only half of those having a manually picked P onset also had a manually picked S onset. Amongst those, in the histogram in figure 4.3 are represented the picks whose difference with the manual picks is less than |0.4| s.

The distribution is not centered in zero, but appears to be shifted towards the right, moreover it presents a second peak between 0.3 and 0.4 s. The

(48)

Figure 4.2: Differences between manual and automatic kurtosis-based picking.Each of the 38 bins in the histogram has a width of 0.016, which is twice the sampling rate of the analysed traces. The majority of the picks lie between zero and 0.016 s from the manual picks, but it is apparent that most of the picks with a higher time difference tend to be late compared to the manual picks.

differences between S waves arrival times picked manually and automatically are bigger than those relative to the P waves arrival times. The determina-tion of the S onset in the algortihm in fact depends on various factors that influence the precision of the pick itself:

• errors in the picked times for P waves arrivals; • an incorrect velocity model;

• presence of the P coda and other seismic phases.

The P onsets and the velocity model are both factors important for the determination of the window used to find the S onset in each seismic trace: an error in the position or the length of the window may result in an error in the S onset. The contamination of the S waves with the P coda and possibly other seismic phases instead influences the position of the S picking within the designated window; the second peak in the distribution, for example, is

(49)

Figure 4.3: Differences between manual and automatic S picking. Each of the 36 bins in the histogram has a width of 0.024, which is thrice the sampling rate of the analysed traces. The pickings are, for the most part, gathered in the positive region of the histogram, meaning that the S onsets found by the automatic picker tend be early compared to the manual ones.

probably due to the picking of another phase arriving before the S phase at the station.

4.2 The localization algorithm

The program has been tested on 1513 events already located, to compare the results; the existing location was obtained using the program Hypoellipse. The area covered by the seismic network that registered the events, figure 4.4, is the area surrounding Larderello, a region interested by geothermic phenomena.

The spacing of the grid used by the localization algorithm is of one kilo-meter on the horizontal plane and half a kilokilo-meter along the vertical direc-tion; this spacing defines the maximum level of precision that is theorically possible to achieve using this algorithm. Moreover, as already noted in 3.3 the size of the grid is determined by the coordinates of the stations furthest away from each other on the horizontal plane and it extends for 15 km on the vertical axes. This means that the algorithm can’t locate correctly events

(50)

Figure 4.4: Map of the area in which the seismic network used to test the algorithm is located. The shaded region in the map represents the maximum area covered by the grid search algorithm.

Table 4.1: Velocity model of the area surrounding the seismic network used by the program Hypoellipse to locate seismic events.

Layer Velocity (km/s) Depth (Km) Thickness (Km) vPvS

1 4.300 0.000 4.000 1.800 2 4.800 4.000 0.200 1.800 3 5.000 4.200 0.400 1.800 4 5.300 4.600 1.400 1.800 5 5.500 6.000 1.500 1.800 6 5.700 7.500 2.500 1.800 7 6.000 10.000 30.000 1.800 8 7.500 40.000 1000.000 1.800

that happen outside of the designed grid; in this case 324 out 1513 events are located outside the grid.

The velocity model used by the grid search algorithm is a one dimensional model with P wave velocity v_P = 5.5 km/s and velocity ratio vPvS = 1.8.

The velocity model used by the program Hypoellipse on the contrary is a two dimensional layered velocity model (tab 4.1). This velocity model is an average model of the whole area surrounding the seismic network and it was obtained inverting1 the travel times from a subset of 200 events considered

(51)

to be particularly significant.

The results of the localizations obtained with the grid search algorithm are shown in figures 4.5, 4.6 and 4.7, where the plotted quantities are respec-tively the differences in epicentral location, depth and hypocentral location with respect to the original localization.

Figure 4.5: Differences in epicentral location between the localization performed by the grid search algorithm and the original localization.

Figure 4.6: Differences in depth between the localization performed by the grid search algorithm and the original localization.

The location of the hypocentre is affected by the errors on both the epicentre location and the depth of the earthquake, so the results are worse

(52)

Figure 4.7: Differences in hypocentral location between the localization performed by the grid search algorithm and the original localization.

than those obtained just considering the position of the epicentre: in fact only the 7% of the events’ locations falls within two kilometers from the original hypocentral location and 17% within two to four kilometers, against 27% and 24% of the epicentral location. The results in the localization performed by the grid search algorithm are affected not only by the spacing and extension of the grid, but also by the different velocity model used to invert the data, since the original velocity model was a two dimensional layered model.

Another localization test has been made on the same dataset, this time using the P and S phases picked by the algorithm as an input for the hypocen-ter location program Hypoellipse, using the same two dimensional layered velocity model that had been used to localize the events using the man-ual picks. The results are shown in figures 4.8, 4.9 and 4.10, representing respectively the differences in epicentral distances, depths and hypocentral distances compared to the original localization, obtained using the manually picked P and S onset times.

The results obtained using the automatic picks but a different, more sophisticated, localization method, are definitely better compared to those obtained using the grid search algorithm implemented in the developed algo-rithm: 45% of the epicentral locations and 24% of the hypocentral locations falls within two kilometers from the original locations, against the 27% and 7% obtained using the grid search algorithm. Since the localization of the

(53)

Figure 4.8: Differences in epicentral locations between the localization performed by the program Hypoellipse using the automatically picked phases and the manually picked phases.

Figure 4.9: Differences in depths between the localization performed by the pro-gram Hypoellipse using the automatically picked phases and the manually picked phases.

events with the automatic picks was made using the same program with the same velocity model as the one used to locate the events with manual picks, the differences in the localizations of the event are to be accounted to the difference in the picked times. Moreover, the P and S times selected by the algorithm to locate the events are not necessarily the same as the manually picked P and S times to locate the events; this depends on two reasons:

(54)

Figure 4.10: Differences in hypocentral locations between the localization per-formed by the program Hypoellipse using the automatically picked phases and the manually picked phases.

1. in one or, eventually, two stages of the developed algorithm, some of the stations (those with the highest residuals with respect to the localization done by the grid search algorithm) are excluded from the localization, meaning that their arrival time is not recorded and can’t be used in other programs;

2. the program Hypoellipse also chooses wich arrival times to use de-pending on the residuals associated to each arrival time, so the P and S phases selected by the program to locate each event may not be the same used in the original localization.

The residual times for P and S phases resulting from the localization with the program Hypoellipse using the times picked by the algorithm are also plotted in figures 4.11 to 4.15; in the first histogram are plotted the residuals relatives to all the stations, while the other histograms show the results for each different station.

We expect to find a distribution of the residuals that is approximately normal and centered in zero for both P and S phases. This is not exatly the case, at it can be seen in figure 4.11; the distribution of the residuals relative to the P onsets is shifted towards the right, while the distribution of the residuals relative to the S onsets is slightly shifted to the left.

(55)

seismo-Figure 4.11: P and S residuals at all stations after the localization attempted using the program Hypoellipse and the automatic picks from the algorithm.

grams and the times computed after the localization, so they can depend on errors in the picking, a locally incorrect velocity model or a combination of the two. In particular, by looking at the residuals computed for each station it is possible to get a feedback on the velocity model: if all the residuals are shifted to the left the velocity model might be to slow, viceversa it might be too fast.

(56)

(a) ‘la01’ (b) ‘la03’ (c) ‘la04’ (d) ‘la05’ Figure 4. 12: P and S residuals at stations ‘la01’, ‘la03’, ‘la04’ ‘la05’ after the lo calization attempted using the program Hyp o ellipse and the automatic pic ks from the algorithm.

(57)

(a) ‘la06 (b) ‘la07’ (c) ‘la08’ (d) ‘la09’ Figure 4. 13: P and S residuals at stations ‘la06’, ‘la07’, ‘la08’ and ‘la09’ after the lo calization attempted using the program Hyp o ellipse and the automatic pic ks from the algorithm.

(58)

(a) ‘la10’ (b) ‘la11’ (c) ‘la12’ (d) ‘trif ’ Figure 4.1 4: P and S re siduals at stations ‘la10’, ‘la11’, ‘la11’ and ‘trif ’ after the lo calization attempted using the program Hyp o ellipse and the automatic pic ks from the algorithm.

An algorithm for automated phase picking and localization of seismic events.

Contents

Riassunto

Chapter 1

Introduction

Chapter 2

An overview of different

picking techniques

2.1

Allen’s picker and Baer and Kradolfer’s picker

2.2

Higher order statistics

2.3

Autoregressive models: the AIC picker

2.4

The fractal dimension

2.5

Polarization filtering

2.6

Neural networks

Chapter 3

A program for phase picking

and event localization

3.1

Trace filtering

3.2

P phase picking

3.3

The localization algorithm

3.4

S phase picking

Chapter 4

Results

4.1

The picker

4.2

The localization algorithm