Digital sound processing Digital sound processing

(1)

Digital sound processing Digital sound processing

Convolution Convolution Digital Filters Digital Filters

FFT FFT

Fast Convolution

(2)

Sampling Sampling

 Sampling an electrical signal means capturing its value Sampling an electrical signal means capturing its value repeatedly at a constant, very fast rate.

repeatedly at a constant, very fast rate.

 Sampling frequency (f Sampling frequency (f

_s_s

) is defined as the number of samples ) is defined as the number of samples captured per second

captured per second

 The sampled value is known with finite precision, given by the The sampled value is known with finite precision, given by the

“number of bits” of the analog-to-digital converter, which is

“number of bits” of the analog-to-digital converter, which is limited (typically ranging between 16 and 24 bits)

limited (typically ranging between 16 and 24 bits)

 Of consequence, in a time-amplitude chart, the Of consequence, in a time-amplitude chart, the analog waveform is approximated by a sequence analog waveform is approximated by a sequence

of points, which lye in the knots of a lattice, as of points, which lye in the knots of a lattice, as

both time and amplitude are integer multiplies of both time and amplitude are integer multiplies of

small “sampling units” of time and amplitude

(3)

Time/frequency discretization Time/frequency discretization



VV

Analog signal (true) Analog signal (true) Digital signal (sampled) Digital signal (sampled)

(4)

Fidelity of sampled signals Fidelity of sampled signals

 Can a sampled digital signal represent faithfully the original analog one?

 YES, but only if the following “Shannon theorem”

is true:

“Sampling frequency must be at least twice of the largest frequency in the signal being

sampled”

A frequency equal to half the sampling frequency is named the “Nyquist frequency”– for avoiding the presence of signals at frequencies higher than the Nyquist’s one, an

analog low-pass filter is inserted before the sampler.

It is called an “anti Aliasing” filter.

(5)

Common cases Common cases

 CD audio – fCD audio – fss = 44.1 kHz – discretization = 16 bit = 44.1 kHz – discretization = 16 bit

Nyquist frequency is 22.05 kHz, the anti-aliasing starts at 20 kHz, so that at Nyquist frequency is 22.05 kHz, the anti-aliasing starts at 20 kHz, so that at 22.05 kHz the signal is already attenuated by at least 80 dB. Hence the filter 22.05 kHz the signal is already attenuated by at least 80 dB. Hence the filter is very steep, causing a lot of artifacts in time domain (ringing, etc.)

is very steep, causing a lot of artifacts in time domain (ringing, etc.)

 DAT recorder – fs = 48 kHz – discretization = 16 bitDAT recorder – fs = 48 kHz – discretization = 16 bit

Nyquist frequency is 24 kHz, the anti-aliasing starts at 20 kHz, so that at 24 Nyquist frequency is 24 kHz, the anti-aliasing starts at 20 kHz, so that at 24 kHz the signal is already attenuated by at least 80 dB. Now the filter is less kHz the signal is already attenuated by at least 80 dB. Now the filter is less steep, and the time-domain artifacts are almost gone.

steep, and the time-domain artifacts are almost gone.

 DVD Audio – fDVD Audio – f_s_s = 96 kHz – discretization = 24 bit = 96 kHz – discretization = 24 bit

Nyquist frequency is 48 kHz, but the anti-aliasing starts around 24 kHz, with Nyquist frequency is 48 kHz, but the anti-aliasing starts around 24 kHz, with a very gentle slope, so that at 48 kHz the signal is attenuated by more than a very gentle slope, so that at 48 kHz the signal is attenuated by more than

120 dB.

 Such a gentle filter is very “short” in time domain, hence there are virtually no Such a gentle filter is very “short” in time domain, hence there are virtually no time-domain artifacts.

time-domain artifacts.

(6)

Impulse Response Impulse Response

System System under test under test Unit pulse

Unit pulse  System’s impulse responseSystem’s impulse response Time of flight

Time of flight Direct sound Direct sound

Early reflections Early reflections Reverberant tail Reverberant tail

(7)

A simple linear system A simple linear system

CD player

CD player AmplifierAmplifier LoudspeakerLoudspeaker MicrophoneMicrophone Real-world system (one input, one output) Real-world system (one input, one output)

Block diagram Block diagram

x(x()) h(h()) y(y())

Input signal

Input signal System’s System’s Impulse Impulse Response Response (Transfer function) (Transfer function)

Output signal Output signal

“SYSTEM”“SYSTEM”

Analyzer Analyzer

(8)

FIR Filtering (Finite Impulse Response) FIR Filtering (Finite Impulse Response)

) i

( x )

(

x      h (  )  h ( i    ) y (  )  y ( i    )

The effect of the linear system h on the signal x passing through it is The effect of the linear system h on the signal x passing through it is described by the mathematical operation called “convolution”, defined by:

described by the mathematical operation called “convolution”, defined by:

 ⁱ ^j    ^h ^j

x )

i ( y

1 N

0 j





 

^



This “sum of products” is also called FIR filtering, and models This “sum of products” is also called FIR filtering, and models

accurately any kind of linear systems.

This is usually written, in compact notation, as:

  ⁱ ^h   ^j

x )

i (

y  

“convolution” operator“convolution” operator

(9)

IIR Filtering (Infinite Impulse Response) IIR Filtering (Infinite Impulse Response)

) i

( x )

(

x      y (  )  y ( i    )

Alternatively, the filtering caused by a linear system can also be described Alternatively, the filtering caused by a linear system can also be described

by the following recursive formula:

 ⁱ ^j    ^a ^j ^y  ⁱ ^j    ^b ^j

x )

i (

y ^N ¹

1 j 1

N 0 j











   ^







• In practice, the filter is computed not only from the input samples x, but also as a In practice, the filter is computed not only from the input samples x, but also as a function of the output samples y, obtained at the previous time steps.

function of the output samples y, obtained at the previous time steps.

• In many cases, this method allows for representinging faithfully the behaviour of the In many cases, this method allows for representinging faithfully the behaviour of the system with a much smaller number of coefficients than when employing FIR

system with a much smaller number of coefficients than when employing FIR filtering.

filtering.

• However, modern algorithms on fast computers make However, modern algorithms on fast computers make FIR filtering preferable and even faster

FIR filtering preferable and even faster

 



 



) j ( b

)

j

(

a

(10)

The FFT Algorithm The FFT Algorithm

 The Fast Fourier Transform (FFT) is often employed in Acoustics, with The Fast Fourier Transform (FFT) is often employed in Acoustics, with two goals:

two goals:

►Performing spectral analysis with constant bandwidth

►Fast FIR filtering

 FFT transforms a segment of time-domain data in the corresponding FFT transforms a segment of time-domain data in the corresponding spectrum, with constant frequency resolution, starting at 0 Hz (DC) up to spectrum, with constant frequency resolution, starting at 0 Hz (DC) up to Nyquist frequency (which is half of the sampling frequency)

Nyquist frequency (which is half of the sampling frequency)

 The longer the time segment, the narrower will be the frequency The longer the time segment, the narrower will be the frequency resolution:

resolution:

[N sampled points in time] =

> >

[N/2+1 frequency bands] [N/2+1 frequency bands]

(the +1 represents the band at frequency 0 Hz, that is the DC component (the +1 represents the band at frequency 0 Hz, that is the DC component – but in acoustics, this is always with zero energy…)

– but in acoustics, this is always with zero energy…)

(11)

The FFT Algorithm The FFT Algorithm

The number of points in the time block must be a power of 2 – for example:

4096, 8192, 16384, etc.

Time signal (64 points) Time signal (64 points)

FFTFFT

Frequency spectrum Frequency spectrum

(32 bands + DC) (32 bands + DC)

IFFTIFFT

The inverse transform is also possible The inverse transform is also possible

(from frequency to time) (from frequency to time)

(12)

Complex spectrum, autospectrum Complex spectrum, autospectrum

 FFT yields a complex spectrum, at every frequency we get a value made of FFT yields a complex spectrum, at every frequency we get a value made of a real and an imaginary parts (Pr, Pi), or, equivalently, by modulus and

a real and an imaginary parts (Pr, Pi), or, equivalently, by modulus and phase

phase

 In many cases the phase is considered meaningless,and only the In many cases the phase is considered meaningless,and only the magnitude of the spectrum is plotted in dB:

magnitude of the spectrum is plotted in dB:

         

 





 



 



 

 



 



 







o2 2 10

o i 2 r 2

10 p

2 / N 2

1 0 N

4 3 2 1

p

f ' P f

log P p 10

f P f

log P 10

f L

} P

,...

P , P , P { ]

FFT [

} p ,..., p

, p , p , p {

• The second version of the formula contains the definition of the The second version of the formula contains the definition of the

Autospectrum, that is the product, at every frequency, of the spectral Autospectrum, that is the product, at every frequency, of the spectral complex number P(f) with its complex conjugate P’(f)

complex number P(f) with its complex conjugate P’(f)

(13)

Complex spectrum, autospectrum Complex spectrum, autospectrum

In other cases, also In other cases, also

the phase the phase

information is information is

relevant, and is relevant, and is

charted separately charted separately

(mainly when the (mainly when the

FFT is applied to an FFT is applied to an

impulse response).

(14)

“ “ leakage” and “windows” leakage” and “windows”

 One of the assumptions of Fourier analysis is that the time-segment One of the assumptions of Fourier analysis is that the time-segment analysed represents a complete period of a periodic waveform

analysed represents a complete period of a periodic waveform

 This is generally UNTRUE: the imperfect connection of the end of a This is generally UNTRUE: the imperfect connection of the end of a segment with the beginning of the next one (identical, as the signal is segment with the beginning of the next one (identical, as the signal is assumed to be periodic), causes a “click”, which produces a wide-band assumed to be periodic), causes a “click”, which produces a wide-band

“white noise”, contaminating the whole spectrum (“leakage”):

Theoretical Theoretical

spectrum spectrum

Leakage Leakage

(15)

“ “ leakage” and “windows” leakage” and “windows”

 If we want to analyze a generic, aperiodic signal, we need to “window” the If we want to analyze a generic, aperiodic signal, we need to “window” the signal inside the block being analyzed, bringing it to zero at both ends

signal inside the block being analyzed, bringing it to zero at both ends

 To this purpose, many differnet types of “windows” are used, named To this purpose, many differnet types of “windows” are used, named

“Hanning”, “Hamming”, “Blackmann”, “Kaizer”, “Bartlett”, “Parzen”, etc.

(16)

Window overlapping Window overlapping

 The problem is that events occurring near the ends of two adjacent blocks The problem is that events occurring near the ends of two adjacent blocks are substantially not analyzed

are substantially not analyzed

 To avoid this loss of information, instead of shifting the analysis window by To avoid this loss of information, instead of shifting the analysis window by one whole block, we need to analyze partially-overlapped blocks, with at one whole block, we need to analyze partially-overlapped blocks, with at least 50% overlapping, usually overalpped at 75% or even more

least 50% overlapping, usually overalpped at 75% or even more

Block 1

Block 1 WindowWindow FFTFFT

Block 2

Block 3

(17)

1717

Averaging, waterfall, spectrogram Averaging, waterfall, spectrogram

 Once a sequence of FFT spectra is obtained, we can average them either Once a sequence of FFT spectra is obtained, we can average them either exponentially (Fast, Slow) o linearly (Leq), emulating a SLM

exponentially (Fast, Slow) o linearly (Leq), emulating a SLM

 Alternatively, we can visualize how the spectrum changes over time, by two Alternatively, we can visualize how the spectrum changes over time, by two graphical representations called “waterfall” and “spectrogram” (or “sonogram”) graphical representations called “waterfall” and “spectrogram” (or “sonogram”)

(18)

Convolution is signifcantly faster if performed in Convolution is signifcantly faster if performed in

frequency domain:

Problems

Problems • The whole lenght of signal must be recorded before being processedThe whole lenght of signal must be recorded before being processed

• if N is large, a lot of memory is required.if N is large, a lot of memory is required.

x(n)x(n) FFTFFT X(k)X(k)

X(k)

X(k)  H(k) H(k)

Y(k)Y(k)

y(n)

y(n) IFFTIFFT

x(n)

x(n)  h(n) h(n)

y(n)y(n)

Solution

Solution • “ “Overlap & Save”AlgorithmOverlap & Save”Algorithm

Fast FIR filtering with FFT

(19)

FFTFFT N-pointN-point

xx IFFTIFFT Xm(k)H(k) Xm(k)H(k)

Select last Select last N – Q + 1 N – Q + 1 samples samples

Append to Append to

y(n)y(n)

x x

_m_m

(n) (n)

h(n) h(n)

Convoluzione veloce FFT con Overlap & Save (Oppenheim &

Shafer, 1975):

Problems

Problems • Excessive processing latency between input and outputExcessive processing latency between input and output

• If N is large, a lot of memory is still requiredIf N is large, a lot of memory is still required Solution

Solution • “ “uniformly-partitioned Overlap & Save”uniformly-partitioned Overlap & Save”

Overlap & Save

(20)

Filter’s impulse Filter’s impulse response

response h(n) h(n) is also is also partitioned in a partitioned in a number of blocks number of blocks

Now latency and Now latency and memory occupation memory occupation reduce to the length reduce to the length

of one single block of one single block

11^st^st block block 22^nd^nd block block 33^rd^rd block block 44^th^th block block

Uniformly Partitioned Overlap & Save

Digital sound processing Digital sound processing