Digital sound processing Digital sound processing
Convolution Convolution Digital Filters Digital Filters
FFT FFT
Fast Convolution
Fast Convolution
Sampling Sampling
Sampling an electrical signal means capturing its value Sampling an electrical signal means capturing its value repeatedly at a constant, very fast rate.
repeatedly at a constant, very fast rate.
Sampling frequency (f Sampling frequency (f
ss) is defined as the number of samples ) is defined as the number of samples captured per second
captured per second
The sampled value is known with finite precision, given by the The sampled value is known with finite precision, given by the
“number of bits” of the analog-to-digital converter, which is
“number of bits” of the analog-to-digital converter, which is limited (typically ranging between 16 and 24 bits)
limited (typically ranging between 16 and 24 bits)
Of consequence, in a time-amplitude chart, the Of consequence, in a time-amplitude chart, the analog waveform is approximated by a sequence analog waveform is approximated by a sequence
of points, which lye in the knots of a lattice, as of points, which lye in the knots of a lattice, as
both time and amplitude are integer multiplies of both time and amplitude are integer multiplies of
small “sampling units” of time and amplitude
small “sampling units” of time and amplitude
Time/frequency discretization Time/frequency discretization
VV
Analog signal (true) Analog signal (true) Digital signal (sampled) Digital signal (sampled)
Fidelity of sampled signals Fidelity of sampled signals
Can a sampled digital signal represent faithfully the original analog one?
YES, but only if the following “Shannon theorem”
is true:
“Sampling frequency must be at least twice of the largest frequency in the signal being
sampled”
A frequency equal to half the sampling frequency is named the “Nyquist frequency”– for avoiding the presence of signals at frequencies higher than the Nyquist’s one, an
analog low-pass filter is inserted before the sampler.
It is called an “anti Aliasing” filter.
Common cases Common cases
CD audio – fCD audio – fss = 44.1 kHz – discretization = 16 bit = 44.1 kHz – discretization = 16 bit
Nyquist frequency is 22.05 kHz, the anti-aliasing starts at 20 kHz, so that at Nyquist frequency is 22.05 kHz, the anti-aliasing starts at 20 kHz, so that at 22.05 kHz the signal is already attenuated by at least 80 dB. Hence the filter 22.05 kHz the signal is already attenuated by at least 80 dB. Hence the filter is very steep, causing a lot of artifacts in time domain (ringing, etc.)
is very steep, causing a lot of artifacts in time domain (ringing, etc.)
DAT recorder – fs = 48 kHz – discretization = 16 bitDAT recorder – fs = 48 kHz – discretization = 16 bit
Nyquist frequency is 24 kHz, the anti-aliasing starts at 20 kHz, so that at 24 Nyquist frequency is 24 kHz, the anti-aliasing starts at 20 kHz, so that at 24 kHz the signal is already attenuated by at least 80 dB. Now the filter is less kHz the signal is already attenuated by at least 80 dB. Now the filter is less steep, and the time-domain artifacts are almost gone.
steep, and the time-domain artifacts are almost gone.
DVD Audio – fDVD Audio – fss = 96 kHz – discretization = 24 bit = 96 kHz – discretization = 24 bit
Nyquist frequency is 48 kHz, but the anti-aliasing starts around 24 kHz, with Nyquist frequency is 48 kHz, but the anti-aliasing starts around 24 kHz, with a very gentle slope, so that at 48 kHz the signal is attenuated by more than a very gentle slope, so that at 48 kHz the signal is attenuated by more than
120 dB.
120 dB.
Such a gentle filter is very “short” in time domain, hence there are virtually no Such a gentle filter is very “short” in time domain, hence there are virtually no time-domain artifacts.
time-domain artifacts.
Impulse Response Impulse Response
System System under test under test Unit pulse
Unit pulse System’s impulse responseSystem’s impulse response Time of flight
Time of flight Direct sound Direct sound
Early reflections Early reflections Reverberant tail Reverberant tail
A simple linear system A simple linear system
CD player
CD player AmplifierAmplifier LoudspeakerLoudspeaker MicrophoneMicrophone Real-world system (one input, one output) Real-world system (one input, one output)
Block diagram Block diagram
x(x()) h(h()) y(y())
Input signal
Input signal System’s System’s Impulse Impulse Response Response (Transfer function) (Transfer function)
Output signal Output signal
“SYSTEM”“SYSTEM”
Analyzer Analyzer
FIR Filtering (Finite Impulse Response) FIR Filtering (Finite Impulse Response)
) i
( x )
(
x h ( ) h ( i ) y ( ) y ( i )
The effect of the linear system h on the signal x passing through it is The effect of the linear system h on the signal x passing through it is described by the mathematical operation called “convolution”, defined by:
described by the mathematical operation called “convolution”, defined by:
i j h j
x )
i ( y
1 N
0 j
This “sum of products” is also called FIR filtering, and models This “sum of products” is also called FIR filtering, and models
accurately any kind of linear systems.
accurately any kind of linear systems.
This is usually written, in compact notation, as:
This is usually written, in compact notation, as:
i h j
x )
i (
y
“convolution” operator“convolution” operator
IIR Filtering (Infinite Impulse Response) IIR Filtering (Infinite Impulse Response)
) i
( x )
(
x y ( ) y ( i )
Alternatively, the filtering caused by a linear system can also be described Alternatively, the filtering caused by a linear system can also be described
by the following recursive formula:
by the following recursive formula:
i j a j y i j b j
x )
i (
y N 1
1 j 1
N 0 j
• In practice, the filter is computed not only from the input samples x, but also as a In practice, the filter is computed not only from the input samples x, but also as a function of the output samples y, obtained at the previous time steps.
function of the output samples y, obtained at the previous time steps.
• In many cases, this method allows for representinging faithfully the behaviour of the In many cases, this method allows for representinging faithfully the behaviour of the system with a much smaller number of coefficients than when employing FIR
system with a much smaller number of coefficients than when employing FIR filtering.
filtering.
• However, modern algorithms on fast computers make However, modern algorithms on fast computers make FIR filtering preferable and even faster
FIR filtering preferable and even faster
) j ( b
)
j
(
a
The FFT Algorithm The FFT Algorithm
The Fast Fourier Transform (FFT) is often employed in Acoustics, with The Fast Fourier Transform (FFT) is often employed in Acoustics, with two goals:
two goals:
►Performing spectral analysis with constant bandwidth
►Fast FIR filtering
FFT transforms a segment of time-domain data in the corresponding FFT transforms a segment of time-domain data in the corresponding spectrum, with constant frequency resolution, starting at 0 Hz (DC) up to spectrum, with constant frequency resolution, starting at 0 Hz (DC) up to Nyquist frequency (which is half of the sampling frequency)
Nyquist frequency (which is half of the sampling frequency)
The longer the time segment, the narrower will be the frequency The longer the time segment, the narrower will be the frequency resolution:
resolution:
[N sampled points in time] =
[N sampled points in time] =
> >
[N/2+1 frequency bands] [N/2+1 frequency bands](the +1 represents the band at frequency 0 Hz, that is the DC component (the +1 represents the band at frequency 0 Hz, that is the DC component – but in acoustics, this is always with zero energy…)
– but in acoustics, this is always with zero energy…)
The FFT Algorithm The FFT Algorithm
The number of points in the time block must be a power of 2 – for example:
The number of points in the time block must be a power of 2 – for example:
4096, 8192, 16384, etc.
4096, 8192, 16384, etc.
Time signal (64 points) Time signal (64 points)
FFTFFT
Frequency spectrum Frequency spectrum
(32 bands + DC) (32 bands + DC)
IFFTIFFT
The inverse transform is also possible The inverse transform is also possible
(from frequency to time) (from frequency to time)
Complex spectrum, autospectrum Complex spectrum, autospectrum
FFT yields a complex spectrum, at every frequency we get a value made of FFT yields a complex spectrum, at every frequency we get a value made of a real and an imaginary parts (Pr, Pi), or, equivalently, by modulus and
a real and an imaginary parts (Pr, Pi), or, equivalently, by modulus and phase
phase
In many cases the phase is considered meaningless,and only the In many cases the phase is considered meaningless,and only the magnitude of the spectrum is plotted in dB:
magnitude of the spectrum is plotted in dB:
o2 2 10
o i 2 r 2
10 p
2 / N 2
1 0 N
4 3 2 1
p
f ' P f
log P p 10
f P f
log P 10
f L
} P
,...
P , P , P { ]
FFT [
} p ,..., p
, p , p , p {
• The second version of the formula contains the definition of the The second version of the formula contains the definition of the
Autospectrum, that is the product, at every frequency, of the spectral Autospectrum, that is the product, at every frequency, of the spectral complex number P(f) with its complex conjugate P’(f)
complex number P(f) with its complex conjugate P’(f)
Complex spectrum, autospectrum Complex spectrum, autospectrum
In other cases, also In other cases, also
the phase the phase
information is information is
relevant, and is relevant, and is
charted separately charted separately
(mainly when the (mainly when the
FFT is applied to an FFT is applied to an
impulse response).
impulse response).
“ “ leakage” and “windows” leakage” and “windows”
One of the assumptions of Fourier analysis is that the time-segment One of the assumptions of Fourier analysis is that the time-segment analysed represents a complete period of a periodic waveform
analysed represents a complete period of a periodic waveform
This is generally UNTRUE: the imperfect connection of the end of a This is generally UNTRUE: the imperfect connection of the end of a segment with the beginning of the next one (identical, as the signal is segment with the beginning of the next one (identical, as the signal is assumed to be periodic), causes a “click”, which produces a wide-band assumed to be periodic), causes a “click”, which produces a wide-band
“white noise”, contaminating the whole spectrum (“leakage”):
“white noise”, contaminating the whole spectrum (“leakage”):
Theoretical Theoretical
spectrum spectrum
Leakage Leakage
“ “ leakage” and “windows” leakage” and “windows”
If we want to analyze a generic, aperiodic signal, we need to “window” the If we want to analyze a generic, aperiodic signal, we need to “window” the signal inside the block being analyzed, bringing it to zero at both ends
signal inside the block being analyzed, bringing it to zero at both ends
To this purpose, many differnet types of “windows” are used, named To this purpose, many differnet types of “windows” are used, named
“Hanning”, “Hamming”, “Blackmann”, “Kaizer”, “Bartlett”, “Parzen”, etc.
“Hanning”, “Hamming”, “Blackmann”, “Kaizer”, “Bartlett”, “Parzen”, etc.
Window overlapping Window overlapping
The problem is that events occurring near the ends of two adjacent blocks The problem is that events occurring near the ends of two adjacent blocks are substantially not analyzed
are substantially not analyzed
To avoid this loss of information, instead of shifting the analysis window by To avoid this loss of information, instead of shifting the analysis window by one whole block, we need to analyze partially-overlapped blocks, with at one whole block, we need to analyze partially-overlapped blocks, with at least 50% overlapping, usually overalpped at 75% or even more
least 50% overlapping, usually overalpped at 75% or even more
Block 1
Block 1 WindowWindow FFTFFT
Block 2
Block 2 WindowWindow FFTFFT
Block 3
Block 3 WindowWindow FFTFFT
1717
Averaging, waterfall, spectrogram Averaging, waterfall, spectrogram
Once a sequence of FFT spectra is obtained, we can average them either Once a sequence of FFT spectra is obtained, we can average them either exponentially (Fast, Slow) o linearly (Leq), emulating a SLM
exponentially (Fast, Slow) o linearly (Leq), emulating a SLM
Alternatively, we can visualize how the spectrum changes over time, by two Alternatively, we can visualize how the spectrum changes over time, by two graphical representations called “waterfall” and “spectrogram” (or “sonogram”) graphical representations called “waterfall” and “spectrogram” (or “sonogram”)
Convolution is signifcantly faster if performed in Convolution is signifcantly faster if performed in
frequency domain:
frequency domain:
Problems
Problems • The whole lenght of signal must be recorded before being processedThe whole lenght of signal must be recorded before being processed
• if N is large, a lot of memory is required.if N is large, a lot of memory is required.
x(n)x(n) FFTFFT X(k)X(k)
X(k)
X(k) H(k) H(k)
Y(k)Y(k)
y(n)
y(n) IFFTIFFT
x(n)
x(n) h(n) h(n)
y(n)y(n)
Solution
Solution • “ “Overlap & Save”AlgorithmOverlap & Save”Algorithm
Fast FIR filtering with FFT
Fast FIR filtering with FFT
FFTFFT N-pointN-point
FFTFFT N-pointN-point
xx IFFTIFFT Xm(k)H(k) Xm(k)H(k)
Select last Select last N – Q + 1 N – Q + 1 samples samples
Append to Append to
y(n)y(n)
x x
mm(n) (n)
h(n) h(n)
Convoluzione veloce FFT con Overlap & Save (Oppenheim &
Convoluzione veloce FFT con Overlap & Save (Oppenheim &
Shafer, 1975):
Shafer, 1975):
Problems
Problems • Excessive processing latency between input and outputExcessive processing latency between input and output
• If N is large, a lot of memory is still requiredIf N is large, a lot of memory is still required Solution
Solution • “ “uniformly-partitioned Overlap & Save”uniformly-partitioned Overlap & Save”
Overlap & Save
Overlap & Save
Filter’s impulse Filter’s impulse response
response h(n) h(n) is also is also partitioned in a partitioned in a number of blocks number of blocks
Now latency and Now latency and memory occupation memory occupation reduce to the length reduce to the length
of one single block of one single block
11stst block block 22ndnd block block 33rdrd block block 44thth block block