A Spherical Microphone Array For

(1)

40 ^th AES Conference

Tokyo, 8-10 October 2010

A Spherical Microphone Array For

Synthesizing Virtual Directive Microphones In Live Broadcasting And In Post Production

Angelo Farina, Andrea Capra, Lorenzo Chiesi, Leonardo Scopece

Strategie Tecnologiche

Centro Ricerche e Innovazione Tecnologica Academic Spinoff University of Parma

(2)

RAI RAI - - state of art state of art

• The Holophone H2 Pro is a microphone system equipped with 8 capsules placed on a egg‐shaped framework. The audio signals are delivered directly in G‐format or using an audio mixer.

• The directivity of each

Holophone capsule was

measured in the Anechoic

Room of “Università’ di

Ferrara” (Italy)

(3)

250 Hz

‐30

‐25

‐20

‐15

‐10

‐5 0 1

2 3 4 5 6 7

8 9 10

11 12

13 14

15 16

17 18 19 20 21 22 23 24 25 26 27 28 29 3130 3332 3534 36 37 3938 4140 4342 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58

59 60

61 62

63 64

6566676869 7071 72

C CS L R LS RS Top

500 Hz

‐30

‐25

‐20

‐15

‐10

‐5 0 1

2 3 4 5 6 7

89 10

11 12

13 14

15 16

17 18 19 20 21 22 23 24 25 26 27 28 29 3130 3332 3534 37 36 3938 4140 4342 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59

60 61

62 63

646566676869 7071 72

C CS L R LS RS Top

1000 Hz

‐30

‐25

‐20

‐15

‐10

‐5 0

1 2 3 4 5 6

7 8 9

10 11

12 13

14 15

16 17 18 19 20 21 22 23 24 25 26 27 28 29 3130 3332 3534 37 36 3938 4140 4342 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59

60 61

62 63

646566676869 7071 72

C CS L R LS RS Top

2000 Hz

‐30

‐25

‐20

‐15

‐10

‐5 0 1

2 3 4 5 6 78

9 10

11 12

13 14

15 16

17 18 19 20 21 22 23 24 25 26 27 28 29 30 3231 3433 3635 37 3938 4140 4342 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58

59 60

61 62

63 64

6566676869 7071 72

C CS L R LS RS Top

4000 Hz

‐30

‐25

‐20

‐15

‐10

‐5 0

1 2 3 4 5

6 7 8 9

10 11

12 13

14 15

16 17 18 19 20 21 22 23 24 25 26 27 28 29 3130 3332 3534 37 36 3938 4140 4342 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59

60 61

62 63

646566676869 7071 72

C CS L R LS RS Top

8000 Hz

‐30

‐25

‐20

‐15

‐10

‐5 0 1

2 3 4 5 6 7

89 10

11 12

13 14

15 16

17 18 19 20 21 22 23 24 25 26 27 28 29 3130 3332 3534 37 36 3938 4140 4342 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59

60 61

62 63

646566676869 7071 72

C CS L R LS RS Top

Holophone

Holophone polar patterns polar patterns

(4)

• Directivity and angles between single capsules are not changeable in post‐processing.

• There isn’t enough separation between sources because of the low directivity of the capsules.

For this reason the probe should be placed very close to the scene that is object of recordings

• Surround imaging is in any case inaccurate, albeit the recording sound spacious and with good frequency response (thanks to the DPA omnis)

HOLOPHONE PROBLEMS:

(5)

The The EIGENMIKE EIGENMIKE ^TM ^TM

 Array with 32 ½” capsules of good quality, frequency response up to 20 kHz

 Preamplifiers and A/D converters inside the sphere, with ethernet interface

 Processing on the PC thanks to a VST plugin (no GUI)

(6)

The The EIGENMIKE EIGENMIKE ^TM ^TM

 The Eigenmike is designed for High Order Ambisonics processing

 His software computes internally the spherical harmonics up to third order

 Up to 8 virtual microphones can be synthesized, with one of 7 pre-defined patterns

(7)

The The EIGENMIKE EIGENMIKE ^TM ^TM

(8)

The RAI project The RAI project

• “Virtual” microphones with high directivity, controlled by mouse/joystick in order to follow in realtime actors on the stage. They should be capable to modify their directivity in a sort of

“acoustical zoom”.

• Surround recordings with microphones that can be modified (directivity, angle, gain, ecc..) in post‐production.

GOALS:

(9)

GOALS GOALS

We want to synthesize virtual

microphones highly directive,

steerable, and with variable

directivity pattern

(10)

MICROPHONE ARRAYS:

MICROPHONE ARRAYS: TYPES AND PROCESSING TYPES AND PROCESSING

Linear Array Planar Array Spherical Array

Processing Algorithm

SIGNAL PROCESSOR x

₁

(t)

x

₂

(t)

x

M

(t) M inputs

y

1

(t) y

2

(t)

y

V

(t)

... ... V outputs

.. . x

₁

(t)

x

2

(t)

x

M

(t)

h

1,1

(t) h

_2,1

(t)

h

M,1

(t)

y

₁

(t)

(11)

General Approach General Approach

• Whatever theory or method is chosen, we always start with M microphones, providing M signals x _m , and we derive from them V signals y _v

• And, in any case, each of these V outputs can be expressed by:

processor M inputs

V outputs

 



 ^M

m

v m m

v t x t h t

y

1 , ( )

) ( )

(

(12)

Microphone

Microphone arrays: target, processing arrays: target, processing

Is the time‐domain sampled waveform attempting to emulate that of a virtual microphone characterized by a certain polar pattern and aiming (our target)

Pre‐calculated filters

x ₁ (t) x ₂ (t) x ₃ (t) x ₄ (t)

y _v (t)

 



 ^M

m

v m m

v t x t h t

y

1 , ( ) )

( )

(

(13)

Traditional approaches Traditional approaches

• The processing filters h _m,v are usually

computed following one of several, complex mathematical theories, based on the solution of the wave equation (often under certaing simplifications), and assuming that the

microphones are ideal and identical

• In some implementations, the signal of each

microphone is processed through a digital

filter for compensating its deviation, at the

expense of heavier computational load

(14)

Traditional Spherical Harmonics approach Traditional Spherical Harmonics approach

Spherical Harmonics (H.O.Ambisonics)

Virtual microphones

A fixed number of “intermediate” virtual microphones is computed,

then the dynamically-positioned virtual microphones are obtained

by linear combination of these intermediate signals.

(15)

Novel approach Novel approach

• No theory is assumed: the set of h _m,v filters are derived directly from a set of impulse response measurements, designed according to a least‐squares principle.

• In practice, a matrix of impulse responses is measured, and the matrix has to be numerically inverted (usually employing some regularization technique).

• This way, the outputs of the microphone array are maximally close to the ideal responses prescribed

• This method also inherently corrects for transducer deviations and acoustical artifacts (shielding,

diffractions, reflections, etc.)

(16)

The microphone array impulse responses c _m,d , are measured for a number of D incoming directions.

We get a matrix C of measured impulse responses for a large number P of directions

m=1…M mikes d=1…D sources

c _ki

Novel approach Novel approach

 





 







D , M d

, M 2

, M 1

, M

D , m d

, m 2

, m 1

, m

D , 2 d

, 2 2

, 2 1

, 2

D , 1 d

, 1 2

, 1 1

, 1

c ...

c c

...

c ...

c c

...

c ...

c c

c ...

c c

C

(17)

The virtual microphone which we want to synthesize must be specified in the same D directions where the impulse responses had been measured. Let’s choose a high‐order cardioid of order n as our target virtual

microphone.

This is just a direction‐dependent gain.

The theoretical impulse response

coming from each of the D directions is:

Target Directivity Target Directivity

    ⁿ

Q n  ,   0 . 5  0 . 5  cos(  )  cos(  )

 ^ ^  ^ ^

 _d ,

d Q

p

(18)

Novel approach Novel approach

m = 1…M microphones

d = 1…D directions

Applying the filter matrix H to the measured impulse responses C, the system should behave as a virtual microphone with wanted directivity

h

₁

(t) h

₂

(t)

h

_M

(t)

p _d (t)

Target function

c _1,d (t)

A_M,v

δ(t)

A_1,v δ(t)

A_2,v δ(t)

 

 

M 

m

d m

d

m h p d D

c

1 , 1 ..

But in practice the result of the filtering will never be exactly equal to the prescribed functions p _d …..

c _2,d (t)

c _M,d (t)

(19)

Novel approach Novel approach

We go now to frequency domain, where convolution becomes simple multiplication at every frequency k, by taking an N-point FFT of all those impulse responses:

We now try to invert this linear equation system at every frequency k, and for every virtual microphone v:

 





 

 

 0 .. / 2

..

1

1 , ,

, k N

D P d

H C

M

m

d k

m k

d m

   

  _k ^DxV _DxM

k MxV

C H  P

This over-determined system doesn't admit an exact solution, but it is possible

to find an approximated solution with the Least Squares method

(20)

Least

Least - - squares solution squares solution

We compare the results of the numerical inversion with the theoretical response of our target microphones for all the D directions, properly delayed, and sum the

squared deviations for defining a total error:

The inversion of this matrix system is now performed adding a regularization parameter  , in such a way to minimize the total error (Nelson/Kirkeby

approach):

It revealed to be advantageous to employ a frequency-dependent regularization parameter  _k .

P

     

  _k _MxD   _k _DxM _k   _MxM

k j MxD DxV

k k MxV

I C

C

e Q

H C









  ^





*

(21)

Spectral shape of the regularization Spectral shape of the regularization

parameter parameter   _k _k

• At very low and very high frequencies it is advisable to increase the value of  .

 _H

 _L

(22)

It is possible to compute just once the following term:

Then, whenever a new set filters is required, this is generated simply applying to R the gains Q of the target microphone:

FIR filters realtime synthesis algorithm:

Thanks to Hermitian symmetry properties, a real-FFT algorithm can be employed

Real Real - - time synthesis of the filters h time synthesis of the filters h

   

 

_k _MxD

 

_k _DxM _k

 

_MxM

k j k MxD

k MxD

I C

C

e R C







 

^





*

  ^H _k _MxV ^   ^R _k _MxD ^   ^Q _DxV

Time-domain windowing

[R _k ] _MxD

[Q _k ] _DxV

[H _k ] _MxV

[h _n ] _MxV

N-point

real-IFFT

(23)

Anechoic measurements Anechoic measurements

 A large anechoic room was employed for full-range measurements

 A computer-controlled turntable was employed for rotation at fixed angular steps

 The AURORA software was employed for generating the test signals and the control pulses for the turntable

 ESS (Exponential Sine Sweep) test signal

 The loudspeaker response was measured with a class-0 B&K microphone,

and a suitable inverse filter applied to the test signal

(24)

 Small angular steps

 These are the verification measurements, employed for checking the polar responses of the virtual microphones

Angular step: 5°

N. of measurements: 72

ESS signal duration: 10 seconds

Total measurement time: approximately 18 minutes

Measurements in the horizontal plane

(25)

Subsequent rotations of the probe along a meridian, after selecting the meridian by manual

rotation of the support.

Support for rotating the probe

 Reduced angular resolution for shortening measurement times Angular step: 10° x 10°

N. of measurements: 684 (36 meridians x 19 parallels, including the poles) ESS signal duration: 10 sec

Total measurmenet time: approximately 3 hours

Measurements on the whole sphere

(26)

 Eccentricity of the support and of the turntable

Small errors in the source-microphone distance are introduced by the measurement method

This causes significant phase errors in the measured impulse responses

These errors can be reduced only applying proper time shifts to the measured data

 manual handling of the microphone support

Causes:

Mechanical problems

(27)

Plot of the mechanical errors for 18 complete rotations For any measurement, the position of the

center of the array is estimated as the average of the 32 values of the time-of-flight

DC component

Sinusoidal component

Random noise

The phase coherence is compromised above 1 kHz

The time of flight was obtained by the maximum of the Hilbert’s transform of the impulse responses

Mechanical problems

(28)

 The distance error is compensated by time-shifting the measured impulse responses

 By implementing the time shift in frequency domain, it was possible to apply fractional delays

 After the sinusidal error component has been removed, only a residual

“random noise” error remains, which is less than 2mm:

Residual positioning error after compensation

 The optimal values of these time

shifts are computed by interpolating the sinusoidal component of the error

Optimized time shift (in samples) for correcting the positioning error

Now the phase coeherence is good up to 17 kHz

Mechanical problems: compensation

(29)

31Hz 63Hz 125Hz 250Hz 500Hz 1kHz 2kHz 4kHz 8kHz 16kHz

Verification of the compensation – synthesis of a virtual 3° order cardioid

Without compensation After compensation

 The compensation of the positioning error significantly improved the directivity patterns at 4 and 8 kHz

 At very high frequency the directivity pattern is corrupted due to spatial aliasing

Mechanical problems: compensation

(30)

 The frequency response is quite flat between 30 Hz and 13 kHz

 The microphone support causes troubles at higher frequencies (old type)

Characterization of a single capsule

(31)

31Hz 63Hz 125Hz 250Hz 500Hz 1kHz 2kHz 4kHz 8kHz 16kHz

Virtual cardioid microphones of 3° order

Software Eigenmike

^TM

(HOA) Novel approach

 Better frequency response

 Better directivity control at low frequency

 Upper frequency limit again limited by spatial aliasing, now it reaches 10 kHz In any case, the novel approach is always better than traditional HOA

Comparison with H.O.A.

(32)

Sennheiser MK 70-1

31Hz 63Hz 125Hz 250Hz 500Hz 1kHz 2kHz 4kHz 8kHz 16kHz - 6dB

- 6dB

 Similar beam width ( 60° at -3dB)

 Constant directivity with frequency:

no colouring outside the beam

- 6dB

85Hz 85Hz - 6dB

 Comparable frequency bandwidth

Comparison with a Sennheiser shotgun

(33)

 The processing is now performed on a dedicated Linux “black box”

 Low cost hardware (the processing unit is below 400 Euros)

 Realtime synthesis of processing filters with our novel algorithm

 Aiming and directivity of the virtual microphones can be changed in realtime under control of a joystick or with the mouse

Architecture of the system for realtime processing

(34)

The real‐time microphone system

(35)

 Linux (Ubuntu Minimal, no GUI) operating system

 BRUTEFIR convolution ebgine, opne source (by Anders Torger)

 Our new processing engine implementing calculation of the coefficients and communication with the controller

Software for realtime processing

(36)

 Up 7 virtual microphones are controlled simultaneously

 Realtime movement and zooming with joystick or mouse control

 The aiming of the microphones is

projected over a 360° panoramic realtime video image, from a mirrored camera

 Total audio latency: 23 ms

 Microphone aiming/zooming update ratio: 14 times per second

Caricamento termine indipendente dalla direttività

precalcolato tramite Matlab

Attesa richiesta variazione microfoni

virtuali

TCP/IP

Sintesi dei filtri FIR relativi ai microfoni virtuali modificati

Sostituzione dei filtri nel convolutore mediante double-buffering e cross-fading

Conferma sintesi eseguita

TCP/IP Start

Filter computation engine written in C

Software for realtime processing

(37)

Application examples Application examples

 Example 1 with static image background

 Example 2 with realtime unwrapped video

 Example 3 – outdoor recording of a train passage

 Example 4 – impulse response measurements

at La Scala theater

(38)

(39)

Hardware for 360° video

• A Logitech webcam is mounted under a parabolic mirror, inside a Perspex tube

• The video stream from the Logitech webcam is processed with a realtime video-unwrapping software, written by Adriano Farina in “Processing”, a

Java-based programming language and environment

• It is possible to record the unwrapped

video stream to a standard MOV file

(40)

Video unwrapping

Original image

Unwrapped image

(41)

Hardware for 360° video

(42)

Recording a train passage

(43)

Recording a train passage

• 24 virtual cardioids have been generated, spaced 15° on the horizontal plane

• The directivity synthesized was the narrowest possible (target function: 10 ^th order)

• The processing was possible in realtime thanks to the X-volver VST plugin

(44)

Recording a train passage

40 42 44 46 48 50 52 54 56

-1 80 -1 65 -1 50 -1 35 -1 20 -1 05 -90 -75 -60 -45 -30 -15 0 15 30 45 60 75 90 105 120 135 150 165 180

(45)

Recording a train passage

(46)

IR measurements at La Scala

(47)

IR measurements at La Scala

(48)

Conclusions

• A new spherical microphone array provides robust hardware platform for capturing spatial information

• HOA processing of the signals is possible for synthesizing virtual microphones with directivity patterns up to 3 ^rd order

• A novel processing method has been developed, not relying anymore on spherical harmonics, capable of providing sharper virtual microphones, up to 6 ^th order – equivalent

A Spherical Microphone Array For

40 th AES Conference

Tokyo, 8-10 October 2010