• Non ci sono risultati.

A Spherical Microphone Array For 

N/A
N/A
Protected

Academic year: 2021

Condividi "A Spherical Microphone Array For "

Copied!
48
0
0

Testo completo

(1)

40 th AES Conference

Tokyo, 8-10 October 2010

A Spherical Microphone Array For 

Synthesizing Virtual Directive Microphones  In Live Broadcasting And In Post Production

Angelo Farina, Andrea Capra,  Lorenzo Chiesi, Leonardo Scopece

Strategie Tecnologiche

Centro Ricerche e Innovazione Tecnologica Academic Spinoff University of Parma

(2)

RAI RAI - - state of art state of art

• The  Holophone  H2  Pro  is  a  microphone  system  equipped  with  8  capsules  placed  on  a  egg‐shaped  framework.  The  audio  signals  are delivered directly in G‐format or using an  audio mixer. 

• The directivity of each 

Holophone capsule was 

measured in the Anechoic 

Room of “Università’ di 

Ferrara” (Italy)

(3)

250 Hz

‐30

‐25

‐20

‐15

‐10

‐5 0 1

2 3 4 5 6 7

8 9 10

11 12

13 14

15 16

17 18 19 20 21 22 23 24 25 26 27 28 29 3130 3332 3534 36 37 3938 4140 4342 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58

59 60

61 62

63 64

6566676869 7071 72

C CS L R LS RS Top

500 Hz

‐30

‐25

‐20

‐15

‐10

‐5 0 1

2 3 4 5 6 7

89 10

11 12

13 14

15 16

17 18 19 20 21 22 23 24 25 26 27 28 29 3130 3332 3534 37 36 3938 4140 4342 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59

60 61

62 63

646566676869 7071 72

C CS L R LS RS Top

1000 Hz

‐30

‐25

‐20

‐15

‐10

‐5 0

1 2 3 4 5 6

7 8 9

10 11

12 13

14 15

16 17 18 19 20 21 22 23 24 25 26 27 28 29 3130 3332 3534 37 36 3938 4140 4342 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59

60 61

62 63

646566676869 7071 72

C CS L R LS RS Top

2000 Hz

‐30

‐25

‐20

‐15

‐10

‐5 0 1

2 3 4 5 6 78

9 10

11 12

13 14

15 16

17 18 19 20 21 22 23 24 25 26 27 28 29 30 3231 3433 3635 37 3938 4140 4342 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58

59 60

61 62

63 64

6566676869 7071 72

C CS L R LS RS Top

4000 Hz

‐30

‐25

‐20

‐15

‐10

‐5 0

1 2 3 4 5

6 7 8 9

10 11

12 13

14 15

16 17 18 19 20 21 22 23 24 25 26 27 28 29 3130 3332 3534 37 36 3938 4140 4342 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59

60 61

62 63

646566676869 7071 72

C CS L R LS RS Top

8000 Hz

‐30

‐25

‐20

‐15

‐10

‐5 0 1

2 3 4 5 6 7

89 10

11 12

13 14

15 16

17 18 19 20 21 22 23 24 25 26 27 28 29 3130 3332 3534 37 36 3938 4140 4342 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59

60 61

62 63

646566676869 7071 72

C CS L R LS RS Top

Holophone

Holophone polar patterns polar patterns

(4)

• Directivity and angles between single capsules  are not changeable in post‐processing.

• There isn’t enough separation between sources  because of the low directivity of the capsules. 

For this reason the probe should be placed very  close to the scene that is object of recordings

• Surround imaging is in any case inaccurate,  albeit the recording sound spacious and with  good frequency response (thanks to the DPA  omnis)

HOLOPHONE PROBLEMS:

HOLOPHONE PROBLEMS:

(5)

The The EIGENMIKE EIGENMIKE TM TM

 Array with 32 ½” capsules of good quality, frequency response up to 20 kHz

 Preamplifiers and A/D converters inside the sphere, with ethernet interface

 Processing on the PC thanks to a VST plugin (no GUI)

(6)

The The EIGENMIKE EIGENMIKE TM TM

 The Eigenmike is designed for High Order Ambisonics processing

 His software computes internally the spherical harmonics up to third order

 Up to 8 virtual microphones can be synthesized, with one of 7 pre-defined patterns

(7)

The The EIGENMIKE EIGENMIKE TM TM

(8)

The RAI project The RAI project

• “Virtual” microphones  with  high  directivity,  controlled by mouse/joystick in order to follow  in realtime actors on the stage. They should be  capable  to  modify  their  directivity  in  a  sort  of 

“acoustical zoom”.

• Surround  recordings  with  microphones  that  can  be  modified  (directivity,  angle,  gain,  ecc..)  in post‐production.  

GOALS:

(9)

GOALS GOALS

We want to synthesize virtual

microphones highly directive,

steerable, and with variable

directivity pattern

(10)

MICROPHONE ARRAYS:

MICROPHONE ARRAYS: TYPES AND PROCESSING TYPES AND PROCESSING

Linear Array Planar Array Spherical Array

Processing Algorithm

SIGNAL PROCESSOR x

1

(t)

x

2

(t)

x

M

(t) M inputs

y

1

(t) y

2

(t)

y

V

(t)

... ... V outputs

.. . x

1

(t)

x

2

(t)

x

M

(t)

h

1,1

(t) h

2,1

(t)

h

M,1

(t)

y

1

(t)

(11)

General Approach General Approach

• Whatever theory or method is chosen, we always  start with M microphones, providing M signals x m ,  and we derive from them V signals y v

• And, in any case, each of these V outputs can be  expressed by:

processor M inputs

V outputs

 

M

m

v m m

v t x t h t

y

1

, ( )

) ( )

(

(12)

Microphone

Microphone arrays: target, processing arrays: target, processing

Is the time‐domain sampled waveform attempting to  emulate that of a virtual microphone characterized by a  certain polar pattern and aiming (our target)

Pre‐calculated filters

x 1 (t) x 2 (t) x 3 (t) x 4 (t)

y v (t)

 

M

m

v m m

v t x t h t

y

1

, ( ) )

( )

(

(13)

Traditional approaches Traditional approaches

• The processing filters h m,v are usually 

computed following one of several, complex  mathematical theories, based on the solution  of the wave equation (often under certaing  simplifications), and assuming that the 

microphones are ideal and identical

• In some implementations, the signal of each 

microphone is processed through a digital 

filter for compensating its deviation, at the 

expense of heavier computational load

(14)

Traditional Spherical Harmonics approach Traditional Spherical Harmonics approach

Spherical Harmonics (H.O.Ambisonics)

Virtual microphones

A fixed number of “intermediate” virtual microphones is computed,

then the dynamically-positioned virtual microphones are obtained

by linear combination of these intermediate signals.

(15)

Novel approach Novel approach

• No theory is assumed: the set of h m,v filters are derived  directly from a set of impulse response measurements,  designed according to a least‐squares principle.

• In practice, a matrix of impulse responses is measured,  and the matrix has to be numerically inverted (usually  employing some regularization technique).

• This way, the outputs of the microphone array are  maximally close to the ideal responses prescribed

• This method also inherently corrects for transducer  deviations and acoustical artifacts (shielding, 

diffractions, reflections, etc.)

(16)

The microphone array impulse responses c m,d , are  measured for a number of D incoming directions. 

We get a matrix C of measured impulse responses for a large  number P of directions

m=1…M mikes d=1…D sources

c ki

Novel approach Novel approach

 

 

 

 

 

 

 

 

D , M d

, M 2

, M 1

, M

D , m d

, m 2

, m 1

, m

D , 2 d

, 2 2

, 2 1

, 2

D , 1 d

, 1 2

, 1 1

, 1

c ...

c ...

c c

...

...

...

...

...

...

c ...

c ...

c c

...

...

...

...

...

...

c ...

c ...

c c

c ...

c ...

c c

C

(17)

The virtual microphone which we want to synthesize  must be specified in the same D directions where the  impulse responses had been measured. Let’s choose a  high‐order cardioid of order n as our target virtual 

microphone.

This is just a direction‐dependent gain. 

The theoretical impulse response 

coming from each of the D directions is:

Target Directivity Target Directivity

    n

Q n  ,   0 . 5  0 . 5  cos(  )  cos(  )

d ,

d Q

p

(18)

Novel approach Novel approach

m = 1…M microphones

d = 1…D directions

Applying the filter matrix H to the measured impulse responses C, the system should behave as a virtual microphone with wanted directivity

h

1

(t) h

2

(t)

h

M

(t)

p d (t)

Target function

c 1,d (t)

AM,v

δ(t)

A1,v δ(t)

A2,v δ(t)

 

 

M

m

d m

d

m h p d D

c

1

, 1 ..

But in practice the result of the filtering will never be exactly equal to the prescribed functions p d …..

c 2,d (t)

c M,d (t)

(19)

Novel approach Novel approach

We go now to frequency domain, where convolution becomes simple multiplication at every frequency k, by taking an N-point FFT of all those impulse responses:

We now try to invert this linear equation system at every frequency k, and for every virtual microphone v:

 

 

 

 0 .. / 2

..

1

1

, ,

, k N

D P d

H C

M

m

d k

m k

d m

   

  k DxV DxM

k MxV

C HP

This over-determined system doesn't admit an exact solution, but it is possible

to find an approximated solution with the Least Squares method

(20)

Least

Least - - squares solution squares solution

We compare the results of the numerical inversion with the theoretical response of our target microphones for all the D directions, properly delayed, and sum the

squared deviations for defining a total error:

The inversion of this matrix system is now performed adding a regularization parameter  , in such a way to minimize the total error (Nelson/Kirkeby

approach):

It revealed to be advantageous to employ a frequency-dependent regularization parameter  k .

P

     

  k MxD   k DxM k   MxM

k j MxD DxV

k k MxV

I C

C

e Q

H C

 

*

*

(21)

Spectral shape of the regularization Spectral shape of the regularization

parameter parameter   k k

• At very low and very high frequencies it is  advisable to increase the value of   .

H

L

(22)

It is possible to compute just once the following term:

Then, whenever a new set filters is required, this is generated simply applying to R the gains Q of the target microphone:

FIR filters realtime synthesis algorithm:

Thanks to Hermitian symmetry properties, a real-FFT algorithm can be employed

Real Real - - time synthesis of the filters h time synthesis of the filters h

   

 

k MxD

 

k DxM k

 

MxM

k j k MxD

k MxD

I C

C

e R C

 

*

*

  H k MxV   R k MxD   Q DxV

Time-domain windowing

[R k ] MxD

[Q k ] DxV

[H k ] MxV

[h n ] MxV

N-point

real-IFFT

(23)

Anechoic measurements Anechoic measurements

 A large anechoic room was employed for full-range measurements

 A computer-controlled turntable was employed for rotation at fixed angular steps

 The AURORA software was employed for generating the test signals and the control pulses for the turntable

 ESS (Exponential Sine Sweep) test signal

 The loudspeaker response was measured with a class-0 B&K microphone,

and a suitable inverse filter applied to the test signal

(24)

 Small angular steps

 These are the verification measurements, employed for checking the polar responses of the virtual microphones

Angular step: 5°

N. of measurements: 72

ESS signal duration: 10 seconds

Total measurement time: approximately 18 minutes

Measurements in the horizontal plane

Measurements in the horizontal plane

(25)

Subsequent rotations of the probe along a meridian, after selecting the meridian by manual

rotation of the support.

Support for rotating the probe

 Reduced angular resolution for shortening measurement times Angular step: 10° x 10°

N. of measurements: 684 (36 meridians x 19 parallels, including the poles) ESS signal duration: 10 sec

Total measurmenet time: approximately 3 hours

Measurements on the whole sphere

Measurements on the whole sphere

(26)

 Eccentricity of the support and of the turntable

Small errors in the source-microphone distance are introduced by the measurement method

This causes significant phase errors in the measured impulse responses

These errors can be reduced only applying proper time shifts to the measured data

 manual handling of the microphone support

Causes:

Mechanical problems

Mechanical problems

(27)

Plot of the mechanical errors for 18 complete rotations For any measurement, the position of the

center of the array is estimated as the average of the 32 values of the time-of-flight

DC component

Sinusoidal component

Random noise

The phase coherence is compromised above 1 kHz

The time of flight was obtained by the maximum of the Hilbert’s transform of the impulse responses

Mechanical problems

Mechanical problems

(28)

The distance error is compensated by time-shifting the measured impulse responses

By implementing the time shift in frequency domain, it was possible to apply fractional delays

After the sinusidal error component has been removed, only a residual

“random noise” error remains, which is less than 2mm:

Residual positioning error after compensation

The optimal values of these time

shifts are computed by interpolating the sinusoidal component of the error

Optimized time shift (in samples) for correcting the positioning error

Now the phase coeherence is good up to 17 kHz

Mechanical problems: compensation

Mechanical problems: compensation

(29)

31Hz 63Hz 125Hz 250Hz 500Hz 1kHz 2kHz 4kHz 8kHz 16kHz

Verification of the compensation – synthesis of a virtual 3° order cardioid

Without compensation After compensation

 The compensation of the positioning error significantly improved the directivity patterns at 4 and 8 kHz

 At very high frequency the directivity pattern is corrupted due to spatial aliasing

Mechanical problems: compensation

Mechanical problems: compensation

(30)

 The frequency response is quite flat between 30 Hz and 13 kHz

 The microphone support causes troubles at higher frequencies (old type)

Characterization of a single capsule

Characterization of a single capsule

(31)

31Hz 63Hz 125Hz 250Hz 500Hz 1kHz 2kHz 4kHz 8kHz 16kHz

Virtual cardioid microphones of 3° order

Software Eigenmike

TM

(HOA) Novel approach

 Better frequency response

 Better directivity control at low frequency

 Upper frequency limit again limited by spatial aliasing, now it reaches 10 kHz In any case, the novel approach is always better than traditional HOA

Comparison with H.O.A.

Comparison with H.O.A.

(32)

Sennheiser MK 70-1

31Hz 63Hz 125Hz 250Hz 500Hz 1kHz 2kHz 4kHz 8kHz 16kHz - 6dB

- 6dB

 Similar beam width ( 60° at -3dB)

 Constant directivity with frequency:

no colouring outside the beam

- 6dB

85Hz 85Hz - 6dB

 Comparable frequency bandwidth

Comparison with a Sennheiser shotgun

Comparison with a Sennheiser shotgun

(33)

 The processing is now performed on a dedicated Linux “black box”

 Low cost hardware (the processing unit is below 400 Euros)

 Realtime synthesis of processing filters with our novel algorithm

 Aiming and directivity of the virtual microphones can be changed in realtime under control of a joystick or with the mouse

Architecture of the system for realtime processing

Architecture of the system for realtime processing

(34)

The real‐time microphone system

(35)

 Linux (Ubuntu Minimal, no GUI) operating system

 BRUTEFIR convolution ebgine, opne source (by Anders Torger)

 Our new processing engine implementing calculation of the coefficients and communication with the controller

Software for realtime processing

Software for realtime processing

(36)

 Up 7 virtual microphones are controlled simultaneously

 Realtime movement and zooming with joystick or mouse control

 The aiming of the microphones is

projected over a 360° panoramic realtime video image, from a mirrored camera

 Total audio latency: 23 ms

Microphone aiming/zooming update ratio: 14 times per second

Caricamento termine indipendente dalla direttività

precalcolato tramite Matlab

Attesa richiesta variazione microfoni

virtuali

TCP/IP

Sintesi dei filtri FIR relativi ai microfoni virtuali modificati

Sostituzione dei filtri nel convolutore mediante double-buffering e cross-fading

Conferma sintesi eseguita

TCP/IP Start

Filter computation engine written in C

Software for realtime processing

Software for realtime processing

(37)

Application examples Application examples

 Example 1 with static image background

 Example 2 with realtime unwrapped video

 Example 3 – outdoor recording of a train passage

 Example 4 – impulse response measurements

at La Scala theater

(38)
(39)

Hardware for 360° video

• A Logitech webcam is mounted under a parabolic mirror, inside a Perspex tube

• The video stream from the Logitech webcam is processed with a realtime video-unwrapping software, written by Adriano Farina in “Processing”, a

Java-based programming language and environment

• It is possible to record the unwrapped

video stream to a standard MOV file

(40)

Video unwrapping

Original image

Unwrapped image

(41)

Hardware for 360° video

(42)

Recording a train passage

(43)

Recording a train passage

• 24 virtual cardioids have been generated, spaced 15° on the horizontal plane

• The directivity synthesized was the narrowest possible (target function: 10 th order)

• The processing was possible in realtime thanks to the X-volver VST plugin

(44)

Recording a train passage

40 42 44 46 48 50 52 54 56

-1 80 -1 65 -1 50 -1 35 -1 20 -1 05 -90 -75 -60 -45 -30 -15 0 15 30 45 60 75 90 105 120 135 150 165 180

(45)

Recording a train passage

(46)

IR measurements at La Scala

(47)

IR measurements at La Scala

(48)

Conclusions

• A new spherical microphone array provides robust hardware platform for capturing spatial information

• HOA processing of the signals is possible for synthesizing virtual microphones with directivity patterns up to 3 rd order

• A novel processing method has been developed, not relying anymore on spherical harmonics, capable of providing sharper virtual microphones, up to 6 th order – equivalent

• An advanced GUI with realtime 360° video background has been developed for aiming correctly up to 7 virtual

microphones during live broadcasting or in post-productions

• The first operational tests confirm that operation is smooth

and reliable, and the quality of the signal provided by these

virtual microphones is comparable or better than state-of-the-

art shotgun mikes

Riferimenti

Documenti correlati

Il sistema sviluppato è stato quindi utilizzato per il rilievo e la ricostruzione di casse di compressori presso gli stabilimenti di Massa della Nuovo Pignone.. Gli ottimi

Numerical results show that both protocols enhance performances of the al- ready existent round robin scheduler both in terms of capacity and fairness.. Fair- ness is obviously

A fixed number of “intermediate” virtual microphones is computed (B-format), then the dynamically-positioned virtual microphones are obtained by linear combination of

For the receiver side we have found that the Memory Acceleration technique is an effective optimization method that permits us to achieve the real-time target with a

The offering of the Notes has not been registered with the Commissione Nazionale per la Società e la Borsa (“CONSOB”) (the Italian securities exchange commission) pursuant to the

Moreover, the full spatial sonic behaviour of a theatre, which includes information about energy, intensity and location of early reflections in the room, is required to determine

9 Array with 32 ½” capsules of good quality, frequency response up to 20 kHz 9 Preamplifiers and A/D converters inside the sphere, with ethernet interface 9 Processing on the PC

• The audio interface is an EMIB Firewire interface based on TCAT DICE chip:. - supported by any