3D Virtual Microphone System employing 3D Virtual Microphone System employing spherical and cylindrical spherical and cylindrical Microphone Arrays Microphone Arrays

(1)

The University of Parma

Leonardo Scopece [email protected] Leonardo Scopece [email protected] Angelo Farina

[email protected] Angelo Farina [email protected]

3D Virtual Microphone System employing 3D Virtual Microphone System employing

spherical and cylindrical spherical and cylindrical

Microphone Arrays Microphone Arrays

Alberto Morello [email protected] Alberto Morello [email protected]

(2)

The 3DVMS project

(3)

The origin of a new multichannel shooting and recording system

• The project was started by Rai Research Centre and Advanced Industrial Design in Acoustic (A.I.D.A.), spin-off of the University of Parma, in 2009

• It resulted in the patent of an innovative

system for live shooting and recording, called 3D Virtual Microphone System

• Starting from a sperical microphone probe, the system can synthesize up to 7 virtual microphones, which can be moved in

realtime, with variable directivity (zooming) capability

3

(4)

Our first 3 -order spherical microphone

(5)

RAI RAI – – previous state of art previous state of art

• The Holophone H2 Pro is a microphone system equipped with 8 capsules placed on a egg-shaped framework. The audio signals are delivered directly in G-format or using an audio mixer.

• The directivity of

Holophone’s capsules was measured in an Anechoic Room

5

(6)

6

(7)

• Directivity and angles between single capsules are not changeable in post-processing.

• There isn’t enough separation between sources

because of the low directivity of the capsules. For this reason the probe should be placed very close to the scene that is object of recordings

• Surround imaging is in any case inaccurate, albeit the recording sound spacious and with good frequency response (thanks to the DPA omnis)

HOLOPHONE PROBLEMS:

7

(8)

The EIGENMIKE The EIGENMIKE

^TM^TM

 Array with 32 ½” capsules of good quality, frequency response up to 20 kHz

 Preamplifiers and A/D converters inside the sphere, with ethernet interface

 Processing on the PC thanks to a VST plugin (no GUI)

8

(9)

The EIGENMIKE

^TM^TM

software (HOA) software (HOA)

9

(10)

Spherical Harmonics (H.O.Ambisonics)

A fixed number of “intermediate” virtual microphones is computed (B-format), then the dynamically-positioned virtual microphones are obtained by linear combination of these intermediate signals. This limits both dynamic range and frequency range.

Virtual microphones

(11)

The RAI-CRIT project The RAI-CRIT project

• “Virtual” microphones with high directivity, controlled by mouse/joystick in order to follow in realtime actors on the stage. They should be capable to modify their directivity in a sort of “acoustical zoom”.

• Surround recordings with microphones that can be modified (directivity, angle, gain, ecc..) in post- production.

• Get rid of problems with Spherical Harmonics signals

GOALS:

11

(12)

GOALS GOALS

We want to synthesize virtual microphones highly directive, steerable, and with variable directivity pattern

12

(13)

Live processing and recording for post- production

 A “service” video is diplayed on the Mac’s screen. This allows to follow the action in scene, and to see

where the synthetized microphones are aimed. The microphones can be moved, fllowing the action, with

almost imperceptible latency. The processed signals are available either as analog or digital (opticala ADAT) in realtime.

 The system can also be used as a multitrack recorder: the “raw” 32 capsule signals are recorded in a W64 file, for later processing in the studio.

(14)

Zoom and dynamic aiming

 7 synthesized signals can be output through ADAT. The 7 virtual microphones can be

“mapped”: for example, on a audio mixer and across a simple programmer for each single microphone you can select the spatial position and its polar pattern

 In real time, no more than 7 virtual microphones can be elaborated, while during the post-production phase the

number of virtual microphones

can be increased at will.

(15)

Boheme Azimut Elevation Order

Microfono 1 -35 5 4

Microfono 2 25 5 4

Microfono 3 -5 5 4

Microfono 4 -130 15 1,5

Microfono 5 115 15 1,5

Microfono 6 -35 -45 2

Microfono 7 25 -45 2

 It is possible to program, either from a GUI, or from a mixer, the optimal

microphone configuration for different kinds of

recording spaces

Configuration Example

(16)

• The 3DVMS technology

(17)

 The processing was initially performed on a dedicated Linux “black box”

 Low cost hardware (the processing unit is below 400 Euros)

 Realtime synthesis of processing filters with our novel algorithm

 Aiming and directivity of the virtual microphones can be changed in realtime under control of a joystick or with the mouse

Architecture of the system for realtime processing Architecture of the system for realtime processing

But nowadays all the processing is performed on a powerful MacBook Pro with Core i7 processor and mouse control

But nowadays all the processing is performed on a powerful

MacBook Pro with Core i7 processor and mouse control

(18)

The real-time microphone system

• The “Black box” runs a special version of Linux, optimized for low latency and multitasking on

multicore processors, and the open-source convolution engine BruteFIR

• The “Black box” runs a special version of Linux, optimized for low latency and multitasking on

multicore processors, and the open-source convolution engine BruteFIR

• The Laptop operates with a GUI written in Python, and controlled with a mouse or a

joystick

• The Laptop operates with a GUI written in Python, and controlled with a mouse or a

joystick

• A panoramic camera provides the

background live video imaging

• A panoramic camera provides the

background live video imaging

(19)

Hardware for 360° video

• A 2 Mp hires Logitech webcam is mounted under a parabolic

mirror, inside a Perspex tube

• The video stream from the Logitech webcam is processed with a realtime video-

unwrapping software, written by Adriano Farina in “Processing”, a Java-based programming language and environment

• It is possible to record the

unwrapped video stream to a

standard MOV file

(20)

Unwrapped image Original image

(21)

Examples

(22)

Video Sample: ScreenRecording

(23)

Example with multiple speakers and a single, movable virtual microphone inside a Reverberant room (post

processing)

23

(24)

Example of operation from a very unfavourable shooting position

Arlecchino servo di due padroni

“ Piccolo Teatro” , Milan 20 october 2010

24

(25)

5 fixed virtual microphones in post-production

25

(26)

Video Sample: Parlato

(27)

Example of post-processing with 7 fixed microphones

La Bohème Theater Regio Turin

20 may 2010

27

(28)

Video Sample: Boheme

(29)

7.1 recording of symphonic music

Concerto in re maggiore op. 35

per violino e orchestra P. I.

Tchaikovsky

Conservatory of Turin 22 november 2010

29

(30)

Video Sample: Tchaikovsky

(31)

Permanent installation of 2 Eigenmikes at

Auditorium Toscanini in

Turin

(32)

Auditorium Toscanini

• The two permanently installed Eigenmikes are almost invisible

(33)

Auditorium Toscanini – Stage coverage

• Original setup employing 22 cardioid microphones

(34)

Auditorium Toscanini – Stage coverage

• The same with 14 virtual microphones (3DVMS)

(35)

Realtime broadcasting

over Internet in 5.1

(36)

System Layout

• 5.1 channels form the mixer are captured by a PC with an RME soundcard and encoded and roadcasted in realtime through Windows Media Streaming

To the Rai Internet streaming server

(37)

Windows Media Streaming

• Up to 7.1 surround in Windows Media Audio format

(38)

Ongoing research activity

(39)

Cylindrical microphone arrays

Craig Jin, University of Sydney, 2007:

•Cylndrical 2D array

•32 DPA capsules on a ring

•Rigid cylindrical body

(D=100mm)

(40)

The new RAI 3D microphone array

• Cylndrical 3D array

• 32 Sennheiser capsules on the whole surface

• Rigid cylindrical body (D=110mm)

• Parabolic mirror

• IP camera 5 Mp

• The microphones are

stacked also vertically

(41)

Advantages of 3D cylindrical array

• Better angular resolution on the azimuth than on the elevation angle

• Integrated high quality panoramic video

• Same electronics and software as the Eigenmike

• Usage not constrained by previous patents and intellectual property

Disadvantages

• Bulky, heavy, no shock mount

• Currently no windshield (work in progress)

(42)

3D Audio playback

(43)

Spatial PCM Sampling

Whilst Sherical Harmonics are the “spatial” equivalent of the Fourier analysis of a wavefront,

Our “virtual microphone” approach is the equivalent of representing a waveform with a sequence of pulses

(PCM, pulse code modulation)

1

2 32 3

32 virtual microphones

for 3D map

(44)

Spatial PCM Sampling

PCM modelling of a waveform and of a spatial balloon

A waveform is represented by a sequence of pulses, a balloon is a “sea urchin” of spikes

(45)

Spatial Fourier Sampling

A waveform can be expanded as the sum of a number of sinusoids (Fourier), Exactly as a balloon can be represented by the sum of a number of spherical harmonics (Ambisonics)

(46)

Not-uniform spatial sampling with 32 spatial Diracs

Unfortunately, the largest regular polyhedron has 20 faces (icosaedron) – a 32-points sampling is slightly irregular

1 2

3

4 5

6

7

8

9

10

11 12

13

14

15

16

17 18

19 20

21 22

23

24

25 26

27 28

29

30

31

32 17

19 21

25

(47)

16-loudspeakers playback system

From 32 virtual microphones to loudspeakers

    ^s ^y ^*   ^f

) t ( f ) t ( y )

t (

s

³²

1 i

r , i i

r

    



Eigenmike^TM

Matrix of 32x16 FIR filters

32 virtual microphone signals y

16 speaker feeds s

(48)

The transfer functions of the sound system are measured

The Eigenmike^TM is placed in the center of the loudspeaker rig, and the transfer functions [k] are measured from each

loudspeaker to each of the 32 virtual microphones.

Then we impose that the resulting signals {y_out} are identical to the virtual

microphone signals recorded by the Eigenmike in the original room {y}:

And consequently we solve for the unknown filters [f]

k₁

k₂

k_R

    ^y

_out

^ ^s ^*   ^k ^   ^y ^*     ^f ^* ^k

Virtual microphone signals y_out

(49)

From 32 virtual microphones to loudspeakers

High resolution 3D sound systems:

•ISVR (Southampton, UK) – 40 loudpeakers

•Casa della Musica, (Parma, ITALY)

•16 loudspeakers

(50)

3D sound system at University of Parma

Horizontal Ambisonics octagon Ambisonics 3D cubeStandard Stereo

Frontal Stereo-Dipole Rear Stereo DipoleUpper Stereo Dipole

(51)

Conclusions

• The most versatile method currently available for capturing a 3D acoustical scene is to employ a microphone array, and to derive a number of virtual microphones