Notes
Outline
Ambiophonic Principles for the Recording and Reproduction
of Surround Sound for Music
Angelo Farina (1), Ralph Glasgal (2), Enrico Armelloni (1), Anders Torger (1)
(1) Industrial Engineering Dept., University of Parma, Via delle Scienze 181/A
Parma, 43100 ITALY – HTTP://pcfarina.eng.unipr.it
(2) Ambiophonics Institute, 4 Piermont Road, Rockleigh, New Jersey 07647, USA
HTTP://www.ambiophonics.org
The Ambiophonics method
Ambiophonics is an hybrid method for creating a realistic spatial reproduction of staged music, starting from two-channel recordings, but extensible to  various kinds of microphonic arrangements up to discrete multichannel
The system is based on two indipendently designed groups of loudspeakers: a Stereo Dipole, responsible for the reproduction of the direct sound and early reflections coming from the stage, asnd a surround periphonic array, driven by real-time convolution with room impulse responses
The Stereo Dipole
The cross-talk cancellation allows for the replica of the recorded signals at the ears of the listener
The surround convolution
The cross-talk cancellation allows for the replica of the recorded signals at the ears of the listener
Design of cross-talk canceling filters
First, a binaural measurement is made in front of the Stereo Dipole loudspeakers
Theory of cross-talk canceling filters
The regularization parameter, e, has to be adjusted by trials
Example
Measured impulse responses h
Example
Computed long-FIR inverse filters f
Warped FIR cross-talk cancellation
Todays DSP boards are not powerful enough for convolving long inverse FIR filters
Warping can be used for concentrating the computing power in the frequency range where it is most needed
Warped FIR implementation on a SHARC
The WFIR structure was coded in assembly on the AD21061 and on the AD21065L processors: here the assembly code of the main cycle is shown:
Subjective blind comparison: FIR vs. WFIR
14 normal-hearing subjects (6 females, 8 males)
Two sound samples: binaural recording of natural sounds and a piece of pop music (Elton John)
5-levels scale (insufficient, mediocre, sufficient, fair, good )
The listener was free to switch at will between the two processing algorithms, denoted simply as A and B
 Classic ANOVA analysis of the subjective response
“Virtual Ambisonics” surround
Measurement of 3D (B-format) impulse responses in theatres, with two source positions on the stage
The IRs are processed, deriving the responses of several directive microphones
Each soundtrack of the original stereo recording is convolved with the corresponding IR
For each loudspeaker, the results of the two convolutions are mixed
Measurements in 3 Italian theatres
Synthesis of directive microphones
The WXYZ channels of a B-format IR can be processed, extracting a single (mono) response of a virtual microphone pointing along a given versor r (rx, ry, rz):
The Double-reverberation problem
When an impulse response is reproduced in another reverberant space, the resulting reverberant tail is the convolution of the two reverberant tails
Hardware implementation
A complete Ambiophonics system can be implemented, nowadays, coupling a general-purpose DSP unit (cross-talk cancellation) and convolution-based reverberators
Software implementation
The preferred implementation is by means of a simple software convolver and a cheap, modern PC. Two solutions are currently available:
Latency vs. performance
The software implementation is based on frequency-domain convolution (overlap-and-save), which inherently introduces some latency.
Furthermore, the audio stream I/O on a PC is always buffered, so an intrinisic latency is caused by the buffer size
BruteFIR distinguishes himself from other convolvers by the fact that it implements partitioned convolution: the impulse response is subdivided in many segments of equal length, and this reduces the latency to twice the length of a segment, instead of twice the length of the whole IR.
On modern CPUs, the partitioned convolution is more efficient than traditional unpartitioned overlap-and-save, with a reduction of CPU load of 20-50%, and can reduce the overall latency to less than 100 ms.
Very efficient FFT implementations are freely available (Intel NSP, FTTW), and thus the computing power of a PC is enough for real-time convolution of 20 IRs, at 44.1 KHz, 32 bits, each being 65,536 points long. The demonstration machine, installed in room 22, is an old Pentium-II 400 MHz.
Subjective comparative experiment
9 normal-hearing subjects (males)
Three sound samples:
Simple ranking test between three systems: Stereo-Dipole, Virtual Ambisonics, complete Ambiophonics
Each listener can switch freely among the three systems during the playback
Conclusions
Ambiophonics revealed to give significant advantages over the two surround systems which constitutes it.
It recreates a realistic virtual acoustic space by means of convolution with proper digital filters
The computational power required can be obtained cheaply by means of a modern PC
The system can be configured for different number and position of the loudspeakers
The “sweet spot” can easily accomodate three persons, and also far from this area, the overall acoustic impression remains that of being in a concert hall.
Internet Links