1	ROOM IMPULSE RESPONSES AS TEMPORAL AND SPATIAL FILTERS
2	Topics
3	Basic sound propagation scheme
4	Traditional measurement methods Pulsive sources: ballons, blank pistol
5	Modern electroacoustical methods The sound is generated by means of an omnidirectional loudspeaker The signal is computer-generated The same computer is also employed for recording the room’response by means of one or more omnidirectional microphones Also directive microphones can be used: binaural, figure-of-eight Different types of test signals have been developed, providing good immunity to background noise and easy deconvolution of the impulse response: MLS (Maximum Lenght Sequence, pseudo-random white noise) TDS (Time Delay Spectrometry, which basically is simply a linear sine sweep, also known in Japan as “stretched pulse”) ESS (Exponential Sine Sweep) Each of these test signals can be employed with different deconvolution techniques, resulting in a number of “different” measurement methods Due to theoretical and practical considerations, the preference is nowadays generally oriented for the usage of ESS with not-circular deconvolution
6	Measurement process The desidered result is the linear impulse response of the acoustic propagation h(t). It can be recovered by knowing the test signal x(t) and the measured system output y(t). It is necessary to exclude the effect of the not-linear part K and of the background noise n(t).
7	Hardware: PC and audio interface
8	Hardware: loudspeaker & microphone
9	Software
10	MLS method X(t) is a periodic binary signal obtained with a suitable shift-register, configured for maximum lenght of the period.
11	MLS deconvolution The re-recorded signal y(i) is cross-correlated with the excitation signal thanks to a fast Hadamard transform. The result is the required impulse response h(i), if the system was linear and time-invariant
12	MLS example
13	MLS example
14	Exponential Sine Sweep method x(t) is a sine signal, which frequency is varied exponentially with time, starting at f₁ and ending at f₂.
15	Test Signal – x(t)
16	Measured signal - y(t) The not-linear behaviour of the loudspeaker causes many harmonics to appear
17	Inverse Filter – z(t) The deconvolution of the IR is obtained convolving the measured signal y(t) with the inverse filter z(t) [equalized, time-reversed x(t)]
18	Deconvolution of Log Sine Sweep The “time reversal mirror” technique is employed: the system’s impulse response is obtained by convolving the measured signal y(t) with the time-reversal of the test signal x(-t). As the log sine sweep does not have a “white” spectrum, proper equalization is required
19	Result of the deconvolution The last impulse response is the linear one, the preceding are the harmonics distortion products of various orders
20	IR Selection After the sequence of impulse responses has been obtained, it is possible to select and insulate just one of them:
21	ESS example
22	ESS example
23	Maximum Length Sequence vs. Sweep
24	Post processing of impulse responses A special plugin has been developed for the computation of STI according to IEC-EN 60268-16:2003
25	Post processing of impulse responses A special plugin has been developed for performing analysis of acoustical parameters according to ISO-3382
26	The new AQT plugin for Audition The new module is still under development and will allow for very fast computation of the AQT (Dynamic Frequency Response) curve from within Adobe Audition
27	Spatial analysis by directive microphones The initial approach was to use directive microphones for gathering some information about the spatial properties of the sound field “as perceived by the listener” Two apparently different approaches emerged: binaural dummy heads and pressure-velocity microphones:
28	“objective” spatial parameters It was attempted to “quantify” the “spatiality” of a room by means of “objective” parameters, based on 2-channels impulse responses measured with directive microphones The most famous “spatial” parameter is IACC (Inter Aural Cross Correlation), based on binaural IR measurements
29	“objective” spatial parameters Other “spatial” parameters are the Lateral Energy ratios: LE, LF, LFC These are defined from a 2-channels impulse response, the first channel is a standard omni microphone, the second channel is a “figure-of-eight” microphone:
30	Robustness of spatial parameters Both IACC and LF depend strongly on the orientation of the microphones Binaural and pressure-velocity measurements were performed in 2 theatres employing a rotating table for turning the microphones
31	Are binaural measurents reproducible? Experiment performed in anechoic room - same loudspeaker, same source and receiver positions, 5 binaural dummy heads
32	Are binaural measurents reproducible? 90° incidence - at low frequency IACC is almost 1, at high frequency the difference between the heads becomes evident
33	Are binaural measurents reproducible? Diffuse field - the difference between the heads is now dramatic
34	Are LF measurents reproducible? Experiment performed in the Auditorium of Parma - same loudspeaker, same source and receiver positions, 5 pressure-velocity microphones
35	Are LF measurents reproducible? At 7.5 m distance, the results already exhibit significant scatter
36	Are LF measurents reproducible? At 25 m distance, the scatter is even larger....
37	3D extension of the pressure-velocity measurements The Soundfield microphone allows for simultaneous measurements of the omnidirectional pressure and of the three cartesian components of particle velocity (figure-of-8 patterns)
38	3D Impulse Response (Gerzon, 1975)
39	The Waves project (2003) The original idea of Michael Gerzon was finally put in practice in 2003, thanks to the Israeli-based company WAVES More than 50 theatres all around the world were measured, capturing 3D IRs (4-channels B-format with a Soundfield microphone) The measurments did also include binaural impulse responses, and a circular-array of microphone positions More details on WWW.ACOUSTICS.NET
40	Directivity of transducers LookLine D200 dodechaedron
41	Directivity of transducers
42	What about source directivity ? Current 3D IR sampling is still based on the usage of an “omnidirectional” source The knowledge of the 3D IR measured in this way provide no information about the soundfield generated inside the room from a directive source (i.e., a musical instrument, a singer, etc.) Dave Malham suggested to represent also the source directivity with a set of spherical harmonics, called O-format - this is perfectly reciprocal to the representation of the microphone directivity with the B-format signals (Soundfield microphone). Consequently, a complete and reciprocal spatial transfer function can be defined, employing a 4-channels O-format source and a 4-channels B-format receiver:
43	1st order MIMO impulse response If only spherical harmonics of order 0 and 1 are taken into account, a complete spatial transfer function measurement requires 16 impulse responses:
44	Limits of the 1^st-order method Albeit mathematically elegant and easy to implement with currently-existing hardware, the 1^st-order method presented here cannot represent faithfully the complex directivity pattern of an human voice or of an human ear:
45	Limits of the 1^st-order method The polar pattern of a binaural dummy head is even more complex, as shown here (1 kHz, right ear):
46	How to get better spatial resolution? The answer is simple: analyze the spatial distribution of both source and receiver by means of higher-order spherical harmonics expansion Spherical harmonics analysis is the equivalent, in space domain, of the Fourier analysis in time domain As a complex time-domain waveform can be though as the sum of a number of sinusoidal and cosinusoidal functions, so a complex spatial distribution around a given notional point can be expressed as the sum of a number of spherical harmonic functions
47	Higher-order spherical harmonics expansion
48	3°-order microphone (Trinnov - France) Arnoud Laborie developed a 24-capsule compact microphone array - by means of advanced digital filtering, spherical ahrmonic signals up to 3° order are obtained (16 channels)
49	4°-order microphone (France Telecom) Jerome Daniel and Sebastien Moreau built samples of 32-capsules spherical arrays - these allow for extractions of microphone signals up to 4° order (25 channels)
50	Multichannel software for high-order Plogue Bidule can be used as multichannel host software, running a number of VST plugins developed by France Telecom - these include spherical harmonics extraction from the spherical microphone arrays, rotation and manipulation of the multichannel B-format signals, and final rendering either on head-.tracked headphones or on a static array of loudspeakers (high-order Ambisonics)
51	Verification of high-order patterns Sebastien Moreau and Olivier Warusfel verified the directivity patterns of the 4°-order microphone array in the anechoic room of IRCAM (Paris)
52	Frequency extension of the patterns
53	High-order sound source University of California Berkeley's Center for New Music and Audio Technologies (CNMAT) developed a new 120-loudspeakers, digitally controlled sound source, capable of synthesizing sound emission according to spherical harmonics patterns up to 5° order.
54	Technical details of high-order source Class-D embedded amplifiers
55	Accuracy of spatial synthesis The spatial reconstruction error of a 120-loudspeakers array is frequency dependant, as shown here:
56	Advanced digital filtering techniques A set of digital filters can be employed for sinthesizing the required spatial pattern (spherical harmonis), either when dealing with a microphone array or when dealing with a loudspeaker array Whatever theory or method is chosen, we always start with N input signals x_i, and we derive from them M output signals y_j And, in any case, each of these M outputs can be expressed by:
57	Example with a microphone array The sound field is sampled in N points by means of a microphone array
58	Traditional design of digital filters The processing filters h_ij are usually computed following one of several, complex mathematical theories, based on the solution of the wave equation (often under certaing simplifications), and assuming that the microphones are ideal and identical In some implementations, the signal of each microphone is processed through a digital filter for compensating its deviation, at the expense of heavier computational load
59	Novel approach No theory is assumed: the set of h_ij filters are derived directly from a set of impulse response measurements, designed according to a least-squares principle. In practice, a matrix of filtering coefficients, is formed, and the matrix has to be numerically inverted (usually employing some regularization technique). This way, the outputs of the microphone array are maximally close to the ideal responses prescribed This method also inherently corrects for transducer deviations and acoustical artifacts (shielding, diffractions, reflections, etc.)
60	Example: synthesizing 0-order shape
61	Example: synthesizing 0-order shape
62	System’s least-squares inversion For computing the matrix of N filtering coefficients h_i0, a least-squares method is employed. A “total squared error” e_tot is defined as:
63	Kirkeby’s regularization During the computation of the inverse filter, usually operated in the frequency domain, one usually finds expressions requiring to compute a ratio between complex spectra: H=A/D Computing the reciprocal of the denominator D is generally not trivial, as the inverse of a complex, mixed-phase signal is generally unstable. The Nelson/Kirkeby regularization method is usually employed for this task:
64	Spectral shape of the regularization parameter e(w) At very low and very high frequencies it is advisable to increase the value of e.
65	Example for a 4-channel mike DPA-4 A-format microphone 4 closely-spaced cardioids A set of 4x4 filters is required for getting B-format signals Global approach for minimizing errors over the whole sphere
66	IR measurements on the DPA-4
67	Computation of the inverse filters A set of 16 inverse filters is required (4 inputs, 4 outputs = 1°-order B-format) For any of the 84 measured directions, a theoretical response can be computed for each of the 4 output channels (W,X,Y,Z) So 84x4=336 conditions can be set:
68	Real-time implementation
69	Complete high-order MIMO method Employing massive arrays of transducers, it is nowaday feasible to sample the acoustical temporal-spatial transfer function of a room Currently available hardware and software tools make this practical only up to 4° order, which means 25 inputs and 25 outputs A complete measurement for a given source-receiver position pair takes approximately 10 minutes (25 sine sweeps of 15s each are generated one after the other, while all the microphone signals are sampled simultaneously) However, it has been seen that real-world sources can be already approximated quite well with 2°-order functions, and even the human HRTF directivites are reasonally approximated with 3°-order functions.
70	Conclusions The sine sweep method revealed to be systematically superior to the MLS method for measuring electroacoustical impulse responses In fact, it is now employed in top-grade measurement systems, including Audio Precision (TM) or Bruel & Kjaer’s DIRAC software Traditional methods for measuring “spatial parameters” proved to be unreliable and do not provide complete information The 1°-order Ambisonics method can be used for generating and recording sound with a limited amount of spatial information For obtained better spatial resolution, High-Order Ambisonics can be used, limiting the spherical-harmonics expansion to a reasonable order (2°, 3° or 4°). Experimental hardware and software tools have been developed (mainly in France, but also in USA), allowing to build an inexpensive complete measurement system From the complete matrix of measured impulse responses it is easy to derive any suitable subset, including an highly accurate binaural rendering over head-tracked headphones.