Conversion between stereo (UHJ) and Ambix

 

This page explains how to use the X-volver plugin for converting between stereo (UHJ) and B-format (Ambix variant) surround formats using any multichannel VST host, such as my favourite one, Plogue Bidule (but also Audio Mulch, Reaper and many others can be used).

UHJ is a "standard" stereo waveform, which includes information capable of driving a complete horizontal surround system (for example, equipped with 5, 6 or even 8 loudspeakers). The derivation of the signals optimized to feed the loudspeakers passes through two steps:

1) Extraction of  1st order B-format signals (W, X and Y channels) form the original stereo waveform. No Z, horizontal only, sorry! And Ambix format, we want to be compatible with Youtube and Facebook, nowadays...

2) Decoding of the B-format signal for the particular array of loudspeakers employed, employing the traditional Ambisonics or other equivalent decoding technologies.

Only the first step is addressed here, for the second step I suggest the use of free plugins such as Ambix, O3A, Wigware, IEM, etc..

A B-format Ambix signal is, in general, a 4-channel signal (WYZX). W is the sound pressure signal, as captured by a perfectly omnidirectional microphone. X, Y and Z are the "first order spherical harmonics" which in practice are proprtional to the Cartesian components of the particle velocity vector.

The Cartesian reference is conforming to ISO standards (for example, ISO2631), as in the following picture:

So we consider first the case in which You have an UHJ stereo signal (taken, for example, from a Nimbus CD), and You want to extract a B-format signal from it. Of course, the Z channel cannot be extracted, and will be set to zero (horizontal-only surround).

The UHJ (LR) to B-format (WXY) conversion si described by the following formulas:

W = 0.5*(0.982*L + 0.982*R + j*0.164*L - j*0.164*R)

X = 0.5*(0.419*L + 0.419*R - j*0.828*L + j*0.828*R)

Y = 0.5*(0.763*L - 0.763*R + j*0.385*L + j*0.385*R)

These equations were implemented in frequency domain, and transformed back to three stereo impulse responses in time domain, at a 48 kHz sampling rate. They were packed in a single 4-channels file to be used as a 2x4 filter matrix inside the X-volver VST plugin, with 2 inputs and 4 outputs.

For extracting the three not-zero channels W, X and Y from the 2-channels UHJ stereo signal, the processing paths are as follows:

In practice, the conversion from UHJ to Ambix can be done in a single step using Xvolver, as shown here:

If your goal was just to create the "spatial audio" soundtrack for your 360-degrees video, before uploading it to Youtube or Facebook, then your are almost done. You have juts to mux your 360-degrees video with your new 4-channels Spatial Audio soundtrack.

There are two methods for muxing video and audio and add the proper Spatial Media Metadatama required by Facebook and Youtube for recognizing that it is a 360-degrees video with Ambix spatial audio:

  1. Using FFMPEG and the Google Spatial Media Metadata Injector
  2. Using the Facebook Spatial Workstation

In the first case, this is the command for having FFMPEG merging the audio and video tracks:

 

  FFMPEG -i your_video.mp4 -i your_spatial_audio.wav -map 0:v -map 1:a -c:v copy -c:a copy -shortest MyVideo.mov

 

Please note that the output will be inside a MOV container, as this is the preferred format for preserving audio quality on Youtube and Facebook (no lossy compression).

Now you need to inject the Spatial Media Metadata using the Google Spatial Media Metadata Injector:

Just select the flags for 360-degrees video and for Spatial Audio, then press into "Inject Metadata" and specify a new name for the resulting MOV file, which will be ready for being uploaded to Youtube or Facebook.

The alternative, simpler method is to first install the free Facebook Spatial Workstation. One of the modules being installed is named the Encoder. Just launch it, specify the correct format and filnames for your 360-degres video and for your 4-channels Spatial Audio Ambix soundtrack, specfy the output format you want to get (Youtube Video with 1st order Ambix soundtrack) and the click on "Encode":

The result will be an MP4 video ready for Youtube, with proper metadata already injected, ready to be uploaded.


If instead you want to further decode your Ambix soundtrack to a "surround" loudspeaker array (typically in the ITU 5.1 arrangement), then you can use a decoder plugin, as shown here:

Note that in this case the output is routed to a 6-channels WAV file, and that the channel numbering must be conforming to 

the Microsoft Wave Format Extensible standard (which is also the SMPTE & ITU specification), which is defined as:

  1. Front Left

  2. Front Right

  3. Center

  4. LFE (unused in this case)

  5. Left Surround (Rear Left)

  6. Right Surround (Rear Right)

The above channel order is also the typical order utilized as the desired input for Dolby-AC3/DTS/MLP/WMA "surround" encoders

Remember that You can download the filter matrix and the Plogue Bidule patches from my web site, just click here.

You can also be interested in doing the inverse procedure, that is, converting a B-format (Ambix) signal into an UHJ stereo file. Look at the corresponding new Ambix to UHJ conversion page for this.


Angelo Farina, August 2017 - August 2018