What is Ambisonics?

Ambisonics is a spatial audio format that has been around since the 1970s. Since then, its popularity has risen and fallen a few times but is undergoing a resurgence with the emergence of virtual reality (VR). In fact, Ambisonics is now used by some of the biggest video platforms on the internet: YouTube and Facebook!

How does Ambisonics work?

Ambisonics works in three steps: encoding, processing and decoding. The encoding stage takes your mono or stereo track and converts (using a plugin such as the aXPanner) it to B-format – a multichannel format where the gain and polarity relationship between the channels is determined by the source position (azimuth and elevation). B-format is independent of the playback stage and can’t be listened to directly. First order Ambisonics (FOA) has 4 channels: W (omnidirectional), X, Y and Z (along the Cartesian axes).

The processing step is where you manipulate your sound sources, either individually or as a group. This manipulation can include dynamic range compression, EQ-ing, adding effects such as delay, etc. You can also rotate everything at the same time. Of course, the B-format signals have to be treated carefully to avoid destroying the spatial information. The aX Plugins were designed specifically with this in mind.

The decoding step converts from B-format to loudspeaker or headphone signals. Decoding for headphones (currently the most convenient and common  listening method) is via Head Related Transfer Functions (HRTFs). These mimic the acoustic characteristics of the head and ears. This is known as binaural decoding and is available in the aXMonitor series. The aim of this sort of decoder is to create a scene that sounds realistic, not “inside the head” like normal stereo over headphones. Perfect for immersive VR! Decoding to loudspeakers requires you to have quite a lot of loudspeakers in (ideally) a sphere around the listening area and is an amazing (but expensive) sonic experience.

What Is Higher Order Ambisonics?

Higher Order Ambisonics (HOA) extends the number of B-format channels thus increasing the spatial resolution of the sound scene. The Ambisonics order can theoretically go to infinity, but in practice DAW plugins are limited to seventh order (64 channels).
HOA signals of a particular order contain all of the same channels required for any lower-order subset. For example, a third order signal contains the first and second order channels as if you had encoded directly to that order. Just drop the channels at the end corresponding to the higher order components and you’ve down-mixed to lower order. Easy!
What this means it is always best to mix in the highest order available (and that your CPU can handle) even if it is above the current requirements for the project. Why? Because you will have the lower order signals needed but you will also be able to go back and get the higher spatial-resolution signals if/when the decoding requirements improve. It is the best way to future-proof your Ambisonics work!

Why use (Higher Order) Ambisonics for VR?

(Higher Order) Ambisonics is a scene-based technique which means that it is possibly to apply processing to the whole scene efficiently. For example, your final mix will probably have dozens of sound sources that you want to respond to movements made by the end-user. Ambisonics lets you rotate the sound field as a whole, rather than having to do this for each individual source in your scene.

Ambisonics is currently used for spatialisation in YouTube 360 and Facebook360. It is also available in the upcoming version 3.0 of VLC Media Player. These offer massive possibility for bringing your VR content to a large audience.  Since audio is key to a users immersion in a VR scene it is worth learning how to use Ambisonics. To help you, I have put together a short tutorial here.