Skip to content

Instantly share code, notes, and snippets.

@Hiradur
Last active February 20, 2024 17:52
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save Hiradur/388cb7f658fe117a1f4ccfd9a21adffa to your computer and use it in GitHub Desktop.
Save Hiradur/388cb7f658fe117a1f4ccfd9a21adffa to your computer and use it in GitHub Desktop.
Guide for achieving speaker-based immersive (3D) audio in hundreds of PC games using OpenAL Soft and Ambisonics

Speaker-based immersive (3D) audio in hundreds of PC games via Ambisonics using OpenAL Soft

Introduction

It's possible to achieve speaker-based immersive (3D) audio in many PC games that don't seem to be supported by current proprietary object-based audio technologies using some tinkering and a technology called Ambisonics. Since Ambisonics seems to have gone largely unnoticed in the consumer-space so far, I will first explain what it is, what its benefits are, what content is available right now, and how to set it up on a PC for gaming.

Theoretical Background

What is Ambisonics?

Many should be familiar with multi-channel audio based on discrete speaker feeds or the more recent object-based audio. Ambisonics uses neither of these approaches. Instead, it describes a continous full-sphere sound field around a single point in space. A so called Ambisonics decoder uses this information in combination with a decoding matrix, which is specific to a given speaker layout, to reproduce this sound field as good as possible with the speakers available. Ambisonics sound fields are stored and transmitted in B-Format.

What are the benefits?

Ambisonics is not limited to a specific number of speakers or transmission channels. It is also not constrained by a maximum number of audio objects that could be transmitted in parallel. That being said, higher precision requires more channels for transmission (Higher-order Ambisonics). It is continous, theoretically allowing for smooth rotation of sources around the listener on a full sphere. Reproduction of the sound field takes a psychoacoustic model into account to improve localization of sources. Interestingly, with Ambisonics, no matter where the virtual source of a sound is located, all speakers always emit a signal for that sound. However, these signals are filtered and vary in level so that the listener perceives it as coming from its intended virtual position. Ambisonics was first conceived in the 1970's. To my knowledge, the original patents surrounding it have expired. If this was true, it could potentially be used by interested parties without paying licensing fees. This could facilitate the creation of content with immersive audio.

Ambisonics playback chain and routing

To provide a basic understanding of how audio flows in an Ambisonics system, I provide some visualizations in the following.

Visualization of a Ambisonics playback chain

In principle, a setup for Ambisonics playback looks like this:

source (audio player playing an Ambisonics file, game ...)
  |
  | B-Format audio stream
  v
Ambisonics decoder <-- decoding matrix that is appropriate for the speaker setup
  |
  | digital speaker feeds
  v
sound card(s)/audio device(s)
  |
  | digital or analog low-level speaker feeds
  v
amplifier(s)
  |
  | analog high-level speaker feeds
  v
speakers

For instance, when using OpenAL Soft (as an audio library loaded by a game), it would look like this:

      game <-- OpenAL Soft library with Ambisonics rendering
|-----|  |---------------------------| OpenAL Soft outputs one of the following two options
|                                    |
| decoded stream (2.0, 5.1, 7,1)     | B-Format audio stream
|                                    v
|                                Ambisonics decoder
|                                    |
|---------------|--------------------| digital speaker feeds
                v
      sound card(s)/audio device(s)
                v
             amplifier
                v
             speakers

Channel routing

The routing of decoded streams to speaker channels happens at the software level on the PC or physically through cabling, e.g. you could use regular 7.1 output to output to a customized speaker layout that does not at all look like a typical 7.1 setup (for instance, 2 speakers could be used as ceiling speakers).

Ambisonics Content:

Games seem to be a good provider of continous full-sphere surround sound content given that players can interactively navigate through a world and the position of audio sources is often dynamic. There are hundreds of games where Ambisonics can be used already, although some tinkering is required.

Supported games

A large number of games can support Ambisonics thanks to OpenAL Soft, a free and open source 3D audio library, which internally renders in Ambisonics and is capable of outputting decoded streams for setups with up to 7.1 channels as well as to output undecoded B-Format streams for use with Ambisonic decoders.

  • Games that use the OpenAL API natively (list 1, list 2). For some games, it might be necessary to replace a bundled OpenAL driver with OpenAL Soft and configure the latter for your setup.
  • Games that use the DirectSound3D API (list 1, list 2). This is possible thanks to DSOAL, a wrapper library that essentially translates DirectSound3D calls to OpenAL calls. If you use OpenAL Soft as the OpenAL driver, the use of Ambisonics is possible.
  • Games that use the A3D API might also work when using an A3D to DirectSound3D wrapper, DSOAL and OpenAL Soft (not tested by myself).

Note on compatible applications/games on GNU/Linux

On GNU/Linux, the following applications should work:

  • Linux-native games using OpenAL for 3D audio
    • some games might require you to override the OpenAL library that is shipped with the game with the library of OpenAL Soft
  • Windows games using OpenAL for 3D audio running in Wine
    • Ensure that the OpenAL shim of Wine is being used, which passes the OpenAL calls from the Wine environment to the Linux host system. If games ship their own OpenAL32.dll it's possible that the Wine shim is not being used. In this case, delete/rename the OpenAL32.dll shipped with the game.
    • When the OpenAL shim of Wine is being used, all OpenAL calls are passed to the OpenAL library of the Linux host system. This means that any configuration of OpenAL Soft needs to be done on the Linux host, not inside Wine.

This does not work (well):

  • Windows-games using the wrapper DSOAL, that wraps DS3D calls to OpenAL calls.
    • This is because for this wrapper, a Windows OpenAL library (dll) has to be used that runs inside Wine. Wine does not seem to support passing arbitrary audio channels to the Linux host, which would be required to pass Ambisonics B-Format of higher orders from Wine to the Linux host.
    • There is a workaround, but it has awful and unstable latency:
      1. Make OpenAL Soft write B-Format output to a file that is a named pipe using the wave file writer driver.
      2. Play back the named pipe with mpv on the Linux host.

Content beyond gaming:

Even though unrelated to gaming, I'd like to mention a few other sources for Ambisonics content since some nice demo material is available for free.

Free Ambisonics recordings

Commercial Ambisonics recordings:

How to use Ambisonics

In general, to set up an Ambisonics listening rig, 4 steps have to be performed:

  1. Plan your speaker layout and set up your speakers accordingly
  • Theoretically, you can place any number of speakers in any way you want. However, not every layout is as good as another. For example, if you have 7 speakers and put all of them in front of you, localization to the sides and rear will be poor.
  • My recommendations:
    • Use a well known layout as a base (due to backwards compatibility) and extend it, i.e. use 5.1 or 7.1 and add some height speakers
    • Take a look at the "One for All" speaker layout for maximum compatibility with non-Ambisonics content
  1. Generate a decoding matrix for your speaker layout
  • This step can be skipped if a decoding matrix for your speaker layout already exists
    • However, if you use, for instance, a 5.1 speaker layout but with different angles compared to the reference layout, you should still generate your own decoding matrix to accompany for the different angles.
  1. Set up the audio backend chain including the Ambisonics decoder
  • This is operating system dependend, I can only provide instructions for GNU/Linux
  1. Set up games for Ambisonics output

Set up the software

All software mentioned in the following is free and open source software. This guide assumes you use a PC as the audio source. The biggest problem right now is to get more than 7.1 channels from a PC to speakers. Many of you probably use AV receivers, which currently (AFAIK) do not accept more than 7.1 PCM channels via HDMI. I'm aware of two workarounds:

  • combine multiple sound cards and receivers/amplifiers to drive more than 7.1 speakers
  • use professional audio interfaces with an arbitrary number of channels in combination with powered speakers If none of these options are acceptable to you, there is a 3D speaker layout specifically designed to be used with receivers that can process at least 7.1 audio channels. It's called 3D7.1 and OpenAL Soft already ships with a decoding matrix for it (which has to be enabled in the configuration file).

Generate a decoding matrix for your speaker layout

OpenAL Soft already ships with some decoding matrices for standard speaker layouts. If you have a custom setup, you most likely need to generate a matching decoding matrix yourself. You can do this by first installing GNU Octave and downloading the Ambisonics Decoding Toolkit. Launch Octave and navigate to the folder to which you extracted adt. Double click adt_initialize.m to open it and then click on the save file and execute button in the top bar (it looks like a cog with a play button in front of it). Once that is done, go to the examples directory, make a copy of one of the example files with a speaker layout that is close to your target layout (e.g. run_dec_itu.m). In the new file you first need to modify speaker numbers, angles (both azimuth and elevation) and distance. Look at the code that is already there, it shouldn't be hard to figure out how to do this. Once your speaker configuration is correct, click the save file and execute button in the top bar. If everything went well, you should now find several new files in the decoders directory inside the adt directory. The most important file there is the *.ambdec file. This contains the decoding matrix that can be loaded with ambdec or OpenAL Soft.

Set up the audio backend chain including the Ambisonics decoder

This depends on the operating system you'd like to use. Since I only use GNU/Linux myself, I cannot provide instructions for other operating systems. You need a facility that is capable of connecting audio input and output ports of software running on the PC and also capable of connecting audio output of software to hardware device outputs. On GNU/Linux, this facility is provided by JACK. Since this guide is about immersive audio in the consumer space, the following is assumed for this scenario: The user has at least two audio devices, such as a USB audio interface (onboard audio or a sound card is also ok) and an AV receiver that is connected to the GPU via HDMI. These two audio devices are combined to provide enough channels for immersive audio.

  1. First, we need to install some software, namely QJackCtl and ambdec. On Debian-based systems, such as Ubuntu, run:
sudo apt install qjackctl ambdec
  1. Launch QJackCtl, a GUI to the JACK audio server.

  2. Configure the JACK audio server for the first audio device In the settings window of QJackCtl, choose the USB audio interface/onboard audio/sound card you have. Selecting the GPU as the main device doesn't work, at least in my case. Should you later experience audio glitches, you might need to finetune the sample rate, frames/period and periods/buffer settings but we will ignore these for this guide.

  3. Start the JACK audio Server. In QJackCtl, use the Start button to launch the JACK audio server.

  4. Configure the second audio device (the GPU) We use alsa_out to make the GPU available for audio output in JACK. The number after -c sets the number of channels. For typical AV receivers, this should be either 6 or 8, depending on whether they output 5.1 or 7.1.

# To determine the correct device name and number for your GPU output, use aplay -l.
# You might need experiment to determine which device number belongs to which
# HDMI/DP port.
alsa_out -d hw:NVidia,7 -c 8 &

Should you later notice a timing difference between the two audio devices, you might need to enable direct mode on your AV receiver that disables all internal processing to minimize latency.

  1. Launch an Ambisonics decoder that decodes B-Format to a given speaker layout. Be sure to replace the placeholder after -c.
ambdec -p /usr/share/ambdec/presets/ -c <path to the preset that is specific to your speaker layout> -V 0 &

Note: For playing back Ambisonics recordings using mpv, you can follow this guide.

Set up games for Ambisonics output

Currently, I'm only aware of how to make games output Ambisonics B-Format when they use the OpenAL API. For compatible games, see [here](Supported games). In the following, I describe how to configure the OpenAL Soft AL driver to output Ambisonics B-Format and how to configure ambdec to decode it properly.

  1. Extract the template of the OpenAL Soft configuration file to your home folder (this command overwrites your previous configuration file!).
gunzip /usr/share/doc/libopenal1/examples/alsoftrc.sample.gz -c > ~/.alsoftrc
  1. In ~/.alsoftrc, set the following settings:
  • drivers = jack
  • channels = ambi2 or another value depending on how many speakers you have
  • ambi-format = ambix
  • spawn-server = false
  • connect-ports = false
Launching and connecting a game
  1. Launch game

  2. Open the connections window in QJackCtl

  3. Connect output ports of alsoft to ambdec

  • for the ambix format (as configured above), connect the ports straight through, i.e.:
    # alsoft ambdec
    #    0 -> 0
    #    1 -> 1
    #    2 -> 2
    #    ...
    
  1. In ambdec, in the configuration menu, set input scaling to SN3D

  2. Optionally save your ambdec configuration

  3. Save your connection in QJackCtl Once everything is connected correctly, you should open the patchbay in QJackCtl, create a new preset and answer yes when prompted whether you want to create a patchbay configuration based on the currect connections to store the current configuration. If you activate this preset in the patchbay menu, all ports will be connected automatically the next time you launch the software as long as the preset is active.

Troubleshooting
  • Set these environment variables to find debug information from OpenAL Soft in 0_alsoft.log:
ALSOFT_LOGFILE=0_alsoft.log
ALSOFT_LOGLEVEL=3
  • If there is no audio at all: Double check that OpenAL Soft is connected to ambdec in Jack
    • store configurations in patchbay presets and activate them to have the ports be connected automatically next time
  • If there is audio with incorrect spatialization: Double check that all channels are connected correctly and that the correct Ambisonics channel ordering is used

Closing words

I'm rather new to Ambisonics myself and what I've covered above is only the tip of the iceberg. Ambisonics UHJ, Higher-order Ambisonics, and optimization of speaker layouts and decoding matrices are just a few topics that I didn't go into. I don't claim that all what I've written above is correct and appreciate any constructive criticism.

I find the idea of having a format that basically scales to an infinite number of speakers positioned in arbitrary layouts very intriguing. Additionally, with UHJ encoding, there seems to be a handy format for transmitting 360° degree 2D surround sound using only 2 channels (stereo) for compatibility and efficiency. Imagine any video stream you'd watch on the Internet would play just fine in stereo on any device but would also offer the possibility that you could enable an UHJ decoder to get 360° 2D surround sound on appropriate systems. The same would be theoretically possible for 2-channel music... According to the OpenAL Soft developer, it might even be possible to encode 3- or 4-channel UHJ (required for full-sphere 3D surround sound) into a stereo file if it has a bitdepth of 24 bits. The first 16 bits would be used for 2-channel UHJ, whereas the remaining 8 bits would carry the other two channels.

The biggest problem of Ambisonics right now clearly is the lack of consumer-friendliness. I think that this could largely be resolved since it should be possible to hide most of the underlying complexity: ideally, modern AV receivers would accept B-Format and would be able to decode it to user-configured speaker layouts. A decoding matrix could be uploaded to a receiver by the user from an USB stick or through other means, should receivers not be powerful enough to calculate them themselves.

Further reading

If you are interested to learn more about Ambisonics, here is some reading I recommend:

Acknowledgments

I'd like to thank the following people:

  • KittyCat, for developing OpenAL Soft and for tirelessly answering the many questions I brought up over the years.
  • I Drink Lava and the 3D audio community, for compiling a lot of valuable information around 3D audio APIs and their use in games.
  • The many people that generously provide some of their Ambisonics recordings for free.
  • The many people that I didn't mention explicitly, such as the contributors of the software and tools I mentioned in this guide.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment