Skip to content

Instantly share code, notes, and snippets.

@dpboard
Last active November 16, 2023 08:47
Show Gist options
  • Star 9 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dpboard/99c26013a0c045193a93e7e860771a55 to your computer and use it in GitHub Desktop.
Save dpboard/99c26013a0c045193a93e7e860771a55 to your computer and use it in GitHub Desktop.
Building an 'Automatic Gain Control' with SoX

Building an 'Automatic Gain Control' with SoX

SoX is a great tool for converting and processing audio. As an engineer working at a radio station I find it especially useful for processing audio files so the volume is even and smooth throughout. A recent use case was preparing news bulletins recorded by a non-technical journalist so they sounded clear and loud for an Amazon Alexa skill.

What is an AGC?

Automatic Gain Control (AGC) is an audio processing technique that is designed to smooth out volume differences between different parts of an audio programme. The majority of radio stations will have a device at their transmitter (for example, an Optimod) that does just that. Parts of the broadcast that are two quiet are boosted and parts that are too loud are atenuated.

For those familiar with audio compressors, an AGC is basically a compressor with a high ratio but long attack and release times. We ratio has to be high so the audio level is tightly controlled to stay at the 'target' audio level. The attack and release have to be long so that the AGC sounds 'transparent' and doesn't introduce any 'pumping' or distortion artefacts.

A plot of the input level vs the output level would look like this:

example agc transfer function

In this example low-level sounds below an input level of -40dB are left untouched (level in = level out). This is so our AGC doesn't try and boost the volume of background noises (which would make the audio sound pretty nasty). We call this level the threshold.

Above our threshold of -40dB the plot becomes a flat line at an output level of -20dB. Parts of the audio between -40dB and -20dB are boosted up to -20dB. Parts of the audio between -20dB and 0dB are pulled down. The effect is that the whole audio programme ends up being smoothed out to our target level of -20dB.

How to do this in SoX

This kind of AGC can be built in SoX using the compand effect. Given a threshold of -40dB and a target of -20dB, our compand effect would look like this:

compand 2:2 -40,-40,-35,-20,0,-20 -10 -60 1

Let's break that down:

  • 2:2 - We want both our attack and release to be nice and slow, so 2s should do it.
  • -40,-40,-35,-20,0,-20 - The next part is our transfer function which is just a list of the points on the above graph in the form in1,out1,in2,out2,in3,out3...
  • -10 - Overall gain to apply. This depends on exactly what our target level is, but it is good advice to give yourself plenty of headroom and then normalise/boost/limit later on in the SoX processing chain if necessary.
  • -60 - 'Initial volume'. You need to give SoX a hint about what the level of the first bit of audio is going to be, so it knows at what level to set the compander initially. As most audio starts out with a short period of silence, a large-ish negative number is recommended.
  • 1- Delay. This means the volume adjustment lags behind the volume detection by 1s. It is generally a good idea to set the delay roughly the same order as your attack and release parameters, so the AGC can 'look-ahead'. This means it would pre-emptively reduce the volume before a suddenly loud piece of audio. I find this to be a good thing to have to avoid hurting your audience's ears!

The exact parameters will depend on the exact nature of your input audio, so experiement. However, I recommend figuring out what the average level of most of you unprocessed audio will be.

A good way to do this is to import your unprocessed files in Audacity and measure the average RMS level with dpMeter II. Do this for a few files (if you intend to batch process) and work out an average. Note this as the target level.

Given a certain target level the following compand effect will give you a good starting point:

compand 2:2 -60,-60,-50,<target>,0,<target> <target> -60 1
@StuartIanNaylor
Copy link

compand 2:2 -40,-40,-35,-20,0,-20 -10 -60 1

sox FAIL compand: there must be an even number of attack/decay parameters

@AndreyBocharnikov
Copy link

AndreyBocharnikov commented Mar 1, 2022

If somebody gets error above too, change 2:2 to 2,2. You may also get error after coping that command, as it is said here, that's due to minus-sign, remove it and write again, it might help, helped me.

@dannybpng
Copy link

sox can provide you with the RMS and lots of other information from the command line with:

sox narration.wav -n stat -rms

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment