dpboard/1-SOXAGC.md

## 1-SOXAGC.md

      
    Raw
  

              1-SOXAGC.md
            
          
    Building an 'Automatic Gain Control' with SoX

SoX is a great tool for converting and processing audio. As an engineer working at a radio station I find it especially useful for processing audio files so the volume is even and smooth throughout. A recent use case was preparing news bulletins recorded by a non-technical journalist so they sounded clear and loud for an Amazon Alexa skill.
What is an AGC?

Automatic Gain Control (AGC) is an audio processing technique that is designed to smooth out volume differences between different parts of an audio programme. The majority of radio stations will have a device at their transmitter (for example, an Optimod) that does just that. Parts of the broadcast that are two quiet are boosted and parts that are too loud are atenuated.
For those familiar with audio compressors, an AGC is basically a compressor with a high ratio but long attack and release times. We ratio has to be high so the audio level is tightly controlled to stay at the 'target' audio level. The attack and release have to be long so that the AGC sounds 'transparent' and doesn't introduce any 'pumping' or distortion artefacts.
A plot of the input level vs the output level would look like this:

In this example low-level sounds below an input level of -40dB are left untouched (level in = level out). This is so our AGC doesn't try and boost the volume of background noises (which would make the audio sound pretty nasty). We call this level the threshold.
Above our threshold of -40dB the plot becomes a flat line at an output level of -20dB. Parts of the audio between -40dB and -20dB are boosted up to -20dB. Parts of the audio between -20dB and 0dB are pulled down. The effect is that the whole audio programme ends up being smoothed out to our target level of -20dB.
How to do this in SoX

This kind of AGC can be built in SoX using the compand effect. Given a threshold of -40dB and a target of -20dB, our compand effect would look like this:
compand 2:2 -40,-40,-35,-20,0,-20 -10 -60 1

Let's break that down:

2:2 - We want both our attack and release to be nice and slow, so 2s should do it.
-40,-40,-35,-20,0,-20 - The next part is our transfer function which is just a list of the points on the above graph in the form in1,out1,in2,out2,in3,out3...
-10 - Overall gain to apply. This depends on exactly what our target level is, but it is good advice to give yourself plenty of headroom and then normalise/boost/limit later on in the SoX processing chain if necessary.
-60 - 'Initial volume'. You need to give SoX a hint about what the level of the first bit of audio is going to be, so it knows at what level to set the compander initially. As most audio starts out with a short period of silence, a large-ish negative number is recommended.
1- Delay. This means the volume adjustment lags behind the volume detection by 1s. It is generally a good idea to set the delay roughly the same order as your attack and release parameters, so the AGC can 'look-ahead'. This means it would pre-emptively reduce the volume before a suddenly loud piece of audio. I find this to be a good thing to have to avoid hurting your audience's ears!

The exact parameters will depend on the exact nature of your input audio, so experiement. However, I recommend figuring out what the average level of most of you unprocessed audio will be.
A good way to do this is to import your unprocessed files in Audacity and measure the average RMS level with dpMeter II. Do this for a few files (if you intend to batch process) and work out an average. Note this as the target level.
Given a certain target level the following compand effect will give you a good starting point:
compand 2:2 -60,-60,-50,<target>,0,<target> <target> -60 1


## 2-agctransferfunction.png

      
    Raw
  

              2-agctransferfunction.png